BuzzData: making data easy, engaging and effective. 

How sharing data can save lives

By Clay Heaton, data management specialist and co-founder of The Perihelion Group. In 2010, Heaton was dispatched to a field hospital in Fond Parisien, Haiti, where he built a simple Rails-based EMR system so emergency response staff and volunteers could track medical information, triage and treatment.

I’m a public health professional who specializes in data management and the humanitarian response to crises. It’s a field where timely and organized data-sharing positively impacts large groups of vulnerable people. I hope for a day when people who aren’t data experts have a go-to place for creating nimble data repositories others can access, query and update in real time, immediately following disasters like the recent ones in Japan and Haiti

Field-based data management is a challenge because first-response teams focus almost exclusively on search, rescue and medical missions. They lack access to a network, and have little time (or patience) to learn new tools while surrounded by catastrophe. Excel is the go-to application for immediate data tracking needs, and, in the hands of the right person, holds up well in difficult conditions. 

However, humanitarian organizations frequently rotate their staff in and out of the field to avoid extensive fatigue. Consequently, spreadsheets created during the early hours of response efforts are usually passed around. Each subsequent staff member and volunteer has their own idea about how the data should be structured, what should be tracked, and how important it is to collect data at all. 

This lack of consistency in data workflow can cause major problems: A single inaccuracy in a spreadsheet can lead to dozens of hours lost trying to locate an unaccompanied child who supposedly is in the hospital, but actually has been discharged already. Worse, such mistakes can have terrifying legal implications for well-intentioned volunteers and staff in the field. A physician on site in Haiti holds a copy of a medical record that came out of the EMR app built by Clay Heaton. It was the first medical record the patient ever had received. Photo by Clay Heaton.

Staffing logistics aren’t the only challenge, however. When you’re trying to manage the health and whereabouts of hundreds of patients and staff and you have limited (or no) Internet access, web-based solutions can be problematic. Given an Internet connection, Google Spreadsheets has been the best off-the-shelf tool that I have seen so far. But disagreements regarding sheet structure, coupled with rotating staff, often leads to unregulated cloning of datasets. It is all too common to see colleagues at the same field location working off of two different copies of the same spreadsheet, unaware that they aren’t sharing the same data. 

Companies such as Google and groups like Crisis Commons work from headquarters locations, building grand tools that help reunite families and give field response teams an outlet for requesting supplies. But the single largest data management challenge during emergency response is not a grand “30,000-foot” problem. It is what I call the three-foot problem. 

Medical supplies in the Fond Parisien Hospital. Photo by Clay Heaton.

Given the best tools in the world, web-based, standalone, mobile, or otherwise, the biggest data challenge is informing responders of the existence of the tool, training them on it, and finally, giving them the incentive to actually use it. While professional staff are far more likely to use the tools than volunteer staff, volunteers often make up the vast majority of field-based responders. The American Red Cross volunteer-to-staff ratio, for instance, is on the order of 40:1. 

Training and encouraging volunteers to use data management tools during a crisis is extremely difficult. Many people volunteer with an adrenaline-based attitude that they are going to save the sick and weary. To those people, time spent entering data into a computer or telephone is equivalent to lives surrendered. 

I often deploy with field teams to build on-site data management tools, in an effort to keep response efforts consolidated and organized. I use the tools available to me, from Excel to Filemaker Pro, MS Access, or Open Office. I prefer to just roll a simple Rails app, however, because I can run it off of an ad-hoc network, anybody with a laptop or smart phone can access it (within range of the network), it doesn’t require commercial software, and most people are familiar with the use of a web browser.

In Haiti, for example, I built an EMR system that included pharmacy and warehouse supply tracking, full patient medical records, disease surveillance tools, and a requisition wish list for supplies. I deployed the components as they were ready, beginning with the pharmacy tracking tools. I built the entire system from scratch and completed it in under a week. 



We still had problems with poor data entry, volunteers who refused to use the system, a cranky generator that sent power spikes to the server, and teams of staff with 5 different native languages. You can see a small example of the data from the Rails app that I build in Haiti on my BuzzData account. While messy, it is one of the best records of medicines dispensed at a field hospital during humanitarian response efforts that ever has been recorded. 

From the Red Cross to the UN to Google, even the big players in this field frequently botch data collection. Most frequently, it is because they work remotely on grand solutions, oblivious to the 3-foot problem. One large and well-respected NGO (name withheld) manages refugee camps and field hospitals with tens of thousands of residents and manage the roster with a simple Excel sheet into which they enter all demographics, medical and surgical treatment information, and contact information for relatives in the camp. Note: this is all on the same sheet, not even on multiple sheets in the same workbook.

Field hospitals, displaced persons camps, and emergency operations eventually close. The staff and volunteers that run them scatter back to the institutions, agencies, and companies whence they came. They spend a few weeks recovering and then they want to share their data, mine it for trends, and try to understand how they can improve their response efforts for the next crisis. Their data often hardly is better than garbage. Sometimes you just have to work with garbage because it’s the best that you have. This is pretty sad to a data nerd like me.

Nevertheless, field collaborators need to share their data so they can publish studies and improve response efforts. This often is done in Google Spreadsheets. But the scourge of any spreadsheet application is that it simplifies complex data relationships. Sure, you can flatten everything down into a dataset that has 500 columns, but it becomes so daunting that it’s nearly impossible to navigate. 

BuzzData wasn’t built to manage emergency response data but I think it has the potential to be a good tool for people working in the field to share data and cooperatively mine it for meaning post-emergency. However, field data can be pretty complex. It does the datasets a disservice to convert them to flat files, dropping relationships and leaving the burden of reestablishing them to people doing analysis or to statistical analysis packages. 

Why flatten them just to have to do the work to reestablish the relationships in other software? It’s much easier to poke around in data and explore a dataset when the relations are intact (if it was relational to begin with). This is a problem experienced in all industries. For me, taking the time to try to generate a meaningful query from a dataset and upload it to BuzzData for sharing ends up being an additional intermediate step between the data repository and the end user. 

I hope that helps you to understand my perspective on BuzzData. As a data nerd, I find BuzzData exciting and I look forward to watching it grow.

This guest blog post evolved from a dialogue between one of our early users and our CTO Pete Forde. What I love most about this is the fact that the focus of our development is helping bring these critical issues to light.

-Momoko Price

What BuzzData will (and won’t) be

“If you want to build a ship, don’t drum up people together to collect wood and don’t assign them tasks and work, but rather, teach them to long for the endless immensity of the sea.”

-Antoine de Saint-Exupery

The BuzzData beta has been public for a few weeks now. Its general reception so far has ranged from evangelistic enthusiasm for its early activity to tentative, thoughtful speculation about its future direction. 

BlogPulse co-creator Matthew Hurst earlier this month attempted, understandably, to position BuzzData in the data value-chain alongside pre-existing startup models: “It is going to be very interesting to see how the site grows and evolves,” he wrote. “Is it a commercial version of IBM’s Many Eyes? A twist on DataMarket or InfoChimps? A re-implentation of Swivels (the YouTube of data)?”

Social datasets— so what?

A few of our early beta users have probably mulled over similar questions since we launched. Many of our early adopters (usually hackers who use Githubget it: they can see where we’re headed and dove right in, while others, likely less familiar with collaborative workflow apps like Github, might upload a test dataset, follow a couple of other users, and then think: “okay, so now what?” 

BuzzData’s social features and easy-to-use UI are familiar value-adds in a post-Twitter world, but we as a team have come to realize that our True Big-Picture Mission is not nearly as easy to recognize for our early adopters. To be clear, with these social features (and many more to come), the BuzzData master plan is nothing less than to gradually infuse the data community (and beyond) with the same real-time, social, collaborative energy that revolutionized innovation for web developers a decade ago. 

What BuzzData will (and won’t) be

In answer to Hurst’s above question, BuzzData is not going to be anything like ManyEyes (or InfoChimps or DataMarket, for that matter). Sometimes I personally think BuzzData might be better described as “ManyHands” for data: as in, “many hands make light work.” 

The real vision of BD’s co-founders Pete Forde and Mark Opausky is an online hub whose purpose goes far beyond that of any static catalogue or “data marketplace,” as existing data-startups are now called. Our goal is to create a place where users — whether they’re individuals, news agencies, science labs, governments — have the power to publish, build, revise and expand existing data into information that’s more current, accurate, accessible and ultimately useful than any version of data they might create alone.

In general, data management is still a relatively isolated, esoteric process — if only someone (hint hint) was focused on connecting people more intuitively and efficiently to their data, their interests and each other, future innovation and knowledge discovery might move more quickly and reliably, while requiring less unpleasant gruntwork per individual person.

Wouldn’t that be nice?

Keeping our eyes on the prize

To improve the speed of data collaboration on BuzzData, one user recently suggested we implement Google-Spreadsheet-like editing functionality to BuzzData. We definitely agree, this seems like an intuitive move, but: we actually have our own plans in mind. Google Spreadsheets is great for on-the-spot, one-off group editing; we’re really bent on creating a place where the best, most current, most accurate data floats to the top, as easily accessible to its audience as it is attributable to its publisher. 

That said, there are many ways to skin a cat, and problems can often be solved by multiple routes. We’re really looking forward to hearing what our users think of the route we’ve taken once it’s fully unveiled. 

Social functionality and easy dataset publishing is just Stage 1 of BuzzData’s ultimate vision. We really hope you’re enjoying it. Stay tuned, because there’s a lot more in store for you. 

-Momoko Price

Got some ideas about improving data workflow? Try out the site (it’s free) and tell us your ideas at support@buzzdata.com (or feel free to bug me directly at momoko@buzzdata.com).