More and more government data is becoming available online, but the keys to unlocking stories belong to those who are willing to get their hands a little dirty in a spreadsheet program.
Several states and cities have started publishing data sets online, publicly available to anyone for review and download. The data sets have a variety of content, including farmers market locations, school absentee rates and crime statistics.
In Connecticut, for example, state agencies have started uploading data to the Open Data Portal, which had 676 data sets uploaded as of Aug. 1.
The federal government is posting its own data and providing a forum for other agencies, states and cities to share theirs at Data.gov. The site lists more than 158,000 data sets posted from 176 different agencies around the country.
Other agencies have hundreds of databases right on their own sites. Public transportation data for each state, for example, can be found at the National Transit Database.
The numbers and trends you can find in the data will lead you to new stories, or give you better perspective on news you already know is happening in your community. Even though the general public has the same access to these databases, your reporting outside the numbers will help people understand the data in new ways. Consider that most Americans are not searching through these databases on their own.
A recent Pew study found that less than half of Americans surveyed (49 percent) had tried to access governmental information or data available online on the federal, state or local levels. The most popular searches were for weather (38 percent), transportation (38 percent) and crime information (36 percent), the survey found.
The biggest hurdle to opening this resource to your readers is getting over a fear of numbers and spreadsheets. A lot of resources are available online to help get you started.
The downloadable data can be as simple as a spreadsheet chronicling one data point over time – for example, the amount of money the federal government has spent on school food programs since 1969. Or it can get complicated, using codes instead of plain English for column headers and filling thousands of rows with what looks like random numbers and letters.
Tackle simple data sets first before you try to interpret the more complicated stuff. Look for anything that can be downloaded for Microsoft Excel, or in a CSV file format. Those are spreadsheet files that can be opened easily with Excel or in Google’s spreadsheet program.
Programs like Open Refine can help you clean up data to make it more consistent. So, for example, if a town name is spelled wrong or abbreviated in some cells, Open Refine can help you quickly find and fix the problem. That will help you analyze data more accurately.
Learning to find stories through data comes with practice and guidance. You can start on your own by analyzing stories that use data. Find the database associated with the story (often provided along with the article online) and practice sorting through the numbers until you see the same patterns the reporter found. Some sources for reporting that use databases include the Texas Tribune, ProPublica, The Guardian and the CTMirror.
You could also take an online class.
Numerous online video tutorials explain how to master Excel. And several reputable agencies host online classes for journalists. The Knight Center for Journalism in the Americas held a Math for Journalists course and a Social Media Analytics course this summer, both digging into numbers in different ways in a journalistic context. Poynter also offers an online Math for Journalists course.
The Canvas Network hosted a Data for Journalists Massive Open Online Course last year, and Coursera has several MOOCs available that help you learn to work with data in different ways.
Lynda.com is another online resource for video courses on all topics tech. Paid subscribers have access to 65 courses on spreadsheets and 45 courses on databases. Google’s new NewsLab also has several tutorials on how to find and use data for your reporting.
Or, ask your local SPJ chapter to host a program about data. Connecticut’s chapter brought in a data team from the CTMirror in May and gave journalists free one-on-one help digging through state numbers. We hosted a webinar beforehand to give a shorter overview of Open Refine.
Resources like The Data Journalism Handbook, a free online open-source book on everything data journalism, can add depth to your understanding of working with numbers in your reporting.
Don’t Forget FOI
Just because government agencies post databases online doesn’t mean that’s all the data they possess. Anytime you see an online form, there’s most likely a database behind it.
If you can’t find it publicly available, fill out a freedom of information request for the full database.
Jodie Mozdzer Gil is an assistant professor of journalism at Southern Connecticut State University and the treasurer for the Connecticut pro chapter of SPJ. Interact on Twitter: @mozactly