When you're tracking a story that involves reading a huge volume of text documents, it's a challenge to both manage and process the documents and see the big picture. Here's how two pieces of software can help.
A recent Knight Blog post explained how Associated Press reporter Jack Gillum used Document Cloud and Overview to show that showing that former Republican Vice Presidential Candidate Congressman Paul Ryan had requested funds for his district from many federal programs which he criticized on the campaign trail as wasteful.
According to Knight, Gillum relied on two tools to tackle this daunting task:
- DocumentCloud, to upload scans of documents, perform optical character recognition, and search the contents.
- Overview, a data visualization tool created by AP that expands the functionality of DocumentCloud to automatically sort documents into topics and visualize the contents.
Overview is designed primarily to process English-language text documents. It's not the tool to use to process tables, data that's primarily numeric, or records exported from a database (unless they include a field containing plain English text).
Both of these tools are available as web applications available for journalists to use free of charge. Both are also previous Knight News Challenge winners. You need a DocumentCloud account to use Overview.