Not a member? Sign up

AP's Overview tool helps DocumentCloud makes sense of text documents

by: Amy Gahran |

When you're tracking a story that involves reading a huge volume of text documents, it's a challenge to both manage and process the documents and see the big picture. Here's how two pieces of software can help.

A recent Knight Blog post explained how Associated Press reporter Jack Gillum used Document Cloud and Overview to show that showing that former Republican Vice Presidential Candidate Congressman Paul Ryan had requested funds for his district from many federal programs which he criticized on the campaign trail as wasteful.

According to Knight, Gillum relied on two tools to tackle this daunting task:

  • DocumentCloud, to upload scans of documents, perform optical character recognition, and search the contents.
  • Overview, a data visualization tool created by AP that expands the functionality of DocumentCloud to automatically sort documents into topics and visualize the contents.

Overview is designed primarily to process English-language text documents. It's not the tool to use to process tables, data that's primarily numeric, or records exported from a database (unless they include a field containing plain English text).

Both of these tools are available as web applications available for journalists to use free of charge. Both are also previous Knight News Challenge winners. You need a DocumentCloud account to use Overview.

Subscribe to the Newsletter

Amy Gahran

Amy Gahran is a journalist, editor, trainer, entrepreneur, strategist, and media consultant based in Boulder, Colorado. In addition to writing
Read More

Events
Submit an event

Startup Weekend DC Media Edition

July 31, 2015 - August 2, 2015

Whether you're a seasoned entrepreneur or new to the startup world, Startup Weekend DC Media Edition is where you can pitch your idea, form a team, validate your concept, build your product, get valuable insight from mentors, and present your work to a panel of esteemed judges all in just 54 hours. 

Podcasting Collab/Space San Francisco Workshop, September 12

September 12, 2015 - September 12, 2015

Podcasting has become the Next Big Thing for the second time in our current golden age of audio. How an audio startups survive and thrive will be the focus of the one-day  Collab/Space San Francisco workshop on September 12. 

Register by July 31 for discount to Online News Assn. Annual Conference in L.A.

September 24, 2015 - September 26, 2015

ONA's annual conference attracts hundreds of highly engaged digital journalists who are shaping the future of media.

Code For America Summit: Transforming 21st Century Government

September 30, 2015 - October 2, 2015

The 2015 Code for America Summit will kick-off September 30 in Oakland, California. For three days, more than 1,300 government leaders, technologists, and community members will  delve into how, together, we can transform government for the 21st Century.

July 31 deadline for discounts to LION Annual Summit October 1-3

October 1, 2015 - October 3, 2015

The 2015 LION Summit will return to Chicago on October 1-3 at the downtown campus of Columbia College Chicago

3-D: Digital, Diversity, Disruption: theme of ASNE-APME conference October 16-18

October 16, 2015 - October 18, 2015

The nation's top editors gather at the ASNE-APME annual conference October 16-18 in Silicon Valley at Stanford University in Palo Alto, CA.  The Associated Press Photo Managers is also a conference partner. 

Submit an event

A redo of the popular Timeline JS to give more flexibility to authors and developers

By Nancy Yoshihara
7/27/2015 | 10:00 pm GMT

The new Timeline JS 3 is a full rewrite of Knight Lab’s most widely used tool and is available now...

Everything you need to know about entrepreneurial journalism

By Nancy Yoshihara
7/20/2015 | 10:00 pm GMT

For a one-stop overview of entrepreneurial journalism, check out Jeremy Caplan’s curated lists of topics on ZEEF.