Skip to main content
  1. Dispatches/

Newspaper = Data Warehouse: A Different Sort Of Journalism

·688 words·4 mins
Articles Culture Development Django Tech
Daniel Andrlik
Daniel Andrlik lives in the suburbs of Philadelphia. By day he manages product teams. The rest of the time he is a podcast host and producer, writer of speculative fiction, a rabid reader, and a programmer.

Adrian Holovaty, one of the creators of Django (which you will read my raves about again and again), has posted a wonderful meditation on how online newspapers need to change. He centers his thoughts on one central point:

One of those important shifts is: Newspapers need to stop the story-centric worldview.

Holovaty explains quite succinctly how journalists gather structured data every day, and instead of focusing on composing this information into a static story, they should be focusing on storing this data into a machine-readable format that allows the information to be used again and again for a variety of uses, and provides concrete examples why this would be such a powerful and useful change.

For example, say a newspaper has written a story about a local fire. Being able to read that story on a cell phone is fine and dandy. Hooray, technology! But what I really want to be able to do is explore the raw facts of that story, one by one, with layers of attribution, and an infrastructure for comparing the details of the fire – date, time, place, victims, fire station number, distance from fire department, names and years experience of firemen on the scene, time it took for firemen to arrive – with the details of previous fires. And subsequent fires, whenever they happen.

That’s what I mean by structured data: information with attributes that are consistent across a domain. Every fire has those attributes, just as every reported crime has many attributes, just as every college basketball game has many attributes.

Now, if you read his post, Holovaty is really focusing on the ability of the newspaper to “repurpose” their data in order to rapidly develop new and powerful features for their own services, but what excites me about this idea is the potential for marketing that kind of information. Let’s say newspapers take his suggestion and begin specializing in what they do best, the rapid collection of structured data. The journalists will still produce stories (it’s necessary, as Holovaty explains), but the focus of the organization is to fill their servers with as much granular data regarding the event as possible. What if then the newspapers provide an API for other applications or organizations to access that raw data?

I’m not saying that they would give it away for free, quite the opposite. That information in its structured form is worth far more than the articles that it generates due to its re-usability. Charge organizations a subscription fee for direct access to data, and then those subscribers can use the API to develop powerful products and services that the news gathering organization does not have the resources to pursue. Rather than focusing their business on the final presentation of news (although there will still be plenty of that), share the focus with the aggregation of the source data, and then serve as a supplier of that information to other vendors. The possibilities it would open for development are really staggering, and I suspect that data-subscriber revenues for the newspapers would be substantial.

As our demand for online services increases, I suspect this type of model will become absolutely necessary, and it is thus inevitable. If the newspapers don’t go for it, some other business will rise to fulfill the same function. Newspapers have the advantage though, since each newspaper is admirably suited to provide highly specialized data for its location. In fact, it is likely that most of those external developers will need to subscribe to several companies’ data streams in order to get the totality of information needed to meet the demand of a diverse online user base.

Don’t get me wrong, the role and importance of traditional journalism, providing analysis and organizing that data into a meaningful story, will always remain. However, I think if newspapers are to survive this New Media transition, it will be essential to pursue this as a parallel business model. With today’s emphasis on rapid development, the early adopters are likely to net a large number of start-ups as subscribers, and I for one think the sooner they get started the better. :)


Django Rocks My World
·851 words·4 mins
Articles Assorted Geekery Development Django Tech Python
So as I mentioned before I’ve been learning Python, and it has been a lot of fun. It says something that I can spend all day at work writing code, only getting up from the computer to scavenge the vending machines, but when the end of the day comes I find myself rushing home so that I can pick up where I left off in my Python lesson plan.
“Sex Is Fun” Is Being Taught An Important Lesson: Don’t Centralize
·932 words·5 mins
Articles Culture Rants Tech
Via Digg: This is an interesting story that showed up a few days ago. The very popular sex education podcast “Sex Is Fun” has been removed from the iTunes directory. “Sex Is Fun” is a sex-positive podcast/videocast that focused on educating their audience on a whole host of sexual issues.
Bruce Sterling Envisions An Internet of Things
·318 words·2 mins
Articles Assorted Geekery Culture Tech
If you have not heard about Bruce Sterling’s keynote speech at the O’reilly Emerging Technology conference, you really need to take a listen. It’s was a mind-blowing presentation illustrating the future of Internet technology and how the language we use to talk about it will guides technological development.