Published on May 13th, 2013 | by EJC0
Revisiting The Age Of ‘Big Data’
The so-called topic of ‘Big Data’ has been creating quite a buzz recently. But what does it really mean? What are the possibilities as well as the challenges that it poses?
In an article from 2012, the New York Times likened big data to “a meme and marketing term, for sure, but also shorthand for advancing trends in technology that open the door to a new approach to understanding the world and making decisions.” The term Big data does not only refer to the mere existence of data, but also to the fact that more data is generated every second, and that entirely new kinds of data and information are being uncovered with new technologies. Social media and the possibility to aggregate their contents, and, using algorithms, to determine the content, sentiment and weight of each message, for instance, constitute an enormous mine of data. Digital sensors in other technologies detect and communicate entire streams of data of their own.
“Big” data literally refers to the relatively large amount of data that can be created, collected and analysed: these are structured and unstructured data that can only be analysed by machine-based systems and technologies, because the amount of data as well as the speed at which it is created and collected is too vast for human analysis. Aside from its velocity and size, big data also involves a wide variety of kinds of information. As writers Kenneth Cukier and Victor Mayer-Schonberg (“Big Data: A Revolution That Will Transform How We Live, Work and Think”) say in a WIRED interview, the actual size of the data or information sets is not what is striking about big data. With big data, “we have more data about a phenomenon relative to the total amount of data that is out there”.
As the Nieman Journalism Lab sees, emergence of data in our everyday lives, in big sets (and smaller ones), can have a great impact in the way we think about things and interact with them. The data will also be used more and more in journalism – if journalists know how to use it.
While the use of ‘big data’ is facilitated with open-sourced online tools and can be a great tool, it is important to realise that it does not give all the answers. There are gaps in the aggregation, collection and contextualisation of data. As Kate Crawford writes, the collection of big data is ultimately rounded up by humans – and therefore, it is biased. Nor does the data that is available show everything there is to know.
There might be various blind spots in big data. There is a digital gap, with groups of people who are less connected than others. This can be an age divide (older generations are less likely to be connected via social media), or a socioeconomic divide – certain regions might be economic hubs, therefore more data is sent out because more people have access to digital and social media and communication channels (think of smartphones). In times of emergencies, the actual connectivity might be badly affected – disasters or conflict can damage the channels of telecommunication, or networks become blocked because of excessive use of for instance SMS or phones.
The above-mentioned blind spots are the result of flaws or irregularities in the gathering of the data. But the potential difficulties of using big data do not stop there, because actually distilling the information you are after can be difficult. Even with a small data pool, it can be hard to find the kind of information that answers your question directly.
The answers that big data sets can provide can be useful – but it can also imply an overload of information that overwhelms us when we are looking for specific detailed information. Various skills and steps are part of dealing with big data: it is crunched, analysed in mathematical models, and, finally it has to be turned in narratives. (Also read this article by New York Times on the emerging Data Science.)
On the other hand, seeing data or information in things we had previously not considered, can be a great find. If we think of such things as behaviour, sentiments or opinions as ‘data’, that also means we can organise them in order to extract the insights that interest us.
The key to big data is the realisation that in aggregate, the public is more knowledgeable than the one newsroom for a media outlet. Journalists can make use of this pool of knowledgeability.
Digital Newsroom Strategy
Merely collecting big data does not suffice for journalistic coverage. At the very least, journalists and editors have to analyse or curate the information that is being aggregated. Newsrooms have to consider their digital strategy when covering events and bringing (breaking) news as well as getting in touch with users, contributors (the crowd) and readers. It can be crucial to take an analytical stance towards the information that you have collected, keeping in mind the blind spots of big data.
Journalists working in emergency/disaster situations can consider several solutions or approaches to these gaps in big data:
If information or data from a specific region is missing, this might indicate a significant lag in the knowledge of what has happened there. In a storm or natural disaster, maybe one neighbourhood is missing. Why are no media updates being sent from that location? It might be where your story is.
Local media and national or broader media might have different insights and access into the region. The same counts for private, public and community media organisations.
The verification of information is of crucial importance. The cross-checking of different data layers – combining meteorological data and social media updates, for instance – can furthermore provide insights in the development of a situation, or the organisation and adequacy of relief that is provided.
For an interpretation of the use of big data with conflict prevention, take a look at this blog by Patrick Meier.
About the Author:
Photo: infocux Technologies