It can be argued that technologies define their times- and that, by extension, big thoughts use the language of technologies to both unearth and define the metaphors of their times. Descartes, Hobbes and Newton all used the vocabulary of mechanical engineering to describe the self, the body politic and the universe. Indeed, the phrases “ghost in the machine”, and “things working like clockwork” are still used and understood today. In our times, scientists like Richard Dawkins insist that we describe the genome as a collection of “bits of information”, and the economist Niall Ferguson recently framed cross-cultural political debate by speaking of “killer apps” that allow non-Western cultures to “download civilisational software.”
It seems, then, that the vocabulary of information technology and data processing is increasingly used to describe the world we currently live in.
We all probably use this language ourselves and are increasingly aware of how much information and data we have to wade through and the means at our disposal to do so - from finding bus times to looking for football scores via websites, apps, tweets, and so on.
But are we also aware of how much information we create ourselves?
To clarify, let me speak of our “information footprint”. With all our tweeting and Facebooking, Googling, email and texting, we are creating a mass of information the likes of which has never been seen before. The footprints are only getting bigger and bigger; more and more global.
All of this information has a value, and there are countless projects underway to scrutinise these information footprints; to gather, scrape, compile, aggregate, cross-reference and benchmark this information - and from there to repackage it and sell it on to those who are willing to pay for it. The buyers in this scenario are mostly large corporations looking for consumer feedback and information to enhance and develop old and new products and services, adjust pricing and so on.
I have been in the business of gathering data for over a decade and have seen that information that was once hard to come by - literally begged and bribed out of people - is now being freely volunteered thanks to new technology platforms and a new understanding of what privacy means. People’s chit-chatting has moved onto the social web and with that become more open and documented: it can be data-processed, cross-tabulated, integrated, pattern-analysed, et cetera. All of this can sound “creepy” – and indeed, speaking of Google’s stance on these matters, ex-CEO Eric Schmidt claimed six months ago that “Google policy is to get right up to the creepy line and not cross it.”
It is equally clear, however, that the benefits and convenience, and even the revolutionary political potential of these information platforms cannot be overlooked: from keeping in touch with friends over Facebook, to using Twitter as a documentation tool for shining a light on old media coverage black-spots: the “creepy” is contrasted with the “liberating”. In short, there is a risk/reward binary at play here.
An information gathering technique particularly close to my heart is “crowd-sourcing”. Put simply, information crowd-sourcing encourages mass-contribution to collect data from the web, so the “wisdom” of this crowd can be expressed and measured.
When done right, information crowd-sourcing should include a list of key attributes: it should be a rewarding, open and transparent exercise, only minimally edited; it should provide a robust, user-friendly platform, which helps lend a voice to its data contributors. In this way, a pool of information is created which is actually useful, meaningful and interesting to everyone involved – both the contributors and those who consume the data. It should lead to a win-win situation for all.
In terms of consuming data, there are encouraging signs that transparency and free access to information are growing trends. In 2009, Tim Berners-Lee, the inventor of the web, was put in charge of data.gov.uk, a project tasked with making government data more transparent and accessible on the web. This constitutes a shift in thinking and of expectation: the default position now is that data "should be in the public domain unless there is a good reason not to - not the other way around." Some of this government data has already been processed, visualized and published - perhaps most notably by the Guardian's datastore.
I hope this trend of transparency and free access continues and extends to the collection of data. I look forward to seeing more good crowd-sourcing projects in future that collect and analyze and display interesting, meaningful data so that we can all benefit from the information footprints we create.
Steven Drost is CEO of Edinburgh social web startup Stipso. www.stipso.com/students