Right now, you are in many — who knows how many — databases run by companies, institutions, and even individuals. The last invoice you paid, your last check-in on social media, your latest search on Google, that pair of shoes you bought using your loyalty card… let’s face it, you and your behaviors are stored on multiple servers around the globe.
Even though some efforts have been made in Europe with the recent launch of the (in)famous GDPR, digital privacy is a dream of the past. Companies collect, exchange, sell, re-sell and re-use our data all the time. Our behaviors are not only tracked, they are also predicted with such accuracy that yes, some databases might actually know us better then we know ourselves!
Few of us are fully data literate, whatever that means, and most are sloppy when it comes to checking the use of data — where they come from, why they were collected in the first place, how they will be used, etc. Each time we give consent for using our data, do we think about the implications of that consent? When we accept cookies, general conditions, single sign-on via our Facebook account? Do we? Most probably not, and of course we don’t. Some patterns in the data have yet to be discovered by the most brilliant data scientists. How on Earth could we imagine our data being used in a way that hasn’t been even invented yet, or that technology doesn’t yet allow?
Whether we are willing to admit it or not, most of us are data-illiterates in a world that is more and more data-hungry. And we should start to worry…
The case for literacy
Let’s think about it. There are high chances that your organization would not want to work with illiterates: in most administrations, businesses, industries, we expect people to be able to read and write, don’t we? We even expect them to master more than the basics… we implicitly expect deeper skills including semantics, grammar, but also accuracy and coherence, critical analysis and thinking, being able to synthesize, make decisions based on available input, etc.
All the above seems normal. Literacy is key in our education. It starts early on at school, and we’ve come to consider it as a given and a fundamental right.
The implications for business
But if we move from language to data, who can confidently state they are literate? We may think that we are data literate because we understand a belly curve or we can discuss an infographic. But do we really, deeply understand the data representation? How do we know if the data themselves, or the way they have been retrieved and analyzed, or even the presentation itself are not biased? Do we (and can we) check if we can trust the data sources? If it was up to us, could we find the data, could we retrieve it, analyze, present, synthetize, explain the consequences and outcomes?
Many are convinced this has to remain a data scientist’s job. Fair enough, not everybody has to be as data-savvy as a data specialist, the same way not everyone becomes an acclaimed author. Still, the same way that we have to understand the symbols, rules, and theories of language, we have to understand the specific rules and theories of data. Otherwise, we will have NO capability for critical analysis, we will NOT be able to summarize and synthesize, and we will therefore NOT be able to make decisions based on the data available to us.
Is this lack of data literacy real? Certainly! Data scientists often complain about their clients’ lack of data literacy. They usually spend more time educating the organizations they work with than actually working the data. This is a burning problem in modern organizations that too often comes underestimated if not unrecognized.
Both at personal and professional levels, it has become not only a duty, but also a right to get the basic data education we all need to understand the implications of data at scale. You can’t have an impact on a world you can’t read, the same way a company cannot thrive in a data-environment their own employees don’t understand!
Please feel free to comment below or via Twitter – @cdn and @MarieLaenen