CAN THE VALUES OF OPEN SOURCE BE APPLIED IN THE CONTEXT OF BIG DATA AND ANALYTICS?
[This article has been published in Engineering Group’s Ingenium Magazine on April, 11th, 2017. The Italian original text is available here.]
"The pure and simple truth is rarely pure and never simple.” (Oscar Wilde)
I accepted the invitation to a briefing by Gartner entitled “How to use Data & Analytics to drive Strategic Business Value“. The main theme was the role of the Chief Data Officer in organisations. Interested and even a little curious, I attended the seminar and to my surprise after an interesting two hour presentation I heard about open source! Or at least, that was my perception – certainly not however that of the analyst I interviewed after the meeting!
At this point it is appropriate to ask why I reached this conclusion and what were the consequent points of reflection.
One of the talks was entitled “how to establish a data-driven culture” and here I agreed with the speaker when he stated that the task of those who work in the context of big data and analytics today is to help organisations rethink various aspects with an approach that is radically different from the traditional one: from an organisational one to how to tackle business, setting objectives and evaluating their achievement and how to implement developments.
This means disseminating a new culture that is born from and driven by information
The data analyst – now commonly referred to as a data scientist – is the individual who when facing a business opportunity and possessing the “appropriate data,” knows how to ask the right questions, before even knowing how to find the answers. In this role he/she has to be an “agent of change“; knowing how to take the initiative in proposing specific actions, working with agility and knowing how to highlight, justify and quantify the results, always avoiding expressing them in purely qualitative terms. He must above all be able to spread this culture around him, among customers, in the development teams and among his own colleagues.
During the presentation a few themes appeared that are well known to those work in open-source: ecosystem, collaboration, distributed development, but also values and ethics. It was interesting to hear a Scottish speaker speak Latin when he addressed the theme of ethos (here expressed as reputation – another key value of open source) and pathos: “what the heart says is more important than that recommended by the mind.” In contrast to the “on-time, on-scope, on-budget” triad, it was a really nice surprise, at a Gartner briefing!
The conclusion was timely: “The data-driven approach requires a change in thinking and behaviour: from ownership to stewardship; from control to facilitation; from securing to sharing; from governance to curation; from legalities to ethics.
Digital ethics is an emerging theme
It consists in establishing a system of values and moral principles for the management of digital interactions between people, objects and activities. Here the important elements are transparency and control, intended as open and public. Let’s think of automatic algorithms, which via the development of artificial intelligence techniques, condition our lives and are set to do so increasingly and are generally available in “black box” mode. Research has shown how some algorithms that, for example, are used to grant a property mortgage or to interpret the sentiment extracted from texts posted online , reproduce the same typical stereotypes of discriminatory human behaviour (towards ethnic groups, non-residents, etc. ). The aggravator, in this case, comes from the fact that automatism reproduces itself and is devoid of the possible corrections induced by human reflection and empathy. The risk we are running is that with machine or deep learning techniques we are creating systems that we risk not totally understanding or controlling!
Another issue is that of trust
This has always been a crucial and thoroughly studied aspect in the open source environment. Today it is clear that without trust there can be no progress. In the digital world data is more valuable if it is shared and sharing means giving access to people both inside and outside the organisation, without activating the traditional methods of control. This requires trust. Trust that does not involve emotion, but is an economic factor and involves progress; it is the basis of solid business relationships. Without trust there is no sharing, and the only possible model is control, not applicable in a context of relationships that are activated in a widespread fashion and expand exponentially.
The future will ask us to take a stand on what we consider to be the acceptable level of trust – in the presence of open or public data that may affect our choices – and on the need to demand transparency on how the algorithms that guide the choices impacting on our lives, are operated.
George Dyson’s book, Turing’s Cathedral, says: “Facebook defines who we are, Amazon defines what we want, Google defines what we think. You can extend this statement to include finance, which defines what we have (at least materially) and reputation, which increasingly defines the opportunities that we can access. In every sector, leaders aspire to take decisions in the absence of rules or the need for applications or explanations. If they get their way, our basic freedoms and opportunities will be delegated to systems driven by values that do little more than enrich top managers and shareholders.”
This vision may definitely be considered as “apocalyptic”, but it stresses the urgency of a demand not only for transparency, but also “intelligibility“.
From open software to open algorithms, therefore good health and long life to open source values in the era of the digital economy!