The economics of (personal) data
Are data more like goods or rights? 5 models for personal data management
On the first day of the Next Generation Internet Summit which took place last September (online, of course), one of the key topics on the agenda was the future of the so-called “European data economy”. What is it about and why is it relevant today?
As it often happens in the digital sector, there is no universally agreed upon definition of data economy. One of the best is that by Jani Koskinen, an Information Systems researcher at the University of Turku, Finland: the data economy is an ecosystem of organizations for whom data is the main source or object of their business. With the Facebook-Cambridge Analytica scandal and the entry into force of the EU General Data Protection Regulation (GDPR) in 2018, the business models of these organizations have increasingly come under the scrutiny of policymakers and regulators. In particular, the (ab)use of personal data for commercial and political purposes has become an issue of legislative debate in and beyond the EU.
Indeed, over the last couple of years, legislative proposals, reforms and innovations in this field have sprung up across the world, from the California Consumer Privacy Act (CCPA) in the US and the Act on the Protection of Personal Information in Japan, to the Personal Information Protection Law in China and the Personal Data Protection Bill in India. More recently, the Netflix documentary The Social Dilemma has brought to the attention of a record number of viewers the privacy problems that come with the monetization schemes of some of the most popular digital platforms of our time.
Now that more and more people are becoming aware of the extent to which the success of companies like Facebook and Google and, to a lesser degree, Amazon, Apple and many others, rests on the exploitation of their personal data, some questions are becoming increasingly pressing: Where should we store our personal data? Who should we share it with? Should we profit from them, too?
Understanding the nature of data
To answer these questions, we first need to understand what data are in economic terms. You might recall that in 2017 The Economist called data “the new oil” . In 2020, they are not so sure anymore.
The main problem is that the very nature of data challenges the traditional classification in private, public, common pool or club goods. Data are non-rival, that is, they can be copied, shared and reused infinite times by different people without decreasing in neither amount nor quality; at the same time, data may be personal or confidential, so they should better stay private. Data can be accessible to some and excluded from others with techniques like encryption, but they can also be made available to everyone as open data. It seems therefore that different types of data fall in different categories.
Another challenge is that data, as pieces of information, are not valuable in themselves, but in the (re)uses we make of them: even if we wanted to evaluate data, their price would be subjective, as the same information might be worth different amounts to different people. At the same time, we can’t deny that personal and non-personal data are a valuable economic resource, underlying business models in sectors as diverse as agriculture, banking, and manufacturing. This has led the EU to conceive Data Spaces where businesses can share non-personal data among themselves and with governments, in an attempt to boost economic growth, interoperability, and innovation. But could we safely share and profit from personal data the same way?
Managing personal data
As there is no consensus on the economic nature of data, neither there is agreement on the best way to administer them, especially when it comes to personal data. To bring some clarity, Ingrid Schneider, senior IT ethics researcher at the University of Hamburg, Germany, has developed a taxonomy of personal data management models based on different economic interpretations of data.
If data are private goods, then individuals exercise commercial rights over them: they can choose which data points to sell, set the price through a normal demand and supply mechanism and be remunerated by means of microtransactions. While intuitively attractive, this model entails serious risks: data “owners” might underestimate the value of their data, or sell more data than it is in their interest to do in order to make more money; worse still, individuals would have very little bargaining power vis-à-vis internet giants.
Another option is to treat data as public goods. In this case, it falls on the State to store, manage, and share data, as well as to collect payments from firms to access and use them. But there are risks to this model, too: if data is centralized, then the exposure and vulnerability to cyberattacks, the potential and accuracy of State surveillance and the likelihood of corruption and mismanagement grow exponentially.
Why don’t consider data as a common pool goods then? This is the idea behind data commons like MIDATA, which shares the health information of its members with a selected group of IT companies and research centers working on data-driven health services, medical research and clinical trials. Members delegate the management of their personal data to the no-profit cooperative, and jointly decide which organizations to share them with, in line with the social cause they are pursuing. But this option too has limitations: as it is based on consensus, it is hardly scalable both in terms of membership and type of data.
If instead we considered data as “assets on which we exercise rights”, then an alternative to data commons are data trusts, legal entities managing personal data on behalf of individuals according to the “terms of the trusts”, the equivalent of cooperatives’ social cause. For example, some trusts might prioritize privacy and thus only share data when strictly necessary; others, instead, might support cancer research and thus share data with medical laboratories. Unlike commons, however, trusts may be public or private, and trustees are professionals rather than members. In addition, trustees have legal responsibilities (fiduciaries duties) to promote the interests of data subjects. This means that while they may be remunerated for their work, they mustn’t profit from it at the expense of data subjects. While theoretically sound, this model faces important regulatory hurdles: as explained by professors Sylvie Delacroix and Neil Lawrence in a recent paper, trust law does not exist in all jurisdictions; moreover, under the GDPR, personal data rights such as access, portability and erasure cannot be mandated to a third party at all.
But this is hardly the end of the story. The EU is in fact investing in a fifth option, compatible with the notion of personal data as “assets on which we exercise rights”: self-sovereign identities (SSIs). Linked to a distributed ledger or blockchain like the open source Sovrin network, self-sovereign identities rely on encrypted data stores to securely gather a person’s identity attributes. These “wallets” enable individuals to decide, on a case by case basis, which piece of information to share and with whom. For example, if you needed to demonstrate that you are older than 13 to create a social media account, your SSI could use the distributed ledger to generate a trustworthy “proof of age” without having to disclose any detail about yourself, not even your date of birth: the nodes of the network (which may include trusted organizations such as public institutions) would verify and validate the information for you. SSIs thus give individuals (or, in GDPR parlance, “data subjects”) full control over their data.
If SSIs are so incredibly good, you might be wondering, how come we don’t all have one by now? According to a recent working paper by NGI Forward, the problem is that the developer ecosystem around SSIs is excessively fragmented and decentralized for any player to reach critical mass. One way to address this issue, suggests the study, would be for the developer and standard setting community to choose a champion and collaborate to the continuous maintenance and upgrade of its solution.
Even if this vision materialized, however, there is no guarantee that SSIs will be the preferred personal data management model of the future: legislative and technological development might make models like data trusts finally viable, or lead to the creation of brand new models.
One thing, however, seems certain: whatever personal data management model will prevail, it will be about giving individuals more control over their data, thus empowering them as agents- and not just subjects- of the data economy of tomorrow.