From Russian hackers interfering in foreign elections to high-profile corporate breaches, people are waking up to some harsh realities around data. We look at what the European call for a ‘Magna Carta’ for data has led to, and explore the investment implications of the growing scrutiny over data privacy.
Tim Berners-Lee, accredited with founding the internet, has described himself as “an optimist standing at the top of a hill with the wind blowing in my face”.1 And when the founder of the internet points out the challenges facing the openness of the worldwide web, it is wise to take note. His words reflect the recent groundswell of concern over online data privacy.
Built on a culture of openness, something embraced by the likes of Facebook and Google, the internet was later exploited for its imbued network effects. Now, from a point where large digital enablers enjoy almost monopolistic advantages, digital citizens are looking more closely at the trade-offs they are being asked to make online. On one hand, they benefit from the extraordinary convenience of free services, from messaging to maps. On the other, intensive tracking sustains the dominant business model, while free services are paid for in the form of users’ personal search and consumption data, collected year after year.
You get what you pay for
Essentially, these are barter-type arrangements that minimise direct costs to consumers, but allow tech companies to reach into personal lives in exchange. What might not have been immediately clear at the outset is the scale of the transaction – just how much information might be siphoned from a digital identity into the emerging data industry, and just how powerful the platform winners would become. At stake are vast caches of granular information that can be mined to inform new services, for directing content and differential pricing.
From today’s start point, trying to assess the true ‘value’ of digital services that the majority of users access ‘for free’ is quite a conundrum. Professor Erik Brynjolfsson’s Discrete Choice Experiments at Massachusetts Institute of Technology (MIT), where cash was offered in exchange for a digital good, suggested users would need around $14 a month to give Facebook a pass, but as much as $500 for email and $1,300 for search engines.2 This suggests quite some appetite for convenience and interconnectedness.
Claiming space in a data world
Although the right to control the use of personal data lies with the individual under European law, carefully-worded terms and conditions have enabled wide-scale data gathering. Significantly, much of the data collated is not obtained from ‘first party’ sites. Online newspaper readers won’t be surprised to discover that a publication, whose website asked for and received their consent, knows exactly which pages were read. Much more surprising is the other, more granular information gathered by data harvesters. Recently, Princeton’s Web Transparency and Accountability Project found 80,000 third-party services collating information on the most commonly-visited websites worldwide.3 The majority of apps gather data which is fed to third parties as well.
Despite rules to protect personal information with a shield of anonymity – for example, by ‘scrubbing’ sensitive details or using numerical identifiers, not names – layering data from different sources is making it increasingly difficult to act incognito. In fact, it has been described as “laughable” to believe that one’s personal information cannot be traced, according to a member of the US Federal Communications Commission.4
Research suggests surveillance has chilling effects
This is important, because information is power. Data is an essential input in the information age, and there will inevitably be tensions between the rights of citizens and others interested in acquiring and analysing their information. Unwanted ‘reveals’ can have long-term implications for job seekers or borrowers, for example. Growing awareness of the erosion of privacy is problematic as well, inhibiting healthy debate that enables societies to change. Research suggests surveillance has “chilling effects”, deterring people from exercising their rights – even when they are going about legal activities.5
As we look ahead, technologies embedded in everyday devices will mean a step change in the amount of information being generated. With the Internet of Things, the volume, velocity and variety of information recorded will surge. Some of the data will come from targeted information gathering, but there will also be dense layers that spill out as a side-effect of digital interactions. If a single UK start-up has 1.1 billion proximity sensors interacting with smartphones through commercial tie-ups,6 what is the scale of content that might be generated in the future?
A world defined by sensors and criss-crossed by GPS boundaries is always on, and suggests quite specific problems of privacy and data security. It also raises much bigger questions of power and system design – whose values are reflected, who is included and who will benefit in the future, according to Dr. Jathan Sadowski, a postdoctoral research fellow at the University of Sydney, in a yet-to-be published paper.7 “If we understand data as capital and a valuable asset, as a source of power, then it starts raising questions about extraction,” according to Sadowski. So, as private spaces shrink, where does this leave the rights of the citizen?
Privacy – an age-old problem
The idea of privacy is tricky, culturally relative and shape shifting. It is broadly agreed to mean the right to be left alone and free from unwanted intrusion. But what might be unwanted intrusion in some cultures is not the case in others. From talking about your pay to appearing in public unclothed, no two societies have the same level of comfort.
When the UN Declaration of Human Rights sought to establish shared ground rules in 1948, it set out the idea that no-one should be subject to “arbitrary interference” with their privacy, family and home, or in correspondence.8 In broad terms, that sentiment has been carried forward in law, although Europe has chosen to take a more robust stance on privacy protection than the US.
The world has moved on since 1948. Technology has become ubiquitous – in domestic life, the workplace and on the move. Anyone happy to share the minutiae of life in a blog, keep an open video-link to chat to friends and post selfies with embedded location data might think the concept of privacy old hat.
“People have really gotten comfortable not only sharing more information and different kinds, but more openly and with more people," said Mark Zuckerberg, CEO of Facebook, back in 2010. "That social norm is just something that has evolved.”9
However, appetite for openness comes in many forms. In some countries, like Germany and France, there is less willingness to trade privacy for convenience (as shown in figure 1). In emerging markets like India, China and much of the Middle East, convenience seems to rank more highly.
Zuckerberg’s relaxed view (which – significantly – the company now seems to be rowing back from) is not held by Professor Barry O’Sullivan, director of the Insight Centre for Data Analytics at University College, Cork. As data is an increasingly valuable currency for research and industry, he believes privacy and the need to ensure age-appropriate content need much closer attention.
“We are moving into a world where people are concerned about privacy. They are concerned about the impact of technology on children. They are very worried about the wider impact of technology on societies. At the base of that, the thing that is enabling it is the sharing of personal data. And the technology doing it is doing it in a sneaky way.”
Rebuilding trust in the data industry
‘Sneakiness’ is undermining; potentially troubling for researchers seeking to draw on troves of data to generate new insights in a Big Data age. Analysing swathes of information from large numbers of people might bring new insights to practical problems – in medical diagnostics, in managing the flows of patients in hospital admissions, in traffic and crowd control and so on. If confidence in the way in which data is collected or stored and protected breaks down, it could potentially limit enthusiasm to share information and scupper positive developments in the future.
“We are entering the ‘Data Economy’, in which data turns into value, after being processed by artificial intelligence, which has potential to create better outcomes,” explains Jason Bohnet, head of technology, media and telecoms research at Aviva Investors. “Users need confidence their data is being used in a reasonable and secure way. Otherwise, it will not really matter how good the offerings are, because ultimately they will get stymied. If we get too cynical and stymie data sharing and growth, we risk holding back transformational innovation. If we push too fast, with disregard for cyber security and civil liberties, we risk losing confidence in the data, which makes any results meaningless.”
In terms of game theory, this is a classic Nash equilibrium problem. How will it be possible to create an optimal outcome for society, based on what the individual players might do? Some important suggestions have flowed out of the predicament, intended to rebalance interests in the data chain.
In the US, MIT Professor Alex Pentland has proposed a ‘New Deal on Data’, suggesting individuals should own their own data10 – far removed from the current US regime. In the UK, Berners-Lee’s ‘Magna Carta for Data’ sought to crowd-source ideas to improve the nature of the web, ultimately reinforcing the idea established in 1215 that everyone should be subject to the law. It went on to inform the new EU General Data Protection Regulation (GDPR) that came into effect on May 25, one of the most significant changes in data privacy regulation for years.
GDPR will apply to any business that handles personal data for European residents – their addresses, bank account details, web search histories and so on. Like Pentland’s ‘New Deal’, at its heart is the idea control over personal data lies with the citizen. Under GDPR, data can only be harvested for specific purposes and not aggregated for incompatible projects. The individual retains the right to withdraw consent for use of their data and can request to be forgotten. The legislation includes a requirement for data keepers to maintain accurate records and keep information secure, and a clause covering the right to redress. Large-scale data breaches could prove costly; companies face fines of up to four per cent of annual turnover, a significant step up from current penalties.
For European regulators, GDPR is particularly important as it aims to establish a framework to future-proof and formalise requirements for the data economy. “The AI industry today is very much dominated by access to data,” explains O’Sullivan. “GDPR is a piece of privacy legislation and a piece of data protection legislation, but I also see it as the world’s first piece of AI legislation.”
China is using its data in its bid to become a global leader in AI
GDPR sits within a very specific ethical culture. If Europe is positioning itself at the conservative end of the scale in terms of personal-data treatment, China is placing itself towards the other. China monitors its citizens closely and comprehensively from birth, and it seems likely data will be used in its own bid to become a global leader in AI. “If one looks at ethical cultures – Europe versus the United States versus China – these are three very different cultures of how to use and access data,” says O’Sullivan. “In the US, the company can own one’s data. That’s not the case in Europe. In Europe, control is always under the ownership of the individual.”
O’Sullivan believes this could prove to be an important differentiator. “We have a very solid base for developing technologies that are privacy-preserving and protective of citizens’ rights – more so than in the United States or China. I think there will be a market for privacy – a business model around privacy. Those three cultures will become key cogs in that industry.”
Nevertheless, this remains a highly contested area. “What happens when you follow the European privacy model and take information out of the information economy?” asked US Republican Marsha Blackburn in 2010.11 “Revenues fall, innovation stalls and you lose out to innovators who choose to work elsewhere.”
GDPR has been portrayed as a burden to companies that fall under the EU regime, possibly putting the nail in the coffin of European businesses harnessing third-party data. An alternative view is that GDPR could encourage much higher standards of data management and a form of privacy arbitrage, where consumers choose service providers in one regime over another. It could also influence investment decisions, such as where companies locate their data centres.
For anyone concerned about online disclosures, the values driving privacy-protected search company MetaGer, a non-profit organisation spun out of Leibniz University in Germany, are clearly different to the dominant players. Its proposition includes no recording or storage of IP addresses or private data, and all of its servers are located in Germany.
So what might all this mean for the global incumbents? Firstly, regulatory risk is rising; even Zuckerberg has conceded more ‘rules of the road’ for Big Tech seem ‘inevitable’.12 Meanwhile, privacy concerns seem to be driving traffic towards companies whose strategies prioritise anonymity – hence the 55 per cent year-on-year growth for the search engine DuckDuckGo in 2017.13
However, the platforms whose business models have historically relied on tracking – like Facebook14 and Google’s parent company Alphabet15 – are not reporting impacts from consumers turning away. In fact, quite the opposite – both reported markedly-higher revenue and profits in 2018.
“I don’t think people will stop searching for things, using maps and translations,” says Bohnet. “At this point, the digital infrastructure has become part of our everyday lives: I believe it is structurally here to stay.” The calls for consumers to think about a ‘digital detox’ are perhaps not being heard.
Putting the individual back in the value chain
With European moves afoot to reinforce respect for the ownership of personal data by the individual, the question that inevitably follows is “who might benefit?”
At the moment, the majority of value coming from the data harvested from billions of individuals is accruing to a relatively small number of companies, some with an iron-like grip on their platforms. While individual users have the benefit of online services ‘for free’, they are effectively sliced out of the value chain. Bruce Schneier, security expert and best-selling author, sees this model as “feudal”.16 He likens it to tenant farmers who have the right to inhabit the digital space, but at a cost. Individuals have no rights to any further value that may accrue from those that aggregate and analyse ‘their’ data.
One way to reimagine the model would be to pay individuals for the right to access their information. This is exactly what sites like CitizenMe envisage, promising individuals and businesses a monetary award for revealing specific preferences and pieces of information. It is early days, but O’Sullivan believes broking personal information – with users giving their informed consent – could eventually morph into a whole new industry.
Another option for those who want to break free of feudal-type data relationships is to opt out of free services, and ‘pay-to-play’ through subscription. Dedicated services like Netflix and Spotify show subscription-based services can work, although they are yet to be proven in search and social media.
Privacy: a killer application for blockchain?
Meanwhile, it is still not entirely clear how companies will deliver privacy and data security at scale. One possibility would be to use blockchain, the system of distributed ledgers. In theory, this has some obvious advantages; it would no longer be necessary for the users of online services to keep inputting sensitive information, such as their personal bank account details, into digital applications again and again. It would also cut the risk of a fundamental systems failure, as risk would be dispersed among the ranks of multiple ledger keepers.
The privacy and security conundrum has been engaging minds at MIT. Its specialists proposed using a protocol that might sit on top of existing blockchains. The idea was to use ‘secret contracts’ that could use data without ever actually ‘seeing’ it. The researchers suggest this might make it possible for users to lock-in their own information, preventing it being monetised or analysed without their consent.
The clearest way to imagine this would be to think of a prospective user of a service whose personal information might be encrypted and locked into the shaded area shown in the chart above. When the user wished to access a service, a request to do that would be sent, assessed digitally (but not ‘read’), and that query would generate an encrypted reply – so the service provider would never view the sensitive data itself.17
Somewhat ironically, the MIT project is called ‘Enigma’ (‘riddle’ in Greek), just like the cipher machine invented by the German engineer Arthur Scherbius. Breaking Enigma and cracking its code became the focus of Allied efforts in World War II – eventually carried out by agents in Poland and the UK. MIT’s Enigma has been hacked too, shortly before the planned launch of a cryptocurrency in 2017.18
Nevertheless, there are still many who believe a role as a privacy enabler could be transformative for blockchain – a killer application. “This is not just about making people’s data private,” explains O’Sullivan. “It’s about the monetisation, trading and, crucially, protection of personal data.” But the pathway for delivery is not yet clear.
Opportunities in an age of Big Data
The opportunities implied by the torrent of information being generated in a closely-connected digital world are immense. The information might give a fundamentally different and more granular level of understanding on what’s happening, extending the boundaries of what we know. But there are a whole cluster of sensitive issues that need to be addressed before that point – ethics among them.
Companies that have aggressively followed ‘grab-all’ data strategies are fully aware of the value that might accrue to them, but their rights to do so are now being challenged. To extend Berners-Lee’s analogy; note the wind, grab a coat and buckle up. Now’s the time to do the thinking.
Opportunities and threats in the data economy
The immense scale of change in the technology industry is matched by interest in the potential investment implications. Tech trades helped drive markets in 2017, and seven of the world’s 10 most valuable companies are technology stocks.
That being the case, it may be worth keeping an eye on valuations at a time when analysts are warning of a ‘techlash’.19 From possible regulatory challenges to the accelerating use of ad blockers by millennials20, there are reasons for caution in some areas.
Cybersecurity – an investment theme with a multi-year horizon?
Nevertheless, some investment themes may yet prove resilient over multi-year timescales; one of these is cybersecurity. Consider the scale of the problem. It is fair to imagine there will always be bad actors who seek to exploit systems of any kind for potential points of vulnerability. In the past, that might mean a large corporate IT network with between 50,000 and 500,000 end points that need to be secured.21
The problem scales up considerably when technology is embedded in everyday objects. A single network might have millions or even tens of millions of end points. From each device to the power supply and heating and cooling systems embedded within a data centre, there are multiple locations from which disruptive actors can access the network. This is why IT research companies like Gartner are anticipating a strong step-up in security spending in the next few years. It could mean IoT-related spending almost tripling in scale by 2021.22
One implication is that the companies taking effective steps to enhance controls and minimise cyber threats might be worth exploring as potential value generators. Not surprising, perhaps, that US-listed prime cybersecurity companies have significantly outperformed the NASDAQ in 2018.23
Scaling up data infrastructure
More applied technologies spilling out data creates practical problems; consider that a single self-drive car generates nearly one gigabyte of data per second.24 So, although it is almost impossible to estimate how fast the uptake of new technologies will be, it is fair to assume there is likely to be more data and metadata to process in future. Indeed, it could well be the case that the amount of data being generated will be far greater than our ability to store it.
For example, it would take roughly 16 billion of today’s largest 12 terabyte enterprise hard-disk drives to store the 163 zettabytes of data that might be created in a single year by 2025. To put that in perspective, the disk-drive industry has shipped less than four zettabytes of capacity over the past two decades. A single exabyte – equivalent to storing all the words ever spoken throughout history – dwarves in comparison.
For investors seeking to take advantage of the scale of change, one possibility might be to explore companies making the memory chips to store or process code, or controlling other parts of the physical infrastructure that enable the transfer and storage of data. Some have high barriers to entry and comparatively-durable business models. Pairing the companies that control telecom towers and data centres with those that deliver the componentry for digital devices themselves would be one way to combine businesses with different risk profiles and relatively uncorrelated returns, while still sharing a growth bias.
Traffic routed through cloud data centres – which enable the remote delivery of IT applications and other resources – is expected to grow rapidly.25 Given the expected volume growth in data moving between clouds, in copying content from one site to multiple data centres and streaming video, successful cloud operators could be strong cash generators. Conversely, the profitability of legacy hardware and traditional IT services companies may be challenged.
To detox or not to detox?
Renewed scrutiny of the way in which social networks operate, of the psychological tools used to increase engagement and the methods of digital marketing, have led to calls for users to think more carefully about how much time they spend online.
From more extreme calls, like VR pioneer Jaron Lanier’s ‘Ten Arguments for Deleting Your Social Media Accounts Right Now’, to more measured actions – the rules of engagement are under scrutiny.
So what do we know about the health impacts of using digital technology?
Brain scans suggest internet usage does change the structure of the brain, according to research at UCLA’s Memory and Aging Research Center. The impact can be positive, enhancing the parts of the brain used for cognitive processing. Changes can be seen in as little as a week in first-time internet users aged 55-78.26
- However, there is evidence that young people whose attention is divided by using smartphones are less-effective learners.
- Having conversations without using mobile devices tends to result in higher levels of empathy.27
- Higher usage of social networking sites has also been associated with depression and greater mood swings.28 But the evidence seems mixed: a study of more than 120,000 UK adolescents in 2017 found no association between mental well-being and ‘moderate’ use of digital technology. There were measurable, ‘albeit small’ negative associations for people who had ‘high levels’ of engagement.29
- One in three internet users worldwide is a child; algorithmically-selected content raises concerns about responsibility and agency.30
1 'Tim Berners-Lee on the future of the web: 'The system is failing'’, The Guardian, November 2017
2 ‘Improving GDP: demolishing, repointing or extending?’, written by a team led by Jonathan Haskel, The Conference Board and Georgetown Center on Business and Public Policy, September 2017
3 The Princeton Web Transparency and Accountability Project, Arvind Narayanan and Dillon Reisman, May 2017
4 ‘Getting to know you’, The Economist, September 2014
5 ‘Internet surveillance, regulation, and chilling effects online: a comparative case study’, Jonathon W. Penney, Journal On Internet Regulation, Volume 6 Issue 2, May 2017
6 ‘Sam Amrani tracks you in Pret. And at Starbucks. And down the pub’, Wired, February 2018
7 ’A digital deal for the smart city: participation, protection, progress’, Jathan Sadowski, Awaiting publication in 2018
8 1948 UN Declaration of Human Rights, United Nations
9 ‘Facebook CEO Challenges the Social Norm of Privacy’, Reuters, January 2010
10 ‘With Big Data comes Big Responsibility’, Harvard Business Review, November 2014
11 ‘Why does privacy matter?’ The Atlantic, February 2013
12 ‘Zuckerberg: Federal regulation of Facebook 'inevitable'’, USA Today, 11 April 2018
13 ’This search engine Is profitable without tracking you online. And Google and Facebook could do It too’, Time.com, May 2016
14 ‘Facebook sales top estimates, fueled by ads; shares jump’, Bloomberg, April 2018
15 ‘Ad sales surge at Google parent Alphabet, but so do costs’, Reuters, April 2018
16 ‘The battle for power on the internet’, Bruce Schneier, October 2013
17 ‘Decentralizing privacy: using blockchain to protect personal data’, Guy Zyskind, Oz Nathan and Alex ’Sandy’ Pentland, 2015 IEEE CS Security and Privacy Workshops
18 ‘Enigma will refund ICO investors who lost $500,000 to scammers’, TechCrunch, August 2017
19 ‘Tech Wreck,’ ‘Techlash,’ ‘Techmaggedon’ – whatever you call it, Wall Street is terrified of it,‘ Seeking Alpha, March 2018
20 2017 Adblock report. PageFair. The use of adblockers grew 30 per cent globally in 2016, the last full year for which data is available. 11 per cent of the global internet population now blocks ads on the web.
21 ’A new posture for cybersecurity in a networked world’, McKinsey, March 2018
22 ‘Gartner says worldwide IoT security spending will reach $1.5 billion in 2018‘, Gartner, March 2018
23 Based on ETFMG Prime Cyber Security ETF (HACK) vs. NASDAQ, June 2018
24 ‘Self-driving cars will create 2 petabytes of data, What are the big data opportunities for the car industry?’ Datafloq, July 2017
25 Cisco Global Cloud Index: Forecast and Methodology, 2015–2020
26 ‘First-time Internet Users Find Boost In Brain Function After Just One Week’, Science Daily, October 2009
27 ‘The iPhone Effect: the quality of in-person social interactions in the presence of mobile devices’, Shalini Misra, Lulu Cheng, Jamie Genevie and Miao Yuan, Environment and Behavior 1–24, 2014
28 ‘Social networking sites, depression, and anxiety: a systematic review’, Elizabeth M. Seabrook, Margaret L. Kern, Nikki S. Rickard, JMIR Mental Health, 2016
29 ‘Smartphones are bad for some teens, not all’, Candace Odgers, Nature, February 2018
30 ‘Detecting depression and mental illness on social media: an integrative review’, Sharath Chandra Guntuku, David B. Yaden, Margaret L. Kern, Lyle H. Ungar and Johannes C. Eichstaedt, Current Opinion in Behavioural Science, 2017.