Quantcast
Channel: News and Research articles on Information and Data
Viewing all 248 articles
Browse latest View live

WhatsApp and political instability in Brazil: targeted messages and political radicalisation

$
0
0

This paper is part of Data-driven elections, a special issue of Internet Policy Review guest-edited by Colin J. Bennett and David Lyon.

Introduction

After 21 years of military dictatorship, followed by a short period of political instability, the political scene in Brazil was dominated by two major parties that, between them, held the presidency until 2018. Both were moderate with a large membership base, had many representatives in Congress and received constant coverage in the legacy media as representatives of a more westernized democratic process. However, in the 2018 elections, the country elected as president a niche congressman, Jair Bolsonaro, a member of a small party (PSL) with almost no registered supporters, who had been relatively unknown until some four years earlier, when he started to make appearances on popular and comic TV shows, on which he combined extremist rhetoric with praise for the military dictatorship. Bolsonaro’s election surprised local and international politicians and intellectuals, in part because his campaign lacked a traditional political structure, but mainly because of his radical rhetoric, which frequently included misogynistic and racist statements that would be sufficient to shake the public image of any candidate anywhere in the world (Lafuente, 2018), but which were even more shocking in a country marked by social inequalities and racial diversity.

One of the hypotheses for Bolsonaro's electoral success is that his campaign and some supporters developed a specific communication strategy based on the intense use of social media, in which the use of WhatsApp chat groups, micro-targeting and disinformation to reach different groups of voters had significant relevance. Albeit not always in a coordinated way, several platforms were used: YouTube, with videos of alt-right political analyses, lectures about "politically incorrect" history and amateur journalism; Facebook, with its pages and groups for distributing content and memes; and Twitter/Instagram, especially as sites for posting political and media content (the last three platforms mentioned were also widely used by the candidate himself to post messages and live videos on his official profile). Davis and Straubhaar (2019) point out that “legacy media, popular right-wing Facebook groups, and networks formed around the communication network WhatsApp fueled “antipetismo” 1, stressing that WhatsApp was particularly instrumental to cement Bolsonaro’s victory. Addressing the emergence of what she calls “digital populism”, Cesarino (2019) discusses the formation of a polarised digital bubble largely anchored on WhatsApp chat groups.

We focus our analysis on WhatsApp, examining the use of encrypted groups containing up to 256 members reflecting specific interests (religious, professional, regional, etc.). Smartphones and WhatsApp “were not as extensively available in Brazil during the previous [2014] presidential election” (Cesarino, 2019), and we aim to show that WhatsApp has technical specificities and was susceptible to an electoral strategy that justifies particular attention. Several media reports stress the role played by WhatsApp in the 2018 elections. Analysing the Brazilian elections, our goal is also to contribute to the decoupling of “research from the US context” and help with the understating of “macro, meso and micro level factors that affect the adoption and success of political micro-targeting (PMT) across different countries” (Bodó, Helberger, & Vreese, 2017). The Global South in general has been associated with an increase in “computational propaganda” in recent years (Woolley & Howard, 2017; Bradshaw & Howard, 2018).

Because of its closed, encrypted architecture, which restricts visibility for researchers and public authorities, the relative anonymity afforded by the use of only telephone numbers as identifiers by groups administrators, and the limited number of members in a group, which favours audience segmentation, WhatsApp is the platform that poses the greatest challenges for those investigating information dynamics during elections. It has also been shown that, because of these characteristics, WhatsApp played a crucial role in the spread of misinformation during the Brazilian 2018 elections (Tardáguila, Benevenuto, & Ortellado, 2018; Davis & Straubhaar, 2019; Junge, 2019) and in the development of a disinformation strategy that not only was on the edge of the law (Melo, 2019; Avelar, 2019; Magenta, Gragnani & Souza, 2018) but also exploited features of the platform’s architecture that help to render coordinated strategies invisible and favour group segmentation.

Although widely used in Brazil as a means of communication, only recently has WhatsApp use been tackled as a research subject in elections. Moura and Michelson (2017) evaluated its use as a tool for mobilising voters, and Resende et al. (2019) conducted groundbreaking research on the dynamics of political groups on WhatsApp. However, little has been said on the interrelation between a historical context and its interplay with new platforms, media technologies and novel campaign strategies that rely on surveillance. In this sense, surveillance shows up as a new mode of power (Haggerty & Ericson, 2006) with direct impact on the election process, with implications for democracy (Bennett, 2015). This article challenges the idea that political micro-targeting (PMT) is elections as usual (Kreiss, 2017), showcasing its connection with disinformation practices and a process of political radicalisation in a specific empirical context, and stresses that PMT functions as part of an (mis)information ecosystem.

In this article, we discuss the Brazilian institutional, political and media context that paved the way for Jair Bolsonaro to become president in what was an atypical election result that surprised the vast majority of political analysts. We describe and discuss the use of a particular social media platform, WhatsApp, which instead of functioning as an instant messaging application was weaponised as social media during the elections. Based on an analysis of a sample of the most widely distributed images on this platform during the month immediately prior to the first round of the elections, in which Bolsonaro won 46.03% of the valid votes, we argue that messages were partially distributed using a centralised structure, built to manage and to stimulate members of discussion groups, which were treated as segmented audiences. Our ambition is to correctly address a specific and concrete use of data in an electoral campaign and to avoid any type of hype around PMT or data-driven strategies (Bodó et al., 2017; Baldwin-Philippi, 2017). In this case, platforms and data are not used as much to scientifically inform broad campaign strategies (Issenberg, 2012), but are more connected to disinformation/misinformation processes.

The Brazilian context and the rise of Bolsonaro as a political and media “myth”

Brazil is a federal republic with relatively independent states but considerable power centralised in the federal executive and legislatures. The regime is presidential, and an election is held every four years for president (who may run once for re-election), state governors, state congressmen and federal deputies and senators; federal deputies and senators represent their states in two legislative houses, the Chamber of Deputies and the Federal Senate. The president of the Chamber of Deputies is the second in line to the presidency (after the vice president, who is elected on the same slate as the president), and it is the responsibility of the legislature to investigate and, if necessary, try the president of the republic for “crimes of responsibility”, which can lead to impeachment and removal from office.

Voting is compulsory and electors do not need to be members of a party to vote. Failure to vote is punished with a small fine (approximately $2.00 USD). Abstentions are typically in the region of 20%; in the last elections the figure reached 20.3%, the highest in the last 20 years.

Federal and state elections are held every four years, and municipal elections occur in between. Candidates must be members of legally constituted political parties to stand for election. Elections for executive office are held in two rounds, and the two candidates with the most votes in the first round compete in a run-off unless one of them has 50% + 1 of the valid votes in the first round.

The political system is extremely fragmented (Nascimento, 2018), and there is frequent switching between parties, particularly between smaller associations. In general, congressional representatives who belong to these smaller associations are known as the “lower clergy” and form a bloc characterised by clientelism (Hunter & Power, 2005) and cronyism. Armijo & Rhodes (2017) argue that “cronyism may tip into outright corruption, or direct payments by private actors for preferential treatment by state officials”, pointing out that Brazilian elections are very expensive and were, at least until recently, heavily funded by the private sector. Parties form coalitions to take part in the elections, and the seats in the Chamber of Deputies are distributed according to the number of votes the coalitions or parties receive and are then allocated within the coalitions according to the number of votes each candidate receives. In his almost 30 consecutive years as a federal deputy, Jair Bolsonaro was known as a member of the “lower clergy” and was affiliated with no less than nine different political parties. A former member of the armed forces, when elected congressman, his votes came mainly from efforts that benefited that sector (Foley, 2019), and also from his criticism of human rights and his position in favour of harsher criminal laws and more vigorous police action.

Brazilian elections historically were financed with a combination of public funds and limited private donations from individuals and companies 2. Public funds are shared between the parties mainly according to the number of seats they have in Congress. Parties are also entitled to television and radio airtime, for which broadcasting companies receive tax exemptions. Political advertisements paid for by parties are prohibited.

Radio and TV airtime was traditionally considered one of the most important factors in a presidential election. However, in the last election, the candidate with the most time in these media ended up in fourth position (Geraldo Alckmin, PSDB, with 44.3% of the airtime), while Bolsonaro, who had little more than 1% of the total airtime, won the first round. Fernando Haddad (PT), who came second, had 19.1% of the airtime (Ramalho, 2018; Machado, 2018).

The Brazilian broadcasting system is concentrated in the hands of a few family groups and, more recently, an evangelical minister from a non-denominational church (Davis & Straubhaar, 2019). These groups own TV and radio networks, newspapers and websites around the country. The editorial line is economically conservative, although some of the companies (e.g., Rede Globo) have a more liberal attitude in terms of customs (Joyce, 2013).

An appreciation of the Brazilian political scene is also important to understand Bolsonaro’s rise. From the middle of the second term in office of Luis Inácio Lula da Silva, when the first accusations of corruption involving members of the government appeared, much of the domestic press became more critical in its tone and adopted a more aggressive posture clearly aligned with the opposition. Even when the government was at the height of its popularity, during Lula’s second mandate, political commentators who were either clearly anti-Workers’ Party (PT) or more scathing in their criticism tended to be in the majority in much of the media (Carvalho, 2016).

This change generally helped to create quite a negative feeling toward the PT, primarily among the middle classes, the main consumers of this political journalism. This feeling was to intensify after Dilma Rousseff was re-elected by a small margin, with voters being clearly divided by class and region. Roussef’s votes came mainly from the lower income classes and from the north-east of the country (Vale, 2015). The figure below by Vale (2015) shows the distribution of votes per state in the second round of the 2014 presidential election (% of vote).

Figure 1: Distribution of votes in the 2014 Brazilian presidential election (per state) (Vale, 2015).

At the same time, the consequences of the 2008 global crisis began to be felt across the country. Until Rousseff’s first mandate, the country had managed to balance the public accounts. However, when the government boosted internal consumption by granting tax exemptions to certain industrial and infrastructure sectors by the end of her second term, public debt surged and became the target of strong criticism by financial commentators. At this point the PT lost most of support it might have had among the middle classes and in the industrialized south and south-east. Later, when the second major political scandal during the rule of the PT was to break — the discovery of a corruption scheme involving mainly allies of the PT but also marked by the participation and/or connivance of the PT itself — a political process was set in motion that included huge street demonstrations with extensive coverage and encouragement by the media. The government began to lose its support in Congress, a support which had never been very ideologically solid given the strength of the “lower clergy”. The end result was that Dilma was impeached for a “crime of responsibility” on controversial grounds.

The political climate that ensued, however, did not help to restore peace. Accusations of corruption were levelled against a wide range of parties, including those that actively supported impeachment, such as the PT’s long-standing adversary, the Brazilian Social Democracy Party (PSDB). Nowadays a centre-right party, the PSDB had held the presidency from 1995 to 2002 and had faced the PT in run-offs in all the presidential election since then. Historically a centre-left party, the PSDB had moved toward neoliberalism, although its membership later included more conservative candidates, with some in favour of more vigorous police action and tougher laws, as well as a morally conservative agenda. This slow ideological transformation was initially welcomed by part of the electorate, particularly middle and upper-class electors, but later proved to be insufficient to ensure their victory.

Following Rousseff’s impeachment, extensive reporting of the widespread involvement of political parties in the Petrobras scandal appears to have helped produce a profound aversion to politicians as a whole. The Petrobras scandal was revealed in 2014 by Operation Car Wash but the investigations lasted for another five years and were the subject of high media attention for the whole period. Investigators revealed that a “cartel of construction companies had bribed a small number of politically appointed directors of Petrobras in order to secure a virtual monopoly of oil-related contracts” (Boito & Saad-Filho, 2016).

In parallel with this, Jair Bolsonaro became increasingly well known among a large swathe of the public. As previously mentioned, although he had been in politics for almost 30 years, Bolsonaro remained a minor figure until 2011, known only in his state, Rio de Janeiro, and popular only among the military (the armed forces as well as police and firefighters, who are members of the military in Brazil) and a small niche of electors in favour of repressive police action. In 2011, however, Bolsonaro took part in CQC (the acronym for “Whatever It Takes” in Portuguese), a TV programme that combines humour and political news, and answered questions sent in by members of the public. At the time, one of the presenters classified him as “polemical” and another said he had not the slightest idea who that congressman was. The objective appeared to be to ridicule him, and the programme featured humorous sound effects. The then congressman’s answers ranged from homophobic to racist, and he praised the period of military dictatorship in Brazil. The interview sparked controversy, and Bolsonaro was invited to appear on the programme again the following week. He became a common attraction on CQC and other programmes that also explore the impact of the politically incorrect.

The congressman was gradually given more airtime on CQC and taken more seriously, while at the same time his legitimacy increased because of his popularity with the audience. A few months before Bolsonaro was elected, a former CQC reporter recalled with regret the visibility the programme had given the congressman. She said they were “clueless that a good chunk of the population would identify with him, with such a vile human being”, admitting they “played a part” in the process. (2018, April 4)

In addition to CQC, other free-to-air TV programmes gave the then federal deputy airtime. In a recent study, Santos (2019) shows how, since 2010, Bolsonaro made an increasing number of appearances on the free-to-air TV programmes with the largest audiences. CQC helped particularly to bring Bolsonaro closer to a younger audience, together with the programme Pânico na Band, which also takes advantage of politically incorrect humor and created a special segment on the congressman, with 33 episodes each 9 minutes long in 2017.

In social media, Bolsonaro and his opinions became the subject of increasing debate and were vigorously rejected by the left but adopted as a symbol of the politically incorrect by sectors of the non-traditional right. The figure of a talkative, indiscreet congressman became a symbol for a subculture of youth groups—similar to those who frequent websites such as 4chan (Nagle, 2017) — for whom he became known as “the Myth”, a mixture of iconoclasm, playful humour and conservatism. This connection with the politically incorrect was exploited by the administrators of the congressman’s Facebook page since it was set up in June 2013 (Carlo and Kamradt, 2018).

WhatsApp and the spread of misinformation / disinformation

The political year of the last presidential election, 2018, was unusually troubled. Involved in various court cases, the then leader in the polls, former president Lula, was arrested in early April. The party was adamant that he should run for election even though there was little chance of his being allowed to do so by the legal authorities. Without Lula, Bolsonaro was ahead in the polls from the start — with a lead that varied but was always at least 20% — although he was considered a passing phenomenon by most analysts and was expected to disappear as soon as the TV and radio campaigns started because he had very little airtime compared with the candidates for the traditional parties.

Another important event that helps describe the scenario in which WhatsApp had a significant role in politics is the strike/lockout organised by truck drivers in May 2018. Dissatisfied with the almost daily variations in fuel prices, which had begun to be adjusted in line with the US dollar and the price of oil, self-employed truck drivers and logistic companies started a protest that ended up bringing the country to a quite long and economically harmful standstill (Demori & Locatelli, 2018). Using mainly WhatsApp, drivers organised roadblocks on the main highways, causing severe disruption to the supply of goods, including fuel and food (Rossi, 2018). Radical right-wing associations infiltrated these online groups, praising militarism as a solution for the country and sometimes clamouring for military intervention to depose the executive, legislature and judiciary.

Interested in understanding how fake news was spread by political groups on WhatsApp, Resende et al. (2019) collected data of two periods of intense activity in Brazil: the truck drivers’ strike and the month prior to the first round of the elections. They sampled chat groups that did not necessarily have any association with a particular candidate, were open to the public (although the administrators could choose who was accepted or removed), having links to join in shared in websites or social networks, and could be found using the URL chat.whatsapp.com. The groups were chosen by the match of that URL with a dictionary containing the name of politicians, political parties, as well as words associated with political extremism. In all, they analysed 141 groups during the truck drivers’ strike and 364 during the elections. The results show that 2% of the messages shared in these groups during the elections were audio messages, 9% were videos, 9% contained URLs to other sites and 15% were images. The other 65% were text messages with no additional multimedia content.

Resende et al. (2019) also developed an automated method for determining whether the shared images in the analysed groups had already been reviewed and rejected by fact-checking services. To that selection they added 15 more that were previously identified by the Brazilian fact-checking agency, Lupa, as misinformation. Totalling 85 images that contained misinformation, they found that these were shared eight times more often than other 69,590 images, which were truthful or had not been denounced for checking by any independent agency.

Although the total number of images labeled as misinformation is relatively low - only 1% of the total number of images shared - these images were seen in 44% of the groups monitored during the election campaign period, which means they have a long reach. Upon investigation of such images, these researchers identified the groups in which the images appeared first, and remarked that a small number of groups seemed to account for dissemination of a large amount of images with misinformation. In our view, this fact indicates a more centralised and less distributed dissemination structure.

Another fact revealing a dynamic of relatively centralised dissemination is that the "behaviour" of image propagation including disinformation (which are images deliberately produced and/or tampered with) is significantly different from unchecked images. Comparing the structure of propagation of these two groups, particularly as to the time these images appeared on the Web and on WhatsApp and vice-versa, the authors noticed that 95% of the images with unchecked content were posted first on the Web and then in monitored WhatsApp groups. Only 3% of these images made the opposite route, and 2% appeared both on the Web and on WhatsApp on the first day. In contrast, only 45% of the images with misinformation appeared first on the Web, 35% were posted first on WhatsApp and 20% were shared in both platforms on the same day. According to the authors, this suggests "that WhatsApp acted as a source of images with misinformation during the election campaign period" (Resende et al., p.9.). Considering that an image with disinformation is deliberately produced and tampered with, the fact that WhatsApp is its first source of sharing in a much higher percentage than images with unchecked content (35% in the first case versus 2% in the second case) is one more element indicating a relatively centralised and not fully spontaneous organisation of propagation of this type of content.

Disinformation in WhatsApp groups

As to the contents of images with disinformation, they reproduce many of the elements that were key in the rise of Bolsonaro and, later, during his election campaign. In this section we will analyse the top eight most shared images with disinformation in the month before the first round, using the same groups monitored by Resende et al. (2019) as our source. Our analysis is based on investigative work developed by the Agência Lupa and Revista Época, in partnership with the research project "Eleições sem Fake" (Resende et al. 2019; Marés, Becker and Resende, 2018). The news piece points out that none of the eight images analysed mentions the presidential candidates directly. All of them refer to topics that reinforce beliefs, perspectives and feelings that shaped the ideological base of Jair Bolsonaro's campaign. Anti-PT-ism, strongly boosted by the legacy media over the last few years, was one of the pillars of Bolsonaro's campaign, and it is the content of the most shared image with disinformation in the monitored groups in the month before the first round. As we can see, Figure 2 is a photo-montage that inserts a photo of the young ex-president of Brazil, Dilma Rousseff, beside the ex-president of Cuba, Fidel Castro.

Figure 2: First most shared disinformation image on WhatsApp.

At the time the original photo of Fidel Castro was taken by John Duprey for the “NY Daily News” in 1959, president Dilma Rousseff was only 11 years old. Therefore, it is clearly a tampered image that intends to associate the PT with communism and Castroism. Such association was recurrent among Bolsonaro supporters during his campaign and antipetismo appears directly in three out of the eight most shared images with misinformation over the time period under analysis.

Another image with clear anti-PT-ist content (the fourth most shared image in the monitored groups) is an alleged criminal record of ex-president Dilma Rousseff during the times of the military dictatorship, in which she would be accused of being a terrorist/bank robber (Figure 3). This record was never issued by any official agency of the military government and has the same format as the third most shared image, this time showing José Serra, current senator of the republic for the PSDB party 3.

Figure 3: Fourth most shared disinformation image on WhatsApp.

Lastly, the third image with direct anti-PT-ism content (the eighth most shared image in the monitored WhatsApp groups) is the reproduction of a graph with false information comparing consumption of families over the last five years of PT government at that time with the expenditure of the government itself (Figure 4). Contrary to what the graph shows, the consumption of families did not decrease; instead it grew 1.8% between 2011 and 2016, whereas expenditure of the public administration rose 3.1% during the period, and not 4% as the graph indicates.

Figure 4: Eighth most shared disinformation image on WhatsApp.

The second most frequent topic in the most shared images with misinformation is attacks to the rights of LGBTs and women, appearing in three out of the eight most shared images. This kind of content, although not directly antipetismo, denies rights that were symbolically associated to leftist parties by Bolsonaro's campaign.

The fifth and sixth most shared images link these rights to sexualisation of childhood and lack of respect for religious beliefs, as shown in figures 5 and 6, respectively. Moreover, in the context of their sharing via WhatsApp, such images were associated with Rede Globo, the largest commercial open television network in the country. In figure 5, the legend of the image (which is in fact a photo of Heritage Pride March in New York in 2015) reads: “People from Globo-Trash who do not support Bolsonaro!!!”. In figure 6, the image is shown with the sentence "Globo today". However, the image is a record of the Burning Man music festival that took place in the Desert of Nevada in 2011 in the US. It shows a man dressed as Jesus kissing Benjamin Rexroad, director of the “Corpus Christi" production. This image was published in O Globo newspaper at the time of the festival and not before the first round of the 2018 elections. The link of these images to the O Globo newspaper is part of a campaign of disqualification of the TV station with the same name, Rede Globo. Bolsonaro supporters’ strategy seems to be to legitimise the WhatsApp groups as more reliable sources of information than the legacy media. The Globo Network, historically linked to conservative economic and political interests in the country, was here associated with the propagation of hyper-sexualised and anti-Christian content.

The other image (figure 7), which is part of this thematic group against LGBT and women's rights, is a montage of photos of different protests in churches. In one of the protests, a couple has sex inside a church in Oslo, Norway, in 2011. In the second one, a woman defecates on the stairway of a church in Buenos Aires, capital of Argentina, which happened when Maurício Macri won the 2015 presidential election. The caption of the false image reads “Feminists invade church, defecate and have sex”, and is in clear opposition to the #EleNão movement. #EleNão (#NotHim) was a movement led by women that denounced Bolsonaro as misogynist, gathering thousands of people across Brazilian streets in the verge of the first round of the elections (Uchoa, 2018).

Figure 5: Fifth most shared disinformation image on WhatsApp.
Figure 6: Sixth most shared disinformation image on WhatsApp.
Figure 7: Seventh most shared disinformation image on WhatsApp.

Use of illegal tactics

In the second round of the elections, the newspaper Folha de S.Paulo managed to discover that businessmen had signed contracts of up to $3 million US each with marketing agencies specialised in automated bulk messaging on WhatsApp (Melo, 2018). The most famous of the businessmen who were accused owns a retail sector company, which could suggest that the marketing methods used in his company could also be used in politics. The practice is illegal as donations from companies were forbidden in the last election. Furthermore, the businessmen were alleged to have acquired user lists sold by agencies as well as lists supplied by candidates. This practice is also illegal as only the candidate’s user lists may be used. According to a report issued by Coding Rights (2018) involving interviews and document analyses of marketing agencies operating in Brazil, election campaigns in general combine a series of databases, associating public information, such as the Census, and data purchased from private companies such as Serasa Experian and Vivo (a telecom company owned by Telefónica). These databases include demographic data and telephone numbers (including the respective WhatsApp contacts).

According to Folha de S.Paulo’s article, the marketing agencies are able to generate international numbers, which are used by employees to get around restrictions on spam and to administer or participate in groups. Off the record statements obtained by the newspaper from ex-employees and clients of the agencies, reveal that the administrators used algorithms that segmented group members into supporters, detractors and neutral members and defined the content sent accordingly. The most active Bolsonaro supporters were also allegedly contacted so that they could create even more groups in favour of the candidate (Melo, 2018). In a different article, the newspaper noted that some of these groups behaved like a military organisation and referred to themselves as a “virtual army” in favour of the candidate. According to the newspaper, the groups are divided into “brigades”, “commands” and “battalions” and are formed mainly of youths, some under 18 years of age (Valente, 2018).

Investments on election campaigns were directed to several variations of digital advertising. Besides being less expensive, digital advertising can be an alternative to limited TV time, particularly for small political associations (Coding Rights, 2018). Digital campaigns included WhatsApp mainly because it is a platform with deep penetration in the population, considering, among other things, the practice of zero-rating policies. Zero-rated data refers to data that does not count toward the user’s data cap (Galpaya, 2017). Telecom operators commonly offer free use of WhatsApp for pre-paid plans, which are the ones most commonly contracted by lower classes. Even if the user doesn’t have any credits left for accessing the internet, they keep sending and receiving text and media content on their WhatsApp chat groups and from individual users. An accurate image is described by Luca Belli: “fact-checking is too expensive for the average Brazilian” (2018).

Rules approved for the Brazilian 2018 elections permitted candidates to buy advertisements on social media and to spread content using message platforms. However, WhatsApp does not offer micro-segmentation as a service, which would allow advertisements to be directed to a certain audience, like Facebook does. Marketing agencies ended up playing that role, not always using legally collected information on voters.

Audi & Dias (2018) reported that agencies in the business of political advertising use software that monitors different interest groups - not only those of political discussion - to prospect voters and supporters. The users are measured in terms of their mood and receptivity towards campaign messages. By doing so, these agencies manage to identify the ideal target population and the right time to send each type of content. According to the article, “messages that reach a diabetic 60-year old man from São Paulo State are different from those sent to a north-eastern woman who lives on minimum wage”. Audi & Dias (2018) had access to one of the software programmes used during the last elections in Brazil, WNL, a version used for the campaign of a non-identified politician. The software programme monitored and distributed contents in over 100 WhatsApp groups that ranged from diabetics discussion groups, soccer team supporters, Uber drivers, advertising of job vacancies and even workmates and neighbours.

Such segmentation was refined by the monitoring of reactions to contents posted, rated as positive, negative or neutral reactions. Users rated as positive keep receiving similar information in favour of the candidate. Those rated as neutral get mostly materials contrary to the opponent. Negative users start getting a more incisive treatment, receiving content that would tend to "target values dear to the person, such as family and religion, in an attempt to inflate rejection towards the candidate's competitor". By monitoring these reactions, users are segmented in individual files and then classified into groups according to specific topics - such as church, "gay kit", family, communism, weapons, privatisation, etc. Moreover, this software programme enables those that monitor the groups to collect and select keywords in order to discover specific interests: "For example, a patient with cancer speaks about his/her condition and treatment. The system collects these data and finds other people in similar conditions. They start to get content with the policies of the candidate for health, for example" (Audi and Dias, 2018).

It should be noted that this micro-segmentation and micro-targeting are integral to the way advertisements on platforms work. Facebook, for instance, announced some special transparency policies for political ads during the election period (Valente, 2018). However, due to the nature and architecture of WhatsApp, the visibility of content spreading strategies on a platform such as it is minimal, and this prevents users to realise that they are being the target of persuasion strategies. We will return to this issue in the conclusion, but it must be pointed out that using this platform for election campaigns is structurally questionable. If we consider only the methods used by AM4 Company, which openly worked in Jair Bolsonaro's campaign spreading contents to 1,500 WhatsApp groups, there are already reasons for concern, since such content is not explicitly distributed as part of an election advertising campaign. It is rather distributed as content shared by common users in groups with a supposedly symmetrical interaction structure. According to a statement of the company's founding partner: “what we do is curator-ship of contents created by supporters” (Soares, 2018). The company owner also stated that a series of content that fed 1,500 WhatsApp groups on a daily basis were part of the strategy of the company hired by PSL, which operated since the pre-campaign, to revert negative episodes in favour of Bolsonaro's campaign.

WhatsApp group dynamics

Before developing a formal research interest in the use of WhatsApp during the elections we started a participatory observation process at political chat groups on WhatsApp. Initially, we were interested in understanding how the app was being used by truck drivers for organising its protests. Later, we noticed that many groups we found were also occupied by radicals in favour of a return to the military dictatorship and supporters of Jair Bolsonaro. This helped us to understand the dynamics of those groups and how some more prominent members acted to manage the discussions or the posting of content.

Various groups were short-lived and rapidly replaced by other groups that were advertised in the “dying” groups. Until the election day, and some weeks before, we tried to follow the discussion of at least three groups on WhatsApp and two on Telegram. Some of the groups were occasionally invaded by users who posted pornographic content or advertised illegal services, such as pirate IP TV and false diplomas, cloned credit cards and counterfeit money. As observed in field work, in the case of one particular group on Telegram, the group became a pirate IP TV sales channel as soon as the elections were over.

At the end of the elections, it was observed that new groups were set up with a new mission: to act as direct communication channels between supporters of the new president. One of these went by the name of Bolsonews TV. There is little discussion in these groups and only a few members are responsible for almost all the content sent to the groups or forwarded from other groups. A frequently repeated claim is that one should not believe in the legacy media because it is controlled by communists and left-wing individuals; according to the people who send these messages, only some YouTubers, journalists, bloggers and politicians can be trusted. Before the elections, material from the legacy media that was highly critical of the PT was frequently shared, particularly if it was produced by commentators who were considered right wing. After the elections, when the criticism was aimed more at the new government, and even when it came from commentators who were considered right wing, this type of content became less common. Any critical comments in groups clearly identified with Bolsonaro led to the author being removed from the group and accusations that he/she was a supporter of the PT who had infiltrated the group. The telephone numbers of people accused of supporting the PT circulate regularly, and group moderators are warned to exclude these people from groups.

Analysing the flow of messages between political groups during the elections, Resende et al. (2018) identified characteristics that indicated the presence of clusters of groups with members in common. They constructed a graphical model of the relationship between members, which revealed a network of users who were associated because they shared images in at least one common group. “We note a large number of users blending together connecting to each other inside those groups. Most users indeed form a single cluster, connecting mostly to other members of the same community. On the other hand, there are also a few users who serve as bridges between two or more groups linked by multiple users at the same time. Furthermore, a few users work as big central hubs, connecting multiple groups simultaneously. Lastly, some groups have a lot of users in common, causing these groups to be strongly inter-connected, making it even difficult to distinguish them” (2019, p. 6). This would suggest that WhatsApp is working not so much as an instant messaging app but as a social network, like Twitter and Facebook. Other evidence, as shown above, allows us to conclude that these groups may be centrally managed, although this is invisible to the ordinary user.

Conclusion

Commenting on the use of micro-targeting in campaigns, Kreiss points out that it is “likely most effective in the short run when campaigns use them to mobilize identified supporters or partisans” (2017). It seems to be the case of what happened in Brazil in the 2018 elections, in which a candidate was able to tap into a conservative sentiment, harnessing it against the progressive field.

Even though it is not possible to fully confirm the hypothesis that WhatsApp has been used as an effective tool to direct messages to micro-segmented voters, we have shown that the campaign of Jair Bolsonaro used the app to deliver messages (and disinformation) to exacerbate political feelings present in the political debate of the legacy media - antipetismo (Davis & Straubhaar, 2019) - and add to them much more conservative elements in the moral field (anti-feminism and anti-LGBT), which brought back topics from the times of the military dictatorship (anti-communism). Beyond the effects on the left, the radicalisation promoted by Bolsonaro’s campaign was able to neutralise any other candidate on the centre, even on the centre-right, associating them with the establishment and with the notion of a corrupt political system. In the symbolic assemblage (Norton, 2017) that was formed, the elected candidate ended up representing the most effective answer against the political system, although many voted for him for different reasons. In a similar fashion to Trump, Bolsonaro “ran an insurgent campaign that called for shaking up the system” (Kivisto, 2019, p. 212)

There is enough evidence that the WhatsApp chat groups feature was weaponised by Bolsonaro supporters. Although WhatsApp does not provide a service for micro-targeting audiences, there is evidence that third party companies, dedicated to non-political marketing campaigns, provided that kind of service in the context of elections, sometimes using illegal databases. There are reports that Haddad’s campaign has also used WhatsApp to deliver messages to voters (Rebello, Costa, & Prazeres, 2018). However, as the sample collected by Resende et al. (2019) suggests, there is no evidence that the left coalition has employed the same tactics as Bolsonaro’s in secretly managing multiple WhatsApp chat groups.

Among the many problems involved in the use of a platform like WhatsApp in an election campaign, we would like to point out one in particular: the invisibility of the actors that produce, monitor, distribute and/or direct the contents viewed and/or shared by most users. The current architecture of the platform does not allow, once appropriated for purposes of election campaigns and micro-targeting, users to notice or become aware that they are being monitored and managed. Writing on voter surveillance in Western societies, Bennett reminds us that “surveillance has particular, and somewhat different, effects depending on whether we are consumers, employees, immigrants, suspects, students, patients or any number of other actors” (2015, p. 370). The case of use of WhatsApp in Brazilian elections shows how a surveillant structure was built on top of a group message service that allegedly uses cryptography to protect its user's privacy.

Resende et al. (2019) characterised a network structure of the monitored WhatsApp groups that evidence a coordinated activity of some members. There are no clear means for regular WhatsApp chat group members to notice if they are being monitored or laterally surveilled by other group members or even other second-hand observers outside the groups. Studies on perception and experience of Facebook users show that when they notice that a post is sponsored they tend to be less persuaded than when exposed by a regular post by a friend or acquaintance (Kruikemeier et al., 2016). But unlike Facebook, where users can have a huge number of connections although a great part of them may not be very close or not close at all, most contacts of WhatsApp users are closer to a personal circle, thus setting a relationship of trust with the content received. Writing on family WhatsApp chat groups in Brazil, Benjamin Junge classifies them as both “public” of sorts, an “open space for the sharing of information and opinions” (2019, pp. 13-14), and closed in the strict sense, because they are only accessible to family members. Although this trust-based relation may be transformed when the user is a member of larger groups, the experience of proximity and connection with the members of a group is bigger than, for instance, among Facebook friends and Twitter followers. WhatsApp favours a stronger relationship of trust between group members and the content shared, which implies that is a more susceptible field for the spread of misinformation. Cesarino (2019a) posits “trust” as one of the affordances of the WhatsApp platform, affirming that most of the political content forwarded to its users during the 2018 election was pushed by friends or family members.

Possible asymmetries of information, persuasion tactics and/or influence strategies within chat groups are rather hard to detect. In countries like Brazil, this condition is reinforced by the impossibility that many users have to reach beyond the platform to check information shared, something that might provide a context or additional information about the contents circulating in the platform. With the zero-rating plans offered by telecom companies, users are subject to tariffs that they cannot afford if they seek other sources of information. This perceptive confinement is particularly worrying in a context of the wide dissemination of disinformation, just like what happened in the 2018 election period, since most users are not only unaware of the authorship of the contents that reach them, but also they cannot reasonably check and verify such contents. The 'near-sighted" environment (in fact the most appropriate eye disorder here would be loss of peripheral sight) of WhatsApp is also favoured by its one-to-one communication structure, which prevents side visibility, transversal or common visibility in the platform. The lack of a common field of visibility would not be a problem if WhatsApp restricted its stated or projected technical functionality - that of being an instant messenger. However, when the tool begins to function as a typical social network - as stated by Resende et al. (2019), and starts to be massively appropriated for political campaigns, it is critical to have more symmetric relationships of visibility, as well as the possibility to build a common visible field that can be debated, examined and audited.

At least since the 2014 elections and especially after the contested impeachment of President Dilma Rousseff (PT), Brazil lives in a period of political and institutional instability. Recently leaked messages exchanged by prosecutors and judges involved in the investigation of corruption scandals help to draw a picture of a justice system contaminated by political goals (Fishman et al., 2019). That struggle certainly played a role in the ineffectiveness of the electoral legislation to curb the illegal use of WhatsApp in the 2018 elections. We described in this article many of the illegalities that surrounded the electoral process. In 2019, the Brazilian Congress approved a data protection law in many aspects compliant with the EU’s General Data Protection Regulation (GDPR) that can help to strengthen the fairness of future elections (if and when the country can restore its political and institutional normalcy).

However, as we hope to have exposed here, there is a complex dynamic between the legacy media and what is created and shared by political actors and supporters. Much of the dis-informative content we have analysed was produced having as background a radicalisation trend noticed on the legacy media. The fact that the means of communication in Brazil are highly concentrated in the hands of a few groups and lacks political diversity certainly played an important role in paving a way for political radicalisation. Zero-rating policies that fuels the popularity of one specific platform (WhatsApp) and curbs users from accessing a full functioning internet obviously are a practical impediment for a voter that could be educated to adequately research and check the news stories they receive.

References

“A gente, infelizmente, contribuiu”, diz Monica Iozzi sobre popularidade de Bolsonaro - Emais ["Unfortunately, we contributed," says Monica Iozzi about Bolsonaro's popularity]. (2018, April 4). O Estado de S.Paulo. Retrieved April 1, 2019 from https://emais.estadao.com.br/noticias/gente,a-gente-infelizmente-contribuiu-diz-monica-iozzi-sobre-popularidade-de-bolsonaro,70002254686

Armijo, L. E., & Rhodes, S. D. (2017). Explaining infrastructure underperformance in Brazil: Cash, political institutions, corruption, and policy Gestalts. Policy Studies, 38(3), 231–247. https://doi.org/10.1080/01442872.2017.1290227

Audi, A., & Dias, T. (2018, October 22). VÍDEO: Seu número de telefone vale 9 centavos no zap dos políticos [VIDEO: Your phone number is worth 9 cents at the politicians' zap]. The Intercept. Retrieved July 15, 2019, from https://theintercept.com/2018/10/22/whatsapp-politicos/

Avelar, D. (2019, October 30). WhatsApp fake news during Brazil election ‘favoured Bolsonaro’. The Guardian. Retrieved from https://www.theguardian.com/world/2019/oct/30/whatsapp-fake-news-brazil-election-favoured-jair-bolsonaro-analysis-suggests

Baldwin-Philippi, J. (2017). The Myths of Data-Driven Campaigning. Political Communication, 34(4), 627–633. https://doi.org/10.1080/10584609.2017.1372999

Belli, L. (2018, December 5). WhatsApp skewed Brazilian election, proving social media’s danger to democracy. The Conversation. Retrieved 4 December 2019, from http://theconversation.com/whatsapp-skewed-brazilian-election-proving-social-medias-danger-to-democracy-106476

Bennett, C. J. (2015). Trends in Voter Surveillance in Western Societies: Privacy Intrusions and Democratic Implications. Surveillance & Society, 13(3/4), 370–384. https://doi.org/10.24908/ss.v13i3/4.5373

Bodó, B., Helberger, N., & de Vreese, C. H. (2017). Political micro-targeting: A Manchurian candidate or just a dark horse? Internet Policy Review, 6(4). https://doi.org/10.14763/2017.4.776

Boito, A., & Saad-Filho, A. (2016). State, State Institutions, and Political Power in Brazil. Latin American Perspectives, 43(2), 190–206. https://doi.org/10.1177/0094582X15616120

Bradshaw, S., & Howard, P. N. (2018). Challenging Truth and Trust: A Global Inventory of Organized Social Media Manipulation [Report]. Oxford: Project on Computational Propaganda, Oxford Internet Institute. Retrieved from https://comprop.oii.ox.ac.uk/research/cybertroops2018/

Carlo, J. D., & Kamradt, J. (2018). Bolsonaro e a cultura do politicamente incorreto na política brasileira [Bolsonaro and the culture of the politically incorrect in Brazilian politics]. Teoria e Cultura, 13(2). https://doi.org/10.34019/2318-101X.2018.v13.12431

Carvalho, R. (2016). O governo Lula e a mídia impressa: Estudo sobre a construção de um pensamento hegemônico [The Lula government and the printed media: Study on the construction of hegemonic thinking]. São Paulo: Pontifícia Universidade Católica de São Paulo. Retrieved from https://tede2.pucsp.br/handle/handle/3708

Cesarino, L. (2019). On Digital Populism in Brazil. PoLAR: Political and Legal Anthropology Review. Retrieved from https://polarjournal.org/2019/04/15/on-jair-bolsonaros-digital-populism/

Cesarino, L. (2019a). Digitalização da política: reaproximando a cibernética das máquinas e a cibernética da vida [Digitization of policy: bringing cybernetics closer to machines and cybernetics to life]. Manuscript submitted for publication.

Coding Rights. (2018). Data as a tool for political influence in the Brazilian elections. Retrieved from Coding Rights website: https://www.codingrights.org/data-as-a-tool-for-political-influence-in-the-brazilian-elections/

Davis, S., & Straubhaar, J. (2019). Producing Antipetismo: Media activism and the rise of the radical, nationalist right in contemporary Brazil. International Communication Gazette, 82(1), 82–100. https://doi.org/10.1177/1748048519880731

Demori, L., & Locatelli, P. (2018, June 5). Massive Truckers’ Strike Exposes Political Chaos as Brazil Gears Up for Elections in October. The Intercept. Retrieved 28 November 2019, from https://theintercept.com/2018/06/05/brazil-truckers-strike/

Fishman, A., Martins, R. M., Demori, L., Greenwald, G., & Audi, A. (2019, June 17). “Their Little Show”: Exclusive: Brazilian Judge in Car Wash Corruption Case Mocked Lula’s Defense and Secretly Directed Prosecutors’ Media Strategy During Trial. The Intercept. Retrieved July 12, 2019, from https://theintercept.com/2019/06/17/brazil-sergio-moro-lula-operation-car-wash/

Foley, C. (2019). Balls in the air: The macho politics of Brazil’s new president plus ex-president Dilma Rousseff’s thoughts on constitutional problems. Index on Censorship, 48(2), 26–28. https://doi.org/10.1177/0306422019858496

Galpaya, H. (2017, February) Zero-rating in Emerging Economies. [Paper No. 47]. Waterloo, Ontario; London: Global Commission on Internet Governance; Centre for International Governance Innovation; Chatham House. Retrieved December 14, 2019 from https://www.cigionline.org/sites/default/files/documents/GCIG%20no.47_1.pdf

Haggerty, K., & Ericson, R. (Eds.). (2006). The New Politics of Surveillance and Visibility. Toronto; Buffalo; London: University of Toronto Press. Retrieved from http://www.jstor.org/stable/10.3138/9781442681880

Hunter, W., & Power, T. J. (2005). Lula’s Brazil at Midterm. Journal of Democracy, 16(3), 127–139. https://doi.org/10.1353/jod.2005.0046

Issenberg, S. (2012). The Victory Lab: The Secret Science of Winning Campaigns. New York: Crown Publishers.

Joyce, S. N. (2013). A Kiss Is (Not) Just a Kiss: Heterodeterminism, Homosexuality and TV Globo Telenovelas. International Journal of Communication, 7. Retrieved from https://ijoc.org/index.php/ijoc/article/view/1832

Junge, B. (2019). “Our Brazil Has Become a Mess”: Nostalgic Narratives of Disorder and Disinterest as a “Once-Rising Poor” Family from Recife, Brazil, Anticipates the 2018 Elections. The Journal of Latin American and Caribbean Anthropology. https://doi.org/10.1111/jlca.12443

Kivisto, P. (2019). Populism’s Efforts to De-legitimize the Vital Center and the Implications for Liberal Democracy. In J. L. Mast & J. C. Alexander (Eds.), Politics of Meaning/Meaning of Politics: Cultural Sociology of the 2016 U.S. Presidential Election (pp. 209–222). https://doi.org/10.1007/978-3-319-95945-0_12

Kreiss, D. (2017). Micro-targeting, the quantified persuasion. Internet Policy Review, 6(4). https://doi.org/10.14763/2017.4.774

Kruikemeier, S., Sezgin, M., & Boerman, S. C. (2016). Political Microtargeting: Relationship Between Personalized Advertising on Facebook and Voters’ Responses. Cyberpsychology, Behavior, and Social Networking, 19(6), 367–372. https://doi.org/10.1089/cyber.2015.0652

Lafuente, J. (2018, October 9). Bolsonaro’s surprise success in Brazil gives new impetus to the rise of the far right. El País. Retrieved from https://elpais.com/elpais/2018/10/09/inenglish/1539079014_311747.html

Machado, C. (2018, November 13). WhatsApp’s Influence in the Brazilian Election and How It Helped Jair Bolsonaro Win [Blog Post]. Council on Foreign Relations. Retrieved from https://www.cfr.org/blog/whatsapps-influence-brazilian-election-and-how-it-helped-jair-bolsonaro-win

Magenta, M., Gragnani, J., & Souza, F. (2018, October 24). WhatsApp ‘weaponised’ in Brazil election. BBC News. Retrieved from https://www.bbc.com/news/technology-45956557

Marés, C., Becker, C., & Resende, L. (2018, October 18). Imagens falsas mais compartilhadas no WhatsApp não citam presidenciáveis, mas buscam ratificar ideologias [WhatsApp's most shared fake images don't quote presidential, but seek to ratify ideologies]. Retrieved July 15, 2019, from Agência Lupa website: https://piaui.folha.uol.com.br/lupa/2018/10/18/imagens-falsas-whatsapp-presidenciaveis-lupa-ufmg-usp/

Moura, M., & Michelson, M. R. (2017). WhatsApp in Brazil: Mobilising voters through door-to-door and personal messages. Internet Policy Review, 6(4). Retrieved from https://doi.org/10.14763/2017.4.775

Melo, P. C. (2018, October 18). Empresários bancam campanha contra o PT pelo WhatsApp [Business owners campaign against PT through WhatsApp]. Folha de S.Paulo. Retrieved from https://www1.folha.uol.com.br/poder/2018/10/empresarios-bancam-campanha-contra-o-pt-pelo-whatsapp.shtml

Melo, P. C. (2019, October 9). WhatsApp Admits to Illegal Mass Messaging in Brazil’s 2018. Folha de S.Paulo. Retrieved from https://www1.folha.uol.com.br/internacional/en/brazil/2019/10/whatsapp-admits-to-illegal-mass-messaging-in-brazils-2018.shtml

Nascimento, W. (2019). Fragmentação partidária e partidos pequenos no Brasil (1998-2014) [Party fragmentation and small parties in Brazil (1998-2014)]. Conversas & Controvérsias, 5(2), 285–305. https://doi.org/10.15448/2178-5694.2018.2.31837

Nagle, A. (2017). Kill All Normies: Online Culture Wars From 4Chan And Tumblr To Trump And The Alt-Right. John Hunt Publishing.

Norton, M. (2017). When voters are voting, what are they doing?: Symbolic selection and the 2016 U.S. presidential election. American Journal of Cultural Sociology, 5(3), 426–442. https://doi.org/10.1057/s41290-017-0040-z

Ramalho, R. (2018). TSE apresenta previsão do tempo de propaganda no rádio e na TV para cada candidato à Presidência [TSE presents radio and TV advertising weather forecast for each presidential candidate]. Retrieved 27 November 2019, from G1 website: https://g1.globo.com/politica/eleicoes/2018/noticia/2018/08/23/tse-apresenta-previsao-do-tempo-de-propaganda-no-radio-e-na-tv-para-cada-candidato-a-presidencia.ghtml

Rebello, A., Costa, F., & Prazeres, L. (2018, October 26). PT usou sistema de WhatsApp; campanha de Bolsonaro apagou registro de envio [PT used WhatsApp system; Bolsonaro campaign deleted submission record]. Retrieved 5 December 2019, from UOL Eleições 2018 website: https://noticias.uol.com.br/politica/eleicoes/2018/noticias/2018/10/26/bolsonaro-apagou-registro-whatsapp-pt-haddad-usou-sistema-mensagens.htm

Resende, G., Melo, P., Sousa, H., Messias, J., Vasconcelos, M., Almeida, J., & Benevenuto, F. (2019). (Mis)Information Dissemination in WhatsApp: Gathering, Analyzing and Countermeasures. WWW’ 19: The World Wide Web Conference, 818–828. https://doi.org/10.1145/3308558.3313688

Rossi, A. (2018, June 2). Como o WhatsApp mobilizou caminhoneiros, driblou governo e pode impactar eleições [How WhatsApp mobilized truckers, dribbles government and can impact elections]. BBC News Brazil. Retrieved March 18, 2019, from https://www.bbc.com/portuguese/brasil-44325458

Soares, J. (2018, October 7). Time digital de Bolsonaro distribui conteúdo para 1.500 grupos de WhatsApp [Digital Scholarship Team distributes content to 1,500 WhatsApp groups]. O Globo. Retrieved from https://oglobo.globo.com/brasil/time-digital-de-bolsonaro-distribui-conteudo-para-1500-grupos-de-whatsapp-23134588

Tardáguila, C., Benevenuto, F., & Ortellado, P. (2018, October 19). Opinion | Fake News Is Poisoning Brazilian Politics. WhatsApp Can Stop It. The New York Times. Retrieved from https://www.nytimes.com/2018/10/17/opinion/brazil-election-fake-news-whatsapp.html

Uchoa, P. (2018, September 21). Why Brazilian women are saying #NotHim. BBC News. Retrieved from https://www.bbc.com/news/world-latin-america-45579635

Vale, H. F. D. (2015). Territorial Polarization in Brazil’s 2014 Presidential Elections. Regional & Federal Studies, 25(3), 297–311. https://doi.org/10.1080/13597566.2015.1060964

Valente, J. (2018, July 24). Facebook vai dar transparência para anúncios eleitorais no Brasil [Facebook to provide transparency for election announcements in Brazil]. Retrieved December 4, 2019, from Agência Brasil website: http://agenciabrasil.ebc.com.br/politica/noticia/2018-07/facebook-vai-dar-transparencia-para-anuncios-eleitorais-no-brasil

Valente, R. (2018, October 26). Grupos de WhatsApp simulam organização militar e compartilham apoio a Bolsonaro [WhatsApp Groups simulate military organization and share support for Bolsonaro]. Folha de S.Paulo. Retrieved from https://www1.folha.uol.com.br/poder/2018/10/grupos-de-whatsapp-simulam-organizacao-militar-e-compartilham-apoio-a-bolsonaro.shtml

Woolley, S. C., & Howard, P. N. (2017). Computational Propaganda Worldwide: Executive Summary [Working Paper No. 2017.11]. Oxford: Project on Computational Propaganda, Oxford Internet Institute. Retrieved from https://comprop.oii.ox.ac.uk/research/working-papers/computational-propaganda-worldwide-executive-summary/

Footnotes

1. We describe antipetismo as “an intensely personal resentment of the Workers’ Party (PT)”.

2. Donations from companies, were however not allowed in the last election, only from individuals.

3. According to Lupa Agency, both fake criminal records circulated for the first time in the presidential election in 2010, when Dilma Rousseff (PT) was the first woman to be elected president of Brazil, beating José Serra (PSDB). It must be pointed out that, before that, the printed newspaper with the greatest circulation in Brazil - A Folha de S.Paulo - published in 2009 a version of the false criminal record of Dilma Rousseff, who at the time was Chief of Staff of the government of then-president Luiz Inácio Lula da Silva (PT). The newspaper corrected the mistake 20 days after publishing the false information. Cf. https://www1.folha.Figureuol.com.br/folha/brasil/ult96u556855.shtml


Disinformation optimised: gaming search engine algorithms to amplify junk news

$
0
0

This paper is part of Data-driven elections, a special issue of Internet Policy Review guest-edited by Colin J. Bennett and David Lyon.

Introduction

Did the Holocaust really happen? In December 2016, Google’s search engine algorithm determined the most authoritative source to answer this question was a neo-Nazi website peddling holocaust denialism (Cadwalladr, 2016b). For any inquisitive user typing this question into Google, the first website recommended by Search linked to an article entitled: “Top 10 reasons why the Holocaust didn’t happen”. The third article “The Holocaust Hoax; IT NEVER HAPPENED” was published by another neo-Nazi website, while the fifth, seventh, and ninth recommendations linked to similar racist propaganda pages (Cadwalladr, 2016b). Up until Google started demoting websites committed to spreading anti-Semitic messages, anyone asking whether the Holocaust actually happened would have been directed to consult neo-Nazi websites, rather than one of the many credible sources about the Holocaust and tragedy of World War II.

Google’s role in shaping the information environment and enabling political advertising has made it a “de facto infrastructure” for democratic processes (Barrett & Kreiss, 2019). How its search engine algorithm determines authoritative sources directly shapes the online information environment for more than 89 percent of the world’s internet users who trust Google Search to quickly and accurately find answers to their questions. Unlike social media platforms that tailor content based on “algorithmically curated newsfeeds” (Golebiewski & boyd, 2019), the logic of search engines is “mutually shaped” by algorithms — that shape access — and users — who shape the information being sought (Schroeder, 2014). By facilitating information access and discovery, search engines hold a unique position in the information ecosystem. But, like other digital platforms, the digital affordances of Google Search have proved to be fertile ground for media manipulation.

Previous research has demonstrated how large volumes of mis- and disinformation were spread on social media platforms in the lead up to elections around the world (Hedman et al., 2018; Howard, Kollanyi, Bradshaw, & Neudert, 2017; Machado et al., 2018). Some of this disinformation was micro-targeted towards specific communities or individuals based on their personal data. While data-driven campaigning has become a powerful tool for political parties to mobilise and fundraise (Fowler et al., 2019; Baldwin-Philippi, 2017), the connection between online advertisements and disinformation, foreign election interference, polarisation, and non-transparent campaign practices has caused growing anxieties about its impact on democracy.

Since the 2016 presidential election in the United States, public attention and scrutiny has largely focused on the role of Facebook in profiting from and amplifying the spread of disinformation via digital advertisements. However, less attention has been paid to Google, who, along with Facebook, commands more than 60% of the digital advertising market share. At the same time, a multi-billion-dollar search engine optimisation (SEO) industry has been built around understanding how technical systems rank, sort, and prioritise information (Hoffmann, Taylor, & Bradshaw, 2019). The purveyors of disinformation have learned to exploit social media platforms to engineer content discovery and drive “pseudo-organic engagement”. 1 These websites — that do not employ professional journalistic standards, report on conspiracy theory, counterfeit professional news brands, and mask partisan commentary as news — have been referred to as “junk news” domains (Bradshaw, Howard, Kollanyi, & Neudert, 2019).

Together, the role of political advertising and the matured SEO industry make Google Search an interesting and largely underexplored case to analyse. Considering the importance of Google Search in connecting individuals to news and information about politics, this paper examines how junk news websites generate discoverability via Google Search. It asks: (1) How do junk news domains optimise content, through both paid and SEO strategies, to grow discoverability and grow their website value? (2) What strategies are effective at growing discoverability and/or growing website value; and (3) What are the implications of these findings for ongoing discussions about the regulation of social media platforms?

To answer these questions, I analysed 29 junk news domains and their advertising and search engine optimisation strategies between January 2016 and March 2019. First, junk news domains make use of a variety of SEO keyword strategies in order to game Search and grow pseudo-organic clicks and grow their website value. The keywords that generated the highest placements on Google Search focused on (1) navigational searches for known brand names (such as searches for “breitbart.com”) and (2) carefully curated keyword combinations that fill so-called “data voids” (Golebiewski & Boyd, 2018), or a gap in search engine queries (such as searches for “Obama illegal alien”). Second, there was a clear correlation between the number of clicks that a website receives and the estimated value of the junk news domains. The most profitable timeframes correlated with important political events in the United States (such as the 2016 presidential election, and the 2018 midterm elections), and the value of the domain increased based on SEO optimised — rather than paid — clicks. Third, junk news domains were relatively successful at generating top-placements on Google Search before and after the 2016 US presidential election. However, their discoverability abruptly declined beginning in August 2017 following major announcements from Google about changes to its search engine algorithms, as well as other initiatives to combat the spread of junk news in search results. This suggests that Google can, and has, measurably impacted the discoverability of junk news on Search.

This paper proceeds as follows: The first section provides background on the vocabulary of disinformation and ongoing debates about so-called fake news, situating the terminology of “junk news” used in this paper in the scholarly literature. The second section discusses the logic and politics of search, describing how search engines work and reviewing the existing literature on Google Search and the spread of disinformation. The third section outlines the methodology of the paper. The fourth section analyses 29 prominent junk news domains to learn about their SEO and advertising strategies, as well as their impact on content discoverability and revenue generation. This paper concludes with a discussion of the findings and implications for future policymaking and private self-regulation.

The vocabulary of political communication in the 21st century

“Fake news” gained significant attention from scholarship and mainstream media during the 2016 presidential election in the United States as viral stories pushing outrageous headlines — such as Hillary Clinton’s alleged involvement in a paedophile ring in the basement of a DC pizzeria — were prominently displayed across search and social media news feeds (Silverman, 2016). Although “fake news” is not a new phenomenon, the spread of these stories—which are both enhanced and constrained by the unique affordances of internet and social networking technologies — has reinvigorated an entire research agenda around digital news consumption and democratic outcomes. Scholars from diverse disciplinary backgrounds — including psychology, sociology and ethnography, economics, political science, law, computer science, journalism, and communication studies — have launched investigations into circulation of so-called “fake news” stories (Allcott & Gentzkow, 2017; Lazer et al., 2018), their role in agenda-setting (Guo & Vargo, 2018; Vargo, Guo, & Amazeen, 2018), and their impact on democratic outcomes and political polarisation (Persily, 2017; Tucker et al., 2018).

However, scholars at the forefront of this research agenda have continually identified several epistemological and methodological challenges around the study of so-called “fake news”. A commonly identified concern is the ambiguity of the term itself, as “fake news” has come to be an umbrella term for all kinds of problematic content online, including political satire, fabrication, manipulation, propaganda, and advertising (Tandoc, Lim, & Ling, 2018; Wardle, 2017). The European High-Level Expert Group on Fake News and Disinformation recently acknowledged the definitional difficulties around the term, recognising it “encompasses a spectrum of information types…includ[ing] low risk forms such as honest mistakes made by reporters…to high risk forms such as foreign states or domestic groups that would try to undermine the political process” (European Commission, 2018). And even when the term “fake news” is simply used to describe news and information that is factually inaccurate, the binary distinction between what is true and what is false has been criticised for not adequately capturing the complexity of the kinds of information being shared and consumed in today’s digital media environment (Wardle & Derakhshan, 2017).

Beyond the ambiguities surrounding the vocabulary of “fake news”, there is growing concern that the term has begun to be appropriated by politicians to restrict freedom of the press. A wide range of political actors have used the term “fake news” to discredit, attack, and delegitimise political opponents and mainstream media (Farkas & Schou, 2018). Certainly, Donald Trump’s (in)famous use of the term “fake news”, is often used to “deflect” criticism and to erode the credibility of established media and journalist organisations (Lakoff, 2018). And many authoritarian regimes have followed suit, adopting the term into a common lexicon to legitimise further censorship and restrictions on media within their own borders (Bradshaw, Neudert, & Howard, 2018). Given that most citizens perceive “fake news” to define “partisan debate and poor journalism”, rather than a discursive tool to undermine trust and legitimacy in media institutions, there is general scholarly consensus that the term is highly problematic (Nielsen & Graves, 2017).

Rather than chasing a definition of what has come to be known as “fake news”, researchers at the Oxford Internet Institute have produced a grounded typology of what users actually share on social media (Bradshaw et al., 2019). Drawing on Twitter and Facebook data from elections in Europe and North America, researchers developed a grounded typology of online political communication (Bradshaw et al., 2019; Neudert, Howard, & Kollanyi, 2019). They identified a growing prevalence of “junk news” domains, which publish a variety of hyper-partisan, conspiracy theory or click-bait content that was designed to look like real news about politics. During the 2016 presidential election in the United States, social media users on Twitter shared as much “junk news” as professionally produced news about politics (Howard, Bolsover, Kollanyi, Bradshaw, & Neudert, 2017; Howard, Kollanyi, et al., 2017). And voters in swing-states tended to share more junk news than their counterparts in uncontested ones (Howard, Kollanyi, et al., 2017). In countries throughout Europe — in France, Germany, the United Kingdom and Sweden — junk news inflamed political debates around immigration and amplified populist voices across the continent (Desiguad, Howard, Kollanyi, & Bradshaw, 2017; Kaminska, Galacher, Kollanyi, Yasseri, & Howard, 2017; Neudert, Howard, & Kollanyi, 2017).

According to researchers on the Computational Propaganda Project junk news is defined as having at least three out of five elements: (1) professionalism, where sources do not employ the standards and best practices of professional journalism including information about real authors, editors, and owners (2) style, where emotionally driven language, ad hominem attacks, mobilising memes and misleading headlines are used; (3) credibility, where sources rely on false information or conspiracy theories, and do not post corrections; (4) bias, where sources are highly biased, ideologically skewed and publish opinion pieces as news; and (5) counterfeit, where sources mimic established news reporting including fonts, branding and content strategies (Bradshaw et al., 2019).

In a complex ecosystem of political news and information, junk news provides a useful point of analysis because rather than focusing on individual stories that may contain honest mistakes, it examines the domain as a whole and looks for various elements of deception, which underscores the definition of disinformation. The concept of junk news is also not tied to a particular producer of disinformation, such as foreign operatives, hyper-partisan media, or hate groups, who, despite their diverse goals, deploy the same strategies to generate discoverability. Given that the literature on disinformation is often siloed around one particular actor, does not cross platforms, nor integrate a variety of media sources (Tucker et al., 2018), the junk news framework can be useful for taking a broader look at the ecosystem as a whole and the digital techniques producers use to game search engine algorithms. Throughout this paper, I use the term “junk news” to describe the wide range of politically and economically motivated disinformation being shared about politics.

The logic and politics of search

Search engines play a fundamental role in the modern information environment by sorting, organising, and making visible content on the internet. Before the search engine, anyone who wished to find content online would have to navigate “cluttered portals, garish ads and spam galore” (Pasquale, 2015). This didn’t matter in the early days of the web when it remained small and easy to navigate. During this time, web directories were built and maintained by humans who often categorised pages according to their characteristics (Metaxas, 2010). By the mid-1990s it became clear that the human classification system would not be able to scale. The search engine “brought order to chaos by offering a clean and seamless interface to deliver content to users” (Hoffman, Taylor, & Bradshaw, 2019).

Simplistically speaking, search engines work by crawling the web to gather information about online webpages. Data about the words on a webpage, links, images, videos, or the pages they link to are organised into an index by an algorithm, analogous to an index found at the end of a book. When a user types a query into Google Search, machine learning algorithms apply complex statistical models in order to deliver the most “relevant” and “important” information to a user (Gillespie, 2012). These models are based on a combination of “signals” including the words used in a specific query, the relevance and usability of webpages, the expertise of sources, and other information about context, such as a user’s geographic location and settings (Google, 2019).

Google’s search rankings are also influenced by AdWords, which allow individuals or companies to promote their websites by purchasing “paid placement” for specific keyword searches. Paid placement is conducted through a bidding system, where rankings and the number of times the advertisement is displayed are prioritised by the amount of money spent by the advertiser. For example, a company that sells jeans might purchase AdWords for keywords such as “jeans”, “pants”, or “trousers”, so when an individual queries Google using these terms, a “sponsored post” will be placed at the top of the search results. 2 AdWords also make use of personalisation, which allow advertisers to target more granular audiences based on factors such as age, gender, and location. Thus, a local company selling jeans for women can specify local female audiences — individuals who are more likely to purchase their products.

The way in which Google structures, organizes, and presents information and advertisements to users is important because these technical and policy decisions embed a wide range of political issues (Granka, 2010; Introna & Nissenbaum, 2000; Vaidhynathan, 2011). Several public and academic investigations auditing Google’s algorithms have documented various examples of bias in Search or problems with the autocomplete function (Cadwalladr, 2016a; Pasquale, 2015). Biases inherently designed into algorithms have been shown to disproportionately marginalise minority communities, women, and the poor (Noble, 2018).

At the same time, political advertisements have become a contentious political issue. While digital advertising can generate significant benefits for democracy, by democratising political finance and assisting in political mobilisation (Fowler et al., 2019; Baldwin-Philippi, 2017), it can also be used to selectively spread disinformation and messages of demobilisation (Burkell & Regan, 2019; Evangelista & Bruno, 2019; Howard, Ganesh, Liotsiou, Kelly, & Francois, 2018). Indeed, Russian AdWord purchases in the lead-up to the 2016 US election demonstrate how foreign states actors can exploit Google Search to spread propaganda (Mueller, 2019). But the general lack of regulation around political advertising has also raised concerns about domestic actors and the ways in which legitimate politicians campaign in increasingly opaque and unaccountable ways (Chester & Montgomery, 2017; Tufekci, 2014). These concerns are underscored by the rise of the “influence industry” and the commercialisation of political technologies who sell various ‘psychographic profiling’ technologies to craft, target, and tailor messages of persuasion and demobilisation (Chester & Montgomery, 2019; McKelvey, 2019; Bashyakarla, 2019). For example, during the 2016 US election, Cambridge Analytica worked with the Trump campaign to implement “persuasion search advertising”, where AdWords were bought to strategically push pro-Trump and anti-Clinton information to voters (Lewis & Hilder, 2018).

Given growing concerns over the spread of disinformation online, scholars are beginning to study the ways in which Google Search might amplify junk news and disinformation. One study by Metaxa-Kakavouli and Torres-Echeverry examined the top ten results from Google searches about congressional candidates over a 26-week period in the lead-up to the 2016 presidential election. Of the URLs recommended by Google, only 1.5% came from domains that were flagged by PolitiFact as being “fake news” domains (2017). Metaxa-Kakavouli and Torres-Echeverry suggest that the low levels of “fake news” are the result of Google’s “long history” combatting spammers on its platform (2017). Another research paper by Golebiewski and boyd looks at how gaps in search engine results lead to strategic “data voids” that optimisers exploit to amplify their content (2018). Golebiewski and boyd argue that there are many search terms where data is “limited, non-existent or deeply problematic” (2018). Although these searches are rare, if a user types these search terms into a search engine, “it might not give a user what they are looking for because of limited data and/or limited lessons learned through previous searches” (Golebiewski & boyd, 2018).

The existence of biases, disinformation, or gaps in authoritative information on Google Search matters because Google directly impacts what people consume as news and information. Most of the time, people do not look past the top ten results returned by the search engine (Metaxas, 2010). Indeed, eye-tracking experiments have demonstrated that the order in which Google results are presented to users matters more than the actual relevance of the page abstracts (Pan et al., 2007). However, it is important to note that the logic of higher placements does not necessarily translate to search engine advertising listings, where users are less likely to click on advertisements if they are familiar with the brand or product they are searching for (Narayanan & Kalyanam, 2015).

Nevertheless, the significance of the top ten placement has given rise to the SEO industry, whereby optimisers use digital keyword strategies to move webpages higher in Google’s rankings and thereby generate higher traffic flows. There is a long history of SEO dating back to the 1990s when the first search engine algorithms emerged (Metaxas, 2010). Since then, hundreds of SEO pages have published guesses about the different ranking factors these algorithms consider (Dean, 2019). However, the specific signals that inform Google’s search engine algorithms are dynamic and constantly adapting to the information environment. Google makes hundreds of changes to its algorithm every year to adjust the weight and importance of various signals. While most of these changes are minor updates designed to improve the speed and performance of Search, sometimes Google makes more significant changes to its algorithm to elude optimisers trying to game the system.

Google has taken several steps to combat people seeking to manipulate Search for political or economic gain (Taylor, Walsh, & Bradshaw, 2019). This involves several algorithmic changes to demote sources of disinformation as well as changes to their advertising policies to limit the extent to which users can be micro-targeted with political advertisements. In one study, researchers interviewed SEO strategists to audit how Facebook and Google’s algorithmic changes impacted their optimisation strategies (Hoffmann, Taylor, & Bradshaw, 2019). Since the purveyors of disinformation often rely on the same digital marketing strategies used by legitimate political candidates, news organisations, and businesses, the SEO industry can offer unique, but heuristic, insight into the impact of algorithmic changes. Hoffmann, Taylor and Bradshaw (2019) found that despite more than 125 announcements over a three-year period, the algorithmic changes made by the platforms did not significantly alter digital marketing strategies.

This paper hopes to contribute to the growing body of work examining the effect of Search on the spread of disinformation and junk news by empirically analysing the strategies — paid and optimised — employed by junk news domains. By performing an audit of the keywords junk news websites use to generate discoverability, this paper evaluates the effectiveness of Google in combatting the spread of disinformation on Search.

Methodology

Conceptual Framework: The Techno-Commercial Infrastructure of Junk News

The starting place for this inquiry into the SEO infrastructure of junk news domains is grounded conceptually in the field of science and technology studies (STS), which provides a rich literature on how infrastructure design, implementation, and use embeds politics (Winner, 1980). Digital infrastructure — such as physical hardware, cables, virtual protocols, and code — operate invisibly in the background, which can make it difficult to trace the politics embedded in technical coding and design (Star & Ruhleder, 1994). As a result, calls to study internet infrastructure has engendered digital research methods that shed light on the less-visible areas of technology. One growing and relevant body of research has focused on the infrastructure of social media platforms and the algorithms and advertising infrastructure that invisibly operate to amplify or spread junk news to users, or to micro-target political advertisements (Kim et al., 2018; Tambini, Anstead, & Magalhães, 2017). Certainly, the affordances of technology — both real and imagined — mutually shape social media algorithms and their potential for manipulation (Nagy & Neff, 2015; Neff & Nagy, 2016). However, the proprietary nature of platform architecture has made it difficult to operationalise studies in this field. Because junk news domains operate in a digital ecosystem built on search engine optimisation, page ranks, and advertising, there is an opportunity to analyse the infrastructure that supports the discoverability of junk news content, which could provide insights into how producers reach audiences, grow visibility, and generate domain value.

Junk news data set

The first step of my methodology involved identifying a list of junk news domains to analyse. I used the Computational Propaganda Project’s (COMPROP) data set on junk news domains in order to analyse websites that spread disinformation about politics. To develop this list, researchers on the COMPROP project built a typology of junk news based on URLs shared on Twitter and Facebook relating to the 2016 US presidential election, the 2017 US State of the Union Address, and 2018 US midterm elections. 3 A team of five rigorously trained coders labelled the domains contained in tweets and on Facebook pages based on a grounded typology of junk news that has been tested and refined over several elections around the world between 2016 and 2018. 4 A domain was labelled as junk news when it failed on three of the five criteria of the typology (style, bias, credibility, professionalism, and counterfeit, as described in section one). For this analysis, I used the most recent 2018 midterm election junk news list, which is comprised of the top-29 most shared domains that were labelled as junk news by researchers. This list was selected because all 29 domains were active during the 2016 US presidential election in November 2016 and the 2017 US State of the Union Address, which provides an opportunity to comparatively assess how both the advertising and optimisation strategies, as well as their performance, changed overtime.

SpyFu data and API queries

The second step of my methodology involved collecting data about the advertising and optimisation strategies used by junk news websites. I worked with SpyFu, a competitive keyword research tool used by digital marketers to increase website traffic and improve keyword rankings on Google (SpyFu, 2019). SpyFu collects, analyses and tracks various data about the search optimisation strategies used by websites, such as organic ranks, paid keywords bought on Google AdWords, and advertisement trends.

To shed light onto the optimisation strategies used by junk news domains on Google, SpyFu provided me with: (1) a list of historical keywords and keyword combinations used by the top-29 junk news that led to the domain appearing in Google Search results; and (2) the position the domain appeared in Google as a result of the keywords. The historical keywords were provided from January 2016 until March 2019. Only keywords that led to the junk news domains appearing in the top-50 positions on Google were included in the data set.

In order to determine the effectiveness of the optimisation and advertising strategies used by junk news domains to either grow their website value and/or successfully appear in the top positions on Google Search, I wrote a simple python script to connect to the SpyFu API service. This python script collected and parsed the following data from SpyFu for each of the top-29 junk news domains in the sample: (1) the number of keywords that show up organically on Google searches; (2) the estimated sum of clicks a domain receives based on factors including organic keywords, the rank of keyword, and the search volume of the keyword; (3) the estimated organic value of a domain based on factors including organic keywords, the rank of keywords, and the search volume of the keyword; (4) the number of paid advertisements a domain purchased through Google AdWords; and (5) the number of paid clicks a domain received from the advertisements it purchased from Google AdWords.

Data and methodology limitations

There are several data and methodology limitations that must be noted. First, the junk news domains identified by the Computational Propaganda Project highlights only a small sample of the wide variety of websites that peddle disinformation about politics. The researchers also do not differentiate between the different actors behind the junk news websites — such as foreign states or hyper-partisan media — nor do they differentiate between the political leaning of the junk news outlet — such as left-or-right-leaning domains. Thus, the outcomes of these findings cannot be described in terms of the strategies of different actors. Further, given that the majority of junk news domains in the top-29 sample lean politically to the right and far right, these findings might not be applicable to the hyper-partisan left and their optimisation strategies. Finally, the junk news domains identified in the sample were shared on social media in the lead-up to important political events in the United States. A further research question could examine the SEO strategies of domains operating in other country contexts.

When it comes to working with the data provided by SpyFu (and other SEO optimisation tools), there are two limitations that should be noted. First, the historical keywords collected by SpyFu are only collected when they appear in the top-50 Google Search results. This is an important limitation to note because news and information producers are constantly adapting keywords based on the content they are creating. Keywords may be modified by the source website dynamically to match news trends. Low performing keywords might be changed or altered in order to make content more visible via Search. Thus, the SpyFu data might not capture all of the keywords used by junk news domains. However, the collection strategy will have captured many of the most popular keywords used by junk news domains to get their content appearing in Google Search. Second, because SpyFu is a company there are proprietary factors that go into measuring a domain’s SEO performance (in particular, the data points collected via the API on the estimated sum of clicks and the estimated organic value). Nevertheless, considering that Google Search is a prominent avenue for news and information discovery, and that few studies have systematically analysed the effect of search engine optimisation strategies on the spread of disinformation, this study provides an interesting starting point for future research questions about the impact SEO can have on the spread and monetisation of disinformation via Search.

Analysis: optimizing disinformation through keywords and advertising

Junk news advertising strategies on Google

Junk news domains rarely advertise on Google. Only two out of the 29 junk news domains (infowars.com and cnsnews.com) purchased Google advertisements (See Figure 1: Advertisements purchased vs. paid clicks). The advertisements purchased by infowars.com were all made prior to the 2016 election in the United States (from the period of May 2015 to March 2016). cnsnews.com made several advertisement purchases over the three-year time period.

Figure 1: Advertisements purchased vs. paid clicks received: inforwars.com and cnsnews.com (May 2015-March 2019)

Looking at the total number of paid clicks received, junk news domains generated only a small amount of traffic using paid advertisements. Infowars on average, received about 2000 clicks as a result of their paid advertisements. cnsnews.com peaked at approximately 1800 clicks, but on average generated only about 600 clicks per month over the course of three years. By comparing the number of clicks that are paid versus those that were generated as a result of SEO keyword optimisation, there is a significant difference. During the same time period, cnsnews.com and infowars.com were generating on average 146,000 and 964,000 organic clicks respectively (See Figure 2:Organic vs. paid clicks (cnsnews.com and infowars.com)). Although it is hard to make generalisations about how junk news websites advertise on Google based on a sample of two, the lack of data suggests that advertising on Google Search might not be as popular as advertising on other social media platforms. Second, the return on investment (i.e., paid clicks generated as a result of Google advertisements) was very low compared to the organic clicks these junk news domains received for free. Factors other than advertising seem to drive the discoverability of junk news on Google Search.

Figure 2: organic vs. paid clicks (cnsnews.com and infowars.com)

Junk news keyword optimisation strategies

In order to assess the keyword optimisation strategies used by junk news websites, I worked with SpyFu, which provided historical keyword data for the 29 junk news domains, when those keywords made it to the top-50 results in Google between January 2016 and March 2019. In total, there were 88,662 unique keywords in the data set. Given the importance of placement on Google, I looked specifically at keywords that indexed junk news websites on the first — and most authoritative — position. Junk news domains had different aptitudes for generating placement in the first position (See Table 1: Junk news domains and number of keywords found in the first position on Google). Breitbart, DailyCaller and ZeroHedge had the most successful SEO strategies, respectively having 1006, 957 and 807 keywords lead to top placements on Google Search over the 39-month period. In contrast, six domains (committedconservative.com, davidharrisjr.com, reverbpress.news, thedailydigest.org, thefederalist.com, thepoliticalinsider.com) had no keywords reach the first position on Google. The remaining 20 domains had anywhere between 1 to 253 keywords place between the 2-10 positions on Google Search over the same timeframe.

Table 1: Junk news domains and number of keywords found in the first position on Google

Domain

Keywords reaching position 1

breitbart.com

1006

dailycaller.com

957

zerohedge.com

807

infowars.com

253

cnsnews.com

228

dailywire.com

214

thefederalist.com

200

rawstory.com

199

lifenews.com

156

pjmedia.com

140

americanthinker.com

133

thepoliticalinsider.com

111

thegatewaypundit.com

105

barenakedislam.com

48

michaelsavage.com

15

theblacksphere.net

9

truepundit.com

8

100percentfedup.com

5

bigleaguepolitics.com

3

libertyheadlines.com

2

ussanews.com

2

gellerreport.com

1

truthfeednews.com

1

Different keywords also generate different kinds of placement over the 39-month period. Table 2 (see Appendix) provides a sample list of up to ten keywords from each junk news domain in the sample when the keyword reached the first position.

First, many junk news domains appear in the first position on Google Search as a result of “navigational searches” whereby a user entered a query with the intent of finding a website. A search for a specific brand of junk news could happen naturally for many users, since the Google Search function is built into the address bar in Chrome, and sometimes set as the default search engine for other browsers. In particular, terms like “infowars” “breitbart” “cnsnews” and “rawstory” were navigational keywords users typed into Google Search. The performance of brand searches over time consistently places junk news webpages in the number one position (see Figure 3: Brand-related keywords over time). This suggests that brand-recognition plays an important role for driving traffic to junk news domains.

Figure 3: the performance of brand-related keywords overtime: top-5 junk news websites (January 2016-March 2019)

There is one outlier in this analysis, where keyword searches for “breitbart” drops to position two: in January 2017 and September 2017. This drop could have been a result of mainstream media coverage of Steve Bannon assuming (and eventually leaving) his position as the White House Chief Strategist during those respective months. The fact that navigational searches are one of the main drivers behind generating a top ten placement on Search suggests that junk news websites rely heavily on developing a recognisable brand and a dedicated readership that actively seeks out content from these websites. However, this also demonstrates that a complicated set of factors go into determining what keywords from what websites make the top placement in Google Search, and that coverage of news events from mainstream professional news outlets can alter the discoverability of junk news via Search.

Second, many keywords that made it to the top position in Google Search results are what Golebiewski and boyd (2018) would call terms that filled “data voids”, or gaps in search engine queries where there is limited authoritative information about a particular issue. These keywords tended to focus on conspiratorial information especially around President Barack Obama (“Obama homosexual” or “stop Barack Obama”), gun rights (“gun control myths”), pro-life narratives (“anti-abortion quotes” or “fetus after abortion”), and xenophobic or racist content (“against Islam” or “Mexicans suck”). Unlike brand-related keywords, problematic search terms do not achieve a consistently high placement on Google Search over the 39-week period. Keywords that ranked in number one for more than 30-weeks include: “vz58 vs. ak47”, “feminizing uranium”, “successful people with down syndrome”, “google ddrive”, and “westboro[sic] Baptist church tires slashed”. This suggests that, for the most part, data voids are either being filled by more authoritative sources, or Google Search has been able to demote websites attempting to generate pseudo-organic engagement via SEO.

The performance of junk news domains on Google Search

After analysing what keywords are used to get junk news websites in the number one position, the next half of my analysis looks at larger trends in SEO strategies overtime. What is the relationship between organic clicks and the value of a junk news website? How has the effectiveness of SEO keywords changed over the past 48 months? And have changes made by Google to combat the spread of junk news on Search had an impact on its discoverability?

Junk news, organic clicks, and the value of the domain

There is a close relationship between the number of clicks a domain receives and the estimated value of that domain. By comparing figure 4 and 5, you can see that the more clicks a website receives, the higher its estimated value. Often, a domain is considered more valuable when it generates large amounts of traffic. Advertisers see this as an opportunity, then, to reach more people. Thus, the higher the value of a domain, the more likely it is to generate revenue for the operator. The median estimated value of the top-29 most popular junk news was $5,160 USD during the month of the 2016 presidential election, $1,666.65 USD during the 2018 State of the Union, and $3,906.90 USD during the 2018 midterm elections. Infowars.com and breitbart.com were the two highest performing junk news domains — in terms of clicks and domain value. While breitbart.com maintained a more stable readership, especially around the 2016 US presidential election and the 2018 US State of the Union Address, its estimated organic click rate has steadily decreased since early 2018. In contrast, infowars.com has a more volatile readership. The spikes in clicks to infowars.com could be explained by media coverage of the website, including the defamation case against Alex Jones in April 2018 who claimed the shooting at Sandy Hook Elementary School was “completely fake” and a “giant hoax”. Since then, several internet companies — including Apple, Twitter, Facebook, Spotify, and YouTube — banned Infowars from their platforms, and the domain has not been able to regain its clicks nor value since. This demonstrates the powerful role platforms play in not only making content visible to users, but also controlling who can grow their website value — and ultimately generate revenue — from the content they produce and share online.

Figure 4: Estimated organic value for the top 29 junk news domains (May 2015 – March 2019)
Figure 5: Estimated organic clicks for the top 29 junk news domains (May 2015-April 2019)

Junk news domains, search discoverability and Google’s response to disinformation

Figure 6 shows the estimated organic results of the top 29 junk news domains overtime. The estimated organic results are the number of keywords that would organically appear in Google searches. Since August 2017, there has been a sharp decline in the number of keywords that would appear in Google. The four top-performing junk news websites (infowars.com, zerohedge.com, dailycaller.com, and breitbart.com) all appeared less frequently in top-positions on Google Search based on the keywords they were optimising for. This is an interesting finding and suggests that the changes Google made to its search algorithm did indeed have an impact on the discoverability of junk news domains after August 2017. In comparison, other professional news sources (washingtonpost.com, nytimes.com, foxnews.com, nbcnews.com, bloomberg.com, bbc.co.uk, wsj.com, and cnn.com) did not see substantial drops in their search visibility during this timeframe (see Figure 7). In fact, after August 2017 there has been a gradual increase in the organic results of mainstream news media.

Figure 6: Estimated organic results for the top 29 junk news domains (May 2015- April 2019)
Figure 7: Estimated organic results for mainstream media websites in the United States (May 2015-April 2019)

After almost a year, the top-performing junk news websites have regained some of their organic results, but the levels are not nearly as high as they were leading up to and preceding the 2016 presidential election. This demonstrates the power of Google’s algorithmic changes in limiting the discoverability of junk news on Search. But it also shows how junk news producers learn to adapt their strategies in order to extend the visibility of their content. In order to be effective at limiting the visibility of bad information via search, Google must continue to monitor the keywords and optimisation strategies these domains deploy — especially in the lead-up to elections — when more people will be naturally searching for news and information about politics.

Conclusion

In conclusion, the spread of junk news on the internet and the impact it has on democracy has certainly been a growing field of academic inquiry. This paper has looked at a small subset of this phenomenon, in particular the role of Google Search in assisting in the discoverability and monetisation of junk news domains. By looking at the techno-commercial infrastructure that junk news producers use to optimise their websites for paid and pseudo-organic clicks, I found:

  1. Junk news domains do not rely on Google advertisements to grow their audiences and instead focus their efforts on optimisation and keyword strategies;
  2. Navigational searches drive the most traffic to junk news websites, and data voids are used to grow the discoverability of junk news content to mostly small, but varying degrees.
  3. Many junk news producers place advertisements on their websites and grow their value particularly around important political events; and
  4. Overtime, the SEO strategies used by junk news domains have decreased in their ability to generate top-placements in Google Search.

For millions of people around the world, the information Google Search recommends directly impacts how ideas and opinions about politics are formulated. The powerful role of Google as an information gatekeeper has meant that bad actors have tried to subvert these technical systems for political or economic game. For quite some time, Google’s algorithms have come under attack by spammers and other malign actors who wish to spread disinformation, conspiracy theories, spam, and hate speech to unsuspecting users. The rise of “computational propaganda” and the variety of bad actors exploiting technology to influence political outcomes has also led to the manipulation of Search. Google’s response to the optimisation strategies used by junk news domains has had a positive effect on limiting the discoverability of these domains over time. However, the findings of this paper are also showing an upward trend, as junk news producers find new ways to optimise their content for higher search rankings. This game of cat and mouse is one that will continue for the foreseeable future.

While it is hard to reduce the visibility of junk news domains when individuals actively search for them, more can be done to limit the ways in which bad actors might try to optimise content to generate pseudo-organic engagement, especially around disinformation. Google can certainly do more to tweak its algorithms in order to demote known disinformation sources, as well as identify and limit the discoverability of content seeking to exploit data voids. However, there is no straightforward technical patch that Google can implement to stop various actors from trying to game their systems. By co-opting the technical infrastructure and policies that enable search, the producers of junk news are able to spread disinformation — albeit to small audiences who might use obscure search terms to learn about a particular topic.

There have also been growing pressures for regulators to take steps that force social media platforms to take greater actions that limit the spread of disinformation online. But the findings of this paper have two important lessons for policymakers. First, the disinformation problem — through both optimisation and advertising — on Google Search is not as dramatic as it is sometimes portrayed. Most of the traffic to junk news websites are by users performing navigational searches to find specific, well-known brands. Only a limited number of placements — as well as clicks — to junk news domains come from pseudo-organic engagement generated by data voids and other problematic keyword searches. Thus, requiring Google to take a heavy-handed approach to content moderation could do more harm than good, and might not reflect the severity of the problem. Second, the reason why disinformation spreads on Google are reflective of deeper systemic problems within democracies: growing levels of polarisation and distrust in the mainstream media are pushing citizens to fringe and highly partisan sources of news and information. Any solution to the spread of disinformation on Google Search will require thinking about media and digital literacy and programmes to strengthen, support, and sustain professional journalism.

References

Allcott, H., & Gentzkow, M. (2017). Social Media and Fake News in the 2016 Election. Journal of Economic Perspectives, 31(2), 211–236. https://doi.org/10.1257/jep.31.2.211

Barrett, B., & D. Kreiss (2019). Platform Transience:  changes in Facebook’s policies, procedures and affordances in global electoral politics. Internet Policy Review, 8(4). https://doi.org/10.14763/2019.4.1446

Bradshaw, S., Howard, P., Kollanyi, B., & Neudert, L.-M. (2019). Sourcing and Automation of Political News and Information over Social Media in the United States, 2016-2018. Political Communication. https://doi.org/10.1080/10584609.2019.1663322

Bradshaw, S., & Howard, P. N. (2018). Why does Junk News Spread So Quickly Across Social Media? Algorithms, Advertising and Exposure in Public Life [Working Paper]. Miami: Knight Foundation. Retrieved from https://kf-site-production.s3.amazonaws.com/media_elements/files/000/000/142/original/Topos_KF_White-Paper_Howard_V1_ado.pdf

Bradshaw, S., Neudert, L.-M., & Howard, P. (2018). Government Responses to the Malicious Use of Social Media. NATO.

Burkell, J., & Regan, P. (2019). Voting Public: Leveraging Personal Information to Construct Voter Preference. In N. Witzleb, M. Paterson, & J. Richardson (Eds.), Big Data, Privacy and the Political Process. London: Routledge.

Cadwalladr, C. (2016a, December 4). Google, democracy and the truth about internet search. The Observer. Retrieved from https://www.theguardian.com/technology/2016/dec/04/google-democracy-truth-internet-search-facebook

Cadwalladr, C. (2016b, December 11). Google is not ‘just’ a platform. It frames, shapes and distorts how we see the world. The Guardian. Retrieved from https://www.theguardian.com/commentisfree/2016/dec/11/google-frames-shapes-and-distorts-how-we-see-world

Chester, J. & Montgomery, K. (2019). The digital commercialisation of US politics—2020 and beyond. Internet Policy Review, 8(4). https://doi.org/10.14763/2019.4.1443

Dean, B. (2019). Google’s 200 Ranking Factors: The Complete List (2019). Retrieved April 18, 2019, from Backlinko website: https://backlinko.com/google-ranking-factors

Desiguad, C., Howard, P. N., Kollanyi, B., & Bradshaw, S. (2017). Junk News and Bots during the French Presidential Election: What are French Voters Sharing Over Twitter In Round Two? [Data Memo No. 2017.4]. Oxford: Project on Computational Propaganda, Oxford University. Retrieved May 19, 2017, from http://comprop.oii.ox.ac.uk/wp-content/uploads/sites/89/2017/05/What-Are-French-Voters-Sharing-Over-Twitter-Between-the-Two-Rounds-v7.pdf

European Commission. (2018). A multi-dimensional approach to disinformation: report of the independent high-level group on fake news and online disinformation. Luxembourg: European Commission.

Evangelista, R., & F. Bruno. (2019) WhatsApp and political instability in Brazil: targeted messages and political radicalization. Internet Policy Review, 8(4). https://doi.org/10.14763/2019.4.1435

Farkas, J., & Schou, J. (2018). Fake News as a Floating Signifier: Hegemony, Antagonism and the Politics of Falsehood. Journal of the European Institute for Communication and Culture, 25(3), 298–314. https://doi.org/10.1080/13183222.2018.1463047

Gillespie, T. (2012). The Relevance of Algorithms. In T. Gillespie, P. J. Boczkowski, & K. Foot (Eds.), Media Technologies: Essays on Communication, Materiality and Society (pp. 167–193). Cambridge, MA: The MIT Press. Retrieved from http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.692.3942&rep=rep1&type=pdf

Golebiewski, M., & Boyd, D. (2018). Data voids: where missing data can be easily exploited. Retrieved from Data & Society website: https://datasociety.net/wp-content/uploads/2018/05/Data_Society_Data_Voids_Final_3.pdf

Google. (2019). How Google Search works: Search algorithms. Retrieved April 17, 2019, from https://www.google.com/intl/en/search/howsearchworks/algorithms/

Granka, L. A. (2010). The Politics of Search: A Decade Retrospective. The Information Society, 26(5), 364–374. https://doi.org/10.1080/01972243.2010.511560

Guo, L., & Vargo, C. (2018). “Fake News” and Emerging Online Media Ecosystem: An Integrated Intermedia Agenda-Setting Analysis of the 2016 U.S. Presidential Election. Communication Research. https://doi.org/10.1177/0093650218777177

Hedman, F., Sivnert, F., Kollanyi, B., Narayanan, V., Neudert, L. M., & Howard, P. N. (2018, September 6). News and Political Information Consumption in Sweden: Mapping the 2018 Swedish General Election on Twitter [Data Memo No. 2018.3]. Oxford: Project on Computational Propaganda, Oxford University. Retrieved from https://comprop.oii.ox.ac.uk/wp-content/uploads/sites/93/2018/09/Hedman-et-al-2018.pdf

Hoffmann, S., Taylor, E., & Bradshaw, S. (2019, October). The Market of Disinformation. [Report]. Oxford: Oxford Information Labs; Oxford Technology & Elections Commission, University of Oxford. Retrieved from https://oxtec.oii.ox.ac.uk/wp-content/uploads/sites/115/2019/10/OxTEC-The-Market-of-Disinformation.pdf

Howard, P., Ganesh, B., Liotsiou, D., Kelly, J., & Francois, C. (2018). The IRA and Political Polarization in the United States, 2012-2018 [Working Paper No. 2018.2]. Oxford: Project on Computational Propaganda, Oxford University. Retrieved from https://comprop.oii.ox.ac.uk/research/ira-political-polarization/

Howard, P. N., Bolsover, G., Kollanyi, B., Bradshaw, S., & Neudert, L.-M. (2017). Junk News and Bots during the U.S. Election: What Were Michigan Voters Sharing Over Twitter? [Data Memo No. 2017.1]. Oxford: Project on Computational Propaganda, Oxford University. Retrieved from http://comprop.oii.ox.ac.uk/2017/03/26/junk-news-and-bots-during-the-u-s-election-what-were-michigan-voters-sharing-over-twitter/

Howard, P. N., Kollanyi, B., Bradshaw, S., & Neudert, L.-M. (2017). Social Media, News and Political Information during the US Election: Was Polarizing Content Concentrated in Swing States? [Data Memo No. 2017.8]. Oxford: Project on Computational Propaganda, Oxford University. Retrieved from http://comprop.oii.ox.ac.uk/wp-content/uploads/sites/93/2017/09/Polarizing-Content-and-Swing-States.pdf

Introna, L., & Nissenbaum, H. (2000). Shaping the Web: Why the Politics of Search Engines Matters. The Information Society, 16(3), 169–185. https://doi.org/10.1080/01972240050133634

Kaminska, M., Galacher, J. D., Kollanyi, B., Yasseri, T., & Howard, P. N. (2017). Social Media and News Sources during the 2017 UK General Election. [Data Memo No. 2017.6]. Oxford: Project on Computational Propaganda, Oxford University. Retrieved from https://www.oii.ox.ac.uk/blog/social-media-and-news-sources-during-the-2017-uk-general-election/

Kim, Y. M., Hsu, J., Neiman, D., Kou, C., Bankston, L., Kim, S. Y., … Raskutti, G. (2018). The Stealth Media? Groups and Targets behind Divisive Issue Campaigns on Facebook. Political Communication, 35(4), 515–541. https://doi.org/10.1080/10584609.2018.1476425

Lakoff, G. (2018, January 2). Trump uses social media as a weapon to control the news cycle. Retrieved from https://twitter.com/GeorgeLakoff/status/948424436058791937

Lazer, D. M. J., Baum, M. A., Benkler, Y., Berinsky, A. J., Greenhill, K. M., Menczer, F., Zittrain, J. L. (2018). The science of fake news. Science, 359(6380), 1094–1096. https://doi.org/10.1126/science.aao2998

Lewis, P. & Hilder, P. (2018, March 23). Leaked: Cambridge Analytica’s Blueprint for Trump Victory. The Guardian. Retrieved from: https://www.theguardian.com/uk-news/2018/mar/23/leaked-cambridge-analyticas-blueprint-for-trump-victory

Machado, C., Kira, B., Hirsch, G., Marchal, N., Kollanyi, B., Howard, Philip N., … Barash, V. (2018). News and Political Information Consumption in Brazil: Mapping the First Round of the 2018 Brazilian Presidential Election on Twitter [Data Memo No. 2018.4]. Oxford: Project on Computational Propaganda, Oxford University. Retrieved from http://blogs.oii.ox.ac.uk/comprop/wp-content/uploads/sites/93/2018/10/machado_et_al.pdf

McKelvey F. (2019). Cranks, Clickbaits and Cons:  On the acceptable use of political engagement platforms.  Internet Policy Review, 8(4). https://doi.org/10.14763/2019.4.1439

Metaxa-Kakavouli, D., & Torres-Echeverry, N. (2017). Google’s Role in Spreading Fake News and Misinformation. SSRN Electronic Journal. https://doi.org/10.2139/ssrn.3062984

Metaxas, P. T. (2010). Web Spam, Social Propaganda and the Evolution of Search Engine Rankings. In J. Cordeiro & J. Filipe (Eds.), Web Information Systems and Technologies (Vol. 45, pp. 170–182). https://doi.org/10.1007/978-3-642-12436-5_13

Nagy, P., & Neff, G. (2015). Imagined Affordance: Reconstructing a Keyword for Communication Theory. Social Media + Society, 1(2). https://doi.org/10.1177/2056305115603385

Narayanan S., & Kalyanam K. (2015). Position Effects in Search Advertising and their Moderators: A Regression Discontinuity Approach. Marketing Science, 34(3), 388–407. https://doi.org/10.1287/mksc.2014.0893

Neff, G., & Nagy, P. (2016). Talking to Bots: Symbiotic Agency and the Case of Tay. International Journal of Communication,10, 4915–4931. Retrieved from https://ijoc.org/index.php/ijoc/article/view/6277

Neudert, L.-M., Howard, P., & Kollanyi, B. (2017). Junk News and Bots during the German Federal Presidency Election: What Were German Voters Sharing Over Twitter? [Data Memo 2 No. 2017.2]. Oxford: Project on Computational Propaganda, Oxford University. Retrieved from http://comprop.oii.ox.ac.uk/wp-content/uploads/sites/89/2017/03/What-Were-German-Voters-Sharing-Over-Twitter-v6-1.pdf

Nielsen, R. K., & Graves, L. (2017). “News you don’t believe”: Audience perspectives on fake news. Oxford: Reuters Institute for the Study of Journalism, University of Oxford. Retrieved from https://reutersinstitute.politics.ox.ac.uk/sites/default/files/2017-10/Nielsen&Graves_factsheet_1710v3_FINAL_download.pdf

Noble, S. (2018). Algorithms of Oppression: How Search Engines Reinforce Racism. New York: NYU Press.

Pan, B., Hembrooke, H., Joachims, T., Lorigo, L., Gay, G., & Granka, L. (2007). In Google We Trust: Users’ Decisions on Rank, Position, and Relevance. Journal of Computer-Mediated Communication, 12(3), 801–823. https://doi.org/10.1111/j.1083-6101.2007.00351.x

Pasquale, F. (2015). The Black Box Society. Cambridge: Harvard University Press.

Persily, N. (2017). The 2016 U.S. Election: Can Democracy Survive the Internet? Journal of Democracy, 28(2), 63–76. https://doi.org/10.1353/jod.2017.0025

Schroeder, R. (2014). Does Google shape what we know? Prometheus, 32(2), 145–160. https://doi.org/10.1080/08109028.2014.984469

Silverman, C. (2016, November 16). This Analysis Shows How Viral Fake Election News Stories Outperformed Real News On Facebook. Buzzfeed. Retrieved July 25, 2017 from https://www.buzzfeed.com/craigsilverman/viral-fake-election-news-outperformed-real-news-on-facebook

SpyFu. (2019). SpyFu - Competitor Keyword Research Tools for AdWords PPC & SEO. Retrieved April 19, 2019, from https://www.spyfu.com/

Star, S. L., & Ruhleder, K. (1994). Steps Towards an Ecology of Infrastructure: Complex Problems in Design and Access for Large-scale Collaborative Systems. Proceedings of the 1994 ACM Conference on Computer Supported Cooperative Work, 253–264. New York: ACM.

Tambini, D., Anstead, N., & Magalhães, J. C. (2017, June 6). Labour’s advertising campaign on Facebook (or “Don’t Mention the War”) [Blog Post]. Retrieved April 11, 2019, from Media Policy Blog website: http://blogs.lse.ac.uk/mediapolicyproject/

Tandoc, E. C., Lim, Z. W., & Ling, R. (2018). Defining “Fake News”: A typology of scholarly definitions Digital Journalism, 6(2). https://doi.org/10.1080/21670811.2017.1360143

Tucker, J. A., Guess, A., Barberá, P., Vaccari, C., Siegel, A., Sanovich, S., Stukal, D., & Nyhan, B. (2018, March). Social Media, Political Polarization, and Political Disinformation: A Review of the Scientific Literature [Report]. Menlo Park: William and Flora Hewlett Foundation. Retrieved from https://eprints.lse.ac.uk/87402/1/Social-Media-Political-Polarization-and-Political-Disinformation-Literature-Review.pdf

Vaidhynathan, S. (2011). The Googlization of Everything: (First edition). Berkeley: University of California Press.

Vargo, C. J., Guo, L., & Amazeen, M. A. (2018). The agenda-setting power of fake news: A big data analysis of the online media landscape from 2014 to 2016. New Media & Society, 20(5), 2028–2049. https://doi.org/10.1177/1461444817712086

Bashyakarla, V. (2019). Towards a holistic perspective on personal data and the data-driven election paradigm. Internet Policy Review, 8(4). Retrieved from https://policyreview.info/articles/news/towards-holistic-perspective-personal-data-and-data-driven-election-paradigm/1445

Wardle, C. (2017, February 16). Fake news. It’s complicated. First Draft News. Retrieved July 20, 2017, from https://firstdraftnews.com:443/fake-news-complicated/

Wardle, C., & Derakhshan, H. (2017). Information Disorder: Toward and interdisciplinary framework for research and policy making [Report No. DGI(2017)09]. Strasbourg: Council of Europe. Retrieved from https://rm.coe.int/information-disorder-report-november-2017/1680764666

Winner, L. (1980). Do Artifacts have Politics. Daedalus, 109(1), 121–136. Retrieved from http://www.jstor.org/stable/20024652

Appendix 1

Junk news seed list (Computational Propaganda Project’s top-29 junk news domains from the 2018 US midterm elections).

www.americanthinker.com, www.barenakedislam.com, www.breitbart.com, www.cnsnews.com, www.dailywire.com, www.infowars.com, www.libertyheadlines.com, www.lifenews.com,www.rawstory.com, www.thegatewaypundit.com, www.truepundit.com, www.zerohedge.com,100percentfedup.com, bigleaguepolitics.com, committedconservative.com, dailycaller.com, davidharrisjr.com, gellerreport.com, michaelsavage.com, newrightnetwork.com, pjmedia.com, reverbpress.news, theblacksphere.net, thedailydigest.org, thefederalist.com, ussanews.com, theoldschoolpatriot.com, thepoliticalinsider.com, truthfeednews.com.

Appendix 2

Table 2: A sample list of up to ten keywords from each junk news domain in the sample when the keyword reached the first position.

100percentfedup.com

dailywire.com

theblacksphere.net

gruesome videos

6

states bankrupt

22

black sphere

28

snopes exposed

5

ms 13 portland oregon

15

dwayne johnson gay

10

gruesome video

4

the gadsen flag

12

george soros private security

1

teendreamers

2

f word on tv

12

bombshell barack

1

bush cheney inauguration

2

against gun control facts

10

madame secretary

1

americanthinker.com

end of america 90

9

head in vagina

1

medienkritic

23

racist blacks

8

mexicans suck

1

problem with taxes

22

associates clinton

8

obama homosexual

1

janet levy

19

diebold voting machine

8

comments this

1

article on environmental protection

18

diebold machines

8

thefederalist.com

maya angelou criticism

18

gellerreport.com

the federalist

39

supply and demand articles 2011

17

geller report

1

federalist

30

ezekiel emanuel complete lives system

16

infowars.com

gun control myths

26

articles on suicide

12

www infowars

39

considering homeschooling

23

American Thinker Coupons

11

infowars com

39

why wont it work technology

22

truth about obama

10

info wars

39

debate iraq war

21

barenakedislam.com

infowars

39

lesbian children

20

berg beheading video

11

www infowars com

39

why homeschooling

19

against islam

11

al-qaeda 100 pentagon run

38

home economics course

18

beheadings

10

info war today

35

iraq war debate

17

iraquis beheaded

10

war info

34

thegatewaypundit.com

muslim headgear

8

infowars moneybomb

34

thegatewaypundit.com

39

torture clips

7

feminizing uranium

33

civilian national security force

10

los angeles islam pictures

7

libertyheadlines.com

safe school czar

8

beheaded clips

7

accusers dod

2

hillary clinton weight gain 2011

8

berg video

7

liberty security guard bucks country

1

RSS Pundit

7

hostages beheaded

6

lifenews.com

hillary clinton weight gain

7

bigleaguepolitics.com

successful people with down syndrome

39

all perhaps hillary

4

habermans

1

life news

35

hillary clinton gained weight

4

fbi whistleblower

1

lifenews.com

35

london serendip i tea camp

4

ron paul supporters

1

fetus after abortion

26

whoa it

4

breitbart.com

anti abortion quotes

21

thepoliticalinsider.com

big journalism

39

pro life court cases

17

obama blames

19

big government breitbart

39

rescuing hug

16

michael moore sucks

14

breitbart blog

39

process of aborting a baby

15

marco rubio gay

11

www.breitbart.com

39

different ways to abort a baby

14

weapons mass destruction iraq

10

big hollywood

39

adoption waiting list statistics

14

weapons of mass destruction found

10

breitbart hollywood

39

michaelsavage.com

wmd iraq

10

breitbart.com

39

www michaelsavage com

19

obama s plan

9

big hollywood blog

39

michaelsavage com

19

chuck norris gay

9

big government blog

39

michaelsavage

18

how old is bill clinton

8

breitbart big hollywood

39

michael savage com

18

stop barack obama

7

cnsnews.com

michaelsavage radio

17

truepundit.com

cns news

39

michael savage

17

john kerrys daughter

8

cnsnews

39

savage nation

15

john kerrys daughters

5

conservative news service

39

michael savage nation

14

sex email

2

christian news service

21

michael savage savage nation

13

poverty warrior

2

cns

20

the savage nation

12

john kerry daughter

1

major corporations

20

pjmedia.com

RSS Pundit

1

billy graham daughter

18

belmont club

39

whistle new

1

taxing the internet

17

belmont club blog

39

pay to who

1

pashtun sexuality

15

pajamas media

39

truthfeednews.com

record tax

15

dr helen

38

nfl.comm

5

dailycaller.com

instapundit blog

38

ussanews.com

the daily caller

37

instapundit

33

imigration expert

2

vz 58 vs ak 47

33

pj media

33

meabolic syndrome

1

condition black

28

instapundit.

32

zerohedge.com

patriot act changes

26

google ddrive

28

zero hedge

33

12 hour school

25

instapundits

27

unempolyment california

24

common core stories

25

rawstory.com

hayman capital letter

24

courtroom transcript

23

the raw story

39

dennis gartman performance

24

why marijuana shouldnt be legal

22

raw story

39

the real barack obama

23

why we shouldnt legalize weed

22

rawstory

39

meredith whitney blog

22

why shouldnt marijuana be legalized

22

rawstory.com

39

weaight watchers

22

  

westboro baptist church tires slashed

35

0hedge

22

  

the raw

25

doug kass predictions

19

  

mormons in porn

22

usa hyperinflation

17

  

norm colemans teeth

19

  
  

xe services sold

18

  
  

duggers

17

  

Footnotes

1. Organic engagement is used to describe authentic user engagement, where an individual might click a website or link without being prompted. This is different from "transactional engagement" where a user engages with content through prompting via paid advertising. In contrast, I use the term “pseudo-organic engagement” to capture the idea that SEO practitioners are generating clicks through the manipulation of keywords that move websites closer to the top of search engine rankings. An important aspect of pseudo-organic engagement is that these results are indistinguishable from those that have “earnt” their search ranking, meaning, users may be more likely to treat the source as authoritative despite the fact their ranking has been manipulated.

2. It is important to note that AdWord purchases can also be displayed on affiliate websites. These “display ads” appear on websites and generate revenue for the website operator.

3. For the US presidential election, 19.53 million tweets were collected between 1 November 2016, and 9 November 2016; for the State of the Union Address 2.26 million tweets were collected between 24 January 2018, and 30 January 2018; and for the 2018 US midterm elections 2.5 million tweets were collected between 21-30 September 2018 and 6,986 Facebook groups between 29 September 2018 and 29 October 2018. For more information see Bradshaw et al., 2019.

4. Elections include: 2016 United States presidential election, 2017 French presidential election, 2017 German federal election, 2017 Mexican presidential election, 2018 Brazilian presidential election, and the 2018 Swedish general election.

Data-driven political campaigns in practice: understanding and regulating diverse data-driven campaigns

$
0
0

This paper is part of Data-driven elections, a special issue of Internet Policy Review guest-edited by Colin J. Bennett and David Lyon.

Introduction

Data has become an important part of how we understand political campaigns. In reviewing coverage of elections – particularly in the US – the idea that political parties and campaigners now utilise data to deliver highly targeted, strategic and successful campaigns is readily found. In academic and non-academic literature, it has been argued that “[i]n countries around the world political parties have built better databases, integrated online and field data, and created more sophisticated analytic tools to make sense of these traces of the electorate” (Kreiss and Howard, 2010, p. 1; see also in t’Veld, 2017, pp. 2-3). These tools are reported to allow voters to “be monitored and targeting continuously and in depth, utilising methods intricately linked with and drawn from the commercial sector and the vast collection of personal and individual data” (Kerr Morrison, Naik, and Hankey, 2018, p. 11). The Trump campaign in 2016 is accordingly claimed to have “target[ed] 13.5 million persuadable voters in sixteen battleground states, discovering the hidden Trump voters, especially in the Midwest” (Persily, 2017, p. 65). On the basis of such accounts, it appears that data-driven campaigning is coming to define electoral practice – especially in the US - and is now key to understanding modern campaigns.

Yet, at the same time, important questions have been raised about the sophistication and uptake of data-driven campaign tools. As Baldwin-Philippi (2017) has argued, there are certain “myths” about data-driven campaigning. Studying campaigning practices Baldwin-Philippi has shown that “all but the most sophisticated digital and data-driven strategies are imprecise and not nearly as novel as the journalistic feature stories claim” (2017, p. 627). Hersh (2015) has also shown that the data that parties possess about voters is not fine-grained, and tends to be drawn from public records that contain certain standardised information. Moreover, Bennett has highlighted the significant incentive that campaign consultants and managers have to emphasise the sophistication and success of their strategies, suggesting that campaigners may not be offering an accurate account of current practices (2016, p. 261; Kreiss and McGregor, 2018).

These competing accounts raise questions about the nature of data-driven campaigning and the extent to which common practices in data use are found around the globe. These ideas are conceptually important for our understanding of developments in campaigning, but they also have significance for societal responses to the practice of data-driven campaigning. With organisations potentially adopting different data-driven campaigning practices it is important to ask which forms of data use are seen to be democratically acceptable or problematic. 1 These questions are particularly important given the recent interest from international actors and politicians in understanding and responding to the use of data analytics (Information Comissioners Office, 2018a), and specifically practices at Facebook (Kang et al., 2018). Despite growing pressure from these actors to curtail problematic data-driven campaigning practices, it is as yet unclear precisely what is unacceptable and how prevalent these practices are in different organisations and jurisdictions. For these reasons there is a need to understand more about data-driven campaigning.

To generate this insight, in this article I pose the question: “what practices characterise data-driven campaigning?” and develop a comparative analytical framework that can be used to understand, map and consider responses to data-driven campaigning. Identifying three facets of this question, I argue that there can be variations in who is using data in campaigns, what the sources of data are, and how data informs communication in campaigns. Whilst not exhaustive, these questions and the categories they inspire are used to outline the diverse practices that constitute data-driven campaigning within single and different organisations in different countries. It is argued that our understanding of who, what and how data is being used is critical to debates around the democratic acceptability of data-driven campaigning and provides essential insights required when contemplating a regulatory response.

This analysis and the frameworks it inspires have been developed following extensive analysis of the UK case. Drawing on a three-year project exploring the use of data-driven campaigning within political parties, the analysis discusses often overlooked variations in how data is used. In highlighting these origins I contend that these questions are not unique to the UK case, but can inspire analysis around the globe and in different organisations. Indeed, as I will discuss below, this form of inquiry is to be encouraged as comparative analysis makes it possible to explore how different legal, institutional and cultural contexts affect data-driven campaigning practices. Furthermore, analysis of different kinds of organisation makes it possible to understand the extent to which party practices are unique. Although this article is therefore inspired by a particular context and organisational type, the questions and frameworks it provides can be used to unpack and map the diversity of data-driven campaigning practices, providing conceptual clarity able to inform a possible regulatory response.

Data and election campaigns

The relationship between data and election campaigns is well established, particularly in the context of political parties. Describing the focus of party campaigning, Dalton, Farrell and McAllister (2013) outline the longstanding interest parties have in collecting data that can be analysed to (attempt to) achieve electoral success. In their account, “candidates and party workers meet with individual voters, and develop a list of people’s voting preferences. Then on election day a party worker knocks on the doors of prospective supporters at their homes to make sure they cast their ballot and often offers a ride to the polls if needed” (p. 56). Whilst parties in different contexts are subject to different regulations and norms that affect the data they can collect and use (Kreiss and Howard, 2010), it is common for them to be provided with information by the state about voters’ age, registered status and turnout history (Hersh, 2015). In addition, parties then tend to gather their own data about voter interests, voting preferences and degree of support, allowing them to build large data sets and email lists at national and local levels. Although regulated – most notably through the General Data Protection Regulation (GDPR), which outlines rules in Europe for how data can be collected, used and stored – parties’ use of data is often seen to be democratically permissible as it enables participation and promotes an informed citizenry.

In recent history, the use of data by parties is seen to have shifted significantly, making it unclear how campaigns are organised and whether they are engaging in practices that may not be democratically appropriate. In characterising these practices, two very different accounts of data use have emerged. On the one hand, scholars such as Gibson, Römmele and Williamson (2014) have argued that parties now adopt data-driven campaigns that “focus on mining social media platforms to improve their voter profiling efforts” (p. 127). From this perspective, parties are now often seen to be routinely using data to gain information, communicate and evaluate campaign actions.

In terms of information, it has been argued that data-driven campaigning draws on new sources of data (often from social media and online sources) to allow parties to search for patterns in citizens’ attitudes and behaviours. Aggregating data from many different sources at a level hitherto impossible, data-driven campaigning techniques are seen to allow parties to use techniques common in the commercial sector to “construct predictive models to make targeting campaign communications more efficient” (Nickerson and Rogers, 2014, p. 54; Castleman, 2016; Hersh, 2015, p. 28). Similarly, attention has been directed to the capacity to use algorithms to identify “look-alike audiences” (Tactical Tech, 2019, pp. 37-69), 2 allowing campaigners to find new supporters who possess the same attributes as those already pledged to a campaign (Kreiss, 2017, p. 5). Data-driven campaigning techniques are therefore seen to offer campaigns additional information with minimal investment of resources (as one data analyst becomes able to find as many target voters as an army of grassroots activists) (Dobber et al., 2017, p. 4).

In addition, data-driven campaigning has facilitated targeted communication (Hersh, 2015, pp. 1-2), allowing particular messages to be conveyed to certain kinds of people. These capacities are seen to enable stratified campaign messaging, allowing personalised messages that can be delivered fast through cheap and easy to use online (and offline) interfaces. Data-driven campaigning has therefore been reported to allow campaigners to “allocate their finite resources more efficiently” (Bennett, 2016, p. 265), “revolutioniz[ing] the process” of campaigning (International IDEA, 2018, p. 7; Chester and Montgomery, 2017).

It has also been claimed that data-driven campaigning enables parties to evaluate campaign actions and gather feedback in a way previously not possible. Utilising message-testing techniques such as A/B testing, and monitoring response rates and social media metrics, campaigners are seen to be able to use data to analyse – in real time – the impact of campaign actions. Whether monitoring the effect of an email title on the likelihood that it is opened by recipients (Nickerson and Rogers, 2014, p. 57), or testing the wording that makes a supporter most likely to donate funds, data can be gathered and analysed by campaigns seeking to test whether their interventions work (Kreiss and McGregor, 2018, pp. 173-4; Kerr Morrison et al., 2018, p. 12; Tactical Tech, 2019). 3

These new capacities are often highlighted in modern accounts of campaigning and suggest that there has been significant and rapid change in the activities of campaigning organisations. Whilst prevalent, this idea has, however, been challenged by a small group of scholars who have offered a more sceptical account, arguing that “the rhetoric of data-driven campaigning and the realities of on-the-ground practices” are often misaligned (Baldwin-Philippi, 2017, p. 627).

The sceptical account

A number of scholars of campaign practice have questioned the idea that elections are characterised by data-driven campaigning and have highlighted a gulf between the rhetoric and reality of practices here. Nielsen, for example, has shown that whilst data-driven tools are available, campaigns continue to rely primarily on “mundane tools” (2010, p. 756) such as email to organise their activities. Hersh also found that, in practice, campaigns do not possess “accurate, detailed information about the preference and behaviours of voters” (2015, p. 11), but rely instead on relatively basic, publically available data points. Similar observations led Baldwin-Philippi to conclude that the day-to-day reality of campaigning is “not nearly as novel as the journalistic feature stories claim” as “campaigns often do not execute analytic-based campaigning tactics as fully or rigorously as possible” (2017, p. 631). In part the gulf between possible and actual practice has emerged because parties – especially at a grassroots level – lack the capacity and expertise to utilise data-driven campaigning techniques (Ibid., p. 631). There is accordingly little evidence that parties are routinely using data to gain more information about voters, to develop new forms of targeted communication or to evaluate campaign interventions. Indeed, in a study of the UK, Anstead et al. found no evidence “that campaigns were seeking to send highly targeted but contradictory messages to would-be supporters”, with their study of Facebook advertisements showing that parties placed adverts that reflected “the national campaigns parties were running” (unpublished, p. 3).

Other scholars have also questioned the scale of data-use by highlighting the US-centric focus of much scholarship on political campaigns (Kruschinski and Haller, 2017; Dobber at al., 2017). Kreiss and Howard (2010) have highlighted important variations in campaign regulation that restrict the practices of data-driven campaigns (see also: Bennett, 2016). In this way, a study of German campaigning practices by Kruschinski and Haller (2017) highlights how regulation of data collection, consent and storage means that “German campaigners cannot build larger data-bases for micro-targeting” (p. 8). Elsewhere Dobber et al. (2017, p. 6) have highlighted how different electoral systems, regulatory systems and democratic cultures can inform the uptake of data-driven campaigning tools. This reveals that, whilst often discussed in universal terms, there are important country and party level variations that reflect different political, social and institutional contexts. 4 These differences are not, however, often highlighted in existing accounts of data-driven campaigning.

Reflecting on reasons for this gulf in rhetoric and practice, some attention has been directed to the incentives certain actors have to “sell” the sophistication and success of data-driven campaigning practices. For Bennett, political and technical consultants “are eager to tout the benefits of micro-targeting and data-driven campaigning, and to sell a range of software applications, for both database and mobile environments” (2016, p. 261). Indeed, with over 250 companies operating worldwide that specialise in the use of individual data in political campaigns (Kerr Morrison, Naik, and Hankey, 2018, p. 20), there is a clear incentive for many actors to “oversell” the gains to be achieved through the use of data-targeting tools (a behaviour Cambridge Analytica has, for example, been accused of). Whatever the causes of these diverging narratives, it is clear that our conceptual understanding of the nature of data-driven campaigning, and our empirical understanding of how extensively different practices are found is underdeveloped. We therefore currently lack clear benchmarks against which to monitor the form and extent of data-driven campaigning.

These deficiencies in our current conceptualisation of data-driven campaigning are particularly important because there has been recent (and growing) attention paid to the need to regulate data-use in campaigns. Indeed, around the globe calls for regulation have been made citing concerns about the implications of data-driven campaigning for privacy, political debate, transparency and social fragmentation (Dobber et al, 2017, p. 2). In the UK context, for example, the Information Commissioner, Elizabeth Denham, launched an inquiry into the use of data analytics for political purposes by proclaiming:

[w]hat we're looking at here, and what the allegations have been about, is mashing up, scraping, using large amounts of personal data, online data, to micro target or personalise or segment the delivery of the messages without individuals' knowledge. I think the allegation is that fair practices and fair democracy is under threat if large data companies are processing data in ways that are invisible to the public (quoted in Haves, 2018, pp. 2-3).

Similar concerns have been raised by the Canadian Standing Committee on Access to Information, Privacy and Ethics, the US Senate Select Committee on Intelligence, and by international bodies such as the European Commission. These developments are particularly pertinent because the conceptual and empirical ambiguities highlighted above make it unclear which data-driven campaign practices are problematic, and how extensively they are in evidence.

It is against this backdrop that I argue there is a need to unpack the idea of data-driven campaigning by asking “what practices characterise data-driven campaigning?”. Posing three supplementary questions, in the remainder of the article I provide a series of conceptual frameworks that can be used to understand and map a diversity of data use practices that are currently obscured by the idea of data-driven campaigning. This intervention aims not only to clarify our conceptual understanding of data-driven campaigning practices, and to provide a template for future empirical research, but also to inform debate about the democratic acceptability of different practices and the form any regulatory response should take.

Navigating the practice of data-driven campaigns

Whilst often spoken about in uniform terms, data-driven campaigning practices come in a variety of different forms. To begin to understand the diversity of different practices, it is useful to pose three questions:

  1. Who is using data in campaigns?
  2. What are the sources of campaign data?
  3. How does data inform communication?

For each question, I argue that it is possible to identify a range of answers rather than single responses. Indeed, different actors, sources and communication strategies can be associated with data use within single as well as between different campaigns. Recognising this, I develop three analytical frameworks (one for each question) that can be used to identify, map and contemplate different practices.

These frameworks have been designed to enable comparative analysis between different countries and organisations, highlighting the many different ways in which data is used. Whilst not applied empirically within this article, the ideal type markers outlined below can be operationalised to map different practices. In doing so it should be expected that a spectrum of different positions will be found within any single organisation. Whilst it is not within the scope of this paper to fully operationalise these frameworks, methods of inquiry are discussed to highlight how data may be gathered and used in future analysis. In the discussion below, I therefore offer these frameworks as a conceptual device that can be built upon and extended in the future to generate comparative empirical insights. This form of empirical analysis is vital because it is expected that answers to the three questions will vary depending on the specific geographic or organisational context being examined, highlighting differences in data driven campaigning that need to be recognised by those considering regulation and reform.

Who is using data in campaigns?

When imagining the orchestrators of data-driven campaigning the actors that come to mind are often data specialists who provide insights for party strategists about how best to campaign. Often working for an external company or hired exclusively for their data expertise, these actors have received much coverage in election campaigns. Ranging from the now notorious Cambridge Analytica, to established companies such as BlueStateDigital and eXplain (formerly Liegey Muller Pons), there is often evidence that professional actors facilitate data-driven campaigns. Whilst the idea that parties utilise professional expertise is not new (Dalton et al., 2001, p. 55; Himmelweit et al., 1985, pp. 222-3), data professionals are seen to have gained particular importance because “[n]ew technologies require new technicians” (Farrell et al., 2001). This means that campaigners require external, professional support to utilise new techniques and tools (Kreiss and McGregor, 2018; Nickerson and Rogers, 2014, p. 70). Much commentary therefore gives the impression that data-driven campaigning is being facilitated by an elite group of professional individuals with data expertise. For those concerned about the misuse of data and the need to curtail practices seen to have negative democratic implications, this conception suggests that it is the actions of a very small group that are of concern. And yet, as the literature on campaigns demonstrates, parties are reliant on the activism of local volunteers (Jacobson, 2015), and often lack the funds to pay for costly data expertise (indeed, in many countries spending limits prevent campaigners from paying for such expertise). As a result, much data-driven campaigning is not conducted by expert data professionals.

In thinking through this point, it is useful to note that those conducting data-driven campaigning can have varying professional status and levels of expertise. These differences need to be recognised because they affect both who researchers study when they seek to examine data-driven campaigning, but also whose actions need to be regulated or overseen to uphold democratic norms. 5 Noting this, it is useful to draw two conceptual distinctions between professional and activist data users, and between data novices and experts. These categories interact, allowing four “ideal type” positions to be identified in Figure 1.

Figure 1: Who is using data in campaigns?6

Looking beyond the “expert data professionals” who often spring to mind when discussing data-driven campaigning, Figure 1 demonstrates that there can be different actors using data in campaigns. It is therefore common to find “professionals without data expertise” who are employed by a party. Whilst often utilising or collecting data, these individuals do not possess the knowledge to analyse data or develop complex data-driven interventions. Interestingly, this group has been understudied in the context of campaigns, meaning the precise differences between external and internal professionals are not well understood.

In addition to professionals, Figure 1 also shows that data-driven campaigning is performed by activists who can vary in their degree of expertise. Some, described here as “expert data activists”, can possess specialist knowledge - often having many of the same skills as expert data professionals. Others, termed “activists without data expertise”, lack even basic understandings of digital technology (let alone data-analysis) (Nielsen, 2012). Some attention has been paid to activists” digital skills in recent elections with, for example, coverage of digital expertise amongst Momentum activists in the UK (Zagoria and Schulkind, 2017) and Bernie Sanders activists in the US (Penney, 2017). And yet, other studies have suggested that such expertise is not common amongst activists (Nielsen, 2012).

These classifications therefore suggest that data-driven campaigning can and is being conducted by very different actors who vary in their relationship with the party, and in their expertise. Currently we have little insight into the extent to which these different actors dominate campaigns, making it difficult to determine who is using data, and hence whose activities (if any) are problematic. This indicates the need for future empirical analysis that sets out to determine the prevalence and relative power of these different actors within different organisations. Whilst space prevents a full elucidation of the markers that could be used for this analysis, it would be possible to map organisational structures and use surveys to gauge the extent of data-expertise present amongst professionals and activists. In turn, these insights could be mapped against practices to determine who was using data in problematic ways. It may, for example, be that whilst “expert data professionals” are engaging in practices that raise questions about the nature of democratic debate (such as micro-targeting), “activists without data expertise” may be using data in ways that raise concerns about data security and privacy.

Knowing who is using data how is critical for thinking about where any response may be required, but also when considering how a response can be made. Far from being subject to the same forms of oversight these different categories of actors are subject to different forms of control. Whilst professionals tend to be subject to codes of conduct that shape data use practices, or can be held accountable by the threat of losing their employment, the activities of volunteers can be harder to regulate. As shown by Nielsen (2012), even when provided with central guidance and protocols, local activists often diverge from central party instructions, reflecting a classic structure/agency dilemma. This suggests not only that the activities of different actors may require monitoring and regulation, but also that different responses may be required. The question “who is using data in campaigns?” therefore spotlights a range of practices and democratic challenges that are often overlooked, but which need to be appreciated in developing our understanding and any regulatory response.

What are the sources of campaign data?

Having looked at who is using data in campaigns, it is, second, important to ask what are the sources of campaign data? The presumption inherent in much coverage of data-driven campaigning is that campaigners possess complex databases that hold numerous pieces of data about each and every individual. The International Institute for Democracy and Electoral Assistance (IDEA), for example, has argued that parties “increasingly use big data on voters and aggregate them into datasets” which allow them to “achieve a highly detailed understanding of the behaviour, opinions and feelings of voters, allowing parties to cluster voters in complex groups” (2018, p. 7; p. 5). It therefore often appears that campaigns use large databases of information composed of data from different (and sometimes questionable) sources. However, as suggested above, the data that campaigns possess is often freely disclosed (Hersh, 2015), and many campaigners are currently subject to privacy laws around the kind of data they can collect and utilise (Bennett, 2016; Kruschinski and Haller, 2017).

To understand variations and guide responses, four more categories are identified. These are determined by thinking about variations in the form of data; differentiating between disclosed and inferred data, and the conditions under which data is made available; highlighting differences between data that is made available without charge, and data that is purchased.

Figure 2: The sources of campaigning data

As described in Figure 2, much of the data that political parties use is provided to them without charge, but it can come in two forms. The first category “free data disclosed by individuals” refers to data divulged to a campaign without charge, either via official state records or directly by an individual to a campaign. The official data provided to campaigns varies from country to country (Dobber et al., 2017, p. 7; Kreiss and Howard, 2010, p. 5) but can include information on who is registered to vote, a voter’s date of birth, address and turnout record. In the US it can even include data on the registered partisan preference of a particular voter (Bennett, 2016, p. 265; Hersh, 2015). This information is freely available to official campaigners and citizens are often legally required to divulge it (indeed, in the UK it is compulsory to sign up to the Electoral Register). In addition, free data can also be more directly disclosed by individuals to campaigns through activities such as voter canvassing and surveys that gather data about individuals’ preferences and concerns (Aron, 2015, pp. 20-1; Nickerson and Rogers, 2014, p. 57). The second category “free inferred data” identifies data available without charge, but which is inferred rather than divulged. These deductions can occur through contact with a campaign. Indeed, research by the Office of the Information and Privacy Commissioner for British Columbia, Canada describes how party canvassers often collect data about ethnicity, age, gender and the extent of party support by making inferences that the individual themselves is unaware of (2019, p. 22). It is similarly possible for data that campaigns already possess to be used to make inferences. Information gathered from a petition, for example, can be used to make suppositions about an individual’s broader interests and support levels. Much of the data campaigners use is therefore available without charge, but differs in form.

In addition, Figure 2 captures the possibility that campaigns purchase data. This data can be classified in two ways. The category “purchased data disclosed by individuals” describes instances in which parties buy data that was not disclosed directly to them, but was provided to other actors. This data can come in the form of social media data (which parties can buy access to rather than possess), or include data such as magazine subscription lists (Chester and Montgomery, 2017, pp. 3-4; Nickerson and Rogers, 2014, p. 57). Figure 2 also identifies “purchased inferred data”. This refers to modelled data whereby inferences are made about individual preferences on the basis of available data. This kind of modelling is frequently accomplished by external companies using polling data or commercially available insights, but it can also be done on social media platforms, with features such as look-a-like audiences on Facebook selling access to inferred data about individuals’ views.

Campaigns can therefore use different types of data. Whilst the existing literature has drawn attention to the importance of regulatory context in shaping the data parties in different countries are legally able to use (Kruschinski and Haller, 2017), there are remarkably few comparative studies of data use in different countries. This makes it difficult to determine not only how places vary in their regulatory tolerance of these different forms of data, but also how extensively parties actually use them. Such analysis is important because parties’ activities are not only shaped by laws, but can also be informed by variables such as resources or available expertise (Hersh, 2015, p. 170). This makes it important to map current practices and explore if and why data is used in different ways by parties around the world. In envisioning such empirical analysis, it is important to note that parties are likely to be sensitive to the disclosure of data sources. However a mix of methods - including interviews with those using data within parties and data subject access requests - can be used to gain insights here.

In the context of debates around data-driven campaigning and democracy, these categories also prompt debate about the acceptability of different practices. Whilst the idea that certain forms of disclosed data should be available without charge is relatively established as an acceptable component of campaigns, it appears there are concerns over the purchase of data and the collection of inferred data. Indeed, in Canada the Office of the Information and Privacy Commissioner for British Columbia recommended that “[a]ll political parties should ensure door-to-door canvassers do not collect the personal information of voters, including but not limited to gender, religion, and ethnicity information unless that voter has consented to its collection” (2019, p. 41). By acknowledging the different sources of data used for data-driven campaigning it is therefore possible to not only clarify what is happening, but also to think about which forms of data can be acceptably used by campaigns.

How does data inform communication?

Finally, in thinking about data-driven campaigning much attention has been paid to micro-targeting and the possibility that data-driven campaigning allows parties to conduct personalised campaigns. IDEA has therefore argued that micro-targeting allows parties to “reach voters with customized information that is relevant to them…appealing to different segments of the electorate in different ways” with new degrees of precision (2018, p. 7). In the context of digital politics, micro-targeting is seen to have led parties to:

…try to find and send messages to their partisan audiences or intra-party supporters, linking the names in their databases to identities online or on social media platforms such as Facebook. Campaigns can also try to find additional partisans and supporters by starting with the online behaviours, lifestyles, or likes or dislikes of known audiences and then seeking out “look-alike audiences”, to use industry parlance (Kreiss, 2017, p. 5).

In particular, platforms such as Facebook are seen to provide parties with a “powerful “identity-based“ targeting paradigm” allowing them to access “more than 162 million US users and to target them individually by age, gender, congressional district, and interests” (Chester and Montgomery, 2017, p. 4). These developments have raised important questions about the inclusivity of campaign messaging and the degree to which it is acceptable to focus on specific segments of the population. Indeed, some have highlighted risks relating to mis-targeting (Hersh and Schaffner, 2013) and privacy concerns (Kim et al., 2018, p. 4). However, as detailed above, there are questions about the extent to which campaigns are sending highly targeted messages (Anstead et al., unpublished).

In order to understand different practices, Figure 3 differentiates between audience size; specifying between wide and narrow audiences, and message content; noting differences between generic and specialised messages.

Figure 3: How data informs communication

Much campaigning activity comprises generic messages, with content covering a broad range of topics and ideas. By using data (often generated through polling or in focus groups) parties can determine the form of messaging likely to win them appeal. The category “general message to all voters” describes instances in which a general message is broadcast to a wide audience, something that often occurs via party political TV broadcasts or political speeches (Williamson, Miller and Fallon, 2010, p. iii). In contrast “generic message to specific voters” captures instances in which parties limit the audience, but maintain a general message. Such practices often emerge in majoritarian electoral systems where campaigners want to appeal to certain voters who are electorally significant, rather than communicating with (and potentially mobilising) supporters of other campaigns (Dobber et al., 2017, p. 6). Parties therefore often gather data to identify known supporters or sympathisers who are then sent communications that offer a general overview of the party’s positions and goals.

Figure 3 also spotlights the potential for parties to offer more specialised messages, describing a campaign’s capacity to cover only certain issues or aspects of an issue (focusing, for example, on healthcare rather than all policy realms, or healthcare waiting lists rather than plans to privatise health services). These messages can, once again, be deployed to different audiences. The category “specialised message to all voters” describes instances in which parties use data to identify a favourable issue (Budge and Farlie, 1983) that is then emphasised in communications with all citizens. In the UK, for example, the Labour Party often communicates its position on the National Health Service, whereas the Conservative Party focuses on the economy (as these are issues which, respectively, the two parties are positively associated with). Finally, “specialised message to specific voters” captures the much discussed potential for data to be used to identify a particular audience that can then be contacted with a specific message. This means that parties can speak to different voters about different issues – an activity that Williamson, Miller and Fallon describe as “segmentation” (2010, p. 6).

These variations suggest that campaigners can use data to inform different communication practices. Whilst much attention has been paid to segmented micro-targeting (categorised here as “specialised messages to specific voters”), there is currently little data on the degree to which each approach characterises different campaigns (either in single countries or different nations). This makes it difficult to determine how extensive different practices are, and whether the messaging conducted under each heading is taking a problematic form. It may, for example, be that specialised messaging to specific voters is entirely innocuous, or it could be that campaigners are offering contradictory messages to different voters and hence potentially misleading people about the positions they will take (Kreiss, 2017, p. 5). Empirically, this form of analysis can be pursued in different ways. As above, interviews with campaign practitioners can be used to explore campaign strategies and targeting, but it is also important to look at the actual practices of campaigns. Resources such as online advertising libraries and leaflet repositories are therefore useful in monitoring the content and focus of campaign communications. Using these methods, a picture of how data informs communication can be developed.

Thinking about the democratic implications of these different practices, it should be noted that message variation by audience size and message scope is not new - campaigns have often varied in their communication practices. And yet digital micro-targeting and voter segmentation has been widely greeted with alarm. This suggests the importance of thinking further about the precise cause of concern here, determining which democratic norms are being violated, and whether this is only occurring in the digital realm. It may, for example, be that concerns do not only reflect digital practices, suggesting that regulation is needed for practices both online and offline. These categories therefore help to facilitate debate about the democratic implications of different practices, raising questions about precisely what it is that is the cause for concern and where a response needs to be made.

Discussion

The above discussion has shown that data-driven campaigning is not a homogenous construct but something conducted by different actors, using different data, adopting different strategies. To date much existing discussion of data-driven campaigning has focused on the extent to which this practice is found. In contrast, in this analysis I have explored the extent to which different data-driven campaigning practices can be identified. Highlighting variations in who is using data in campaigns, what the sources of campaign data are, and how data informs campaign communication, I argue that there are a diverse range of possible practices.

What is notable in posing these questions and offering these frameworks is that whilst there is evidence to support these different conceptual categories, at present there is little empirical data on the extent to which each practice exists in different organisations. As such, it is not clear what proportion of campaign activity is devoted to targeting specific voters with specific messages as opposed to all voters with a general message. Moreover, it is not clear the extent to which parties rely on different actors for data-driven campaigning, nor how much power and scope these actors have within a single campaign. At present, therefore, there is considerable ambiguity about the type of data-driven campaigns that exist. This suggests the urgent need for new empirical analysis that explores the practice of data-driven campaigning in different organisations and different countries. By operationalising the categories proposed here and using methods including interviews, content analysis and data subject access requests, I argue that it is possible to build up a picture of who is using what data how.

Of particular interest is the potential to use these frameworks to generate comparative insights into data-driven campaigning practice. At present studies of data use have tended to be focused on one country, but in order to understand the scope of data-driven campaigning it is necessary to map the presence of different practices. This is vital because, as previous comparative electoral research has revealed, the legal, cultural and institutional norms of different countries can have significant implications on campaigning practices. In this way it would be expected that a country such as Germany with a history of strong data protection law would exhibit very different data-driven campaigning practices to a country such as Australia. In a similar way, it would be expected that different institutional norms would lead a governmental organisation, charity or religious group to use data differently to parties. At present, however, the lack of comparative empirical data makes it difficult to determine what influences the form of data-driven campaigning and how different regulatory interventions affect campaigning practices. This framework therefore enables such comparative analysis, and opens the door to future empirical and theoretical work.

One particularly valuable aspect of this approach is the potential to use these questions and categories to contribute to existing debates around data-driven campaigning and democracy. Throughout the discussion, I have argued that many commentators have voiced concerns. These relate variously to privacy, the inclusivity of political debate, misinformation and disinformation, political finance, external influence and manipulation, transparency and social fragmentation (for more see Zuiderveen Borgesius et al., 2018, p. 92; Chester and Montgomery, 2017, p. 8; Dobber et al., 2017, p. 2; Hersh, 2015, p. 207; Kreiss and Howard, 2010, p. 11; International IDEA, 2018, p. 19). Such concerns have led to calls for regulation, and, as detailed above, many national governments, regulators and international organisations have moved to make a response. And yet, before creating new regulations and laws, it is vital for these actors to possess accurate information about how precisely data-driven campaigning is being conducted, and to reflect on which democratic ideals these practices violate or uphold. Data-driven campaigning is not an inherently problematic activity, indeed, it is an established feature of democratic practice. However, our understanding of the acceptability of this practice will vary dependent on our understanding of who, what and how data is being used (as whilst some practices will be viewed as permissible, others will not). This makes it important to reflect on what is happening and how prevalent these practices are in order to determine the nature and urgency of any regulatory response. Importantly, these insights need to be gathered in the specific regulatory context of interest to policy makers, as it should not be presumed that different countries or institutions will use data in the same way, or indeed have the same standards for acceptable democratic conduct.

The frameworks presented in this article therefore provide an important means by which to consider the nature, prevalence and implications of data-driven campaigning for democracy and can be operationalised to produce vital empirical insights. Such data and conceptual clarification together can ensure that any reaction to data-driven campaigning takes a consistent, considered approach and reflects the practice (rather than the possibility) of this activity. Given, as a report from Full Fact (2018, p. 31) makes clear that there is a danger of “government overreaction” based on limited information and self-evident assumptions (Ostrom, 2000) about how campaigning is occurring, it is vital that such insights are gathered and utilised in policy debates.

Conclusion

This article has explored the phenomenon of data-driven campaigning. Whilst receiving increased attention over recent years, existing debate has tended to focus on the extent to which this practice can be found. In this article, I present an alternative approach, seeking to map the diversity of data-driven campaigning practices to understand the different ways in which data can and is being used. This has shown that far from being characterised by uniform data-driven campaigning practices, data-use can vary in a number of ways.

In classifying variations in who is using data in campaigns, what the sources of campaign data are, and how data informs campaign communication, I have argued that there are diverse practices that can be acceptable to different actors to different degrees. At an immediate level, there is a need to gain greater understanding of what is happening within single campaigns and how practices vary between different political parties around the globe. More widely, there is a need to reflect on the implications of these trends for democracy and the form that any regulatory response may need to take. As democratic norms are inherently contested, there is no single roadmap for how to make a response, but the nature of any response will likely be affected by our understanding of who, what and how data is being utilised. This suggests the need for new conceptual and empirical understanding of data-driven campaigning practices amongst both academics and regulators alike.

References

Anstead, N., et al. (2018). Facebook Advertising the 2017 United Kingdom General Election: The Uses and Limits of User-Generated Data. Unpublished Manuscript. Retrieved from https://targetingelectoralcampaignsworkshop. files.wordpress.com/2018/02/anstead_et_al_who_targets_me.pdf.

Aron, J. (2015, May 2). Mining for Every Vote. New Scientist, 226(3019), 20–21. https://doi.org/10.1016/S0262-4079(15)30251-7

Baldwin-Philippi, K. (2017). The Myths of Data Driven Campaigning. Political Communication, 34(4), 627–633. https://doi.org/10.1080/10584609.2017.1372999

Bennett, C. (2016). Voter Databases, micro-targeting, and data protection law: can political parties campaign in Europe as they do in North America?. International Data Privacy Law, 6(4), 261–275. https://doi.org/10.1093/idpl/ipw021

Budge, I., & Farlie, D. (1983). Explaining and Predicting Elections. London: Allen and Unwin.

Castleman, D. (2016). Essentials of Modelling and Microtargeting. In A. Therriault (Ed.), Data and Democracy: How Political Data Science is Shaping the 2016Elections, (pp. 1–6). Sebastopol, CA: O’Reilly Media. Retrieved from https://www.oreilly.com/ideas/data-and-democracy/page/2/essentials-of-modeling-and-microtargeting

Chester, J., & Montgomery, K.C. (2017). The role of digital marketing in political campaigns. Internet Policy Review, 6(4). https://doi.org/10.14763/2017.4.773

Dalton, R. J., Farrell, D. M., & McAllister, I. (2013). Political Parties and Democratic Linkage. Oxford: Oxford University Press. https://doi.org/10.1093/acprof:osobl/9780199599356.001.0001

Dobber, T., Trilling, D., Helberger, N., & de Vreese, C. H. (2017). Two Crates of Beer and 40 pizzas: The adoption of innovative political behavioral targeting techniques. Internet Policy Review, 6(4). https://doi.org/10.14763/2017.4.777

Dommett, K., & Temple, L. (2018). Digital Campaigning: The Rise of Facebook and Satellite Campaigns. Parliamentary Affairs, 71(1), 189–202. https://doi.org/10.1093/pa/gsx056

Farrell, D., Kolodny, R., & Medvic, S. (2001). Parties and Campaign Professionals in a Digital Age: Political Consultants in the United States and Their Counterparts Overseas. The International Journal of Press/Politics, 6(4), 11–30. https://doi.org/10.1177/108118001129172314

Full Fact. (2018). Tackling Misinformation in an Open Society [Report]. London: Full Fact. Retrieved from https://fullfact.org/blog/2018/oct/tackling-misinformation-open-society/

Gibson, R., Römmele, A., & Williamson, A. (2014) Chasing the Digital Wave: International Perspectives on the Growth of Online Campaigning. Journal of Information Technology & Politics, 11(2), 123–129. https://doi.org/10.1080/19331681.2014.903064

Haves, E. (2018). Personal Data, Social Media and Election Campaigns. House of Lords Library Briefing. London: The Stationary Office.

Hersh, E. (2015). Hacking the Electorate: How Campaigns Perceive Voters. Cambridge: Cambridge University Press.

Hersh, E. & Schaffner, B. (2013). Targeted Campaign Appeals and the Value of Ambiguity. The Journal of Politics, 75(2), 520–534. https://doi.org/10.1017/S0022381613000182

Himmelweit, H., Humphreys, P. , & Jaeger, M. (1985). How Voters Decide. Open University Press.

in ‘t Veld, S. (2017). On Democracy. Internet Policy Review, 6(4). https://doi.org/10.14763/2017.4.779

Information Commissioners Office. (2018a). Investigation into the use of data analytics in political campaigns. London: ICO.

Information Commissioners Office. (2018b) Notice of Intent. Retrieved from https://ico.org.uk/media/2259363/emmas-diary-noi-redacted.pdf.

International IDEA. (2018). Digital Microtargeting. IDEA: Stockholm.

Jacobson, G. (2015). How Do Campaigns Matter?. Annual Review of Political Science, 18(1), 31–47. https://doi.org/10.1146/annurev-polisci-072012-113556

Kaang, C., Rosenburg, M., and Frenkel, S. (2018, July 2). Facebook Faces Broadened Federal Investigations Over Data and Privacy. New York Times.Retrieved fromhttps://www.nytimes.com/2018/07/02/technology/facebook-federal-investigations.html?module=inline

Kerr Morrison, J., Naik, R., & Hankey, S. (2018). Data and Democracy in the Digital Age. London: The Constitution Society.

Kim, T., Barasz, K., & John, L. (2018). Why Am I Seeing this Ad? The Effect of Ad Transparency on Ad Effectiveness. Journal of Consumer Research, 45(5), 906–932. https://doi.org/10.1093/jcr/ucy039

Kreiss, D., & Howard, P. N. (2010). New challenges to political privacy: Lessons from the first US Presidential race in the Web 2.0 era. International Journal of Communication, 4(19), 1032–1050. Retrieved from https://ijoc.org/index.php/ijoc/article/view/870

Kreiss, D. (2017). Micro-targeting, the quantified persuasion. Internet Policy Review, 6(4). https://doi.org/10.14763/2017.4.774

Kreiss, D., & McGregor, S. (2018). Technology Firms Shape Political Communication: The Work of Microsoft, Facebook, Twitter and Google with Campaigns During the 2016 US Presidential Cycle. Political Communication, 35(2), 155–177. https://doi.org/10.1080/10584609.2017.1364814

Kruschinski, S., & Haller, A. (2017). Restrictions on data-driven political micro-targeting in Germany. Internet Policy Review, 6(4). https://doi.org/10.14763/2017.4.780

Nickerson, D., & Rogers, T. (2014). Political Campaigns and Big Data. Journal of Economic Perspectives, 28(2), 51–74. https://doi.org/10.1257/jep.28.2.51

Nielsen, R. (2010). Mundane internet tools, mobilizing practices, and the coproduction of citizenship in political campaigns. New Media and Society, 13(5), 755–771. https://doi.org/10.1177/1461444810380863

Nielsen, R. (2012). Ground Wars. Princeton: Princeton University Press.

Office of the Information and Privacy Commissioner for British Columbia. (2019). Investigation Report P19-01, Full Disclosure: Political Parties, Campaign Data and Voter Consent. Retrieved from https://www.oipc.bc.ca/investigation-reports/2278

Ostrom, E. (2000). The Danger of Self-Evident Truths. Political Science and Politics, 33(1), 33–44. https://doi.org/10.2307/420774

Penney, J. (2017). Social Media and Citizen Participation in “Official“ and “Unofficial“ Electoral Promotion: A Structural Analysis of the 2016 Bernie Sanders Digital Campaign. Journal of Communication, 67(3), 402–423. https://doi.org/10.1111/jcom.12300

Persily, N. (2017). Can Democracy Survive the Internet?. Journal of Democracy, 28(2), 63–76. https://doi.org/10.1353/jod.2017.0025

Tactical Tech. (2019). Personal Data: Political Persuasion – Inside the Influence Industry. How it works. Berlin: Tactical Technology Collective.

Williamson, A., Miller, L., & Fallon, F. (2010). Behind the Digital Campaign: An Exploration of the Use, Impact and Regulation of Digital Campaigning. London: Hansard Society.

Zagoria, T. and Schulkind, R. (2017). How Labour Activists are already building a digital strategy to win the next election”, New Statesman. Retrieved from https://www.newstatesman.com/politics/elections/2017/07/how-labour-activists-are-already-building-digital-strategy-win-next.

Zuiderveen Borgesius, F., Möller, J., Kruikemeier, S., Fathaigh, R., Irion, K., Dobber, T., Bodo, B. & de Vreese, C. H. (2018). Online Political Microtargeting: Promises and Threats for Democracy. Utrect Law Review, 14(1), 82–96. https://doi.org/10.18352/ulr.420

Footnotes

1. This question is important because it is to be expected that universal responses to this question do not exist, and that different actors in different countries will view and judge practices in different ways (against different democratic standards).

2. See the report from Tactical Tech (2019) Personal Data for a range of examples of how data can be used to gain “political intelligence“ about voters.

3. Importantly, this data use is not guaranteed to persuade voters. Campaigns can identify the type of campaign material viewers are more likely to watch or engage with, but this does not necessarily mean that those same viewers are persuaded by that content.

4. Similarly there are likely to be variations between parties and other types of organisation such as campaign groups or state institutions.

5. It should be noted that these democratic norms are not universal, but are expected to vary dependent on context and the perspective of the particular actor concerned.

6. For more on local expert activism in the UK see Dommett and Temple, 2017. In the US see Penney, 2017.

Want to open the budget now? Ask me how! Budget data literacy in Israel - a case study

$
0
0

This commentary is part of Digital inclusion and data literacy, a special issue of Internet Policy Review guest-edited by Elinor Carmi and Simeon J. Yates.

A particularly useful type of data literacy, instrumental for civic participation, for the ability to hold governments accountable, and to monitor policy implementation, as well as the fairness of public funds allocation - is budget and spending data literacy. These types of data have a long lasting aura of intimidating spreadsheets and a convoluted bureaucratic jargon, in addition to being oftentimes technically inaccessible and not complying to open data standards.

In this essay I’m sharing my experience of working for the Israeli civic-tech non-profit organisation The Public Knowledge Workshop, and what I learned from creating a budget data literacy programme, especially about what it takes to achieve this type of literacy and how it further affects the government itself.

The PKW (Hasadna, in Hebrew) is part of the global movement of Open Government Data. For the past ten years its volunteer community has been developing open source technological tools for visualising, analysing and making government data accessible to the public, along with pushing for the publication of more open data by the central government and municipalities, and advising on open data standards. Its projects analyse data such as parliamentary activity and legislation, road accidents data published by the National Bureau of Statistics, real time public transport location data, financial data on citizens pension funds management and more.

The BudgetKey is one of PKW’s flagship projects - a web based tool with joyful colours and beautiful visualisations, that demystifies the undecipherable spreadsheets of financial data. It combines over 30 different data sources (and still counting) from a plethora of agencies to provide a searchable one stop shop for all budgetary interests citizens may have:

Figure 1: The BudgetKey homepage: visualisation of the budget breakdown. Source: https://obudget.org/

The BudgetKey walks the user through the money pipeline, from the budget allocation, to the funds recipient on the other end. One can find how much money the Ministry of Education has allocated for youth movements and then see how it actually spent it - which movements received how much funding, and how this funding changed over the years. One can find government contractors, and check how spending on outsourced consultants has changed or how privatisation of social services has played out in recent years. One can also follow up on funds that municipalities receive from government offices, and tell if cities take full advantage of bike lane construction subsidies. Finally, it could simply be used as a search engine for government tenders and calls for proposals, which otherwise were scattered around the web.

Figure 2: The BudgetKey homepage. A chatbot styled introductory tutorial on the budget. Source: https://obudget.org/

It’s hard to overemphasise the role that the BudgetKey played in giving access to the budget for the public. For decades the Israeli state budget has been dwelling in the archives of the Knesset (Israeli parliament), available only as a bundle of hard copy booklets. Full procurement data was not published until 2016. Other data sets were scattered around the net, in every format imaginable. Read more about the full saga of fiscal data “liberation” by the PKW in this case study published by the Open Government Partnership.

The unavailability of quality open data, went hand in hand with a lack of interest, lack of understanding and mere intimidation with the budget on the part of civil society organisations (CSOs), journalists and the general public. This was not surprising given the government’s history of holding the budget and spending data close to its chest and the entrenched culture of centralising these processes within the Ministry of Finance. For years, the Ministry controlled the budgeting process almost entirely, with very little consultation even with other government offices and almost no oversight by the Knesset. Within the government offices, only budget and accountant officers had access to the government's financial system, and only to the data concerning their own office. Other civil servants, as well as members of the Knesset had as little access to the budget as a lay person on the street. For decades, deliberation on fiscal matters was regarded as strictly professional rather than political, and reserved for economists only. This severely hindered the ability of civil society to participate in any budgeting processes and monitoring government fund allocation and spending.

However, contrary to the expectation that “if we build it they will come”, the BudgetKey did not provide an instant remedy. The Budget Key, a love child1 of a team of open data activists*, though very powerful, was still too complex for the uninitiated citizen. Without training, users were unable to take advantage of the data, and to understand the story it could be telling. So the PKW started offering individual training on-demand, explaining the inner works of the budget and the intricacies of the data. Soon, this proved to be highly time consuming and unsustainable, given the small team at the organisation. This is how the idea of a MOOC was born - a sustainable model for a mass training programme that has also provided a flexible study framework for participants. They could study on their own schedule and their own computer, no travel needed, except for a single face-to-face session, indispensable for building a community.

The programme was named Budgetism - an amalgam of budget and activism. PKW developed it together with two other transparency NGOs - the Movement for Freedom of Information (FOIM) and the Social Guard (SG).

CSOs started showing interest in the curriculum and the course took off. After several terms it became clear that it was not useful only for advocacy organisations and think tanks, but for any non-profit that was interested in exploring government funding opportunities. Since the BudgetKey contains data about government money received by non-profits, and vice versa, data about the non-profits themselves and all types of funds they received, it could serve as tool for researching and analysing the field, discovering funding opportunities and potential partners or competitors.

The programme is a six week long online course based on Open Edx platform, that consists of eight chapters for self study and one face-to-face session. The study material is designed to accommodate diverse learning styles, and includes short video tutorials, articles, podcasts, toolkits and hands-on exercises. It covers the following topics:

  1. The budgeting process and the main actors in the government and the Knesset.
  2. The role of the Knesset Finance Committee in approving ongoing changes to the budget.
  3. Understanding the data and navigating the BudgetKey website.
  4. Managing freedom of information (FOI) requests to obtain missing information.
  5. Gender based budget analysis.
  6. Monitoring implementation of government resolutions with budgetary implications.
  7. Reading municipal budgets.

The programme is aimed mostly at civil society, but has inadvertently attracted central and local government officials, journalists and researchers as well.

Its core goals are to promote greater government transparency and accountability and to increase the capacity of activists and CSOs in Israel to act as watchdogs against unfair or corrupt spending of public funds, with the long term vision to democratise the budgeting processes and open them to public participation. The programme equipped them with tools and skill sets for effective monitoring of government policy implementation and public funds redistribution. This is particularly relevant to CSOs representing minority and vulnerable social groups which are often subject to budgetary discrimination.

A secondary goal was to forge a broad community of informed and involved citizens, who could build on the mutual experience, turn to each other for advice, and further promote public discussion on budgetary topics.

An important byproduct of the programme however, is the impact on the government itself, as the data producer and publisher, as much as the data user.

One of the setbacks to open data publication is that many of the data sets are never made use of. The BudgetKey provided a powerful demonstration of the real life value of open data, and could now be used as an incentive for demanding the release of still more data. On the other hand, the fact that so many eyes were now looking at the data, served as a cataliser for improvement of the quality of the data that is already being published. When presented at various governmental fora, it served as an inspiration for government officials and showcased the importance of the core principles of open data, such as the need for interoperability (ability to combine data sets from different sources), standardisation, completeness etc.

To sum up, data literacy is a case of chicken-and-egg. First you need data, then you need technology to process it. Further you need to teach fellow citizens how to make sense of it. But without the educated citizen to begin with, it is hard to get data published, and to get the citizen interested in learning how to use it. Thus, to really achieve the goal of data literacy it is necessary to act simultaneously on all fronts.

Acknowledgement

Special thanks to Shevy Korzen, the visionary behind Budgetism, who made the project a reality.

Footnotes

1. Adam Kariv, Mushon Zer-Aviv, Saar Alon Barkat and a long list of volunteers and activists who created the BudgetKey.

What is critical big data literacy and how can it be implemented?

$
0
0

This paper is part of Digital inclusion and data literacy, a special issue of Internet Policy Review guest-edited by Elinor Carmi and Simeon J. Yates.

Introduction

With the increasing ubiquity of big data systems, awareness of citizens’ need for ‘digital skills’ and ‘data literacy’ has been growing among scholars, activists and political decision-makers. What type of skills and knowledge do people need to ‘be digital’, to use the internet and related technologies in an informed manner? While media, digital and data literacy approaches have been refined and expanded to accommodate for the changing landscape of digital technologies (see below), one aspect is only recently beginning to attract more attention: people’s awareness and understanding of big data practices(e.g., report on ‘Digital Understanding’ by Miller, Coldicutt, & Kitcher, 2018). Even though various skills to use digital media, data sets and the internet for specific purposes are without doubt important, citizens in today’s datafied societies need more than skills. They need to be able to understand and critically reflect upon the ubiquitous collection of their personal data and the possible risks and implications that come with these big data practices (e.g., Pangrazio & Selwyn, 2019), challenging common ‘myths’ around big data’s objective nature (boyd & Crawford, 2012). This is essential to foster an informed citizenry in times of increasing profiling and social sorting of citizens and other profound economic, political and social implications of big data systems. These systems come with manifold risks, such as threatening individual privacy, increasing and transforming surveillance, fostering existing inequalities and reinforcing discrimination (e.g., Eubanks, 2018; O’Neil, 2016; Redden & Brand, 2017).

At the moment, however, people’s knowledge and understanding of these issues seem fragmented. Recent research has shown Europeans' lacking knowledge about algorithms (Grzymek & Puntschuh, 2019); Germans' knowledge deficits with regards to digital interconnectivity and big data (Müller-Peters, 2019); and Americans' misunderstandings and misplaced confidence in privacy policies (Turow, Hennessy, & Draper, 2018). The British organisation Doteveryone identified a “major understanding gap around technologies”, finding that a only a third of people know that data they have not actively chosen to share has been collected (Doteveryone, 2018, p. 5), and that less than half of British adult internet users are aware that apps collect their location and information on their personal preferences (Ofcom, 2019, p. 14).

Moreover, research has found that even when people are aware of the collection of their personal data online, they often only have a vague idea of the general system of big data practices and how this might impact their lives and our societies (Turow, Hennessy, & Draper, 2015, p. 20). Nevertheless, many are uncomfortable about the collection and use of their personal data (e.g., Bucher, 2017; Miller et al., 2018; Ofcom, 2019) and they find big data practices less acceptable the more they learn about them (Worledge & Bamford, 2019). Alongside many other scholars, I argue that it is essential that citizens in datafied societies learn more about and begin to understand big data practices in order to enable them “to form considered opinions and debate the issue in a factually informed way” (Grzymek & Puntschuh, 2019, p. 11). While aspects such as general media literacy, fake news and disinformation are slowly reaching more attention (e.g., the European Commission’s ‘European Media Literacy Week’ or UNESCO’s ‘Global Alliance for Partnerships on Media and Information Literacy’), there is still lacking engagement with the impacts of big data systems. People’s education about big data needs to be strengthened, with more “investment in new forms of public engagement and education” (Doteveryone, 2018, p. 6) that also “address structural features of media systems” (Turow et al., 2018, p. 461), such as the structural and systemic levels of big data practices. This study provides insight into ways to achieve these goals by: a) conceptualising critical big data literacy and b) drawing on empirical research to discuss the effects of data literacy tools.

After presenting the concept of critical big data literacy and its theoretical grounding, the paper details existing examples of initiatives and resources that aim to foster such literacy. An analysis of 40 data literacy tools is used to present a typology of their content and design approaches. A small selection of these tools is then evaluated by examining users’ perspective on them: their perception of the tools’ short and mid-term effects, reflections on their suitability to teach data literacy and feedback on their content and design. Finally, based on the participants’ feedback and ideas and complemented by further research findings, suggestions for the creation of future data literacy efforts are made.

1. What is critical big data literacy?

I argue here for critical big data literacy, that is literacy efforts that aim to go beyond the skills of using data. While learning to use digital media and data productively without doubt constitute relevant skills for today’s digital citizens, only a small body of research around digital or data literacy concepts focuses on the many potentially problematic issues related to big data. As often highlighted before, big data constitute a “socio-technical phenomenon” that entails the “capacity to search, aggregate, and cross-reference large data sets” (boyd & Crawford, 2012, p. 663), to, among others, search for patterns and create categories, profiles or scores often used for decision-making and predictive analyses. The increasing use of these systems in areas such as banking, employment, policing, and social services comes with profound economic, political, and, importantly, social implications (Eubanks, 2018; O’Neil, 2016; Redden & Brand, 2017).

In today’s datafied societies, citizens need to be aware of these systems that affect so many areas of their lives, and a critical public debate about data practices is essential. Therefore, I argue that data literacy today needs to go beyond data skills in order to foster such awareness and public debate of the datafication of our societies. Citizens need to recognise risks and benefits related to the increasing implementation of data collection, analytics, automation, and predictive systems and need to be able to scrutinise the structural and systemic levels of these changing big data systems. Thus, I suggest that critical big data literacy in practice should mean an awareness, understanding and ability to critically reflect uponbig data collection practices, data uses and the possible risks and implications that come with these practices, as well as the ability to implement this knowledge for a more empowered internet usage (see also Sander, 2020).

The concept of critical big data literacy combines approaches from a variety of research fields. First, critical data studies not only aim to understand and critically examine the “importance of data” and how they have become “central to how knowledge is produced, business conducted, and governance enacted” (Kitchin & Lauriault, 2014, p. 2), but many critical data scholars also call for more societal and public involvement (Marwick & Hargittai, 2018, p. 14; O’Neil, 2016, p. 210; Zuboff, 2015, p. 86). They hope that more “reflexive, active and knowing publics” (Kennedy & Moss, 2015, p. 1) might not only empower citizens, but also “open up discussion of policy solutions to regulate such [big data] practices” (Marwick & Hargittai, 2018, p. 14). However, concrete suggestions on how to implement such a transfer of academic knowledge are rare.

Secondly, the concept of critical big data literacy builds on a long history of critical media literacy and critical digital literacy research. While literacy was long understood to include mainly the skills to use (digital) media productively, this understanding has been criticised as too uncritical in recent years. Scholars have questioned this “technocratic or functional perspective” and have called for a critical perspective that includes analysis and judgement of the “content, usage and artefacts” of digital technology as well as the “development, effects and social relations bound in technology” (Hinrichsen & Coombs, 2013, p. 2, p. 4). Thus, an increased emphasis on critical approaches has emerged (e.g., Hammer, 2011; Garcia et al., 2015; Pangrazio, 2016). Yet, only few literacy initiatives and concepts include critiquing the “platform or modality relationships to information and communication” (Mihailidis, 2018, p. 4), and aspects such as the impacts of datafication or an understanding of risks around privacy and surveillance are often omitted. However, there are some particularly relevant concepts of critical digital literacy that “take structural aspects of the technology into account”, considering “issues of exploitation, commodification, and degradation in digital capitalism” (Pötzsch, 2019, p. 221) and aiming for a “more nuanced understanding of power and ideology within the digital medium” (Pangrazio, 2016, p. 168).

Thirdly, many data literacy approaches and concepts have developed in recent years. While a large part of these studies understand data literacy in an active and creative way and aim to teach citizens to read, use and work with data – often in an empowering way (e.g., D’Ignazio, 2017), fostering people’s “data mindset” (D’Ignazio & Bhargava, 2018) – there are a number of concepts that include a critical reflection of (big) data systems either while or as a result of working with data. For some, this critical approach falls under the term “data literacy” (e.g., Crusoe, 2016), others use concepts such as “big data literacy” (D’Ignazio & Bhargava, 2015; Philip, Schuler-Brown, & Way, 2013), “digital understanding” (Miller et al., 2018), or “algorithmic literacy” (Grzymek & Puntschuh, 2019). Particularly relevant approaches are “data infrastructure literacy” that aims to promote “critical inquiry into datafication” (Gray, Gerlitz, & Bounegru, 2018, p. 3) and an “extended definition of Big Data literacy” by D’Ignazio and Bhargava, aiming to educate about “when and where data is being passively collected”, “algorithmic manipulation” and the “ethical impacts” of “data-driven decisions for individuals and for society” (2015, p. 1, p. 3).

These concepts focus on fostering a critical view through active usage of data tools in classroom and workshop environments. Thus, while they add highly relevant insights into conceptualising critical big data literacy, they lack focus on a broader audience as well as critical inquiry of the structural level of big data systems. However, some very relevant approaches go beyond formal education and argue for a broader conceptualisation of critical data literacy. For example, Fotopoulou conceptualises data literacy for civil society organisations and argues for “data literacies as agentic, contextual, critical, multiple, and inherently social”, raising “awareness about the ideological, ethical and power aspects of data” (2020, p. 2f). Similarly, Pangrazio and Selwyn call for “personal data literacies” that “include conceptualisations of the inherently political nature of the broader data assemblage” and aim to build “awareness of the social, political, economic and cultural implications of data” (2019, p. 426), while Pybus, Coté and Blanke work towards a “holistic approach to data literacy”, including an understanding of meaning making through data, of big data’s opaque processes and an “active (re)shaping of data infrastructures” (2015, p. 4). A recent report by the ‘Me and My Big Data’-Project further suggests “data citizenship” as a new data literacy framework that combines skills with critical understanding. This framework consists of “Data thinking” – critically understanding the world through data; “Data doing” – learning practical skills around everyday engagements with data; and “Data participation” – aiming to “examine the collective and interconnected nature of data society” (Yates et al., 2020, p. 10).

Critical big data literacy builds on such conceptualisations as well as contributes to research from all of these fields: It aims to communicate critical data studies’ findings to the public; to learn from critical media and digital literacy approaches; and to build on and advance data literacy concepts by working to foster citizens’ critical understanding of datafication. Following Mulnix’s considerations on critical thinking, critical big data literacy aims at the “development of autonomy”, or the “ability to decide for ourselves what we believe”, through our own critical deliberations on the use of our data and the impact of datafied systems (2012, p. 473). Importantly, just as critical thinking is “not directed at any specific moral ends” and does not intrinsically contain a certain set of beliefs as its natural outcomes (ibid., p. 466), so is being critically data literate not understood as necessarily taking a ‘negative’ stance to all data practices.

The aim of critical big data literacy is not to affect internet users in a way that leaves them feeling negatively about all data collection and analysis or even resigned about such big data practices (for resignation, see below and, e.g., Turow et al., 2015). Rather, the goal is to foster users’ awareness and understanding about what happens to their data and thus to enable them to question and scrutinise the socio-technical systems of big data practices, to weigh the evidence, to build informed opinions on current debates around data analytics as well as to allow them to make informed decisions on personal choices such as which data to share or which services to use. While it is important to not merely shift responsibility to individuals as citizens’ agency in this field and possibilities to opt out of these systems are limited, it is nevertheless crucial that citizens are able to learn more about these debates and how to use the internet in a more critical and empowered way, for example through using alternative services.

2. How to research critical big data literacy

This study investigates online resources that foster critical big data literacy and focuses on two perspectives: to gain insight into existing examples of such online resources and to understand how they affect those who use them as well as how these users perceive these resources.

To investigate the first research question – which and what kind of examples for online critical big data literacy tools already exist? – I conducted a snowball sampling of available tools, using the interactive web-series ‘Do Not Track’ (2015) as a starting point. This tool was developed by a wide range of “public media broadcasters, journalists, developers, graphic designers and independent media makers from different parts of the world” (‘About Do Not Track - Who We Are’, 2015). ‘Do Not Track’ is often being used in teaching and offers a great amount of follow-up links for every episode, including further information and articles, but also other resources that inform or teach about big data, constituting the ideal starting point for this snowball sampling. In snowball sampling, a group of respondents – or, in this case, online data literacy resources – is initially selected because of their relevance to the research objectives (Hansen & Machin, 2013, p.217). These are then asked to identify other potential respondents “of similar relevance to the research objectives” (ibid.). In this case, this meant following the great amount of links and mentions of other resources on the ‘Do Not Track’ website as well as further links and mentions on any website identified in this way.

All English-language online resources that aim to educate about big data and data collection and that were identified through this snowball sampling were included in the initial sample. Thus, when I talk about critical big data literacy tools, the term ‘tool’ is used in a very open way and the sampling included various different kinds of informational and educational resources on issues related to big data (see below). However, individual news articles on the topic, sites that cover only one aspect (e.g., cookie tracking), data visualisation tools, and resources that do not include any constructive suggestions for ways to protect one’s data online were excluded. It is further important to emphasise that, as there exists very little research on these tools as yet, this study did not aim at a comprehensive, but rather an adequate overview and a first insight into the field.

To deepen this insight, I conducted a comparative analysis of the identified tools, focusing on the tools’ formats and their production origins. Based on this, I developed a typology reflecting the different content and design approaches that tools in the sample apply. This typology further provided an initial basis on which to select tools to test in the second part of the study. The three selected tools should: (1) apply different design approaches according to this typology; (2) not only outline critical aspects of big data, but also provide easy-to-follow constructive suggestions for ways to protect one’s data online; (3) be oriented towards findings of media literacy evaluation or effectiveness research.

The second part of this study tested the three selected tools, researching the questions: how do critical big data literacy tools change people’s privacy attitudes and behaviour short and mid-term? How do people perceive these tools? In this qualitative multi-methods approach, critical big data literacy was operationalised by investigating changes in people’s concern for privacy and in their privacy-sensitive internet usage in three points in time: before using the tools, one week after this ‘intervention’ and finally eight months later. While concern for privacy and privacy-sensitive internet usage are likely insufficient to measure all aspects of the complex concept of critical big data literacy as outlined above, an increase in these two areas certainly indicates a generally increased awareness and critical reflection of issues around big data and online privacy as well as the ability to implement these concerns into one’s daily internet usage. For example, critically data literate citizens should be aware of the way Google and WhatsApp, among others, collect, use and sell their data, what it could be used for, and they should know about alternative and privacy-sensitive search engines and messaging apps (e.g., DuckDuckGo, Telegram or Signal) to be able to make an informed decision on which service to use in the future.

The study’s sample was selected in consideration of ‘data adequacy’ (Vasileiou et al., 2018). Both the sample size and the participants’ demographic were aimed to be adequate for this study’s design and research objectives. As many studies with larger sample sizes have indicated the complexity of people’s attitudes towards data usage (see above), the aim of this study was to complement such quantitative studies by conducting a small and focused qualitative analysis that provides in-depth information for each participant. Thus, considering the length of the study and the depth of the analyses, a small sample size was adequate for this design. As each participant contributed to the study at several points in time and the study’s instruments included various open questions and investigated different perspectives on data literacy, each participant’s responses and reflections consisted of very information-rich data. Therefore, a small sample size of ten participants was adequate for this multi-methods study.

Also in terms of the demographics of the sample, data adequacy was considered. Here, two aspects played a key role. First, as the goal of this study was to identify if data literacy tools would lead to a change in online privacy attitude and online behaviours, and the interactive tools employed in the study required a certain level of digital ability, it was decided to recruit participants with some level of digital skills. Second, given the small sample size and the study’s objective to understand individual privacy concerns, potentially intricate reasonings for online behaviour and individual users’ perceptions of data literacy tools in-depth, the sample aimed for as few variables in the sample composition as possible. In light of these two aspects, focussing on a narrow demographic with a certain level of digital skills was adequate for this study. Therefore, university students constituted an ideal group of participants for this study as many of them already possess the basic level of digital ability required to engage with the tools and recruiting them was resource-friendly through formal and informal university networks. However, this narrow sample also came with some limitations, which will be discussed in the conclusion.

The study’s participants were identified through purposive sampling, a strategy where particularly “information-rich cases” are selected (Vasileiou et al., 2018, p. 2). In this case, the purposive sampling aimed for Cardiff students with an average ‘digital literacy’ but no previous knowledge on big data. Therefore, certain courses of study were excluded from the sampling and a question that tested previous knowledge in the field was added in the first questionnaire. There was also an effort to balance gender, course and degree type. The final sample consisted of five undergraduate and five postgraduate students who each studied a different university course in Cardiff. The six women and four men, most aged in their twenties, were predominantly of British nationality, with one Canadian and one Hungarian participant.

Figure 1: The three stages of research.

At each of the three points in time when testing took place, the participants completed questionnaires with open and closed questions, which examined their concern for privacy and various aspects of their internet usage. Furthermore, after the intervention, open questions asked for a reflection of the tools applied in the study. The questionnaires were designed building on established instruments that measure privacy attitudes and concerns, adapting them to this study’s research question (Chellappa & Sin, 2005; Malhotra, Kim, & Agarwal, 2004; Smith, Milberg, & Burke, 1996). In the 40 minutes intervention, the participants were invited to use the three tools and also navigate freely around further links and resources they found. This took place individually and was not closely supervised, aiming at a ‘natural browsing behaviour’ and catering for the needs of different learning styles (see for example Pritchard, 2008, p. 41ff). The intervention was examined by noting particularly striking initial reactions of the participants to the tools (e.g., exclamations of surprise about certain methods of tracking online) as well as using a screen recording tool, both took place transparently with the participants’ knowledge and consent. Even though the screen recording may have restricted the natural browsing situation described above because some participants may have felt observed, it also allowed for interesting analyses of which tools were used, how much time was spent with each, and how people’s browsing behaviours differed that would not have been possible otherwise.

Finally, qualitative one-hour interviews were conducted with five of the participants eight months after they had used the tools. These aimed at gaining a more in-depth understanding of the participants’ critical big data literacy and patterns found by prior research; the potential effect the tools had on their privacy concern and behaviour; and the participants’ perceptions of and their reflections on the three tools. Using a structured interview guide, which was partly adapted to each participant based on their prior findings, I was further able to clarify minor ambiguities that arose before and inquire about intentions participants had expressed earlier.

3. How could critical big data literacy be implemented?

3.1 Examples for data literacy tools

Aiming at an overview of examples of existing data literacy tools, the main finding of the first part of this study was the large amount and variety of tools identified. The snowball sampling process proved effective as many organisations and websites seem to be highly connected and interrelated. In total, nearly 40 examples of data literacy tools could be identified. A comparative analysis gave further insight into how critical big data literacy is being implemented in each tool’s design, revealing a variety of different approaches.

Figure 2: Typology with categories.

Firstly, the analysis demonstrated the sample’s diversity in the tools’ formats. Fifteen distinct categories of content and design approaches could be identified (see figure 2), which included approaches that were to be expected, such as ‘(multimedia) websites’, ‘short videos’, or ‘text-based information’, but also some emerging findings, such as a graphic novel, a game, and an audio story used to communicate a critical view on big data1. Many of those are interactive in design, meaning when using these tools, they will require an action from the user such as the click of a button or entering some information before the content will continue to the next section.

Moreover, the typology identified several ‘collections of resources’: websites that collate a variety of self-produced and external resources that educate about big data. ‘Toolkits’, on the other hand, describe resources that provide constructive advice such as services and software to use or easy to follow steps to take to protect one’s data online. As figure 3 depicts, some tools apply more than one of the identified approaches. For example, the website Privacy International includes short videos as well as text-based information. An overview of all identified tools including a brief assessment of each tool’s content, design and suitability for different purposes of teaching data literacy, geared towards practitioners, can also be found in the “Critically Commented Guide to Data Literacy Tools” (Sander, 2019).

Figure 3: Typology of all identified data literacy tools and associated categories.

Secondly, the analysis revealed a variety of different production origins. As was to be expected, a large amount of tools originated from NGOs such as the ‘Tactical Technology Collective’ or the ‘Electronic Frontier Foundation’. Yet, also further less expected actors were identified, such as the “secure mobile communications” company ‘Silent Circle’ that produced the documentary “The Power of Privacy” (Silent Circle, n.d.), or several private individuals who developed the website “I have something to hide”. Also somewhat surprisingly, the sample did not include any resources from governmental or public service institutions, nor from traditional educational avenues and only very few efforts on the part of academia. Moreover, even though this sampling only included English-language tools, the identified online resources were not only developed by actors from English-speaking countries, but also, for example, by the French ‘La Quadrature du Net’, Berlin-based ‘Tactical Technology Collective’, and the already mentioned Portuguese individuals.

Thirdly and finally, the sampling and first analysis of tools also revealed that while many useful tools for educating about big data were identified, not all ideally suited the above definition of critical big data literacy and many required a certain level of previous knowledge. Thus, I decided to differentiate between the wider category of data literacy tools and more specific critical big data literacy tools. While the latter were found to implement all aspects of critical big data literacy (as defined above) and address a general public, data literacy tools would, for example, focus on providing resources such as teaching material about big data or technological tools to help users improve their digital security and protect their data online. Particularly the targeted audience of the tools was one of the key distinguishing factors between the two categories. Many data literacy tools did not seem to address the general public as they often lacked a general introduction to big data and directly skipped to issues such as encryption, digital security or online tracking. These tools seemed to aim at already interested individuals with a pronounced prior knowledge on issues related to big data (e.g., ‘Ononymous’; ‘Exposing the Invisible’) or those planning to teach about big data (e.g., ‘learning.mozilla.org’). Thus, they often lacked the broader perspective and awareness-raising aspect that resources categorised as critical big data literacy tools provided.

3.2 Testing the tools

The second part of this study tested three critical big data literacy tools: the interactive web-series ‘Do Not Track’, the website ‘Me and My Shadow’ and the short video ‘Reclaim our Privacy’. Overall, the three tools predominantly had an awareness-raising and privacy-enhancing influence on the participants, leading to a generally increased concern for privacy and distinctly more privacy-sensitive internet usage2. This effect was nearly unambiguous in the short term (one week after using the tools), with more diverse findings mid-term. After eight months, some participants showed interest and concern only about some aspects of privacy (mainly about data security, see also 3.3, as well as their personal security in relation to their traceability), one had ‘defaulted back’ to their original attitude and behaviour but two others showed a persistent and even growing increase in privacy concern and behaviour. Furthermore, all participants stated increased caution in at least some situations of data disclosure online. Thus, this first testing demonstrated that while such literacy resources are of course no ultimate and perfect solution to solve all broader problems around big data, critical big data literacy tools nevertheless can be great ways to increase internet users’ awareness, understanding and critical reflection of big data practices as well as their ability to protect their data online.

This effect was further confirmed by the participants’ reflections on the tools, stating their insights through the tools, that they needed to “think about and reflect” (Participant 01) on what they learned and that they wanted to change their internet usage: “I’m going to go home and clear all my cookies now” (P09). They further stressed that they may have known about data collection but had not been aware of the impact this can have on them, arguing that issues around big data are too removed from individuals and that people “just don’t necessarily realise the weight of this” (P10) or “don’t think it impacts their lives, but it really, really does” (P05). This confirms people’s lacking knowledge as outlined above and reaffirms that even if they are aware of data collection, internet users often only have a vague idea of how “‘the system’—the opaque under-the-hood predictive analytics regimes that they know are tracking their lives but to which they have no access” is operating (Turow et al., 2015, p. 20).

Appreciation for the tools

In general, the participants felt very positive about the tools. Both in the questionnaire after one week and also in the follow up interviews after eight months, they expressed their excitement about the tools, praised the tools’ accessibility: they are “accessible to people” (P10) and “easy to read through” (P05), and found them well-suited for “educational purposes” (P10) as they constitute “a good way to reach people” and to “spark a little interest in them, and a little concern” (P05). Moreover, two participants emphasised how great it is that the tools “gave you all the technical information in a way that wasn’t technical” (P05). Finally, the tools’ interactivity was praised repeatedly: “They were all good, it’s just the interactive parts of it that were better than any of the other ones” (P07).

The interactive web-series ‘Do Not Track’ was particularly popular with the participants. They spent the most time with this tool and highlighted its appealing visualisations and, above all, praised its interactivity. They believed the series was “the sort of thing that everybody – no matter what age you are would be able to grasp and be intrigued with” (P05) as “you have to be a part of it” (P04). Interactivity in general was repeatedly emphasised as a core strength of the tools, and lacking interactivity criticised as a major weakness of the short video. This highlights that in terms of user engagement, popularity and learning effect, the importance of the interactivity of critical big data literacy tools cannot be underestimated, especially when addressing younger demographics.

Also the website ‘Me and My Shadow’ was popular and participants liked the “unique view on data stored about people” it provided (P04) and how it gave “straightforward advice” (P08). They particularly liked a visualisation of data traces and constructive advice in form of a five-step-list. The short video Reclaim our Privacy’, however, was not as popular. While some participants liked it, the video and its contents were primarily forgotten after eight months, except for one participant who criticised its “scare-tactics” (P05; see below for more). Also other criticism of the tools was voiced. For example, participants found ‘Do Not Track’ “a bit hard to navigate” (P10), disliked that ‘Me and My Shadow’ used “lots of text” (P05) or criticised the video’s lacking interactivity as “all just information” (P07).

Overall however, the participants’ reflections emphasised how well-suited they found these tools – particularly ‘Do Not Track’ and ‘Me and My Shadow’ – to inform and educate people about big data practices, especially stressing the importance of interactive design, appealing visualisations and accessibility – the ability to convey complex information in a concise and easily-understandable manner. Thus, these findings also imply that some design approaches may be more or less suitable for certain goals of teaching data literacy – at least in the context of this study’s sample of university students – and that it might make sense to combine some of the identified tools when educating about big data. For example, while a short video in itself may lack interactivity and vigour, it could be a great first introduction into the topic, whereas toolkits often lack an introduction to the topic, but provide great constructive advice about a more empowered internet usage. Such combination of resources could also be a workaround to issues of high production costs. Particularly suitable resources such as ‘Do Not Track’ that include various formats, are engaging and interactive, and include high quality design and content, come with very high production costs and are therefore rare. Combining several tools with different formats and adding interactive elements could thus constitute an alternative way to advance critical big data literacy in practice.

Calls for more education and resources

Finally, one key finding of this study was that the participants repeatedly and distinctly called for more education on big data and more data literacy resources like the ones applied in this study. Many stressed that “people are not educated enough” (P04) and that everybody should be aware of issues around big data (P05, P10). Some also explained that they “want it to be second hand nature for me to think about that issue” (P04).

As one solution, participants called for “more disclaimers” (P06) to explain data usage and for a reliable source of information on big data practices, such as a “government or independent organisation” (P04). Moreover, many called for teaching “digital awareness” (P04) from a young age in schools in order for children to learn “the ins and outs of technology” and develop a “thirst for knowledge” (P05), but also to question and critically reflect on these issues (P04). Participant P04 even considered teaching data literacy in his free time in the future, which again emphasises the participants’ enthusiasm and their serious calls for more education and resources in this field. As will be further discussed below, these findings, albeit from a small and specific sample, constitute an important counterargument to discourses around people’s alleged indifference about the use of their data and their supposed unwillingness to understand complex issues around big data.

3.3 Ideas and suggestions for future tools

Finally, the participants of this study, particularly those who took part in the in-depth follow up interviews after eight months, developed manifold fascinating and creative ideas on how to improve existing data literacy tools and design future resources. When enquiring about one of the original goals for this stage of the study – to learn more about participants’ perceptions of and their reflections on the three tools – many individually expressed ideas for future tools that often resemble each other. These were complemented by further research on people’s nuanced and complex attitudes towards privacy and big data, so that they can be summarised into something of a tentative ‘guideline’ for designing future critical big data literacy tools.

Format of the tool

To begin, the participants argued, such tools should include an attention catcher, a video or “kind of poster that attracts you to the website” (P04). Ideally, this could easily be shared on social media and thus has the potential to ‘go viral’. The actual tool – a website or an app – should have a “catchy name or slogan […] that would get people’s attention” (P04). The content of the tool should be interactive and personalised to each user and should include multimedia usage such as good visualisations and appealing short videos.

Examples of data harm

Moreover, future resources should include “real stories from real people” (P04) who have experienced negative consequences of big data practices. These could include not seeing job advertisements online due to discriminatory ad settings, not receiving a credit because of erroneous credit scoring or innocently becoming a police suspect based on biased predictive policing mechanisms, but also harms such as identity theft, data breaches or hacks (for examples, see Redden & Brand, 2017). Such stories would speak to the users “on an emotional level” (P04) and address the problem that the negative impacts of data disclosure are currently often too removed from individuals, as criticised by nearly every interviewee. Their lacking awareness of potential negative consequences of data disclosure online was expressed, among others, by stating that they found it “quite hard to think of any situations where it would be properly terrible” (P06) if their information was used.

Related to this, many expressed the feeling that their data would not be of interest to anyone. During the interviews, every participant expressed this attitude, arguing that their information was irrelevant as they are “just talking about my day-to-day life” (P04) and: “they’re really big companies, why would they be that bothered about this data” (P07). It seems that the participants still lack an understanding of the value of their data and potential uses. Thus, future data literacy efforts should address this gap by, for example, presenting real-life examples of data harm that demonstrate how even ‘harmless’ data can be used for questionable causes and could potentially have negative consequences.

Beyond data security

Moreover, such testimonials of data harm could also be used to foster an understanding of non-security related issues of big data practices. The participants of this study often expressed complex and fluctuating or partly contradictory attitudes. For example, while many were concerned about problems such as data breaches, data hacks and cyber security, they were less aware of longer-term impacts of big data practices such as tracking, scoring or surveillance. Also the identified data literacy tools showed a prevalence of aspects around data security. Thus, future data literacy efforts should aim to address this imbalance.

The ‘shock value’

Besides, the “shock value” (P10) of the tools’ content was controversial with the participants. Some said they wanted to be “scare[d] into it” (P07) and that this is “always necessary” (P10), whereas another argued that it might work to scare people, yet “it’s not the right approach” and tools should work toward building awareness rather than people “doing the right thing because they’re scared” (P05). Thus, it is important to find the right balance on the “narrow path” (P10) between providing factual knowledge and giving constructive advice on the one hand, and emphasising the severity of the topic enough to get and keep people engaged on the other hand.

Constructive advice to avoid digital resignation

Furthermore, as already included in the concept of critical big data literacy, constructive advice on how to protect one’s data online should play a big part in any data literacy tool. Several participants stressed its importance, highlighting that easy-to-follow advice would not only keep people engaged and immediately give them a starting point to change their behaviour, but could also help prevent resignation towards privacy in light of the new information received. As already identified by other scholars (Dencik & Cable, 2017; Draper, 2017; Draper & Turow, 2019; Turow et al., 2015), some internet users have ‘given up’ on their data as they regard any efforts to protect their data as futile. Importantly, this attitude does not express consent with data collection but rather a feeling of inability to protect their data and thus a resignation to any such efforts.

Also in this study, resignation towards privacy and data collection was a critical issue. In particular, one participant was identified as a ‘typical case’ for resignation, who was aware of the importance of privacy at all times, yet the new information they gained from the tools made them feel “depressed” in sight of the many years of data on “every part of my life” and thus “so much knowledge” these companies had about them (P10). However, despite their initially resigned reaction, this participant later took action and made various changes to their internet usage, again emphasising the importance of privacy and explaining: “I don’t know if anything I do is actually helping anything, but I try”. This again stresses the relevance of such easy-to-follow advice not only in preventing but also in fighting resignation. Nevertheless, it is important to clarify that this should not entail a shift of responsibility to the individual users. While it is important and necessary that citizens of datafied societies obtain a certain ability to protect their data online -providing them with an, albeit limited, sense of agency and control and circumventing resignation - these digital skills do not constitute the key goal of critical big data literacy. Instead, this literacy’s main objective lies in enabling citizens to develop an informed opinion and take part in the public debate around datafied systems.

Moreover, this participant highlighted that in order to become active and start making changes, it takes “initial investments” of time and energy (P10). This constitutes a useful insight for further constructive advice to be included in data literacy tools: the initial investment to take action should be as small as possible and it should be stressed that this is only a short, initial effort to be made. The participants suggested that one way to do this would be to develop a simple checklist with several easy and low-threshold first steps to take in order to protect one’s data online. This would aim at building new habits, for example by suggesting alternative services people can start using, but it should also – wherever possible – provide instructions on how to remove data that have already been disclosed. One example for a similar ‘to-do-list’ is the “data protection toolkit” on the website ‘I have something to hide’.

Options for sharing and a regular reminder

Another idea that was very popular with several participants was that of a regular reminder: “I think the reminder is key” (P05). They outlined that they had forgotten about changes in their internet usage they meant to make and would therefore have appreciated a reminder. One participant further explained that the common GDPR pop-ups often serve as a reminder for them (P05). As it is “very easy to forget these things” (P05), a reminder would help make privacy-sensitive internet usage a “part of people’s routine” (P04) and make them “more conscious” (P05). Finally, some participants suggested an “aspect of sharing” in the tool in order to “get the message around” (P04).

Overall, these manifold ideas provide a first insight into which format, design and content users of future data literacy tools might appreciate. Moreover, future initiatives and resources should be careful to find the ‘narrow path’ between emphasising the severity of some of the impacts big data practices can have, yet not inducing a feeling of resignation towards data collection and online privacy. Finally, the value of low-threshold constructive advice on how to curb the ubiquitous collection of personal data online should not be underestimated.

Conclusion

To conclude, this paper argues that data literacy today needs to go beyond the mere skills to use digital media, the internet or even the ability to use data or handle big data sets. Rather, what is required is an extended critical big data literacy that includes citizens’ awareness, understanding and critical reflection of big data practices and their risks and implications, as well as the ability to implement this knowledge for a more empowered internet usage. Citizens need an understanding of the structural and systemic levels of changes that come with big data systems and the datafication of our societies. This paper discusses this concept, presents research findings about the kinds of critical big data literacy tools available as well as gives an insight to how student internet users view the short and also mid-term effects of these tools. The article aims to contribute to and advance debates about how data literacy efforts can be implemented and fostered.

To begin, the study provided a first, non-representative insight into the field of existing efforts to inform and educate citizens about big data practices. Nearly 40 such data literacy tools could be identified and a comparative analysis revealed a great variety in their national and production origins and in their formats. A typology revealed 15 distinct content and design approaches, including unusual and emerging formats such as a graphic novel about big data.

In a next step, three selected critical big data literacy tools were tested in a qualitative multi-methods study over three points in time. Overall, this found a positive effect of the tools on the participants’ concern for privacy and their privacy-sensitive internet usage. Moreover, the study revealed fascinating findings on the users’ perspective on these tools. For example, the participants highly praised the tools’ interactivity, their appealing visualisations and their accessibility, and they controversially discussed the ‘shock-value’ of their content. Apart from this, they also clearly called for more education on the topic and more informational resources on big data, like the ones mentioned in this study. However, it is important to understand these findings, such as the popularity of interactive formats, within the context of this study’s specific sample. Previous research has identified generational, socioeconomic and cultural differences in digital skills (e.g., van Djijk, 2006, p. 223). Internet users’ skills but also their personal interests and their learning preferences are likely to impact their responsiveness to data literacy tools. While this study’s method allowed for information-rich data on each participant’s attitudes, their internet usage and their perceptions of the three data literacy tools, this method did not allow for generalising claims, and some findings may be specific for the sample’s demographic.

Finally, as this study included the opportunity to follow up on some of the participants after several months and discuss their perceptions of the applied tools in detail, the findings also offer valuable suggestions for future data literacy tools and initiatives or training programmes that aim to teach about big data practices. The participants provided detailed and creative ideas for future tools: they should have an attention catcher followed by an interactive website or app that includes personalised content, real stories about data harm, appealing visualisations, easy to follow constructive advice in form of a checklist, and options to easily recommend the tool to friends and set up a regular reminder. This was complemented by further research findings highlighting people’s lacking ability to imagine negative consequences of data disclosure; the prevalence of concern about data security; and the critical issue of resignation towards privacy and data collection online. However, also here it is necessary to regard these findings in the context of the study’s narrow sample and keep in mind that different target groups will likely require different design approaches when teaching about big data. Such differences between target communities would be interesting to explore in relation to responsiveness to data literacy tools in future studies.

Overall, this study contributed novel findings to various fields of scholarly research. First and foremost, this article presented the concept of critical big data literacy, building on and advancing research from the fields of critical data studies, media and digital literacy, and data literacy. Moreover, the study’s findings as presented in this article reaffirm people’s lacking knowledge about big data practices, and confirm existing research which has shown that while they may be aware that their data is being collected, many people only have a vague idea of the longer-term implications their data disclosure might have. In contrast to the common claim that internet users do not care about their data and feel like they have ‘nothing to hide’ (Solove, 2007; Marwick & Hargittai, 2018), the participants of this study were eager to learn more, highly appreciated the tools and this opportunity, and many seemed keen to protect their data. Such desire and motivation to learn more about such complex issues was unexpected. While they may not always understand what they have to ‘hide’, they were clearly not indifferent towards the usage of their data and they became more concerned when learning more about big data practices. This also confirms the findings of Worledge & Bamford (2019), who demonstrated that internet users find big data practices less acceptable the more they learn about them. Furthermore, this study advances existing research about people’s resignation towards privacy by highlighting an example case and providing insights, through qualitative investigation, about some of the origins of this feeling and potential ways to prevent it.

Apart from these contributions to scholarly understanding, this study’s findings on how to implement critical big data literacy are, albeit based on a small and specific sample, also relevant for practitioners in different fields and they provide first insights for policymakers. Literacy is an issue that can have deep impacts on citizenship and in order for policymakers and citizens to make informed decisions, datafied societies needinformed public debate about the use and implications of data science technologies. This article suggests that one way to enhance such debate is through critical big data literacy, and it proposes what such data literacy should entail, suggestions on how to put this into practice, what to keep in mind when designing future data literacy resources and what they could look like.

Acknowledgements

I would like to express my deep gratitude to Dr Joanna Redden, Data Justice Lab, Cardiff University, for her valuable and constructive advice and her continuous support. I would also like to thank the Center for Advanced Internet Studies (CAIS) for their generous support and a pleasant and productive working atmosphere.

References

boyd, d., & Crawford, K. (2012). Critical Questions For Big Data: Provocations for a cultural, technological, and scholarly phenomenon. Information, Communication & Society, 15(5), 662–679. https://doi.org/10.1080/1369118X.2012.678878

Bucher, T. (2017). The algorithmic imaginary: Exploring the ordinary affects of Facebook algorithms. Information, Communication & Society, 20(1), 30–44. https://doi.org/10.1080/1369118X.2016.1154086

Chellappa, R. K., & Sin, R. G. (2005). Personalization versus privacy: An empirical examination of the online consumer’s dilemma. Information Technology and Management, 6(2), 181–202. https://doi.org/10.1007/s10799-005-5879-y

Crusoe, D. (2016). Data Literacy defined pro populo: To read this article, please provide a little information. The Journal of Community Informatics, 12(3), 27–46. http://ww.w.ci-journal.net/index.php/ciej/article/view/1290

Dencik, L., & Cable, J. (2017). The Advent of Surveillance Realism: Public Opinion and Activist Responses to the Snowden Leaks. International Journal of Communication, 11, 763–781. https://ijoc.org/index.php/ijoc/article/view/5524/1939

D’Ignazio, C., & Bhargava, R. (2015, September 28). Approaches to Building Big Data Literacy [Paper presentation]. Bloomberg Data for Good Exchange Conference, New York. http://rahul-beta.connectionlab.org/wp-content/uploads/2011/11/Edu_DIgnazio_52.pdf

D’Ignazio, C., & Bhargava, R. (2018). Cultivating a Data Mindset in the Arts and Humanities. Public, 4(2). https://public.imaginingamerica.org/blog/article/cultivating-a-data-mindset-in-the-arts-and-humanities/

D’Ignazio, C. (2017). Creative Data Literacy: Bridging the Gap between the Data-Haves and Data-Have Nots. Information Design Journal, 23(1), 6–18. https://doi.org/10.1075/idj.23.1.03dig

Do Not Track (2015). About. Do Not Track. https://donottrack-doc.com/en/about/

Doteveryone. (2018). People, Power and Technology: The 2018 Digital Attitudes Report. https://attitudes.doteveryone.org.uk

van Dijk, J. A. G. M. (2006). Digital divide research, achievements and shortcomings. Poetics,34(4–5), 221–235. https://doi.org/10.1016/j.poetic.2006.05.004

Draper, N. A. (2017). From Privacy Pragmatist to Privacy Resigned: Challenging Narratives of Rational Choice in Digital Privacy Debates. Policy & Internet, 9(2), 232–251. https://doi.org/10.1002/poi3.142

Draper, N. A., & Turow, J. (2019). The corporate cultivation of digital resignation. New Media & Society, 21(8). https://doi.org/10.1177/1461444819833331

Eubanks, V. (2018). Automating Inequality: How High-Tech Tools Profile, Police, and Punish the Poor. St. Martin’s Press.

Fotopoulou, A. (In press). Conceptualising critical data literacies for civil society organisations: agency, care, and social responsibility. Information Communication and Society.

Garcia, A., Mirra, N., Morrell, E., Martinez, A., & Scorza, D. (2015). The Council of Youth Research: Critical Literacy and Civic Agency in the Digital Age. Reading & Writing Quarterly, 31(2), 151–167. https://doi.org/10.1080/10573569.2014.962203

Gray, J., Gerlitz, C., & Bounegru, L. (2018). Data infrastructure literacy. Big Data & Society, 5(2). https://doi.org/10.1177/2053951718786316

Grzymek, V., & Puntschuh, M. (2019). Was Europa über Algorithmen weiß und denkt. Ergebnisse einer repräsentativen Bevölkerungsumfrage. Bertelsmann Stiftung. https://doi.org/10.11586/2019006

Hammer, R. (2011). Critical Media Literacy as Engaged Pedagogy. E-Learning and Digital Media, 8(4), 357–363. https://doi.org/10.2304/elea.2011.8.4.357

Hansen, A., & Machin, D. (2013). Media and communication research methods. Palgrave Macmillan.

Hinrichsen, J., & Coombs, A. (2013). The five resources of critical digital literacy: A framework for curriculum integration. Research in Learning Technology, 21. https://doi.org/10.3402/rlt.v21.21334

Kennedy, H., & Moss, G. (2015). Known or knowing publics? Social media data mining and the question of public agency. Big Data & Society, 2(2). https://doi.org/10.1177/2053951715611145

Kitchin, R., & Lauriault, T. P. (2014). Towards critical data studies: Charting and unpacking data assemblages and their work.[The Programmable City Working Paper No. 2]. Maynooth University. http://mural.maynoothuniversity.ie/5683/

Malhotra, N. K., Kim, S. S., & Agarwal, J. (2004). Internet Users’ Information Privacy Concerns (IUIPC): The Construct, the Scale, and a Causal Model. Information Systems Research, 15(4), 336–355. https://doi.org/10.1287/isre.1040.0032

Marwick, A., & Hargittai, E. (2018). Nothing to hide, nothing to lose? Incentives and disincentives to sharing information with institutions online. Information, Communication & Society, 22(12), 1–17. https://doi.org/10.1080/1369118X.2018.1450432

Mihailidis, P. (2018). Civic media literacies: Re-Imagining engagement for civic intentionality. Learning, Media and Technology, 43(2), 1–13. https://doi.org/10.1080/17439884.2018.1428623

Miller, C., Coldicutt, R., & Kitcher, H. (2018). People, Power and Technology: The 2018 Digital Understanding Report. doteveryone. http://understanding.doteveryone.org.uk/

Müller-Peters, H. (2019). Big Data: Chancen und Risiken aus Sicht der Bürger [Big Data: Chances and Risks from Citizens’ Perspective]. In S. Knorre, H. Müller-Peters, & F. Wagner (Eds.), Die Big-Data-Debatte (pp. 137–193). Springer. https://doi.org/10.1007/978-3-658-27258-6_3

Ofcom. (2019). Adults: Media use and attitudes report [Report]. https://www.ofcom.org.uk/__data/assets/pdf_file/0021/149124/adults-media-use-and-attitudes-report.pdf.

O’Neil, C. (2016). Weapons of math destruction: How big data increases inequality and threatens democracy. Allen Lane.

Pangrazio, L. (2016). Reconceptualising critical digital literacy. Discourse: Studies in the Cultural Politics of Education, 37(2), 163–174. https://doi.org/10.1080/01596306.2014.942836

Pangrazio, L., & Selwyn, N. (2019). ‘Personal data literacies’: A critical literacies approach to enhancing understandings of personal digital data. New Media & Society, 21(2), 419–437. https://doi.org/10.1177/1461444818799523

Philip, T. M., Schuler-Brown, S., & Way, W. (2013). A Framework for Learning About Big Data with Mobile Technologies for Democratic Participation: Possibilities, Limitations, and Unanticipated Obstacles. Technology, Knowledge and Learning, 18(3), 103–120. https://doi.org/10.1007/s10758-013-9202-4

Pötzsch, H. (2019). Critical Digital Literacy: Technology in Education Beyond Issues of User Competence and Labour-Market Qualifications. TripleC: Communication, Capitalism & Critique. Open Access Journal for a Global Sustainable Information Society, 17(2), 221–240. https://doi.org/10.31269/triplec.v17i2.1093

Pritchard, A. (2008). Ways of Learning. Learning Theories and Learning Styles in the Classroom (2nd ed.). Routledge. https://doi.org/10.4324/9780203887240

Pybus, J., Coté, M., & Blanke, T. (2015). Hacking the social life of Big Data. Big Data & Society, 2(2). https://doi.org/10.1177/2053951715616649

Redden, J., & Brand, J. (2017). Data Harm Record. Data Justice Lab, Cardiff University. https://datajusticelab.org/data-harm-record/

Sander, I. (2019). A Critically Commented Guide to Data Literacy Tools. https://doi.org/10.5281/zenodo.3241422

Sander, I. (2020). Critical Big Data Literacy Tools – Engaging Citizens and Promoting Empowered Internet Usage. ORCA. http://orca.cf.ac.uk/id/eprint/131843

Silent Circle. (n.d.). Silent Circle | Secure Enterprise Communication Solutions. https://www.silentcircle.com/

Smith, H. J., Milberg, S. J., & Burke, S. J. (1996). Information Privacy: Measuring Individuals’ Concerns about Organizational Practices. MIS Quarterly, 20(2), 167–196. https://doi.org/10.2307/249477

Solove, D. J. (2007). “I’ve Got Nothing to Hide“ and Other Misunderstandings of Privacy. San Diego Law Review, 44(4), 745–765. https://digital.sandiego.edu/sdlr/vol44/iss4/5/

Turow, J., Hennessy, M., & Draper, N. (2018). Persistent Misperceptions: Americans’ Misplaced Confidence in Privacy Policies, 2003–2015. Journal of Broadcasting & Electronic Media, 62(3), 461–478. https://doi.org/10.1080/08838151.2018.1451867

Turow, J., Hennessy, M., & Draper, N. A. (2015). The tradeoff fallacy: How marketers are misrepresenting American consumers and opening them up to exploitation [Report]. Annenberg School for Communication. https://www.asc.upenn.edu/sites/default/files/TradeoffFallacy_1.pdf

Vasileiou, K., Barnett, J., Thorpe, S., & Young, T. (2018). Characterising and Justifying Sample Size Sufficiency in Interview-Based Studies: Systematic Analysis of Qualitative Health Research over a 15-Year Period. BMC Medical Research Methodology, 18. https://doi.org/10.1186/s12874-018-0594-7

Worledge, M., & Bamford, M. (2019). Adtech: Market Research Report. Information Commissioner’s Office; Ofcom. https://www.ofcom.org.uk/__data/assets/pdf_file/0023/141683/ico-adtech-research.pdf

Yates, S., Carmi, E., Pawluczuk, A., Wessels, B., Lockley, E., & Gangneux, J. (2020). Understanding citizens data literacy: thinking, doing & participating with our data (Me & My Big Data Report 2020). Me and My Big Data project, University of Liverpool. https://www.liverpool.ac.uk/humanities-and-social-sciences/research/research-themes/centre-for-digital-humanities/projects/big-data/publications/

Zuboff, S. (2015). Big other: Surveillance capitalism and the prospects of an information civilization. Journal of Information Technology, 30(1), 75–89. https://doi.org/10.1057/jit.2015.5

Appendix: List of all identified data literacy tools

Acquisti, A. (2013, June). What will a future without secrets look like? [Video]. TED Conferences. https://www.ted.com/talks/alessandro_acquisti_why_privacy_matters

Asin, A. (2014, November 20). Big Data and the Hypocrisy of Privacy [Keynote, video]. Strata + Hadoop World Europe 2014, Barcelona. https://www.youtube.com/watch?v=oWwQfgpvlzI

Gaylor, B. (Director). (2015). Do Not Track. Upian; Arte; Office national du film du Canada; Bayerischer Rundfunk. https://donottrack-doc.com/en/

Disconnect. (2013, June 30). Unwanted tracking is not cool [Video]. YouTube. https://www.youtube.com/watch?v=UU2_0G1nnHY

The Economist. (2014, September 11). How internet advertisers read your mind [Video]. YouTube. https://www.youtube.com/watch?v=8KYugpMDXAE

Electronic Frontier Foundation. (n.d.). Surveillance Self-Defense. https://ssd.eff.org

Electronic Frontier Foundation. (n.d.). Surveillance Self-Defense Basics. https://ssd.eff.org/module-categories/basics

Electronic Frontier Foundation. (n.d.). Surveillance Self-Defense Tool-Guides. https://ssd.eff.org/module-categories/tool-guides

Fight for the Future. (n.d.). I feel naked. https://www.ifeelnaked.org

FreeNet Film (2012, December 15). Do you care about your privacy in the web? [Video]. YouTube. https://www.youtube.com/watch?v=jtGtIxgS7io

Greenwald, G. (2014, October 10). Why Privacy Matters [Video]. YouTube. https://www.youtube.com/watch?v=pcSlowAhvUk

The Guardian Project. (n.d.). The Guardian Project. https://guardianproject.info

Hill, A. (2014, September 15). A Day in the Life of a Data Mined Kid. Marketplace. https://www.marketplace.org/2014/09/15/education/learning-curve/day-life-data-mined-kid

Internet Society (2015, May 14). Four Reasons to Care About Your Digital Footprint [Video]. YouTube. https://www.youtube.com/watch?v=OA6aiFeMQZ0

Julie, Ozoux, P., Daniel, & Bouda, P. (n.d.). I have something to hide. https://ihavesomethingtohi.de

Keller, M. & Neufeld, J. (2014, October 30). Terms of Service. Understanding our role in the world of Big Data. http://projects.aljazeera.com/2014/terms-of-service/#1

La Quadrature du Net. (2014, February 11). Reclaim Our Privacy [Video]. YouTube. https://www.youtube.com/watch?v=TnDd5JmNFXE

La Quadrature du Net. (n.d.). Website – Section Privacy. https://www.laquadrature.net/en/Privacy

Mozilla. (n.d.). Learning.mozilla.org. https://web.archive.org/web/20200216134228/https://learning.mozilla.org/en-US/

Mozilla. (n.d.). SmartOn Privacy and Security. https://web.archive.org/web/20180510112403/https://www.mozilla.org/en-US/teach/smarton/

PBS Nova Labs – Cybersecurity Lab. (n.d.). A Cyber Privacy Parable [Video]. Public Broadcasting Service. http://www.pbs.org/wgbh/nova/labs/lab/cyber/2/1/

PBS Nova Labs – Cybersecurity Lab. (2014). Cyber Lab [Game]. Public Broadcasting Service. http://www.pbs.org/wgbh/nova/labs/lab/cyber/research#/newuser

PBS Nova Labs – Cybersecurity Lab (n.d.). Cybersecurity videos. Public Broadcasting Service. http://www.pbs.org/wgbh/nova/labs/videos/#cybersecurity

Privacy International. (n.d.). Privacy International Website. https://privacyinternational.org/

Privacy International. (n.d.). Invisible Manipulation: 10 ways our data is being used against us.https://www.privacyinternational.org/long-read/1064/invisible-manipulation-10-ways-our-data-being-used-against-us

Privacy International (n.d.). What Is Data Protection?https://privacyinternational.org/explainer/41/101-data-protection

Silent Circle (2016, January 27). The Power of Privacy. https://www.youtube.com/watch?v=BvQ6I9xrEu0

Silent Circle (2015, January 27). #Privacy Project. https://www.youtube.com/watch?v=ZcjtEKNP05c

Smith, T. (n.d.). Big Data [Video]. TED-Ed. https://ed.ted.com/lessons/exploration-on-the-big-data-frontier-tim-smith

Tactical Technology Collective. (2017). A Data Day in London. https://tacticaltech.org/#/news/a-data-day

Tactical Technology Collective. (n.d.). Data Detox Kit. https://datadetoxkit.org/de/home

Tactical Technology Collective. (n.d.). Exposing the Invisible. https://exposingtheinvisible.org

Tactical Technology Collective. (n.d.). The Glass Room. https://theglassroom.org

Tactical Technology Collective. (n.d.). Me and My Shadow. https://myshadow.org

Tactical Technology Collective. (n.d.). Ononymous. https://ononymous.org

Tactical Technology Collective. (n.d.). Our Data Our Selves. https://ourdataourselves.tacticaltech.org

Tactical Technology Collective. (n.d.). School of Data. https://schoolofdata.org

Tactical Technology Collective. (n.d.). Security-in-a-box. https://securityinabox.org/en/

Footnotes

1. For a list of all identified tools and their URLs, see appendix.

2. More detailed findings on the effect the tools had on my participants’ internet usage and their privacy attitudes are illustrated in a previous publication (Sander, 2020).

Data citizenship: rethinking data literacy in the age of disinformation, misinformation, and malinformation

$
0
0

This paper is part of Digital inclusion and data literacy, a special issue of Internet Policy Review guest-edited by Elinor Carmi and Simeon J. Yates.

Introduction

Citizens' engagement with media and the ways in which they develop their agency have long been discussed through the lens of written, media and information or digital ‘literacy’. More recently, as algorithmic decision-making processes have become widespread, data literacy has joined this conversation (Gilster, 1997; Eshet, 2004; Bawden, 2008; Gummer and Mandinach, 2015). In this field of literacies, the emphasis has been around the need to include disadvantaged citizens in society’s everyday activities by improving and supporting specific literacy skills and knowledge. There has been a focus on research that explores where a lack of key literacy skills intersects with inequalities across economic or social status, health and disability (physical and mental), racial and cultural position, or gender. This set of ‘required’ skills have become more complex as the technologies, services and devices people use have rapidly evolved.

In the UK context, the Facebook/Cambridge Analytica scandal in 2017 (Cadwalladr, 2017), revealed that people received disinformation content and advertisements based on their social media profiles and activity, designed to influence their decisions on the 2016 UK Referendum to leave the European Union, and the 2016 US presidential election. These two cases made clear the extent to which citizens are unaware of the uses and abuses to which our data can be put. This lack of data literacy opens citizens up to risks and harms – personal, social, physical and financial – but also limits their ability to be proactive citizens in an increasingly datafied society. However, it is clear that this is part of a wider set of issues that need to be considered and that the current definitions of ‘data literacy’ (for example, Pangrazio and Selwyn, 2019) are not addressing the issue of mis/dis/mal-information. These gaps in data literacies and their connection to disinformation, misinformation and malinformation are exactly what this paper is focusing on.

When talking about information distortions these similar concepts often get confused or conflated; yet they have important specific differences. According to the Council of Europe (Wardle and Derakhshan, 2017) there are three different definitions determined in part by the ‘intention’ of those creating or distributing the information:

  1. Dis-information - information that is false and deliberately created to harm a person, social group, organisation or country.
  2. Mis-information - information that is false, but not created with the intention of causing harm.
  3. Mal-information - information that is based in reality, but is used to inflict harm on a person, organisation or country.

Though the term ‘fake-news’ was coined to capture the use of dis- and mis-information in news reporting it has now been deployed as a dis- and mis-information tactic by political actors in the attempt to discredit news reporting and reported facts they dislike. Despite different intentions, these strategies influence citizens’ opinions and actions both online and offline in various capacities. Recent examples can be seen in various types of misinformation around the Covid-19 pandemic, where people blamed 5G radiation for causing the disease and consequently telecoms “engineers are facing verbal and physical threats during the lockdown” (Waterson, 2020). These messages, then, can impact on citizens' agency, freedom of choice and their perceptions, especially when undertaking their everyday civic engagement with different authorities and fellow citizens.

With these changes to the media and digital landscape in mind, we want to examine how the field of digital and data literacy should address these harms and risks, particularly focusing on what kinds of skills, thinking and actions are needed in information distortion times. In this paper we present the first phases of our project which includes mapping the media, data and digital literacy field in relation to ‘mis/dis/mal-information’ – building on longer-term discussions that have been made around media and literacies. Specifically, we focus on what digital and data literacy means in the age of information distortions and what scholars, activists and educators should take into account when developing literacy programmes. We have developed a framework of data citizenship (Yates et al., 2020) that builds on the understanding of data literacy presented here. We specifically attend to the challenges and gaps that previous literature and organisations did not include in their conceptualisations of literacy. In this paper we highlight three of these challenges and gaps that our framework attends to, and show how we approached them in our UK survey.

We explore the data literacy element of data citizenship and how we have applied this to the design of our UK nationally representative survey of citizens’ data. We used the survey company Critical to conduct in-home survey work, using a computer-aided personal interview methodology. 125 sampling points were used to achieve a maximum of n = 1,500 interviews. These points were selected to be a representative cross section of UK addresses. Quotas were set to be reflective of the UK internet using population by age, gender, and household socio-economic group, and urbanity. After analysing the survey we identified six types of digital technology users based on the activity they undertake online:

  1. Extensive political users (10% of users) – High probability of engaging in all forms of digital media use – including political action and communication
  2. Extensive users (20% of users) – High probability of engaging in all forms of digital media use – except political action and communication
  3. Social and media users (17% of users) – High likelihood of engaging with social media (social networking sites - SNS) and entertainment media (e.g., Netflix and YouTube)
  4. General users (no social media) (31% of users) – Lower likelihoods of engaging in most digital media forms but not SNS
  5. Limited users (22% of users) – Limited engagement with all forms of digital media
  6. Non-users – Currently non-internet users

We will get back to these user types to show preliminary findings from the survey and how that informs our data citizenship framework, and what we think needs to be included in contemporary data literacy programmes. In the following sections we will first define the evolution of different types of literacies to understand the nuances behind them. We then move to understand what types of challenges we face with dis- mis- and mal-information. After this we discuss the politics and ideology behind literacy to set the discussion for the three gaps that we identified in the two fields and show how our survey and frameworks tackle them. We therefore make several interventions in this paper:

  1. Consider the longer history of debates and research regarding citizen literacies.
  2. Draw connections and point to the gaps between ideas of literacy and mis-dis-mal/information.
  3. Develop a definition of data literacy that accommodates contemporary media and communication developments.

Understanding contemporary media and digital literacies

In this section we provide brief definitions of the different types of literacy. Like any definition in the field of social sciences and humanities, we want to emphasise that they are by no means universal or unanimously agreed upon. Literacy, as in the context of reading and writing of text, has been the focus of educational, social and cultural research for several centuries. Yet, as Brian Street (2005) noted the term literacy in regard to written materials has multiple meanings – many contested – that have invoked particular foci for research and issues in society. Before the arrival of the web or broader digital media, as Ruth Finnegan (1989) pointed out, perceptions of written literacy have tended to favour specific modes (e.g. reading literature) and to forget that reading, writing and print are just one of many communications technologies. As a result, there is a need to understand literacy as the skills and competencies in using multiple media via communication technologies and not just the ‘written’ word. We will return to the ideological component of literacies later in the paper.

Over the last 50 years we have seen arguments for: information literacy, media literacy, digital literacy and data literacy. According to Christina Doyle, information literacy is “the ability to access, evaluate and use information from a variety of sources” (1994). According to Patricia Aufderheide (1992), a media literate person “can decode, evaluate, analyse and produce both print and electronic media. The fundamental objective of media literacy is a critical autonomy relationship to all media” (Aufderheide, 1992). According to Paul Gilster digital literacy is “the ability to understand and use information in multiple formats from a wide variety of sources when it is presented via computers” (1997). And finally, the most recent iteration of literacy is data literacy. According to Luci Pangrazio and Neil Selwyn (2019) data literacy is about the way “individuals might better engage with and make use of the ‘personal data’ generated by their own digital practices” (Pangrazio and Selwyn, 2019, p. 420). Pangrazio and Selwyn propose the ‘personal data literacies’ that focus on five domains: 1) data identification, 2) data understandings, 3) data reflexivity, 4) data uses, and 5) data tactics.

Clearly these ideas significantly overlap, especially when viewed from the perspective of today where nearly all media are digitally mediated. Yet they each relate to thinking in the specific technological, political and social context and time that they were developed. Each type of literacy came in a time where policymakers made assumptions about media and how different groups of people should use them and for what purposes. ‘Media’ literacy reflected the growth of networked mass and audio-visual media, from newspapers, radio, television through to video tape, cable and satellite. ‘Information literacy’ was developed in the beginning of the 1970s around libraries and their privatisation through the use of computers in school and education. Definitions and literature around digital literacy was the focus when the internet became more widespread in the late 1990s and early 2000s as academics and organisations were trying to make sense of it and how people should engage with it. Data literacy has arisen as researchers and policymakers sought to understand the implications of our datafied society.

In many ways, all these definitions still apply to the datafied society, as our media have been transferred to digital/data formats. In this way, we could say that digital/data literacy encompasses these older forms of media and information seeking but expands on them as more services and everyday life activities are conducted through digital platforms. What is different with data literacy, though, will be elaborated below with three key points: moving beyond the individual to networked literacies, developing critical thinking about the online ecosystem and finally – providing literacies which empower people to become active citizens. Like we emphasised with previous literacies, it is not that these three gaps were not as important in the past, however, as we will show below, with algorithmic processes engineering our societies and accelerating inequalities there is an increased need for them. As Sonja Špiranec et al. argue “data literacy is also being considered as a critical concept with the purpose of promoting social justice and the public good, understanding power relations and power asymmetries as well as reducing social, economic, political and other types of inequalities” (Špiranec et al., 2019). In short, data literacy today means understanding and being able to challenge, object and protest contemporary power asymmetries manifested in datafied societies. In the next section we want to understand the current climate of dis-/mis-/mal- information and what new challenges they bring to literacy.

Disinformation and the changing media landscape

The topic of dis-/mis-/mal-information is not new. The fields of psychology and political science (Loftus and Hoffman, 1989; Bittman, 1990; Kates, 1998) have examined these issues with their own emphases mainly around manipulation of opinions, memories and emotions in relation to political parties and elections. Media and communication scholars have also paid attention to disinformation strategies, mainly around wars between countries and propaganda (e.g., Kux, 1985; Snyder, 1997). More recently there has been an increased focus on dis-/mis-/mal-information in connection to its use within social and digital media contexts. As Bennet and Livingston argue: “[c]ompared to the mass media era, the current age displays a kaleidoscopic mediascape of television networks, newspapers and magazines (both online eand print), YouTube, WikiLeaks, and LiveLeak content, Astroturf think tanks, radical websites spreading disinformation using journalistic formats, Twitter and Facebook among other social media, troll factories, bots, and 4chan discussion threads, among others” (Bennett and Livingston, 2018, p. 129). Dis-, mis- and mal-information proliferate in the online attention economy where sensational and click-bait content attracts more clicks and hence more profit. This has led to concerns about and demands for citizens to have the understanding and skills to interpret and respond in a meaningful and educated way to dis-/mis-/mal-information strategies employed via digital media.

The use of digital dis-/mis-/mal-information methods has become widespread by various types of entities, from governments, to companies and onto individuals. As Samantha Bradshaw and Phil Howard argue, different actors use “different tools and techniques to manipulate public opinion, such as political bots, content creation, targeted advertisements, fake personas, and trolling or harassment” (Bradshaw and Howard, 2017, p. 28). These multiple types of digital manipulations happen more easily in platforms, and this means that people need to have new types of knowledge and skills as well as being able to use and apply them according to each case.

In recent elections across the world, from the UK to the USA (Allcott and Gentzkow, 2017; Faris et al., 2017; Guess et al., 2018), India, France (Ferrara, 2017), Israel and onto Brazil, disinformation has been used by governments, organisations and individuals to influence citizens political opinions and subsequent behaviours towards a particular goal. For example, in the 2018 Brazilian election citizens received millions of false messages and photos via WhatsApp for several months (Evangelista and Bruno, 2019). Similarly, the New York Times revealed that:

[t]here were doctored photos and videos edited out of context. There were stories exaggerating Mr. Bolsonaro’s heroism and spreading rumours about his rivals. There were conspiracy theories promoting the rumour that Mr. Bolsonaro, who was stabbed at a rally in September, had faked his own injuries as part of a preplanned stunt (Isaac and Roose, 2018).

While that particular news story focused on citizens’ use of the WhatsApp messaging platform, others reveal the contribution algorithms that link similar content. For example, on the video streaming service YouTube “users who watched one far-right channel would often be shown many more. The algorithm had united once-marginal channels — and then built an audience for them, the researchers concluded” (Fisher and Taub, 2019). Though it is hard to pinpoint a direct influence of such human or algorithmic interventions on subsequent action, it is clear that such strategies have become more visible and widespread since the rise of social media. There is therefore a pressing need to understand how the manipulation of social media content and algorithms to distort information can potentially influence citizen’s perceptions and behaviours around various topics including politics, health and economics.

The reason for this prevalence of disinformation in digital media contexts, argue Alice Marwick and Becca Lewis (2017), arises in part from broadcast media’s current vulnerability and weakness. As they argue, the mainstream broadcast media has been targeted via social media because of:

… low public trust in media; a proclivity for sensationalism; lack of resources for fact-checking and investigative reporting; and corporate consolidation resulting in the replacement of local publications with hegemonic media brands (Marwick and Lewis, 2017, p. 42).

Therefore, with the increased power of platforms like Facebook, Google, Amazon and Microsoft, as well as the proliferation of mobile telephone use, and weakening media landscape (Pierson, 2017), citizens face new challenges with the way they engage with digital and social media platforms. In the next section we will explore how different regions have approached these challenges.

Different responses to information distortions

In light of the challenges associated with citizens’ digital and data literacy education, the European Union (EU) has invested in research around internet safety, digital well-being and digital skills aimed at developing the critical awareness of citizens (European Commission, 2019). In January 2018, the EU developed “The Digital Education Action Plan”. This plan emphasises:

… the risks disinformation poses for educators and students and the urgent need to develop digital skills and competences of all learners, in both formal and non-formal education. The Digital Competence Framework for Citizens, developed by the Commission, sets out the wide mix of skills needed by all learners, from information and data literacy, to digital content creation, to online safety and well-being (European Commission, 2018a, p. 12).

One of the pillars of the European Commission Action Plan against Disinformation (European Commission, 2018b) is raising awareness and improving societal resilience. As they argue, the public’s awareness is vital for societal resilience of dis-/mis-/mal-information and this mainly involves improving citizens’ media/digital/data literacies with a particular focus on identifying and ‘combating fake news’.

While it is difficult to know the kinds of effects dis-/mis-/mal-information messages produce, governments have started to see them as something they have to address and started to call for research that will investigate potential interventions. In Belgium, for example, the government has since 2018 established an expert group made of journalists and scholars to try and find a solution while also launching a media literacy campaign to inform people about misinformation. In Canada the government has launched a Digital Charter titled ‘Trust in a digital world’ to defend freedom of expression and protect against disinformation aimed to undermine democracy, and also proposed to invest funding in projects aimed to raise public awareness and digital literacy especially in relation to dis- and mis-information. And Nigeria has developed a media literacy campaign in 2018 which was said to include a collaboration between digital and traditional media together with the National Orientation Agency to provide Nigerians with the appropriate education to fight dis- and mis-information. (for a detailed account of how different countries developed anti-misinformation actions go to Poynter’s project).

The UK’s Department for Digital, Culture, Media and Sport (DCMS) for example, published a report in February 2019 on ‘Disinformation and “fake news”', where they highlight the importance of digital and data literacy, arguing that:

It is hard to differentiate on social media between content that is true, that is misleading, or that is false, especially when those messages are targeted at an individual level. Children and adults need to be equipped with the necessary information and critical analysis to understand content on social media, to work out what is accurate and trustworthy, and what is not. Furthermore, people need to be aware of the rights that they have over their own personal data, and what they should do when they want their data removed (DCMS, 2019).

DCMS has in fact proposed to include digital literacy as part of the four education pillars (along with reading, writing and maths) though this has not yet been taken up as an action by the Department of Education. The report also notes that UK citizens need to know about their opportunities to complain and protest about misleading digital campaigns by addressing them to relevant UK regulators such as communications regulator Ofcom, the Advertising Standards Authority (ASA), the Information Commissioner’s Office ICO and the Electoral Commission.

The DCMS’s report also highlights one of the persistent approaches taken when seeking to address dis-/mis-/mal-information online – that is the application of technological solutions. For example, one of the suggestions in the report is introducing ‘friction’ in algorithmic design to slow down the time citizens engage on platforms and by doing so allowing them to think about what they write and share on social media platforms. Meaning, that platforms should develop computational ‘obstacles’ to make processes of sharing more meaningful and slower. Another technical design solution the DCMS offers involves developing online tools that distinguish between quality content and disinformation sources.

However, the challenges that dis-/mis-/mal-information poses are far more complex. In fact, it is not only the dis-/mis-/mal-information of ‘messages’ that is deceiving citizens but what is also known as ‘dark patterns’. These are interface features designed to mislead, potentially deceive and ‘nudge’ citizens into a particular behaviour. For example, accepting default settings that ‘consent’ to their data being shared with third party companies. The Norwegian Consumer Council demonstrated in its research, Deceived by Design (Forbrukerrådet, 2018), how social media platforms use ‘dark patterns’ to manipulate citizens’ behaviour. Its report analyses platform compliance to the General Data Protection Regulation (GDPR), and they have examined the privacy settings of “Facebook, Google and Windows 10, and show how default settings and dark patterns, techniques and features of interface design meant to manipulate users, are used to nudge users towards privacy intrusive options” (Forbrukerrådet, 2018, p. 3).

Some of these dark patterns include privacy intrusive default settings, misleading wording, giving users an illusion of control, making it hard to find privacy-friendly choices, and take-it-or-leave-it choices. Similarly, Nouwens et al. (2020), have recently shown that many websites use ‘consent management platforms’ (CMPs) that retain ‘dark patterns’ so as to on the surface ‘conform’ with the EU's General Data Protection Regulation. Nouwens et al. found that only 12% of the CMPs they examined met the minimum EU regulatory criteria. The majority retained interface designs that are likely to mislead or make difficult to reject options where citizens provide data to third party companies.

In most of these debates we see how academics or governments provide technological solutions or expect citizens to take action to force companies to act ethically or legally. Though we support such initiatives, we note that they put the responsibility on the individual to be trained and have appropriate skills. Additionally, citizens are expected to act in ways that support the quality assurance for platforms and their regulation, potentially removing the onus from regulatory bodies, institutions or the state. Similar to the current situation where governments suggest technological solutions to Covid19 (with ‘contact tracing apps’), we argue that these are social problems that are entrenched in structural inequalities and therefore cannot be ‘solved’ only with technological means.

As these various studies and policy interventions show, citizens have to engage not only with potentially dis-/mis-/ or mal-information and content (like ‘fake news’) but also deliberately misleading or complex interface design that prevents control over privacy, content delivery and content sharing. Hence, the types of digital and data literacy that citizens need today are complex. They involve not only being able to read and verify news and content, but also, understand the technical and media economics of digital platforms, how they are funded, what their different features and affordances mean and how they function, how to change their privacy and content settings and importantly their individual and collective rights. Digital and data literacy therefore have a strong political, civic and ideological aspect which we explore in the next section.

Understanding literacy and the ideology behind it

Many scholars have pointed out the ideological aspects of literacy ideas (Street, 2005; Collard et al., 2017). Though such scholars do not deny the value of literacy and the skills that are attained through its development, they specifically point out (Street, 1984) that there is a strong ideological component to most definitions and education programmes of literacy. In school “essay writing” literacy skills are valued over others, in the workplace formal writing over casual forms may be preferred. In contemporary society older literate forms (books) may still hold greater value than Facebook posts. Importantly, Street warns us that many views of “literacy” carry with them this ideological baggage – presenting specific literacy practices as though they are ‘neutral or universal’ or implying that they are better or preferred to others:

The 'autonomous' model of literacy works from the assumption that literacy in itself - autonomously - will have effects on other social and cognitive practices … The model, I argue, disguises the cultural and ideological assumptions that underpin it and that can then be presented as though they are neutral and universal (Street, 2005, p. 13)

In this way, literacy skills and competencies become a political ground. For example, in the case of digital and data literacy, training citizens to become more productive workers or consumers are more valued, while skills for their wellbeing, entertainment or activism are seen as unimportant and do not appear in government policy and educational curricula. The idea that literacy has a strong ideological component was also one of the founding arguments of cultural studies. Richard Hoggart’s (1957) exploration (for all its faults) of what we would now term the “literacy practices” of working-class citizens sought to grasp the fundamental impacts of close to universal literacy (reading and writing) in the context of a society with rapidly changing print and broadcast media.

These moral distinctions around the economic values of digital and data literacy have also been examined by Payal Arora (2019). In her study on digital literacy in the Global South and Asia, Arora argues that most people use their literacy skills to access films or more controversially - pornographic sites. According to Arora, such informal uses of literacy are seen as ‘less important’ or ‘less productive’ by governments and NGOs. We would therefore note that any definitions of literacy – written, media, digital or data – need to be cognisant of these issues. Such definitions often mix practical skills, broader social or critical reading skills, ideas of cognitive impact and effects.

Social values in regard to types of ‘text’ and types of ‘skills’ are integrated into education and skills programmes. As a result, we cannot simply view digital and data literacy through a lens of basic skills. The literacies that citizens need as they begin to conduct most of their lives through data services and platforms are complex and varied. Having a strong understanding of how lives and practices have become centralised and interconnected by platforms and digital systems is key to how contemporary literacy is developed. If media, work, and health used to be conducted in different locations and through different instruments, now they are centralised through our computers and our mobile phones. Perceptions and understandings of digital and data literacy will be shaped by social and political contexts that citizens find themselves operating and living within. In the next section we elaborate on our project and how we integrated all these thoughts into our data literacy framework and survey.

Data literacy and information distortions

In this section we focus on the idea of data literacy and how we have linked this to the notion of data citizenship - a theoretical framework developed by the Me & My Big Data team (Pawluczuk et al., 2020). We highlight three main ways that our data literacy is different from previous literacies and how we tailored that into our nationally representative UK survey that was circulated to UK citizens in summer 2019. The data citizenship framework was crafted following a broad literature analysis and the analysis of secondary survey data (Yates et al., in press). Data citizenship is a framework that outlines the importance of citizens having a critical and active agency, at a time when society’s datafication and algorithmically-driven decision-making has become normalised. As digital data have become the core element of our cultural, social, political, and economic worlds, data citizenship aims to create a framework that explores links between data, power and contextuality. Through data citizenship citizens are encouraged and supported to carry out an individual and collective critical inquiry to participate in their communities in a way that is meaningful and proactive. The framework consists of three areas:

  • Data thinking - Citizens’ critical understanding of data (for example, understanding data collection and data economy).
  • Data doing - Citizens’ everyday engagements with data (for example, deleting data and using data in an ethical way).
  • Data participation - Citizens’ proactive engagement with data and their networks of literacy (for example, taking proactive steps to protect individual and collective privacy and wellbeing in the data society as well as helping others with their data literacy).

We argue that inequalities in regard to digital systems and media are better understood around types of users and their correspondence to other key social variables – rather than solely individual skills and access. To explore how data literacies and issues of dis-/mis-/mal-information intersect we highlight three avenues where data citizenship provides deeper insight into citizens engagement with digital media:

  1. going beyond the individual focusing on the networked
  2. understanding critical aspects of media
  3. developing proactive skills rather than passive engagement

We highlight where our conceptions differ from past ideas of digital and data literacy. In addition, we explain how we addressed these issues in our UK citizens survey and provide some initial findings to help illustrate our framework.

Networked and contextual literacies

One of the main differences between previous ideas of data and digital literacy and our model of data literacy is the need to focus on literacies beyond the individual. As we noted earlier about the DCMS approach, policy often focuses on measurement and development of skills of the individual in isolation. For example, Helsper and van Deursen (2015) point to the challenge of self-reports which usually result in people over or under rating their engagement with digital systems. They emphasise the need to go beyond the techno-determinist approach and to focus on the social aspects of digital skills, adapting skills to individuals and their local contexts. Overall though people’s skills are often measured individually and usually detached from their everyday use of multiple digital services and devices.

Trying to tackle some methodological challenges, Helsper et al. later (2016) developed a measurement instrument called the Internet Skills Scale (ISS). Using the ISS, they measured if people know which information they should not share and when. However, in the case studies from the UK and the Netherlands that they outline, measurement of skills is conducted in an individual way online, taken out of people’s everyday context and without focusing on critical engagement with the internet. In addition, since they try to make the measurement broad and applicable to multiple internet platforms and website, the scale misses nuanced practices that people engage in, depending place (people behave differently on WhatsApp, Facebook and Twitter).

Similarly, the digital rights NGO Mozilla’s Internet Health 2019 report notes that the challenges for digital literacy go beyond skills. As they point out, most people do not understand how internet technologies work and the implications of using them in their everyday lives. As they emphasise:

… basic Web literacy skills are important. But they don’t necessarily prepare us to identify and address the big questions and serious challenges like bias, harassment and concentration of power in our connected world (Mozilla Foundation, 2019, p. 86).

Yet Mozilla advocates in this report for a universal ‘Web literacy’ which will support educators and activists in diverse communities. Though we support and advocate for more proactive citizens through literacy (as we discuss below), it is not quite clear what is included in this universal literacy and at whom it is aimed. As we have already argued above, the notion that a ‘universal’ one size fits all solution can be developed to tackle literacy problems goes against most of the contemporary literature on the issue (including our report’s findings - see Yates et al., 2020), highlighting that people with different backgrounds need different literacy programmes.

Others have highlighted the need for tailoring skills development to key social groups. Agencies such as the International Telecommunication Union (ITU), adopt Van Deursen et al.’s (2017) skills frameworks and adapt it focusing on: 1) operational skills; 2) content creation skills; 3) information management skills; 4) social skills. The key findings from ITU’s latest report is that as activities get more complex, fewer people undertake them. The organisation is quite broad and general when it comes to what they actually mean by ‘negative outcomes’ of lack of skills. However, they emphasise the need for specially tailored digital skills training and learning formats for specific disadvantaged groups, such as the unemployed, lower educated, elderly, disabled, illiterate, migrants and families in precarious conditions. While it is important to tailor literacy programmes to disadvantaged groups, these education programmes are usually aimed at individuals who should integrate as efficient workers and/or consumers.

Although such measures can be useful, especially in identifying socio-demographic variations in reported skills, they potentially fall foul of Street’s (2005) concern that literacies, in this case digital and data literacies, are reduced to universal skills that are ‘autonomous’ and automatically lead to certain cognitive or social outcomes. As a result, whichever skills measure you select may carry with them ideological assumptions about the best or more important skills, and hence the type of citizen you want to have at the end. In order to take into consideration the different skills and competencies different groups of people value and need, according to their life course and their communities, our project design will follow up the survey work with in-depth citizens workshops. We want to understand how data literacies can assist people in multiple avenues of their lives, beyond work and consumerism, and highlighting the collective potential of literacies for their communities.

But when we talk about thinking beyond the individual we are also talking about the need to expand the responsibility of training literacy beyond individual people. As Monica Bulger and Patrick Davison (2018) argue, media literacy is often broadly defined as a set of individual skills which promotes a critical engagement with different media messages. One of the shortcomings is that media literacy training mainly focuses on individual responsibility rather than questioning the role of the community, state, institutions or technology companies. As a result, media literacy interventions suffer:

… from issues plaguing education generally; primarily, the longitudinal nature of media literacy creates difficulty in evaluating the success of particular training initiatives. Across education, a diversity of goals leads to incoherent expectations of outcomes, making decisions about what is measured, how, and why very important (Bulger and Davison, 2018, p. 16).

As Bulger and Davidson point out, education is not about one short term programme, but a longitude project. This is especially true considering the fast-changing nature of media. Therefore, in the current landscape of short-term literacy and education programmes it is difficult to assess whether they work in the long run. It is also difficult to know who is accountable for the way these programmes are implemented. In addition, literacy programmes are usually aimed at young people who have institutions (like schools), unlike adults. If the responsibility to be literate lies on individuals, then we can expect socio-economic inequalities to influence their ability to have access and resources to such education programmes.

In our research we wanted to zoom into, not only the practices of sharing things on social media, but also on how people engage in relation to other people and in multiple environments. Instead of portraying people as using digital systems and apps as individuals, we wanted to situate them in their social networks – their families, friends, communities, neighbourhoods and other networks. We call this ‘networks of literacy’, meaning how people engage with others, where and with which media to gain the understanding, skills and competencies in a way that fits them. We see these networks operating across all aspects of our data citizenship model, but we view it as being most evident in the way people engage with data (what we call data doing) but especially in the way they proactively create new things and collaborations with data (what we call data participation).

We asked a set of questions focused on their digital media participation, interactions or relations with others, for example:

  • ‘Have you ever used internet search during a conversation with your friends or family to verify information that you discuss? (“let’s Google this…”)’
  • ‘Have you ever encouraged/ taught others how to stay safe online (e.g. by showing them privacy settings of software tools? (e.g. virus checkers)’
  • ‘Have you ever encouraged others to fact-check? (e.g. by conducting other searches or using other media)’
  • ‘Have you ever helped others to protect their personal data online?’

What our data shows is that these practices differ according to the user type, so while our extensive political users would participate the most in different settings, others who come from lower education and socio-economic condition are less likely to encourage people to stay safe, to fact check or help others to protect themselves online. These insights can help us in understanding who are the proxy points, people who are most likely to be approached for assistance and see how we can make interventions in those social spaces. Nevertheless, in order to get a better understanding of the way people use specific technologies and how they interact with others we will, in subsequent phases of our research project, meet citizen groups to discuss with them about the everyday ‘data day’ that they have. With this approach we also want to avoid developing literacies from above rather than together with people and the way it makes sense in their lives.

Critical understanding of the digital ecosystem

Another important aspect that is unique to our view of data literacy is around the understanding and critical thinking of the digital ecosystem. Of course, scholars have been advocating for critical thinking as part of all forms of literacy before the datafication of our everyday lives. For example, for Tibor Koltay (2011), media literacy describes being able to access media and to critically understand, produce, and negotiate meanings in a culture of images, words and sounds. Koltay argues that there are five levels of media literacy:

  1. Actively using media while feeling comfortable with all types of media.
  2. Having a critical approach to quality and accuracy of content.
  3. Using media creatively.
  4. Understanding media economy.
  5. Awareness of copyright issues.

Here we can see that being ‘critical’ towards content is important, but it is not so clear what ‘critical’ specifically means. It clearly includes what citizens need to do in order to verify and counter problematic content, such as dis-/mis-/mal-information, but it does not include being critical towards platforms’ interface design. However, we argue that understanding the digital economy, including how algorithms work and who is funding social media platforms is a key issue in regard to data literacies. In the context of the broader question of understanding the media economy of digital systems, the UK telecoms regulator Ofcom (2019) has conducted annual surveys to understand the current state of UK adults (16+) media use. From recent Ofcom data (Ofcom, 2019) the proportion of citizens who can correctly identify the main source of income for broadcast TV drops from 80% for BBC (public service) to 53% for subscription services and lower for Youtube (44%). These figures have remained pretty stable since 2005. In addition, and unchanged from 2017, in terms of personalised advertising, six in ten internet users recognise that some people might see different adverts to the ones they see. As Ofcom emphasises – there are differences in socio-economic status when it comes to awareness of funding and online advertising. This points to the challenge of citizens’ ability, especially marginalised ones, to understand media economies and how that shapes what and how they engage with platforms. Therefore, critically understanding media economics and ecosystems is important when it comes to knowing how social media are sponsored and how that may affect the way algorithms are ordering content as well as tempo-spatial relations between people (Carmi, 2020a).

Similarly, the UK NGO doteveryone report (Miller, Coldicutt, & Kitcher, 2018) argues that there are measurements of digital skills (such as those noted above) but not for deeper understanding of digital media and systems. Therefore, doteveryone has developed a model that defines what digital understanding means for citizens in practical ways that they can recognise in their own lives. They have split the types of understandings into the main roles people take in their communities: individual, consumer, worker and member of society. The model shows how people can move from basic awareness to deeper questioning of the implications of technologies in each part of their lives as they need it. As members of society, citizens need to understand how to use the internet to become a part of the public sphere, which doteveryone argues includes being aware of the role of the internet in civic and political life, thinking critically about the trustworthiness of information, knowing about filter bubbles and their impact, and being aware of their legal rights online (Miller, Coldicutt, & Kitcher, 2018).

Following these approaches to critical thinking, we wanted to expand on this dimension further in our data citizenship model and call it data thinking. Examples of types of questions that we asked in our survey under this domain were:

  • Which, if any, of the following information do you believe that a company like Google, Amazon or Facebook collects about its users? (financial situation, health and wellbeing, their friends and family, location, what they do on social media etc.);
  • In your opinion which, if any, of these reasons apply as to why companies like Google, Amazon or Facebook might collect information about users? (targeted advertising, selling users’ data to other companies, tailor prices for products and services, personalize their experience when using a website/app etc.);
  • Thinking generally, when you find factual information online, perhaps on search engines like Google, do you ever think about whether the information you find is truthful?

Interestingly we found that people (in all of our user types) do not want to be tracked over time. They think that platforms like Facebook and Google do not make it easier to change privacy settings and they do not want to share their data with these companies in exchange for a free service. However, when asked if they think it is acceptable for these companies to personalise their experience through apps and websites there is around 50% agreement across the user groups. This is a clear indication that people do not understand the online ecosystem of how these companies produce profiles and segmentation and then trade their data through different brokers to provide their ‘free’ services (Carmi, 2020b). Educating people on how these ecosystems work can help people understand better these connections and object to these practices (Worledge & Bamford, 2019). But to be able to properly oppose these asymmetric power structures people need to be proactive, which leads us to the next big gap in literacies.

Proactive citizens and not passive consumers

There has been a distinct shift in the debate around media literacy in the context of digital systems with a focus on two issues: skills and a more “mechanical or technical” understanding of the media economy of digital systems. There may be an obvious reason for this new focus: digital media are much more interactive and technically varied compared to traditional broadcast media and require such things as media production (posting on social media) and security skills (such as changing privacy settings) – skills that are proactive digital and data literacy practices. However, echoing again the ideological aspects of literacy we discussed above, both policy and education programmes have not been developed around citizen’s proactive skills to protest, object, unionise and conduct other collective actions against various civic issues.

A good example of how digital skills education programmes were aimed to keep citizens passive and not proactive is the European Union’s Safer Internet Programmes that were running from 1999 until 2013. As Elinor Carmi (2020b) illustrates in her analysis of these programmes, citizens were taught to report on harmful content and to avoid actions that could harm the protection of reputation and intellectual property. However, teaching citizens how the internet works, how to encrypt their communication or to use more privacy friendly services was never part of these programmes. Not to mention that teaching citizens about laws that they can use to object, protest or negotiate things on the internet were never developed or promoted. As Carmi argues, these:

[E]ducational programs have helped to cement and institutionalize EU citizens’ roles as consumers and products in the online market territory. Although framed as ‘safety’ education for people, the material that EU citizens were taught was mainly about maintaining the safety of all the organizations that create, manage, and control the internet: governments, copyright holders (of various types of content), ISPs, publishers, digital advertisers, browsers and others (Carmi, 2020b, p. 163).

This is a clear example of how digital and data literacy programmes have an ideological component. Just like written literacy key skills and competencies are often tied to economic need (Street, 1984), or in relation to valuing new media forms and their social status (Hoggart, 1957). Therefore, literacies with a critical element need to address skills and thinking that can provide citizens with tools to shape, object and protest their datafied realities.

To reflect the idea of proactive and potentially critical digital literacy we sought to have a better understanding of how citizens participate. In our data citizenship model we called these activities data participation. This dimension focuses on how citizens participate and especially those connections between practices which integrate online and offline activities and how they inform each other. Data participation therefore helps us to examine the collective and interconnected nature of data society. Through data participation citizens can seek opportunities to exercise their rights and to contribute to and shape their collective data experiences. Our survey therefore asked how often respondents had engaged in examples of data participation. This might include a person who actively contributes to online forums, citizens using open data for the benefit of their community, helping others to set up a secure password, engaging in privacy or dis-/mis-/mal-information debates, showing people how to fact-check things online or create an online campaign around a specific cause.

Although our extensive political users did show more proactive practices with their datain relation to others, none of our user types show evidence of deep engagement with data as part of their personal and civic lives. We will further focus on these particular practices in the focus groups, but what this indicates is that people mostly do not know about proactive options, do not know how to actually unleash them, or choose not to engage in more proactive activities around civic actions.

Conclusion - where next?

In this paper we examined what the ideas of literacy mean in the age of dis/mis- and mal-information – especially critical digital and data literacy. We did this by building on the work of our Me and My Big Data project to examine two fields which have been discussing these issues in parallel but not always together:

• Digital and data literacy

• Dis-/mis-/mal-information

Our project literature review and secondary data analysis work helped us to identify what we view as several key gaps. From this we built a model of data citizenship and of data literacy (Pawluczuk et al., 2020). We then used this model to develop a national representative survey of UK citizens. Our main focus in this paper was to further explore our idea of data literacy in the context of dis-/mis-/mal-information so as to:

  • Examine the ideological aspects (after Street, 1984) of viewing digital and data literacy policy and theory in terms of individual skills.
  • Highlight the networked and contextual nature of citizens’ everyday digital and data literacies.
  • Highlight the need for citizens’ digital and data literacies to include a critical understanding of the economy and ‘ecologies’ of digital platforms.
  • Emphasise the need for proactive participation of citizens in these digital platform economies and ecologies – especially the potential for critical participation.

We have also indicated how we sought to translate these data citizenship and data literacy models into a survey tool. In addition, we highlighted how different user types have different literacies practices and understanding and hence should have different education programmes tailored to them. We believe that exposing such challenges can help other researchers who are trying to understand what to ask, how to measure and how to phrase it. We hope this paper is a starting point to bridge fields of study, which have become critically dependent on one another over the last few years.

Literacies – written, media and datafied – keep on changing, so unpicking the new from the old and the changing from the consistent is key at any historical point. With the rise of persistently high levels of dis-/mis-/mal-information online we argue that there needs to be an emphasis on evaluation and critical understanding of media, its design, and especially its political economy. Importantly, the history of literacy and media literacy research makes clear any understandings and interventions need to address different social contexts, but especially marginalised ones such as lower socio-economic status, disabled, elderly and racial communities. We also need to take on Bulger and Davison’s (2017) point that this is not just about individuals but that other social, government and industry institutions should be a part of this process. This means that such education programmes should be running on an ongoing basis to tackle new emerging media changes and make sure people from various background have appropriate access and resources to gain such literacies. Importantly, a focus on individual skills and technologies narrows the understanding of digital literacy and hence the development of proactive activities which suit people’s everyday needs.

Nevertheless, and as we mentioned above, we believe that surveys of skills and practices can be highly informative, but they are not enough on their own. To get a better understanding of how people engage with data and what it means to their lives and in their communities, there is a need for qualitative methods such as citizen workshops. It is crucial to understand how different groups of citizens engage with content, how they learn to use and understand different devices and applications. In particular, we believe it is important to provide people with tools and knowledge to engage in more critical understanding and actions to fully enjoy their agency and civic participation. What we want to ask citizens is how they establish a network of literacy, support systems but also systems of trust between their peers and communities to develop their digital skills. A key component of this is to explore how and to what extent people’s networks can help (or hinder) them from developing the critical data literacy skills they need to evaluate and counter digital dis-/mis-/mal-information and content.

We especially want to expand the understanding of digital literacy outside the ‘digital’ realm and see it as a more holistic and networked experience that involves practices which are conducted in various spheres, some online and some offline, as these cannot be separated. We also want to emphasise that these insights and skills should be done locally (instead of trying to ‘scale up’), citizens in different places in the world have different considerations, backgrounds, understanding and resources. There should not be a one-set-of-skills-fits-all sort of approach. Nevertheless, scholars and practitioners would benefit from sharing their findings and insights and trying to build a central hub where these skills can be shared, and especially for NGOs, governments and journalists to access as well.

Acknowledgement

We would like to thank the academic editor Francesca Musiani and the reviewers for their valuable input.

References

Allcott, H., & Gentzkow, M. (2017). Social media and fake news in the 2016 election. Journal of economic perspectives, 31(2), 211–236. https://doi.org/10.1257/jep.31.2.211

Arora, P. (2019). The next billion users: Digital life beyond the West. Harvard University Press.

Aufderheide, P. (1993). Media Literacy. A Report of the National Leadership Conference on Media Literacy. Aspen Institute, Communications and Society Program, 1755 Massachusetts Avenue, NW, Suite 501, Washington, DC 20036. Available at: https://files.eric.ed.gov/fulltext/ED365294.pdf.

Bakir, V., & McStay, A. (2018). Fake news and the economy of emotions: Problems, causes, solutions. Digital journalism, 6(2), 154–175. https://doi.org/10.1080/21670811.2017.1345645

Barberá, P., Tucker, J. A., Guess, A., Vaccari, C., Siegel, A., Sanovich, S., Stukal, D., & Nyhan, B. (2018). Social media, political polarization, and political disinformation: A review of the scientific literature. Hewlett Foundation. http://eprints.lse.ac.uk/id/eprint/87402

Bawden, D. (2001). Information and digital literacies: a review of concepts. Journal of documentation, 57(2), 218–259. https://doi.org/10.1108/eum0000000007083

Bawden, D. (2008). Origins and concepts of digital literacy. In C. Lankshear & M. Knobel (Eds.), Digital literacies: Concepts, policies and practices (pp. 17–32). Peter Lang.

BBC. (2020). Coronavirus and ibuprofen: Separating fact from fiction. BBC News. Available at: https://www.bbc.co.uk/news/51929628

Bennett, W. L., & Livingston, S. (2018). The disinformation order: Disruptive communication and the decline of democratic institutions. European journal of communication33(2), 122–139. https://doi.org/10.1177/0267323118760317

Bittman, L. (1990). The use of disinformation by democracies. International Journal of Intelligence and Counter Intelligence, 4(2), 243–261. https://doi.org/10.1080/08850609008435142

Bulger, M., & Davison, P. (2018). The promises, challenges and futures of media literacy. Data and Society Research Institute. https://datasociety.net/library/the-promises-challenges-and-futures-of-media-literacy/

Bradshaw, S., & Howard, P. N. (2018). The Global Organization of Social Media Disinformation Campaigns. Journal of International Affairs, 71(1.5), 23–32. https://www.jstor.org/stable/26508115

Cadwalladr, C. (2017). The great British Brexit robbery: how our democracy was hijacked. The Guardian. Available at: https://www.theguardian.com/technology/2017/may/07/the-great-british-brexit-robbery-hijacked-democracy.

Carmi, E. (2020a). Rhythmedia: A Study of Facebook Immune System. Theory, Culture & Society. Available at: https://journals.sagepub.com/doi/full/10.1177/0263276420917466.

Carmi, E. (2020b). Media Distortions: Understanding the Power Behind Spam, Noise and Other Deviant Media. Peter Lang.

Collard, A. S., De Smedt, T., Dufrasne, M., Fastrez, P., Ligurgo, V., Patriarche, G., & Philippette, T. (2017). Digital media literacy in the workplace: a model combining compliance and inventivity. Italian Journal of Sociology of Education, 9(1), 122–154. https://doi.org/10.14658/pupj-ijse-2017-1-7

Nguyen, P., & Solomon, L. (2018). Consumer data and the digital economy - Emerging issues in data collection, use and sharing. Consumer Policy Research Centre. https://cprc.org.au/wp-content/uploads/Full_Data_Report_A4_FIN.pdf

Crawford, K. (2009). Following you: Disciplines of listening in social media. Continuum, 23(4), 525–535. https://doi.org/10.1080/10304310903003270

Davies, H. C. (2018). Learning to Google: Understanding classed and gendered practices when young people use the Internet for research. New Media & Society, 20(8), 2764–2780. https://doi.org/10.1177/1461444817732326

Digital, Culture, Media and Sport Committee. (2019). Disinformation and ‘fake news’: Final Report (Session 2017-19 Report No. 8). House of Commons.

de Montjoye, Y.-A., Hidalgo, C. A., Verleysen, M., & Blondel, V. D. (2013). Unique in the crowd: The privacy bounds of human mobility. Scientific reports3. https://doi.org/10.1038/srep01376

de Montjoye, Y.-A., Radaelli, L., & Singh, V. K. (2015). Unique in the shopping mall: On the reidentifiability of credit card metadata. Science347(6221), 536–539. https://doi.org/10.1126/science.1256297

Doyle, C. S. (1994). Information literacy in an information society: A concept for the information age. Diane Publishing.

European Commission. (2018a). Tackling online disinformation: a European Approach (COM(2018) 236 final). https://eur-lex.europa.eu/legal-content/EN/TXT/?uri=CELEX:52018DC0236

European Commission. (2018b). Action Plan against Disinformation (JOIN(2018) 36 final). https://ec.europa.eu/digital-single-market/en/news/action-plan-against-disinformation.

European Commission. (2019). Tackling online disinformation. https://ec.europa.eu/digital-single-market/en/tackling-online-disinformation

Eshet, Y. (2004). Digital literacy: A conceptual framework for survival skills in the digital era. Journal of educational multimedia and hypermedia, 13(1), 93–106.

Evangelista, R., and Bruno, F. (2019). WhatsApp and political instability in Brazil: targeted messages and political radicalization. Internet Policy Review, 8(4). https://doi.org/10.14763/2019.4.1434

Faris, R., Roberts, H., Etling, B., Bourassa, N., Zuckerman, E., & Benkler, Y. (2017). Partisanship, propaganda, and disinformation: Online media and the 2016 US presidential election. Berkman Klein Center for Internet & Society. http://nrs.harvard.edu/urn-3:HUL.InstRepos:33759251

Ferrara, E. (2017). Disinformation and social bot operations in the run up to the 2017 French presidential election. First Monday, 22(8). https://doi.org/10.5210/fm.v22i8.8005

Finnegan, R. (1989). Communication and technology. Language & Communication, 9(2-3), 107–127. https://doi.org/10.1016/0271-5309(89)90013-X

Fisher, M., & Taub, A. (2019). How YouTube Radicalized Brazil. New York Times. Available at: https://www.nytimes.com/2019/08/11/world/americas/youtube-brazil.html.

Forbrukerrådet (2018). Deceived by Design: How tech companies use dark patterns to discourage us from exercising our rights to privacy. https://fil.forbrukerradet.no/wp-content/uploads/2018/06/2018-06-27-deceived-by-design-final.pdf

Gil de Zúñiga, H., & Diehl, T. News finds me perception and democracy: Effects on political knowledge, political interest, and voting. new media & society, 21(6), 1253–1271. https://doi.org/10.1177/1461444818817548

Glister, P. (1997). Digital literacy. Wiley Computer Pub.

Guess, A., Nyhan, B., & Reifler, J. (2018). Selective exposure to misinformation: Evidence from the consumption of fake news during the 2016 US presidential campaign.

Gummer, E., & Mandinach, E. (2015). Building a Conceptual Framework for Data Literacy. Teachers College Record, 117(4).

Helsper, E. J., & van Deursen, A. J. A. M. (2015). Digital skills in Europe: Research and policy. In K. Andreasson (Ed.), Digital divides: The new challenges and opportunities of e-inclusion (pp. 125–148). CRC Press.

Hoggart, R. 1957. The uses of literacy: changing patterns in English mass culture. Essential Books.  

Kates, S. (1998). A qualitative exploration into voters' ethical perceptions of political advertising: Discourse, disinformation, and moral boundaries. Journal of Business Ethics, 17(16), 1871–1885. https://doi.org/10.1023/A:1005796113389

Kellner, D., & Share, J. (2005). Toward critical media literacy: Core concepts, debates, organizations, and policy. Discourse: Studies in the cultural politics of education, 26(3), 369–386. https://doi.org/10.1080/01596300500200169

Kumar, S., West, R., & Leskovec, J. (2016). Disinformation on the Web: Impact, Characteristics, and Detection of Wikipedia Hoaxes. Proceedings of the 25th International Conference on World Wide Web, 591–602. https://doi.org/10.1145/2872427.2883085

Kux, D. (1985). Soviet Active Measures and Disinformation: Overview and Assessment. Parameters, 15(4), 19–28.

Issac, M., & Roose, M. (2018, October 19). Disinformation Spreads on WhatsApp Ahead of Brazilian Election. New York Times. https://www.nytimes.com/2018/10/19/technology/whatsapp-brazil-presidential-election.html .

Loftus, E. F., & Hoffman, H. G. (1989). Misinformation and memory: The creation of new memories. Journal of Experimental Psychology: General, 118(1), 100–104. https://doi.org/10.1037/0096-3445.118.1.100

Ofcom. (2019). Adults: Media use and attitudes report [Report]. https://www.ofcom.org.uk/__data/assets/pdf_file/0021/149124/adults-media-use-and-attitudes-report.pdf.

Livingstone, S. (2004). Media literacy and the challenge of new information and communication technologies. The communication review, 7(1), 3–14. https://doi.org/10.1080/10714420490280152

Madden, M. (2017). Privacy, Security, and Digital Inequality: How Technology Experiences and Resources Vary by Socioeconomic Status, Race, and Ethnicity. Data & Society Research Institute. https://datasociety.net/library/privacy-security-and-digital-inequality/

Marchi, R. (2012). With Facebook, blogs, and fake news, teens reject journalistic “objectivity”. Journal of Communication Inquiry, 36(3), 246–262. https://doi.org/10.1177/0196859912458700

Marwick, A., & Lewis, R. (2017). Media manipulation and disinformation online. Data & Society Research Institute. https://datasociety.net/library/media-manipulation-and-disinfo-online/

Miller, C., Coldicutt, R., & Kitcher, H. (2018). People, Power and Technology: The 2018 Digital Understanding Report. Doteveryone. http://understanding.doteveryone.org.uk/

Morgan, S. (2018). Fake news, disinformation, manipulation and online tactics to undermine democracy. Journal of Cyber Policy3(1), 39–43. https://doi.org/10.1080/23738871.2018.1462395

Mozilla Foundation. (2019). Internet Health Report 2019. transcript verlag. https://internethealthreport.org/2019/

Nouwens, M., Liccardi, I., Veale. M., Karger, D., & Kagal, L. (2020). Dark Patterns after the GDPR: Scraping Consent Pop-ups and Demonstrating their Influence. CHI Conference on Human Factors in Computing Systems, Honolulu. https://arxiv.org/abs/2001.02479

Pangrazio, L., & Selwyn, N. (2019). ‘Personal data literacies’: A critical literacies approach to enhancing understandings of personal digital data. New Media & Society, 21(2), 419–437. https://doi.org/10.1177/1461444818799523

Pawluczuk, A., Yates, S., Carmi, E., Lockley, E., & Wessels, B. (In Press). Developing Citizen's Data Citizenship in the Age of Disinformation. In International Telecommunication Union, Digital Skills Insights 2020.

Pierson, D. (2017, July 9). Newspapers want to team up against the ‘inexorable threat’ from Google and Facebook. Will the government let them?. LATimes. https://www.latimes.com/business/la-fi-tn-newspapers-facebook-google-20170709-story.html

Vargo, C. J., Guo, L., & Amazeen, M. A. (2018). The agenda-setting power of fake news: A big data analysis of the online media landscape from 2014 to 2016. New Media & Society, 20(5), 2028–2049. https://doi.org/10.1177/1461444817712086

Snyder, A. A. (1997). Warriors of disinformation: American propaganda, Soviet lies, and the winning of the Cold War: an insider's account. Arcade Publishing.

Špiranec, S., Kos, D., & George, M. (2019). Searching for critical dimensions in data literacy. Information Research, 24(4). http://informationr.net/ir/24-4/colis/colis1922.html

Street, B. V. (1984). Literacy in theory and practice. Cambridge University Press.

Street, B. V. (2005). At last: Recent applications of new literacy studies in educational contexts. Research in the Teaching of English, 39(4), 417–423. https://www.jstor.org/stable/40171646

Wardle, C., & Derakhshan, H. (2017) Information disorder: Toward an interdisciplinary framework for research and policy making (Council of Europe Report DGI(2017)09). Council of Europe. https://rm.coe.int/information-disorder-toward-an-interdisciplinary-framework-for-researc/168076277c

van Deursen, A. J. A. M., Helsper, E. J., & Eynon, R. (2016). Development and validation of the Internet Skills Scale (ISS). Information, Communication & Society, 19(6), 804–823. https://doi.org/10.1080/1369118X.2015.1078834

Yates, S., Kirby, J., & Lockley, E. (2015). Digital media use: Differences and inequalities in relation to class and age. Sociological Research Online, 20(4), 71–91. https://doi.org/10.5153/sro.3751

Yates, S., Carmi, E., Pawluczuk, A., Wessels, B., Lockley, E., & Gangneux, J. (2020). Understanding citizens data literacy: thinking, doing & participating with our data (Me & My Big Data Report 2020). Me and My Big Data project, University of Liverpool. https://www.liverpool.ac.uk/humanities-and-social-sciences/research/research-themes/centre-for-digital-humanities/projects/big-data/publications/

Yates, S., Carmi, E., Lockley, E., Pawluczuk, A., French, T., & Vincent, S. (In Press). Who are the limited users of digital systems and media? An examination of UK evidence. First Monday.

Waterson, J. (2020, April 3). Broadband engineers threatened due to 5G coronavirus conspiracies. The Guardian. https://www.theguardian.com/technology/2020/apr/03/broadband-engineers-threatened-due-to-5g-coronavirus-conspiracies

Worledge, M., & Bamford, M. (2019). Adtech: Market Research Report. Information Commissioner’s Office; Ofcom. https://www.ofcom.org.uk/__data/assets/pdf_file/0023/141683/ico-adtech-research.pdf

Digital inclusion and well-being

$
0
0
Well-being

This commentary is part of Digital inclusion and data literacy, a special issue of Internet Policy Review guest-edited by Elinor Carmi and Simeon J. Yates.

At the Carnegie UK Trust, a charitable foundation based in Scotland and operating across the UK and Ireland, we have been working for more than 100 years to improve well-being for individuals, community and society.

Our founding deed, set by the Scottish-American philanthropist Andrew Carnegie, gave our organisation the mandate to reinterpret our broad mission over the passage of time, to respond accordingly to the most pressing issues of the day. In 2020, it is inevitable that any intervention on policy or practice designed to improve well-being should include a consideration of the role and effect of digital technology.

The impact of technology on all aspects of our individual lives, society and the economy during the past thirty years is well-documented. The benefits for many people have been substantial, but there are also significant challenges. The current COVID-19 crisis is bringing both the upsides and the downsides to the fore.

Through our ‘Digital Futures’ programme at Carnegie we can identify a number of clear themes on the relationship between digital inclusion and different aspects of well-being.

  1. Digital inclusion is changing. As digital becomes ever more central to how we live our daily lives, the range of benefits that we might wish to maximise from our engagement with technology; and the range of risks that we need to be aware of and mitigate, rapidly grows. It is very unlikely that any one of us is making use of digital to its full potential, while also doing all that we possibly can to minimise the associated risks of digital engagement. In the past, there has been a perceived dichotomy between those who are ‘digitally included’ and those who are not. Today, it makes much more sense to think of digital inclusion as a spectrum – or a series of spectrums – upon which we each have our own individual place. We may be skilled in some areas, less so in others. To paraphrase a well-known quote from Joe Kraus, the founder of Excite, we might think of digital inclusion as previously consisting of dozens of markets of millions of citizens; whereas now, it is millions of markets of dozens.
  2. The key question for public policy and practice is to determine when and where it should intervene in this arena. To do this, we need to ask, who could benefit the most from being more digitally included; and who risks experiencing the greatest harm from the nature of, or lack of, digital engagement that they currently have?
  3. Within this context, it is important to give due attention to groups who many may not realise have digital inclusion needs. At Carnegie, our #NotWithoutMe project has worked over a number of years with organisations across the UK who are supporting at risk children and young people to develop their digital skills. There has often been a tendency to regard those in these age groups as ‘digital natives’, for whom digital skills are somehow inherent. This is far from the case, and indeed the consequences of not being able to maximise the benefits of technology – and being exposed to the harms that it can bring – can be particularly significant for young people. Two key points of learning from this work are that even our self-perceptions of our levels of digital understanding can be misleading; and that a sustained improvement in digital inclusion in any cohort is only likely to be achieved if the digital skills of support networks are also invested in.
  4. More broadly, it is imperative that we recognise that there remains a clear social justice dimension in access to and use of digital technology. There is substantial evidence which demonstrates that those most likely to be disadvantaged digitally are also more likely to be disadvantaged according to a range of social or economic measures. What does this digital disadvantage look like? It may encompass lack of connectivity; no access to an appropriate device, or not enough devices; reliance on less flexible or more expensive payment models; limited or narrow digital engagement; or greater risk of exposure to different types of harm online. In this context, there is a significant risk that technology is deepening existing inequalities in society. This contrasts with a commonly held public perception about the internet-age, where digital technology has often been regarded as a route to breaking down traditional hierarchies, barriers and divides. The evidence of the digital divide has perhaps never been so stark as it is just now, with the lockdowns in place to tackle COVID-19, exposing the importance of digital inclusion and the need to urgently tackle significant digital inequalities.
  5. There are a range of well-honed approaches to supporting improved digital inclusion. Networks are often vitally important – people learning from and supporting each other. The current crisis presents new challenges to this mode of learning. Focusing on personal hooks, identity and connections is another well-established approach. Recognising that barriers to improved digital inclusion, and the vulnerabilities that people experience, are multi-layered is essential.
  6. The divide between the ‘digital world’ and the ‘offline world’ is increasingly blurry, with many people simultaneously engaging with a wide range of activities and services online and offline. It is important that any discussion of digital inclusion avoids false dichotomies and supports action that enables people to effectively navigate their digital and physical lives so that each enhances the other and improves well-being. Once again, the COVID-19 crisis has made this consideration an even more pressing, live issue that will need serious consideration and action over the coming weeks and months.
  7. Advancing digital inclusion requires, more than ever, a focus on much more than individual access, skills, confidence and motivation – as critical as these factors are. Much of the growth of the digital sphere during the past 30 years has been driven by private enterprise. Until recently, it has felt like the burden for working out how to engage with digital markets and platforms in a safe and effective way has predominantly fallen on individuals. In recent years however, there has been a growing recognition of some of the challenges and risks associated with the way in which these systems have developed; and an understanding that a much wider range of public policy interventions are likely to be required to ensure that digital can deliver positive well-being outcomes for all citizens. It has also become increasingly apparent that the asymmetric power dynamic between large, global providers and individual citizens makes it difficult to organise a common user interest. In this context, coordinated public policy action at a system level, to ensure that digital inclusion really does deliver well-being benefits, is particularly important.
  8. The significance of public services to supporting well-being is well-understood. Digital technology has presented new opportunities to reimagine the way in which these services are designed and delivered, to become faster, more convenient, more flexible and more responsive. Services which are predominantly transactional in nature have been more advanced in their roll out, but highly effective, responsive, relational digital public services have – unsurprisingly – been slower to emerge. The future of digital inclusion may increasingly look at our ability to design, deliver and engage with such relational services. Again, the COVID-19 crisis has highlighted the value of these type of services – and perhaps may lead to an acceleration of progress towards this type of digital public service development.
  9. Another issue which has received much attention during the current crisis is the question of data and how this is used, stored and shared for public good. While a key focus has obviously been on the role that data might play in helping to tackle COVID-19, much of the discussion that underpins this debate is our common understanding of how data is captured and processed. If data is to become even more important in the years to come, it is essential that improved understanding is at the heart of digital inclusion.
  10. If digital inclusion is to help deliver truly transformative well-being benefits then we need high quality, robust evidence to support these outcomes. This type of evidence is not necessarily straightforward to capture. This may be partly as it is often still too soon for longitudinal benefits – or downsides – of digital services, to be assessed, but also because isolating the positive effects of any digital intervention from other factors is not easy. However, if digital inclusion is to prove its worth in the COVID-19 and post COVID-19 world, then it will be essential to give proper attention to measuring and understanding its value, and to iterating and adapting interventions based on this evidence.
  11. Finally, when reflecting on the relationship between digital inclusion and well-being, we must be cognisant of the fact that change in the digital sphere is rapid and constant. The issues where we need to take action to maximise benefit and mitigate risk can emerge quickly and require policy and practice to pivot accordingly. Our understanding of digital inclusion has developed considerably over time and we should expect this pace to quicken further in the years ahead.

Digital technology had already become fundamental to our individual, community and societal well-being, long before the COVID-19 crisis. The crisis, and the immediate and long-term response to it will extend this even further and more rapidly. Action on digital inclusion - so that everyone can enjoy equally the advantages that technology brings and be protected from harm that it can facilitate - has arguably never been more important and urgent.

Back up: can users sue platforms to reinstate deleted content?

$
0
0

Introduction

Platforms and their terms of service have a decisive impact on freedom of expression and communication online (Suzor, 2018). The private power of platforms is unprecedented and sits uneasily with the primary responsibility and ultimate obligation of states to protect human rights and fundamental freedoms in the digital environment. But states do not only have the negative obligation to refrain from violating the right to freedom of expression and other human rights in the digital environment but also the positive obligation to protect human rights. Companies, as the (international) law and practice of social responsibility of transnational corporations (Ruggie, 2008; 2011) demonstrates, have a responsibility not to violate human rights and to offer redress mechanisms, when they do. This paper asks whether and in what way this obligation extends especially to social networks and to the reinstatement of user comments that may have been wrongfully deleted. Put concisely: under what circumstances should platforms be forced by courts to reinstate content? We will address this question by looking at Germany and the United States, two jurisdictions that deal with the issue of ‘must carry’ in a very different way.1

Analysing a selection of US and German court cases on the question of reinstatement of accounts and republication of deleted content, we will draw out the differences in constitutional and statutory law and show why they explain some of the divergences. In comparative case studies of US and German courts we will address the following questions: Can users sue platforms to have deleted posts and videos reinstated? Do they have a right to a Facebook or Twitter account? Do platforms have corresponding duties to treat users equally in furnishing these services as long as users do not violate the terms of service or as long as users do not violate local law? We will also point to a larger issue, namely the differences in the treatment of states and private companies as threats to and/or guarantors of fundamental rights between the jurisdictions under review. We will finally show how public and private judicial and quasi-judicial approaches towards reinstatement can interact (Kadri & Klonick, 2019).

Today, a quickly growing percentage of communication takes place online. Platforms that are privately owned communication spaces have become systemically important for public discourse, in itself a key element of a free and democratic society (Hölig & Hasebrink, 2019). The internet has heavily influenced our communicative practices (Kettemann, 2018) and will continue to do so. As the European Court of Human Rights noted in 2015, the internet is ‘one of the principal means by which individuals exercise their right to freedom to receive and impart information and ideas, providing [...] essential tools for participation in activities and discussions concerning political issues and issues of general interest’ (Cengiz v. Turkey, 2015). It plays ‘a particularly important role with respect to the right to freedom of expression’ (Council of Europe, CM/Rec(2018)2, 2018). Due to technological innovation, social media platforms are now able to de facto (and de jure) regulate speech in real time at any time. The platforms do not only set the rules for communication and judge on their application but also moderate, curate, rate and edit content according to their rules. Speaking from a constitutional perspective, they combine the tasks of all three separate powers of states – law-making, judication and execution, plus the role of the press (Kadri & Klonick, 2019). As one author put it, ‘platforms [...] engage in intensive legislation, administration of justice and punishment, and develop eclectic governing and legitimation apparatuses consisting of algorithms, proletarian judicial labor and quasi-constitutional governing documents‘ (Schwarz, 2020, p. 117).

The major part of the research that has already been conducted on the issue focuses on the situation in the US (Keller, 2019b; Goldmann, 2017). These analyses, however, seem to accept and appreciate the dual systems of remedy (Kadri & Klonick, 2019). In fact, they consider a culture on platforms that would oblige platforms to carry all legal speech, a potential threat to free speech and to the economic interests of the platforms (Keller, 2019b). We will show that the key to understanding ‘must carry’ is to put a qualifying asterisk to the public/private distinction in law. We will also show that ‘must carry’ obligations need to be understood in the context of the impact platforms have as gatekeepers for discourses, when a growing number of societally relevant debates take place online. Recognising this, platforms, we submit, have to implement a transparent and consistent process of balancing the interests at stake. As their quasi-judicial functions grow, they have to become more judicial.

After a brief analysis of the challenges of regulating online speech between state duties and private obligations, the jurisprudence of US and German courts will be presented. On this basis we proceed with a critical assessment of the horizontal or third-party effects of human and fundamental rights on private contracts and draw conclusions.

Private and public freedom of expression governance

In times of digitality, online communicative spaces have enriched and partially replaced public offline spaces, e.g. town squares, as communicative settings where discourse is relevant for democratic decision-making. This is a challenge for states that continue to have the primary responsibility and ultimate obligation to protect human rights and fundamental freedoms, online just as offline. All regulatory frameworks they introduce, including self- or co-regulatory approaches, have to include effective oversight mechanisms over the companies controlling the private communication spaces and be accompanied by appropriate redress opportunities. However, the normativity inherent in the primary responsibility of states to protect human rights is at odds with the facticity of online communicative practices that are prima facie regulated by the rules of intermediaries through their terms of service, their Hausrecht.

The private sector assumes a distinct role that reveals the specificity of the internet: the vast majority of communicative spaces on the internet are privately held and owned. These intermediaries, including social media companies, today have become important normative actors (Zakon, 2018). They have established largely autonomous legal orders (Kettemann & Schulz, 2020), even if they still form part of the normative order of the internet (Kettemann, in press). Network effects and mergers have led to the domination of the market by a relatively small number of key intermediaries.

Social media companies set the rules for the private public online spaces they control. Some do it via Community Standards (Facebook, 2020; Kettemann & Schulz, 2020), others via their terms of service, while in some jurisdictions2 judges have applied the concept of indirect third-party effect of fundamental rights to online spaces. Social media companies remain – for the foreseeable future – the primary or at least prima facie norm-setters regarding online communicative spaces. Understanding the theory and practice of the private norm-setting process is thus essential. Intermediaries set the rules for communication and thereby define what they understand as ‘desirable communication’. An example to show us how far this can go is TikTok. In order – nominally – to avoid cyberbullying, TikTok would flag individuals with specific features such as ‘facial disfigurement, autism, Down syndrome’, and ‘[d]isabled people or people with some facial problems such as birthmark, slight squint (…)‘ as vulnerable and would limit their videos to be shown to a wider audience or even block them to appear in other users’ feeds (Botella, 2019).

Pushing – from the perspective of TikTok – information and communications that are mainstream and monetisable translates into more growth in today's immaterial production environment (Han, 2014, p. 19) – at least in the short run. This creates ‘a cultural system as well as a political system’ (Balkin, 2004, p. 4). The commodity in this system is not just the user, but the content, produced and used (prod-used) by ‘user culture’ (Klonick, 2018, p. 1630; Balkin, 2004, p.5) ). This user culture is shaped by the specific rules of the digital platform. These sets of rules have matured and now (frequently) include a specific set of values (Facebook, 2019; Twitter, 2019).

In light of the persuasive power of the UN Guiding Principles on Business and Human Rights and the ‘Protect, Respect and Remedy’ Framework (Ruggie, 2011), intermediaries have started to pledge commitment to human rights-inspired values and principles that have certain self-constitutionalising functions. Facebook’s Oversight Board, for example, will have substantial leeway in framing selected norms that apply to online speech on Facebook’s platform (Facebook, 2019). Facebook has undertaken to implement the Board’s decision ‘to the extent that requests are technically and operationally feasible and consistent with a reasonable allocation of Facebook’s resources’ (Facebook, 2019). Next to authenticity, safety, privacy and dignity, Facebook thus favours voice as the paramount value and states:

The goal of our Community Standards is to create a place for expression and give people voice. Building community and bringing the world closer together depends on people’s ability to share diverse views, experiences, ideas and information. We want people to be able to talk openly about the issues that matter to them, even if some may disagree or find them objectionable. In some cases, we allow content which would otherwise go against our Community Standards – if it is newsworthy and in the public interest. We do this only after weighing the public interest value against the risk of harm, and we look to international human rights standards to make these judgments (Bickert, 2019).

This reliance on ‘newsworthiness’ or ‘public interest’ as criteria to allow content that would otherwise be deleted echoes similar policies at Twitter, which defines the importance of public interest for its network as follows (Twitter, 2019):

Serving the public conversation includes providing the ability for anyone to talk about what matters to them; this can be especially important when engaging with government officials and political figures. By nature of their positions these leaders have outsized influence and sometimes say things that could be considered controversial or invite debate and discussion. A critical function of our service is providing a place where people can openly and publicly respond to their leaders and hold them accountable. With this in mind, there are certain cases where it may be in the public’s interest to have access to certain Tweets, even if they would otherwise be in violation of our rules. (...). We’ll also take steps to make sure the Tweet is not algorithmically elevated on our service, to strike the right balance between enabling free expression, fostering accountability, and reducing the potential harm caused by these Tweets.

However, private (platform) companies still have an overriding interest in creating a hospitable communication environment that fosters and attracts advertisements and business activity (Klonick, 2018, p. 1615). As Hill puts it: ‘social media companies, their transnational nature, and the transnational, risk-averse nature of their advertising stakeholders has created an emphasis on brand safety in media content governance’ (Hill, 2019, p. 2). They depend on the ‘prod-users’ to generate and share information and personal data. On platforms, ‘users are not customers, (…) users are ‘value creators’ (Schwarz, 2019, p. 121). Platforms, by offering users social ‘connectedness’ (Van Dijck, 2013, p. 13), turn social interaction and attention into data which are captured and sold to advertisers (Schwarz, 2019, p. 121).

In that way the motivation to safeguard the right of free speech differs significantly from the persuasion shared by liberal democratic societies. For a liberal democratic state order the right of free speech is ‘absolutely essential (...), for it makes possible only the constant intellectual confrontation, the clash of opinions, which is its vital element (…)’. In a certain sense it is considered to be the basis of any freedom or, as the German Federal Constitutional Court (BVerfG) put it: ‘the matrix, the indispensable condition of nearly every other form of freedom (Cardozo)’ (Lüth, 1958). Clashing opinions by definition include negative communication, it includes disruption and it includes content that might not be attractive for advertisers’ brand safety (Hill, 2019, p. 10, 12).

In an environment that promotes and protects speech only to that degree that speech is still good for business (Citron & Norton, 2011, p. 1454), clashes of opinions will not always be desired and protected in a way they would be in liberal democratic societies. For platforms, ultimately, it will remain a business decision to do anything to protect voice, meaning ‘desirable communication’ in the view of the social network services. By favouring this kind of communication, they have changed the social condition of regulating quasi-public speech (Balkin, 2004, p. 26).

The above cited statements of Facebook and Twitter matter. They show that social networking services begin to see that just evaluating content on the basis of their terms of service (and deleting content, if it falls foul of a private norm) might lead to unjustified (or unjustifiable) decisions. Taking up the example of the Napalm girl incident: clearly a picture of an unclothed child is a violation of Facebook’s Community Standards on child nudity. But, the picture of a specific unclothed child, namely Phan Thị Kim Phúc, has a special place in history. Deleting it carries a different message, even though this set of values and the commitment to the Ruggie Principles as a ‘social licence to operate’ (Ruggie, 2008) reinforces an international trend to force platforms to commit to constitutional and human rights principles. Providing access to ‘content’ - and content (including ads) creators’ access to customers’ attention remains the essence of the platforms. Especially in light of potential liability risks, substantiated for example by the fines companies can incur under the German Network Enforcement Act (NetzDG) (Kettemann, 2019) or the EU Code of conduct against illegal hate speech online, which was adopted by Facebook, Microsoft, Twitter and YouTube already in 2016, platforms will promote ‘desirable communication’ on the platform and moderate content accordingly. To minimise the risk of being held liable for (potentially) illegal content is one of the strong drivers for platforms concerning the question of how they draft their rules and how fast they remove content. This preference for speed bears the potential risk that this goes to the expense of a thorough assessment of the legality of the content and thus the rule of law (Coche, 2018, p. 11). In this regard, the ruling of European Court of Justice (CJEU) Glawischnig-Piesczek v
. Facebook Ireland Limited is relevant.The CJEU ruled that EU law does not preclude national courts ordering social network services to seek, identify and delete comments identical to illegal comments and equivalent comments from the same user – globally (Glawischnig v. Facebook, 2019).Since the CJEU chose to follow Advocate General Szpunar’s Advisory Opinion (AG Opinion Glawischnig v. Facebook, 2019) and ruled that the E-Commerce Directive ‘does not preclude a court of a member state from ‘ordering a host provider to remove information which it stores, the content of which is identical to the content of information which was previously declared to be unlawful, or to block access to that information, irrespective of who requested the storage of that information’ (Glawischnig v. Facebook, 2019, para. 53) negative implications for free speech are not unlikely. Legal speech might be caught like ‘dolphins in the [tuna] net’ (Keller, 2019a).

There is some content that companies want, some content that companies put up with, and some content they a) wish to delete or b) legally have to delete. The question now arises how – in the US and the German jurisdiction – courts have dealt with arguments that content platforms want to delete or have deleted should be reinstated as long as it is not illegal. The choice of these two jurisdictions is not a coincidence, it rather will allow us to approach the issue from two very different angles. On the one hand the United States understands freedom of speech as the freedom from interference by the state. The idea that there is a marketplace of ideas can be regarded as the foundational theoretical basis and rationale for freedom of speech doctrine (European Parliament Study, 2019, p. 40). In Germany, freedom of expression is regarded as essential for a free and democratic state order (Lüth, 1958) and also needs to be guaranteed by the state (Saunders, 2017, pp. 11, 14).

Through the lens of the ‘must carry’ approach we will now take a closer look at the situation in the United States and Germany and show how ‘must carry’ is sometimes the only way to guarantee effective protection of speech online. We chose the US as the home of the currently leading social networking sites and the jurisdiction with many judgments regarding freedom of speech in private communication spaces. We selected Germany because of its history of strong regulation of network sites, through for instance the Network Enforcement Act, and the courts’ willingness to consider the application of fundamental rights to platforms. Comparing the US and the German approach to reinstatement of contents allows us to highlight the differences.

The United States: private spaces under private rules

In the US, courts have regularly sided with social networks that have blocked user accounts or deleted tweets (Mezey v. Twitter, Cox v. Twitter, Kimbrell v. Twitter). In the 2018 Twitter v. San Francisco case, for instance, the California Court of Appeal confirmed that a service provider’s decision to restrict or make available certain material is expressly covered by section 230 Communications Decency Act (CDA), the clause shielding internet service providers from liability (Twitter v. San Francisco). The court presupposes the existence of ‘must carry’ claims (Keller, 2019b) but shields platforms from them because section 230 Communications Decency Act (CDA, 1996) and the Digital Millennium Copyright Act (DMCA, 1998) intend to limit the take-down of legal speech (Keller, 2019b). In light of the potential misuses of Sec. 230 by 'bad samaritans' (Citron & Witte, 2017, p. 409), scholars have developed nuanced approaches for the law's reform (Citron & Franks, 2020, pp. 20-25).

The purpose of this grant of immunity was both to encourage platforms to be ‘Good Samaritans’ and take an active role in removing offensive content, and also to avoid free speech problems of collateral censorship (Zeran v. America Online Inc.). The courts rejected the claims with reference to section 230 CDA in the majority of cases, for example in Mezey v. Twitter Inc., Twitter Inc. v. The Superior Court ex rel Taylor, Williby v.Zuckerberg, Fyk v. Facebook Inc., Murphy v. Twitter, Inc. and Brittain v. Twitter Inc. Besides these two regimes, any other arguments were also rejected in court. Up until today, in the US there has been no successful ‘must carry’ claim with relation to platforms (Keller, 2019b,), in contrast to cases against individuals exercising state functions and controlling subspaces within the platforms, such as the comment section under a tweet (e.g., Knight First Amendment Institute v. Trump).

But US jurisprudence has insights to offer into the relationship of private property and public communication goals. Back in the day, it was booksellers, broadcasters or editors that would put limits to content or speech. According to the United States Supreme Court (SCOTUS), strict liability on their part would lead booksellers ‘to restrict the public’s access to forms of the printed word, which the State could not constitutionally suppress directly’ (Smith v. California; Keller, 2018, p. 17). Therefore, also this argument was rejected to protect free speech. In Johnson v. Twitter Inc., the California Superior Court refused to consider Twitter akin to a ‘private shopping mall’ (Pruneyard v. Robins) that was ‘obligated to tolerate protesters’ (Johnson v. Twitter). In Prager v. Google, the Northern California District Court refused to see YouTube as a state actor in accordance with the ‘public function’-test, arguing that providing a video sharing platform fulfils neither an exclusive nor a traditional function of the state. The court did not see YouTube as a ‘company town’ (Marsh v. Alabama) either. A claim relaying on the ‘company town’ rule, which was established in 1964 Marsh v. Alabama, today would only succeed if a claim was brought against a private entity that owns all the property and controls all the functions of an entire (virtual) town (Prager v. Google).

Economic dominance — or dominance in the ‘attention marketplace’ — was not considered to be enough to justify must carry obligations and override the platforms’ own speech rights (First Amendment to the United States Constitution), because the courts do not consider major platforms, comparable to the cable companies in Turner v. FCC, to control ‘critical pathway[s] of communication’.

In Manhattan Community Access Corporation (MNN) v. Halleck, the SCOTUS had the chance to weigh in again on the tension between cable operators’ and cable programmers’ First Amendment rights - and, by implication, on the viability of must carry claims for internet platforms. However, in June 2019 the court only ruled on the status of MNN (non-state actor) rather than on whether the actions directly affect free speech. Only the dissenting opinion of Justice Sotomayor in MNN v. Halleck argued that MNN ‘stepped into the City's shoes and thus qualifies as a state actor, subject to the First Amendment like any other.’ Justice Sotomayor also argued that since New York City laws require that public access channels be open to all, MNN also took responsibility for this law with the public access channels. It did not matter whether the city or a private company runs this public forum since the city mandated that the channels be open to all.

In fact, US courts have repeatedly held that the platform versus publisher dichotomy is irrelevant in the context of section 230 CDA (Chukwurah v. Google). There is established case law on the notion that immunity under section 230 CDA protects platforms against a variety of claims, just recently confirmed in FAN v. Facebook, Sikhs v. Facebook and Chukwurah v. Google. This includes claims for breach of contract and the implied covenant of good faith (FAN v. Facebook). Courts in the US have continuously rejected the notion that platforms are public fora (Prager v. Google; Ebeid v. Facebook; Buza v. Yahoo! Inc.; Langdon v. Google). In May 2020 in Freedom Watch, Inc., et al v. Google Inc., et al the U.S. Court of Appeals for the D.C. Circuit referring to the 2019 SCOTUS decision in MNN v. Halleck confirmed that ‘the First Amendment prohibits only governmental abridgment of speech (...).’ The judges rejected the argument brought forward by Freedom Watch and held that ‘a private entity who provides a forum for speech is not transformed by that fact alone into a state actor.’ (Freedom Watch, Inc., et al v. Google Inc., 2020, p. 2). Only if a social media account, for instance a Twitter account, is used by a public official ‘as a channel for communicating and interacting with the public about his administration’ and ‘to conduct official business and to interact with the public’ (Knight First Amendment Institute v. Trump) can the interactive space on that account be regarded a public forum. However, this does not make Twitter itself a public forum. Only a part of Twitter, namely the account which, in the Knight First Amendment Institute case, Donald Trump, ‘upon assuming office, repeatedly used (...) as an official vehicle for governance’ with ‘interactive features accessible to the public without limitation’ (ibid.) can be considered a ‘public forum’ with the clear consequence that an exclusion from that space (by blocking users or deleting posts) has to be considered an unconstitutional viewpoint discrimination (ibid., p. 23).

Even though there have been more decisions in a similar setting (Morris & Sarapin, 2020, p. 11) supporting this line of argumentation, Pruneyard v. Robins remains an exception and the closest a US case has come with respect to third-party effect on fundamental rights. In that case the SCOTUS confirmed the Californian Supreme Court's decision and thereby the plaintiffs’ rights under the California Constitution to enter a Silicon Valley shopping mall to distribute leaflets. Plaintiffs suing today’s platforms argue that the platforms fulfill the public forum function at least as much as shopping malls ever did and, in consequence, must tolerate unwanted speech. In Pruneyard v. Robins, SCOTUSheld that a shopping mall owner’s own autonomy and communication power were not undermined by leafleteers’ presence on its premises (Pruneyard v. Robins). In Hurley, it held that to ‘require private citizens who organize a parade to include among the marchers a group imparting a message the organizers do not wish to convey [...] violates the First Amendment.’ (Hurley v. Irish Am. GLIB Ass., 1995, p. 559). In the US, therefore, what is taken down, stays down. The situation in Germany is different.

Public law in private spaces: German jurisprudence

Since 2018 German civil courts have decided a number of ‘put-back’ cases arising from deletions by social media companies (especially Facebook and Twitter) in favour of the plaintiff. The judgments of the civil courts are taken against the background of a specific understanding of the public sphere that was shaped by Germany’s highest court regarding constitutional questions, including the protection of fundamental rights, the German Federal Constitutional Court (BVerfG). We will look at how this understanding has now been transferred into the digital sphere with the preliminary decision by the BVerfG delivered in the case “DerIII. Weg” in 2019 after having taken a closer look at the BVerfG's past decisions on private gatekeepers.

In one of its landmark decisions, Fraport in 2011, the BVerfG considered that, ‘depending on the ‘guaranteed scope [of the fundamental right] (Gewährleistungsinhalt) and the case’, the ‘indirect fundamental rights obligation of private parties (…) can come close or even be close to a fundamental rights obligation of the state’ if the private actor has ‘already taken over the provision of the framework conditions of public communication (…)’ (Fraport, 2011, para. 59). This is a nuancing of the doctrinal concept of indirect third-party effect of fundamental rights (mittelbare Drittwirkung der Grundrechte), which was developed in 1958 (Lüth, 1958). However and since more than 50% of the shares of the Fraport AG were held by public shareholders, the BVerfG found that in this case fundamental rights were to apply directly. It was left open to what extent the indirect third-party effect of fundamental rights applied ‘to materially private companies that open up public services and thus create places of general communication with regard to freedom of assembly or freedom of expression’ (Fraport, 2011, para. 59). In its Bierdosen-Flashmob decision 2015, the BVerfG confirmed this reasoning. Three years later, in Stadionverbot, the BVerfG applied the doctrine of indirect third-party effect of fundamental rights (mittelbare Drittwirkung) and found that according to the principle of equal treatment (Art. 3 Basic Law (GG)) a ban for (suspected) hooligans and other potentially violent soccer fans must ‘not [be] imposed arbitrarily but must be based on an objective reason (...) [and] is associated with procedural requirements (…).’ From this the BVerfG concluded that individuals should not be excluded ‘without objective reason’ and not without ‘compliance with procedural requirements’. Otherwise the principle of equal treatment would be violated. However, what the ruling in Stadionverbot does not tell us is if these requirements also apply to the protection of Art. 5 (1) (1) Basic Law (GG) and the public sphere in a digital environment or on platforms.

On 22 May 2019, the BVerfG was concerned with a put-back claim for the first time. Considering that the BVerfG found that the outcome of the proceedings in the main proceedings was not manifestly founded or unfounded, it performed a genuine weighing of the disadvantages for the involved parties. In a preliminary injunction decision, the BVerfG found that the consequences that would occur if the interim injunction was not issued but the main proceedings were successful would outweigh the disadvantages that would arise if the interim injunction was issued, but the main proceedings proved to be unfounded. In Der III. Weg the BVerfG ordered Facebook to allow a right-wing party to access its Facebook page and resume posting (Der III. Weg, 2019).

Even though the BVerfG did only order Facebook to temporarily re-grant the right-wing party Der III. Weg access to its Facebook page and resume posting (the preliminary injunction decision expires after six months, according to § 32 (6) BVerfGG), we can draw some insight from its decision. The BVerfG argued that, by excluding the use of its Facebook base, the right-wing party was ‘denied an essential opportunity to disseminate its political messages and actively engage in discourse with users of the social network,’ which would ‘significantly impede’ its visibility, especially during the run-up to the European elections (Der III. Weg, 2019).

The circumstances of the case and the reasoning of the BVerfG was very similar to that of the Tribunale di Roma in CasaPound v. Facebook, where another right-wing party had had their account suspended by Facebook. The Tribunale di Roma granted a precautionary measure against the suspension and found that Facebook has reached a level of systemic relevance regarding political participation under Art. 49 of the Italian Constitution. CasaPound’s rights to political participation was potentially subject to irreparable damage pending ordinary proceedings (Golia & Behring, 2020).

The BVerfG emphasised inter alia that Facebook has ‘significant market power’ within Germany and that fundamental rights can be effective in disputes between private parties by means of the doctrine of indirect third-party effect of the fundamental right. Therefore, Art. 3 (1) Basic Law (GG) (‘All persons shall be equal before the law’) may have to be interpreted in ‘specific cases’ to force powerful private actors to respect equality of treatment provisions with regard to private contracts (see Maunz, Dürig, & Herdegen, 2019, Art. 1 (3) para. 64).The fact that in Der III. Weg the BVerfG argued that Facebook will have to adhere to the principle of equal treatment with regard to its interaction with its users in the same way the state has to adhere to this principle does not mean that this holds true in regard to other fundamental rights, in particular Art. 5 (1) (1) Basic Law (GG). As the BVerfG clarified in its Fraport decision, the scope of the indirect third-party effect of fundamental rights always depends on the ‘guaranteed scope [of the fundamental right] and the circumstances of the case (Fraport, 2011). This suggests that not only Art. 5 Basic Law (GG) but all relevant fundamental rights in question need to be considered and balanced in order to determine if community standards can justify the deletion of a specific statement, even though it would be protected under Art. 5 Basic Law (GG).

With the introduction of the Act to Improve Enforcement of the Law in Social Networks (Network Enforcement Act (NetzDG)) in 2018 the issue received a lot of attention (Schulz, 2018; Kettemann, 2018, 2019; Heldt, 2019; Wagner, 2020; Peukert, 2018; Löber & Roßnagel, 2019; Bassini, 2019). Users have become more sensitive about the issue that content that was (in most of the cases) permissible under statutory law was taken down by platforms. Since 2018 we have seen the first cases that were decided by civil courts regarding put-back claims. Most of the cases concerned statements which constituted or were deemed to constitute hate speech, according to the platform’s definition of hate speech (Facebook, 2020). Only in rare cases is the solution clear cut, because only if the statement clearly violates the law, for instance § 130 German Criminal Code (StGB),3 a put-back claim will clearly fail.

This is not the case where the statements that have been taken down do not violate any laws but rather go against the terms of service or community standards of the platform that might be protected under Art. 5 (1) (1) Basic Law (GG) (Maunz, Dürig, & Grabenwarter, 2019, Art. 5 (1) (2) para. 108). Art. 5 (1) (1) Basic Law (GG) protects the right of every person to freely express and disseminate their opinions without hindrance. ‘There shall be no censorship’, the Basic Law (GG) confirms. Still, limitations to freedom of expression do exist ‘in the provisions of general law, in provisions for the protection of young persons and in the right to personal honor’ (Maunz, Dürig, & Grabenwarter, 2019, Art. 5 (1) (2) para. 121, 190, 195). In these cases, there are different ways to argue and the courts – in the end – have the obligation to balance constitutional values.

With reference to Art. 5 (1) (1) Basic Law (GG) and the function of Facebook as a ‘public marketplace’, German courts found that Facebook is a ‘public marketplace’ for information and opinion-sharing,4 and therefore it had to ensure – via the doctrine of indirect third-party effect of fundamental rights – that “zulässige Meinungsäußerungen” (admissible opinions = legal opinions) are not deleted.5 German courts concluded that platforms have a ‘substantial indirect meaningful duty’6 to protect the rights under Art. 5 (1) (1) Basic Law (GG). They argued that Facebook had developed a ‘quasi-monopoly’7 and that it is a private company offering a ‘public communicative space’. Therefore platforms8 would generally9 not be allowed to remove ‘admissible expressions of opinion’ and the community standards would not be allowed to exclude such content. 10

Such restrictions on the terms of service, however, could only be explained by a direct and state-like duty to guarantee Art. 5 (1) Basic Law (GG), which many courts of instance have rejected so far.11 Their argument is convincing because the indirectly binding nature of fundamental rights of private individuals is not about minimising interference restricting freedom, but about balancing fundamental rights.12 That is, balancing the legitimate interests of the intermediary in setting their own communication standards – and ruling over their own private space – as well as the interests (and concomitant communication rights) of the affected user and other users and their right of information (Spindler, 2019, p. 8, para. 22).

It is in line with that reasoning that a contract between a user and Facebook constitutes a contract ‘sui generis’13and that Facebook’s Declaration of Rights and Duties forms part of the terms of service (Allgemeine Geschäftsbedingungen, AGBs). These were considered to be (partially) invalid, insofar as they substantially disadvantage the user contrary to good faith (§ 307 German Civil Code (BGB). The court found that provision on deletion of content and accounts in the terms of service could not survive the ‘disadvantage test’ since the provision restricted the reviewability of any decision to delete.

To put it concisely: social networks can prohibit hate speech that does not yet amount to a criminally punishable content pursuant to § 1 (3) NetzDG, but only as long as deletion is not performed arbitrarily, and users are not barred from the service without recourse. A private company, the court continued, that ‘takes over from the state the framework of public communication to such a degree’ must also have the ‘concomitant duties the state as a provider of essential services used to have’ (‘Aufgaben der Daseinsvorsorge’). Intermediaries have a right to police their platforms (‘virtuelles Hausrecht14) and must have the right to delete uploaded content in order to avoid liability (Kettemann, 2019). But opinions that are protected under Art. 5 (1) (1) Basic Law (GG) enjoy a higher level of protection (from deletion by a private actor) than other forms of expression. The generic terms in the BGB allow for and demand an interpretation that ensures that constitutional guarantees are being observed in contractual relations and by private actors. Thus, the violation of the terms of service does not always suffice to justify a deletion of a statement if it is protected under Art. 5 (1) (1) Basic Law (GG), thus restricting the rights of Facebook under Artt. 2, 12, 14 Basic Law (GG) (Maunz, Dürig, & Grabenwarter, 2019, Art. 5 (1) (2) para. 106, 143).

Conclusion: integrating public values into private contracts

The comparative analysis of the German and US case law shows us that there is not just ‘one’ answer to the question of whether social network services do incur ‘must carry’ obligations. It rather depends on the jurisdiction. In the US and in Germany, social network services may restrict content on their platform via terms of service. However, depending on the importance of a communication made (user-side) and the ‘significant market power’ (intermediary-side), social network services in Germany face restrictions in limiting access to the platform by suspending users or cancelling profile access contracts via the concept of indirect third-party effect of fundamental rights.

This may include restrictions regarding the design of terms of service (§§ 307, 305c BGB15), the interpretation of the terms of service in light of the Basic Law and the obligations for companies to take into account (§§ 241 (2) BGB and 242 BGB (good faith)). There might even be grounds to argue for an exclusion of the ordinary right of termination and an obligation to contract for particularly important networks, due to the adverse effects of the exclusion from the platform on fundamental rights of individual users and considering the self-defined general or issue-specific role of the platform (Twitter, 2020) it might even have an obligation to contract (§ 242 BGB). Whether the indirect third-party effect of fundamental rights that has been accepted for Art. 3 Basic Law (GG) is transferrable to Art. 5 (1) (1) Basic Law (GG) is not clear yet. The German line of cases following the adoption of the Network Enforcement Act confirms that certain intermediaries – namely those with a key role for public communication – have duties towards private users under fundamental rights law, namely a duty to respect the equality principle.

On the other hand, US law and jurisprudence in general is reluctant to recognise fundamental right-based duties for private intermediaries. As Knight v. Trump and other cases thoroughly analysed by Morris and Sarapin show, only parts of the privately owned online communication spaces can be regarded a public fora if they were used by government officials, as such (Morris & Sarapin, 2020). In an offline setting, such as in PruneYard v. Robins,the US Supreme Court, which has traditionally been very reluctant to apply fundamental rights obligations to private actors, acknowledged that under certain circumstances and only if a private entity fulfills a public-forum function, it must tolerate unwanted speech. This jurisprudence has not impacted intermediaries and ‘must carry’ cases until now, which is mainly because PruneYard v. Robins was decided against the backdrop of the Californian Constitution (Keller, 2019b). Rather, the courts continue to reject the argument that platforms are generally subject to constitutional speech guarantees (e.g., FAN v. Facebook ). This is also because US law is very sensitive to interferences with free speech by the government. This becomes clear when looking at the First Amendment argument invoked by private companies against ‘must carry’ claims. The freedom from interference by the government goes further and protects the private company from being forced to restore and put back speech that they do not want to host on their platforms (negative free speech). The understanding of speech is much broader than what German jurisprudence would comfortably interpret as falling under Art. 5 (1) (1) Basic Law (GG). However, this leaves citizens less protected regarding interferences with their right to free speech by private actors. This is unfortunate as these have become important providers of online communicative spaces.

In order to meet the fundamental rights guarantees (applied horizontally), content-related standards need to be (and by now usually are) published, enshrined in terms of service that meet fundamental rights-standards, formulated as general rules that are applied non-arbitrarily and allow for effective recourse against deletions and suspensions, as foreseen, for example, by the Council of Europe Recommendation on Intermediaries (Council of Europe, CM/Rec(2018)2, 2018). With Facebook’s introduction of revised values, and a charter for an Oversight Board, content governance is progressively ‘constitutionalized’. But the scope of review is limited to ‘single-user content' that was taken down and does not deal with ranking decisions, ‘shadow banning’ (Menegus, 2019) and, most importantly, does not include the possibility to review Facebook’s algorithm (Douek, 2019).

We expect other platforms to watch the development closely and potentially follow. The step to implement values can be considered a reaction of Facebook to the demand of a number of scholars (Kadri & Klonick, 2019) for a ‘constitution-building’ within the platform (Kettemann, in press). What they are trying to do is to implement self-regulation first, before governments force them to implement regulation which might be difficult to enforce or bad for business. Further, the values and the Oversight Board can be a vehicle for Facebook to add legitimacy to their actions and to outsource controversy while achieving a higher level of actual compliance with their policies (Douek, 2019). In that way, platforms’ anticipatory normative action spares governments the need to enact (and enforce) actual laws - and at the same time makes it more difficult for affected users to challenge takedowns in courts (Bassini, 2019, p. 186), especially in the US (Keller, 2019b). This is why the horizontal application of fundamental rights is so important as a concept.

We argue that insofar platforms serve as (quasi)public fora for communications this influences the ‘normative order’ in which they operate (Kettemann, in press). The German approach to this question offers elements worth considering. The reasoning of the Stadion ban decision of the BVerfG, transferred in the digital sphere, already can be regarded a ‘must carry’ obligation on the internet regarding an access related dimension to online content. It is very likely that after the preliminary injunction decision in ‘Der III. Weg’ the BVerfG will extend its reasoning of indirect third-party effect of fundamental rights onto platforms in regard to the principle of equal treatment to Art. 5 (1) (1) Basic Law (GG). This is appropriate and will facilitate a transparent process of balancing in regard to the fundamental rights in conflict.

The reasoning behind the third-party effect of fundamental rights is confined to one or two European jurisdictions. The Court of Justice of the European Union confirmed the indirect third-party obligation of fundamental rights when assessing how search engines have to balance fundamental rights in the context of de-referencing decisions:

(…) It is thus for the operator of a search engine to assess, (…) in the light of all the circumstances of the case, (…) if (…) the inclusion of the link in question is strictly necessary for reconciling the data subject’s rights to privacy and protection of personal data with the freedom of information of potentially interested internet users (…) (CNIL, 2019).

Drittwirkung by another name – the horizontal application of fundamental rights – is thus a common theme of CJEU jurisprudence as well.16 But even acknowledging that platforms have a ‘must carry’-obligation does not mean they ‘have to carry’ any content. As the CJEU confirms, they can still restrict content in specific cases after balancing the fundamental rights at stake.

This holistic approach to the normative order of online speech is less concerned with public versus private ownership of the communicative space but focuses on the function of online speech. We conclude that this approach makes much sense in times of divergence of online actors and redistribution of responsibilities for governing the public sphere. It is thus time to – figuratively – back up and consider the potential impact of the horizontal application of human rights on the normative order of private-public interaction on the internet as a whole, including governance by algorithms and governance by affordance, which influences the way speech is communicated and received. ‘Must carry’ cases and put-back-attempts draw our attention – with much potential gain – to clashes between private and public orders, between public law and private law.

References

Balkin, J. M. (2004). Digital Speech and Democratic Culture: A Theory of Freedom of Expression for the Information Society. New York University Law Review, 79(1), 1–58. https://www.nyulawreview.org/wp-content/uploads/2018/08/NYULawReview-79-1-Balkin.pdf

Bassini, M. (2019). Fundamental rights and private enforcement in the digital age. European Law Journal, 25(2), 182–197. https://doi.org/10.1111/eulj.12310

Botella, E. (2019). TikTok Admits It Suppressed Videos by Disabled, Queer and Fat Creators. Slate. https://slate.com/technology/2019/12/tiktok-disabled-users-videos-suppressed.html

Bickert, M. (2019, September 12). Updating the Values That Inform Our Community Standards [Press release]. https://newsroom.fb.com/news/2019/09/updating-the-values-that-inform-our-community-standards

Coche, E. (2018). Privatised enforcement and the right to freedom of expression in a world confronted with terrorist propaganda online. Internet Policy Review, 7(4).https://doi.org/10.14763/2018.4.1382

Council of Europe, Committee of Ministers to Member States. (2018, March 7). Recommendation CM/Rec(2018)2, On the roles and responsibilities of internet intermediaries. Council of Europe. https://search.coe.int/cm/Pages/result_details.aspx?ObjectID=0900001680790e14

Citron, D. K., & Norton, H. (2011). Intermediaries and Hate Speech: Fostering Digital Citizenship for Our Information Age. Boston University Law Review, 91. 1435–1484. https://scholar.law.colorado.edu/articles/178/

Citron, D. K., & Wittes, B. (2017). The Internet Will Not Break: Denying Bad Samaritans § 230 Immunity. Fordham Law Review, 86(2), 401–423. https://ir.lawnet.fordham.edu/flr/vol86/iss2/3

Citron, D. K., & Franks, M. A. (2020). The Internet as a Speech Machine and Other Myths Confounding Section 230 Reform (Public Law & Legal Theory Paper No. 20-8). Boston University School of Law. https://scholarship.law.bu.edu/cgi/viewcontent.cgi?article=1833&context=faculty_scholarship

Douek, E. (2019). ‘Facebook’s “Oversight Board”: Move fast with stable infrastructure and humility’. North Carolina Journal of Law & Technology, 21(1). http://ncjolt.org/wp-content/uploads/2019/10/DouekIssue1_Final_.pdf

European Parliament. (2019). Freedom of expression, a comparative law perspective. The United States (Comparative Law Library Unit Study No. PE 642.246). European Parliamentary Research Service. https://www.europarl.europa.eu/RegData/etudes/STUD/2019/642246/EPRS_STU(2019)642246_EN.pdf

European Union Code of Conduct against illegal hate speech online. (2016). https://ec.europa.eu/info/policies/justice-and-fundamental-rights/combatting-discrimination/racism-and-xenophobia/eu-code-conduct-countering-illegal-hate-speech-online_en

Facebook. (2019, September 17). Establishing Structure and Governance for an Independent Oversight Board [Press release]. https://newsroom.fb.com/news/2019/09/oversight-board-structure

Facebook. (2020). Community Standards. https://m.facebook.com/communitystandards

Facebook. (2020) Give people the power to build community and bring the world closer together. https://www.facebook.com/pg/facebook/about

Goldmann, E. (2017). The Ten Most Important Section 230 Rulings. Tulane Journal of Technology and Intellectual Property, 20. https://journals.tulane.edu/TIP/article/view/2676/

Golia, A., Jr., & Behring, R. (2020, February 18). Private (Transnational) Power without Authority: Online fascist propaganda and political participation in CasaPound v. Facebook.Verfassungsblog. https://doi.org/10.17176/20200218-164225-0

Han, B.-C. (2014). Krise der Freiheit. In Psychopolitik. Neoliberalismus und die neuen Machttechniken (pp. 9–24). Fischer Verlag.

Heldt, A. P. (2019). Reading between the lines and the numbers: an analysis of the first NetzDG reports. Internet Policy Review, 8(2). https://doi.org/10.14763/2019.2.1398

Hill, S. (2019). Empire and the megamachine: comparing two controversies over social media content. Internet Policy Review, 8(1). https://doi.org/10.104763/2019.1.1393

Hölig, S., & Hasebrink, U. (2019). Reuters Institute Digital News Report 2019: Ergebnisse für Deutschland (HBI Working Paper No. 47). Leibniz-Institut für Medienforschung, Hans-Brewdow-Institut. https://www.hans-bredow-institut.de/uploads/media/default/cms/media/os943xm_AP47_RDNR19_Deutschland.pdf

Kadri, T., & Klonick, K. (2019). Facebook v. Sullivan: Public Figures and Newsworthiness in Online Speech (Legal Studies Research Paper No. 19-0020). St. John’s University School of Law. https://doi.org/10.2139/ssrn.3332530

Keller, D. (2019a). Dolphins in the Net: Internet Content Filters and the Advocate General’s Glawischnig-Piesczek v. Facebook Ireland Opinion. Stanford Center for Internet and Society. https://cyberlaw.stanford.edu/files/Dolphins-in-the-Net-AG-Analysis.pdf

Keller, D. (2019b). Who do you sue? State and platform hybrid power over online speech (Aegis Series Paper No. 1902). Hoover Institution. https://www.hoover.org/research/who-do-you-sue

Keller, D. (2018). Internet Platforms observations on speech, danger, and money (Aegis series paper No. 1807). Hoover Institution. https://www.hoover.org/research/internet-platforms-observations-speech-danger-and-money

Kettemann, M. C. (2019). Stellungnahme als Sachverständiger für die öffentliche Anhörung zum Netzwerkdurchsetzungsgesetz auf Einladung des Ausschusses für Recht und Verbraucherschutz des Deutschen Bundestags. https://www.hans-bredow-institut.de/uploads/media/default/cms/media/up8o1iq_NetzDG-Stellungnahme-Kettemann190515.pdf

Kettemann, M. C., & Schulz, W. (2020). Setting Rules for 2.7 Billion. A (First) Look into Facebook’s Norm-Making System: Results of a Pilot Study (Working Papers: Works in Progress No. 1). Hans-Bredow-Institut. https://leibniz-hbi.de/uploads/media/Publikationen/cms/media/5pz9hwo_AP_WiP001InsideFacebook.pdf

Kettemann, M. C. (In press). The Normative Order of the Internet. Oxford University Press.

Klonick, K. (2018). The New Governors: The People, Rules, and Processes Governing Online Speech. Harvard Law Review, 131(6), 1598–1670. https://harvardlawreview.org/2018/04/the-new-governors-the-people-rules-and-processes-governing-online-speech

Löber, L. I., & Roßnagel, A. (2019). Das Netzwerkdurchsetzungsgesetz in der Umsetzung Bilanz nach den ersten Transparenzberichten. MMR, 22(2), 71–75.

Maunz, T., Dürig, G., Grabenwarter, C., & Herdegen, M. (2019). Kommentar zum Grundgesetz (GG) Teil B, (89th ed). C.H. Beck.

Menegus, B. (2019). Facebook patents shadowbanning. Gizmodo. https://gizmodo.com/facebook-patents shadowbanning-1836411346

Morris, P. L. & Sarapin, S. H. (2020). You can’t block me: When social media spaces are public forums. First Amendment Studies. https://doi.org/10.1080/21689725.2020.1742760

Peukert, A. (2018). Gewährleistung der Meinungs- und Informationsfreiheit in sozialen Netzwerken - Vorschlag für eine Ergänzung des NetzDG um sog. Put-back-Verfahren. MMR, 21(9), 572–578.

Ruggie, J. (2008, April 7). Human Rights Council. Protect, Respect and Remedy: a Framework for Business and Human Right. Report of the Special Representative of the Secretary-General on the issue of human rights and transnational corporations and other business enterprises [Report No. A/HRC/8/5]. United Nations. https://www.business-humanrights.org/sites/default/files/reports-and-materials/Ruggie-report-7-Apr-2008.pdf

Ruggie, J. (2011, March 21) United Nations Guiding Principles on Business and Human Rights: Implementing the United Nations ‘Protect, Respect and Remedy’ Framework. Report of the Special Representative of the Secretary-General on the issue of human rights and transnational corporations and other business enterprises (UN Doc. A/HRC/17/31). United Nations.

Saunders, K.W. (2017). Free Expression and Democracy. A comparative analysis. Cambridge University Press. https://doi.org/10.1017/9781316771129

Schulz, W. (2018). Regulating Intermediaries to Protect Privacy Online – the Case of the German NetzDG (HIIG Discussion Paper Series No. 2018-01). Alexander von Humboldt Institute for Internet and Society. https://www.hiig.de/wp-content/uploads/2018/07/SSRN-id3216572.pdf

Schwarz, O. (2019). Facebook Rules: Structures of governance in digital capitalism and the control of generalized social capital. Theory, Culture & Society, 36(4). https://doi.org/10.1177/0263276419826249

Spindler, G. (2019). Löschung und Sperrung von Inhalten aufgrund von Teilnahmebedingungen sozialer Netzwerke. Computer und Recht, 35(4), 238–247. https://doi.org/10.9785/cr-2019-350411

Suzor, N. (2018). Digital Constitutionalism: Using the Rule of Law to Evaluate the Legitimacy of Governance by Platforms. Social Media + Society, 4(3). https://doi.org/10.1177/2056305118787812

Twitter (2019, June 27). Defining Public Interest on Twitter [Blog post]. https://blog.twitter.com/en_us/topics/company/2019/publicinterest.html

Twitter (2020). We believe in free expression and think every voice has the power to impact the world. Retrieved from https://about.twitter.com/en_us/values.html

van Dijck, J. (2013). The Culture of Connectivity: A Critical History of Social Media. Oxford University Press. https://doi.org/10.1093/acprof:oso/9780199970773.001.0001

Wagner, G. (2020). Haftung von Plattformen für Rechtsverletzungen (Teil 1). GRUR, 122(4), 329–337.

Zakon, R. (2017). Hobbes’ Internet Timeline 10.2. https://www.zakon.org/robert/internet/timeline

Cases

Bierdosen-Flashmob (2015). Decision of BVerfG - 1 BvQ 25/15. BVerfGE 139, 378.

Brittain v. Twitter, Inc. (2018). Northern California District Court, CV-18-01714-PHX-DGC.

Buza v. Yahoo!, Inc. (2011). WL 5041174, 1.

CasaPound v. Facebook, Inc. (2019). R. G. 59264/2019.

Cengiz and Others v. Turkey. (2015) applications nos. 48226/10 and 14027/11

Chukwurah v. Google, Inc. (2020). No. 8:2019cv00782.

Cox v. Twitter, Inc. ́(2019). 2:18-2573-DCN-BM (D.S.C.).

Cyber Promotions, Inc. v. America Online, 948 F. Supp. 436 (E.D. Pa. 1996).

Der III. Weg. (2019). 1 BvQ 42/19. NJW 2019,1935.

Ebeid v. Facebook, Inc.. (2019). WL 2059662, 6.

FAN vs. Facebook, Inc.. (2019). Case No. 18-CV-07041-LHK.

Fraport (2011). Decision of BVerfG - 1 BvR 699/06. BVerfGE 128, 226 – 278.

Freedom Watch, Inc., Individually and on behalf of those similarly situated and Laura Loomer, individually and on behalf of those similarly situated Palm Beach, Florida v. Google Inc., et al. (2020) Appeal from the United States District Court for the District of Columbia (No. 1:18-cv-02030)

Fyk v. Facebook, Inc. (2019). Northern California District Court, C 18-05159 JSW.

GC and Others v. Commision nationale de I ́informatique et des libertés (CNIL). (2019). Case no. C-136/17. Digital reports, ECLI:EU:C:2019:773.

Glawischnig-Piesczek v. Facebook Ireland Limited. (2019). Case no. C-18/18, ECLI:EU:C:2019:821.

John J. Hurley and South Boston Allied War Veterans Council v. Irish-American Gay, Lesbian and Bisexual Group of Boston et al (1995). 515U.S. 557.

Johnson v. Twitter Inc. (2018). California Superior Court, 18CECG00078.

Kimbrell v. Twitter Inc. Northern California District Court, 18-cv-04144-PJH.

Knight First Amendment Inst. at Columbia Univ. v. Trump. (2017). No. 1:17-cv-5205 (S.D.N.Y.),No. 18-1691 (2d Cir.).

Langdon v. Google, Inc. (2007). 474 F. Supp. 2d 622, 632.

Manhattan Community Access Corporation v. Halleck (2019). No. 17-702, reviewing 882 F. 3d 300 (2d Cir. 2018).

Marsh v. Alabama. (1946). 326 U.S. 501.

Mezey v. Twitter Inc. (2018). Florida Southern District Court, 1:18-CV-21069.

Murphy v. Twitter Inc. (2019). San Francisco Superior Court, CGC-19-573712.

Prager University v. Google. (2018). WL 1471939, 8.

Pruneyard Shopping Center v. Robins. (1980). 447 U.S. 74.

Smith v. California, (1959). 361 U.S. 147.

Stadionverbot (2018). Decision of BVerfG - 1 BvR 3080/09; BVerfGE 148, 267 - 290.Turner Broad. Sys. v. FCC. (1994). 512 U.S. 622, 629 at 657.Turner Broadcasting System, Inc. v. FCC, 520 US 180 (1997))

Twitter Inc. v. The Superior Court for the City and County of San Francisco, (2018). California Court of Appeal, A154973.

Williby v. Zuckerberg. (2019). Northern California District Court, 18-cv-06295-JD.

Zeran v. America Online, Inc. (1997). 129 F.3 d 327, 330; 4 th Cir. 1997.

Footnotes

1.‘Must carry’ from a US perspective originated from a set of rules instituted by the Federal Communications Commission (FCC) in 1965. Originally, ‘must carry’ was a claim, which was brought forward with the aim ‘to preserve[ing] a multiplicity of broadcasters’ (Turner II v. FCC, 1997) and that obliged cable television networks to carry particular (local) programmes. In the context of communications law, ‘must carry’ is not alien to the European and in particular Germany legal order. It is part of the statutory broadcasting obligations which apply under the German Interstate Broadcasting Treaty (RStV) to so-called platform providers, in particular cable network operators. But the context in which ‘must carry’ arguments are put forward has expanded in the last years. Speakers on online platforms in the US have tried to make use of the idea behind ‘must carry’ in a different context. They have tried to force privately owned (social media) platforms to carry (i.e. publish)their speech. In their view ‘a private entity becomes a state actor through its operation’ of the private property as ‘a public forum for speech’ (Cyber Promotions v. America Online, 1996). The approach to obligations of private digital platforms regarding speech to be carried differs significantly between the US and Germany.

2. E.g. Der III. Weg (2019) 1 BvQ 42/19. NJW 2019, 1935; Regional Court Berlin (LG Berlin) (2018) 31 O 21/18; Regional Court Offenburg (LG Offenburg) (2018) 2 O 310/18; Higher Regional Court Munich (OLG München) (2018) 18 W 1294/18; District Court Tübingen (AG Tübingen) (2018) 3 C 26/18; Regional Court Bamberg (LG Bamberg) (2018) 2 O 248/18.

3. OLG Stuttgart (2018) 4 W 63/18; ´Drecksvolk´ (2018), 1 OLG 21 Ss 772/17.

4. Higher Regional Court Frankfurt/Main (OLGFrankfurt/Main) (2017) 16 U 255/16 at 28.

5. Similarly, Higher Regional Court Munich (OLGMünchen) (2018) 18 W 858/18 (LG München I).

6. Higher Regional Court Stuttgart (OLGStuttgart) (2018) 4 W 63/18 at 73.

7. Higher Regional Court Dresden (OLGDresden) (2018) 4 W 577/18.

8. Regional Court Berlin (KGBerlin) (2019) 10 W 172/18 at 17.

9. Higher Regional Court Munich (OLGMünchen) (2018) 18 W 1955/18 at 19 et seq.- possible exception for subforums.

10. Higher Regional Court Munich (OLG München) (2018) 18 W 858/18 at 30; 18 W 1873/18 at 21; 18 W 1383/18 at 20 et seq.; 18 W 1294/18 at 28; Regional Court Karlsruhe (LG Karlsruhe) (2018) 11 O 54/18 at 12; Regional Court Frankfurt/Main (LG Frankfurt/Main) (2018) 2-03 O 182/18 at 16; Regional Court Bamberg (LG Bamberg) (2018) 2 O 248/18 at 86.

11. Higher Regional Court Dresden (OLG Dresden) (2018) 4 W 577/18 at 19 et seq.; Higher Regional Court Karlsruhe (OLG Karlsruhe) (2019) 6 W 81/18 at 51 et seq.; Higher Regional Court Karlsruhe (OLG Karlsruhe) (2018) 15 W 86/18 at 21; Higher Regional Court Stuttgart (OLG Stuttgart) (2018) 4 W 63/18 at 71; Regional Court Offenburg (LG Offenburg) (2019) 2 O 329/18 at 80; Regional Court Bremen (LG Bremen) (2019) O 1618/18 at 59; Regional Court Heidelberg (LG Heidelberg) (2018) 1 O 71/18 at 38. 


12. Higher Regional Court Karlsruhe (OLG Karlsruhe) (2019) 6 W 81/18 at 52. 


13. Higher Regional Court Munich (OLG München) (2018) 18 W 1294/18 (LG München II).

14.Regional Court Bonn (LG Bonn) (1999) 10 O 457/99.

15. Higher Regional Court Dresden (OLG Dresden) (2018) 4 W 577/18 (LG Görlitz).

16. The CJEU is interpreting the relevant provisions ‘in the light’ of the fundamental rights without talking about a third-party effect or ‘must carry’, for example Alemo-Herron v. Parkwood Leisure Ltd. Case no. C-426/11 at 29-30 Digital reports, ECLI:EU:C:2013:521; Google Spain SL and Google Inc. v. Agencia Española de Protección de Datos (AEPD) and Mario Costeja González´ (2014) Case no. C-131/12 at 68 and 74, Digital reports, ECLI:EU:C:2014:317; Y.S. v. Minister voor Immigratie, Integratie en Asiel´ (2014) Case no. C - 141/12 at 54; Opinion of Advocate General Poiares Maduro delivered on ´International Transport Workers´ Federation, Finnish Seamen's Union v. Viking Line ABP, OÜ Viking Line Eesti´ (2007) Case no. C-438/05, at 39. Digital reports, ECLI:EU:C:2007:772 ; Opinion of Advocate General Trstenjak on ´Dominguez v. Centre informatique du Centre Ouest Atlantique, Préfet de la région Centre´ (2012) Case no.C 282/10 at 83.


What if Facebook goes down? Ethical and legal considerations for the demise of big tech

$
0
0

Introduction

Facebook1 has, in large parts of the world, become the de facto online platform for communication and social interaction. In 2017, the main platform reached the milestone of two billion monthly active users (Facebook, 2017), and global user growth since then has continued, reaching 2.6 billion in April 2020 (Facebook, 2020). Moreover, in many countries Facebook has become an essential infrastructure for maintaining social relations (Fife et al., 2013), commerce (Aguilar, 2015) and political organisation (Howard and Hussain, 2013). However, recent changes in Facebook’s regulatory and user landscape stand to challenge its pre-eminent position, making its future demise if not plausible, then at least less implausible over the long-term.

Indeed, the closure of an online social network would not in itself be unprecedented. Over the last two decades, we have seen a number of social networks come and go — including Friendster, Yik Yak and, more recently, Google+ and Yahoo Groups. Others, such as MySpace, continue to languish in a state of decline. Although Facebook is arguably more resilient to the kind of user flight that brought down Friendster (Garcia et al., 2013; Seki and Nakamura, 2016; York and Turcotte, 2015) and MySpace (boyd, 2013), it is not immune to it. These precedents are important for understanding Facebook’s possible decline. Critically, they demonstrate that the closure of Facebook’s main platform does not depend on the exit of all users; Friendster, Google+ and others continued to have users when they were sold or shut down.

Furthermore, as we examine below, any user flight that precedes Facebook’s closure would probably be geographically asymmetrical, meaning that the platform remains a critical infrastructure in some (less profitable) regions, whilst becoming less critical in others. For example, whilst Friendster started to lose users rapidly in North America, its user numbers were simultaneously growing, exponentially, in South East Asia. It was eventually sold to a Filipino internet company and remained active as a popular social networking and gaming platform until 2015.2 The closure of Yahoo! GeoCities, the web hosting service, was similarly asymmetrical: although most sites were closed in 2009, the Japanese site (which was managed by a separate subsidiary) remained open until 2019.3 It is also important to note that, in several of these cases, a key reason for user flight was the greater popularity of another social network platform: namely, MySpace (Piskorski and Knoop, 2006) and Facebook (Torkjazi et al., 2009). Young, white demographics, in particular, fled MySpace to join Facebook (boyd, 2013).

These precedents suggest that changing user demographics and preferences, and competition from other social networks such as Snapchat or a new platform (discussed further below) could be key drivers of Facebook’s decline. However, given Facebook’s pre-eminence as the world’s largest social networking platform, the ethical, legal and social repercussions of its closure would have far graver consequences than these precedents. Rather, the demise of a global online communication platform such as Facebook could have catastrophic social and economic consequences for innumerable communities that rely on the platform on a daily basis (Kovach, 2018), as well as the users whose personal data Facebook collects and stores. 

Despite the high stakes involved in Facebook’s demise, there is little research or public discourse addressing the legal and ethical consequences of such a scenario. The aim of this article is therefore to foster dialogue on the subject. Pursuing this goal, the article provides an overview of the main ethical and legal concerns that would arise from Facebook’s demise and sets out an agenda for future research in this area. First, we identify the headwinds buffeting Facebook, and outline the most plausible scenarios in which the company — specifically, its main platform — might close down. Second, we identify four key ethical stakeholders in Facebook’s demise based on the types of harm to which they are susceptible. We further examine how various scenarios might lead to these harms, and whether existing legal frameworks are adequate to mitigate them. Finally, we provide a set of recommendations for future research and policy intervention.

It should be noted that the legal and ethical considerations discussed in this article are by no means limited to the demise of Facebook, social media, or even “Big Tech”. In particular, to the extent that most sectors in today’s economy are already, or will soon become, data-driven and data-rich, these considerations, many of which relate to the handling of Facebook’s user data, are ultimately relevant to the failure or closure of any company handling large volumes of personal data. Likewise, as human interaction becomes increasingly mediated by social networks and Big Tech platforms, the legal and ethical considerations that we address are also relevant to the potential demise of other social networks, such as Google or Twitter. However, focusing on the demise of Facebook — one of the most data rich, social networks in today’s economy — offers a fertile case study for the analysis of these critical legal and ethical questions.

Why and how could Facebook close down?

This article necessarily adopts a long-term perspective, responding to issues that could significantly harm society in the long run if we do not begin to address them today. As outlined in the introduction, Facebook is currently in robust health: aggregate user growth on the main platform is increasing, and it continues to be highly profitable, with annual revenue and income increasing year-over-year (Facebook, 2017; 2018). As such, it is unlikely that Facebook would shut down anytime soon. However, as anticipated, the rapidly changing socio-economic and regulatory landscape in which Facebook operates could lead to a reversal in its priorities and fortunes over the long term.

Facebook faces two major headwinds. First, the platform is coming under increasing pressure from regulators across the world (Gorwa, 2019). In particular, tighter data privacy regulation in various jurisdictions (notably, the EU General Data Protection Regulation [GDPR]4 and the California Consumer Privacy Act [CCPA])5 could severely inhibit the company’s ability to collect and analyse user data. This in turn could significantly reduce the value of the Facebook platform to advertisers, who are drawn to its granular, data-driven insights about user behaviour and thus higher ad-to-sales conversion rates through targeted advertising. In turn, this would undermine Facebook’s existing business model, whereby advertising generates over 98.5% of Facebook’s revenue (Facebook, 2018), the vast majority of which on its main platform. More boldly, regulators in several countries are attempting to break up the company on antitrust grounds (Facebook, 2020, p. 64), which could lead, inter alia, to the reversal of its acquisitions of Instagram and WhatsApp — key assets, the loss of which could adversely affect Facebook’s future growth prospects.

Secondly, the longevity of the main Facebook platform is under threat from shifting social and social media trends. Regarding the latter, social media usage is gradually moving away from public, web-based platforms in favour of mobile-based messaging apps, particularly within younger demographics. Indeed, in more saturated markets, such as the US and Canada, Facebook’s penetration rate has declined (Facebook, 2020, pp. 31-33), particularly amongst teenagers who tend to favour mobile-only apps such as Snapchat, Instagram and TikTok (Piper Jaffray, 2020). Although Facebook and Instagram still have the largest share of the market in terms of time spent on social media, this has declined since 2015 in favour of Snapchat (Furman, 2019, p. 26). They also face growing competition from international players such as WeChat with over 1 billion users (Tencent, 2019), as well as social media apps with strong political leanings, such as Parler, which are growing in popularity.6

A sustained movement of active users away from the main Facebook platform would inevitably impact the preferences of advertisers, who rely on active users to generate engagement for their clients. More broadly, Facebook’s business model is under threat from a growing social and political movement against the company’s perceived failure to remove misinformation and hateful content from its platform. The advertiser boycott in the wake of the Black Lives Matter protests highlights the commercial risks to Facebook of failing to respond adequately to the social justice concerns of its users and customers.7 As we have seen in the context of both Facebook as well as precedents such as Friendster, due to reverse network effects, any such exodus of users and/or advertisers can occur suddenly and escalate rapidly (Garcia et al., 2013; Seki and Nakamura, 2016; Cannarella and Spechler, 2014).

Collectively, these socio-technical and regulatory developments may force Facebook to shift its strategic priorities away from being a public networking platform (and monetising user data through advertising on the platform), to a company focused on private, ephemeral messaging, monetised through commerce and payment transactions. Indeed, recent statements from Facebook point in this direction:

I believe the future of communication will increasingly shift to private, encrypted services where people can be confident what they say to each other stays secure and their messages and content won't stick around forever. This is the future I hope we will help bring about.

We plan to build this the way we've developed WhatsApp: focus on the most fundamental and private use case -- messaging -- make it as secure as possible, and then build more ways for people to interact on top of that.(Zuckerberg, 2019)

Of course, it does not automatically follow that Facebook would shut down its main platform, particularly if it still has sufficient active users remaining on it, and it bears little cost from keeping it open. On the other hand, closure becomes more likely once a sufficient number of active users and advertisers (but, importantly, not necessarily all) have also left the platform, especially in its most profitable regions. In this latter scenario, it is conceivable that Facebook would consider shutting down the main platform’s developer API (Application Programming Interface — the interface between Facebook and client software) instead of leaving it open and vulnerable to a security breach. Indeed, it was in similar circumstances that Google recently closed the consumer version of its social network Google+ (Thacker, 2018). 

In a more extreme scenario, Facebook Inc. could fail altogether and enter into a legal process such as corporate bankruptcy (insolvency): either a reorganisation that seeks to rescue the company as a going concern, typically by restructuring and selling off some of its assets; or liquidation, in which the company is wound down and dissolved entirely. Such a scenario, however, should be regarded as highly unlikely for the foreseeable future. Although we highlight some of the legal and ethical considerations arising from a Facebook insolvency scenario, the non-insolvent discontinuation or closure of the main platform shall be our main focus henceforth. It should be noted that, as a technical matter, this closure could take various forms. For example, Facebook could close the platform but preserve users’ profiles; alternatively, it could close the platform and destroy, or sell parts or all of its user data etc. Whilst our focus is on the ethical and legal consequences of Facebook’s closure at the aggregate level, we address technical variations in the specific form that this closure could take to the extent that it impacts upon our analysis. 

Key ethical stakeholders and potential harms

In this section, we identify four key ethical stakeholders who could be harmed8 by Facebook’s closure. These stakeholders are: dependentcommunities, in particular the socio-economic and media ecosystems that depend on Facebook to flourish; existingusers, (active and passive) individuals, as well as groups,whose data are collected, analysed and monetised by Facebook, and stored on the company’s servers; non-users, particularly deceased users whose data continues to be stored and used by Facebook, and who will represent hundreds of millions of Facebook profiles in only a few decades; and future generations, who may have a scientific interest in the Facebook archive as a historical resource and cultural heritage.

We refer to these categories as ethical stakeholders, rather than user types, because our categorisation is based on the unique types of harm that each would face in a Facebook closure, not their way of using the platform. That is, the categorisation is a tool to conduct our ethical analysis, rather than corresponding to some already existing groups of users. A single individual may for instance have mutually conflicting interests in her capacity as an existing Facebook user, a member of a dependent community, and as a future non-user. Thus, treating her as a single unit, or part of a particular user group, would reduce the ethical complexity of the analysis. As such, the interests of the stakeholders are by no means entirely compatible with one another, and there will unquestionably be conflicts of interest between them.

Furthermore, for the purposes of the present discussion, we do not intend to rank the relative value of the various interests; there is no internal priority to our analysis, although this may become an important question for future research. We also stress that our list is by no means exhaustive. Our focus is on the mostsignificant ethical stakeholders who have an interest in Facebook’s closure and would experience unique harms due to the closure of a company that is both a global repository of personal data, and the world’s main communication and social networking infrastructure. As such, we exclude traditional, economic stakeholders from the analysis — such as employees, directors, shareholders and creditors. While these groups certainly have stakes in Facebook’s potential closure, there is nothing that significantly distinguishes their interests in the closure of a company like Facebook from the closure of any other (multinational) corporation. This also means that we exclude stakeholders that could benefit from Facebook’s closure, such as commercial competitors, or governments struggling with Facebook’s influence on elections and other democratic processes. Likewise, we refrain from assessing the relative overall (un)desirability of Facebook’s closure.

Dependent communities

The first key ethical stakeholders are the ‘dependent communities’, that is, communities and industries that have developed around the Facebook platform and now (semi-)depend on its existence to flourish.9

Over the last decade, Facebook has become a critical economic engine and a key gateway to the internet as such (Digital Competition Expert Panel, 2019). The growing industry of digitally native content providers, from major news outlets such as Huffington Post and Buzzfeed, to small independent agencies, is sometimes entirely dependent on exposure through Facebook. For example, the most recent change in Facebook’s News Feed algorithm had devastating consequences for this part of the media industry — some news outlets allegedly lost over 50% of their traffic overnight (Nicholls et al., 2018, p. 15). If such a small change in its algorithms could lead to the economic disruption of an entire industry, the wholesale closure of the main Facebook platform would likely cause significant economic and societal damage on a global scale, particularly where it occurs rapidly and/or unexpectedly, such that news outlets and other dependent communities do not have sufficient time to migrate to other web platforms.

To be clear, our main concern here is not with the individual media outlets, but with communities that are dependent on a functioning Facebook-based media ecosystem. While the sudden closure of one, or even several media outlets may not pose a threat to this ecosystem, a sudden breakdown of the entire ecosystem would have severe consequences. For instance, many of the content providers reliant on exposure through Facebook are located in developing countries, in which Facebook has become almost synonymous with the internet, acting as the primary source of news (Mirani, 2015), amongst other functions. Given the primacy of the internet to public discourse in today’s world, it goes without saying that, for these communities, Facebook effectively is the digital public sphere, and hence a central part of the public sphere overall. A notable example is Laos, a country which has so recently been digitised, that its language (Lao) has not yet been properly indexed by Google (Kittikhoun, 2019). This lacuna is filled by Facebook, which has established itself not only as the main messaging service and social network in Laos, but effectively also as the web as such. 

The launch of Facebook’s Free Basics platform, which provides free access to Facebook services in less developed countries, has further increased the number of communities that depend solely on Facebook. According to the Free Basics website,10 100 million people who would not otherwise have been connected are now using the services offered by the platform. As such, there are many areas and communities that now depend on Facebook in order to function and are thus susceptible to considerable harm were the platform to shut down. Note that this harm is not reducible to the individuals using free basics, but is a concern for the entire community, including members not using Facebook. As an illustrative example, consider the vital role played by Facebook and other social media platforms in disseminating information about and keeping many communities connected during the COVID-19 pandemic. In a time of crisis, communities with a large dependency on a single platform become particularly vulnerable.

Of course, whether the closure of Facebook’s main platform harms these communities depends on the reasons for closure and the manner in which it closes down (sudden death vs slow decline). If closure is accompanied by the voluntary exodus of these communities, for example to a different part of the Facebook Inc. group (e.g., Messenger or Instagram), or a third-party social network, they would arguably incur limited social or economic costs. Furthermore, it is entirely possible to imagine a scenario in which the main Facebook platform is shut down because it is unprofitable to the company as a whole, or does not align with the company’s strategic priorities, yet remains systemically important for a number of dependent communities. These communities could still use and depend on the platform however may simply not be valuable or lucrative enough for Facebook Inc. to justify keeping the platform open. Indeed, many of the dependent communities that we have described are located in regions of the world that are the least profitable for the company (certainly under an advertising-driven revenue model).

The question arises how these dependent communities should be protected in the event of Facebook’s demise. Indeed, existing legal frameworks governing Facebook do not make special provision for its systemically important functions. As such, we propose that a new concept of ‘systemically important technological institutions’ (‘SITIs’) — drawing on the concept of ‘systemically important financial institutions’ (‘SIFIs’) — be given more serious consideration in managing the life and death of global communications platforms, such as Facebook, that provide a critical societal infrastructure. This proposal is examined further in the second part of this article.

Existing users

‘Existing users’ refers broadly to any living person or group of people who uses or has used the main Facebook platform, and continues to maintain a Facebook profile or page. That is, both daily and monthly active users, as well as users who are not actively using the platform however still have a profile where their information is stored (including ‘de-activated’ profiles). Invariably, there is an overlap between this set of stakeholders and ‘dependent communities’: the latter includes the former. Our main focus here is on ethical harms that arise at the level of the individual user, by virtue of their individual profiles or group pages, rather than the systemic and societal harms outlined above. 

It is tempting to think that the harm to these users in the event of Facebook’s closure is limited to the loss of the value that they place on having access to Facebook’s services. However, this would be an incomplete conclusion. Everything a user does on the network is recorded and becomes part of Facebook’s data archive, which is where the true potential for harm lies. That is, the danger stems not only from losing access to the Facebook platform and the various services it offers, but from future harms that users (active and passive) are exposed to as they lose control over their personal data. Any violation of the trust that these users place in Facebook with respect to the use of their personal data threatens to compromise user privacy, dignity and self-identity (Floridi, 2011). Naturally, these threats also exist today. However, as long as the platform remains operational, users have a clear idea of who they can hold accountable for the processing of their data. Should the platform be forced to close, or worse still, sell off user data to a third party, this accountability will likely vanish.

The scope for harm to existing users upon Facebook’s closure depends on how Facebook continues to process user data. If the data are deleted (as occurred, for example, in the closure of Yahoo! Groups),11 users could lose access to information — particularly, photos and conversations — that are part of their identity, personal history and memory. Although Facebook does allow users to download much of their intentionally provided data to a hard drive — in the EU, implementing the right to data portability12— this does not encompass users’ conversations and other forms of interactive data. For example, Facebook photos in which a user has been tagged, but which were uploaded by another user, are not portable, even though these photos arguably contain the first user’s personal data. Downloading data is also an impractical option for the hundreds of millions of users accessing the platform only via mobile devices (Datareportal, 2019) that lack adequate storage and processing capacity. Personal archiving is an increasingly constitutive part of a person’s sense of self, but, as noted by Acker and Brubaker (2014), there is a tension between how users conceive of their online personal archives, and the corporate, institutional reality of these archives.

On the other hand, it is highly plausible that Facebook would instead want to retain these data to train its machine learning models and to provide insights on users of other Facebook products, such as Instagram and Messenger. In this scenario, the risk to existing users is that they lose control over how their information is used, or at least fail to understand how and where it is being processed (especially where these users are not active on other Facebook products, such as Instagram). Naturally, involuntary user profiling is a major concern with Facebook as it stands. The difference in the case of closure is that many users will likely not even be aware of the possibility of being profiled. If Facebook goes down, these users would no longer be able to view their data, leading many to believe that it in fact is destroyed. Yet, a hypothetical user may for instance create an Instagram profile in 2030 and still be profiled by her lingering Facebook data, despite Facebook (the main platform) being long gone by then. Or worse still, her old Facebook data may be used to profile other users who are demographically similar to her, without her (let alone their) informed consent or knowledge.

Existing laws in the EU offer limited protection for users’ data in these scenarios. If Facebook intended to delete the data, under EU data protection law it would likely need to notify as well as seek the consent of users for the further processing of their data,13 offering them the opportunity to retrieve their data before deletion (see the closure of Google+14 and Yahoo! Groups). On the other hand, if Facebook opted to retain and continue processing user data in order to provide the (other) services set out under its terms and conditions, it is unlikely that it would be legally required to obtain fresh consent from users — although, in reality, the company would likely still offer users the option to retrieve their data. Independently, users in the EU could also exercise their rights to data portability and erasure15 to retrieve or delete their data.

In practice, however, the enforcement and realisation of these rights is challenging. Given that user data are commingled across the Facebook group of companies, and moreover have ‘velocity’ — an individual user’s data will likely have been repurposed and reused multiple times, together with the data of other users — it is unlikely that all of the data relating to an individual user can or will be identified and permanently ‘returned’. Likewise, given that user data are commingled, objection by an individual user to the transfer of their data is unlikely to be effective — their data will still be transferred with the data of other users who consent to the transfer. As previously mentioned, the data portability function currently offered by Facebook is also limited in scope.

Notwithstanding these practical challenges, a broader problem with the existing legal framework governing user data is that it is almost entirely focused on the rights of individual users. It offers little recognition or protection for the right of groups— for example, Facebook groups formed around sports, travel, music or other shared interests — and thus limited protection against group-level ethical harm within the Facebook platform (i.e., when the ethical patient is a multi-agent-system, not necessarily reducible to its individual parts [Floridi, 2012; Simon, 1995]).

This problem is further exacerbated by so called ‘ad hoc groups’ (i.e., groups that are formed only algorithmically [Mittelstadt, 2017]), which may not necessarily correspond to any organic communities. For example, ‘dog owners living in Wales aged 38–40 that exercise regularly’ (Mittelstadt 2017, p. 477) is a hypothetical, algorithmically formed group. Whereas many organically formed groups are already acknowledged by privacy and discrimination laws, or at least have the organisational means to defend their interests (e.g., people with a certain disability, sexual orientation etc.), ad hoc algorithmic groups often lack organisational means of resistance.

Non-users

The third key ethical stakeholders are those who never, or no longer, use Facebook, yet are still susceptible to harms resulting from its demise. This category includes a range of disparate sub-groups, including individuals who do not have an account, but whose data Facebook nevertheless collects and tracks from apps or websites that embed its services (Hern, 2018). Facebook uses these data, inter alia, to target the individual with ads encouraging them to join the platform (Baser, 2018). Similarly, the non-user category includes individuals who may be tracked by proxy, for example by analysing data from their relatives or close network (more on this below). A third sub-group is minors who may feature in photos and other types of data uploaded to Facebook by their parents (so-called “sharenting”).

The most significant type of non-users, however, are deceased users, i.e., those who have used the platform in the past but have since passed away. Although this may currently seem a rather niche concern, the deceased user group is expected to grow rapidly over the next couple of decades. As shown by Öhman and Watson (2019), Facebook will soon host hundreds of millions of deceased profiles on their servers.16 This sub-group is of special interest since, unlike living non-users who generally enjoy at least some legal rights to privacy and data protection (as outlined above), the deceased do not qualify for protection under existing data protection laws.17 The lack of protection for deceased data subjects is a pressing concern even without Facebook closing.18 Facebook does not have any legal obligation to seek their consent (nor that of their representatives) before deleting, or otherwise further processing, users’ data after death (although Denmark, Spain and Italy are exceptions).19 Moreover, even if Facebook tried to seek the consent of their representatives, it would have a difficult time given that users do not always appoint a ‘legacy contact’ to represent them posthumously.

The closure of the platform, however, opens an entirely new level of ethical harm, particularly in the (unlikely but not impossible) case of bankruptcy or insolvency. Such a scenario would likely force Facebook to sell off its assets to the highest bidder. However, unlike the sale or transfer of data of living users, which under the GDPR and EU insolvency law requires users’ informed consent, there is no corresponding protection for the sale of deceased users’ data in insolvency, such as requiring the consent of their next of kin.20 Moreover, there are no limitations on who could purchase these data and for what purposes. For example, a deceased person’s adversaries could acquire their Facebook data in order to compromise their privacy or tarnish their reputation posthumously. Incidents of this kind have already been reported on Twitter, where the profiles of deceased celebrities have been hacked and used to spread propaganda.21 The profiles of deceased users may also remain commercially valuable and attractive to third party purchasers — for instance, by providing insights on living associates of the deceased, such as their friends and relatives. As in genealogy — where one individual’s DNA also contains information about their children, siblings and parents — one person’s data may similarly be used to predict another’s behaviour or dispositions (see Creet [2019] on the relationship between genealogy websites and big pharma).

In sum, the demise of a platform with Facebook’s global and societal significance is not only a concern for those who use, or have used it directly, but also for individuals who are indirectly affected by its omnipresence in society.

Future generations

It is also important to consider indirect harms arising from Facebook’s potential closure due to missed opportunities. The most important stakeholders to consider in this respect are future generations, which, much like deceased users, are seldom directly protected in law. By ‘future generations’ we refer mainly to future historians and sociologists studying the origins and dynamics of digital society, but also to the general public and their ability to access their shared digitalcultural heritage.

It is widely accepted that the open web holds great cultural and historical value (Rosenzweig, 2003), and thus several organisations — perhaps most notably the Internet Archive’s Way Back Machine22— as well as researchers (Brügger and Schroeder, 2017) are working to preserve it. Personal data, however, have received less attention. Although (most) individual user data may be relatively inconsequential for historical, scientific and cultural purposes, the aggregate Facebook data archive amounts to a digital artefact of considerable significance. The personal digital heritage of each Facebook user is, or will become, part of our shared cultural digital heritage (Cameron and Kenderdine, 2007). As Varnado writes:

Many people save various things in digital format, and if they fail to alert others of and provide access to those things, certain memories and stories of their lives could be lost forever. This is a loss not only for a descendant’s legacy and successors but also for society as a whole. […] This is especially true of social networking accounts, which may be the principal—and eventually only—source for future generations to learn about their predecessors (Varnado, 2014, p. 744)

Not only is Facebook becoming a significant digital cultural artefact, it is arguably the first such artefact to have truly global proportions. Indeed, Facebook is by far the largest archive of human behaviour in history. As such, it can legitimately be said to hold what Appiah (2006) calls ‘cosmopolitan value’ — that is, something that is significant enough to be part of the narrative of our species. Given its global reach, and thus its interest to all of human kind (present and future), this record can even be thought of as a form of future public good (Waters, 2002, p. 83), without which we risk falling into a ‘digital dark age’ (Kuny, 1998; Smit et al., 2011) — a state of ignorance of our digital past.

The concentration of digital cultural heritage in a single (privately controlled and corporate) platform is in and of itself problematic, especially in view of the risk of Facebook monopolising private and collective history (Öhman and Watson, 2019). These socio-political concerns are magnified in the context of the platform’s demise. For such a scenario poses a threat not only to the control or appraisal of digital cultural heritage, but also to its very existence — by decompartmentalising the archive, thus destroying its global significance, and/or by destroying it entirely due to lack of commercial or other interest in preserving it.

These risks are most acute in an insolvency scenario, where, as discussed above, the data are more likely to be deleted or sold to third parties, including by being split up among a number of different data controllers. Although such an outcome may be viewed as a positive development in terms of decentralising Facebook’s power (Öhman and Watson, 2019), it also risks dividing and therefore diluting the global heritage and cosmopolitan value held within the platform. Worse still would be a scenario in which cosmopolitan value is destroyed due to a lack of, or divergent, commercial interests in purchasing Facebook’s data archives, or indeed the inability to put a price on these data due to the absence of agreed upon accounting rules over a company’s (big) data assets (Lyford-Smith, 2017). The recent auction of Cambridge Analytica’s assets in administration, where the highest bid received for the company’s business and intellectual property rights (assumed to include the personal data of Facebook users) was a mere £1, is a sobering illustration of these challenges.23 

However, our concerns are not limited to an insolvency scenario. In the more plausible scenario of Facebook closing the shutters on one of its products, such as the main platform website and app, the archive assembled by the product would no longer be accessible as such to either the public or future generations, even though the data and insights would likely continue to exist and be utilised within the Facebook Inc. group of companies (inter alia, to provide insights on users of other products such as Instagram and Messenger).

Recommendations

The stakeholders presented above, and the harms to which they are exposed, occupy the ethical landscape in which legal and policy measures to manage Facebook’s closure must be shaped. Although it is premature to propose definitive solutions, in this section we offer four broad recommendations for future policy and research in this area. These recommendations are by no means intended to be coherent solutions to “the” problem of big tech closure, but rather are posed as a starting point for further debate.

Develop a regulatory framework for Systemically Important Technological Institutions.

As examined earlier, many societies around the world have become ever-more dependent on digital communication and commerce through Big Tech platforms such as Facebook and would be harmed by their (disorderly) demise. Consider, for instance, the implications of a sudden breakdown of these platforms in times of crisis like the COVID-19 pandemic. As such, there are compelling reasons to regulate these platforms as systemically important institutions. By way of analogy to the SIFI concept — that is, domestic or global financial institutions and financial market infrastructures whose failure is anticipated to have adverse consequences for the rest of the financial system and the wider economy (FSB, 2014) — we thus propose that a new concept of systemically important technological institution, or ‘SITI’, be given more serious consideration. 

The regulatory framework for SITIs should draw on existing approaches to regulating SIFIs, critical national infrastructures and public utilities, respectively. In the insolvency context, drawing upon best practices for SIFI resolution, the SITI regime could include measures to fast-track insolvency proceedings in order to facilitate the orderly wind-down or reorganisation of a failing SITI in a way that minimises disruption to the (essential) services that it provides, thus mitigating harm to dependent communities. This might include resolution powers vested in a regulatory body authorised to supervise SITIs (this could be an existing body, such as the national competition or consumer protection/trade agency, or a newly established ‘Tech’ regulator) — including the power to mandate a SITI, such as Facebook, to continue to provide ‘essential services’ to dependent communities — for example, access to user groups or messaging apps — or else facilitate the transfer of these services to an alternative provider. 

In this way, SITIs would be subject to public obligations similar to those imposed on regulated public utilities, such as water and electricity companies — as “private companies that control infrastructural goods” (Rahman, 2018) — in order to prevent harm to dependent communities.24 Likewise, the SITI regime should include obligations for failure planning (by way of analogy to ‘resolution and recovery planning’ under the SIFI regime). In the EU, this regime should also build on the regulatory framework for ‘essential services’, specifically essential ‘digital service providers’, under the EU NIS (Network and Information Systems) Directive,25 which focuses on managing and mitigating cyber security risks to critical national infrastructures.

Whilst the fine print of the SITI regulatory regime requires further deliberation — indeed, the analogy with SIFIs and public utilities has evident limitations — we hope this article will help incite discussions to that end.

Strengthen the legal mechanisms for users to control their own data in cases of platform insolvency or closure.

Existing data protection laws are insufficient to protect Facebook users from the ethical harms that could arise from the handling of their data in the event of the platform’s closure. As we have highlighted, the nature of ‘Big Data’ is such that even if users object to the deletion or sale of their data, and request their return, Facebook would be unable as a practical matter to fully satisfy that request. As a result, users face ethical harm where their data is used against their will, in ways that could undermine their privacy, dignity and self-identity.

This calls for new data protection mechanisms that give Facebook users better control over their data. Potential solutions include creating new regulatory obligations for data controllers to segregate user data, in particular as between different Facebook subsidiaries (e.g., the main platform and Instagram), where data are currently commingled.26 This would allow users to more effectively retrieve their data were Facebook to shut down and could offer a more effective way of protecting the interests of ad hoc ‘algorithmic’ groups (Mittelstadt, 2017). However, to the extent that segregating data in this way undermines the economies of scale that facilitate Big Data analysis, it could have the unintended effect of reducing the benefits that users gain from the Facebook platform, inter alia through personalised recommendations. 

Additionally, or alternatively, further consideration should be given to the concept of ‘data trusts’, as a bottom-up form of data governance and control by users (Delacroix & Lawrence, 2019). Under a data trust structure, Facebook would act as a trustee for user data, holding them on trust for the user(s) — as the settlor(s) and beneficiary(ies) of the trust — and managing and sharing the data in accordance with their instructions. Moreover, a plurality of trusts can be developed, for example, designed around specified groups of aggregated data (in order to leverage the economies of scope and scale of large, combined data sets). As a trustee, Facebook would be subject to a fiduciary duty to only use the data in ways that serve the best interests of the user (see further Balkin, 2016). As such, a data trust structure could provide a stronger legal mechanism for safeguarding the wishes of users with respect to their data as compared to the existing standard of ‘informed consent’. Another possible solution involves decentralising the ownership and control of user data, for example using distributed ledger technology.27 

Strengthen legal protection for the data and privacy of deceased users.

Although the interests of non-users as a group need to be given serious consideration, we highlight the privacy of deceased users as an area in particular need of protection. We recommend that more countries follow the lead of Denmark in implementing legislation that, at least to some degree, protects the profiles of deceased users from being arbitrarily sold, mined and disseminated in the case of Facebook’s closure.28 Such legislation could follow several different models. Perhaps the most intuitive option is to simply enshrine the privacy rights of deceased users in data protection law, such as (in the EU) the GDPR. This can either be designed as a personal (but time-limited) right (as in Denmark), or a right bestowed upon next of kin (as in Spain and Italy). It could also be shaped by extending copyright law protection (Harbinja, 2017) or take place within what Harbinja (2013, p. 20) calls a ‘human rights-based regime’, (see also Bergtora Sandvik, 2020), i.e. as a universal and inviolable right. Alternatively, it could be achieved by designating companies such as Facebook as ‘information fiduciaries’ (Balkin, 2016), pursuant to which they have a duty of care to act in the best interests of users with respect to their data, including posthumously.

The risk of ethical harm to deceased users or customers in the event of corporate demise is not limited to the closure of Facebook, or Big Tech (platforms). Although Facebook will likely be the single largest holder of deceased profiles in the 21st century, other social networks (LinkedIn, WeChat, YouTube etc.) are also likely to host hundreds of millions of deceased profiles within only a few decades. And as more sectors of the economy become digitised, any company holding customer data will eventually hold a large volume of data relating to deceased subjects. As such, developing more robust legal protection for the data privacy rights of the deceased is important for mitigating the ethical harms due to corporate demise, broadly defined. 

However, for obvious reasons, deceased data subjects have little political influence, and are thus unlikely to become a top priority to policy makers. Moreover, any legislative measures to protect their privacy are likely to be adopted at national or regional levels first, although the problem inevitably remains global in nature. A satisfactory legislative response may therefore take significant time and political effort to develop. Facebook should therefore be encouraged to specify how they intend to handle deceased users’ data upon closure in their terms of service, and in particular commit not to sell those data to a third party where this would not be in the best interests of said users. While this private approach may not have the same effectiveness and general applicability as national or regional legislation protecting deceased user data, it would provide an important first step.

Create stronger incentives for Facebook to share insights and preserve historically significant data for future generations.

Future generations cannot directly safeguard their interests and thus it is incumbent on us to do so. Given the societal, historical and cultural interest in preserving, or at least averting the complete destruction of Facebook’s cultural heritage, stronger incentives need to be created for Facebook to take responsibility and begin acknowledging the global historical value of its data archives.

A promising strategy would be to protect Facebook’s archive as a site of digital global heritage, drawing inspiration from the protection of physical sites of global cultural heritage, such as through UNESCO World Heritage protected status.29 Pursuant to Article 6.1 of the Convention Concerning the Protection of World Cultural and Natural Heritage (UNESCO, 1972), state parties acknowledge that, while respecting the sovereignty of the state territory, their national heritage may also constitute world heritage, which falls within the interests and duties of the ‘international community’ to preserve. Meanwhile, Article 4 stipulates that:

Each State Party to this Convention recognizes that the duty of ensuring the identification, protection, conservation, presentation and transmission to future generations of the cultural and natural heritage […] situated on its territory, belongs primarily to that State. It will do all it can to this end, to the utmost of its own resources and, where appropriate, with any international assistance and co-operation, in particular, financial, artistic, scientific and technical, which it may be able to obtain. (UNESCO, 1972, Art. 4)

A digital version of this label may similarly entail acknowledgement by data controllers of, and a pledge to preserve, the cosmopolitan value of their data archive, while allowing them to continue using the archive. However, in contrast to physical sites and material artefacts, which fall under the control of sovereign states, the most significant digital artefacts in today’s world are under the control of Big Tech companies, like Facebook. As such, there is reason to consider a new international agreement between corporate entities, in which they pledge to protect and conserve the global cultural heritage on their platforms.30

However, bestowing the label of global digital heritage does not resolve the question of access to this heritage. Unlike Twitter, which in 2010 attempted to donate its entire archive to the Library of Congress,31 Facebook’s archive arguably contains more sensitive, personal information about its users. Moreover, these data offer the company more of a competitive advantage compared to Twitter (the latter’s user accounts are public, in contrast to Facebook, where many of the profiles are visible only to friends of the user). These considerations could reduce Facebook’s readiness to grant public access to its archives. Nevertheless, safeguarding the existence of Facebook’s records and its historical significance remains an important first step in making it accessible to future generations.

It goes without saying that the interests of future generations will at times conflict with the interests of the other three ethical stakeholders we have identified. As Mazzone (2012, p. 1660) points out, ‘the societal interest in preserving postings to social networking sites for future historical study can be in tension with the privacy interests of individual users.’ Indeed, Facebook’s data are proprietary, and any interventions must respect its rights in the data as well as the privacy rights of users. Yet, the mere fact that there are conflicts of interests and complexities does not mean that the interests of future generations ought to be neglected altogether.

Conclusion

For the foreseeable future, Facebook’s demise remains a high risk, low probability event. However, mapping out the legal and ethical landscape for such an eventuality, as we have done in this article, allows society to better manage the fallout should this scenario materialise. Moreover, our analysis helps to shed light on lower risk but higher probability scenarios. Companies regularly fail and disappear — increasingly taking with them troves of customer-user data that receive only limited protection and attention under existing law. The legal and ethical harms that we have identified in this article, many of which flow from the use of data following Facebook’s closure, are thus equally relevant to the closure of other companies, albeit on a smaller scale. Regardless of which data-rich company is the next to go, we must make sure that an adequate governance framework is in place to minimise the systemic and individual damage. Our hope is that this article will help kickstart a debate and further research on these important issues.

Acknowledgements

We are deeply grateful to Luciano Floridi, David Watson, Josh Cowls, Robert Gorwa, Tim R Samples, and Horst Eidenmüller for valuable feedback and input. We would also like to add a special thanks to reviewers James Meese and Steph Hill, and editors Frédéric Dubois and Kris Erickson for encouraging us to further improve this manuscript.

References

Acker, A., & Brubaker, J. R. (2014). Death, memorialization, and social media: A platform perspective for personal archives. Archivaria, 77, 2–23. https://archivaria.ca/index.php/archivaria/article/view/13469

Aguilar, A. (2015). The global economic impact of Facebook: Helping to unlock new opportunities [Report]. Deloitte. https://www2.deloitte.com/uk/en/pages/technology-media-and-telecommunications/articles/the-global-economic-impact-of-facebook.html

Aplin, T., Bentley, L., Johnson, P., & Malynicz, S. (2012). Gurry on breach of confidence: The protection of confidential information. Oxford University Press.

Appiah, K. A. (2006). Cosmopolitanism: Ethics in a world of strangers. Penguin.

Balkin, J. (2016). Information fiduciaries and the first amendment. UC Davis Law Review, 49(4), 1183–1234. https://lawreview.law.ucdavis.edu/issues/49/4/Lecture/49-4_Balkin.pdf

Baser, D. (2018, April 16). Hard questions: What data does Facebook collect when I’m not using Facebook, and why? [Blog post]. Facebook Newsroom. https://newsroom.fb.com/news/2018/04/data-off-facebook/

Bergtora Sandvik, K. (2020). Digital dead body management (DDBM): Time to think it through. Journal of Human Rights Practice, uaa002. https://doi.org/10.1093/jhuman/huaa002

boyd, d. (2013). White flight in networked publics? How race and class shaped american teen engagement with MySpace and facebook. In L. Nakamura & P. Chow-White (Eds.), Race after the internet.

Cadwalladr, C., & Graham-Harrison, E. (2018, March 17). Revealed: 50 million Facebook profiles harvested for Cambridge Analytica in major data breach. The Guardian. https://www.theguardian.com/news/2018/mar/17/cambridge-analytica-facebook-influence-us-election

Cannarella, J., & Spechler, J. (2014). Epidemiological Modelling of Online Social Network Dynamics. ArXiv. https://arxiv.org/pdf/1401.4208.pdf

Competition & Markets Authority. (2020). Online Platforms and Digital Advertising (Market Study) [Final report]. Competition & Markets Authority. https://assets.publishing.service.gov.uk/media/5efc57ed3a6f4023d242ed56/Final_report_1_July_2020_.pdf

Creet, J. (2019). Data mining the deceased: Ancestry and the business of family [Documentary]. https://juliacreet.vhx.tv/

DataReportal. (2019). Global digital overview. https://datareportal.com/?utm_source=Statista&utm_medium=Data_Citation_Hyperlink&utm_campaign=Data_Partners&utm_content=Statista_Data_Citation

Delacroix, S., & Lawrence, N. D. (2019). Disturbing the ‘One size fits all’ approach to data governance: Bottom-up. International Data Privacy Law, 9(4), 236–252. https://doi.org/10.1093/idpl/ipz014

Di Cosmo, R., & Zacchiroli, S. (2017). Software heritage: Why and how to preserve software source code. iPRES 2017 – 14th international conference on digital preservation. 1–10.

F, C., & S, K. (2007). Theorizing digital cultural heritage: A critical discourse. MIT Press.

Facebook. (2017). Form 10-K annual report for the Fiscal Period ended December 31, 2017.

Facebook. (2018). Form 10-K annual report for the fiscal period ended december 31, 2018.

Facebook. (2019, June 18). Coming in 2020: Calibra [Blog post]. Facebook Newsroom. https://about.fb.com/news/2019/06/coming-in-2020-calibra/

Facebook. (2020). Form 10-Q quarterly report for the quarterly period ended March 31, 2020.

Federal Trade Commission. (2019, July 24). FTC Imposes $5 Billion Penalty and Sweeping New Privacy Restrictions on Facebook [Press Release]. News & Events. https://www.ftc.gov/news-events/press-releases/2019/07/ftc-imposes-5-billion-penalty-sweeping-new-privacy-restrictions

Financial Stability Board. (2014). Key attributes of effective resolution regimes for financial institutions ¡. https://www.fsb.org/wp-content/uploads/r_141015.pdf

Floridi, L. (2011). The informational nature of personal identity. Minds and Machines, 21(4), 549–566. https://doi.org/10.1007/s11023-011-9259-6

Floridi, L. (2012). Distributed morality in an information society. Science and Engineering Ethics, 19(3), 727–743. https://doi.org/10.1007/s11948-012-9413-4

Furman, J. (2019). Unlocking digital competition [Report]. Digital Competition Expert Panel. https://www.gov.uk/government/publications/unlocking-digital-competition-report-of-the-digital-competition-expert-panel

Garcia, D., Mavrodiev, P., & Schweitzer, F. (2013). Social resilience in online communities: The autopsy of Friendster. Proceedings of the First ACM Conference on Online Social Networks (COSN ’13). https://doi.org/10.1145/2512938.2512946.

Gorwa, R. (2019). What is platform governance? Information, Communication & Society, 22(6), 854–871. https://doi.org/10.1080/1369118X.2019.1573914

Harbinja, E. (2013). Does the EU data protection regime protect post-mortem privacy and what could be the potential alternatives? Scripted, 10(1). https://doi.org/10.2966/scrip.100113.19

Harbinja, E. (2014). Virtual worlds—A legal post-mortem account. Scripted, 11(3). https://doi.org/10.2966/scrip.110314.273

Harbinja, E. (2017). Post-mortem privacy 2.0: Theory, law, and technology. International Review of Law, Computers & Technology, 31(1), 26–42. https://doi.org/10.1080/13600869.2017.1275116

Howard, P. N., & Hussain, M. M. (2013). Democracy’s fourth wave? Digital media and the arab spring. Oxford University Press. https://doi.org/10.1093/acprof:oso/9780199936953.001.0001

Information Commissioner’s Office. (2019, October). Statement on an agreement reached between Facebook and the ICO [Statement]. News and Events. https://ico.org.uk/about-the-ico/news-and-events/news-and-blogs/2019/10/statement-on-an-agreement-reached-between-facebook-and-the-ico

Kittikhoun, A. (2019). Mapping the extent of Facebook’s role in the online media landscape of Laos [Master’s dissertation.]. University of Oxford, Oxford Internet Institute.

Kuny, T. (1998). A digital dark ages? Challenges in the preservation of electronic information. International Preservation News, 17(May), 8–13. https://doi.org/Article

Lyford-Smith, D. (2017). Data as an Asset. ICAEW ¡. https://www.icaew.com/technical/technology/data/data-analytics-and-big-data/data-analytics-articles/data-as-an-asset

Marcus, D. (2020, May). Welcome to Novi [Blog post]. Facebook Newsroom. https://about.fb.com/news/2020/05/welcome-to-novi/

Mazzone, J. (2012). Facebook’s afterlife. North Carolina Law Review, 90(5), 1643–1685.

Mirani, L. (2015). Millions of Facebook users have no idea they’re using the internet. Quartz. https://qz.com/333313/milliions-of-facebook-users-have-no-idea-theyre-using-the-internet/

M.I.T. (2013). An autopsy of a dead social network ¡. https://www.technologyreview.com/s/511846/an-autopsy-of-a-dead-social-network/

Mittelstadt, B. (2017). From Individual to Group Privacy in Big Data Analytics. Philos. Technol, 30, 475–494. https://doi.org/10.1007/s13347-017-0253-7

N, B., & R, S. (Eds.). (2017). The web as history: Using web archives to understand the past and the present. UCL Press.

Öhman, C., & Floridi, L. (2018). An ethical framework for the digital afterlife industry. Nature Human Behaviour. https://doi.org/10.1038/s41562-018-0335-2

Öhman, C. J., & Watson, D. (2019). Are the dead taking over Facebook? A Big Data approach to the future of death online. Big Data & Society, 6(1), 205395171984254. https://doi.org/10.1177/2053951719842540

Open Data Institute. (2018, July 10). What is a Data Trust? [Blog post]. Knowledge & opinion blog. https://theodi.org/article/what-is-a-data-trust/#1527168424801-0db7e063-ed2a62d2-2d92

Piper Sandler. (2020). Taking stock with teens, spring 2020 survey. Piper Sandler. http://www.pipersandler.com/3col.aspx?id=5956

Piskorski, M. J., & Knoop, C.-I. (2006). Friendster (A) [Case Study]. Harvard Business Review.

Rahman, K. S. (2018). The new utilities: Private power, social infrastructure, and the revival of the public utility concept. Cardozo Law Review, 39(5), 1621–1689. http://cardozolawreview.com/wp-content/uploads/2018/07/RAHMAN.39.5.2.pdf

Rosenzweig, R. (2003). Scarcity or abundance? Preserving the past in a digital era. The American Historical Review, 108(3), 735–762. https://doi.org/10.1086/ahr/108.3.735

Scarre, G. (2013). Privacy and the dead. Philosophy in the Contemporary World, 19(1), 1–16. https://doi.org/10.1063/1.2756072

Seki, K., & Nakamura, M. (2016). The collapse of the Friendster network started from the center of the core. 2016 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM), 477–484. https://doi.org/10.1109/ASONAM.2016.7752278

Simon, T. W. (1995). Group harm. Journal of Social Philosophy, 26(3), 123–138. https://doi.org/10.1111/j.1467-9833.1995.tb00089.x

Smit, E., Hoeven, J., & Giaretta, D. (2011). Avoiding a digital dark age for data: Why publishers should care about digital preservation. Learned Publishing, 24(1), 35–49. https://doi.org/10.1087/20110107

Stokes, P. (2015). Deletion as second death: The moral status of digital remains. Ethics and Information Technology, 17(4), 1–12. https://doi.org/10.1007/s10676-015-9379-4

Taylor, J. S. (2005). The myth of posthumous harm. American Philosophical Quarterly, 42(4), 311–322. https://www.jstor.org/stable/20010214

Tencent. (2019). Q2 earnings release and interim results for the period ended June 30, 2019.

Thacker, D. (2018, December 10). Expediting Changes to Google+ [Blog post]. Google. https://blog.google/technology/safety-security/expediting-changes-google-plus/

Torkjazi, M., Rejaie, R., & Willinger, W. (2009). Hot today, gone tomorrow: On the migration of MySpace users. Proceedings of the 2nd ACM Workshop on Online Social Networks - WOSN ’09, 43. https://doi.org/10.1145/1592665.1592676

U. K. Government. (2019). Online harms [White Paper]. U.K. Government, Department for Digital, Culture, Media & Sport; Home Department. https://assets.publishing.service.gov.uk/government/uploads/system/uploads/attachment_data/file/793360/Online_Harms_White_Paper.pdf

UNESCO. (1972). Convention concerning the Protection of the World Cultural and Natural Heritage. Adopted by the General Conference at its seventeenth session Paris, November 16.

Varnado, A. S. S. (2014). Your digital footprint left behind at death: An illustration of technology leaving the law behind. Louisiana Law Review, 74(3), 719–775. https://digitalcommons.law.lsu.edu/lalrev/vol74/iss3/7

Warren, E. (2019). Here’s How We Can Break Up Big Tech [Medium Post]. Team Warren. https://medium.com/@teamwarren/heres-how-we-can-break-up-big-tech-9ad9e0da324c

Waters, D. (2002). Good archives make good scholars: Reflections on recent steps toward the archiving of digital information. In The state of digital preservation: An international perspective (pp. 78–95). Council on Library and Information Resources. https://www.clir.org/pubs/reports/pub107/waters/

York, C., & Turcotte, J. (2015). Vacationing from facebook: Adoption, temporary discontinuance, and readoption of an innovation. Communication Research Reports, 32(1), 54–62. https://doi.org/10.1080/08824096.2014.989975

Zuckerberg, M. (2019, March 6). A privacy-focused vision for social networking [Post]. https://www.facebook.com/notes/mark-zuckerberg/a-privacy-focused-vision-for-social-networking/10156700570096634/

Footnotes

1. Unless otherwise stated, references to ‘Facebook’ are to the main platform (comprising News Feed, Groups and Pages, inter alia, both on the mobile app as well as the website), and do not include the wider group of companies that comprise Facebook Inc, namely WhatsApp, Messenger, Instagram, Oculus (Facebook, 2018), and Calibra (recently rebranded as Novi Financial) (Marcus, 2019; 2020).

2. See https://www.washingtonpost.com/news/the-intersect/wp/2015/02/12/8-throwback-sites-you-thought-died-in-2005-but-are-actually-still-around/

3. See https://qz.com/1408120/yahoo-japan-is-shutting-down-its-website-hosting-service-geocities/

4. Regulation (EU) 2016/679 < https://eur-lex.europa.eu/legal-content/EN/TXT/?uri=uriserv:OJ.L_.2016.119.01.0001.01.ENG>.

5. California Legislature Assembly Bill No. 375 <https://leginfo.legislature.ca.gov/faces/billTextClient.xhtml?bill_id=201720180AB375>

6. See <https://www.politico.com/news/2020/07/06/trump-parler-rules-349434>

7. See < https://www.nytimes.com/2020/06/29/business/dealbook/facebook-boycott-ads.html>.

8. We adopt an inclusive definition of ethical harm (henceforth just ‘harm’) as any encroachment upon personal or collective and legitimate interests such as dignity, privacy, personal welfare, and freedom.  

9. Naturally, not all communities with a Facebook presence can be included in this category. For example, the lost marketing opportunities for large multinational corporations such as Coca Cola Inc., due to the sudden demise of Facebook, cannot be equated with the harm to a small-scale collective of sole traders in a remote area (e.g., a local craft or farmers’ market) whose only exposure to customers is through the platform. By ‘dependent communities’ we thus refer only to communities whose ability to flourish and survive may be threatened by Facebook’s sudden demise.

10. See https://info.internet.org/en/impact/

11. See https://help.yahoo.com/kb/understand-data-downloaded-yahoo-groups-sln35066.html

12. See Art 20 GDPR. 

13. See Art 4(2) GDPR (defining ‘processing’ to include, inter alia, ‘erasure or destruction’ of personal data).

14. See Google Help, (2019) ‘Shutting down Google+ for consumer (personal) accounts on April 2, 2019’ https://support.google.com/plus/answer/9195133?hl=en-GB. Facebook states in its data policy that ‘We store data until it is no longer necessary to provide our services and Facebook Products or until your account is deleted — whichever comes first’, which might suggest that users provide their consent to future deletion of their data when they first sign up to Facebook. However, it is unlikely that this clause substitutes for the requirement to obtain specific and unambiguous consent to data processing, for specific purposes — including deletion of data — under the GDPR (see Articles 4(11) and 6(1)(a)).

15. See Art 17 GDPR.

16. Facebook’s policy on deceased users has changed somewhat over the years, but the current approach is to allow next of kin to either memorialise or permanently delete the account of a confirmed deceased user (Facebook, n.d.). Users are also encouraged to select a ‘legacy contact’, that is, a second Facebook user who will act as a custodian in the event of their demise. Although these technical solutions have proven to be successful on an individual, short-term level, several long-term problems remain unsolved. In particular, what happens when the legacy contact themselves dies? For how long will it be economically viable to store hundreds of millions of deceased profiles on the servers?

17. However, note that the information of a deceased subject can continue to be protected by the right to privacy under Art 8 of the European Convention on Human Rights, and the common law of confidence with respect to confidential personal information (although the latter is unlikely to apply to data processing by Facebook) (see generally Aplin et al., 2012).

18. Several philosophers and legal scholars have recently argued for the concept of posthumous privacy to be recognised (see Scarre [2014, p. 1], Stokes [2015] and Öhman & Floridi [2018]). 

19. Recital 27 of the GDPR clearly states that ‘[t]his Regulation does not apply to the personal data of deceased persons’, however does at the same time allow member states to make additional provision for this purpose. Accordingly, a few European countries have included privacy rights for deceased data subjects in their implementing laws (for instance, Denmark, Spain and Italy — see https://www.twobirds.com/en/in-focus/general-data-protection-regulation/gdpr-tracker/deceased-persons.) However, aside from these limited cases, existing data protection for the deceased is alarmingly sparse across the world. 

20. Under EU insolvency law, any processing of personal data (for example, deletion, sale or transfer of the data to a third party purchaser) must comply with the GDPR (See Art 78 (Data Protection) of EU Regulation 2015/848 on Insolvency Proceedings (recast). However, see endnote 17 with regard to the right to privacy and confidentiality.

21. See https://www.alaraby.co.uk/english/indepth/2019/2/25/saudi-trolls-hacking-dead-peoples-twitter-to-spread-propaganda

22. See https://archive.org/web/

23. See Administrator’s Progress Report (2018) https://beta.companieshouse.gov.uk/company/09375920/filing-history. However, consumer data (for example, in the form of customer loyalty schemes) has been valued more highly in other corporate insolvencies (see for example, the Chapter 11 reorganisation of the Caesar’s Entertainment Group https://digital.hbs.edu/platform-digit/submission/caesars-entertainment-what-happens-in-vegas-ends-up-in-a-1billion-database/).

24. There is a broader call, from a competition (antitrust) policy perspective, to regulate Big Tech platforms as utilities on the basis that these platforms tend towards natural monopoly (see, e.g. Warren, 2019). Relatedly, the UK Competition and Markets Authority has recommended a new ‘pro-competition regulatory regime’ for digital platforms, such as Google and Facebook, that have ‘strategic market status’ (Furman, 2019; CMA, 2020). The measures proposed under this regime — such as facilitating interoperability between social media platforms— would also help to mitigate the potential harms to Facebook’s ethical stakeholders due to its closure.

25. Directive (EU) 2016/1148 of the European Parliament and of the Council of 6 July 2016 concerning measures for a high common level of security of network and information systems across the Union OJ L 194, 19.7.2016.

26. Facebook has stated that financial data collected by Calibra/Novi, the digital wallet for Libra cryptocurrency, will not be shared with Facebook or third parties without user consent (Facebook 2019b). The segregation of user data is the subject of a ruling by the German Competition Authority, however this was overturned on appeal by Facebook (and is now being appealed by the competition authority — the original decision is here: https://www.bundeskartellamt.de/SharedDocs/Meldung/EN/Pressemitteilungen/2019/07_02_2019_Facebook.html).

27. A related imperative is to clarify the financial accounting rules for the valuation of (Big) data assets, including in an insolvency context.

28. See s 2(5) of the Danish Data Protection Act 2018 <https://www.datatilsynet.dk/media/7753/danish-data-protection-act.pdf>

29. UNESCO has previously initiated a project to preserve source code (see Di Cosmo R and Zacchiroli, 2017).

30. This could be formal or informal, for example in the vein of the ‘Giving Pledge’ — a philanthropic initiative to encourage billionaires to give away the majority of their wealth in their lifetimes (see < https://givingpledge.org/>).

31. Although the initiative has ceased to operate as originally planned, it remains one of the best examples of large scale social media archiving (see https://www.npr.org/sections/thetwo-way/2017/12/26/573609499/library-of-congress-will-no-longer-archive-every-tweet). 

Too big to fail us? Platforms as systemically relevant

$
0
0

During the current COVID-19 crisis, we can see that we increasingly depend on digital platforms to satisfy our basic needs. Platforms like Google, Facebook, Uber, and Amazon are not only providing central communication channels, server capacity, and information but also offer mobility infrastructure, deliver food, and supply vital medicines. Platforms become essential infrastructures and key players in our society. They are both systemically relevant and too big to fail.

Becoming essential infrastructures of connection has been good business for platform companies, as digital markets turned into oligopolies that allow for rent extraction and self-sustaining growth – at low marginal cost. Meanwhile, for individuals, platform infrastructure has become taken for granted. Just like we stop paying attention to the affordances of roads or the electricity grid, we no longer wonder about browsers, internet exchange points, or app stores. Yet, in crises, when infrastructures fail (Graham, 2009), we do notice them.

The same goes for professions such as nurses, truck drivers, or salespeople. They have become essential workers during this COVID-19 crisis, or, to be more precise, their centrality is now more visible. German public agencies have explicitly called these workers KRITIS, short for “critical infrastructure”. Of course, some of these essential workers, like gig delivery couriers, are already part of platform infrastructures.

However, platforms are not only increasingly coordinating essential workers but become systemically relevant themselves (Friederici, Meier, & Gümüsay, 2020). During the financial crisis in 2008, we learned that there are two central concerns when dealing with systemically relevant economic actors. One is how to keep systemically relevant platforms from collapsing (Öhman & Aggarwal, 2020). The second is how to prevent individual platforms from becoming “too big to fail” in the first place.

Of course, the idea that digital platforms need to be regulated is not new (Gorwa, 2019). Latest since the tech lash, it has become something of a received wisdom that digital platforms are “big, anti-competitive, addictive, and destructive to democracy” (The Economist, 2018), as they engage in “harmful, extractive, and monopolistic business practices” (Nachtwey & Seidl, 2020, p. 2). But these legal debates – whether focusing on antitrust, labour protection, or tax issues – maintain a perspective on platforms that sees them rather as private entities separate from society and off limits for collective governance (see van Dijck, Poell, & de Waal, 2018; Kenney, Bearson, & Zysman, 2019).

We feel that the current crisis forces us to take platforms’ far-reaching infrastructural character more seriously than the different prevailing national regulations have done until now. We argue that we can productively employ the idea of systemic relevance to determine platforms’ policy relevance and appropriate responses. The pithy notion of banks becoming “too big to fail” led to Basel III, the most recent European banking regulation. In the case of the financial sector, regulation was introduced to prevent systemic collapse with the help of a two-step procedure. First, Basel III defined which banks are systemically relevant on the global and national level, assessing five characteristics: size, cross-border activities, intertwining, substitutability, and complexity. Second, organisations meeting these criteria are more closely monitored and regulated. For example, they must have additional capital buffers; as soon as the buffers fall below a certain level, automatic restrictions apply.

We can treat platforms similarly. COVID-19 is an example of a socio-economic shock triggered by a deadly infectious disease: ‘after’ this crisis will be ‘before’ the next crisis. The systemic centrality of platforms and their evolving role before, during, and after crisis should thus be taken into account in policy regulation. In particular, we can distinguish between pre-crisis and in-crisis regulation of systemically relevant platforms.

Pre-crisis regulation may ensure that private platforms do not become essential infrastructures to begin with, or that they are already regulated for public benefit as they become more essential. To prepare platform regulation for the next crisis, we need a Basel III for platforms. In contrast to banks, however, systemically relevant platforms would not aim at capital buffers but would have to commit themselves to more transparency, for example by setting up API interfaces, disclosing algorithms, making filter decisions transparent, or sharing findings from the analysis of user data, also and especially with competitors.

In-crisis regulation can add stronger direct interventions in platform governance, making sure that platforms’ centrality is not misused. For example, it would then be necessary to ensure that obvious misinformation, as in the case of COVID-19 and social media, could be effectively filtered because it is a matter of life and death. Or to use the example of Uber, supply and demand should no longer be allocated according to pure market criteria, but should also include social or health issues e.g., in emergency situations when driving to the doctor or for persons with disabilities. Similarly, food delivery platforms can be subsidised and turned into universal service providers at city level, ensuring that the elderly or poor people receive food during lockdowns.

Both before and during a crisis, then, platforms are not just too big to fail, but too big to fail us.

Note

All authors have equally contributed to this publication; authorship is in alphabetical order.

References

Friederici, N., Meier, P. & Gümüsay, A.A. (2020), An opportunity for inclusion? Digital platform innovation in times of crisis. Pioneers Post. https://www.pioneerspost.com/news-views/20200616/opportunity-inclusion-digital-platform-innovation-times-of-crisis

Graham, S. (2009): (Hg.) Disrupted cities: when infrastructure fails. London/New York: Routledge.

Gorwa, R. (2019). The platform governance triangle: conceptualising the informal regulation of online content. Internet Policy Review, 8(2). Doi: 10.14763/2019.2.1407

Kenney, M., Bearson, D., & Zysman, J. (2019). The Platform Economy Matures: Pervasive Power, Private Regulation, and Dependent Entrepreneurs. SSRN Electronic Journal. https://doi.org/10.2139/ssrn.3497974

van Dijck, J., Poell, T., & Waal, M. de. (2018). The Platform Society. Oxford University Press.

Nachtwey, O., & Seidl, T. (2020). The Solutionist Ethic and the Spirit of Digital Capitalism. Doi: https://doi.org/10.31235/osf.io/sgjzq

Magalhães; J.C. & Couldry, N. (2020). Tech Giants Are Using This Crisis to Colonize the Welfare System. JACOBIN. https://www.jacobinmag.com/2020/04/tech-giants-coronavirus-pandemic-welfare-surveillance?fbclid=IwAR37SVEWehoJBlKnFLO3lBjwH0oblPPeB-KzogDpg34NiKEIK43UyMBaZ9M.

Öhman, C. & Aggarwal, N. (2020). What if Facebook goes down? Ethical and legal considerations for the demise of big tech. Internet Policy Review, 9(3). Doi: 10.14763/2020.3.1488.

The Economist (2018). How to tame the tech titans. https://www.economist.com/leaders/2018/01/18/how-to-tame-the-tech-titans

Bank for International Settlement (2017). Basel III: international regulatory framework for banks. https://www.bis.org/bcbs/basel3.htm

Transnational collective actions for cross-border data protection violations

$
0
0

This paper is part of Geopolitics, jurisdiction and surveillance, a special issue of Internet Policy Review guest-edited by Monique Mann and Angela Daly.

Introduction

The Cambridge Analytica/Facebook (hereinafter CA/FB) scandal revealed the level of surveillance we may be subject to during the time we spend online. The fact that a Facebook app was programmed to gather personal data from more than 87 million users’ profiles without their consent shows how crucial data gathering is for online platforms. The CA/FB case was shocking for two main reasons: first, because the collection of data was concealed within a quiz app able to access not only information in the profiles but also about the ‘friends’ of the people that took the quiz; and second, and most importantly, because Facebook CEO Mark Zuckerberg did not decide to notify the competent authorities of the unlawful data processing immediately, even though he was aware that the data gathered were subsequently sold to Cambridge Analytica. Instead the Facebook CEO only asked the profiling company to destroy the information obtained unlawfully without checking for subsequent confirmation (Messina, 2019). As a result, Cambridge Analytica used the data to profile users and target potential voters with personalised political messages during the 2016 US election (Cadwalladr, 2018; Granville, 2018).

We live in a data-driven economy where personal data can (consciously or not) be used as counter-performance for digital services. This process started, according to Zuboff (2019), in 2002 when society moved towards a form of ‘surveillance capitalism’, which is based on the instrumentalisation of human behaviour for the purposes of modification and monetisation.

This is reflected in the business model adopted by online platforms, which use data as fuel: without data on users’ activities no advertisement may be sold nor a new app developed. This is the reason why platform data gathering is increasing, both in terms of breath and depth. According to Manokha (2018), user data gathered by Google include several types of information such as: the user’s location (via smartphone use); search history across devices; app and extensions used, including frequency of use and contacts; parent company (such as YouTube) history; bookmarks, emails and contacts when Google products are used. Moreover, when more devices are connected, each of them may provide additional data. The depth of data gathering flows from the ever-improving technical tools used by online platforms, exploiting algorithms and more recently artificial intelligence (AI) to process data and develop profiles and clusters of users.

However, this approach is not without limits and one of the most pervasive and detailed regulatory frameworks to safeguard users’ data is data protection. One of the pillars of this, both in the European Union and in the US, is user consent, which should be based on the data subject being aware of – and (in theory) understanding – the processing of their data and the potential consequences. 1

In the CA/FB case, data subjects were clearly not aware of the type and objectives of the processing, not having consented to a further use of the data gathered by the app. The social networking platform failed to perform the monitoring tasks allocated to the data processor in the case of a breach. From a legal perspective this was unlawful data processing which could be subject to judicial and administrative proceedings, which was exactly what happened in a few European countries, namely the UK, Italy and Germany.

In the UK, the Information Commissioner’s Office (ICO) extended the investigation it was already conducting into data analytics for political purposes to encompass the CA/FB scandal, eventually announcing its intention to fine Facebook for lack of transparency and security issues relating to the harvesting of data in contravention of the Data Protection Act 1998. Then, in October 2018 it fined Facebook £500,000 for breaching the UK’s data protection law. Discussing the numerous reasons for imposing the maximum fine, the ICO noted “the personal information of at least one million UK users was among the harvested data and consequently put at risk of further misuse”. In Italy, in April 2018 both the Italian Data Protection Authority (DPA) and Antitrust Authority started an investigation into what exactly happened with the data, both in terms of individual privacy and alleged unfair commercial practices. Eventually the investigations resulted in one of the highest fines by the Italian DPA, on the basis of Facebook’s economic status and the number of its users both worldwide and in Italy (Italian Data Protection Authority, 2019).

In all these cases, the administrative procedure was initiated ex officio by DPAs as a reaction to data breaches that occurred in relation to domestic users in each country, whereas no individual user was able (or willing) to claim before national courts for the same data breaches. The data breaches were negligible for each individual, which is clearly a disincentive to starting a long and expensive judicial procedure that could result in a very limited award of damages. As a result, users were not able to recover any damages due to practical limitations affecting their right of access to justice.

While at the societal level the fines imposed by the national DPAs on the overall data protection system may trigger the adoption of better and stronger means for online platforms to protect personal data under the threat of higher fines and stricter scrutiny of their conduct, they do not provide specific redress for each citizen who has suffered from the violation. Moreover, in the case of cross-border data processing in the EU, the intervention of DPAs is subject to a coordination mechanism which requires the identification of a lead supervisory authority that will guide the investigation activities of the other DPAs involved, pursuant to Article 56 GDPR (Article 29 Data Protection Working Party, 2017). Given that the identification of the lead supervisory authority is based on the main establishment of the data processor, there may be a risk of forum shopping towards countries where the enforcement of (joint) decisions is less vigorous (a phenomenon also seen regarding encryption measures - see Mann et al., 2020). The need for an intervention enhancing cooperation among data protection authorities was also affirmed by the European Commissioner Věra Jourová (European Commission, 2020), however, so far no specific action has been taken in this direction.

The position of data subjects is still weaker vis-à-vis that of data processors, particularly in the case of big online companies, which may justify their activities on the ground that limiting access and use of data would have the effect of limiting the opportunities that large volumes of data may offer in terms of personalisation, cost reduction etc. In order to achieve an effective remedy when breaches occur, there is a need for alternative forms of enforcement such as collective redress which may empower data subjects vis-à-vis data processors - particularly in cases where a public outcry regarding data breaches does not result in such swift and immediate administrative proceedings before DPAs (Manokha, 2018; Messina, 2019). Collective remedies may thus provide for the effective protection of data subjects’ interests through what has been claimed as a need for the active empowerment of individuals (Malgieri and Custers, 2017).

Within this framework, some steps were taken by the EU legislator in drafting the General Data Protection Regulation (GDPR), 2 which introduced the possibility for data subjects to also exercise their rights through associations and non-profit organisations. According to Article 80, a data subject “shall have the right to mandate a not-for-profit body, organisation or association […] to lodge the complaint on his or her behalf, to exercise the rights referred to in Articles 77, 78 and 79 on his or her behalf, and to exercise the right to receive compensation referred to in Article 82 on his or her behalf where provided for by Member State law”.

Although there are several open issues regarding better solutions to implement this provision at the national level, it is interesting to note that an element that is crucial in the online environment is the possibility of coordinating actions across different EU member states when violations occur in several countries as a result of the conduct. This element is addressed in Article 81 GDPR, which provides a special rule on lis pendens in cases where the same data controller or processor is party to different proceedings in different EU member states, or the proceedings concern the same subject. 3 Indeed, Article 81(2) GDPR provides that “where proceedings concerning the same subject matter as regards processing of the same controller or processor are pending in a court in another Member State, any competent court other than the court first seized may suspend its proceedings”.

This provision paves the way for transnational collective actions, which in principle may achieve positive results for both parties:

  • for multinational companies that have seats in different EU countries - they will not have to be subject to proceedings across the EU for the same conduct but with different procedural rules;
  • for national associations and NGOs working on data protection issues, which will have the opportunity to strengthen their position vis-à-vis data processors as a result of a wider range and larger number of claimants. At the same time they will be able to collaborate and coordinate their actions across the EU, thus reducing the costs of multiple proceedings in different courts in EU member states.

This contribution will focus on the current framework provided for transnational collective actions. It will show the gaps emerging in legislation, the limits set by private international law rules on jurisdiction over such actions, and the consequences that these actions may have for the coordinating mechanisms between national courts and data protection authorities. In particular, Section 2 will provide an overview of the collective redress mechanism provided by the GDPR and in Section 3 the specific issues related to transnational collective claims will be addressed.

Collective remedies in the GDPR

Collective remedies or collective redress mechanisms include a large number of legal instruments aimed at resolving disputes by clustering multiple individuals within a single action or procedure. According to Hodges (2019), the collective enforcement mechanisms that can be identified are private collective litigation, the partie civile mechanism (a civil claim following on from a criminal prosecution), the involvement of public regulatory authorities (either through the power to order redress by starting a collective court claim or merely through the general enforcement authority) and Alternative Dispute Resolution (ADR), namely through the Consumer Ombudsman.

This contribution will only focus on the two possible options for the first mechanism, namely (a) the procedure granting a member of the affected group standing to bring an action on behalf of the group (a so-called class action or group action) and (b) the procedure granting a representative entity standing to bring an action on behalf of the group (a so-called representative action). In both cases, a group of claimants sharing the same interest starts the action, and a single representative or an association represents the entire group. Then, according to procedural rules, the representative (be it an individual or an association) is in charge of pursuing the action, while the other individual members do not play a role in the proceedings.

The objective of these types of actions can be simply compensatory, allocating the damages caused by the violation to each of the group members, or may be to achieve deterrent effects, in particular through injunctive relief preventing future violations (Hodges, 2019; Bosters, 2017; Trstenjak and Weingerl, 2014).

Although the EU legislator left the task of putting this provision into practice to the member states by introducing substantive and procedural rules applicable to collective redress (Casarosa, 2018; Pato, 2019), I note some important features emerging from the current legislative framework.

According to Article 80 GDPR, each member state should provide for three different types of action:

  • an opt-in collective action in which the interested parties have the right to instruct an authorised body to file a complaint on their behalf, the right to lodge a complaint with a supervisory authority (Article 77 GDPR), the right to an effective judicial remedy against a supervisory authority (Article 78 GDPR) and the right to an effective judicial remedy against a controller or a processor (Article 79 GDPR);
  • an opt-in collective action in which the interested parties have the right to instruct an authorised body to exercise the right to receive compensation, but only if the legislation of the member state so permits;
  • an opt-out collective action where the authorised entities are authorised to act on behalf of the data subjects without having obtained a mandate from those persons in the case of infringement of the rights of a data subject under the Regulation, as long as the member state provides for such a possibility. Claims for compensation are, however, excluded from this mechanism.

In the opt-in procedures, it is clear that data subjects will have to take positive steps to join the proceedings, affirming their rights and the will to be subject to the effects of the decision. In these scenarios, however, the GDPR does not preclude the possibility for member states to identify different phases of the judicial or administrative proceedings in which the opt-in may take place. The opt-out procedure instead implies that the group of claimants is not identified individually. However, the decision of the court will bind all groups sharing the same interest. To be outside the group, data subjects have the possibility of opting out (Bosters, 2017).

Each member state is free to select whether all three actions will be available or only the first (and mandatory) one. This choice would also be based on the pre-existing national legislation applicable to collective redress, which in some member states already covers data protection. 4

Regarding the application of procedural rules in the case of collective actions, the GDPR is silent, leaving the national legislator full discretion. As for the applicable forum, guidelines emerge in Article 79(2) GDPR, which expressly provides that (individual) actions before the courts should be brought in the member state where the controller or the processor is established. Alternatively, such actions may be brought before the courts of the member state where the data subject is habitually resident. However, in this case it is difficult to identify if habitual residence is applicable as more data subjects are involved in the claim who may be resident in different countries, although no criteria of preference is provided. Moreover, in the case of associations or NGOs the criteria of habitual residence cannot be applicable (Casarosa, 2018). As a result, it will be up to the national legislators to identify the procedural rules applicable to this type of case: for instance, in Italy, the solution adopted allocated jurisdiction to the tribunal of the place where the controller or the processor is established also in those cases where the claim is presented by an association (as provided by Article 10 of the amended Legislative decree 151/2011).

Other doubts emerge, in particular regarding the effect of a decision declaring a violation of the data protection rules which may or may not also include an award of damages to the data subjects. In the event that the member state provides for an opt-out collective action, where an association or an NGO is authorised to act on behalf of the data subjects without any individual mandate, which effects will the decision of the judicial authority have vis-à-vis the data subjects that did not take part in the action? According to Article 80 GDPR, member states are free to include this procedure, but the article is silent on the third party effects of the decision. Would it be possible for a decision declaring a breach of data protection rules to be followed by so-called follow-on actions by individual data subjects to obtain any compensation for the damage suffered as a result of the violation? Similar doubts emerge in the case of opt-in collective claims. Where a mandate is provided by a limited number of data subjects, what would be the effect of a decision declaring that the conduct of the data controller does not infringe data protection rules? Can such a decision limit any subsequent claim pursued through individual proceedings? Or would it only be used in such proceedings by the defendant as proof of lack of wrongdoing?

Obviously, these elements may be decided at the national level following pre-existing procedural rules. However, given the EU’s recent attention to collective remedies in the consumer protection sector (European Commission, 2018), the rules applicable should be carefully identified. It is interesting to note that in the context of EU intervention in relation to collective actions, a much more effective approach has been adopted in the Proposal for a Directive of the European Parliament and of the Council on representative actions for the protection of the collective interests of consumers, and repealing Directive 2009/22/EC, COM(2018) 184 final. Also in this case the legislator has taken on board the problems that arose as a result of cases of infringement that have occurred in the member states. First and foremost in the Dieselgate case (Garaci and Montinaro, 2019), where the collective protection of users came up against the difficulty of recognising the effects of the decisions of individual member states’ competition authorities on collective actions relating to the same infringements. The Proposal for a Directive in fact addresses the cases where there is an interaction between administrative enforcement (for instance through DPA involvement) and judicial enforcement. Article 10 of the Proposal for a Directive states that final decisions (regardless of their objective and the deciding body) are considered evidence (freely assessable by the court) that establishes the existence or non-existence of an infringement, for the purposes of any other action for damages before national courts against the same controller or processor, on the same facts.

Given that there may well be cases where there is an overlap between the status of consumer and of data subject, the rules applicable to collective claims in the consumer and data protection frameworks should provide for an even level of judicial protection (Amaro et al., 2018; Casarosa, 2020). For example, a collective action based on a claim of unfair contractual clauses included in the so-called privacy policy attached as contractual content to the terms of service of several online platforms may be used for both injunctive and compensatory claims, but (according to the proposed Directive on collective claims for consumer protection) other consumers who are in the same contractual scenario are also allowed to use the decision as evidence for bringing equivalent claims for damages. The same cannot happen for collective actions for claims regarding the violation of data protection rules. Thus, a situation of unequal judicial protection could arise which is not justified by substantive differences (Casarosa, 2018).

The degree of complexity increases when looking at the possibility of transnational collective claims.

Transnational collective actions

Although collective claims are perceived as a tool to safeguard the interests of a plurality of claimants unable to pursue their interests through judicial proceedings, the existing legal framework applicable to collective actions at EU, and consequently at national level, seems to rely on the assumption that only national collective redress is conceivable (Amaro et al., 2018, p. 94). This assumption did not hold when the Schrems v Facebook case arrived at the Court of Justice of the EU (CJEU) in 2018.

The C-498/16 Schrems v Facebook case was the first example of the possible use of collective actions at the transnational level in the field of data protection. The case involved Maximillian Schrems, who presented a claim for alleged violation of data protection laws in his own country (Austria). The claim was not only in his name but also in the name of seven other claimants resident in other EU member states and in non-EU countries. These other claimants provided a mandate to Mr Schrems to act on their behalf, following the Austrian law allowing for different claims to be presented by one applicant against the same defendant. 5 The national court, however, had several doubts regarding the qualification of Mr Schrems as a consumer as he was involved in several academic and commercial activities, first as a privacy activist and then as the founder of a non-profit organisation, NOYB – European Center for Digital Rights. The qualification of the status of Mr Schrems impacted also on whether the protective provisions in the Brussels I Regulation were applicable. According to Article 18 Brussels I Regulation “a consumer may bring proceedings against the other party to a contract either in the courts of the Member State in which that party is domiciled or, regardless of the domicile of the other party, in the courts of the place where the consumer is domiciled”. The qualification as a consumer, then, would allow Mr Schrems to bring the claims ceded to him before the Vienna jurisdiction. The CJEU’s decision addressed the analysis of the application of the Brussels I Regulation with some caution (Amaro et al., 2018) due to the fact that any extended interpretation of Article 18 of the Brussels I Regulation regulating the consumer forum would have the indirect effect of reducing legal certainty, as the representative of the consumer group may be allowed to select the forum from those available to the group (Blanc, 2017).

This case showed clearly that given the ubiquitous control occurring of personal data online, it is possible (if not common) that the same conduct occurring in different member states may result in a violation of the data protection framework against a large number of online users. In this case, there are two possible options: one is the emergence of several national collective actions against the same defendant, following in each case the rules and procedures applicable at the national level. This was the path selected, for instance, by consumer associations in Belgium, Spain, Portugal and Italy, which followed a collective strategy: each association presented a national collective claim against Facebook in relation to the Cambridge Analytics/Facebook scandal (Consumer International, 2018). In this case, however, the decisions of courts at the national level may differ, and such decisions may not be used as an authoritative precedent in foreign countries. The alternative available is the transnational collective claim: this claim may avoid the fragmentation of the proceedings and of the decisions, collecting all the claims within a single procedure.

Although in practice this case is far from unrealistic, the possibility of pursuing a transnational collective action faces several difficulties.

As mentioned above, Article 81 GDPR hints at cases where data processors may be sued in different countries for the same violation, alluding to the occurrence of a cross-border dimension of the violation. However, the Article does not specify if it only applies to individual claims or to collective claims. If a claim presented by an association represents data subjects in different EU countries, which are the rules applicable according to the current EU legal framework?

The first issue is legal standing: can associations and NGOs which qualify to represent data subjects in national collective actions also be able to present transnational claims? The GDPR does not provide any indication, but neither does it exclude this possibility. A comparison with Injunction Directive 2009/22/EC 6 can help by acknowledging that this element is not without importance: Recital 12 of the Directive provides that mutual recognition should apply in the case of associations and NGOs which have been admitted as qualified claimants at the national level. The provision then prevents requirements identified for the qualification from being interpreted differently across countries, thus avoiding conflicting judgments on the admissibility or recognition of collective redress actions (Voet, 2017). Given that Article 80 GDPR already identifies the basic requirements for associations and NGOs, it would be reasonable to acknowledge that they should be applicable across the EU.

An additional element highlighted by ELI/Unidroit (2018) in a chapter dedicated to the model rules on collective redress is the fact that information regarding collective actions across Europe would also be fruitful in order to avoid parallel proceedings and enhance cooperation among European actors. According to Articles [X4bis] and [X29], national courts should provide a publicly accessible electronic register where all collective redress claims are registered in order for potential ‘qualified claimants’, lawyers, group members etc. to gain knowledge of existing actions. When such collective claims have a cross-border effect, the model rules provide that the registry entries “shall be made available on the European e-justice platform”. However, it must be highlighted that the disclosure of the name of the defendant in such cases could have adverse effects on their position, in particular when liability is still to be decided, as clarified by Articles 35-36 of the European Commission Recommendation on common principles for injunctive and compensatory collective redress mechanisms. 7

Another important set of questions relate to the private international laws (here after PILs) applicable in the case of transnational collective claims. As is clearly acknowledged by the Report on Collective Redress (2018) (and previously Voet, 2017; Money-Kyrle, 2016), the current legal framework for PILs is still unsatisfactory. The applicable EU Regulations, namely the Brussels I Regulation 8 and the Rome I 9 and Rome II 10 Regulations are all drafted taking as the point of reference a conflict between an individual claimant and an individual defendant.

Only Article 4 of the Brussels I Regulation provides the possibility of multiple claims being consolidated, and in this case the general rule regarding the choice of jurisdiction designates the defendant’s domicile. Accordingly, any collective action for data protection infringement would be obliged to sue the data controller at its headquarters in any EU member state, for instance Ireland in the case of Facebook. This would create the possibility for data controllers to select safe havens in member states where collective redress mechanisms are not effectively regulated. Moreover, the special rule in Article 7 (2) Brussels I Regulation does not provide an effective solution as it affirms that, in the case of tort, delict or quasi-delict, the claimant may sue before the court of the place where the harmful event occurred. This permits cases of concurrent jurisdiction. Only if the defendant proves that the harmful event occurred in the place where the decisions regarding the data processing were taken, i.e. at the headquarters of the data processor, will there be no difference in the application of the general rule provided by Article 4 Brussels I.

In the case of concurrent jurisdiction, rules on lis pendens may apply, and as mentioned above Article 81 GDPR provides for a lex specialis vis-à-vis Articles 29-34 Brussels I Regulation. Article 81 GDPR provides that if the defendant (i.e. the data controller or processor) coincides in both proceedings or the claims address the same conduct, the court subsequently seized may suspend the action in order to await the outcome of the proceedings before the foreign authority. Moreover, the Article recognises the possibility for courts to decline jurisdiction at the request of one of the parties if "the court first seized has jurisdiction over the proposed actions and its law allows proceedings to be joined" (Article 81(3)). If the provision also applies to collective actions, then parallel proceedings may be avoided if the national procedural rules allow consolidation of actions.

Instead, in the case where procedural rules do not allow for consolidation of proceedings, it is important to consider the effects the decisions of the foreign court may have on the suspended proceedings. What is the value of a foreign decision in a parallel proceeding? On the one hand, a decision in a collective claim is automatically recognised in the other member states according to Article 36 of the Brussels I Regulation without any specific procedure. On the other hand, the decision may be used in the suspended proceeding as proof of the existence or non-existence of the violation, which can be evaluated by the judge. However, no specific guideline is provided by the EU legislator as regards the role of the decision.

As emerging from the analysis here, it seems clear that transnational collective claims in the data protection area cannot be exploited yet. In particular, the provisions of Brussels I Regulation dedicated to jurisdiction and lis pendens are not apt for addressing multi-party conflicts. Thus, a further step is needed from the EU bodies, namely an effort to coordinate the specificities of the GDPR enforcement system with amended private international law rules in order to provide an effective transnational collective action that can enhance the opportunities for data subjects to enforce their rights.

Conclusion

The GDPR was seen as a step forward in solving many of the challenges posed by the development of new technologies, and in particular it was presented as a tool to improve data subjects’ awareness and to empower them vis-à-vis data processors through consent mechanisms, avoiding hidden data processing. Reality has then clashed against this positive image, as the CA/FB scandal arose just before the entry into force of the GDPR. The case showed that forms of surveillance over online users are more and more subtle and able to manipulate the choices of users not only over goods and services but also political preferences, with significant implications for democratic processes. Given the data protection framework, if preventive measures do not achieve the result of protection, then data subjects should at least have access to remedial measures that can help them recover potential damages, and through collective action overcome the weaker position each individual user may have vis-à-vis data processors.

The GDPR framework has already made a step forward in this direction by requiring member states to adopt national provisions for collective actions. However, given the cross-border nature of violations of data protection rules occurring online, the objective should be even more ambitious: to address the possibility of presenting transnational collective actions where associations or NGOs may represent claimants from different EU countries. It is true that the current framework includes some common principles regarding the features that associations and NGOs should have in order to engage in collective actions before national courts, ensuring – in principle – equivalent criteria across the EU. However, the EU legislator could have explicitly mentioned in addition that the mutual recognition principle (applicable to other collective actions according to Directive 2009/22 on injunctions) is also applicable to any entity designated for such collective actions at the national level. Accordingly, lists of organisations qualified according to national criteria could be communicated to the Commission and publication in such a list could be used as proof of legal capacity in other EU member states’ national jurisdictions. 11

Moreover, the system provided by the GDPR is based on the assumption that not only are qualified associations and NGOs aware of existing collective actions but also that data subjects are aware of breaches occurring at a cross-border level, are interested in joining such actions and provide their mandate to the relevant association or NGO. Unfortunately, such active engagement of data subjects is difficult to find in practice and the lack of centralised information mechanisms is an open issue in the development of transnational collective actions. The proposal by the ELI/UNIDROIT group regarding the creation of an electronic register of existing collective actions could be seen as a simple yet effective tool to improve the ability of qualified organisations to collaborate in the case of cross-border actions.

Finally, a revision of the EU legal framework regarding the private international law rules applicable to transnational collective claims and the effects that transnational decisions may have is required. If the process of modernisation of collective redress mechanisms – which started in 2013 with the Recommendation on common principles for injunctive and compensatory collective redress mechanisms – is not to end, increased attention should be dedicated by the EU legislator to ensuring EU citizens have effective access to transnational collective actions.

References

Amaro, R., Azar-Baud, M. J., Corneloup, S., Fauvarque-Cosson, B., & Jault-Seseke, F. (2018). Study on Collective Redress In the Member States of the European Union [Study]. European Parliament. http://www.europarl.europa.eu/RegData/etudes/STUD/2018/608829/IPOL_STU(2018)608829_EN.pdf

Article 29 Data Protection Working Party. (2017). Guidelines for identifying a controller or processor’s lead supervisory authority. 16/EN WP 244 rev.01. http://ec.europa.eu/newsroom/document.cfm?doc_id=44102

Biard, A. (2016). Class Action Developments in France [Country report]. Global Class Actions Exchange. http://globalclassactions.stanford.edu/content/class-action-developments-france

Blanc, N. (2017). Schrems v Facebook: Jurisdiction Over Consumer Contracts Before the CJEU. European Data Protection Law Review, 3(3), 413 – 417. https://doi.org/10.21552/edpl/2017/3/20

Bosters, T. (2017). Collective Redress and Private International Law in the EU. Asser Press.

Cadwalladr, C. (2018, March 18). I made Steve Bannon’s psychological warfare tool’: Meet the data war whistleblower. The Guardian. https://www.theguardian.com/news/2018/mar/17/data-war-whistleblower-christopher-wylie-faceook-nix-bannon-trump

Casarosa, F. (forthcoming). Azioni collettive fra tutela dei dati personali e tutela dei consumatori: Nuovi strumenti alla prova dei fatti. In P. Iamiceli (Ed.), Diritti Fondamentali ed effettività della tutela. Uni Service.

Casarosa, F. (2018). La tutela aggregata dei dati personali nel Regolamento UE 2016/679—Una base per l’introduzione di rimedi collettivi? In M. A. & D. Poletti (Eds.), Regolare la tecnologia: Il Reg. UE 2016/679 e la protezione dei dati personali. Un dialogo fra Italia e Spagna. Pisa University Press.

Consumers International. (2018). Not your Puppets: An update on the Euroconsumers class action against Facebook [Blog post]. Consumers International. https://www.consumersinternational.org/news-resources/blog/posts/not-your-puppets-euroconsumers-interview/

D. Messina. (2018). Il Regolamento (EU) 2016/679 in materia di protezione dei dati personali alla luce della vicenda ‘Cambridge Analytica’. Federalismi.it, 20. https://www.federalismi.it/nv14/articolo-documento.cfm?Artid=37234.

European Commission. (2018). Communication from the Commission to the European Parliament, the Council, and the European Economic and Social Committee: A ‘New Deal’ for consumers. European Union. https://eur-lex.europa.eu/legal-content/EN/TXT/?uri=CELEX:52018DC0183

European Commission. (2020, January 27). Joint Statement by Vice-President Jourová and Commissioner Reynders ahead of Data Protection Day. European Commission, Press Corner. https://ec.europa.eu/commission/presscorner/detail/en/STATEMENT_20_120

Garaci, I., & Montinaro, R. (2019). Public and Private Law Enforcement in Italy of EU Consumer Legislation after Dieselgate. Journal of European Consumer and Market Law, 8(1), 29–34.

Granville, K. (2018, March 19). Facebook and Cambridge Analytica: What You Need to Know as Fallout Widens. The New York Times. https://www.nytimes.com/2018/03/19/technology/facebook-cambridge-analytica-explained.html

Hodges, C. (2019). Collective Redress: The Need for New Technologies. Journal of Consumer Policy, 42, 59–90. https://doi.org/10.1007/s10603-018-9388-x

Information Commissioner’s Office. (2019). SCL Elections prosecuted for failing to comply with enforcement notice[Press release]. https://ico.org.uk/about-the-ico/news-and-events/news-and-blogs/2019/01/scl-elections-prosecuted-for-failing-to-comply-with-enforcement-notice.

Institute, E. L. (2018). - International Institute for the Unification of Private Law (UNIDROIT. https://www.europeanlawinstitute.eu/fileadmin/user_upload/p_eli/Projects/Unidroit_Materials/Trier_2018/WG_Parties_-_Draft_on_Collective_Redress.pdf.

Italian Data Protection Authority. (2019). Cambridge Analytica: Facebook fined 1 million Euro by the Italian Dpa [Press release]. https://www.garanteprivacy.it/web/guest/home/docweb/-/docweb-display/docweb/9121506

Jančiūtė, L. (2019). Data protection and the construction of collective redress in Europe: Exploring challenges and opportunities. International Data Privacy Law, 9(1), 2–11. https://doi.org/10.1093/idpl/ipy022

Malgieri, G., & Custers, B. (2017). Pricing privacy: The right to know the value of your personal data. Computer Law & Security Review. https://ssrn.com/abstract=3047257

Mann, M., Daly, A., & Molnar, A. (2020). Regulatory arbitrage and transnational surveillance: Australia’s extraterritorial assistance to access encrypted communications. Internet Policy Review, 9(3). https://doi.org/10.14763/2020.3.1499

Manokha, I. (2018). Surveillance: The DNA of Platform Capital – The Case of Cambridge Analytica Put into Perspective. Theory & Event, 21(4), 891–913,. https://ora.ox.ac.uk/objects/uuid:15e74c10-225f-4bd7-b086-8e1fdb1b79e8/download_file?file_format=pdf&safe_filename=Manokha%252C%2BSurveillance%252C%2BAAM.pdf&type_of_work=Journal+article.

Money-Kyrle, R. (2016). Legal Standing in Collective Redress Actions for Breach of EU Rights: Facilitating or Frustrating Common Standards and Access to Justice? In B. Hess, M. Bergström, & E. Storskrubb (Eds.), EU Civil Justice. Current Issues and Future Outlook (pp. 223–254). Hart Publishing.

Nuyts, A. (2014). The Consolidation of Collective Claims Under Brussels I. In A. Nuyts & N. Hatzimihail (Eds.), Cross-Border Class Actions. The European Way (pp. 69–84). Verlag Dr. Otto Schmidt.

Pato, A. (2019). The Collective Private Enforcement of Data Protection Rights in the EU (2019. In L. Cadiet, B. Hess, & M. R. Isidro (Eds.), MPI-IAPL Summer School. Nomos. https://doi.org/10.5771/9783748900351-129

Privacy International. (2016). The Global Surveillance Industry [Report]. Privacy International. https://privacyinternational.org/sites/default/files/2017-12/global_surveillance_0.pdf

Trstenjak, V., & Weingerl, P. (2014). Collective Actions in the European Union—American or European Model? Beijing Law Review, 5(3), 155–162. https://doi.org/10.4236/blr.2014.53015

Voet, S. (2017). 'Where The Wild Things Are’: Reflections On the State and Future of European Collective Redress. In M. Loos & A. L. M. Keirse (Eds.), Waves in contract and liability law in three decades of ius commune. Intersentia. https://ssrn.com/abstract=2913010

Zuboff, S. (2019). The age of surveillance capitalism: The fight for a human future at the new frontier of power. Profile Books.

Footnotes

1. Note that the recently adopted Directive 2019/770 on certain aspects concerning contracts for the supply of digital content also acknowledges the fact that personal data are used as counter-performance for ‘free’ digital services or for discounts on online products and services: Recital 67 and Art 3 (1).

2. Regulation (EU) 2016/679 of the European Parliament and of the Council of 27 April 2016 on the protection of natural persons with regard to the processing of personal data and on the free movement of such data, and repealing Directive 95/46/EC, OJ L 119, 4.5.2016.

3. The doctrine of lis pendens is the basis for suspending or staying legal proceedings in light of other pending proceedings that involve the same or very similar parties, issues or relief. It is aimed at avoiding situations in which two equally final and enforceable decisions exist within the same legal system.

4. For instance, French legislation extends the scope of application of collective claims to data protection in Article 43ter French Data Protection Act of 6 January 1978. See Biard (2016).

5. According to the Austrian model of group litigation the claim is admissible if the basis for the claims is essentially similar and the claims have to refer to the same factual or legal question (Amaro et al., 2018).

6. Directive 2009/22/EC of the European Parliament and of the Council of 23 April 2009 on injunctions for the protection of consumers' interests (Codified version), OJ L 110, 1.5.2009.

7. European Commission Recommendation on common principles for injunctive and compensatory collective redress mechanisms in the member states concerning violations of rights granted under Union Law (2013/396/EU, OJ L 201/60, 26.7.2013).

8. Regulation (EU) 1215/2012 of the European Parliament and of the Council of 12 December 2012 on jurisdiction and the recognition and enforcement of judgments in civil and commercial matters (recast), OJ L 351, 20.12.2012.

9. Regulation (EC) 593/2008 of the European Parliament and of the Council of 17 June 2008 on the law applicable to contractual obligations (Rome I), OJ L 177, 4.7.2008.

10. Regulation (EC) 864/2007 of the European Parliament and of the Council of 11 July 2007 on the law applicable to non-contractual obligations (Rome II), OJ L 199, 31.7.2007.

11. Similar rules have been adopted in the Proposed Directive on representative actions for consumer claims in particular recitals (11a), (11ea) and (11f).

Regulatory arbitrage and transnational surveillance: Australia’s extraterritorial assistance to access encrypted communications

$
0
0

This paper is part of Geopolitics, jurisdiction and surveillance, a special issue of Internet Policy Review guest-edited by Monique Mann and Angela Daly.

Introduction

Since the Snowden revelations in 2013 (see e.g., Lyon, 2014; Lyon, 2015) an ongoing policy issue has been the legitimate scope of surveillance, and the extent to which individuals and groups can assert their fundamental rights, including privacy. There has been a renewed focus on policies regarding access to encrypted communications, which are part of a longer history of the ‘cryptowars’ of the 1990s (see e.g., Koops, 1999). We examine these provisions in the Anglophone ‘Five Eyes’ (FVEY) 1 countries - Australia, Canada, New Zealand, the United Kingdom and the United States (US) - with a focus on those that attempt to regulate communications providers. The paper culminates with the first comparative analysis of recent developments in Australia. The Australian developments are novel in the breadth of entities to which they may apply and their extraterritorial reach: they attempt to regulate transnational actors, and may implicate Australian agencies in the enforcement - and potential circumvention - of foreign laws on behalf of foreign law enforcement agencies. This latter aspect represents a significant and troubling development in the context of FVEY encryption-related assistance provisions.

We explore this expansion of extraterritorial powers that extend the reach of all FVEY nations via Australia, by requesting or coercing assistance from transnational technology companies as “designated communications providers”, and allowing foreign law enforcement agencies to request their Australian counterparts to make such requests. Australia has unique domestic legal arrangements, which includes an aggressive stance on mass surveillance (Molnar, 2017), an absence of comprehensive constitutional or legislated fundamental rights at the federal level (Daly & Thomas, 2017; Mann et al., 2018), and has recently enacted the Telecommunications and Other Legislation Amendment (Assistance and Access) Act 2018 (Cth) 2, the focus of this article. We demonstrate that Australia’s status as the ‘weak link’ in the FVEY alliance enables the introduction of laws less likely to be constitutionally or otherwise legally permissible elsewhere. We draw attention to the extraterritorial reach of the Australian provisions which affords the possibility for other FVEY members to engage in regulatory arbitrage to exploit the weaker human rights protections and oversight measures in Australia.

Human rights and national security in Australia

Australia has a well-documented track record of ‘hyper legislation’ of national security measures (Roach, 2011), having passed over 64 anti-terrorism specific laws since 9/11 that have been recognised as having serious potential to encroach democratic rights and freedoms (Williams & Reynolds, 2017). Some of these laws have involved digital and information communications infrastructures and their operators, such as those facilitating Australian security and law enforcement agencies’ use of Computer Network Operations (Molnar, Parsons, & Zouave, 2017) and the introduction of mandatory data retention obligations on internet service providers (Suzor, Pappalardo, & McIntosh, 2017). Australia’s role as a leading proponent in advocating for stronger powers against encrypted communications is consistent with this history.

Yet, unlike any of the other FVEY members, Australia has no comprehensive enforceable human rights protection at the federal level (Daly & Thomas, 2017; Mann et al., 2018). 3 Australia does not have comprehensive constitutional rights (like the US and Canada), a legislated bill of rights (like NZ and the UK) nor recourse to regional human rights bodies (like the UK and its relationship with the European Convention on Human Rights) (Refer to Table 1).

Given this situation, we argue Australia is a ‘weak link’ among FVEY partners because its legal framework allows for a more vigorous approach to legislating for national security at the expense of human rights protections, including but not limited to, privacy (Williams & Reynolds, 2017; Mann et al., 2018). Australia’s status as a human rights ‘weak link’ affords the ‘legal possibility’ for measures which may be ‘legally impossible’ in other jurisdictions, including those of the other FVEY countries, given peculiar domestic and regional rights protections.

Encryption laws in the Five Eyes

FVEY governments have made frequent statements regarding their surveillance capabilities ‘going dark’ due to encryption, with consequences for their ability to prevent, detect and investigate serious crimes such as terrorism and the dissemination of child exploitation material (Comey, 2014). This is despite evidence that the extensive surveillance powers that these agencies maintain are mostly used for the investigation of drug offences (Wilson & Mann, 2017; Parsons & Molnar, 2017). Further, there is an absence of evidence that undermining encryption will improve law enforcement responses (Gill, Israel, & Parsons, 2018), coupled with disregard for the many legitimate uses of encryption (see e.g., Abelson et al., 2015), including the protection of fundamental rights (see e.g., Froomkin, 2015).

It is important to note, as per Koops and Kosta (2018), that communications may be encrypted by different actors at different points in the telecommunications process. Where, and who applies encryption, will affect which actors have the ability to decrypt communications, and accordingly where legal obligations to decrypt may lie, or be actioned. For example, in some scenarios the service provider maintains the means of decrypting the communications, but this would not be the case where the software provider or end user has the means to decrypt (i.e., ‘at the ends’). More recently, the focus has shifted to communications providers offering encrypted services or facilitating a third party offering such services over their networks. These actors can be forced to decrypt communications either via ‘backdoors’ (i.e., deliberate weaknesses or vulnerabilities) built into the service, or via legal obligations to provide assistance. The latter scenario is not a technical backdoor per se, but could be conceptualised as a ‘legal’ means to acquire a ‘backdoor’ as the government agency will obtain covert access to the service and communications therein, thus having a similar outcome to a technical backdoor. It is these measures which are the focus of our analysis. We provide a brief overview of the legal situation in each FVEY country (Table 1), before turning to Australia as our main focus.

United States

The legal situation in the US to compel decryption depends, at least in part, on the actor targeted. The US has no specific legislation dealing with encryption although other laws on government investigatory and surveillance powers may be applicable (Gonzalez, 2019). Forcing an individual to decrypt data or communications has generally been considered incompatible with the Fifth Amendment to the US Constitution (i.e. the right against self-incrimination), although there is no authoritative Supreme Court decision on the issue (Gill, 2018). Furthermore, the US government may be impeded by arguments that encryption software constitutes ‘speech’ protected by the First Amendment and Fourth Amendment (Cook Barr, 2016; Gonzalez, 2019; see also Daly, 2017).

For communications providers, the US has a provision in the Communications Assistance for Law Enforcement Act (CALEA) §1002on Capability Requirements for telecommunications providers, which states that providers will not be required to decrypt or ensure that the government can decrypt communications encrypted by customers, unless the provider has provided the encryption used (see e.g., Koops & Kosta, 2018). 4

In an attempt to avoid the difficulty of forcing individuals to decrypt, and the CALEA requirements’ application only to telecommunications companies, attention has been turned to technology companies, including equipment providers. Litigation has been initiated against companies that refuse to provide assistance; the most notable being the FBI-Apple dispute concerning the locked iPhone of one of the San Bernardino shooters (Gonzalez, 2019). Ultimately the FBI were able to unlock the iPhone without Apple’s assistance, by relying on a technical solution from Cellebrite (Brewster, 2018), thereby engaging in a form of ‘lawful hacking’ (Gonzalez, 2019). Absent a superior court’s ruling, or legislative intervention, the legal position regarding compelled assistance remains uncertain (Abraha, 2019).

Canada

Canada does not have specific legislation that provides authorities the power to compel decryption. Canadian authorities have imposed requirements on wireless communications providers through spectrum licensing conditions in the form of the Solicitor General Enforcement Standards for Lawful Interception of Telecommunications (SGES) Standard 12 which obliges providers to decrypt any communications they have encrypted on receiving a lawful request, but excludes end-to-end encryption “that can be employed without the service provider’s knowledge” (Gill, Israel, & Parsons, 2018, p. 59; West & Forcese, 2020). It appears the requirements only apply to encryption applied by the operator itself, can involve a bulk rather than case-by-case decryption requirement, do not require the operator to develop “new capabilities to decrypt communications they do not otherwise have the ability to decrypt”, and do not prevent operators employing end-to-end encryption (Gill, Israel, & Parsons, 2018, p. 60; West & Forcese, 2020).

There are provisions of the Canadian Criminal Code which give operators immunity from civil and criminal liability if they cooperate with law enforcement ‘voluntarily’ by preserving or disclosing data to law enforcement, even without a warrant (Gill, Israel, & Parsons, 2018, p. 57). There are also production orders and assistance orders that can be issued under the Criminal Code to oblige third parties to assist law enforcement, and disclose documents and records which could, in theory, be used to target encrypted communications (Gill, Israel, & Parsons, 2018, pp. 62-63), but West and Forcese (2020, p. 13) cast doubt on this possibility. There are also practical limitations, including the fact that many digital platforms and service providers do not have a physical presence in Canada, and thus are effectively beyond the jurisdiction of Canadian authorities (West & Forcese, 2020). Here, Mutual Legal Assistance Treaty (MLATs) could be used, although their use is notoriously beset with delay, and may only be effective if the other jurisdiction has its own laws to oblige third parties to decrypt data or communications (West & Forcese, 2020).

The Canadian Charter of Rights and Freedoms has a number of sections relevant to how undermining encryption can interfere with democratic freedoms, namely sections 2 (freedom of expression), 7 (security of the person), 8 (right against unreasonable search and seizure), and the right to silence and protection from self-incrimination contained in sections 7, 11 and 14 (West & Forcese, 2020). Case law from Canadian courts suggests that individuals cannot be compelled to decrypt their own data (Gill, 2018, p. 451). The Charter implications of BlackBerry’s assistance to the Canadian police in the R v Mirarchi5 case was never ruled on as the case was dropped (Gill, Israel, & Parsons, 2018, p. 58).

In absence of a legislative proposal before the Canadian Parliament, it is difficult to surmise how, and whether, anti-encryption powers would run up against human rights protections. Yet any concrete proposal would likely face scrutiny in the courts given the impacts on Canadians’ Charter-protected rights.

New Zealand

In New Zealand, provisions in theTelecommunications (Interception Capability and Security) Act 2013 (TISCA) require network operators to ensure that their networks can be technically subjected to lawful interception (Cooper, 2018). 6 Section 10(3) requires that public telecommunications network operators, on receipt of a lawful request, must decrypt encrypted communications carried by its network, if that operator has provided the means of encryption. Subsection 10(4) states that an operator is not required to decrypt communications that have been encrypted using a publicly available product supplied by another entity, and the operator is not under any obligation to ensure that a surveillance agency has the ability to decrypt communications.

It appears these provisions may entail that an operator cannot provide end-to-end encryption on its services so that their networks can be subject to lawful interception - that is, they must maintain the cryptographic key where encryption is managed centrally by the service provider (Global Partners Digital, n.d.) and engineer a ‘back door’ into the service (Cooper, 2018). However, NGO NZ Council for Civil Liberties considered the impact of this provision is theoretical as most services are offshore, and this provision does not apply extraterritorially (Beagle, 2017). Yet, section 38 of TICSA allows the responsible minister to make “service providers” (discussed below) subject to provisions such as this on the same basis as “network operators”, which may involve section 10 having an extraterritorial reach (Keith, 2020).

There is a further provision in section 24 of TISCA that places both network operators and service providers (defined as anyone, whether in New Zealand or not, who provides a communications service to an end user in New Zealand) under obligations to provide ‘reasonable’ assistance to surveillance agencies with interception warrants or lawful interception authorities, including the decryption of communications, when they were the source of the encryption. Such companies do not have to decrypt encryption they have not provided nor “ensure that a surveillance agency has the ability to decrypt any telecommunication” (TICSA s 24(4)(b)). It is unclear what “reasonable assistance” entails, and how that would apply to third party app providers such as WhatsApp (to which section 24 would prima facie apply but not section 10 in the absence of a section 38 decision). It is also unclear how this provision would be enforced against offshore companies (Dizon et al., 2019, pp. 74-75).

There are further provisions in the Search and Surveillance Act 2012 which affect encryption. Section 130 includes a requirement that “the user, owner, or provider of a computer system […] offer reasonable assistance to law enforcement officers conducting a search and seizure including providing access information” which could be used to force an individual or business to decrypt data and communications (Dizon et al., 2019, p. 61). There is a lack of clarity as to how the privilege against self-incrimination operates (Dizon et al., 2019, pp. 62-63). There is also a lack of clarity about what “reasonable assistance” from companies, which will likely be third parties, and not able to avail themselves of the protection against self-incrimination, may entail (Dizon et al., 2019, pp. 65-66).

New Zealand has human rights protections enshrined in its Bill of Rights Act 1990, and section 21 contains the right to be secure against unreasonable searches and seizures. However, it “does not have higher law status and so can be overridden by contrary legislation…but there is at least some effort to avoid inconsistencies” (Keith, 2020). There is also the privilege against self-incrimination, “the strongest safeguard available in relation to encryption as it works to prevent a person from being punished for refusing to provide information that could lead to criminal liability” (Dizon et al., 2019, p. 7). There is no freestanding right to privacy in the New Zealand Bill of Rights, and so aspects of privacy must be found via other recognised rights (Butler, 2013), or may be protected via data protection legislation and New Zealand courts’ “relatively strong approach to unincorporated treaties, including human rights obligations” (Keith, 2020).

Despite being part of the FVEY communiques on encryption mentioned below, Keith (2020) views New Zealand’s domestic approach as more “cautious or ambivalent”, with “no proposal to follow legislation enacted by other Five Eyes countries”.

United Kingdom

The most significant law is the UK’s Investigatory Powers Act 2016 (henceforth IPA). 7 Section 253 allows a government minister, subject to approval by a 'Judicial Commissioner', to issue a ‘Technical Capability Notice’ (TCN) to any communications operator (which includes telecommunications companies, internet service providers, email providers, social media platforms, cloud providers and other ‘over-the-top’ services), whether UK-based or anywhere else in the world, imposing obligations on that provider. Such an obligation can include the operator having to remove “electronic protection applied by or on behalf of that operator to any communications or data”. The government minister must also consider technical practicalities such as whether it is ‘practicable’ to impose requirements on operators, and for the operators to comply. Section 254 provides that Judicial Commissioners conduct a necessity and proportionality test before approving a TCN. This means that a provider receiving a TCN would not be able to provide end-to-end encryption for its customers, and must ensure there is a method of decrypting communications. In other words, the provider must centrally manage encryption and maintain the decryption key (Smith, 2017a).

In November 2017, the UK Home Office released a Draft Communications Data Code of Practice for consultation, which clarified that a TCN would not require a telecommunications operator to remove encryption per se, but “it requires that operator to maintain the capability to remove encryption when subsequently served with a warrant, notice or authorisation” (UK Home Office, 2017, p. 75). Furthermore, it was reiterated that an obligation to remove encryption can only be imposed where “reasonably practicable” for the communications provider to comply with, and the obligation can only pertain to encryption that the communications provider has itself applied, or in circumstances when this has been done, for example, by a contractor on the provider’s behalf.

Later, in early 2018, after analysing responses to the Draft Code, the UK Home Office introduced draft administrative regulations to the UK Parliament, which were passed in March 2018. These regulations affirm the Home Office’s previous statements that TCNs require that operators “maintain the capacity” to disclose communications data on receipt of an authorisation or warrant, and such notices can only impose obligations on telecommunications providers to remove “electronic protection” applied by, or on behalf of, the provider “where reasonably practicable” (Ni Loideain, 2019, p. 186). This would seem to entail that encryption methods applied by the user are not covered by this provision (Smith, 2017b). However, Keenan (2019) argues that the regulations may “compel […] operators to facilitate the ‘disclosure’ of content by targeting authentication functions” which may have the effect of secretly delivering messages to law enforcement.

While some of the issues identified above with the UK’s TCNs may be clarified by these regulations, other issues remain. For example, the situation remains unclear for a provider wanting to offer end-to-end encryption to its customers without holding the means to decrypt them. Practical questions remain about how the provisions can be enforced against providers which may not be geographically based in the UK, such as technology companies and platforms which may or may not maintain offices in the UK. To date, there is also no public knowledge of whether any TCNs have been made, approved by Judicial Commissioners, and complied with by operators (Keenan, 2019).

In addition to TCNs, section 49 of the Regulation of Investigatory Powers Act (2000) (RIPA) allows law enforcement agencies in possession of a device to issue a notice to the device user or device manufacturer to compel them to unlock encrypted devices or networks (Keenan, 2019). The law enforcement officer must obtain permission from a judge on the grounds that it is “necessary in the interests of national security, for the purpose of preventing or detecting crime, or where it is in the interest of the economic well-being of the United Kingdom” (Keenan, 2019). Case law on section 49 notices in criminal matters has generally not found the provision’s use to force decryption to violate the privilege against self-incrimination, in sharp distinction to the US experience (Keenan, 2019).

It is unclear whether these provisions would withstand such a challenge before the European Court of Human Rights on the basis of incompatibility with ECHR rights, especially Article 6 (right to a fair trial) and Article 8 (right to privacy).

Australia

In Australia the encryption debate commenced in June 2017 when then-Australian Prime Minister Turnbull (in)famously stated that “the laws of mathematics are very commendable, but the only law that applies in Australia is the law of Australia” (Pearce, 2017, para. 8). This remark, interpreted colloquially as a ‘war on maths’ (Pearce, 2017), gestured at an impending legislative proposal that would introduce provisions to weaken end-to-end encryption.

In August 2018, the Five Eyes Alliance met in a ‘Five Country Ministerial’ (FCM) and issued a communique that stated: “ We agreed to the urgent need for law enforcement to gain targeted access to data, subject to strict safeguards, legal limitations, and respective domestic consultations” (Australian Government Department of Home Affairs, 2018, para. 18). The communique was accompanied by a Statement of Principles on Access to Evidence and Encryption, assented to by all FVEY governments (Australian Government Department of Home Affairs, 2018). The statement affirmed the important but non-absolute nature of privacy, and signalled a “pressing international concern” posed by law enforcement inability to access encrypted content. FVEY partners also agreed to abide by three principles in the statement: mutual responsibility; the paramount status of rule of law and due process; and freedom of choice for lawful access solutions. “Mutual responsibility” relates to industry stakeholders being responsible for providing access to communications data. The “freedom of choice” principle relates to FVEY members encouraging service providers to “voluntarily establish lawful access solutions to their products and services that they create or operate in our countries”, with the possibility of governments “pursu[ing] technological, enforcement, legislative or other measures to achieve lawful access solutions” if they “continue to encounter impediments to lawful access to information” (Australian Government Department of Home Affairs, 2018, paras. 34-35).

In the month following this meeting, the Australian government introduced what became the Telecommunications and Other Legislation Amendment (Assistance and Access) Act 2018 (Cth) (or ‘AA Act’), which was subsequently passed by the Australian Parliament in December 2018. The Act amends pre-existing surveillance legislation in Australia, including the Telecommunications Act 1997(Cth) and the Telecommunications (Interception and Access) Act 1979 (Cth). It includes a series of problematic reforms that have extraterritorial reach beyond the Australian jurisdiction. 8

Specifically, three new mechanisms which seem (at least at face value) to be inspired by the UK’s IPA are introduced into the Telecommunications Act: Technical Assistance Requests (TARs), 9 Technical Assistance Notices (TANs) 10 and Technical Capability Notices (TCNs). 11 TARs can be issued by Australian security agencies 12 that may “ask the provider to do acts or things on a voluntary basis that are directed towards ensuring that the provider is capable of giving certain types of help.” 13 TARs escalate to TANs compelling assistance and impose penalties for non-compliance. The Australian Attorney-General can also issue TCNs which “may require the provider to do acts or things directed towards ensuring that the provider is capable of giving certain types of help” or to actually do such acts and things.

While the language of TCN is similar to the UK IPA, there is a much longer and more broadly worded list of “acts or things” that a provider can be asked to do on receipt of a TCN. 14 Although, as per section 317ZG, “systemic weaknesses” cannot be introduced, 15 there is still a significant potential impact on the security and privacy of encrypted communications. An important distinction between Australian and the UK TCNs is that the Australian notices are issued by the executive and are not subject to judicial oversight (Table 1).

The AA Act has extraterritorial reach beyond Australia in two main ways. The first is via obligations imposed on “designated communications providers” located outside Australia. “Designated communications providers” is defined extremely broadly to include, inter alia, carriers, carriage service providers, intermediaries and ancillary service providers, and any provider of an “electronic service” with any end-users in Australia, or of software likely to be used in connection with such a service, that has any end-users in Australia. It includes any “constitutional corporation” 16 that manufactures, installs, maintains or supplies devices for use, or likely to be used, in Australia, or develops, supplies or updates software that is capable of being installed on a computer or device that is likely to be connected to a telecommunications network in Australia (Ford & Mann, 2019). Thus a very wide range of providers from Australia and overseas will fall within these definitions (McGarrity & Hardy, 2020). Failure to comply with notices may result in financial penalties for companies, yet it is not clear how such penalties may be enforced vis-à-vis companies which are not incorporated or located in Australia. In any case in which a TAR is issued, it provides designated communications providers with civil immunity 9 from damages that may arise from the request (for example, rendering phones or devices useless), which may incentivise compliance prior to escalation to an enforceable TAN or TCN (Ford & Mann, 2019).

The second aspect of the AA Act’s extraterritorial reach is the provision of assistance by Australian law enforcement to their counterparts via the enforcement of foreign laws. The TARs, TANs, and TCNs all involve “assisting the enforcement of the criminal laws of a foreign country, so far as those laws relate to serious foreign offences”. 17 This is also reinforced by further amendments to theMutual Assistance in Criminal Matters Act 1987(Cth) that bypass MLAT processes, and provide a conduit to the extraterritorial application of Australia’s surveillance laws. That is, Australian law enforcement agencies are able to assist foreign governments through their requests for Australian assistance, including in the form of accessing encrypted communications and/or designing new ways to access encrypted communications (as per TCNs), for the enforcement of their own criminal laws. 18 This may operate as a loophole through which foreign law enforcement agencies circumvent their own legal system’s safeguards and capitalise on Australia’s lack of a federal human rights framework (Ford & Mann, 2019).

Table 1: Overview of anti-encryption measures in each FVEY country
 

United States

Canada

New Zealand

United Kingdom

Australia

Relevant law/s

Communications Assistance for Law Enforcement Act § 1002.

No specific legislation that provides authorities the power to compel decryption.

Narrow obligation in Solicitor General Enforcement Standards for Lawful Interception of Telecommunications (SGES) Standard 12.

Telecommunications (Interception Capability and Security) Act 2013 sections 10 and 24.

Investigatory Powers Act 2016 section 253.

Telecommunications and Other Legislation Amendment (Assistance and Access) Act 2018 (Cth) section 317A.

Entities targeted

Application only to “telecommunications companies.”

Application only to “wireless communication providers.”

Section 10 applies to “network operators” and section 24 applies to “network operators” and “service providers”.

Any “communications operator” (which includes telecoms companies, internet service providers, email providers, social media platforms, cloud providers and other ‘over-the-top’ services).

The definition of “designated communications provider” is set out in section 317C. It includes but is not limited to “a carrier or carriage service provider”, “person provides an electronic service that has one or more end-users in Australia”, or “the person manufactures or supplies customer equipment for use, or likely to be used, in Australia”.

Statutory obligations imposed on target

Companies will not be required to decrypt or ensure that the government can decrypt communications encrypted by customers, unless the provider itself has provided the encryption used.

Providers must decrypt any communications they have encrypted themselves on receiving a lawful request. Seems not to apply to end-to-end encryption not applied by the provider.

Operators, on the receipt of a lawful request to provide interception, must decrypt encrypted communications carried by its network, if that operator has provided the means of encryption (s 10).

Operators and providers must provide “reasonable” assistance to surveillance agencies with interception warrants or lawful interception authorities, including the decryption of communications when they have provided the encryption (s 24).

Operators obliged to do certain things which can include the removal of “electronic protection applied by or on behalf of that operator to any communications or data”. It is unclear whether a provider receiving a TCN would be able provide end-to-end encryption for its customers.

Providers may be issued with Technical Assistance Requests (TARs), Technical Assistance Notices (TANs) and/or Technical Capability Notices (TCNs). TARs can be issued by Australian security agencies that may “ask the provider to do acts or things on a voluntary basis that are directed towards ensuring that the provider is capable of giving certain types of help.” TARs escalate to TANs compelling assistance and impose penalties for non-compliance. The Australian Attorney-General can also issue TCNs which “may require the provider to do acts or things directed towards ensuring that the provider is capable of giving certain types of help” or to actually do such acts and things.

Human rights protections

US Constitution, notably the Fourth and Fifth Amendment. Also, First Amendment in terms of cryptographic code as a possible form of protected free speech.

Canadian Charter of Rights and Freedoms: Section 2 (freedom of expression), Section 7 (security of the person), Section 8 (right against unreasonable search and seizure), and the right to silence and protection from self-incrimination contained in sections 7, 11 and 14.

Human Rights Act 1993.

Human Rights Act 1998, European Convention on Human Rights.

No comprehensive protection at the federal level; no right to privacy in Australian Constitution.

Approval mechanisms for encryption powers’ exercise

N/A

Minister of Public Safety (executive branch).

Powers subject to interception warrants or other lawful interception authority. “Indirect” judicial supervision (Keith, 2020).

Approval by Judicial Commissioner.

Approval by administrative or executive officer (TCNs are approved by the Attorney-General). If a warrant or authorisation was previously required for the activity, it is still required after these reforms.

Extraterritorial application

Does not apply extraterritorially

Does not apply extraterritorially.

Section 10 does not apply extraterritorially unless section 38 decision made.

Section 24 applies to both NZ providers and foreign providers providing a service to any end-user in NZ.

Applies to both UK-based and foreign-based communications operators.

Applies to both Australian and foreign-based providers.

Providers can receive notices to assist with the enforcement of foreign criminal laws.

Relevant court cases

Apple-FBI

R v Mirarchi

None known.

None known.

Not applicable.

Discussion

The recent legislative developments in Australia position it as a leading actor in the ongoing calls for a broader set of measures to weaken or undermine encryption. The AA Act introduces wide powers for Australian law enforcement and security agencies to request, or mandate assistance in, communications interception from a wide category of communications providers, internet and equipment companies, both in Australia and overseas, and permits foreign agencies to make requests to Australian agencies to use these powers in the enforcement of foreign laws. Compared to the other FVEY jurisdictions’ laws in Table 1, the AA Act’s provisions cover the broadest category of providers and companies, to do the broadest category of assistance acts, with the weakest oversight mechanisms and no protections for human rights.

Australia’s AA Act also gives these provisions the most broad and significant extraterritorial reach of the FVEY equivalent. While New Zealand and the UK also extend their assistance obligations to foreign entities, Australia’s AA Act surpasses this to provide assistance to foreign law enforcement agencies. This is a highly worrying development since the AA Act facilitates the paradoxical enforcement (of criminal laws) and circumvention of (human rights) foreign laws on behalf of foreign law enforcement agencies, through inter alia the coercion of transnational technology companies into designing new ways of undermining encryption at a global scale via Australian law in the form of TCNs.

The idea of jurisdiction shopping by FVEY law enforcement agencies may be applicable, whereby Australia has enacted powers that have extraterritorial consequence, and that could operate to serve the wider FVEY alliance, especially given the lack of judicial oversight of TCNs, and Australia’s weak human rights protections. Jurisdiction shopping concerns the strategic pursuance of legislative, policy and operational objectives in specific venues to achieve outcomes that may not be possible in other venues due to the local context. 19

The AA Act provisions expand legally permissible extraterritorial measures to obtain encrypted communications, and in theory, this enables FVEY partners to ‘jurisdiction shop’ to exploit the lack of human rights protections in Australia. This is not the first time Australia has been an attractive jurisdiction shopping destination. One previous example relates to Operation Artemis run by the Queensland Police where a website used for the dissemination of child exploitation material was relocated to Australian servers so that police could engage in a controlled operation and commit criminal offences (including the dissemination of child exploitation material) without criminal penalty (Høydal, Stangvik, & Hansen, 2017; McInnes, 2017). 20

Australia emerges as a strategic forum for FVEY partners to implement new laws and powers with extraterritorial reach, as unlike other FVEY members, Australia has no meaningful human rights protections that would prevent gross invasions arising from measures that undermine encryption, coupled with weak oversight mechanisms (McGarrity & Hardy, 2020). These considerations also relate to the pre-existing use of ‘regulatory arbitrage’ by FVEY members, which involves information being legally accessed and intercepted in one of the FVEY countries with weaker human rights protection, then being transferred and used in other FVEY countries with more restrictive legal frameworks (Citron & Pasquale, 2010). This situation may allow for authorisation for extraterritorial data gathering to, in effect, be funnelled through the ‘weak link’ of Australia. Thus, the AA Act presents an opportunity for FVEY partners to engage in further regulatory arbitrage by jurisdiction shopping their requests to access encrypted communications and to mandate designated communications providers (i.e. transnational technology companies) design and develop new ways to access encrypted communications via Australia.

However, it is difficult to ascertain the extent to which the FVEY partners are indeed exploiting the Australian ‘weak link’, for two reasons. One, the FVEY alliance operates in a highly secretive manner. Second, the AA Act severely restricts transparency, via the introduction of secrecy provisions and enhanced penalties for unauthorised disclosure, and an absence of judicial authorisation of the exercise of the powers (Table 1). There is very limited ex-post aggregated public reporting of the exercise of the powers. One of these few mechanisms is the Australian Department of Home Affairs annual report on the operation of the Telecommunications (Interception and Access) Act 1979 (Cth). The 2018-2019 report stated that seven TARs were issued, five to the Australian Federal Police and two to the New South Wales Police. Cybercrime and telecommunications offences were the two most common categories of crimes for which the TARs were issued, with the notable absence of any terrorism offences - the main rationale supporting the introduction of the powers. In the Australian Senate Estimates process in late 2019, it was revealed that the TAR powers had been used on a total of 25 occasions up to November 2019 (Sadler, 2020a). 21 The fact that only TARs have been issued may indicate that designated communications providers are complying with requests in the first instance, and thus there is no need to escalate to enforceable notices.

One possible, and as yet unresolved, countervailing development to the AA Act in the FVEY countries concerns the US introduction of the Clarifying Lawful Overseas Use of Data (CLOUD) Act, which aims to facilitate US and foreign law enforcement access to data held by US-based communications providers in criminal investigations, bypassing MLAT procedures (Abraha, 2019; see also Gstrein, 2020, this issue; Vazquez Maymir, 2020, this issue). Bilateral negotiations regarding mechanisms for accessing (via US technology companies) and sharing e-evidence under the CLOUD Act between the US and Australia are underway, and there have been some early questions and debates (Bogle, 2019; Hendry, 2020) as to whether Australia will comply with CLOUD requirements. Specifically, the CLOUD Act allows “foreign partners that have robust protections for privacy and civil liberties to enter into executive agreements with the United States to use their own legal authorities to access electronic evidence” (Department of Justice, n.d) (PDF). CLOUD agreements between the US and foreign governments should not include any obligations forcing communications providers to maintain data decryption capabilities nor should they include any obligation preventing providers from decrypting data. 22 It is uncertain whether Australia would comply with CLOUD requirements given its aforementioned weak human rights framework, and the absence of judicial oversight for the authorisation of the anti-encryption powers.

These concerns seem to have motivated the current Australian opposition party, Labor, to introduce a private member’s bill into the Australian Parliament in late 2019 to ‘fix’ some aspects of the AA Act, despite their bipartisan support in passage of the law at the end of 2018. Notable fixes sought include the introduction of enhanced safeguards, including judicial oversight and clarification that TARs, TANs, and TCNs cannot be used to force providers to build systemic weaknesses and vulnerabilities in their systems, including implementing or building a new decryption capability. At the time of writing, the Australian Parliament is considering the bill, although it is unlikely it will be passed given the government has indicated it will vote down Labor’s proposed amendments (Stadler, 2020b).

Conclusion

Laws to restrict encryption occur in the context of regulatory arbitrage (Citron & Pasquale, 2010). This paper has analysed new powers that allow for Australian law enforcement and security agencies to request or mandate assistance in accessing encrypted communications, and permits foreign agencies to make requests to Australian agencies to use these powers in the enforcement of foreign laws, taking advantage of a situation where there is less oversight and fewer human rights or constitutional protections. The AA Act presents new opportunities for FVEY partners to leverage access to (encrypted) communications via Australia’s ‘legal backdoors’, which may undermine protections that might otherwise exist within local legal frameworks. This represents a troubling international development for privacy and information security.

Acknowledgements

The authors would like to acknowledge Dr Kayleigh Murphy for her excellent research assistance and the Computer Security and Industrial Cryptography (COSIC) Research Group at KU Leuven, the Law Science Technology Society (LSTS) Research Group at Vrije Universiteit Brussel, and the Department of Journalism in Maria Curie-Skłodowska University (Lublin, Poland) for the opportunity to present and receive feedback on this research. Finally, we thank Tamir Israel, Martin Kretschmer, Balázs Bodó, and Frédéric Dubois for their comprehensive peer-review comments and editorial review.

References

Abelson, H., Anderson, R., Bellovin, S. M., Benaloh, J., Blaze, M., Diffie, W., Gilmore, J., Green, M., Landau, S., Neumann, P. G., Rivest, R. L., Schiller, J. I., Schneier, B., Specter, M. A., & Weitzner, D. J. (2015). Keys under doormats: Mandating insecurity by requiring government access to all data and communications. Journal of Cybersecurity, 1(1), 69–79. https://doi.org/10.1093/cybsec/tyv009

Abraha, H. H. (2019). How Compatible is the US ‘CLOUD’ Act’ with Cloud Computing? A Brief Analysis. International Data Privacy Law, 9(3), 207–215. https://doi.org/10.1093/idpl/ipz009

Australian Constitution. https://www.aph.gov.au/about_parliament/senate/powers_practice_n_procedures/constitution

Australian Government Department of Home Affairs. (2018). Five country ministerial 2018. Australian Government Department of Home Affairs. https://www.homeaffairs.gov.au/about-us/our-portfolios/national-security/security-coordination/five-country-ministerial-2018

Australian Government, Department of Home Affairs. (2019). Telecommunications (Interception and Access) Act 1979: Annual Report 2018-19 [Report]. Australian Government, Department of Home Affairs. https://parlinfo.aph.gov.au/parlInfo/download/publications/tabledpapers/c424e8ec-ce9a-4dc1-a53e-4047e8dc4797/upload_pdf/TIA%20Act%20Annual%20Report%202018-19%20%7BTabled%7D.pdf;fileType=application%2Fpdf#search=%22publications/tabledpapers/c424e8ec-ce9a-4dc1-a53e-4047e8dc4797%22

Beagle, T. (2017, July 2). Why we support effective encryption [Blog post]. NZ Council for Civil Liberties. https://nzccl.org.nz/content/why-we-support-effective-encryption

Bell, S. (2013, November 25). Court rebukes CSIS for secretly asking international allies to spy on Canadian suspects travelling abroad. The National Post. https://nationalpost.com/news/canada/court-rebukes-csis-for-secretly-asking-international-allies-to-spy-on-canadian-terror-suspects

Bogle, A. (2019, October 31). Police want Faster Data from the US, but Australia’s Encryption Laws Could Scuttle the Deal. ABC News. https://www.abc.net.au/news/science/2019-10-31/australias-encryption-laws-could-scuttle-cloud-act-us-data-swap/11652618

Brewster, T. (2018, February 26). The Feds Can Now (Probably) Unlock Every iPhone Model in Existence. https://www.forbes.com/sites/thomasbrewster/2018/02/26/government-can-access-any-apple-iphone-cellebrite/#76a735e8667a

Butler, P. (2013). The Case for a Right to Privacy in the New Zealand Bill of Rights Act. New Zealand Journal of Public & International Law, 11(1), 213–255.

Citron, D. K., & Pasquale, F. (2010). Network Accountability for the Domestic Intelligence Apparatus. Hastings, 62, 1441–1494. https://digitalcommons.law.umaryland.edu/fac_pubs/991/

Comey, J. B. (2014). Going Dark: Are Technology, Privacy, and Public Safety on a Collision Course? Federal Bureau of Investigation. https://www.fbi.gov/news/speeches/going-dark-are-technology-privacy-and-public-safety-on-a-collision-course

Constitution Act, (1982). https://laws-lois.justice.gc.ca/eng/const/page-15.html

Cook Barr, A. (2016). Guardians of Your Galaxy S7: Encryption Backdoors and the First Amendment. Minnesota Law Review, 101(1), 301–339. https://minnesotalawreview.org/article/note-guardians-of-your-galaxy-s7-encryption-backdoors-and-the-first-amendment/

Cooper, S. (2018). An Analysis of New Zealand Intelligence and Security Agency Powers to Intercept Private Communications: Necessary and Proportionate? Te Mata Koi: Auckland University Law Review, 24, 92–120.

Daly, A. (2017). Covering up: American and European legal approaches to public facial anonymity after SAS v. France. In T. Timan, B. C. Newell, & B.-J. Koops (Eds.), Privacy in Public Space: Conceptual and Regulatory Challenges(pp. 164–183). Edward Elgar.

Daly, A., & Thomas, J. (2017). Australian internet policy. Internet Policy Review, 6(1). https://doi.org/10.14763/2017.1.457

Department of Justice. (n.d.). Frequently Asked Questions. https://www.justice.gov/dag/page/file/1153466/download

Dizon, M., Ko, R., Rumbles, W., Gonzalez, P., McHugh, P., & Meehan, A. (2019). A Matter of Security, Privacy and Trust: A study of the principles and values of encryption in New Zealand (Report. New Zealand Law Foundation and University of Waikato.

Ford, D., & Mann, M. (2019). International Implications of the Telecommunications and Other Legislation Amendment (Assistance and Access) Act 2018. Australian Privacy Foundation. https://privacy.org.au/wp-content/uploads/2019/06/APF_AAAct_FINAL_040619.pdf

Froomkin, D. (2015). U.N. Report Asserts Encryption as a Human Right in the Digital Age. The Intercept. https://theintercept.com/2015/05/28/u-n-report-asserts-encryption-human-right-digital-age/

Gill, L. (2018). Law, Metaphor and the Encrypted Machine. Osgoode Hall Law Journal, 55(2), 440–477. https://doi.org/10.2139/ssrn.2933269

Gill, L., Israel, T., & Parsons, C. (2018). Shining a Light on the Encryption Debate: A Canadian Fieldguide [Report]. Citizen Lab; The Canadian Internet Policy & Public Interest Clinic. https://citizenlab.ca/2018/05/shining-light-on-encryption-debate-canadian-field-guide/

Global Partners Digital. (n.d.). World Map of Encryption Law and Policies. https://www.gp-digital.org/world-map-of-encryption/

Gonzalez, O. (2019). Cracks in the Armor: Legal Approaches to Encryption. Journal of Law, Technology & Policy, 2019(1), 1–46. http://illinoisjltp.com/journal/wp-content/uploads/2019/05/Gonzalez.pdf

Gstrein, O. (2020). Mapping power and jurisdiction on the internet through the lens of government-led surveillance. Internet Policy Review, 9(3). https://doi.org/10.14763/2020.3.1497

Hendry, J. (2020, January 14). Home Affairs Rejects Claims Anti-Encryption Laws Conflict with US CLOUD Act. IT News. https://www.itnews.com.au/news/home-affairs-rejects-claims-anti-encryption-laws-conflict-with-us-cloud-act-536339

Holyoke, T., Brown, H., & Henig, J. (2012). Shopping in the Political Arena: Strategic State and Local Venue Selection by Advocates. State and Local Government Review, 44(1), 9–20. https://doi.org/10.1177/0160323X11428620

Høydal, H. F., Stangvik, E. O., & Hansen, N. R. (2017, October 7). Breaking the Dark Net: Why the Police Share Abuse Pics to Save Children. VG. https://www.vg.no/spesial/2017/undercover-darkweb/?lang=en

Human Rights, E. C. (2010). European Convention on Human Rights. https://www.echr.coe.int/Documents/Convention_ENG.pdf

Investigatory Powers Act 2016 (UK), Pub. L. No. 2016 c. 25 (2016). http://www.legislation.gov.uk/ukpga/2016/25/contents/enacted

Investigatory Powers (Technical Capability) Regulations 2018 (UK. (n.d.). http://www.legislation.gov.uk/ukdsi/2018/9780111163610/contents

Keenan, B. (2019). State access to encrypted data in the United Kingdom: The ‘transparent’ approach. Common Law World Review. https://doi.org/10.1177/1473779519892641

Keith, B. (2020). Official access to encrypted communications in New Zealand: Not more powers but more principle? Common Law World Review. https://doi.org/10.1177/1473779520908293

Telecommunications Amendment (Repairing Assistance and Access) Bill 2019, (2019) (testimony of Kristina Keneally). https://parlinfo.aph.gov.au/parlInfo/download/legislation/bills/s1247_first-senate/toc_pdf/19S1920.pdf;fileType=application%2Fpdf

Koops, B.-J. (1999). The Crypto Controversy: A Key Conflict in the Information Society. Kluwer Law International.

Koops, B.-J., & Kosta, E. (2018). Looking for some light through the lens of “cryptowar” history: Policy options for law enforcement authorities against “going dark”. Computer Law & Security Review, 34(4), 890–900. https://doi.org/10.1016/j.clsr.2018.06.003

Ley, A. (2016). Vested Interests, Venue Shopping and Policy Stability: The Long Road to Improving Air Quality in Oregon’s Willamette Valley. Review of Policy Research, 33(5), 506–525. https://doi.org/10.1111/ropr.12190

Lyon, D. (2014). Surveillance, Snowden, and Big Data: Capacities, Consequences, Critique. Big Data & Society, 1(2). https://doi.org/10.1177/2053951714541861

Lyon, D. (2015). Surveillance After Snowden. Polity Press.

Mann, M., & Daly, A. (2019). (Big) data and the north-in-south: Australia’s informational imperialism and digital colonialism. Television and New Media, 20(4), 379–395. https://doi.org/10.1177/1527476418806091

Mann, M., Daly, A., Wilson, M., & Suzor, N. (2018). The Limits of (Digital) Constitutionalism: Exploring the Privacy-Security (Im)Balance in Australia. International Communication Gazette, 80(4), 369–384. https://doi.org/10.1177/1748048518757141

McGarrity, N., & Hardy, K. (2020). Digital surveillance and access to encrypted communications in Australia. Common Law World Review. https://doi.org/10.117/1473779520902478.

McInnes, W. (2017, October 8). Queensland Police Take Over World’s Largest Child Porn Forum in Sting Operation. Brisbane Times. https://www.brisbanetimes.com.au/national/queensland/queensland-police-behind-worlds-largest-child-porn-forum-20171007-gywcps.html

Molnar, A. (2017). Technology, Law, and the Formation of (il)Liberal Democracy? Surveillance & Society, 15(3/4), 381–388. https://doi.org/10.24908/ss.v15i3/4.6645

Molnar, A., Parsons, C., & Zouave, E. (2017). Computer network operations and ‘rule-with-law’ in Australia. Internet Policy Review, 6(1). https://doi.org/10.14763/2017.1.453

Murphy, H., & Kellow, A. (2013). Forum Shopping in Global Governance: Understanding States, Business and NGOs in Multiple Arenas. Global Policy, 4(2), 139–149. https://doi.org/10.1111/j.1758-5899.2012.00195.x

Mutual Assistance in Criminal Matters Act 1987 Compilation No. 35, (2016). https://www.legislation.gov.au/Details/C2016C00952

Nagel, P. (2006). Policy Games and Venue-Shopping: Working the Stakeholder Interface to Broker Policy Change in Rehabilitation Services. Australian Journal of Public Administration, 65(4), 3–16. https://doi.org/10.1111/j.1467-8500.2006.00500a.x

New Zealand Bill of Rights Act 1990. http://www.legislation.govt.nz/act/public/1990/0109/latest/DLM224792.html

Ni Loideain, N. (2019). A Bridge Too Far? The Investigatory Powers Act 2016 and Human Rights Law. In L. Edwards (Ed.), Law, Policy and the Internet (2nd ed., pp. 165–192). Hart.

Parsons, C. A., & Molnar, A. (2017). Horizontal Accountability and Signals Intelligence: Lesson Drawing from Annual Electronic Surveillance Reports. SSRN. http://dx.doi.org/10.2139/ssrn.3047272

Pearce, R. (2017, July 27). Australia’s War on Maths Blessed with Gong at Pwnie Awards. ComputerWorld. https://www.computerworld.com.au/article/625351/australia-war-maths-blessed-gong-pwnie-awards/

Pfefferkorn, R. (2020, January 30). The EARN IT Act: How to Ban End-to-End Encryption Without Actually Banning It [Blog post]. The Center for Internet Society. https://cyberlaw.stanford.edu/blog/2020/01/earn-it-act-how-ban-end-end-encryption-without-actually-banning-it

Pralle, S. (2003). Venue Shopping, Political Strategy, and Policy Change: The Internationalization of Canadian Forest Advocacy. Journal of Public Policy, 23(3), 233–260. https://doi.org/10.1017/S0143814X03003118

Regulation of Investigatory Powers Act 2000, Pub. L. No. 2000 c. 23 (2000). http://www.legislation.gov.uk/ukpga/2000/23/contents

Roach, K. (2011). The 9/11 Effect: Comparative Counter-Terrorism. Cambridge University Press.

Sadler, D. (2020a, February 3). Encryption laws not used to fight terrorism [Blog post]. InnovationAus. https://www.innovationaus.com/encryption-laws-not-used-to-fight-terrorism/?fbclid=IwAR2fdjBwK827idNXHY4X5-5Xk3d8LZJBjSVJrLMutxBn6XeWXTvzyNhsVtg

Sadler, D. (2020b, February 14). No encryption fix until at least October [Blog post]. InnovationAus. https://www.innovationaus.com/no-encryption-fix-until-at-least-october/?fbclid=IwAR0HdUHyy2ArihJC6lEze0H_rxvJnB4ryNknGMAlsWf4PeibIpJXJYD--dI

Search and Surveillance Act, (2012). http://www.legislation.govt.nz/act/public/2012/0024/latest/DLM2136536.html

Smith, G. (2017, May 8). Back doors, black boxes and #IPAct technical capability regulations [Blog post]. Graham Smith’s Blog on Law, IT, the Internet and Online Media. http://www.cyberleagle.com/2017/05/back-doors-black-boxes-and-ipact.html

Smith, Graham. (2017, May 29). Squaring the circle of end to end encryption [Blog post]. Graham Smith’s Blog on Law, IT, the Internet and Online Media. https://www.cyberleagle.com/2017/05/squaring-circle-of-end-to-end-encryption.html

Solicitor General. (2008). Solicitor General’s Enforcement Standards for Lawful Interception of Telecommunications. https://perma.cc/NQB9-ZHPY

Suzor, N., Pappalardo, K., & McIntosh, N. (2017). The Passage of Australia’s Data Retention Regime: National Security, Human Rights, and Media Scrutiny. Internet Policy Review, 6(1). https://doi.org/10.14763/2017.1.454

Telecommunications Act, (1997). https://www.legislation.gov.au/Details/C2017C00179

Telecommunications and Other Legislation Amendment (Assistance and Access) Act 2018, Pub. L. No. 148 (2018). https://www.legislation.gov.au/Details/C2018A00148

Telecommunications (Interception Capability and Security) Act 2013 (NZ), (2013). http://www.legislation.govt.nz/act/public/2013/0091/22.0/DLM5177923.html

United Kingdom Home Office. (2017). Communications Data Draft Code of Practice. https://www.gov.uk/government/uploads/system/uploads/attachment_data/file/663675/November_2017_IPA_Consultation_-_Draft_Communications_Data_Code_of_Pract....pdf

US Telecommunications: Assistance Capability Requirements, USC § 1002, 47 Telecommunications § 1002 (1994). https://www.law.cornell.edu/rio/citation/108_Stat._4280

Vazquez Maymir, S. (2020). Anchoring the Need to Revise Cross-Border Access to E-Evidence. Internet Policy Review, 9(3). https://doi.org/10.14763/2020.3.1495

West, L., & Forcese, C. (2020). Twisted into knots: Canada’s challenges in lawful access to encrypted communications. Common Law World Review. https://doi.org/10.1177/1473779519891597

Williams, G., & Reynolds, D. (2017). A charter of rights for Australia (4th ed.). NewSouth Press.

Wilson, M., & Mann, M. (2017, September 7). Police Want to Read Encrypted Messages, but They Already Have Significant Power to Access our Data. The Conversation. https://theconversation.com/police-want-to-read-encrypted-messages-but-they-already-have-significant-power-to-access-our-data-82891

Zuan, N., Roos, C., & Gulzau, F. (2016). Circumventing Deadlock Through Venue-shopping: Why there is more than just talk in US immigration politics in times of economic crisis. Journal of Ethnic and Migration Studies, 42(10), 1590–1609. https://doi.org/10.1080/1369183X.2016.1162356

Footnotes

1. The FVEY partnership is a comprehensive intelligence alliance formed after the Second World War, formalised under the UKUSA Agreement (see e.g., Mann & Daly, 2019).

2. Cth stands for Commonwealth, which means “federal” legislation, as distinct from state-level legislation.

3. At the state and territory level: Victoria, Queensland and the Australian Capital Territory have human rights laws, however the surveillance powers examined in this article are subject to Commonwealth jurisidiction rendering so these state and territory based protections are inapplicable. See: Charter of Human Rights and Responsibilities Act 2006 (Vic); Human Rights Act 2019 (QLD); Human Rights Act 2004 (ACT).

4. However, the draft EARN IT bill currently before the US Congress, if enacted, may impact negatively upon providers’ ability to offer end-to-end encrypted messaging. See Pfefferkorn (2020).

5.R v Mirarchi involved BlackBerry providing the Canadian police with a key which allowed them to decrypt one million BlackBerry messages (Gill, Israel & Parsons, 2018, p. 57-58). The legal basis and extent of BlackBerry’s assistance to the Canadian police was unclear from the ‘heavily redacted’ court records (West & Forcese, 2020).

6. For a full picture of New Zealand legal provisions which may affect encryption see Dizon et al. (2019).

7. For additional provisions in UK law which may be relevant to encryption see Keenan (2019).

8. The analysis presented here focuses on Schedule 1 of the AA Act. Schedule 2 of the AA Act introduces computer access warrants that allow law enforcement to covertly access and search devices, and to conceal the fact that devices have been accessed.

9.a.b. S 317G.

10. S 317L.

11. S 317T.

12. Namely ‘the Director‑General of Security, the Director‑General of the Australian Secret Intelligence Service, the Director‑General of the Australian Signals Directorate or the chief officer of an interception agency’.

13. Namely ‘ASIO, the Australian Secret Intelligence Service, the Australian Signals Directorate or an interception agency’.

14. For example, “removing one or more forms of electronic protection that are or were applied by, or on behalf of, the provider”, “installing, maintaining, testing or using software or equipment” and “facilitating or assisting access to… a facility, customer equipment, electronic services and software” are included in the list of ‘acts or things’ that a provider may be asked to do via these provisions. The complete list of ‘acts or things’ are listed in section 317E

15. According to AA Act s 317B a systematic vulnerability means “a vulnerability that affects a whole class of technology, but does not include a vulnerability that is selectively introduced to one or more target technologies that are connected with a particular person” and a systematic weakness means “a weakness that affects a whole class of technology, but does not include a weakness that is selectively introduced to one or more target technologies that are connected with a particular person.”

16. A category which, according to paragraph 51(xx) of the Australian Constitution, comprises “foreign corporations, and trading or financial corporations formed within the limits of the Commonwealth”.

17. S 317A; Table 1.

18. AA Act s 15CC(1); Surveillance Devices Act 2004 (Cth) ss 27A(4) and (4)(a).

19. Analyses of policy venue shopping have been conducted in relation to a range of policy areas, inter alia, immigration, environmental, labour, intellectual property, and rehabilitation policies (see e.g., Ley, 2016; Holyoke, Brown, & Henig, 2012; Pralle, 2003; Zuan, Roos, & Gulzau, 2016; Nagel, 2006; Murphy & Kellow, 2013). According to Pralle (2003, p. 233) a central “component of any political strategy is finding a decision setting that offers the best prospects for reaching one’s policy goals, an activity referred to as venue shopping”. Further, Murphy and Kellow (2013, p. 139) argue that policy venue shopping may be a political strategy deployed at global levels where “entrepreneurial actors take advantage of ‘strategic inconsistencies’ in the characteristics of international policy arenas”.

20. A further example that demonstrates regulatory arbitrage between FVEY members from the perspective of Canada, brought to light in 2013, involved Canada’s domestic security intelligence service (CSIS) being found by the Federal Court to have ‘breached duty of candour’ by secretly refusing to disclose their leveraging of FVEY networks when it applied for warrants during an international terrorism investigation involving two Canadian suspects (Bell 2013).

21. It should be noted that due to the overlapping time frames and aggregated nature of reporting, the 25 occasions the powers were used may also include some of the 7 occasions reported in the most recent Home Affairs annual-report.

22. CLOUD Act s 105 (b) (3). Note: The US Department of Justice claims the CLOUD Act is “encryption neutral” in that “neither does it prevent service providers from assisting in such decryption, or prevent countries from addressing decryption requirements in their own domestic laws.” (Department of Justice, n.d)

Public and private just wars: Distributed cyber deterrence based on Vitoria and Grotius

$
0
0

This paper is part of Geopolitics, jurisdiction and surveillance, a special issue of Internet Policy Review guest-edited by Monique Mann and Angela Daly.

Introduction

There is a growing tendency to apply just war theory to cyberspace (e.g., Finlay, 2018; Smith, 2018; Sleat, 2017; Thumfart, 2017; Smotherman, 2016; Giesen, 2014; Solis, 2014; Taddeo, 2014). This corresponds to three developments at the intersection of international relations and digital studies.

First: just war theory drafts a way in which individual states can secure international peace. It is therefore compatible with the declining relevance of traditional institutions of global governance and the emergence of a multi-polar world order.

Second: since they are not directly connected to military losses, relatively cheap to execute and difficult to attribute, cyber warfare and cyber attacks encourage offensive strategies (Taddeo 2018a, p. 324; Lin, 2012, p. 521). When peace cannot be guaranteed through supremacy of defense, it has to be secured through deterrence. And since “deterrence means dissuading someone from doing something by making them believe that the costs to them will exceed their expected benefit” (Nye, 2017, p. 45), it depends on credibility (McKenzie, 2017, p. 2). Credibility, in turn, is achieved by formalisation and legitimisation of sanctions such as offered by just war theory.

Third: just war theory might appear archaic. And correctly so. After all, it is an ethical, so to speak: proto-legal, concept. It should compensate for the absence of reliable legal standards. Although it is a broad consensus that international law and the UN Charter in general apply to cyberspace, there is still no specific international legal framework concerning cyber attacks.

This lack of legal framework or even basic orientation primarily concerns jus ad bellum, i.e.the question of when a state has the right to retaliate cyber attacks proportionately.

The Tallinn Manual 2.0, which is issued by NATO to clarify the application of international law to cyber attacks, is explicitly undecided on the matter. On the one hand, it generally states that “the right to employ force in self-defence extends beyond kinetic armed attacks to those that are perpetrated solely through cyber operations.” And it expresses a broad consensus by stating that a “cyber operation that seriously injures or kills a number of persons or that causes significant damage to, or destruction of, property would satisfy the scale and effects requirement” (Schmitt and NATO, 2017, p. 340).

On the other hand, the manual expresses significant uncertainty concerning cyber aggressions without direct physical consequences, such as the manipulation of elections, stock markets or deliberative processes in the public sphere. “The case of cyber operations that do not result in injury, death, damage, or destruction, but that otherwise have extensive negative effects, remains unsettled” (Schmitt and NATO, 2017, p. 340f.).

Likewise, in 2018, the US State Department called for a “fundamental rethinking” of “strategies for deterring malicious cyber activities” (State Department, 2018). But it failed to clarify, too, which attack would trigger what response – a quite remarkable silence on this issue, since, just recently, one of the world’s oldest and still most powerful democracies became a victim of Russian cyber operations interfering with its public sphere (Matishak, 2018).

Next to the jus ad bellum of states, cyber attacks are atypical from a Westphalian perspective because they directly involve private actors. Not just as aggressors, i.e. as independent hacker groups or state-sponsored proxies, but also as its primary victims.

Concerning private actors, too, the right of reprisals, i.e. “hack-back” and “active cyber defense”, has been debated, for example in the context of the proposed Active Cyber Defense Certainty Act discussed in US congress. This is all the more urgent because in cyberspace “the top tech companies appear to be as powerful as States, and sometimes even more so, to prevent cyber attacks, attribute them and to respond to malicious acts” (Bannelier and Christakis, 2017, p. 10; see also Gstrein, 2020, in this issue). And Facebook and Google have already engaged in active cyber defense without having been prosecuted by the US government (Glosson, 2015, p. 17; Huang, 2014, p. 1234).

Nevertheless, the Tallinn Manual unambiguously reads: “Only States may take countermeasures. For example, an information technology firm may not act on its own initiative in responding to a harmful cyber operation” (Schmitt and NATO, 2017, p. 130).

This contribution represents a realistic approach to cyber deterrence inasmuch as it focuses on two essential features that are usually omitted in the debate about just cyberwar and can be considered elephants in the room: public wars and private wars.

First, public wars: whilst the philosophically intriguing question of the physicality of cyber attacks still dominates theory, there seems to be a broad consensus in practice that it is irrelevant as long as cyber attacks have effects that are physically violent, such as this is expressed in the Tallinn Manual 2.0 quoted above. A clear expression of this virtually global consensus is that cyber attacks with physically violent effects are extremely rare, also beyond NATO states. Due to their unclear legal status much more attractive and frequent are cyber aggressions aimed at manipulating deliberative processes in the public sphere. They are therefore a much more urgent object of a theory of just cyberwar.

Due to these questions, in section 1, I will go back to the origin of modern just war theory and international law, to Francisco de Vitoria, who developed a framework of international law that understands communication as the highest normative value. Contrary to contemporary misconceptions of just war theory (Taddeo, 2014, p. 7), Vitoria’s original doctrine did not focus on the physicality of the means or the effects of an attack as casus belli, but rather on the gravity of the violation of rights connected to an attack, especially in regard to communication, which makes his theory interesting to contemporary problems.

In this context, I will also discuss Vitoria’s early answer to the attribution problem in section 2: the conduct of coercion-free and transparent multi-stakeholder discourses that minimise the risk of false attribution, which resembles the contemporary demand to constitute an independent global institution responsible for attributing cyber attacks (Davis et al., 2017).

In section 3, I will show how Vitoria’s concept of ius communicationis, the “right to communicate”,allows for proportionate sanctions of cyber attacks directed at the public sphere.

The second focus of this contribution lies on the question of private wars. The debate concerning just cyberwars so far omits the role of private actors. Not so the classics. In his earliest writing on international law, Grotius coined the term bellum iustum privatum, private just war. How companies could justify hack-back and active cyber defense relating to this concept will be discussed in section 4.

If one accepts the right of companies to actively defend themselves in cyberspace, then one must also pose the question of how individual citizens could legitimately and actively defend their human right to privacy against intrusive state actors and companies1. To a realistic long-term perspective of deterrence, a medially and digitally literate, critical, and in extreme cases, actively disobedient civil society poses no threat, but, on the contrary, states should encourage such “civilian-based deterrence” (Thumfart, 2011; Sharp, 1985). Private and public cyber deterrence capacities together make up a system of distributed deterrence that is much more realistic, effective and secure than state-based deterrence alone and makes “the proverbial square peg” of cyber deterrence fit into the “round hole” of deterrence theory (Taddeo, 2018a, p. 325).

Section 1: What constitutes an attack? A constructivist approach beyond the cyberspace physicality problem

Vitoria’s lectures are broadly recognised as the first teachings on international law and the first modern account of just war theory. His just war doctrine immediately leads to the question of physicality. Echoing the classical doctrine of vim vi repellere licet, Vitoria says: “There can be no doubt about the rights of defensive war, since it is lawful to resist force with force” (Vitoria, 1991, p. 297).

This long-lived tradition is also echoed by Article 51 of the UN Charter, which claims an “inherent right of collective or individual self-defense” as part of natural law that has to be understood as an exception to the positive norm of the prohibition of the use of armed force in article 2 (4) (Christakis and Bannelier, 2017, p. 36).

Before the digital age, it would have been unnecessary to stress that “force” (Latin: vis) is a physical term. The physicality of cyberspace, in turn, is a heavily debated issue. Insisting on the mere virtuality of cyberspace in order to exempt it from the physical scope of law in general has been a common trope in the debate from its beginning on, leading to conservatives’ fear of “cyber anarchy” (Goldsmith, 1999).

However, cyberspace is of course, in fact, based on physical infrastructure. And it is of course wrong to state that “cyber attacks are nonphysical” (Smith, 2018, p. 222) or that they mark a “shift toward the non-physical domain” (Taddeo 2012, p. 106). It has for example been argued that data has a (however minimal) weight (Ray, 2011; Robinson, 2018). And it has been acknowledged in criminal law that the obtaining of e-evidence from servers abroad constitutes a physical intrusion in a different jurisdiction. “Contrary to the current metaphor often used by Internet-based service providers, digital information is not actually stored in clouds; it resides on a computer or some other form of electronic media that has a physical location” (In re Warrant to Search Target Computer at Premises Unknown. 958 F. Supp. 2d 753, S.D. Tex, 2013).

Nevertheless, for a long time, the dominant interpretation of the UN Charter's prohibition of the use of "force" in Article 2(4) applied this prohibition only to kinetic, i.e. non-digital attacks (Waxman, 2011, p. 45). As Orend puts it: “The gold standard of casus belli is a kinetic physical attack” (Orend, 2013, p. 176).

However, it is clear that cyber attacks, for instance attacks on transportation infrastructure or on factories, can have physical consequences. Take the Stuxnet attack: it could be considered a kinetic attack. Although it was primarily executed by non-kinetic means, it had kinetic consequences (Jenkins, 2013, p. 70). Therefore, a new consensus emerged that no longer focuses on the physicality of the means, but, rather, on the physically violent effects of an attack.

The Tallinn Manual 2.0 seems to express this latest consensus in its rule 69. “A cyber operation constitutes a use of force when its scale and effects are comparable to non-cyber operations rising to the level of a use of force” (Schmitt and NATO, 2017, p. 330). This seems to leave the issue of attacks that produce effects that are not physically violent. “The case of cyber operations that do not result in injury, death, damage, or destruction, but that otherwise have extensive negative effects, remains unsettled” (Ibid., p. 342).

In another passage, however, the Tallinn Manual indicates that its authors do not understand the term “injury” in a physical way. “The term ‘injury’ is not to be understood to require damage. Instead, simple breach of an international legal obligation suffices to make proportionate countermeasures available to the injured State” (Ibid., p. 127).

In addition to the criteria of the physicality or non-physicality of means or effects, in the Tallinn Manual, it is the criteria of coerciveness that is decisive regarding whether an action can be met with countermeasures. “The fact that cyber operations result in no physical consequences does not detract from their characterisation as a prohibited intervention. By way of contrast, a cyber operation that does not seek any change of conduct lacks the requisite coercive element” (Ibid., p. 318).

This of course raises the question of how to determine the coerciveness of an action. What is coercive to a small nation hardly affects a powerful nation. Or, is the criteria of coerciveness only fulfilled if an explicit threat has been formulated? Something along the lines: “we did A to you, so that you do B”? The criteria of coerciveness will be dealt with again later on (sections 2 and 3).

As Russia’s various interventions around the US elections in 2016 have shown (Matishak, 2018), the unclarity regarding the criteria that are needed to justify countermeasures led to a reality, in which attacks on the public sphere remain unsanctioned. This is because they cannot be easily placed within these three categories: physically violent means, physically violent effects or coerciveness. Similar attacks targeting elections, the freedom of the press or financial markets are plausible and must seem attractive to aggressors. Deterrence-oriented just war theory must find an answer to this problem.

For this undertaking, it is helpful to take a step back to the origins of modern just war doctrine in Vitoria’s lectures. Next to the physicalism characteristic to Roman Law and its vim vi repellere licet discussed above, his theory focuses on the effects that actions have on people’s rights, not just the physical means or effects of an attack.

This needs to be uncovered with the help of the original Latin text of Vitoria’s lectures. In the standard English translation, he clarifies the jus ad bellum as self-defense with the following words: “The sole and only just cause for waging war is when harm has been inflicted” (Vitoria, 1991, p. 303).

The word “harm” seems to indicate physicality. In the Latin original, however, it says iniuria, which is far better translated with “injustice” than with “harm”, as this is the case in the German translation, which uses the term “Unrecht”, i.e. “injustice” (Vitoria, 1997, p. 558). In fact, in a decisive paragraph discussed below, also the English translation uses the more adequate term “offence” for iniuria (section 3).

In short: decisive to the legitimacy of just war is not the gravity of the action or its effects, but the gravity of the violation of a subject’s natural rights. We will later see that, due to this definition, Vitoria’s doctrine allows for sanctions to actions that do not include physical violence (section 3).

This whole problem becomes clearer with some philosophical remarks. Of course, the is-/ought-distinction would have been completely alien to a natural rights lawyer such as Vitoria. However, in terms of contemporary philosophy, it makes sense to apply it.

The physicality problem in the just cyberwar debate might be overall the wrong question to ask. It is yet another is-ought-fallacy that characterises many positions in the debate about cyberspace, which continue deducing normative claims from a supposed “nature” of cyberspace, preferably along the equation: non-physical = without legal consequences (De Hert and Thumfart, 2018, p. 5).

This necessity to separate normative and descriptive claims has been, in part, understood by Jenkins who denies that the physicality or non-physicality of the means used for an attack makes a “morally relevant difference” (Jenkins, 2013, p. 74). Similarly, Sleat dismisses the debate around the physicality of means by pointing out that just war theories ultimately concern “specific human actions and their effects on a particular group of other human beings”, and that, therefore, they must ultimately address this “human question”, i.e. categories that are rather relevant to normative discourses than to descriptive ones: the impact of aggressions “on our projects and purposes” (Sleat, 2017, p. 333).

At Vitoria, the ultimate irrelevance of the question of physicality to just waris grounded in a thick communitarian ontology and anthropology, which he inherited from Aristotle and Aquinas. Vitoria takes the Aristotelian dictum that humans are political animals, quite literal, inasmuch as he insists that a human being outside of the political community is not human, but an animal (Vitoria, 1995, p. 125).

This communitarian anthropology also offers a radically minimal, anachronistically speaking constructivist solution to the metaphysical problem of the existence of physicality independent from human beings. Even if there was “wisdom without speech” ( sapientia [...] sine sermone), Vitoria says, such truth would be of no value because it would be “unedifying and unsociable” (ingrata et insociabilis esset ipsa sapientia) (Vitoria, 1995, p. 123).

In this sense, the question of physicality is not decisive. Rather, the issue has to be, whether the existence of an object is communicated or not. In the same way, contemporaries concluded that the issue of just cyberwar calls for a “constructivist approach to cyber deterrence emphasizing intersubjective understandings” (Lupovici, 2016, p. 328). A constructivist approach is an approach that does not conceive of empirical reality as something given independently from human discourse, but rather, as something being constructed by a discourse. Such an approach is especially useful in order to address the problem of how to determine and assess coerciveness, which will be dealt with in sections 2 and 3.

A constructivist approach based on Vitoria’s original just war theory also solves the problem of the perception of an “ontological gap between just war theory and cyber conflicts” (Taddeo, 2018b, p. 350), which seems to be owed to a two-fold misconception: the misconception of cyberspace as non-physical (Taddeo 2012, p. 106) and the simplification of just war theory as being “centered on human beings, tangible objects, and kinetic conflicts causing physical damage and bloodshed” (Taddeo, 2018b, p. 350).

However, there are, of course, grave difficulties with such an intersubjective and constructivist approach. Due to its epistemological openness, such a broad understanding of casus belli could easily turn into a carte blanche for military aggression.

Section 2: The attribution problem: transparent discourse instead of carte blanche

Whether an intersubjective understanding of casus belli becomes a dangerous carte blanche for military aggression or an effective means of cyber deterrence largely depends on solving the attribution problem.

It is surely exaggerated and not consistent to state that in cyberspace “attribution is at best problematic, if not impossible” and to deny the possibility of cyber deterrence on these grounds (Taddeo, 2018b). This can be easily shown by employing the domestic analogy from Walzer’s just war theory: this would be as if one argued for an end to law enforcement activities against cybercrime because it is hard to catch cybercriminals.

In general, however, it is indeed difficult to attribute cyber attacks, due to the involvement of non-state hacker groups acting as proxies or autonomously, various degrees of anonymity on the web, the widespread practice of staging false flag attacks (Skopik and Pahi, 2020), such as happened during the 2018 Olympics cyber attack (Greenberg, 2019b) and difficulties regarding the obtaining of extraterritorial e-evidence. Those difficulties are not only responsible for the fact that, so far, there is no standard procedure for attributing cyber attacks, but also for the fact that they are rarely officially attributed by and to states at all.

In order not to get carried away by the endless possibilities of theory, it is important to provide three examples of attribution in practice.

1. The distributed denial-of-service attack (DDoS-attack) on Estonia in 2007: In the context of a heated debate around the relocation of Russian war memorials, websites of Estonian media companies, banks and government agencies were flooded with superfluous requests by bots and individuals. This overload caused a collapse of systems.

In the realm of the grand, but often empty gestures of symbolic politics, the incident led to the instalment of the NATO Cooperative Cyber Defence Centre in the Estonian capital (McGuinness, 2017). This clearly invoked the idea of a new cold war in cyberspace. In reality, however, the seemingly clear front between the two blocks of yore was in fact a blurry line.

Since the criminal investigation of the attacks showed that the majority of the attackers could be located within the Russian jurisdiction, the Estonian Public Prosecutor issued a formal investigation assistance request to Russia. Russia, in turn, after first ensuring assistance, eventually denied it by arguing that it was not covered by the Mutual Legal Assistance Treaty (Tikk and Kaska, 2010). It blamed the attack on the pro-Kremlin youth movement Nashi, who took responsibility and stressed the autonomy of its actions (Shachtman, 2009). So far, the only person who has been tried for the attack was the Estonian citizen Dmitri Galushkevich, part of the country’s large Russian ethnic minority. He was fined 17,500 Estonian Krooni, which roughly amounts to US$ 1,600 dollars (BBC, 2008).

2. North Korea’s attacks on Sony in 2014: Due to North Korea’s indignation about the then-upcoming comedy The Interview, which included a plot to assassinate Kim Jong-un, the North Korean hacker group Guardians of Peace hacked Sony’s computer network, demanded the withdrawal of the movie and leaked confidential information. This led to the resignation of a high-ranking Sony employee, among other effects. Primarily as a reaction to cinema chains’ fear of making themselves the target of cyber attacks by getting involved, Sony decided to pull the theatrical premiere.

However, on 19 December, the FBI announced the attribution of the attack to North Korea, primarily because some of the tools used were similar to tools used before in attacks on South Korea. The very same day, the former US president Obama announced a proportionate response to North Korea and insisted on the release of the movie. Sony subsequently released the movie online and in some cinemas without suffering further attacks. It is unclear whether the US’ response went further than imposing additional economic sanctions. In 2016, a private sector investigation made the evidence of the attribution available to the public (Novetta, 2016).

3. The Democratic National Committee (DNC) hacking on the eve of the 2016 elections: this was the first attack that was officially recognised as a confrontation between the two superpowers Russia and the US. Whilst the FBI contacted the DNC as early as September 2015 with a warning that their systems had been accessed by hackers linked to Russian intelligence, it took the DNC six months to hire a private security firm in order to protect their systems. The internal emails obtained in the attack were released by WikiLeaks from 22 July up to the election, causing the resignation of high-ranking DNC officials and influencing the US elections to an unclear extent.

In spite of Guccifer 2.0, the fictional persona crafted by Russian intelligence, causing the desired confusion, the early attribution to Russia was soon publicly confirmed by several cyber experts and private security firms and narrowed down to Russia’s secret service (Banks 2017, p. 1488). Roughly a month before the election, the Department of Homeland Security (DHS) and the Office of the Director of National Intelligence (ODNI) published a joint statement that the Intelligence Community was confident that the Russian government was responsible for the attack and that it signified, indeed, an attempt to “interfere with the US election process” (DHS and ODNI, 2016). During that time, Obama reportedly exhorted Putin on the so called red phone that “international law, including the law for armed conflict, applies to actions in cyberspace" (Arkin, 2016).

Nevertheless, it took until January 2017, two months after the elections, until the public was provided with access to a declassified version of a report that contained some of the evidence supporting the attribution to Russia (ODNI, 2017). Somewhat ironically, before the elections, it was exactly the fear that the hack would strongly influence the elections, which prevented the Obama administration from acting faster. Another reason was, of course, that it needed to be absolutely sure not to unnecessarily escalate the conflict between the two superpowers with an unjustified public attribution. Additionally, a great part of the evidence was obtained by intelligence activities and not considered suitable for the critical eyes of the public.

These examples show four essential features of the attribution of cyber attacks:

  1. A long lag time between attack andattribution makes deterrence difficult. As Banks points out concerning the DNC hacks, countermeasures should be immediate, because they “are designed to persuade the perpetrator to stop its unlawful actions, not as punishment or escalation”; and, therefore, a too long lag time between hack and attribution will turn legal active self-defense into unilateral punishment illegal under international law (Banks, 2017, p. 1502).
  2. The more unclear the attribution the more effective the attack. Difficulties in attribution are the worst effect of attacks on the public sphere. Whilst North Korea wanted the US to know who committed the attack, and its attack therefore had no grave effects, the case is different with the Russian DNC hacks and even Nashi’s in other ways extremely unsophisticated attacks on Estonia, which directly aimed at undermining trust in public deliberation, respectively instrumentalised existing ethnic conflicts within Estonia.
  3. The attribution problem is no preliminary problem to effective deterrence, as it may seem, but, since undermining trust in public discourse is the main target of many cyber attacks, solving the attribution problem is already part of an effective defense strategy.
  4. The lack of explicit coerciveness makes the attack more effective. In spite of having devastating effects, causing chaos and the erosion of trust in deliberative processes, attacks on the public sphere cannot be understood from a coercion-centred approach. On the contrary, such attacks are precisely characterised by their lack of an explicit threat structure. Explicit coerciveness causes less confusion and, due to providing a clear concept of the enemy, can even reinforce the trust in the attacked nation’s institutions and deliberative processes.

It is interesting that already Vitoria, who lived in a time with less possibilities to verify the occurrence of an attack, saw the problem that the existence of a cause for just war might not be evident. Concerning this problem, he seems to foreshadow Kant’s principle of publicity (Thumfart, 2013) by demanding that there must be a public, objective and balanced discussion free from coercion that includes representatives of the opposing nation and experts:

For the just war it is necessary to examine the justice and causes of just war with great care, and also to listen to the arguments of the opponents, if they are prepared to negotiate genuinely and fairly. One must consult reliable and wise men who can speak with freedom and without anger or hate or greed (Vitoria 1991, p. 307).

One should not hold Vitoria’s historical sexism against him, who only speaks of “men” (viros) here. And of course, a realist might add, such a public discussion including the opponent would render a military reprisal practically impossible. But maybe that is exactly what Vitoria intended, since he repeatedly asserts that just war must be a last resort. And this surely must also be the starting point for every not war-mongering, and hence reasonable just war theory.

On the other hand, public discourse concerning cyber attacks with public access to the available evidence seems to be a useful means to solve the dilemma of the attribution of cyber attacks, which consists in either carrying out a response too soon and wrongly attributed or not responding at all, because it is too late. In addition to that, an early public debate has the potential to prevent the erosion of trust in national deliberative processes, which is the strategic target of cyber attacks on the public sphere.

A practical problem here is that a primary means by which nation states can achieve accurate attribution is that they have information collected through espionage. This may enable those states to have confidence in attributions, but they will be extremely reluctant to reveal all the evidence behind such attributions in a timely manner to the public. In the specific case of the DNC hack, the attribution to Russia was apparently in substantive part due to the Russian hacking operation being hacked by Dutch intelligence (Gallagher, 2018).

An independent, albeit not public discussion of evidence of casus belli is supported by a paper of Rand Corporation on the matter, which mentions some factors for establishing credible attribution: the inclusion of independent experts, the building up of a track record of accuracy and precision, transparent methodology and review processes (Davis et al., 2017, p. 17).

Also, the creation of an independent institution for the attribution of cyber attacks modeled after the International Atomic Energy Agency (IAEA) has been discussed (Smith, 2017; Healey et al., 2014). This could be an important addition to a transparent public debate about each attack within the affected nation, especially concerning attacks on less liberal societies, which are not fit to provide the necessary conditions for internal transparent debates.

At this moment of history, which is characterised by the declining relevance of traditional institutions of global governance, it does actually not seem to be the case that “the UN Security Council has the necessary resources and the political and coercive power” to attribute and to sanction cyber attacks, such as argued by Taddeo (Taddeo, 2018a, p. 323). It is widely known that the UN Security Council is paralysed since the last years (Lynch, 2020; United Nations, 2019). And, even if that was not the case, an institution for the attribution of cyber attacks as independent and transparent as possible would be preferable, because these attributions are complex and nation states’ incentives to abuse them for political reasons are significant.

Section 3: Ius Communicationis: attacks on the public sphere as casus belli

In the case of cyber attacks aiming at the manipulation of elections such as exemplified by the DNC attack of 2016, it must be noted that, due to the lack of a coercive element, the first problem of credible deterrence against such attacks is that the DNC hack was probably not illegal under international law (Banks, 2017, p. 1501). The predominant “realpolitik view of the intelligence/international law relationship” is that there are few constraints to intelligence activities abroad (Deeks 2016, p. 601). And this also applies to cyber espionage (Schmitt and NATO, 2017, p. 323; Smith, 2018, p. 224).

Intelligence activities can, at best, be met with diplomatic retorsions, which are, in turn, unlikely to stop the respective behaviour.

Although, as developed in section 1, classical just war theory is primarily a means of defense against violent attacks, it also offers an adequate answer to attacks with non-violent effects, inasmuch as it focuses on the violation of rights due to an attack, rather than on its physical means or physical effects.

This also includes a focus shifted away from the criteria of coerciveness, since the violation of rights does not need to be combined with a threat in order to amount to the violation of the right to self-determination. Since to Vitoria, the right to communicate freely is the highest norm of international law (Vitoria, 1991, p. 279), also infringements of the right to communicate (ius communicationis) lacking explicit coerciveness can be a casus belli.

In Vitoria’s understanding, communication has a broad range of implications that can be exemplified by the etymology of the word. Communication means communal activities between humans in general, of course speech, but also travel and trade, and, perhaps most significantly, enjoying free access to the commons, to air and sea.

All four, speech, travel, trade and access to the commons, are of great importance to his conception in terms of its political intentions, which consist in legitimising the Spanish preaching, travel and trade in the Americas in a more coherent way than his predecessors in the duda indiana debate, ranging from John Mair to Matías de Paz (Thumfart, 2012, pp. 76-117). To underline its importance independent from its dark pedigree: via the appropriation by Grotius, Vitoria’s ius communicationis gave rise to the still valid principle of the freedom of the seas (section 4).

Vitoria’s stressing of communication as the highest norm goes back to his Thomist-Aristotelian communitarian heritage that was discussed in section 1. Already Aquinas wrote: “Civitas est quaedam communicatio. Unde contra rationem civitatis esset quod cives in nullo communicarent.” – “The State isa kind of communication, because a State, in which citizens do not communicate is impossible” (Aquinas, 1971). With his ius communicationis, Vitoria extends this core principle of politics from the domestic to the international realm and strengthens it, inasmuch as violating it can be a cause of war.

This becomes clear when he denies any people the right to prohibit strangers from accessing its ports, since not only the air and the sea, but also the ports were traditionally regarded as commons. And, according to him, if one is hindered from accessing the commons, this, in turn, is a legitimate reason to wage just war. Not because it presupposes violence, which hindrance usually does, but because it is a violation of one’s natural right. —“If the barbarians deny the Spaniards what is theirs by the law of nations, they commit an offence (iniuriam) against them. Hence, if war is necessary to obtain their rights, they may lawfully go to war” (Vitoria, 1991, p. 282).

Just war is here a means of the restitution of one’s natural right to access the commons and not an act of defense against a physical attack. Vitoria uses the same word iniuria (i.e. “injustice”) as discussed in section 1, which is only obscured by the fact that the English translation uses the misleading term “harm” in the passage in section 1 and the more adequate term “offence” in the passage quoted here.

Vitoria’s emphasis on communication, i.e. speech, travel, trade, accessing the commons, stems from his Aristotelian and Thomist communitarian anthropology, according to which humans can only exist in political community with others. This communitarian anthropology runs so deep at Vitoria that a human being born without citizenship is a contradiction in terms to him (Vitoria, 1991, p. 281). Equally, losing community would be equal to losing the ontological status as a human being. That is why to Vitoria, ius communicationis is that part of the law of nations, which is natural law and cannot be abrogated. And denying this basic access to community is a grave violation of one’s natural right, which in turn can be answered with force (Thumfart, 2017, p. 207f.).

From this point of view, an attack on communication itself is rather worse than a kinetic attack and not neglectable at all. But, of course, this presupposes that one is willing to accept the broad range of implications of the term communication in Vitoria’s lectures.

In spite of some remnants of the ius communicationis in the Law of the Sea in contemporary international law, it is of course the consensus that one is not allowed to travel everywhere and to trade freely with everyone. Why should this be different regarding an attack on a nation’s public sphere that could be described as its internal communication?

One could also draw conclusions from Vitoria’s ius communicationis that would justify a new kind of colonialism in the digital age. Analogous to understanding the individual right to access the digital commons as a human right (Thumfart, 2017, p. 207), denying a nation the right to access another nation’s digital public sphere could be considered an offence. This assumption, following the intellectual history of colonialism, could be understood as the root of the digital “open door policy” that characterised the “free internet” US foreign policy approach during the Obama administration (Thumfart, 2017, p. 214; Hanson, 2012). This approach used the demand to open up national informational spheres as a way to establish soft power, and, as it turned out, to globally collect data: a “digital imperialism” (Thumfart, 2017, p. 214; Wasik, 2015).

The contemporary post-Snowden context (De Hert and Thumfart, 2018, p. 6) is defined by a widespread re-closure of national digital spheres (Rosenberger, 2020). From this perspective, and from a perspective of international law, it makes sense to take Vitoria’s ius communicationis, the right to communicate,from the realm of external relations, where it is largely outdated, and apply it to the realm of internal self-determination instead. International law does not require states to maintain a democratic constitution. And, although the idea of an individual human right to access the digital sphere can be considered a sensible claim, following the doctrine of self-determination, it can be assumed that every nation has the right to keep its deliberative processes within its public sphere free from intrusions of other states.

This is supported by the International Court of Justice’s Nicaragua judgement (Ibid., p. 315), which the Tallinn Manual quotes as precedent for its rule 66. “A State may not intervene, including by cyber means, in the internal or external affairs of another State” (Schmitt and NATO, 2017, p. 312). In this way, the doctrine of self-determination would, for example, prohibit Russia’s DNC-hacks of 2016, inasmuch as they went beyond intelligence operations and affected the US’ ability to decide about its affairs.

Ohlin argues that this is only the case if it could be substantiated that Russia’s hack indeed affected the outcome of the elections (Ohlin, 2017, p. 1598). One could conclude, following Ohlin, that whilst the DNC hacks themselves did not represent a casus belli, the injection of the obtained materials into the pre-election debate did. However, Ohlin’s focus on the effects of cyber operations on the outcome of the elections does not necessarily address the adequate issue. Rather than to support a specific party, it is the aim of such attacks to undermine the credibility of national deliberative processes in general.

Concerning an application of these insights to a doctrine of just cyberwar, Smith’s distinction between two kinds of non-violent cyber attacks is helpful. Whilst cyber espionage or DDoS-attacks would normally not constitute a casus belli, they could so if they directly targeted the agency of the other State, i.e. if they “fundamentally undermined the ability of the target State’s political and social institutions to deliberate” (Smith, 2018, p. 233ff.).

The Tallinn Manual considers an even wider understanding of possible causes of war, since it discusses the issue of frequency and poses the question whether “a series of cyber incidents that individually fall below the threshold of an armed attack (…) constitute an armed attack when aggregated” (Schmitt and NATO, 2017, p. 342). So, albeit not constituting a casus belli by itself, permanent surveillance such as conducted by the NSA and its allied intelligence agencies, or permanent disinformation campaigns such as conducted by the Russians during the 2016 presidential elections, could constitute a just cause of war or proportionate countermeasures, especially if they create a lasting sense of mistrust in the public sphere, affecting a nation’s capacity to deliberate.

In this sense, the right to self-determination includes the prohibition of “violence against the state as an informational entity”, as Haataja writes in his application of Floridi’s informational ethics to cyber warfare (Haataja 2019, p. 32). A similar, explicitly non-anthropocentric application of Floridi’s informational ethics to just war theory has been proposed by Taddeo (Taddeo, 2014, p. 7). Such explicitly non-anthropocentric approaches are compatible with a theory of just war based on Vitoria’s ius communicationis, which conceives of communication as a phenomenon in its own right that can be defended by proportionate sanctions.

However, the non-anthropocentric, information-centred approach is in danger to miss the distinctively anthropocentric nature of attacks on the public sphere. Although for example disinformation campaigns and DDoS-attacks actually have non-anthropocentric features, because they are partly conducted by “artificial agents” (Taddeo, 2012, p. 113) such as bots, these non-human agents ultimately target the agency and judgement of really existing humans.

What must be addressed is, ultimately, the psychological dimension of a people’s trust in a society’s deliberative processes and the impact of lost trust on the formulation of aims, values and purposes, which are distinctively anthropocentric issues (Sleat, 2017, p. 333).

In particular, this human dimension plays a big role considering the means that states should use to internally counteract such attacks. These means should not be restricted to regulating or insulating the national informational sphere, e.g. by the prohibition of fake news and privacy laws, but need to include promoting digital and media literacy of civil society, such as, for example, included in the European Commission’s respective policy recommendations (European Commission, 2018). Next to the legitimisation of proportional countermeasures against such attacks within just war ethics, the promotion of digital and media literacy of civil society should be part of a system of distributed deterrence involving the private sector and individual citizens, which will be elaborated below.

Section 4: Private just wars. Economic espionage, hack back and the right to resist

This section deals with the second issue that the discourse on just cyberwar usually gets wrong. Just war ethicists in the wake of Walzer derive their theories from the enlightenment model of conflicts between free and equal individuals (McMahan, 2007). This premise limits their applicability to Westphalian state-vs-state confrontations, which resemble conflicts between free and equal individuals, inasmuch as the Westphalian model largely imagines states to be as monolithic and impermeable as in-dividuals (literally: un-divideables).

However, within cyberspace, the borders of nation states became permeable; the world of cyberwars is post-Westphalian. And the typical form of cyber aggression is not a state-vs-state confrontation, but consists in economic and industrial espionage below the level of international or even political conflicts and is purely economically motivated. Given the great interest in cyberwar theory, it is indeed “rather puzzling” that the economic sphere has not been dealt with to the same extent as the political one (Magen, 2017, p. 4).

The first difficulty here is the asymmetry in these conflicts. The victims are usually private firms who keep quiet because admitting having been hacked gives companies a bad reputation (Javers, 2013). In addition to that, companies usually want to avoid direct confrontation with states, since they want to keep future business opportunities open and they do not have incentives to engage in strongman politics.

This is for example the case concerning US companies that do not speak out about Chinese hacking. China has “engaged in cyber-enabled economic espionage and trillions of dollars of intellectual property theft”, claims the White House in its National Cyber Strategy from 2018 (White House, 2018). Correspondingly, Keith Alexander, the former US National Security Agency (NSA) director, and Dennis Blair, the former director of US National Intelligence, wrote in 2017 that “Chinese companies have stolen trade secrets from virtually every sector of the American economy” (Alexander and Blair, 2017). This American claim that China currently spearheads cyber economic espionage can also be supported by examining Turkey’s and the UAE’s business transactions with Chinese firms (Magen, 2017, p. 14).

However, the US also has acted as a perpetrator of economic espionage, and not just as a victim. An NSA spokesperson assured that “the department does not engage in economic espionage in any domain, including cyber” (Greenwald, 2014). Nevertheless, the NSA was spying on economic targets such as the Brazilian oil giant Petrobras. Snowden also cited the German firm Siemens as one target (Kirschbaum, 2014). However, the frontlines in economic espionage are more complex than a simple transatlantic divide. The foreign intelligence agency of Germany Bundesnachrichtendienst (BND) has been accused of spying in collaboration with the NSA on the French firm Airbus (BBC, 2015).

In the case of economically powerful countries such as the US and Germany, the economic dependency of companies and the connected unwillingness to come out against the perpetrators play a big role. Also, economic and financial espionage can universally be legitimised with concerns regarding economic and financial security. For example, although denying the stealing of trade secrets, in 2013, the director of US National Intelligence admitted to collecting economic and financial information abroad in order to prevent economic and financial crises (Clapper, 2013).

Economic espionage surely is the elephant in the room when one talks about cyber deterrence. Whilst it constantly fosters conflicts between states that could escalate into trade wars and ultimately military conflicts, it is usually seen as a problem that concerns the private, not the public sector, an issue of criminal, not international law.

The involvement of private actors has also another aspect, inasmuch as in cyber security, public private partnerships are the rule, rather than the exception, and private entities have robust capacities to defend and to attribute attacks. “The top tech companies appear to be as powerful as States, and sometimes even more so, to prevent cyber attacks, attribute them and to respond to malicious acts” (Bannelier and Christakis, 2017, p. 10).

Especially concerning attribution, private entities have already demonstrated their skills, for example by publicly examining the attribution of the Sony hack within a group that included prominent cyber security firms such as Novetta, Kaspersky, Symantec and ThreatConnect (section 2), by verifying the attribution of the DNC hacks (section 2) or concerning the uncovering of the Russian false flag attack on the 2018 Olympic Games which involved the firms Cisco, CrowdStrike, Intezer and Kaspersky (Greenberg, 2019b).

Additionally, the legalisation of “hack-back” or “active cyber defense” is being discussed, for example in the context of the Active Cyber Defense Certainty Act proposed to US congress. Such regulations would make it legal for companies to actively attack computer networks of their attackers.

De facto, this practice has already been adopted by companies. In 2009, Google accessed foreign computers in order to gather information about a malware attack they suffered (Glosson, 2015, p. 17). In 2011, Facebook employed active defense and took control of a hacker gang’s primary command-and-control server (Glosson, 2015, p. 17). During the attacks on Sony in 2014, the company allegedly fought back by launching “a counteroffensive that sought to impede the hackers’ distribution of its data” (Glosson, 2015, p. 3). The US government has not prosecuted any of these firms for having undertaken active defense measures. In the language of just war theory, the legalisation of such practices amounts to the development of a practice of private just war.

The right to wage defensive wars is one of the most significant prerogatives of sovereign states. From a Westphalian point of view, a theory of private just war is therefore highly problematic. And also concerning cyberspace, the Tallinn Manual states: “Only States may take countermeasures. For example, an information technology firm may not act on its own initiative in responding to a harmful cyber operation targeting it by styling its response as a countermeasure” (Schmitt and NATO, 2017, p. 130).

However, such a right for private actors to take countermeasures is formulated in classical just war theory, in Hugo Grotius’ very first work on international law, de jure praedae (1604/05), where he develops the notion of a bellum iustum privatum, private just war. Although de jure praedae is not one of the works Grotius is known for, it is quite essential to his thought, since it is here where he originally develops his still valid principle of the Freedom of the Seas.

Grotius was originally commissioned by the Dutch East India Company to write a legal opinion in defense of the capture of a Portuguese ship by a ship of this company. In order to do so, Grotius appropriated the Spaniard Vitoria’s arguments. This includes great irony, since the Netherlands were engaged in a war with Spain at that time and the issues of trade and colonisation, which Vitoria discussed, were precisely a great issue of conflict in that war. To make that irony perfect, Grotius employs Vitoria’s argument of the ius communicationis that justified Iberian colonialism so that it justifies Dutch resistance against Portugal’s role as a hegemon on the world seas.

If Vitoria’s claim that the right to travel and trade could be defended by the force of arms was true, argues Grotius, then the Dutch could also defend their right to travel and trade freely against Portugal’s and Spain’s claim to a monopoly on trade on the world’s seas. The fact that it was a private entity that committed this just war, which was still unthinkable to Vitoria, does not make a difference to Grotius. Such “private just war”, bellum iustum privatum, he argues, was necessary because the confrontation took place in international waters, where there was no protection by a state available.

“Almost all of the events that gave rise to this war took place upon the ocean; but we have maintained that no one can claim special jurisdiction over the ocean with respect to locality” (Grotius, 2006, XII).

As a result, the East India Company’s natural right to self-defense applies.

Nature withholds from no human being the right to carry on private wars; and therefore, no one will maintain that the East India Company is excluded from the exercise of that privilege, since whatever is right for single individuals is likewise right for a number of individuals acting as a group (Grotius, 2006, XII).

Grotius’s doctrine of private just war might seem puzzling to contemporaries, to which war, of course, can only be conducted by states. However, it has some contemporary applications. Due to the threat of piracy in international waters, for example, trade ships carry private armed guards with them, in order to defend themselves against pirates, which also includes pre-emptive aggression, such as the firing of warning shots (Dutton, 2017).

One might argue, as the often used and abused rhetorical figure of “digital commons” suggests (e.g. Benkler, 2006), that cyberspace is, from the perspective of legal geography, comparable to an extension of the high seas through computer networks. From this perspective, one could also justify “hack-back” actions if states are demonstratively unable to guarantee the safety of private entities formally situated on their territory, but de facto more at home in the vastness of cyberspace.

This analogy between maritime security and cyber security has also been developed in a paper by the Carnegie Endowment (Hofmann and Levite, 2017, pp. 23-31). Such analogisation might seem like an anachronistic recourse to Grotius’ age of privateers and pirates. And, of course, a doctrine of private just wars includes the danger of further weakening the already weakened rule of international law. In particular, states may become liable for the actions of non-state actors on their territory causing harm to others, which would increase international legal conflicts.

From a realistic perspective, however, to legitimise the proportionate use of cyber force by private actors in the case of attacks could significantly contribute to international stability for seven reasons.

First, the “post-territoriality” of cyberspace (De Hert and Thumfart, 2018) raises difficult legal, ethical and political dilemmas regarding states’ protection of technology companies. Technology companies are usually not situated within one state’s territory, but their computers, networks and stored data are usually spread out across the globe. Which country should be responsible for defending cloud-based assets? (Hofmann and Levite, 2017, p. 14) This is a difficult question. For example, problems related to national responsibilities for protecting personal data stored in the cloud arose during the Microsoft Ireland case, which led to the Cloud Act in the US (De Hert and Thumfart, 2020). Since it defies the boundaries of territories, national legislation regarding post-territorial cloud assets has a tendency to overreach. This inherently leads to jurisdictional conflicts (Thumfart and De Hert, 2018). The deterring and stabilising impact of geographical borders has to be redrawn “by redefining territory in a way that defies the original connection of the notion of territory to the land” (Hildebrandt, 2013, p. 222). Such a redefinition of territory should be more adequate to the needs of post-territorial technology companies, who, due to the lack of consequences when attacking them, represent a preferred target for cyber attacks. In cyberspace, more than ever, the borders of firms do not follow the borders of states. Therefore, a doctrine of just war focusing on the complex territoriality of private companies needs to be developed.

Second, a doctrine of just cyber war would be far too permissive if any attack on a private actor related to it would automatically lead to a just war involving nation states. Just think about a world in which every attack on Microsoft, Facebook, Apple or Amazon would automatically trigger a response by the US government, or, even worse, by the government of every territory these companies are in any way related to. A clear separation between the defensive rights of private and public entities would contribute to lowering the risk of quick escalation of cyber conflicts into kinetic conflicts involving nation states, i.e. cross-domain conflict escalation.2

Third (following from arguments one and two), since a complete defense of post-territorial companies by states can neither be guaranteed nor is it desirable, companies can be expected “to fill gaps in the defensive coverage that governments provide”, also by active cyber defense (Hofmann and Levite, 2017, p. 14). This behaviour has been observed regarding maritime security, where private companies are facing a similar situation and switched to active modes of self-defense. In spite of raising severe legal problems, this mode of conduct is effective in reducing piracy (Hofmann and Levite, 2017, pp. 23-31).

Fourth, the practical employment of active cyber defense has already led to the discussion of the Active Cyber Defense Certainty Act in US congress, also known as the ‘hack-back bill’, which would legalise such measures. Since private active self-defense in cyberspace is a reality, it is necessary to regulate it, for example along the lines of just war principles of jus ad bellum and jus in bello, such as proportionality and attribution, and jus post bellum, i.e. the state of affairs after the retribution. Following Vitoria’s conception of jus post bellum (Thumfart, 2017, p. 212), one could for example make sure that active cyber defense only serves to restore the attacked party’s rights and produces no financial gains or other advantages to the party undertaking measures of active cyber defense, which seems especially important considering the profit-oriented motivations of private companies. A regulation that takes the de facto reality into account and considers these criteria will help to deter cyber attackers and limit the scope of possible responses.

Fifth, such a limited right to exercise proportionate hack-back will contribute to stability, inasmuch as it distributes the problem of cyber defense to more actors, some of which are, in fact, better equipped to guarantee cyber security than states. States, in turn, can profit from a robust private cyber security sector, especially since public private partnerships are rather the rule than the exception in cyber security. Distributed cyber deterrence can be expected to be more effective than centralised cyber deterrence. On the other hand, if states are entirely responsible for cyber security, this represents an incentive for companies to save resources in that field, which will decrease security.

Sixth, in general, private actors follow foremost economic motivations, when attributing and defending, which are reliable in this respect and not distorted by the need for populist strongman politics that can easily lead to conflict escalation. In the field of attribution, it has already been demonstrated that private security firms can even act as a geopolitical counterweight to the states where they headquarter.3 Kaspersky, for example, decided to regain the trust of its clients and counter the “cloud of suspicions against the Moscow-based security firm” by providing “evidence that actually bolstered the case against Russia” during the particularly complex false flag attack on the 2018 Olympics (Greenberg, 2019a, chapter 35).

Seventh, and this is crucial: if a limited right to exercise proportionate hack back is granted to private entities, there seems to be no obstacle to also granting such a right to individual citizens for the same reasons. What if states are not only incapable of protecting their citizens’ rights, for instance the human right to privacy, but complicit in foreign agents’ violating their citizens’ rights? Although, for example the Snowden-revelations have led to an outcry all over Europe (De Hert and Thumfart, 2018, p. 6) and saved the GDPR (Rossi, 2018), national intelligence agencies have, like corporations, in fact collaborated with the NSA (Borger, 2013). If one takes the notion of distributed deterrence seriously, then all three levels of power, states, companies and individuals, need to be equipped with deterrence mechanisms that keep other actors in check.

It seems that a perpetual violation of the human right of privacy should trigger something like the “right to resist”, incorporated in the German Basic Law’s article 20 and reflected in the US constitution’s Second Amendment, for the case that a government acts unconstitutionally. Of course, such vigilante justice is not desirable and can only be a last resort. Especially the illegal, yet non-violent means of whistleblowing in order to protect human rights where states or companies do not comply with their duties seems an appropriate mechanism to control the abuse of state power and technological power in cyberspace.

It is strategically short sighted to conceive of such whistleblowing activities and similar resistance exclusively in terms of a subversion of states’ defense capacities, such as this is the case in the US’ on-going assessment of Assange, Manning and Snowden (Miller, 2020; Lee, 2020; Maloney, 2019). In the nineteen-eighties, political theorist Gene Sharp developed “civilian-based deterrence” that paradoxically relied on the anti-war movement to “make Europe unconquerable” (Thumfart, 2011; Sharp, 1983).

When it comes to cyber deterrence, nations all over the world have yet to learn that a critical and, in extreme cases, disobedient civil society poses no threat to their national cyber security, but rather can be its strongest line of defense. The same civil society forces that make life sometimes difficult for the executive branch of government, can make it even harder for a state’s enemies to attack, to remain undetected and to escape retaliation.

Conclusion

This contribution aimed at counteracting two shortcomings of just cyberwar theory: the insufficient discussion of non-violent cyberattacks that are directed at the public sphere and the omission of the role of non-state actors. It did so by connecting contemporary research to Vitoria’s and Grotius’ original conceptions of international law and just war.

In section 1, I discussed the problem of the lack of immediate physical violence of cyber attacks on the public sphere. This makes it difficult to characterise them as a prohibited use of force under the UN Charter. Also, I discussed newer frameworks, such as the Tallinn Manual’s effect-based and coercion-centred approach to the assessment of cyber attacks. Vitoria’s conception of just war has been demonstrated to go beyond action-based, effect-based and coercion-centred approaches alike, inasmuch as it focuses on the violation of rights caused by an attack.

In section 2, I discussed the attribution problem in light of three well known cyber attacks: Estonia 2007, Sony 2014 and DNC 2016. Referring to Vitoria’s conception of just war, the importance of a public transparent discourse on attribution has been highlighted in its crucial function for preserving the credibility of deliberative processes and building credible deterrence. I have recommended establishing an independent, international institution for attributing cyber attacks. Four theses have been developed: 1. A long lag time between attack and attribution makes deterrence difficult; 2. The more unclear the attribution, the more effective the attack; 3. The attribution problem is no preliminary problem to effective deterrence, but rather, mechanisms for solving it are part of effective deterrence; 4. The lack of explicit coerciveness makes an attack more effective.

In section 3, I demonstrated that Vitoria’s notion of just war legitimises sanctioning attacks on the public sphere in spite of them not involving explicit coercion or physical violence. This is based on an interpretation of his ius communicationis, i.e. “right to communicate”,that stresses the importance of deliberative processes to political self-determination. The promotion of digital and media literacy of civil society has been identified as part of a strategy of distributed deterrence involving private actors.

In section 4, I have demonstrated the growing importance of private actors in the context of cyber attacks. Building on Grotius’ notion of bellum iustum privatum, i.e.private just war”, a doctrine of active cyber defense has been formulated, when a state cannot or does not want to protect private actors' rights. This also applies to individuals such as whistleblowers. I formulated a theory of distributed cyber deterrence that equally builds on public and private actors, the latter legitimately conducting proportionate reprisals, being involved in public private partnerships and offering civilian-based deterrence. Seven lines of argument leading to this conclusion have been developed along the following ideas: 1. the post-territoriality of cloud-based assets; 2. the need for a clear separation of private conflicts from international conflicts; 3. the analogy of the success of private deterrence in the fight against piracy on the high seas; 4. legislation such as the hack-back bill; 5. stability through distributed deterrence involving various actors; 6. the economic and non-belligerent motivations of private actors; 7. the individual right to resist as a guarantee of checks and balances of digital power.

Acknowledgements

I thank the staff of Internet Policy Review and the editors of this special issue. In particular, I would like to thank the peer-reviewer Robert Merkel, lecturer in Software Engineering at Monash University, who contributed one paragraph on secret services to this essay. I would also like to thank the peer-reviewer Samuli Haataja, faculty member at Griffith Law School, who contributed some in-depth knowledge regarding cyber attacks and international law.

References

Aquinas, T. (1971). Sancti Thomae de Aquino Sententia libri Politicorum. Corpus Thomisticum. http://www.corpusthomisticum.org/cpo.html

Arkin, W. M. (2016, December 19). What Obama Said to Putin on the Red Phone About the Election Hack. NBC News. https://perma.cc/5CKG-G5XC

Banks, W. (2017). State Responsibility and Attribution of Cyber Intrusions After Tallinn 2.0. Texas Law Review, 95(7), 1487–1513. https://texaslawreview.org/state-responsibility-attribution-cyber-intrusions-tallinn-2-0/

Bannelier, K., & Christakis, T. (2017). Cyber-Attacks – Prevention-Reactions: The Role of States and Private Actors. Les Cahiers de la Revue Défense Nationale.

BBC News. (2008, January 25). Estonia fines man for ‘cyber war’. BBC News. http://news.bbc.co.uk/2/hi/technology/7208511.stm

BBC News. (2015, May 1). Airbus to sue over US-German spying row. BBC News. https://www.bbc.com/news/world-europe-32542140

Benkler, Y. (2006). The Wealth of Networks: How Social Production Transforms Markets and Freedom. Yale University Press.

Blair, D., & Alexander, K. (2017, August 15). China’s Intellectual Property Theft Must Stop. New York Times. https://www.nytimes.com/2017/08/15/opinion/china-us-intellectual-property-trump.html

Borger, J. (2013, November 1). GCHQ and European spy agencies worked together on mass surveillance. The Guardian. https://www.theguardian.com/uk-news/2013/nov/01/gchq-europe-spy-agencies-mass-surveillance-snowden

Clapper, J. R. (2013). Statement by Director of National Intelligence James R. Clapper on Allegations of Economic Espionage. Office of the Director of National Intelligence (ODNI. https://www.dni.gov/index.php/newsroom/press-releases/press-releases-2013/item/926-statement-by-director-of-national-intelligence-james-r-clapper-on-allegations-of-economic-espionage

Davis, J. S. I., Boudreaux, B., Welburn, J. W., Aguirre, J., Ogletree, C., McGovern, G., & Chase, M. S. (2017). Stateless Attribution. Toward International Accountability in Cyberspace [Report]. Rand Corporation. https://www.rand.org/pubs/research_reports/RR2081.html

De Hert, P., & Thumfart, J. (2018). The Microsoft Ireland case and the Cyberspace Sovereignty Trilemma. Post-Territorial technologies and companies question territorial state sovereignty and regulatory state monopolies(Working Paper No. 4/11). Brussels Privacy Hub. https://brusselsprivacyhub.eu/BPH-Working-Paper-VOL4-N11.pdf

De Hert, P., & Thumfart, J. (2020). The Microsoft Ireland case, the CLOUD Act and the cyberspace sovereignty trilemma. In International Trends in Legal Informatics (pp. 373–418). Editions Weblaw.

Dutton, Y. M. (2016, July 11). Fighting Maritime Piracy with Private Armed Guards [Blog post]. Oxford Research Group. https://www.oxfordresearchgroup.org.uk/blog/fighting-maritime-piracy-with-private-armed-guards

European Commission. (2018). Communication from the Commission to the European Parliament, the Council, the European Committee and the Committee of the Regions. Tackling online disinformation: A European Approach COM(2018) 236 final. European Union. https://eur-lex.europa.eu/legal-content/EN/TXT/HTML/?uri=CELEX:52018DC0236&from=de

Finlay, C. (2018). Just War, Cyber War, and the Concept of Violence. Philosophy & Technology, 31, 357–377. https://doi.org/10.1007/s13347-017-0299-6

Fitton, O. (2016). Cyber Operations and Gray Zones: Challenges for NATO. Connections, 15(2), 109–119. https://doi.org/10.11610/Connections.15.2.08

Freedman, L. (2017). The Future of War: A History. Allen Lane.

Gallagher, S. (2018, January 26). Candid camera: Dutch hacked Russians hacking DNC, including security cameras. Ars Technica. https://arstechnica.com/information-technology/2018/01/dutch-intelligence-hacked-video-cameras-in-office-of-russians-who-hacked-dnc/

Giesen, K.-G. (2014). Justice in Cyberwar. Ethic@ — An International Journal of Moral Philosophy, 13(1), 27–49. https://doi.org/10.5007/1677-2954.2014v13n1p27

Glosson, A. D. (2015). Active Defense: An Overview of the Debate and a Way Forward [Working Paper]. Mercatus Center at George Mason University. https://www.mercatus.org/system/files/Glosson-Active-Defense.pdf

Goldsmith, J. L. (1999). Against Cyberanarchy (No. 40; Occasional Papers). University of Chicago Law School. https://chicagounbound.uchicago.edu/cgi/viewcontent.cgi?referer=https://www.google.com/&httpsredir=1&article=1001&context=occasional_papers

Greenberg, A. (2019a). Sandworm: A New Era of Cyberwar and the Hunt for the Kremlin’s Most Dangerous Hackers. Doubleday.

Greenberg, A. (2019b, October 17). The Untold Story of the 2018 Olympics Cyberattack, the Most Deceptive Hack in History. Wired. https://www.wired.com/story/untold-story-2018-olympics-destroyer-cyberattack/

Greenwald, G. (2014, September 5). The US’s Government Secret Pans to Spy for American cooperations. The Intercept. https://theintercept.com/2014/09/05/us-governments-plans-use-economic-espionage-benefit-american-corporations/

Grotius, H. (2006). Commentary on the Law of Prize and Booty (M. J. van Ittersum, Ed.). Liberty Fund. https://oll.libertyfund.org/titles/grotius-commentary-on-the-law-of-prize-and-booty

Gstrein, O. (2020). Mapping power and jurisdiction on the internet through the lens of government-led surveillance. Internet Policy Review, 9(3). https://doi.org/10.14763/2020.3.1497

Haataja, S. (2019). Cyber Attacks and International Law on the Use of Force. Routledge. https://doi.org/10.4324/9781351057028

Hanson, F. (2012, March 29). Open Door Policy. Foreign Policy. https://foreignpolicy.com/2012/03/29/open-door-policy/

Healey, J., Mallery, J. C., Jordan, K. T., & Youd, N. V. (2014). Confidence-Building Measures in Cyberspace: A Multistakeholder Approach for Stability and Security [Report]. Atlantic Council. https://www.atlanticcouncil.org/in-depth-research-reports/report/confidence-building-measures-in-cyberspace-a-multistakeholder-approach-for-stability-and-security/

Hildebrandt, M. (2013). Extraterritorial jurisdiction to enforce in cyberspace?: Bodin, Schmitt, Grotius in cyberspace. University of Toronto Law Journal, 63(2), 196–224. https://doi.org/10.3138/utlj.1119

Hofmann, W., & Levite, A. (2017). Private Sector and Cyber Defense. Can Active Measures Help Stabilize Cyberspace?[Report]. Carnegie Endowment for International Peace. https://carnegieendowment.org/2017/06/14/private-sector-cyber-defense-can-active-measures-help-stabilize-cyberspace-pub-71236

Huang, S. (2014). Proposing a Self-Help Privilege for Victims of Cyber Attacks. The George Washington Law Review, 82(4), 1229–1266. http://www.gwlr.org/wp-content/uploads/2014/10/Huang_82_4.pdf

In re Warrant to Search Target Computer at Premises Unknown, 958 F. Supp. 2d 753 (S.D. Tex. 2013) (United States District Court, S.D. Texas, Houston Division 2013). https://casetext.com/case/in-re-search

Javers, E. (2013, February 25). Cyberattacks: Why Companies Keep Quiet. CNBC. https://www.cnbc.com/id/100491610

Jenkins, R. (2013). Is Stuxnet Physical? Does it Matter? Journal of Military Ethics, 12(1), 68–79. https://doi.org/10.1080/15027570.2013.782640

Joint Statement from the Department Of Homeland Security and Office of the Director of National Intelligence on Election Security. (2016). Department of Homeland Security. https://www.dhs.gov/news/2016/10/07/joint-statement-department-homeland-security-and-office-director-national

Kirschbaum, E. (2014). Snowden says NSA engages in industrial espionage: TV. Reuters. https://www.reuters.com/article/us-security-snowden-germany/snowden-says-nsa-engages-in-industrial-espionage-tv-idUSBREA0P0DE20140126

Lee, T. B. (2020, March 13). Chelsea Manning is out of jail after almost a year. https://arstechnica.com/tech-policy/2020/03/chelsea-manning-is-out-of-jail-after-almost-a-year/

Lin, H. (2012). Cyber conflict and international humanitarian law. International Review of the Red Cross, 94, 886,. https://doi.org/10.1017/S1816383112000811

Lucas, G. (2017). Ethics and Cyber Warfare. Oxford University Press. https://doi.org/10.1093/acprof:oso/9780190276522.001.0001

Lupovici, A. (2016). The “Attribution Problem” and the Social Construction of “Violence”: Taking Cyber Deterrence Literature a Step Forward. International Studies Perspectives, 17(3), 322–342. https://doi.org/10.1111/insp.12082

Lynch, C. (2020, March 27). U.N. Security Council Paralyzed as Contagion Rages. Foreign Policy. https://foreignpolicy.com/2020/03/27/un-security-council-unsc-coronavirus-pandemic/

Magen, S. (2017). Cybersecurity and Economic Espionage: The Case of Chinese Investments in the Middle East. Cyber, Intelligence, and Security, 1(3), 3–124. https://css.ethz.ch/content/dam/ethz/special-interest/gess/cis/center-for-securities-studies/resources/docs/INSS-Cyber,%20Intelligence,%20and%20Security,%20Volume%201,%20No.%203.pdf

Maloney, C. (2019, March 12). The top 4 reasons Edward Snowden deserves a fair trial. The Hill. https://thehill.com/blogs/congress-blog/politics/487229-the-top-4-reasons-edward-snowden-deserves-a-fair-trial

Mann, M., & Daly, A. (2019). (Big) data and the north-in-south: Australia’s informational imperialism and digital colonialism. Television and New Media, 20(4), 379–395. https://doi.org/10.1177/1527476418806091

Matishak, M. (2018, July 18). What we know about Russia’s election hacking. Politico. https://www.politico.com/story/2018/07/18/russia-election-hacking-trump-putin-698087

McGuinness, D. (2017, April 27). How a cyber attack transformed Estonia. BBC News. https://www.bbc.com/news/39655415

McKenzie, T. (2017). Is Cyber Deterrence Possible? Air University Press.

McMahan, J. (2007). The Sources and Status of Just War Principles. Journal of Military Ethics, 6(2), 91–106. https://doi.org/10.1080/15027570701381963

McMahan, J., & McKim, R. J. (1993). The Just War and The Gulf War. Canadian Journal of Philosophy, 23(4), 501–541. https://doi.org/10.1080/00455091.1993.10717333

Miller, M. (2020, June 24). Justice Department announces superseding indictment against Wikileaks’ Assange. The Hill. https://thehill.com/policy/cybersecurity/504434-justice-department-announces-superseding-indictment-against-wikileaks

Muller, L. P., & Stevens, T. (2017). Upholding the NATO cyber pledge: Cyber Deterrence and Resilience: Dilemmas in NATO defence and security politics (Research Report No. 5/2017; Policy Brief). Norwegian Institute for International Affairs (NUPI). http://www.jstor.org/stable/resrep08037

Novetta. (2016). Operation Blockbuster. Unraveling the Long Threat of the Sony Attack [Report]. https://operationblockbuster.com/wp-content/uploads/2016/02/Operation-Blockbuster-Report.pdf

Nye, J. S. (2017). Deterrence and Dissuasion in Cyberspace. International Security, 41(3), 44–71. https://doi.org/10.1162/ISEC_a_00266

Office of the Director of National Intelligence (ODNI). (2017). Background to “Assessing Russian Activities and Intentions in Recent US Elections”: The Analytic Process and Cyber Incident Attribution [Declassified report]. Office of the Director of National Intelligence (ODNI). https://www.dni.gov/files/documents/ICA_2017_01.pdf

Ohlin, J. D. (2017). Did Russian Cyber Interference in the 2016 Election Violate International Law? Texas Law Review, 95(7), 1579–1598. https://texaslawreview.org/russian-cyber-interference-2016-election-violate-international-law/

Open Data City. (2013). Stasi vs NSA. How much space would the filing cabinets of the Stasi and the NSA use up, if the NSA would print out their 5 Zettabytes? Stasi versus NSA. https://opendatacity.github.io/stasi-vs-nsa/english.html

Orend, B. (2013). The Morality of War. Broadview Press.

Ray, C. (2011, October 24). The Weight of Memory. New York Times. https://www.nytimes.com/2011/10/25/science/25qna.html

Robinson, I. (2018, April 20). How Much Does the Internet Weigh? Azo Quantum. https://www.azoquantum.com/Article.aspx?ArticleID=68

Roscini, M. (2010). World Wide Warfare – Jus ad bellum and the Use of Cyber Force. Max Planck Yearbook of United Nations Law, 14(1), 85–130. https://doi.org/10.1163/18757413-90000050

Rosenberger, L. (2020). Making Cyberspace Safe for Democracy. Foreign Affairs, May/June, 146–159.

Rossi, A. (2018). How the Snowden Revelations Saved the EU General Data Protection Regulation. The International Spectator, 53, 4 95–111. https://doi.org/10.1080/03932729.2018.1532705

Schmitt, M. N., & NATO Cooperative Cyber Defence Centre of Excellence (Eds.). (2017). Tallinn manual 2.0 on the international law applicable to cyber operations (Second edition). Cambridge University Press.

Shachtman, N. (2009, March 11). Kremlin Kids: We Launched the Estonian Cyber War. Wired. https://www.wired.com/2009/03/pro-kremlin-gro/

Skopik, F., & Pahi, T. (2020). Under false flag: Using technical artifacts for cyber attack attribution. Cybersecurity, 3. https://doi.org/10.1186/s42400-020-00048-4

Sleat, M. (2018). Just cyber war?: Casus belli, information ethics, and the human perspective. Review of International Studies, 44(2), 324–342. https://doi.org/10.1017/S026021051700047X

Smith, B. (2017). The need for a Digital Geneva Convention [Blog post]. Microsoft on the Issues. https://blogs.microsoft.com/on-the-issues/2017/02/14/need-digital-geneva-convention/#sm.0001hkfw5aob5evwum620jqwsabzv

Smith, P. T. (2018). Cyberattacks as Casus Belli: A Sovereignty-Based Account. Journal of Applied Philosophy, 35(2). https://doi.org/10.1111/japp.12169

Smotherman, J. W. (2016). Justified Physical Response to Cyber Attacks from Walzer’s Legalist Paradigm. Army War College Review, 2(3), 43–53. https://www.jstor.org/stable/resrep11938.5

Solis, G. D. (2014). Cyber warfare. Military Law Review, 219, 1–52. https://www.loc.gov/rr/frd/Military_Law/Military_Law_Review/pdf-files/219-spring-2014.pdf#page=9

State Department. (2018). Recommendations to the President on Deterring Adversaries and Better Protecting the American People from Cyber Threats. Office of the Coordinator for Cyber Issues.

Sullivan, L., & Schuknecht, C. (2019, April 12). As China Hacked, U.S. Businesses Turned A Blind Eye. NPR. https://www.npr.org/2019/04/12/711779130/as-china-hacked-u-s-businesses-turned-a-blind-eye?t=1560710524188&t=1566677585432

Taddeo, M. (2012). Information Warfare: A Philosophical Perspective. Philosophy & Technology, 25, 105–120. https://doi.org/10.1007/s13347-011-0040-9

Taddeo, M. (2014). Information Warfare: The Ontological and Regulatory Gap. Newsletter on Philosophy and Computers, 14(1), 13–20. https://www.researchgate.net/publication/267019306_Information_warfare_the_ontological_and_regulatory_gap

Taddeo, M. (2018a). Deterrence and norms to foster stability in cyberspace. Philosophy & Technology, 31, 323–329. https://doi.org/10.1007/s13347-018-0328-0

Taddeo, M. (2018b). The Limits of Deterrence Theory in Cyberspace. Philosophy and Technology, 31, 339–355. https://doi.org/10.1007/s13347-017-0290-2

The White House. (2018). National Cyber Strategy. https://www.whitehouse.gov/wp-content/uploads/2018/09/National-Cyber-Strategy.pdf

Thumfart, J. (2009). On Grotius’s Mare Liberum and Vitoria’s De Indis, Following Agamben and Schmitt. Grotiana, 30, 65–87. https://doi.org/10.1163/016738309X12537002674286

Thumfart, J. (2011, March 3). Gene Sharp. Der Demokrator. Die Zeit. https://www.zeit.de/2011/10/Gene-Sharp

Thumfart, J. (2012). Die Begründung der globalpolitischen Philosophie. Francisco de Vitorias Vorlesung über die Entdeckung Amerikas im ideengeschichtlichen Kontext. Kadmos.

Thumfart, J. (2017). Francisco de Vitoria and the Nomos of the Code: The Digital Commons and Natural Law, Digital Communication as a Human Right, Just Cyber-Warfare. In At the Origins of Modernity (pp. 197–217). Springer. https://doi.org/10.1007/978-3-319-62998-8_11

Thumfart, J. (2013). Kolonialismus oder Kommunikation. Kants Auseinandersetzung mit Francisco de Vitorias ius communicationis. Proceedings of the XI. International Kant Congress in Pisa, 929–940. https://www.academia.edu/10342105/Kolonialismus_oder_Kommunikation._Kants_Auseinandersetzung_mit_Francisco_de_Vitorias_ius_communicationis

Thumfart, J., & De Hert, P. (2018, June 4). Both the US’s Cloud Act and Europe’s GDPR Move Far Beyond Geography, but Will Not Solve Transatlantic Jurisdictional Conflicts. Just Security. https://www.justsecurity.org/57346/uss-cloud-act-europes-gdpr-move-geography-solve-transatlantic-jurisdictional-conflicts/

Tikk, E., & Kaska, K. (2010). Legal Cooperation to Investigate Cyber Incidents: Estonian Case Study and Lessons. Proceedings of the 9th European Conference on Information Warfare and Security, 288–294.

United Nations. (2019, January 10). Paralysis Constricts Security Council Action in 2018, as Divisions among Permanent Membership Fuel Escalation of Global Tensions. https://www.un.org/press/en/2019/sc13661.doc.htm

Vitoria, F. (1991). Political Writings (A. Pagden & J. Lawrence, Eds. & Trans.). Cambridge University Press.

Vitoria, F. (1995). Vorlesungen I. Völkerrecht, Politik, Kirche. (U. Horst, Ed.; Latin and German). Kohlhammer.

Vitoria, F. (1997). Vorlesungen II. Völkerrecht, Politik, Kirche. (U. Horst, Ed.; Latin and German). Kohlhammer.

Wasik, B. (2015, June 4). Welcome to the Age of Digital Imperialism. New York Times Magazine. https://www.nytimes.com/2015/06/07/magazine/welcome-to-the-age-of-digital-imperialism.html

Waxman, M. C. (2011). Cyber Attacks as ‘Force’ Under UN Charter Article 2(4). International Law Studies, 87, 43–57. https://scholarship.law.columbia.edu/faculty_scholarship/847/

Footnotes

1. Note from the editor: a previous version of this paper did not have "and companies" included. We added these two words on the wish of the author on 16 September 2020.

2. Note from the editor: a previous version of this paper did not have the words "kinetic" and "i.e. cross-domain conflict escalation" included in this sentence. We added these words on the wish of the author on 16 September 2020.

3. Note from the editor: a previous version of this paper used the expression "the private sector" instead of the more accurate "private security firms". We made the remplacement on the wish of the author on 16 September 2020.

Going global: Comparing Chinese mobile applications’ data and user privacy governance at home and abroad

$
0
0

This paper is part of Geopolitics, jurisdiction and surveillance, a special issue of Internet Policy Review guest-edited by Monique Mann and Angela Daly.

In February 2019, the short video sharing and social mobile application TikTok was fined a record-setting penalty (US$ 5.7 million) for violating the Children’s Online Privacy Protection Act by the US Federal Trade Commission for failing to obtain parental consent and deliver parental notification. TikTok agreed to pay the fine (Federal Trade Commission, 2019). This settlement implies several significant developments. Owned by the Chinese internet company ByteDance, TikTok is popular worldwide, predominantly among young mobile phone users, while most commercially successful Chinese internet companies are still based in the Chinese market. Such global reach and commercial success makes Chinese mobile applications pertinent sites of private governance on the global scale (see Cartwright, 2020, this issue). China-based mobile applications therefore need to comply with domestic statutory mechanisms as well as privacy protection regimes and standards in the jurisdictions as they expand outward, such as the extraterritorial application of Article 3 of the EU’s General Data Protection Regulation (GDPR).

To examine how globalising Chinese mobile apps respond to the varying data and privacy governance standards when operating overseas, we compare the Chinese and overseas version of four sets of China-based mobile applications: (1) Baidu mobile browser - a mobile browser with a built-in search engine owned and developed by Chinese internet company Baidu, (2) Toutiao and TopBuzz - mobile news aggregators developed and owned by ByteDance, (3) Douyin and TikTok - mobile short video-sharing platforms developed and owned by ByteDance, with the former only available in Chinese app stores and the later exclusively in international app stores, and (4) WeChat and Weixin - a social application developed and owned by Chinese internet company Tencent. Together, these four mobile applications represent a global reach of flagship China-based mobile apps and a wide range of functions: search and information, news content, short videos and social. They also represent a mix of more established (Baidu, Tencent) and up-and-coming (ByteDance) Chinese internet companies. Lastly, this sample also demonstrates the varying degree of commercial success as they all offer services globally, with Baidu browser the least commercially successful, and TikTok the most successful.

An earlier study shows that Chinese web services had a bad track record in privacy protection: back in 2006, before China had in place a national regime of online privacy protection, among 82 commercial websites in China, few websites posted a privacy disclosure and an even fewer number of websites followed the four fair information principles of notice, choice, access and security (Kong, 2007). These four principles are to enhance self-regulation of the internet industry by providing consumers notice, control, security measures, and ability to view and contest the accuracy and completeness of data collected about them (Federal Trade Commission, 1998). In 2017, only 69.6 percent of the 500 most popular Chinese websites had disclosed their privacy policies (Feng, 2019). These findings suggest a significant gap between data protection requirements on paper and protection in practice (Feng, 2019). In a recent study, Fu (2019) finds improvement of the poor privacy protection track record of the three biggest internet companies in China (Baidu, Alibaba, and Tencent). Her study shows that BAT’s privacy policies are generally compliant with the Chinese personal information protection provisions but lack sufficient considerations to transborder data flows and in the case of change of ownership (such as merger and acquisitions (Fu, 2019). Moreover, the privacy policies of BAT offer more notice than choice—that user either is forced to accept the privacy policy or forego the usage of the web services (Fu, 2019, p. 207). Building on these findings, this paper asks: does the same app differ in data and privacy protection measures between international and Chinese versions? How are these differences registered in the app’s user interface design and privacy policies?

In the following analysis, we first outline the evolving framework of data and privacy protection that governs the design and operation of China-based mobile apps. The next section provides a background overview of key functions, ownership information, business strategies of examined apps. The walkthrough of app user interface design studies how a user experiences privacy and data protection features in various stages of app usage. Last, we present the comparison of privacy policies and terms of service between the two versions of the same China-based apps to identify the differences in data and privacy governance. We find that not only different apps vary in data and privacy protection, the international and Chinese versions of the same app also show discrepancies.

Governance ‘of’ globalising Chinese apps

Law and territory has always been at the centre of debates in the regulation and development of the internet (Goldsmith & Wu, 2006; Kalathil & Boas, 2003; Steinberg & Li, 2016). Among others, China has been a strong proponent of internet sovereignty in global debates about internet governance and digital norms. The 2010 white paper titledThe Internet In China enshrines the concept of internet sovereignty into the governing principles of the Chinese internet. It states: “within Chinese territory the internet is under the jurisdiction of Chinese sovereignty” (State Council Information Office, 2010). The principle of internet sovereignty was later reiterated by the Cyberspace Administration of China (CAC), the top internet-governing body since 2013, to recognise “each government has the right to manage its internet and has jurisdiction over information and communication infrastructure, resources and information and communication activities within their own borders” (CAC, 2016).

Under the banner of internet sovereignty, the protection of data and personal information in China takes a state-centric approach, which comes in the form of government regulations and government-led campaigns and initiatives. The appendix outlines key regulations, measures and drafting documents. Without an overarching framework for data protection, China’s data protection approach is characterised in a “cumulative effect” (de Hert & Papakonstantinou, 2015), which is composed of multitude of sector-specific legal instruments, promulgated in a piecemeal fashion. While previous privacy and data protection measures are dispersed across various government agencies, laws and regulation, the first national standard for personal data and privacy protection was put forth only in 2013. The promulgation of the Cybersecurity Law in 2016 is a major step forward in the nation’s privacy and data protection efforts, despite the policy priority of national security over individual protection. Article 37 of the Cybersecurity Law stipulates that personal information and important data collected and produced by critical information infrastructure providers during their operations within the territory of the People’s Republic of China shall be stored within China. Many foreign companies have complied either as a preemptive goodwill gesture or as a legal requirement in order to access, compete, and thrive in the Chinese market. For example, in 2018, Apple came under criticism for moving the iCloud data generated by users with a mainland Chinese account to data management firm Guizhou-Cloud Big Data - a data storage company of the local government of Guizhou province (BBC, 2016). LinkedIn, Airbnb (Reuters, 2016), and Evernote (Jao, 2018) have stored mainland user data in China, even prior to the promulgation of the Cybersecurity Law. The Chinese government asked transnational internet companies to form joint ventures with local companies to operate data storage and cloud computing businesses, such as Microsoft Azure’s cooperation with Century Internet and Amazon AWS-Sinnet technology (Liu, 2019).

The Chinese state participates in a wide range of online activities including, among other things, data localisation requirements for domestic and foreign companies (McKune & Ahmed, 2018). The Chinese government attributes data localisation requirements to national security and the protection of personal information on the basis that the transfer of personal and sensitive information overseas may undermine the security of data (Xu, 2015). While others point out the recurring themes of the ideological tradition of technological nationalism and independence as Cyberspace Administration of China’s prioritisation of security over personal privacy and business secrets (Liu, 2019). Captured in President Xi’s speech “without cybersecurity comes no national security”, data and privacy protection is commonly framed under the issue of internet security (Gierow, 2014).

There is a growing demand for the protection of personal information among internet users and a growing number of government policies pertaining to the protection of personal information in China (Wang, 2011). Since 2016, the Chinese government is playing an increasingly active role in enforcing a uniform set of rules and standardising the framework of privacy and data protection. As of July 2019, there are 16 national standards, 10 local standards and 29 industry standards in effect that provide guidelines on personal information protection. However, there is no uniform law or a national authority to coordinate data protection in China. The right to privacy or the protection of personal information (the two are usually interchangeable in the Chinese context) often comes as an auxiliary article along with the protection of other rights. Whereas jurisdictions such as the EU have set up Data Protection Authorities (DPAs) - that are independent public entities that supervise the compliance of data protection regulations, in China the application and supervision of data protection has fallen on private companies and state actors respectively. User complaints against the violation of data protection laws are mostly submitted to, and handled by, private companies themselves rather than an independent agency. This marks the decisive difference underlying China’s and the EU’s approach to personal data processing: in China, data protection is aimed exclusively at the individual as consumer, versus in the EU, the data protection recipient is regarded as an individual or a data subject and protection of personal data is both a fundamental right and is conducive to the trade of personal data within the Union, as stipulated in Article 1 of the General Data Protection Regulation (de Hert & Papakonstantinou, 2015).

The pre-existing legal modicum and self-regulatory regime of privacy and data protection by Chinese internet platform companies gives rise to rampant poor privacy and data protection practices, even among the country’s largest and leading internet platforms. Different Chinese government ministries have also tackled the poor data and privacy regulation of mobile apps and platform in rounds of “campaign style” (运动式监管) regulation—a top down approach often employed by the Chinese government to provide solutions to emerging policy challenges (Xu, Tang, & Guttman, 2019). For instance, Alibaba’s payment service Alipay, its credit scoring system Sesame Credit, Baidu, Toutiao, and Tencent have all shown poor track records of data and privacy protection and have come under government scrutiny (Reuters, 2018). Alipay was fined by the People’s Bank of China in 2018 for collecting users’ financial information outside the scope defined in the Cybersecurity Law (Xinhua, 2018). The Ministry of Industry and Information Technology publicly issued a warning to Baidu and ByteDance’s Toutiao for failing to properly notify users about which data it is collecting (Jing, 2018).

As China experienced exponential mobile internet growth, mobile apps stand out as a poignant regulatory target. The Cyber Administration of China put forth the Administrative Rules on Information Services via Mobile Internet Applications in 2016 that distinguishes the duties for mobile app stores and mobile apps. Mobile apps, in particular, bear six regulatory responsibilities: 1) enforce real name registration and verify the identity of users through cell phone number or other personally identifiable information, 2) establish data protection mechanism to obtain consent and disclose the collection and use of data, 3) establish fulsome information gatekeeping mechanisms to warn, limit, suspend accounts that post content that violate laws or regulations, 4) safeguard privacy during app installation processes, 5) protection of intellectual property, 6) obtain and store user logs for sixty days.

As more China-based digital platforms join the ranks of the world’s largest companies by measures of user population, market capitalisation and revenues (Jia & Winseck, 2018), various scholarly studies have already started to grapple with the political implications of their expansion. Existing studies call for attention to the distinctions between global and domestic versions of the same Chinese websites and mobile applications in information control and censorship activities and results show Chinese mobile apps and websites are lax and inconsistent at content control when they go global (Ruan, Knockel, Ng, & Crete-Nishihata, 2016; Knockel, Ruan, Crete-Nishihata, & Deibert, 2018; Molloy & Smith, 2018). To ameliorate these dilemmas, some China-based platforms have designed different versions of their products that serve domestic and international users separately. Yet, data and privacy protection of Chinese mobile apps is under-studied, especially as they embark on a global journey. This is ever more pressing an issue as Chinese internet companies that have been successful at growing their international businesses, such as Tencent and ByteDance, simultaneously struggle to provide a seamless experience for international users and complying with data and content regulations at home.

Methods

We employ a mixed-method approach to investigate how globalising Chinese mobile apps differ in data and privacy governance between Chinese and international versions accessed through Canadian app stores. While Baidu Search, TikTok, WeChat, and Topbuzz do not appear to have region-based features, the actual installation package may or may not differ based on where a user is based and downloads the apps from. First, we conducted an overview of tested mobile apps and functions, looking at issues of ownership, revenue, user population. Each app’s function and business model has a direct bearing on the data collection and usage. Secondly, to study how mobile apps structure and shape end users’ experience with regards to data and privacy protection, we deployed the walkthrough method (Light, Burgess, & Duguay, 2018). We tested both the Android and iOS version of the same app. In the case of China-based apps (i.e., Douyin & Toutiao), we downloaded the Android version from the corresponding official website of each service and the iOS version from the Chinese regional Apple App Store. For the international-facing apps (i.e., TikTok and TopBuzz), we downloaded their Android versions from the Canadian Google Play Store and the iOS version from the Canadian Apple App Store. Baidu and WeChat do not offer separate versions for international and Chinese users; instead, the distinction is made when users register their account. After we downloaded each app, we systematically stepped through two stages in the usage of the apps: app entry and registration, and discontinuation of use. We conducted the walkthrough on multiple Android and Apple mobile devices in August 2019.

In addition, we conducted content analysis of the privacy policies and terms of service of each mobile app. These documents demonstrate the governance by mobile apps as well as the governance of mobile apps within certain jurisdictions. They are also key legal documents that set the conditions of user’s participation online and lay claim to the institutional power of the state (Stein, 2013). We examined a total of 15 privacy policies and terms of service in Chinese and English language, retrieved in July 2019. Here are the numbers of documents we examined for each app: Baidu (2), Weixin (2), WeChat (2), TopBuzz (2), TikTok (3), Douyin (2), Toutiao (2). We then conducted content analysis of mobile app privacy policies and terms of service along five dimensions: data collection, usage, disclosure, transfer, and retention. For data collection, we looked for items that detailed the types of information collected, the app’s definitions of personally identifiable information, and the possibility to opt out of the data collection process; for data usage, we looked for terms and conditions that delineated third party use; for disclosure, we looked at whether the examined app would notify its users in case of privacy update, merger and acquisitions, and data leakages; for data transfer and retention, we examined whether app specified security measures such as encryption of user data, emergency measures in case of data leaks, terms and conditions of data transfer, as well as the specific location and duration of data retention.

Research limitations

Due to network restrictions, our walkthrough is limited to the Canadian-facing versions of these China-based apps. For each mobile app we studied, its parent company offers only one version of an international-facing app and one version of a China-facing app on the official website. Yet, even though there is only one international-facing app for each of the products we analysed, it remains to be tested whether the app interface, including the app’s notification setting differs when downloaded and/or launched in different jurisdictions. Moreover, our research is based on a close reading of the policy documents put together by mobile app companies. It does not indicate whether these companies actually comply with their policy documents in the operation of services, or the pitfalls of notice and consent regime (Martin, 2013). Existing research has already shown that under the Android system, there are many instances of potential inconsistencies between what the app policy states and what the code of the app appears to do (Zimmeck et al., 2016).

Overview of apps

Baidu Search

Baidu App is the flagship application developed by Baidu, one of China’s leading internet and platform companies. The Baidu App provides the search function but also feeds users highly personalised content based on data and metadata generated by users. Often regarded as the Chinese counterpart of Google, Baidu’s main business includes online search, online advertising and artificial intelligence. In 2018, the daily active users of Baidu app reached 161 million, a 24% jump from 2017. Although Baidu has embarked on many foreign ventures and expansion projects, according to its annual report, the domestic market still accounts for 98% of Baidu’s total revenue for 2016, 2017, and 2018 consecutively. Based on revenue composition, Baidu’s business model is online advertising. The major shareholders of Baidu are its CEO Robin Yanhong Li (31.7%) and Baillie Gifford (5.2%), an investment management firm headquartered in Edinburgh, Scotland.

TikTok vs Douyin, TopBuzz vs Toutiao

TikTok, Douyin, TopBuzz and Toutiao are among the flagship mobile apps in ByteDance’s portfolio. ByteDance represents a new class of up-and-coming Chinese internet companies competing for global market through diversification, merger and acquisitions of foreign apps. ByteDance acquired US video app Flipagram in 2017, France-based News Republic in 2017, and invested in India-based news aggregator Dailyhunt. TikTok, first created in 2016, was rebranded with ByteDance’s US$ 1 billion acquisition of Muscial.ly in 2018. The Chinese version of TikTok, Douyin, was released in 2016 by ByteDance as the leading short-video platform in the country. The Douyin app has several different features that are particular to the Chinese market and regulation. For example, the #PositiveEnergy was integrated into the app as an effort to align with the state's political agenda to promote Chinese patriotism and nationalism (Chen, Kaye, & Zeng, 2020). Douyin also differs from TikTok in the app’s terms of service, of which it states that content undermining the regime, overthrowing the socialist system, inciting secessionism, and subverting the unification of the country is forbidden on the platform (Chen, Kaye, & Zeng, 2020; Kaye, Chen, & Zeng, 2020). Such regulation does not exist on TikTok. ByteDance’s Chinese news and information app Toutiao was launched in 2012, followed by its English version TopBuzz in 2015, for the international market.

Dubbed as the “world’s most valuable startup” (Byford, 2018), ByteDance secured investment from Softbank and Sequoia Capital. ByteDance has made successful forays into North American, European and Southeast Asian markets, reaching 1 billion monthly active users globally in 2019 (Yang, 2019). It is one of the most successful and truly global China-based mobile apps. The company focuses on using artificial intelligence (AI) and machine learning algorithms to source and push content to its users. To accelerate its global reach, ByteDance sources its top-level management from Microsoft and Facebook for AI and global strategy development.

Both apps and their overseas versions have received much legal and regulatory scrutiny. In 2017, Toutiao was accused of spreading pornographic and vulgar information by the Beijing Cyberspace and Informatisation Office. In the 2018 Sword Net Action, China’s National Copyright Administration summoned Douyin to better enforce copyright law and put in place a complaint mechanism to report illegal content (Yang, 2018). Reaching millions of youth, TikTok was temporarily banned by Indian court and Indonesia’s Ministry of Communication and Information Technology for “degrading culture and encourag[ing] pornography” and for spreading pornography, inappropriate content and blasphemy. TikTok attempted to resolve the ban by building data centres in India while hiring more content moderators (Sharma & Niharika, 2019).

WeChat/Weixin

WeChat or Weixin is China’s most popular mobile chat app and the fourth largest in the world. It is a paradigmatic example of the infrastructurisation of platforms, where the app bundles and centralises many different functions, such as digital payment, group buying, taxi hailing into one super-app (Plantin & de Seta, 2019). Owned by Tencent, one of China’s internet behemoths, WeChat has a user base of 1 billion, though Tencent has not updated the number of its international users since 2015 (Ji, 2015). WeChat’s success was built upon Tencent’s previous social networking advantages.

Unlike ByteDance which separates its domestic and international users by developing two different versions of its major products (i.e., the internationally-facing TikTok can only be downloaded in international app stores whereas Douyin can only be downloaded in Chinese app stores and Apple’s China-region App Store), Tencent differentiates WeChat (international) and Weixin (domestic) users by the phone number a user originally signs up with. In practice, users download the same WeChat/Weixin app from either international or Chinese app stores. The app then decides whether the user is an international or Chinese user during the account registration process. Besides certain functionalities such as Wallet that is exclusive to Chinese users, the overall design of the app and the processes of account registration and deletion are the same for international and domestic users.

App walkthrough

We conducted app walkthroughs to examine and compare user experience in data and privacy protection during the app registration and account deletion process. Figure 1 compares the walkthrough results. 

Android-iOS difference

Registration processes for Baidu, Douyin, Toutiao and WeChat differ between the Android and iOS versions. The Android and iOS registration processes for TopBuzz and TikTok are similar, therefore they are recorded in one timeline in Figure 1. In general, app registrations on iOS devices comprise of more steps compared to Android, meaning that the apps need to request more function-specific authorisation from users. In the Android versions, access to certain types of data is granted by default when users install and use the app; users need to change authorisations within the app or on the device’s privacy settings. For example, TopBuzz and TikTok, both owned by ByteDance, set app push notifications as the default option without prompting for user consent. If users want to change the setting, they need to do so via their device’s privacy settings. 

“Ask until consent”

All Chinese versions of apps will prompt a pop-up window displaying a summary of privacy notification, while this is not the case for the Canadian version. However, the pop-up reminder for privacy notification does not give the users a choice to continue usage of the app without ticking “I agree”. For example, if you do not agree with the privacy reminder, the app will show the notice again until user consent is obtained to proceed to the next step. This is a reflection of the failure of the notice and choice approach to privacy protection that the users are left without a choice but to accept the terms or relinquish the usage of the app (Martin, 2013). It also mirrors and reaffirms existing study on the lack of choice if users do not agree with a privacy notice. For Douyin, TikTok, Toutiao, TopBuzz, and Baidu, users can still use limited app functions if they do not sign up for an account. However, these apps will still collect information during the use of the apps, such as device information and locational information, as per privacy policies. WeChat and Weixin, on the other hand, mandate the creation of accounts to use app services.

Real name registration

For all examined apps, users can choose to register with either cell phone numbers or emails in the international version. However, for all domestic versions, cell phone numbers are mandatory to sign up for services. This is a key difference between the international and domestic versions. The main reason is that Article 24 of China’s Cybersecurity Law requires internet companies to comply with the real name registration regulation. During account registration, all apps request for access to behavioral data (request for location) and user data (contact). The real name registration process mandated under the Chinese law differs in intent and in practice from those of US-based internet companies and platforms. For example, Facebook, YouTube, now-defunct Google+, Twitter and Snapchat have different policies about whether a user has the option of remaining anonymous, or creating an online persona that masks their identity to the public (DeNardis & Hackl, 2015, p. 764). The decisions made on part of internet companies and digital platforms could jeopardise the online safety and anonymity of minority populations and have potential to stifle freedom of expression. However, in the Chinese context, the real name registration is overseen and enforced by different levels of government for the purpose of governance and control, following the principle of “real identity on the backend and voluntary compliance on the front end”, which means apps, platforms, and websites must collect personally identifying information while it is up to users to decide whether to adopt real name as screen name.

Account deletion

For all apps examined, users need to go through multiple steps to reach the account deletion options: WeChat 5 steps, Douyin 6 steps, TikTok 4 steps, TopBuzz 3 steps. The more steps it takes, the more complicated it is for users to de-register and delete data and metadata generated on the app. All Chinese versions of the tested apps prompt an “account in secure state” notification in the process of account deletion. To have an account in secure state, it means that the account does not have any suspicious changes such as changing password or unlinking the mobile phone within a short period of time before the request, as a security measure. To have an account in a secure state is a prerequisite for account removal. The domestic versions also have screening measures so that only accounts that have a “clean history” can be deleted. A clean history means the account has not been blocked nor engaged in any previous activities that are against laws and regulations. TikTok also offers a 30-day deactivation period option before the account is deleted and TopBuzz requires users to tick “agree” on privacy terms during account deletion. It also offers a re-participation option by soliciting reasons why users delete accounts.

Figure 1: Walkthrough analysis

Content analysis of privacy policies and terms of service

Table 1: Cross-border regulation

Company

Regions

Privacy policy application scope

Laws and jurisdictions referred

Specific court that legal proceedings must go through

Baidu

 

Part of larger organization

Relevant Chinese Laws, Regulations

Beijing Haidian District People’s court

TopBuzz

EU

Part of larger organization

GDPR and EU

No

Non-EU

Part of larger organization

US, California Civil Code, Japan, Brazil

Singapore International Arbitration Center

Toutiao

 

For Toutiao

Relevant Chinese Laws, Regulations

Beijing Haidian District

Douyin

 

For Douyin

Relevant Chinese Laws, Regulations

Beijing Haidian District People’s court

TikTok

US

For TikTok

Yes

Unspecified

EU

For TikTok

Yes

Unspecified

Global

For TikTok

No

Unspecified

WeiXin

 

For Weixin

Relevant Chinese Laws, Regulations

Shenzhen Nanshan People's Court

WeChat

US

For WeChat

No

American Arbitration Association

EU

The court of the user’s place or residence or domicile

Other

Hong Kong International Arbitration Centre

We retrieved and examined the privacy policies and terms of service of all apps as of July 2019. Baidu only has one set of policies covering both domestic and international users. WeChat/WeiXin, TopBuzz/Toutiao and TikTok/Douyin have designated policies for domestic and international users, respectively. TikTok’s privacy policies and terms of service are most regional-specific, with three distinctive documents for US, EU, and global users (excluding US and EU). TopBuzz distinguishes EU and non-EU users with jurisdiction-specific items for users based in the US, Brazil, and Japan in the non-EU users privacy policies. Most policies and terms of service refer to privacy laws of the jurisdictions served, but WeChat and TikTok’s global users’ privacy policies are vague as they do not explicitly name the laws and regulations but refer to them under “relevant laws and regulations”. Compared to the Canadian versions of the same app, Chinese apps provide clearer and more detailed information about the specific court where disputes are to be solved.

Table 2: Storage and transfer of user data

Company

Regions

Storage of data

Location of storage

Duration of storage

Data transfer

Baidu

 

Yes

PRC

Unspecified

Unspecified

TopBuzz

EU

Yes

Browser behavior data stored for 90 days

third party servers in US & Singapore Amazon Web Services

Varies according to jurisdictions

Yes

Non-EU

Yes

US and Singapore

Unspecified

Yes

Toutiao

 

Yes

PRC

Unspecified

No

Douyin

 

Yes

PRC

Unspecified

Transfer with explicit consent 

TikTok

US

Unspecified

Unspecified

Unspecified

Unspecified

EU

Yes

Unspecified

Unspecified

Yes

Global

Unspecified

Unspecified

Unspecified

Unspecified

WeiXin

 

Yes

PRC

Unspecified

Unspecified

WeChat

 

Yes

Canada, Hong Kong

 

Unspecified

In terms of data storage, as shown in Table 2, most international versions of examined apps store user data in foreign jurisdictions. For example, WeChat’s international-facing privacy policy states that the personal information it collects from users will be transferred to, stored at, or processed in Ontario, Canada and Hong Kong. The company explains explicitly why it chooses the two regions: “Ontario, Canada (which was found to have an adequate level of protection for Personal Information under Commission Decision 2002/2/EC of 20 December 2001); and Hong Kong (we rely on the European Commission’s model contracts for the transfer of personal data to third countries (i.e., the standard contractual clauses), pursuant to Decision 2001/497/EC (in the case of transfers to a controller) and Decision 2010/915/EC (in the case of transfers to a processor).” Only Baidu stores user data in mainland China, regardless of the residing jurisdictions of users. However, the latter app’s policies do not specify where and for how long the transnational communications between users based in China and users based outside will be stored. Baidu’s privacy policies are particularly ambiguous about how long data will be stored. Governed by the GDPR, privacy policies serving EU users are more comprehensive than others in disclosing whether user data will be transferred.

All apps have included mechanisms through which users can communicate their concerns or file complaints about how the company may be retaining, processing, or disclosing their personal information. Almost all apps – with the exception of Baidu – provide an email address and a physical mailing address of where users can initiate communications. TikTok has provided the name of an EU representative in its EU-specific privacy policy, though the contact email provided is the same as the one mentioned in TikTok’s other international privacy policies.

Table 3: Privacy disclosure

Company

Regions

Last policy update date

Access to older versions

Notification of update?

Complaint mechanism

Complaint venue

Baidu

 

No

No

No

Yes

Legal process through local court

TopBuzz

EU

No

Yes

Yes

No privacy officer listed

Non-EU

No

No

Yes

Yes

No privacy officer listed

Toutiao

 

Yes

No

Yes

Yes

No privacy officer listed

Douyin

 

Yes

No

Yes

Yes

Email and physical mailing address

TikTok

US

Yes

No

Yes

Yes

No privacy officer listed

EU

Yes

No

Yes

Yes

A EU representative is listed

Global

Yes

No

Yes

Yes

Email and a mailing address

WeXin

 

Yes

No

Yes

Yes

Contact email and location of Tencent Legal Department

WeChat

 

Yes

No

Yes

Yes

Contact email of Data Protection Officer and a physical address

Baidu only mentions that any disputes should be resolved via legal process through local court, which increases the difficulties if users, especially international users, wish to resolve a dispute with the company. WeChat/Weixin is another interesting case: unlike ByteDance which distinguishes its domestic and international users by providing them with two different versions of apps, Tencent’s overseas and domestic users use the same app. Users receive different privacy policies and terms of service based on the phone number they signed up with. In addition, the company’s privacy policy and terms of service differentiate international users and domestic users not only via their place of residence but also their nationalities. Tencent’s terms of service for international WeChat users denote that if the user is “(a) a user of Weixin or WeChat in the People’s Republic of China; (b) a citizen of the People’s Republic of China using Weixin or WeChat anywhere in the world; or (c) a Chinese-incorporated company using Weixin or WeChat anywhere in the world,” he or she is subject to the China-based Weixin terms of service. However, neither WeChat/Weixin explain how the apps identify someone as a Chinese citizen in these documents. That said, even if Weixin users are residing overseas, they will need to go through the complaint venue outlined in the Chinese privacy policy version rather than taking it to the company’s overseas operations.

Our analysis of these apps’ data collection practices show some general patterns in both the domestic and international versions. All apps mention the types of information they may collect such as name, date of birth, biometrics, address, contact, location. However, none of the apps, except WeChat for international users offer a clear definition or examples of what counts as personally identifiable information (PII). As for disclosure of PII, all apps state that they will share necessary information with law enforcement agencies and government bodies. TikTok’s privacy policy for international users outside the US and EU seems to be the most relaxed when it comes to sharing user information with third parties or company affiliates. All the other apps surveyed state that they will request users’ consent before sharing PII with any non-government entities. TikTok’s global privacy policy states that it will share user data – without asking for user consent separately — with “any member, subsidiary, parent, or affiliate of our corporate group”, “law enforcement agencies, public authorities or other organizations if legally required to do so”, as well as with third parties.

Conclusion

This study shows that not only different Chinese mobile apps vary in data and privacy protection but also the Chinese domestic and international versions of the same app vary in data and privacy protection standards. More globally successful China-based mobile apps have better and more comprehensive data and privacy protection standards. Similar to previous findings (Liu, 2019; Fazhi Wanbao, 2018), our research shows that Baidu, compared to other apps, has the most unsatisfactory data and privacy protection measures. ByteDance’s apps: TopBuzz/Toutiao, TikTok/Douyin are more attentive to users from different geographical regions by designating jurisdiction-specific privacy policies and terms of service. In this case, the mobile app’s globalisation strategies and aspirations play an important part in the design and governance of mobile app data and privacy protection. ByteDance is the most internationalised company, when compared to Baidu and Tencent. ByteDance’s experience of dealing with fines from the United States, Indian and Indonesian law enforcement and regulatory authorities has helped revamp its practices overseas. For instance, TikTok updated its privacy policy after the Federal Trade Commission’s fine in February 2019 (Alexander, 2019). Faced with probing from US lawmakers and a ban from US Navy, TikTok released its first Transparency report in December 2019 and the company is set to open a “Transparency Center” in its Los Angeles office in May 2020, where external experts will oversee its operations (Pappas, 2020). For Tencent, with an expanding array of overseas users, the company was also among the first to comply with the GDPR. Tencent updated its privacy policy to meet GDPR’s requirement on 29 May 2018 — a day after it came into force.

For China-based internet companies that eye global markets, expanding beyond China means that they must provide a compelling experience for international users and comply with laws and regulations in jurisdictions where they operate. In this regard, nation-states and their designed ecosystem of internet regulations have a powerful impact on how private companies govern their platforms. Our analysis suggests that nation-based regulations on online spaces have at times spilled beyond their territory (e.g., Tecent’s WeChat/Wexin’s distinguishing domestic and international users based on their nationality). However, the effects of state regulations on transnational corporations are not monolithic. They vary depending on how integrated a platform is into a certain jurisdiction, where its main user base is, and what its globalisation strategies are. For example, ByteDance’s TikTok is more responsive to international criticism and public scrutiny than the other applications in this study potentially because of the app’s highly globalised presence and revenue streams.

Secondly, this paper highlights that in addition to app makers, other powerful actors and parties shape the app’s data and privacy protection practices. One of the actors is mobile app store owners (e.g., Google Play and Apple App Store). As the walkthrough analysis demonstrates, the app interface design and requests on Apple iOS do a better job at informing and notifying data access for mobile phone users. The Android version of tested apps have set user consent for push notification as default in some cases, therefore it requests individual efforts to navigate and learn how to opt out or withdraw consent. Examined mobile apps operating in the Android system are more lenient in requesting data from users, as compared to iOS. The gatekeeping function of mobile app platforms that host these apps and set the standards for app designers and privacy protection further indicates a more nuanced and layered conceptualisation of corporate power in understanding apps as a situated digital object. This further shows that in a closely interconnected platform ecosystem, some platform companies are more powerful than others with their infrastructural reach in hosting content, providing cloud computing and data services (van Dijck, Nieborg, & Poell, 2019). Even though Tencent, ByteDance and Baidu are powerful digital companies in China, they still rely on Google Play store and Apple’s App Store for the domestic and global distribution of their apps, therefore subjecting to the governance of these mobile app stores (see Cartwright, 2020, this issue). Another example is the mini-programmes, which are “sub-applications” hosted on WeChat, where developers and apps are subject to WeChat’s privacy policies and developer agreements. This shows that apps are always situated in and should be studied together with the complex mobile ecosystem and their regional context (Dieter et al., 2019). Therefore, we should consider the relational and layered interplay between different levels of corporate power in co-shaping the data and privacy practices of mobile apps.

As shown in the analysis, the international-facing version of the same China-based mobile app provides relatively higher levels of data protection to app users in the European Union than its Chinese-facing version. This further highlights the central role of nation states and the importance of jurisdiction in the global expansion of Chinese mobile apps. As non-EU organisations, Chinese app makers are subject to the territorial scope of GDPR (Article 3) when offering services to individuals in the EU. On the other hand, Chinese-facing apps have operationalised Chinese privacy regulations in app design and privacy policies compliant with rules such as real name registration. Through the analysis of terms of service and privacy policies, this paper shows that China-based mobile apps are generally in compliance with laws and data protection frameworks across different jurisdictions. However, there lacks detailed explanations of data retention and storage when users are in transit, for example, when an EU resident travels outside, do they have the same level of privacy protection as residing in the EU? On average, EU users of Chinese mobile apps are afforded greater transparency and control with regards to how data is used, stored and disclosed compared to other jurisdictions for these four particular sets of China-based mobile apps. Under China’s privacy regulation regime, which itself is full of contradictions and inconsistencies (Lee, 2018; Feng, 2019), data and privacy protection is weak for domestic Chinese users. Certain features of the app, such as the “security clearance” declaration during account deletion for domestic versions of Chinese mobile apps also shows the prioritisation of national security over the individual right to privacy as key doctrines in China’s approach to data and privacy protection under the banner of internet sovereignty. This, however, is not unique to China as national security and privacy protection is portrayed in many policy debates and policymaking processes as a zero-sum game (Mann, Daly, Wilson, & Suzor, 2018). The latest restrictions imposed by the Trump administration on TikTok and WeChat in the US citing concerns over the apps’ data collection and data sharing policies (Yang and Lin, 2020) is just another example of the conundrum China-based apps face in their course of global expansion and global geopolitics centered around mobile and internet technologies. To be sure, data and privacy protection is one of the biggest challenges if China-based apps continue to expand overseas and it is going to incur a steep learning curve and possible reorganisation of a company’s operation and governance structure.

References

Alexander, J. (2019, February 27). TikTok will pay $5.7 million over alleged children’s privacy law violations. The Verge. https://www.theverge.com/2019/2/27/18243312/tiktok-ftc-fine-musically-children-coppa-age-gate

Balebako, R., Marsh, A., Lin, J., Hong, J., & Cranor, L. F. (2014, February 23). The Privacy and Security Behaviors of Smartphone App Developers. Network and Distributed System Security Symposium. https://doi.org/10.14722/usec.2014.23006

BBC News. (2016, July 18). Apple iCloud: State Firm Hosts User Data in China. BBC News. https://www.bbc.com/news/technology-44870508

Byford, S. (2018, November 30). How China’s Bytedance Became the World’s Most Valuable Startup. The Verge. https://www.theverge.com/2018/11/30/18107732/bytedance-valuation-tiktok-china-startup

C.A.C. (2016, December 27). Guojia Wangluo Anquan Zhanlue. Xinhuanet. http://www.xinhuanet.com/politics/2016-12/27/c_1120196479.htm

Cartwright, M. (2020). Internationalising state power through the internet: Google, Huawei and geopolitical struggle. Internet Policy Review, 9(3). https://doi.org/10.14763/2020.3.1494

Chen, J. Y., & Qiu, J. L. (2019). Digital Utility: Datafication, Regulation, Labor, and Didi’s Platformization of Urban Transport in China. Chinese Journal of Communication, 12(3), 274–289. https://doi.org/10.1080/17544750.2019.1614964

Chen, X., Kaye, D. B., & Zeng, J. (2020). #PositiveEnergy Douyin: Constructing ‘Playful Patriotism’ in a Chinese Short-Video Application. Chinese Journal of Communication. https://doi.org/10.1080/17544750.2020.1761848

de Hert, P., & Papakonstantinou, V. (2015). The Data Protection Regime in China. [Report]. European Parliament. https://www.europarl.europa.eu/RegData/etudes/IDAN/2015/536472/IPOL_IDA(2015)536472_EN.pdf

Deibert, R., & Pauly, L. (2017). Cyber Westphalia and Beyond: Extraterritoriality and Mutual Entanglement in Cyberspace. Paper Prepared for the Annual Meeting of the International Studies Association.

DeNardis, L., & Hackl, A. M. (2015). Internet Governance by Social Media Platforms. Telecommunications Policy, 39(9), 761–770. https://doi.org/10.1016/j.telpol.2015.04.003

Dieter, M., Gerlitz, C., Helmond, A., Tkacz, N., Vlist, F., & Weltevrede, E. (2019). Multi-Situated App Studies: Methods and Propositions. Social Media + Society, 1–15.

Dijck, J., Nieborg, D., & Poell, T. (2019). Reframing Platform Power. Internet Policy Review, 8(2). https://doi.org/10.14763/2019.2.1414

Federal Trade Commission. (1998). Privacy Online: A Report to Congress [Report]. Federal Trade Commission. https://www.ftc.gov/sites/default/files/documents/reports/privacy-online-report-congress/priv-23a.pdf

Federal Trade Commission. (2013). Mobile Privacy Disclosures: Building Trust Through Transparency [Staff Report]. Federal Trade Commission. https://www.ftc.gov/reports/mobile-privacy-disclosures-building-trust-through-transparency-federal-trade-commission

Federal Trade Commission. (2019, February 27). Video Social Networking App Musical.ly Agrees to Settle FTC Allegations That it Violated Children’s Privacy Law [Press release]. Federal Trade Commission. https://www.ftc.gov/news-events/press-releases/2019/02/video-social-networking-app-musically-agrees-settle-ftc

Feng, Y. (2019). The Future of China’s Personal Data Protection Law: Challenges and Prospects. Asia Pacific Law Review, 27(1), 62–82. https://doi.org/10.1080/10192557.2019.1646015

Fernback, J., & Papacharissi, Z. (2007). Online Privacy as Legal Safeguard: The Relations Among Consumer, Online Portal and Privacy Policy. New Media & Society, 9(5), 715–734. https://doi.org/10.1177/1461444807080336

Flew, T., Martin, F., & Suzor, N. (2019). Internet Regulation as Media Policy: Rethinking the Question of Digital Communication Platform Governance. Journal of Digital Media & Policy, 10(1), 33–50. https://doi.org/10.1386/jdmp.10.1.33_1

Fu, T. (2019). China’s Personal Information Protection in a Data-Driven Economy: A Privacy Policy Study of Alibaba, Baidu and Tencent. Global Media and Communication, 15(2), 195–213. https://doi.org/10.1177/1742766519846644

Fuchs, C. (2012). The Political Economy of Privacy on Facebook. Television & New Media, 13(2), 139–159. https://doi.org/10.1177/1527476411415699

Gierow, H. J. (2014). Cyber Security in China: New Political Leadership Focuses on Boosting National Security (Report No. 20; China Monitor). merics. https://merics.org/en/report/cyber-security-china-new-political-leadership-focuses-boosting-national-security

Gillespie, T. (2018a). Custodians of the Internet: Platforms, Content Moderation, and the Hidden Decisions that Shape Social Media. Yale University Press.

Gillespie, T. (2018b). Regulation Of and By Platforms. In J. Burgess, A. Marwick, & T. Poell (Eds.), The SAGE Handbook of Social Media (pp. 254–278). SAGE Publications. https://doi.org/10.4135/9781473984066.n15

Goldsmith, J., & Wu, T. (2006). Who Controls the Internet? Illusions of Borderless World. Oxford University Press.

Gorwa, R. (2019). What is platform governance? Information, Communication & Society, 22(6), 854–871. https://doi.org/10.1080/1369118X.2019.1573914

Greene, D., & Shilton, K. (2018). Platform Privacies: Governance, Collaboration, and the Different Meanings of “Privacy” in iOS and Android Development. New Media & Society, 20(4), 1640–1657. https://doi.org/10.1177/1461444817702397

Jao, N. (2018, February 8). Evernote Announces Plans to Migrate All Data in China to Tencent Cloud. Technode. https://technode.com/2018/02/08/evernote-will-migrate-data-china-tencent-cloud/

Jia, L., & Winseck, D. (2018). The Political Economy of Chinese Internet Companies: Financialization, Concentration, and Capitalization. International Communication Gazette, 80(1), 30–59. https://doi.org/10.1177/1748048517742783

Kalathil, S., & Boas, T. (2003). Open Networks, Closed Regimes: The Impact of the Internet on Authoritarian Rule. Carnegie Endowment for International Peace.

Kaye, B. V., Chen, X., & Zeng, J. (2020). The Co-evolution of Two Chinese Mobile Short Video Apps: Parallel Platformization of Douyin and TikTok. Mobile Media & Communication. https://doi.org/10.1177/2050157920952120

Knockel, J., Ruan, L., Crete-Nishihata, M., & Deibert, R. (2018). (Can’t) Picture This: An Analysis of Image Filtering on WeChat Moments [Report]. Citizen Lab. https://citizenlab.ca/2018/08/cant-picture-this-an-analysis-of-image-filtering-on-wechat-moments/

Kong, L. (2007). Online Privacy in China: A Survey on Information Practices of Chinese Websites. Chinese Journal of International Law, 6(1), 157–183. https://doi.org/10.1093/chinesejil/jml061

Lee, J.-A. (2018). Hacking into China’s Cybersecurity Law. Wake Forest Law Review, 53, 57–104. http://wakeforestlawreview.com/wp-content/uploads/2019/01/w05_Lee-crop.pdf

Light, B., Burgess, J., & Duguay, S. (2018). The Walkthrough Method: An Approach to the Study of Apps. New Media & Society, 20(3), 881–900. https://doi.org/10.1177/1461444816675438

Liu, J. (2019). China’s Data Localization. Chinese Journal of Communication, 13(1). https://doi.org/10.1080/17544750.2019.1649289

Logan, S. (2015). The Geopolitics of Tech: Baidu’s Vietnam. Internet Policy Observatory. http://globalnetpolicy.org/research/the-geopolitics-of-tech-baidus-vietnam/

Logan, S., Molloy, B., & Smith, G. (2018). Chinese Tech Abroad: Baidu in Thailand [Report]. Internet Policy Observatory. http://globalnetpolicy.org/research/chinese-tech-abroad-baidu-in-thailand/

Mann, M., Daly, A., Wilson, M., & Suzor, N. (2018). The Limits of (Digital) Constitutionalism: Exploring the Privacy-Security (Im)Balance in Australia. International Communication Gazette, 80(4), 369–384. https://doi.org/10.1177/1748048518757141

Martin, K. (2013). Transaction Costs, Privacy, and Trust: The Laudable Goals and Ultimate Failure of Notice and Choice to Respect Privacy Online. First Monday, 18(12). https://doi.org/10.5210/fm.v18i12.4838

McKune, S., & Ahmed, S. (2018). The Contestation and Shaping of Cyber Norms Through China’s Internet Sovereignty Agenda. International Journal of Communication, 12, 3835–3855. https://ijoc.org/index.php/ijoc/article/view/8540

Nissenbaum, H. (2011). A Contextual Approach to Privacy Online. Dædalus, 140(4), 32–48. https://doi.org/10.1162/DAED_a_00113

Pappas, V. (2020, March 11). TikTok to Launch Transparency Center for Moderation and Data Practices [Press release]. TikTok. https://newsroom.tiktok.com/en-us/tiktok-to-launch-transparency-center-for-moderation-and-data-practices

Plantin, J.-C., Lagoze, C., Edwards, P., & Sandvig, C. (2016). Infrastructure Studies Meet Platform Studies in the Age of Google and Facebook. New Media & Society, 20(1), 293–310. https://doi.org/10.1177/1461444816661553

Plantin, J.-C., & Seta, G. (2019). WeChat as Infrastructure: The Techno-nationalist Shaping of Chinese Digital Platforms. Chinese Journal of Communication, 12(3). https://doi.org/10.1080/17544750.2019.1572633

Reuters. (2016, November 1). Airbnb Tells China Users Personal Data to be Stored Locally. Reuters. https://www.reuters.com/article/us-airbnb-china/airbnb-tells-china-users-personal-data-to-be-stored-locally-idUSKBN12W3V6

Reuters. (2018, January 12). China Chides Tech Firms Over Privacy Safeguards. Reuters. https://www.reuters.com/article/us-china-data-privacy/china-chides-tech-firms-over-privacy-safeguards-idUSKBN1F10F6

Ruan, L., Knockel, J., Ng, J., & Crete-Nishihata, M. (2016). One App, Two Systems: How WeChat Uses One Censorship Policy in China and Another Internationally (Research Report No. 84). Citizen Lab. https://citizenlab.ca/2016/11/wechat-china-censorship-one-app-two-systems/

Sharma, I., & Niharika, S. (2019, July 22). It Took a Ban and a Government Notice for ByteDance to Wake Up in India. Quartz India. https://qz.com/india/1671207/bytedance-to-soon-store-data-of-indian-tiktok-helo-users-locally/

State Council Information Office. (2010). The Internet in China. Information Office of the State Council of the People’s Republic of China. http://www.china.org.cn/government/whitepaper/node_7093508.htm

Stein, L. (2013). Policy and Participation on Social Media: The Cases of YouTube, Facebook and Wikipedia. Communication, Culture & Critique, 6(3), 353–371. https://doi.org/10.1111/cccr.12026

Steinberg, M., & Li, J. (2016). Introduction: Regional Platforms. Asiascape: Digital Asia, 4(3), 173–183. https://doi.org/10.1163/22142312-12340076

Wanbao, F. (2018, January 6). Shouji Baidu App Qinfanle Women de Naxie Yinsi. 163. http://news.163.com/18/0106/17/D7G2O0T200018AOP.html

Wang, H. (2011). Protecting Privacy in China: A Research on China’s Privacy Standards and the Possibility of Establishing the Right to Privacy and the Information Privacy Protection Legislation in Modern China. Springer Science & Business Media. https://doi.org/10.1007/978-3-642-21750-0

Wang, W. Y., & Lobato, R. (2019). Chinese Video Streaming Services in the Context of Global Platform Studies. Chinese Journal of Communication, 12(3), 356–371. https://doi.org/10.1080/17544750.2019.1584119

West, S. M. (2019). Data Capitalism: Redefining the Logics of Surveillance and Privacy. Business & Society, 58(1), 20–41. https://doi.org/10.1177/0007650317718185

Xu, D., Tang, S., & Guttman, D. (2019). China’s Campaign-style Internet Finance Governance: Causes, Effects, and Lessons Learned for New Information-based Approaches to Governance. Computer Law & Security Review, 35, 3–14. https://doi.org/10.1016/j.clsr.2018.11.002

Xu, J. (2015). Evolving Legal Frameworks for Protecting the Right to Internet Privacy in China. In J. Lindsay, T. M. Cheung, & D. Reveron (Eds.), China and Cybersecurity: Espionage, Strategy, and Politics in the Digital Domain(pp. 242–259). Oxford Scholarship Online. https://doi.org/10.1093/acprof:oso/9780190201265.001.0001

Yang, J., & Lin, L. (2020). WeChat and Trump’s Executive Order: Questions and Answers. The Wall Street Journal. https://www.wsj.com/articles/wechat-and-trumps-executive-order-questions-and-answers-11596810744.

Yang, W. (2018, September 15). Online Streaming Platforms Urged to Follow Copyright Law. ChinaDaily. http://usa.chinadaily.com.cn/a/201809/15/WS5b9c7e90a31033b4f4656392.html

Yang, Y. (2019, June 21). TikTok Owner ByteDance Gathers 1 Billion Monthly Active Users Across its Apps. South China Morning Post. https://www.scmp.com/tech/start-ups/article/3015478/tiktok-owner-bytedance-gathers-one-billion-monthly-active-users

Zimmeck, S., Wang, Z., Zou, L., Iyengar, R., Liu, B., Schaub, F., & Reidenberg, J. (2016, September 28). Automated Analysis of Privacy Requirements for Mobile Apps. 2016 AAAI Fall Symposium Series. http://pages.cpsc.ucalgary.ca/~joel.reardon/mobile/privacy.pdf

Appendix

Current laws, regulations and drafting measures for data and privacy protection in China

Year

Title

Government ministries

Legal effect

Main takeaway

2009

General Principles of The Civil Law

National People's Congress

Civil law

Lays the foundation for the protection of personal rights including personal information, but privacy protection comes as an auxiliary article

2010

Tort Liabilities Law

Standing Committee of the National People’s Congress

Civil law

2012

Decision on Strengthening Online Personal Data Protection

Standing Committee of the National People’s Congress

General framework

Specifies the protection of personal electronic information or online personal information for the first time

2013

Regulation on Credit Reporting Industry

State Council

Regulation

Draws a boundary of what kinds of personal information can and cannot be collected by credit reporting business

2013

Telecommunication and Internet User Personal Data Protection Regulations

Ministry of Industry and Information Technology

Department regulation

Provides industry-specific regulations on personal information protection duties

2013

Information Security Technology Guidelines for Personal Information Protection with Public and Commercial Services Information Systems

National Information Security Standardization Technical Committee; China Software Testing Center

Voluntary national standard

Specifies what “personal general information” 个人一般信息 and what “personal sensitive information” 个人敏感信息 entail respectively;

Defines the concepts of “tacit consent” 默许同意 and “expressed consent” 明示同意 for the first time

2014

Provisions of the Supreme People's Court on Several Issues concerning the Application of Law in the Trial of Cases involving Civil Disputes over Infringements upon Personal Rights and Interests through Information Networks

Supreme People's Court

General framework

Defines what is included in the protection of "personal information", with a specific focus on regulating online search of personal information and online trolls

2015

Criminal Law (9th Amendment)

Standing Committee of the National People’s Congress

Criminal law

Criminalises the sale of any citizen's personal information in violation of relevant provisions.

Criminalises network service providers' failure to fulfil network security management duties.

2016

Administrative Rules on Information Services via Mobile Internet Applications

Cyberspace Administration China

Administrative rules

Reiterates app stores and internet app providers' responsibilities to comply with real-name verification system and content regulations regarding national security and public order;

Mentions data collection principles (i.e., legal, justifiable, necessary, expressed consent)

2017

Cybersecurity Law

Standing Committee of the National People’s Congress

Law

Requires data localisation;

Provides definitions of ""personal information""

Defines data collection principles;

Currently the most authoritative law protecting personal information

2017

Interpretation of the Supreme People's Court and the Supreme People's Procuratorate on Several Issues concerning the Application of Law in the Handling of Criminal Cases of Infringing on Citizens' Personal Information

Supreme People's Court

General framework

Defines "citizen personal information", what activities equate to "providing citizen personal information", and what are the legal consequences of illegally providing personal information

2017

Information security technology

Guide for De-Identifying Personal Information

Standardization Administration of China

Drafting

Provides a guideline on de-identification of personal information

2018

Information security technology Personal information security specification

Standardization Administration of China

Voluntary national standard / Currently under revision

Lays out granular guidelines for consent and how personal data should be collected, used, and shared.

2018

E-Commerce Law

Standing Committee of the National People’s Congress

Law

Provides generally-worded personal information protection rules for e-commerce vendors and platforms

2019

Measures for Data Security Management

Cyberspace Administration China

Drafting

Proposes new requirements with a focus on the protection of "important data", which is defined as "data that, if leaked, may directly affect China’s national security, economic security, social stability, or public health and security"

2019

Information security technology Basic specification for collecting personal information in mobile internet applications

Standardization Administration of China

Drafting

Provides guidelines on minimal information for an extensive list of applications ranging from navigation services to input software

2019

Measures for Determining Illegal Information Collection by Apps

Drafting stage

 

Borderline speech: caught in a free speech limbo?

$
0
0

In late September 2020, the Facebook algorithm removed a picture of onions because of the 'overtly sexual manner' they were positioned. While this anecdote will probably make you smile because of its absurdity (Facebook later confirmed the algorithm was supposed to block ‘nudity’), the underlying systematic sanctioning of content that might potentially breach community standards won’t. Social media platforms will remove or algorithmically downgrade content, or suspend or shadow-ban accounts because of borderline speech. How open can communication be if it is almost exceeding normative limits? And who gets to define these limits? This text aims at giving a few insights on borderline speech and why the concept behind it is highly problematic.

Definition and use

The debate about ‘borderline’ content on social media platforms is about a categorisation of speech that is not covered by the barriers to freedom of expression but is considered inappropriate in the public debate. Generally, borderline means ‘being in an intermediate position or state: not fully classifiable as one thing or its opposite’.1 It suggests that it is very close to one thing while still being part of another and probably not describable without both. While this combination might seem familiar in many social situations, the general principle that an action cannot be ‘fully categorised’ is quite unusual in jurisprudence. Under certain legal provisions, an action might be forbidden or even punishable. The law defines the limits of legality, for speech too. If an expression of opinion is not illegal but categorised as ‘borderline’, it means it is very close to being illegal but it is not. Perhaps an accessible comparison is the presumption of innocence: the defendant is innocent until proven guilty, but he might already be considered guilty by society because he is a suspect.

When it comes to freedom of expression, not only laws can have a speech-restricting effect: social norms and other private rules define what should not be said (Noelle-Neumann, 1991, p. 91). The problem of harmful online content and the struggle to contain it might come from the fact that social norms, which have an effect in the analogue even when unexpressed, might not unfold equally in the digital sphere. Regarding borderline speech, we are confronted with two problems: (1) Where to position this type of expression in a legal framework? (2) How restrictive may other norms than laws be for freedom of expression? According to the European Court of Human Rights freedom of expression:

applies not only to “information” or “ideas” that are favorably received or regarded as inoffensive or as a matter of indifference, but also to those that offend, shock or disturb. Such are the demands of pluralism, tolerance and broadmindedness without which there is no “democratic society”. (ECtHR, 2015)

The definition of borderline speech shows how thin the line is between protecting a deliberative public sphere even if the public debate is uncomfortable on the one hand, and protecting a "civilised" public debate on the other.

Borderline content in a legal framework

If content is considered borderline but still legal, what level of protection does it deserve? Freedom of expression is a human right, indispensable to democracies and protected by national constitutions and international treaties. Generally, what falls in the scope of application is protected, as long as it is not declared illegal and what is ‘legal’ speech depends on the scope of protection. In Europe, most constitutions allow the legislator to draw limits on speech, for example by criminal law. Notwithstanding high standards and strict requirements by constitutional proviso2, they can intervene. In the US, the scope of protection of freedom of speech is very broad and the legislator is not allowed to regulate speech since the First Amendment protects the citizens from the coercive power of the state. Any law regulating speech will therefore be subject to strict scrutiny.

Social norms can play a role in the creation of legislation, including criminal law provisions that restrict freedom of expression, because they naturally go beyond codified legal norms. Eventually, legal norms only reflect the need for action by the legislator and the social context. There are other rules that affect the public discourse such as social norms, i.e. behavioural norms that aren’t standardised but that will lead to a reaction by other members of society because they agree on a common notion of unwanted or even harmful speech, even if it's not forbidden by law. Often, those norms are implicit and culturally presupposed. Consequently, they also manifest themselves in private legal relationships, because there, social norms can be concretised, i.e. leave the realm of the implicit. Private parties can contribute to standardising social norms by defining them as binding rules in contractual relationships. By doing so, they can agree on stricter rules than certain laws, like confidentiality clauses in employment agreements. If such private rules imposed by one party are too strict, the other party to the agreement might bring the speech-restricting clause to court and courts might find them disproportionate or otherwise void.

In sum, communication can be subject to all three types of limitations, but when the restriction is cemented in law there is a democratic consensus and legitimisation of it. When it is a social norm, the societal consensus is probably debatable but at least there is no sanction arising from infringement. The point is that when social media platforms define what can or cannot be said on their platforms there is neither a democratic process nor is there a common social consensus (probably impossible in a global user network). Infringing community standards, however, will have repercussions for users.

Social media platform’s private ordering

Content moderation is mostly based on community standards, that is, private rules drafted and enforced by social media platforms (referred to as private ordering). Private ordering comes along with private sanctions: unwanted content will be banned from the platform by removing or blocking it for specific regions. “Recidivist” users might see their accounts suspended, blocked or their content algorithmically downgraded or shadowbanned. But what are the boundaries of these private sanctions? Content moderation is an opaque process (Insights, 2020; Laub, 2019). Even if transparency efforts increased over the past two to three years due to public pressure, only little is known about how and what content is removed. There is a clear lack of data with regard to the influence of national regulations on community standards and their interpretation. Hate speech, for example, is not a legal term in many jurisdictions but it is used comprehensively to remove content (Kaye, 2018, p. 10). The interconnection between speech-restricting provisions in criminal law (e.g., libel, incitement to violence) and the broad scope of application of categories like hate speech is indiscernible. And it becomes even more blurry when content is deemed to be borderline because it is only close to such undefined category. However, platforms hold on to the practice of removing content when it is classifiable as borderline, i.e. somewhat closer to categories of unwanted content (Zuckerberg, 2018; Alexander, 2019; Constine, 2018a, 2019b; Sands, 2019).

Why call it a free speech limbo?

Borderline speech is in legal limbo because it is neither legal nor illegal, it is not even clearly unwanted (per community standards) – it is just too close to unwanted content to be fully permissible. Hence, ‘borderline’ is a category used by social media platforms to remove content although it is not manifestly violating laws or community standards. YouTube, for instance, defined it as “Content that comes close to — but doesn’t quite cross the line of — violating our Community Guidelines” and removing that sort of speech is a priority (YouTube, 2019). For many reasons, this practice is highly objectionable. It relies on a term that is per se very vague and dependent on other definitions (generally not provided either). It is inherently indefinable and does not allow to form potentially helpful categories to classify that type of speech. The expression of borderline speech itself is not unlawful but the fact that it might come close to a certain category favours its removal. This type of approach might lead to an over-removal of content, followed by chilling effects when users can simply no longer foresee whether their expression will be treated as inappropriate. From a user’s perspective, it is already challenging to understand community standards when they are too vague and moreover rely on social norms from another cultural and legal context.

Admittedly, the challenge posed by harmful content is a serious issue. We are witnessing that harmful content can be a real threat to communities in- and outside social media platforms and, in a broader sense, to democracy (Sunstein, 2017). These damaging effects can be even more serious if harmful content is algorithmically distributed and amplified. This leaves us with the classical dilemma of freedom of expression: how open can communication be if it is at other people’s expense? If the influence of harmful content online is somehow interrelated to real-life dangers like the genocide in Myanmar (Mozur, 2018), the spread of the Covid-19 pandemic (Meade, 2020) or other events, such as racist or antisemitic tragic ones (Newton, 2020), we need to take this fact into account when balancing freedom of expression with other fundamental rights.

The ban on nudity as an example of borderline content

It makes more sense to be over-precautious for some policy categories than for others. For instance, most platforms ban nudity, especially Facebook and its related platform Instagram. Banning any type of nudity without taking into account the type of picture and its context is a disproportionate limitation for many users who intend to express themselves. To learn more about how this policy is put into practice, I conducted a small series of semi-structured interviews with Instagram users who all experienced content removal by the platform. Although not part of a representative study, their views might help shed light here. Nine out of eleven had their content removed due to a ‘violation of the “nudity ban”’ but only one found the removal ‘understandable’. All the pictures removed by Instagram showed a female body and most of them showed either bare skin or breasts. In four pictures viewers could see a nipple. The interviewees later confirmed the chilling effect of content removal: they were more prudent after their content had been removed by Instagram, including blurring potentially “problematic” body parts in pictures, always editing pictures so that you cannot see the nipples and posting fewer pictures of nudes. Moreover, they felt wronged by the platform. One said ‘The pictures don’t just show nudity, they have an artistic approach, mostly don’t even look like showing nudity’. The interviewees expressed their concern about women being treated unequally and discriminated against by platforms like Instagram on the grounds of the ban on nudity. And all this without violating any law.

At the intersection of law and communication

Enforcing the rules of public communication in the digital sphere raises many questions as to setting the rules of permissible speech. Some are related to the nature of countermeasures against harmful content. At the level of individual users, the European approach is to balance freedom of expression with other fundamental rights. In the case of borderline content, it could translate into platforms tempering their sanctions against borderline speech. It should only be flagged if the probability that it might cause damage is relatively high. Regarding private sanctions may the platform go after the user or does it have to limit the enforcement to the mere content? As mentioned, the targeted content is neither unlawful nor does it manifestly infringe community standards. Platforms should therefore comply with due process standards for content that is not unlawful and provide mandatory explanatory statements when removing or algorithmically downgrading borderline content.

References

Alexander, J. (2019, December 3). YouTube claims its crackdown on borderline content is actually working. The Verge. https://www.theverge.com/2019/12/3/20992018/youtube-borderline-content-recommendation-algorithm-news-authoritative-sources

Constine, J. (2018a, November 15). Facebook will change algorithm to demote “borderline content” that almost violates policies. TechCrunch. https://social.techcrunch.com/2018/11/15/facebook-borderline-content/

Constine, J. (2019b, April 10). Instagram now demotes vaguely ‘inappropriate’ content. TechCrunch. http://social.techcrunch.com/2019/04/10/instagram-borderline/

Kaye, D. (2018). Report of the Special Rapporteur on the promotion and protection of the right to freedom of opinion and expression (Human Rights Council A/HRC/38/35). United Nations General Assembly. https://www.ohchr.org/EN/Issues/FreedomOpinion/Pages/OpinionIndex.aspx

Laub, Z. (2019, June 7). Hate Speech on Social Media: Global Comparisons. Council on Foreign Relations. https://www.cfr.org/backgrounder/hate-speech-social-media-global-comparisons

Meade, A. (2020, October 13). Facebook greatest source of Covid-19 disinformation, journalists say. The Guardian. https://www.theguardian.com/technology/2020/oct/14/facebook-greatest-source-of-covid-19-disinformation-journalists-say

Mozilla Insights. (2020, May 4). When Content Moderation Hurts. Mozilla Foundation. https://foundation.mozilla.org/en/blog/when-content-moderation-hurts/

Mozur, P. (2018, October 15). A Genocide Incited on Facebook, With Posts From Myanmar’s Military. The New York Times. https://www.nytimes.com/2018/10/15/technology/myanmar-facebook-genocide.html

Newton, C. (2020, October 14). How real-world violence led Facebook to overturn its most controversial policy. The Verge. https://www.theverge.com/2020/10/14/21516088/facebook-holocaust-deniers-policy-qanon-anti-semitism

Noelle-Neumann, E. (1991). Öffentliche Meinung: Die Entdeckung der Schweigespirale (Erw. Ausg.). Ullstein.

Sands, M. (2019, June 9). YouTube’s “Borderline Content” Is A Hate Speech Quagmire. Forbes. https://www.forbes.com/sites/masonsands/2019/06/09/youtubes-borderline-content-is-a-hate-speech-quagmire/

Sunstein, C. R. (2017). #Republic: Divided democracy in the age of social media. Princeton University Press.

YouTube. (2019, December 3). The Four Rs of Responsibility, Part 2: Raising authoritative content and reducing borderline content and harmful misinformation. Blog. Youtube. https://blog.youtube/inside-youtube/the-four-rs-of-responsibility-raise-and-reduce/

Zuckerberg, M. (2018, November 15). A Blueprint for Content Governance and Enforcement. Facebook. https://www.facebook.com/notes/mark-zuckerberg/a-blueprint-for-content-governance-and-enforcement/10156443129621634/

Court ruling: ECtHR (2015) Perincek v. Switzerland (App. 27510/08), Para. 158

Footnotes

1.Borderline, as defined by Merriam-Webster, retrieved from (accessed Oct 15, 2020).

2.Proviso, as defined by Merriam-Webster: an article or clause (as in a contract) that introduces a condition, retrieved from (accessed Oct 14, 2020).


Towards platform observability

$
0
0

1. Introduction

Platforms are large-scale infrastructures specialised in facilitating interaction and exchange among independent actors. Whether understood economically as two- or multi-sided markets (Langley & Leyshon, 2017) or with an eye on online media as services that ‘host, organize, and circulate users’ shared content or social interactions’ (Gillespie, 2018, p. 18), platforms have not only become highly visible and valuable companies but also raise important social challenges. While intermediaries have in one form or another existed for millennia, contemporary platforms are relying on digital technologies in (at least) two fundamental ways. First, platforms ‘capture’ (Agre, 1994) activities by channelling them through designed functionalities, interfaces, and data structures. Uber, for example, matches riders with drivers in physical space, handles payment, and enforces ‘good behaviour’ through an extensive review system covering both parties. This infrastructural capture means that a wide variety of data can be generated from user activity, including transactions, clickstreams, textual expressions, and sensor data such as location or movement speed. Second, the available data and large numbers of users make algorithmic matching highly attractive: ranking, filtering, and recommending have become central techniques for facilitating the ‘right’ connections, whether between consumers and products, users and contents, or between people seeking interaction, friendship, or love.

Digital platforms host social exchange in ways that Lawrence Lessig (1999) summarised under the famous slogan ‘code is law’, which holds that technical means take part in regulating conduct and shaping outcomes. The combination of infrastructural capture and algorithmic matching results in forms of socio-technical ordering that make platforms particularly powerful. As Zuboff (2019, p. 15) discusses under the term surveillance capitalism, the tight integration of data collection and targeted ‘intervention’ has produced ‘a market form that is unimaginable outside the digital milieu’. The rising power of platforms poses the question of what kind of accountability is necessary to understand these processes and their consequences in more detail. Matching algorithms, in particular, represent ordering mechanisms that do not follow the same logic as traditional decision-making, leading to considerable uncertainty concerning their inner workings, performativities, and broader social effects.

So far, most regulatory approaches to tackling these questions seek to create accountability by ‘opening the black box’ of algorithmic decision-making. A recent EU regulation on fairness in platform-to-business relations, for example, proposes transparency as its principal means. 1 The public debate about the upcoming EU Digital Services Act indeed shows that calls for transparency of algorithmic power have gained support across parliamentary factions and stakeholder groups. 2 The ‘Filter Bubble Transparency Act’—a US legislative proposal that seeks to protect users from being ‘manipulated by algorithms driven by user-specific data’ - focuses more specifically on platforms as media, but again relies on transparency as guiding principle. 3 The German Medienstaatsvertrag (‘State Media Treaty’), which has recently been ratified by all state parliaments, explicitly requires platform operators to divulge criteria for ranking, recommendation, and personalisation ‘in a form that is easily perceivable, directly reachable, and permanently available’. 4 This widespread demand for disclosure and explanation articulates not only justified concerns about the opacity of platforms but also testifies to the glaring lack of information on their conduct and its social, political, and economic repercussions.

In this paper, we likewise take up the challenge posed by platform opacity from the angle of accountability but seek to probe the conceptual and practical limitations of these transparency-led approaches to platform regulation. Echoing the critical literature on transparency as a policy panacea (e.g., Etzioni, 2010; Ananny & Crawford, 2018), we propose the concept of observability as a more pragmatic way of thinking about the means and strategies necessary to hold platforms accountable. While transparency and observability are often used synonymously (e.g. August & Osrecki, 2019), we would like to highlight their semantic differences. Unlike transparency, which nominally describes a state that may exist or not, observability emphasises the conditions for the practice of observing in a given domain. These conditions may facilitate or hamper modes of observing and impact the capacity to generate external insights. Hence, while the image of the black box more or less skips the practicalities involved in opening it, the term observability intends to draw attention to and problematise the process dimension inherent to transparency as a regulatory tool.

While observability incorporates similar regulatory goals to transparency, it also deviates in important respects, most importantly by understanding accountability as a complex, dynamic ‘social relation’ (Bovens, 2007, p. 450), which is embedded in a specific material setting. The goal is not to exchange one concept for the other but to sharpen our view for the specificities of platform power. At the risk of stating the obvious, regulatory oversight needs to take into account the material quality of the objects under investigation. Inspecting the inner workings of a machine learning system differs in important ways from audits in accounting or the supervision of financial markets. Rather than nailing down ‘the algorithm’, understood as a singular decision mechanism, the concept of observability seeks to address the conditions, means, and processes of knowledge production about large-scale socio-technical systems. In the everyday life of platforms, complex technologies, business practices, and user appropriations are intersecting in often unexpected ways. These platform dynamics result in massive information asymmetries that affect stakeholder groups as well as societies at large. Regulatory proposals need to take a broader view to live up to these challenges.

Our argument proceeds in three steps. In the next section, we retrace some of the main problems and limitations of transparency, paying specific attention to technical complexity. The third section then discusses the main principles guiding the observability concept and provides concrete examples and directions for further discussion. We conclude by arguing for a policy approach to promoting observability, emphasising that institutional audacity and innovation are needed to tackle the challenges raised by digital platforms.

2. Limitations to transparency

Much of the debate around our insufficient understanding of platforms and their use of complex algorithmic techniques to modulate users’ experience has centred on the metaphor of a ‘black box’. Although Frank Pasquale, whose Black Box Society (2015) has popularised the term beyond academia, prefers the broader concept of intelligibility, the talk of black boxes is often accompanied by demands for transparency. The regulatory proposals mentioned above are largely organised around mechanisms such as explanations, disclosures, and—more rarely—audits 5 that would bring the inner workings of the machine to light and thereby establish some form of control. But these calls for transparency as a remedy against unchecked platform power encounter two sets of problems. First, the dominant understanding of transparency as information disclosure faces important limitations. Second, the object under scrutiny itself poses problems. Platforms are marked by opacity and complexity, which effectively challenges the idea of a black box whose lid can be lifted to look inside. This section discusses both of these issues in turn.

2.1. Accountability as mediated process

Transparency has a long tradition as a ‘light form’ (Etzioni, 2010) of regulation. It gained new popularity in the 1970s as a neoliberal governance method, promising better control of organisational behaviour through inspection (August & Osrecki, 2019). Transparency is seen as an essential means of oversight and of holding commercial and public entities to account: only if powerful organisations reveal relevant information about their actions are we able to assess their performance. This understanding of transparency implies a number of taken for granted assumptions, which link information disclosure to visibility, visibility to insight, and insight to effective regulatory judgement (Ananny & Crawford, 2018, p. 974). According to this view, transparency is able to reveal the truth by reflecting the internal reality of an organisation (Albu & Flyverbom, 2019, p. 9) and thereby creating ‘representations that are more intrinsically true than others’ (Ananny & Crawford, 2018, p. 975). Making the opaque and hidden visible, creates truth and truth enables control, which serves as a ‘disinfectant’ (Brandeis, 1913, p. 10) capable of eliminating malicious conduct. Transparency is considered crucial for the accountability of politics because seeing, just as in the physical world, is equated with knowing: ‘what is seen is largely what is happening’, as Ezrahi (1992, p. 366) summarises this view. These assumptions also inform current considerations on platform regulation.

However, recent research on transparency has shown that transparency does more and different things than shedding light on what is hidden. The visibility of an entity and its procedures is not simply a disclosure of pre-existing facts, but a process that implies its own perspective. While transparency requirements expect ‘to align the behavior of the observed with the general interest of the observers’, empirical studies found that ‘transparency practices do not simply make organizations observable, but actively change them’ (August & Osrecki, 2019, p. 16). As Flyverbom (2016, p. 15) puts it, ‘transparency reconfigures - rather than reproduces - its objects and subjects’. The oversight devices used to generate visibility shape what we get to see (Ezrahi, 1992; Flyverbom, 2016), which puts into question the idea of direct, unmediated access to reality if only the disclosed information is accurate.

From a social science perspective, transparency should not be regarded as a state or a ‘thing’ but as the practice ‘of deciding what to make present (i.e. public and transparent) and what to make absent’ (Rowland & Passoth, 2015, p. 140). Creating visibility and insights as part of regulatory oversight consists of specific procedures, which involve choices about what specifically should be exposed and how, what is relevant and what can be neglected, which elements should be shown to whom and, not least, how the visible aspects should be interpreted (Power, 1997). In their critique of transparency-led approaches to algorithmic accountability, Ananny & Crawford (2018) moreover argue that there is a distinct lack of sensitivity for fundamental power imbalances, strategic occlusions, and false binaries between secrecy and openness, as well as a broad adherence to neoliberal models of individual agency.

In light of these criticisms, it may not come as a surprise that regulatory transparency obligations often fall short of their goals and create significant side-effects instead. Among the most common unintended outcomes are bureaucratisation, generalised distrust, and various forms of ‘window dressing’ designed to hide what is supposed to be exposed to external review. Informal organisational practices emerge and coexist with official reports, accounts, and presentations (August & Orecki, 2019, p. 21). While the critical literature on regulatory failures of transparency obligations is increasing, these insights have yet to have an impact on regulatory thinking. Most regulatory proposals resort to traditional ideas of external control through transparency and frame transparency as a straightforward process of disclosure. As a result, they are missing the mark on the complex and conflictual task of creating meaningful understanding that can serve as an effective check on platform power.

Taken together, a social science perspective on this key ideal of regulation suggests that making platforms accountable requires a critical engagement with the achievements and shortcomings of transparency. It needs to take on board efforts to combine different forms of evidence, and above all, to become attentive to the selective and mediated character of knowledge-building. Similar to the flawed logic of ‘notice and consent’ in the area of privacy protection, which holds that informing individuals on the purposes of data collection allows them to exercise their rights, a superficial understanding of transparency in the area of platform regulation risks producing ineffective results (see Obar, 2020; Yeung, 2017).

2.2. Opacity, complexity, fragmentation

A second set of complications for transparency concerns algorithms and platforms as the actual objects of scrutiny. Large-scale technical systems, in particular those incorporating complex algorithmic decision-making processes, pose severe challenges for assessing their inner workings and social effects. One obvious reason for this is indeed their opacity. As Burrell (2016, p. 2) argues, opacity may stem from secrecy practices, lack of expertise in reading code, and the increasing ‘mismatch between mathematical optimization in high-dimensionality characteristic of machine learning and the demands of human-scale reasoning’. The last point in particular introduces significant challenges to transparency understood as information disclosure or audit. Even if decision procedures behind automated matchmaking can sometimes still be meticulously specified, platforms nowadays mainly deploy statistical learning techniques. These techniques develop decision models inductively and ‘learn programs from data’ (Domingos, 2012, p. 81), based on an arrangement between data, feedback, and a given purpose (see Rieder, 2020).

In the canonical example of spam filtering, users label incoming emails as spam or not spam. Learning consists in associating each word in these messages with these two categories or ‘target variables’. Since every word contributes to the final decision to mark an incoming message as spam or not spam, the process cannot be easily traced back to singular factors. Too many variables come into play, and these algorithms are therefore not ‘legible’ in the same way as more tangible regulatory objects. With regard to regulatory oversight, this means that transparency in the sense of reconstructing the procedure of algorithmic decision making ‘is unlikely to lead to an informative outcome’, as Koene et al. (2019, p. II) conclude. Audits are unable to find out ‘what the algorithm knows because the algorithm knows only about inexpressible commonalities in millions of pieces of training data’ (Dourish, 2016, p. 7). There is a large gulf between the disclosure of ‘fundamental criteria’ mandated by regulatory proposals like the Medienstaatsvertrag and the technical complexities at hand.

Even if regulators were given access to data centres and source code, the process of sense-making would not be straightforward. Reading the gist of an algorithm from complex code may run into difficulties, even if no machine learning is involved. As Dourish (2016) shows, the presence of different programming languages and execution environments adds further complications, and so do the many subsystems and modules that concrete programmes often draw on. Algorithmic decision procedures ‘may not happen all in one place’ (Dourish, 2016, p. 4) but can be distributed over many different locations in a large programme or computer network. In the case of online advertising, for example, the placement of a single ad may entail a whole cascade of real-time auctions, each drawing on different algorithms and data points, each adding something to the final outcome. The result is a continuously evolving metastable arrangement. Thus, time becomes a crucial analytical factor, causing considerable difficulties for the ‘snapshot logic’ underlying most audit proposals.

For these reasons, algorithms turn out to be difficult to locate. In his ethnographic study of a recommender system, Seaver (2017) observes that even in small companies it can be a challenge for staff members to explain where exactly ‘the algorithm’ is. As Bogost (2015) quips, ‘[c]oncepts like “algorithm” have become sloppy shorthands, slang terms for the act of mistaking multipart complex systems for simple, singular ones’. What is referred to as ‘algorithm’, i.e. the actual matchmaking technique, may thus only be a small component in a much larger system that includes other various instances of ordering, ranging from data modelling to user-facing interfaces and functions that inform and define what users can see and do. YouTube, for example, not only fills its recommendation pipeline with a broad array of signals generated from the activities of billions of users but actually uses two different deep learning models for ‘candidate generation’ (the selection of hundreds of potential videos from the full corpus) and ‘ranking’ (the selection and ordering of actual recommendations from the candidate list) (see Covington et al., 2016). The fuzzy, dynamic, and distributed materiality of contemporary computing technologies and data sets means that algorithmic accountability is harder to put into practice than the call for transparency suggests. Regulatory proposals such as disclosures, audits, or certification procedures seeking to establish effective control over their functionality and effects assume properties that algorithmic systems may often not meet. Suffice to say that technical complexity also facilitates the attempts at dissimulation and ‘window dressing’ mentioned above.

Yet, as if this was not difficult enough, our understanding of platform accountability should extend beyond oversight of algorithms and platform conduct to be meaningful. The ordering power of platforms also encompasses shared or distributed accomplishments (see Suchman, 2007) to which platforms, users and content providers each contribute in specific ways. As Rahwan et al. (2019, p. 477) argue, machine behaviour ‘cannot be fully understood without the integrated study of algorithms and the social environments in which algorithms operate’. The actions of users, for example, provide the data that shape algorithmic models and decisions as part of machine learning systems. In the same vein, platform behaviour cannot be reduced to platform conduct, that is, to the policies and design decisions put in place by operators. It must include the evolving interactions between changing social practices and technical adjustments, which may, in turn, be countered by user appropriations. As use practices change, algorithmic decision models change as well. Platform companies are therefore neither fully in control of actual outcomes, nor fully aware what is happening within their systems.

Finally, the effects of platforms can only be sufficiently addressed if we consider what is being ordered. For example, ranking principles considered beneficial in one culture domain, e.g. music recommendation, may have troubling implications in another, e.g. the circulation of political content. Accountability thus has to consider what is made available on platforms and how ordering mechanisms interact with or shape the content and its visibility. This again requires a broader view than what algorithm audits or broad technical disclosures are able to provide.

Taken together, research on the properties of algorithms and algorithmic systems suggests that regulatory proposals such as ‘opening the black box’ through transparency, audit, or explainability requirements reflect an insufficient understanding of algorithms and the platform architectures they enable. Algorithms can neither be studied nor regulated as single, clear-cut, and stable entities. Rather, their behaviour and effects result from assemblage-like contexts whose components are not only spatially and functionally distributed but also subject to continuous change, which is partly driven by users or markets facilitated by platforms. Given the ephemeral character of algorithms on the one side and the enormous generative and performative power of algorithmic systems on the other, the question arises what concepts, strategies, and concrete tools might help us to comprehend their logics and to establish effective political oversight. Such an approach needs to take on board the critique of transparency as a regulatory tool and consider accountability as a continuous interaction and learning process rather than periodical undertakings. It should recognise that the legibility of algorithmic systems significantly differs from that of other objects or areas of regulation; and it should take into account that any form of review is not only selective but also shapes the object under investigation. Thus, the debate on platform regulation needs to become reflexive with regard to the specific materiality of the regulatory field and the constitutive effects of studying it.

3. Principles of observability

This section seeks to flesh out an understanding of observability as a step toward tackling the problems platform accountability currently faces. While the term is regularly used in the literature on transparency (e.g., Bernstein, 2012; Albu & Flyverbom, 2015; August & Osrecki, 2019), we seek to calibrate it to our specific goals: the challenges raised by platforms as regulatory structures need to be addressed more broadly, beginning with the question of how we can assess what is happening within large-scale, transnational environments that heavily rely on technology as a mode of governance. Who gets treated how on large online platforms, how are connections between participants made and structured, what are the outcomes, and—crucially—who can or should be able to make such assessments? Rather than a binary between transparency and opacity, the question is how to foster the capacity to produce knowledge about platforms and ‘platform life’ in constructive ways. The increasingly technological nature of our societies requires not just penalties for law infringements, but a deeper and well-informed public conversation about the role of digital platforms. This includes attention to the larger impacts of the new kinds of ordering outlined above, as well as a sensitivity for the ideological uses of transparency, which may serve ‘as a tool to fight off the regulations opposed by various business groups and politicians from conservative parties’ (Etzioni, 2010, p. 2). We therefore position observability as an explicit means of, not an alternative to regulation. As van Dijck et al. (2018, p. 158) underline, ‘[r]egulatory fixes require detailed insights into how technology and business models work, how intricate platform mechanisms are deployed in relation to user practices, and how they impact social activities’. Our concept of observability thus seeks to propose concrete actions for how to produce these insights. While some of the more concrete strategies we discuss may come out of self-regulation efforts, effective and robust observability clearly requires a regulatory framework and institutional support. In what follows, we outline three principles that inform the concrete conceptual and practical directions observability seeks to emphasise.

3.1. Expand the normative and analytical horizon

The first principle concerns the research perspective on platforms and argues that a broader focus is needed. This focus takes into consideration how digital platforms affect societies in general, ranging from everyday intimacy to economic and labour relations, cultural production, and democratic life. Given that platformisation transforms not only specific markets but ‘has started to uproot the infrastructural, organizational design of societies’ (van Dijck, 2020, p. 2), it seems crucial to develop knowledge capacities beyond critical algorithm studies and include platform conduct, behaviour, and effects across relevant social domains in our agendas. As Powles and Nissenbaum (2018) have recently argued for artificial intelligence systems, limiting our focus to the important yet narrow problems of fairness and biases means that ‘vast zones of contest and imagination are relinquished’, among them the question whether the massive efforts in data collection underlying contemporary platform businesses are acceptable in the first place. The ability to say no and prohibit the deployment of certain technologies such as political micro-targeting of voters or face recognition requires robust empirical and normative evidence on its harm for democracies.

While investigations into misinformation and election tampering are important, there are other long-term challenges waiting to be addressed. Recent studies on surveillance capitalism (Zuboff, 2019), digital capitalism (Staab, 2019), informational capitalism (Cohen, 2019), the platform society (van Dijck et al., 2018), or the ‘dataist state’ (Fourcade & Gordon, 2020) aim to capture and make sense of the ongoing structural changes of societies and economies, including the power shifts these imply. EU commissioner Vestager recently evoked Michel Foucault’s notion of biopower when addressing novel data-based techniques of classifying, sorting, and governing (Stolton, 2019). While the term addresses a set of political technologies that emerged in the 19th century to manage the behaviour of populations by means of specific regimes of knowledge and power, digital platforms’ considerable reach and fine-grained ‘capture’ (Agre, 1994) of everyday activities invites comparison. The deep political and social repercussions these conceptual frames highlight require broader forms of social accountability (Bovens, 2007) than disclosures or audits are able to provide.

How can researchers, regulators, and civil society expand their capacity to study, reflect and act on these developments? The concept of observability starts from the recognition of a growing information asymmetry between platform companies, a few data brokers, and everyone else. The resulting data monopoly deprives society of a crucial resource for producing knowledge about itself. The expanding data sets on vast numbers of people and transactions bear the potential for privileged insights into societies’ texture, even if platforms tend to use them only for operational purposes.

AirBnB’s impact on urban development, Uber’s role in transforming transportation, Amazon’s sway over retail, or Facebook and Twitter’s outsized influence on the public sphere cannot be assessed without access to relevant information. It is symptomatic that companies refuse access to the data necessary for in-depth, independent studies and then use the lack of in-depth, independent studies as evidence for lack of harm. New modes of domination are unfolding as part of analytics-driven business models and the unprecedented information asymmetries they bring about. Powles and Nissenbaum (2019) therefore argue that we need ‘genuine accountability mechanisms, external to companies and accessible to populations’. An essential condition and experimental construction site for such accountability mechanisms would be the institutionalisation of reliable information interfaces between digital platforms and society—with a broad mandate to focus on the public interest.

We propose the concept of public interest as a normative reference for assessing platform behaviour and regulatory goals. However, public interest is neither well defined nor without alternatives. 6 We prefer public interest over the closely related common good because the former refers to an internationally established mandate in media regulation and could thus inform the formulation of specific requirements or ‘public interest obligations’ for platforms as well (Napoli, 2015, p.4). Furthermore, the concept speaks to our specific concern with matters of governance of platform life. The use of public interest spans different disciplinary and regulatory contexts, and it is open to flexible interpretation. Yet, the often-criticised vagueness of the concept has the advantage of accommodating the broad range of existing platforms. As a normative framework it can be used to critically assess the design of multiple-sided markets as much as the impact of digital intermediaries on the public sphere. Approaches to defining and operationalising public interest depend on the context. In economic theory, public interest is suspected of functioning as a ‘weapon’ for justifying regulatory intervention into markets for the purpose of enhancing social welfare (Morgan & Yeung, 2007). Correcting failing markets constitutes a minimalist interpretation of public interest, however. In politics, public interest is associated with more diverse social goals, among them social justice, non-discrimination, and access to social welfare; or more generally the redistribution of resources and the maintenance of public infrastructures. With regard to the public sphere and the media sector, public interest refers to protecting human rights such as freedom of information and freedom of expression, fostering cultural and political diversity, and not least sustaining the conditions for democratic will formation through high quality news production and dissemination (Napoli, 2015).

What these different understandings of public interest have in common is a focus on both procedural and substantial aspects. Obviously, public interest as a frame of reference for assessing and regulating digital platforms is not a given. Rather, the meaning and principles of public interest have to be constantly negotiated and reinterpreted. As van Dijck (2020, p. 3) reminds us, such battles over common interest do not take place in a vacuum, they are ‘historically anchored in institutions or sectors’ and ‘after extensive deliberation’ become codified in more or less formal norms. From a procedural point of view, public interest can also be defined as a practice, which has to meet standards of due process such as inclusiveness, transparency, fairness, and right to recourse (Mattli & Woods, 2009, p. 15). In terms of substance, the notion of public interest clearly privileges the collective common welfare over that of individuals or private commercial entities. In this respect, it entails a departure from the neoliberal focus on individual liberty toward collective freedoms. Thereby it also extends the space of policy options beyond ‘notice and consent’ to more far-reaching regulatory interventions (Yeung, 2017, p. 15). We see similar conceptual adjustments toward public interest in other areas such as the discourse on data protection. As Parsons (2015, p. 6) argues, it is necessary to recognise ‘the co-original nature of [...] private and public autonomy’ to understand that mass surveillance is not merely violating citizens’ individual rights, but ‘erodes the integrity of democratic processes and institutions’ (p. 1).

To conclude, the concept of observability emphasises the societal repercussions of platformisation and suggests public interest as a normative horizon for assessing and regulating them. It problematises the poor conditions for observing platform life and its effects, and suggests levelling off, in institutionalised ways, the information asymmetry between platforms and platform research. Thus, we think of observability as one possible ‘counter power’ in the sense of Helberger (2020, p. 9) who calls for establishing ‘entirely new forms of transparency’. First and foremost, observability therefore seeks to improve the informational conditions for studying the broader effects of platformisation. Over the next two sections, we discuss the modalities for such an approach.

3.2. Observe platform behaviour over time

Building on the arguments laid out in section two, the second principle of observability holds that the volatility of platforms requires continuous observation. While ex ante audits of technical mechanisms and ex post analysis of emblematic cases are certainly viable for more restricted systems, the dynamic and distributed nature of online platforms means that intermittent inspections or disclosures are insufficient, thwarted by the object’s transient character. Traditional forms of information sharing through transparency reports, legal inquiries, and regulated and structured disclosures, similar to those that exist for stock markets, can still be part of an observability framework, as can investigative reporting and whistleblowing. However, to tackle the specific challenges of digital platforms, more continuous forms of observation need to be envisaged.

When terms of service, technical design, or business practices change, the ‘rules of the game’ change as well, affecting platform participants in various ways. Projects like TOSBack 7 use browser plugins and volunteer work to track and observe changes in platforms’ terms of service continuously, that is, while they are happening and not after some complaint has been filed. These are then distilled into more readable forms to accommodate wider audiences. The joint Polisis 8 and PriBot 9 projects pursue similar goals, drawing on artificial intelligence to interpret privacy policies and deal with the limitations of volunteer work. Such efforts should be made easier: a recent proposal by Cornelius (2019) suggests making terms of service contracts available as machine-readable documents to facilitate ongoing observation and interpretation. Similar approaches can be imagined for other areas of platform conduct, including technical tweaks or changes in business practices.

However, to account for the distributed and dynamic character of platform life, as it emerges from the interaction between policies, design choices, and use practices, continuous observation needs to reach beyond legal and technical specifications. Bringing the space of distributed outcomes into view is by no means easy, but the importance of doing so is increasingly clear. In their discussion of algorithms as policies, Hunt and McKelvey (2020, p. 330) indeed argue that the ‘outcomes of these policies are as inscrutable as their intentions - under our current system of platform governance, it is beyond our reach to know whether algorithmic regulation is discriminatory or radicalizing or otherwise undermines the values that guide public policy’. Here, observability does not alter the underlying normative concerns but asks how platform reality can be sufficiently understood to make it amenable to normative reasoning in the first place. As platforms suck the bulk of online exchange into their increasingly centralised infrastructures, we need the capacity to probe not merely how algorithms work, but how fundamental social institutions are being reshaped. Answering these questions requires studying technical and legal mechanisms, use practices, and circulating units such as messages together. Given that our first goal is to understand rather than to place blame, there is no need to untangle networks of distributed causation from the outset. Entanglement and the wide variety of relevant questions we may want to ask mean that observability thus favours continuous and broad access to knowledge generating facilities.

There are at least four practical approaches that align with what we are aiming at. First, platforms have occasionally entered into data access agreements with researchers, journalists, NGOs, and so forth. Facebook is a case in point. The company’s Data for Good 10 programme, which builds ‘privacy-preserving data products to help solve some of the world's biggest problems’, shares data with approved universities and civil society groups. The recently launched Social Science One initiative 11, a collaboration with the US Social Science Council, is supposed to grant selected researchers access to both data and funding to study ‘the impact of social media on elections and democracy’ (King & Persily, 2019, p. 1). While these initiatives are good starting points, they have been plagued by delays and restrictions. Scholars have rightfully criticised that the scope and modalities for access remain in the hands of platforms themselves (Hegelich, 2020; Suzor et al., 2019). The central question is thus how to structure agreements in ways that asymmetries between platforms and third parties are reduced. Without a legal framework, companies can not only start and stop such initiatives at will but are also able to control parameters coming into play, such as thematic scope, coverage, and granularity.

Accountability interfaces providing continuous access to relevant data constitute a second direction. Facebook’s Ad Library 12, for example, is an attempt to introduce carefully designed observability, here with regard to (political) advertisement. Despite the limitations of the existing setup (see Leerssen et al., 2019), machine-readable data access for purposes of accountability can enable third-party actors to ask their own questions and develop independent analytical perspectives. While tools like Google Trends 13 are not designed for accountability purposes, a broader understanding of the term could well include tools that shed light on emergent outcomes in aggregate terms. There are already working examples in other domains, as the German Market Transparency Unit for Fuels 14, a division of the Federal Cartel Office shows. It requires gas stations to communicate current prices in real-time to make them available on the Web and via third-party Apps. 15 Well-designed data interfaces could both facilitate observability and alleviate some of the privacy problems other approaches have run into. One could even imagine sandbox-style execution environments that allow third parties to run limited code within platforms’ server environment, allowing for privacy-sensitive analytics where data never leaves the server.

Developer APIs are data interfaces made available without explicit accountability purposes. These interfaces have been extensively repurposed to investigate the many social phenomena platforms host, ranging from political campaigning (e.g. Larsson, 2016) to crisis communication during disasters (e.g. Bruns & Burgess, 2014), as well as the technical mechanisms behind ranking and recommendation (e.g., Airoldi et al., 2016; Rieder et al., 2018). Depending on the platform, developer APIs provide data access through keyword searches, user samples, or other means. Twitter’s random sample endpoint 16, which delivers representative selections of all tweets in real time (Morstatter et al., 2014), is particularly interesting since it allows observing overall trends while reducing computational requirements. One of the many examples for exploiting a data interface beyond social media is David Kriesel’s project BahnMining 17, which uses the German railroad’s timetable API to analyse train delays and challenge the official figures released by Deutsche Bahn.

But the so-called ‘APIcalypse’ (Bruns, 2019) that followed the Facebook-Cambridge Analytica scandal has led to restrictions in data access, rendering independent research much more difficult. Even before Facebook-Cambridge Analytica, working with developer APIs regularly created issues of reliability and reproducibility of results, research ethics, and privacy considerations (see Puschmann, 2019). Generally, developer interfaces are not designed for structured investigations into the layers of personalisation and localisation that may impact what users actually see on their screens. YouTube’s ‘up next’ column is a case in point: while the API does make so-called ‘related videos’ available, it leaves out the personalized recommendations that constitute a second source for suggested videos. Research on the YouTube’s recommender system, for example a study by PEW 18, is therefore necessarily incomplete. But the fact that developer APIs enable a wide variety of independent research on different topics means that in cases where privacy concerns can be mitigated, they are worth extending further. A structured conversation between platforms and research organisations about possible long-term arrangements is necessary and independent regulatory institutions could play a central role here.

Finally, due to API limitations, researchers have been relying on scraping, a set of techniques that glean data from end-user interfaces. Search engines, price snipers, and a whole industry of information aggregators and sellers rely on scraped data, but there are many non-commercial examples as well. Projects like AlgoTransparency 19, run by former YouTube employee Guillaume Chaslot, regularly capture video recommendations from the web interface to trace what is being suggested to users. Roth et al. (2020) have recently used a similar approach to study whether YouTube indeed confines users to filter bubbles. Such high-profile questions call for empirical evidence, and since research results may change as quickly as systems evolve, continuous monitoring is crucial. While scraping does not demand active cooperation from the platforms under scrutiny, large-scale projects do require at least implicit acquiescence because websites can deploy a whole range of measures to thwart scraping.

Although more precarious than API-based approaches, taking data directly from the user interface allows for the explicit study of personalisation and localisation. Data retrieved through scraping may also serve to verify or critique data obtained through the previously mentioned techniques. Not unlike the panels assembled by analytics companies like Nielsen for their online products 20, the most promising platform-centred crowd-sourcing projects ask volunteers to install custom-built browser plugins to ‘look over their shoulder’. The Datenspende project, a collaboration between several German state-levelmedia authorities, the NGO AlgorithmWatch, the Technical University Kaiserslautern, and Spiegel Online, recruited 4,500 volunteers before the German parliamentary elections in 2017 to investigate what users actually see when they look for party and candidate names on Google Search and Google News. 21 The same approach was later used to scrutinise the SCHUFA 22, Germany’s leading credit bureau, and most recently Instagram 23.

There are many other areas where scraping has been productively used. The $herrif project 24, for example, also deployed browser plugins to investigate price discrimination practices on retail websites like Amazon (Iordanou et al., 2017). Even regulators have to resort to scraping: a recent study by the French Conseil Supérieur de l’Audiovisuel used the accounts of 39 employees and four fictitious users to study YouTube’s recommendation system. 25 The City of Amsterdam already began scraping data from AirBnB in 2017 26, analysing consequences for the housing market and compliance by landlords with rules on short-term rentals. Given that sample quality, scale, and the dependence on platform acquiescence are significant disadvantages under current conditions, a legal framework regulating access to platform data would increase the practical viability of this approach. The current ambiguities risk creating chilling effects that discourage smaller research projects in particular. NYU’s Ad Observer 27, a tool that uses browser plugins and scraping to investigate ad targeting on Facebook to compensate for the limitations of the above-mentioned Ad Library, tells a cautionary tale. The researchers recently received a cease and desist letter from the company, putting the whole project in peril (Horwitz, 2020).

However, it should be stated that not all forms of access to platform data further the public interest. Across all these four approaches we encounter serious privacy concerns. While there are areas where data access is unproblematic, others may require restricting access to certain groups, anonymise data, use aggregate statistics, or explore innovative models such as sandbox environments. These are not trivial problems; they raise the need for innovative and experimental approaches supported by institutional oversight. From a legal perspective, a recent interpretation of the GDPR by the European Data Protection Supervisor 28 clarified that research in the public interest must have leeway if done in accordance with ethical best practices. Still, concrete measures will need to be the subject of broader conversations about the appropriate balance to strike, which may lead, in certain cases, to more restrictions rather than fewer.

3.3. Strengthen capacities for collaborative knowledge creation

In his analysis of accountability as a social relation, Bovens (2007, p. 453) argues that ‘transparency as such is not enough to qualify as a genuine form of accountability, because transparency does not necessarily involve scrutiny by a specific forum’. Given their deep and transversal impact, the question as to how knowledge about platforms is generated and how it circulates through society is crucial. In this section, we argue that effective accountability requires the participation of different actors and the generation of different forms of knowledge.

Our argument starts from the fact that platform companies have largely treated information about their systems, what users are posting or selling, and which kind of dynamics emerge from their interactions as private assets. They heavily invest in sophisticated analytics to provide insights and pathways for corporate action. Product development, optimisation, and detection and moderation of all kinds of illegal or ‘undesirable’ content have become important tasks that fully rely on evolving observational capabilities. While platforms would be able to facilitate knowledge creation beyond such operational concerns, the existing information asymmetries between those collecting and mining private data and society at large make this highly unlikely. Instead, platforms provide businesses and individual users with deliberately designed ‘market information regimes’ (Anand & Peterson, 2000) consisting of analytics products and services that provide information about the larger market and one’s own standing.

Creators on YouTube, for example, are now able to gauge how their videos are faring, how the choice of thumbnails affects viewer numbers, or how advertisers are bidding on keywords within the platform interface. But such interfaces are ‘socially and politically constructed and [...] hence fraught with biases and assumptions’ (Anand & Peterson, 2000, p. 270), privileging operational knowledge designed to boost performance over broader and more contextualised forms of insight. The narrow epistemological horizon of platform companies thus needs to be supplemented by inquiries that contextualise and question this business model. The problematic monopolisation of analytical capacities legitimises our demand for a more inclusive approach, which would open the locked-up data troves to qualified external actors. However, there simply is no one-size-fits-all approach able to cover all types of platforms, audiences, and concerns. Researchers, journalists, and activists are already engaged in ‘accountability work’, covering a range of questions and methods. Regulators add to this diversity: competition and antitrust inquiries require different forms of evidence than concerns regarding misinformation or radicalisation. We may therefore prefer to speak of ‘accountabilities’ in plural form.

There are many approaches coming from the technical disciplines that promise to enhance understanding. Emerging research fields like ‘explainable AI’ (e.g. Doran et al., 2017) seek to make primary ordering mechanisms more accountable, even if the issue remains of what ‘explainable’ means when different audiences ask different questions. Other strategies like the ‘glass box’ approach (Tubella & Dignum, 2019) focus on the monitoring of inputs and outputs to ‘evaluate the moral bounds’ of AI systems. A particularly rich example for image classification from Google Researchers comes in the form of an ‘activation atlas’, which intends to communicate how a convolutional neural network ‘sees’. 29 But since platforms are much more than contained ordering mechanisms, the problem of how to make their complexity readable, how to narrate what can be gleaned from data (see Dourish, 2016), remains unsolved. However, researchers in the humanities and social sciences have long been interested in how to make sense of quantitative information. Work on ‘narrating numbers’ (Espeland, 2015), ‘narrating networks’ (Bounegru et al., 2017), or the substantial research on information visualisation (e.g. Drucker, 2014) can serve as models. But as Sloane & Moss (2019) argue in their critique of current approaches to AI, there is a broader ‘social science deficit’ and the one-sided focus on quantitative information is part of the problem. The marginalisation of qualitative methods such as ethnographic work that tries to elucidate both the context within which platforms make decisions and the meaning actors ascribe to practices and their effects, limits knowledge production.

Journalists also have unique expertise when it comes to forms of knowledge generation and presentation. A recent example is the work by Karen Hao and Jonathan Stray 30 on the controversial KOMPASS project, 31 which questions the very possibility of fair judgements by allowing users to ‘play’ with the parameters of a simplified model. Likewise, NGOs have long worked on compound forms of narration that combine different data sources and methods for purposes of accountability. Greenpeace’s Guide to Greener Electronics, which includes a grade for companies’ willingness to share information, or the Ranking Digital Rights 32 project are good examples for the translation of research into concrete political devices. Accountability, understood as an inherent element of democratic control, cannot be reduced to a forensic process that transposes ‘facts’ from obscurity into the light. It needs to be considered as an ongoing social achievement that requires different forms of sense-making, asking for contributions from different directions and epistemological sensitivities. Access to machine-readable data, our focus in the last section, has limitations, but also allows different actors to develop their own observation capacities, adapting their analytical methods to the questions they want to ask.

We are aware that increased understanding of platform life would prompt reactions and adaptations by different stakeholders gathering around platforms, including actors seeking to ‘game’ the system and even platform owners themselves. Making the constant negotiations between these actors more visible may have the advantage, however, that the process of establishing boundaries of acceptable behaviour could be engaged more explicitly. As Ziewitz (2019, p. 713) argues for the field of search engine optimisation (SEO), ‘the moral status of reactive practices is not given, but needs to be accomplished in practice’. Distributing this ‘ethical work’ over a wider array of actors could thus be a step toward some modest form of ‘cooperative responsibility’ (Helberger et al., 2018), even if fundamental power asymmetries remain.

Observability thus raises the complicated question of how data and analytical capacities should be made available, to whom, and for what purpose. This clearly goes beyond data access. As Kemper & Kolkman (2019) note, ‘no algorithmic accountability without a critical audience’, and the capacity for critique requires more than a critical attitude. For this reason, frameworks for data access should ‘go hand-in-hand with the broader cultivation of a robust and democratic civil society, which is adequately funded and guaranteed of its independence’ (Ausloos et al., 2020, p. 86). And Flyverbom (2015, p. 115) reminds us that transparency, understood as a transformative process, cannot succeed ‘without careful attention to the formats, processes of socialization, and other affordances of the technologies and environments in which they play out’. Monitoring platforms on a continuous basis may thus call for considerable resources if done well. Governmental institutions, possibly on a European level, could play a central role in managing data access, in making long-term funding available for research, and in coordinating the exchange between existing initiatives. But given the complexity of the task, regulators will also have to build ‘in-house’ expertise and observational capacity, backed by strong institutional support.

The capacity to make sense of large and complex socio-technical systems indeed relies on a number of material conditions, including access to data, technical expertise, computing power, and not least the capacity to connect data-analytical practices to social concerns. Such a capacity is typically produced as a collective effort, through public discourse. The quality of observability depends on such discourses to explore what kind of knowledge forms allow concerned actors to make actually meaningful interpretations.

4. Conclusion: toward platform observability

This article developed the concept of observability to problematise the assumptions and expectations that drive our demands for transparency of platform life. Observability is not meant to be a radical departure from the call for transparency. Rather, it draws practical conclusions from the discrepancy we noted between the complexity of the platform machinery and the traditional idea of shedding light on and seeing as a way of establishing external oversight. In a nutshell, we are suggesting observability as a pragmatic, knowledge-focused approach to accountability. Observability stresses technical and social complexities, including the distributed nature of platform behaviour. Moreover, it regards continuous and collaborative observation within a normative framework as a necessary condition for regulating the explosive growth of platform power. We see three main directions where further steps are needed to move closer to the practical realisation of these principles.

Regulating for observability means working toward structured information interfaces between platforms and society. 33 To account for quickly changing circumstances, these interfaces need to enable continuous observation. To allow for a broader set of questions to be asked, a broad range of data has to be covered. And to bring a wider variety of epistemological sensitivities into the fold, they need to be sufficiently flexible. What constitutes suitable and sufficient access will have to be decided on a per-platform basis, including the question of who should be able to have access in the first place. But the examples we briefly discussed in section 3.2—and the many others we left out—show that there is already much to build on. The main goal, here, is to develop existing approaches further and to make them more stable, transparent, and predictable. Twitter’s new API 34, which now explicitly singles out academic research use cases, is a good example for a step in the right direction, but these efforts are still voluntary and can be revoked at any time. Without binding legal frameworks, platforms can not only terminate such initiatives at will, they also control relevant modalities such as thematic scope and depth of access. Realigning the structural information asymmetries between platforms and society thus requires curtailing the de facto ownership over data that platforms collect about their users.

Observability as part of regulation requires engaging with the specific properties of algorithmic systems and the co-produced nature of platform behaviour. The complex interactions between technical design, terms of service, and sometimes vast numbers of both users and ‘items’ mean that the concept of a singular algorithm steering the ordering processes at work in large-scale platforms is practically and conceptually insufficient. If techniques like machine learning are here to stay, regulatory approaches will have to adapt to conditions where the object of regulation is spread out, volatile, and elusive. The pressing questions are not restricted to how and what to regulate, but also encompass the issue of what platforms are doing in the first place. While normative concepts such as algorithmic fairness or diversity are laudable goals, their focus seems rather narrow considering the fundamental change of markets and the public sphere that platforms provoke. We therefore suggest the broader concept of public interest as a normative benchmark for assessing platform behaviour, a concept obviously in need of specification. But whatever set of norms or values are chosen as guiding principles, the question remains how to ‘apply’ them, that is, how to assess platform behaviour against public interest norms. Observation as a companion to regulation stresses the fact that we need to invest in our analytical capacities to undergird the regulatory response to the challenges platforms pose. Likewise, the existing approaches to studying platforms should be supplemented with specific rights to information. Together, these elements would constitute important steps towards a shared governance model (see Helberger et al., 2018), where power is distributed more equally between platforms and their constituencies.

Institutionalising processes of collective learning refers to the need to develop and maintain the skills that are required to observe platforms. A common characteristic of the data collecting projects mentioned above is their ephemeral, experimental, and somewhat amateurish nature. While this may sound harsh, it should be obvious that holding platforms to account requires ‘institution-building’, that is, the painstaking assembly of skills and competence in a form that transposes local experiments into more robust practices able to guarantee continuity and accumulation. While academic research fields have their own ways of assembling and preserving knowledge, the task of observing large-scale platforms implies highly specialised technical and logistical feats that few organisations are able to tackle. Material resources are only one part of the equation and the means to combat discontinuity and fragmentation are at least equally important. One form of institutional incorporation of observability would therefore be something akin to ‘centres of expertise’ tasked with building the capacity to produce relevant knowledge about platforms. Such centres could act as an, ‘important bridge builder between those holding the data and those wishing to get access to that data’ (Ausloos et al., 2020, p. 83). Pushing further, a European Platform Observatory, 35 driven by a public interest mandate, equipped with adequate funding, and backed by strong regulatory support, could be a way forward to platform accountability.

Holding platforms to account is a complex task that faces many challenges. However, given their rising power, it is quickly becoming a necessity. The concept of observability spells out these challenges and suggests steps to tackle them, taking a pragmatic, knowledge-based approach. The goal, ultimately, is to establish observability as a ‘counter power’ to platforms’ outsized hold on contemporary societies.

Acknowledgements

This work was, in part, inspired by discussions we had as members of the European Commission’s Observatory on the Online Platform Economy. We would also like to thank Joris van Hoboken, Paddy Leerssen, and Thomas Poell for helpful comments and feedback.

References

Agre, P. E. (1994). Surveillance and Capture: Two Models of Privacy. The Information Society, 10(2), 101–127. https://doi.org/10.1080/01972243.1994.9960162

Albu, O. B., & Flyverbom, M. (2019). Organizational Transparency: Conceptualizations, Conditions, and Consequences. Business & Society, 58(2), 268–297. https://doi.org/10.1177/0007650316659851

Anand, N., & Peterson, R. A. (2000). When Market Information Constitutes Fields: Sensemaking of Markets in the Commercial Music Industry. Organization Science, 11(3), 270–284. https://doi.org/10.1287/orsc.11.3.270.12502

Ananny, M., & Crawford, K. (2018). Seeing without knowing: Limitations of the transparency ideal and its application to algorithmic accountability. New Media & Society, 20(3), 973–989. https://doi.org/10.1177/1461444816676645

August, V., & Osrecki, F. (2019). Transparency Imperatives: Results and Frontiers of Social Science Research. In V. August & F. Osrecki (Eds.), Der Transparenz-Imperativ: Normen – Praktiken – Strukturen (pp. 1–34). Springer. https://doi.org/10.1007/978-3-658-22294-9

Bernstein, E. S. (2012). The Transparency Paradox: A Role for Privacy in Organizational Learning and Operational Control. Administrative Science Quarterly, 57(2), 181–216. https://doi.org/10.1177/0001839212453028

Bogost, I. (2015, January 15). The Cathedral of Computation. The Atlantic. https://www.theatlantic.com/technology/archive/2015/01/the-cathedral-of-computation/384300/

Bovens, M. (2007). Analysing and Assessing Accountability: A Conceptual Framework. European Law Journal, 13(4), 447–468. https://doi.org/10.1111/j.1468-0386.2007.00378.x

Brandeis, L. D. (1913, December 20). What publicity can do. Harper’s Weekly.

Bruns, A. (2019). After the ‘APIcalypse’: Social media platforms and their fight against critical scholarly research. Information, Communication & Society, 22(11), 1544–1566. https://doi.org/10.1080/1369118X.2019.1637447

Bruns, A., & Burgess, J. (2013). Crisis communication in natural disasters: The Queensland floods and Christchurch earthquakes. In K. Weller, A. Bruns, J. Burgess, M. Mahrt, & C. Puschmann (Eds.), Twitter and Society (pp. 373–384). Peter Lang.

Burrell, J. (2016). How the machine “thinks”: Understanding opacity in machine learning algorithms. Big Data & Society, 3(1), 1–12. https://doi.org/10.1177/2053951715622512

Cohen, J. E. (2019). Between Truth and Power: The Legal Constructions of Informational Capitalism. Oxford University Press. https://doi.org/10.1093/oso/9780190246693.001.0001

Cornelius, K. B. (2019). Zombie contracts, dark patterns of design, and ‘documentisation’. Internet Policy Review, 8(2). https://doi.org/10.14763/2019.2.1412

Covington, P., Adams, J., & Sargin, E. (2016). Deep Neural Networks for YouTube Recommendations. Proceedings of the 10th ACM Conference on Recommender Systems, 191–198. https://doi.org/10.1145/2959100.2959190

Domingos, P. (2012). A few useful things to know about machine learning. Communications of the ACM, 55(10), 78–87. https://doi.org/10.1145/2347736.2347755

Doran, D., Schulz, S., & Besold, T. R. (2017). What Does Explainable AI Really Mean? A New Conceptualization of Perspectives. ArXiv. http://arxiv.org/abs/1710.00794

Douglass, B. (1980). The Common Good and the Public Interest. Political Theory, 8(1), 103–117. https://doi.org/10.1177/009059178000800108

Dourish, P. (2016). Algorithms and their others: Algorithmic culture in context. Big Data & Society, 3(2). https://doi.org/10.1177/2053951716665128

Espeland, W. (2015). Narrating Numbers. In R. Rottenburg, S. E. Merry, S.-J. Park, & J. Mugler (Eds.), The World of Indicators: The Making of Governmental Knowledge through Quantification (pp. 56–75). Cambridge University Press. https://doi.org/10.1017/CBO9781316091265.003

Etzioni, A. (2010). Is Transparency the Best Disinfectant? Journal of Political Philosophy, 18(4), 389–404. https://doi.org/10.1111/j.1467-9760.2010.00366.x

Ezrahi, Y. (1992). Technology and the civil epistemology of democracy. Inquiry, 35(3–4), 363–376. https://doi.org/10.1080/00201749208602299

Flyverbom, M. (2016). Transparency: Mediation and the Management of Visibilities. International Journal of Communication, 10, 110–122. https://ijoc.org/index.php/ijoc/article/view/4490

Fourcade, M., & Gordon, J. (2020). Learning Like a State: Statecraft in the Digital Age. Journal of Law and Political Economy, 1(1), 78–108. https://escholarship.org/uc/item/3k16c24g

Gillespie, T. (2018). Custodians of the Internet. Yale University Press.

Hegelich, S. (2020). Facebook needs to share more with researchers. Nature, 579, 473–473. https://doi.org/10.1038/d41586-020-00828-5

Helberger, N. (2020). The Political Power of Platforms: How Current Attempts to Regulate Misinformation Amplify Opinion Power. Digital Journalism, 8(3). https://doi.org/10.1080/21670811.2020.1773888

Helberger, N., Pierson, J., & Poell, T. (2018). Governing online platforms: From contested to cooperative responsibility. The Information Society, 34(1), 1–14. https://doi.org/10.1080/01972243.2017.1391913

Horwitz, J. (2020, October 23). Facebook Seeks Shutdown of NYU Research Project Into Political Ad Targeting. The Wall Street Journal. https://www.wsj.com/articles/facebook-seeks-shutdown-of-nyu-research-project-into-political-ad-targeting-11603488533

Hunt, R., & McKelvey, F. (2019). Algorithmic Regulation in Media and Cultural Policy: A Framework to Evaluate Barriers to Accountability. Journal of Information Policy, 9, 307–335. https://doi.org/10.5325/jinfopoli.9.2019.0307

Iordanou, C., Soriente, C., Sirivianos, M., & Laoutaris, N. (2017). Who is Fiddling with Prices?: Building and Deploying a Watchdog Service for E-commerce. Proceedings of the Conference of the ACM Special Interest Group on Data Communication - SIGCOMM, 17, 376–389. https://doi.org/10.1145/3098822.3098850

Kemper, J., & Kolkman, D. (2019). Transparent to whom? No algorithmic accountability without a critical audience. Information, Communication & Society, 22(14), 2081–2096. https://doi.org/10.1080/1369118X.2018.1477967

King, G., & Persily, N. (2019). A New Model for Industry–Academic Partnerships. PS: Political Science & Politics, 53(4), 703–709. https://doi.org/10.1017/S1049096519001021

Langley, P., & Leyshon, A. (2017). Platform capitalism: The intermediation and capitalisation of digital economic circulation. Finance and Society, 3(1), 11–31. https://doi.org/10.2218/finsoc.v3i1.1936

Larsson, A. O. (2016). Online, all the time? A quantitative assessment of the permanent campaign on Facebook. New Media & Society, 18(2), 274–292. https://doi.org/10.1177/1461444814538798

Leerssen, P., Ausloos, J., Zarouali, B., Helberger, N., & Vreese, C. H. (2019). Platform Ad Archives: Promises and Pitfalls. Internet Policy Review, 8(4), 1–21. https://doi.org/10.14763/2019.4.1421

Lessig, L. (1999). Code: And other laws of cyberspace. Basic Books.

Mattli, W., & Woods, N. (2009). In Whose Benefit? Explaining Regulatory Change in Global Politics. In W. Mattli & N. Woods (Eds.), The Politics of Global Regulation (pp. 1–43). https://doi.org/10.1515/9781400830732.1

Morgan, B., & Yeung, K. (2007). An introduction to Law and Regulation. Cambridge University Press. https://doi.org/10.1017/CBO9780511801112

Morstatter, F., Pfeffer, J., & Liu, H. (2014). When is it Biased? Assessing the Representativeness of Twitter’s Streaming API. ArXiv. http://arxiv.org/abs/1401.7909

Napoli, P. M. (2015). Social media and the public interest: Governance of news platforms in the realm of individual and algorithmic gatekeepers. Telecommunications Policy, 39(9), 751–760. https://doi.org/10.1016/j.telpol.2014.12.003

Obar, J. A. (2020). Sunlight alone is not a disinfectant: Consent and the futility of opening Big Data black boxes (without assistance). Big Data & Society, 7(1). https://doi.org/10.1177/2053951720935615

Parsons, C. (2015). Beyond Privacy: Articulating the Broader Harms of Pervasive Mass Surveillance. Media and Communication, 3(3), 1–11. https://doi.org/10.17645/mac.v3i3.263

Pasquale, F. (2015). The black box society: The secret algorithms that control money and information. Harvard University Press.

Power, M. (1997). The audit society. Rituals of verification. Oxford University Press. https://doi.org/10.1093/acprof:oso/9780198296034.001.0001

Powles, J., & Nissenbaum, H. (2018, December 7). The Seductive Diversion of ‘Solving’ Bias in Artificial Intelligence. OneZero. https://onezero.medium.com/the-seductive-diversion-of-solving-bias-in-artificial-intelligence-890df5e5ef53

Puschmann, C. (2019). An end to the wild west of social media research: A response to Axel Bruns. Information, Communication & Society, 22(11), 1582–1589. https://doi.org/10.1080/1369118X.2019.1646300

Rahwan, I., Cebrian, M., Obradovich, N., Bongard, J., Bonnefon, J.-F., Breazeal, C., Crandall, J. W., Christakis, N. A., Couzin, I. D., Jackson, M. O., Jennings, N. R., Kamar, E., Kloumann, I. M., Larochelle, H., Lazer, D., McElreath, R., Mislove, A., Parkes, D. C., Pentland, A. ‘Sandy’, … Wellman, M. (2019). Machine behaviour. Nature, 568(7753), 477–486. https://doi.org/10.1038/s41586-019-1138-y

Rieder, B. (2020). Engines of Order. A Mechanology of Algorithmic Techniques. Amsterdam University Press. https://doi.org/10.2307/j.ctv12sdvf1

Rieder, B., Matamoros-Fernández, A., & Coromina, Ò. (2018). From ranking algorithms to ‘ranking cultures’: Investigating the modulation of visibility in YouTube search results. Convergence, 24(1), 50–68. https://doi.org/10.1177/1354856517736982

Roth, C., Mazières, A., & Menezes, T. (2020). Tubes and bubbles topological confinement of YouTube recommendations. PLOS ONE, 15(4). https://doi.org/10.1371/journal.pone.0231703

Rowland, N. J., & Passoth, J.-H. (2015). Infrastructure and the state in science and technology studies. Social Studies of Science, 45(1), 137–145. https://doi.org/10.1177/0306312714537566

Sandvig, C., Hamilton, K., Karahalios, K., & Langbort, C. (2014). Auditing algorithms: Research methods for detecting discrimination on internet platforms. Data and Discrimination: Converting Critical concerns into productive inquiry, a Preconference at the 64th Annual Meeting of the International Communication Association, Seattle, WA. https://pdfs.semanticscholar.org/b722/7cbd34766655dea10d0437ab10df3a127396.pdf

Seaver, N. (2017). Algorithms as culture: Some tactics for the ethnography of algorithmic systems. Big Data & Society, 4(2), 1–12. https://doi.org/10.1177/2053951717738104

Sloane, M., & Moss, E. (2019). AI’s social sciences deficit. Nature Machine Intelligence, 1(8), 330–331. https://doi.org/10.1038/s42256-019-0084-6

Staab, P. (2019). Digitaler Kapitalismus: Markt und Herrschaft in der Ökonomie der Unknappheit. Suhrkamp.

Stolton, S. (2019, November 20). Vestager takes aim at ‘biopower’ of tech giants. EURACTIV. https://www.euractiv.com/section/copyright/news/vestager-takes-aim-at-biopower-of-tech-giants/

Suchman, L. A. (2007). Human-Machine Reconfigurations. Plans and Situated Actions (Second). Cambridge University Press. https://doi.org/10.1017/CBO9780511808418

Suzor, N. P., Myers West, S., Quodling, A., & York, J. (2019). What Do We Mean When We Talk About Transparency? Toward Meaningful Transparency in Commercial Content Moderation. International Journal of Communication, 13, 1526–1543. https://ijoc.org/index.php/ijoc/article/view/9736/0

van Dijck, J. (2020). Governing digital societies: Private platforms, public values. Computer Law & Security Review, 36. https://doi.org/10.1016/j.clsr.2019.105377

van Dijck, J., Poell, T., & De Waal, M. (2018). The platform society: Public values in a connective world. Oxford University Press. https://doi.org/10.1093/oso/9780190889760.001.0001

Yeung, K. (2017). 'Hypernudge’: Big Data as a mode of regulation by design. Information, Communication & Society, 20(1), 118–136. https://doi.org/10.1080/1369118X.2016.1186713

Ziewitz, M. (2019). Rethinking gaming: The ethical work of optimization in web search engines. Social Studies of Science, 49(5), 707–731. https://doi.org/10.1177/0306312719865607

Zuboff, S. (2019). The age of surveillance capitalism: The fight for a human future at the new frontier of power. Profile Books.

Footnotes

1.https://eur-lex.europa.eu/eli/reg/2019/1150/oj

2. See, for example, the response by AlgorithmWatch and other signatories to the European Commission’s planned Digital Services Act: https://algorithmwatch.org/en/submission-digital-services-act-dsa/.

3.https://www.congress.gov/bill/116th-congress/senate-bill/2763/all-info

4.https://www.rlp.de/fileadmin/rlp-stk/pdf-Dateien/Medienpolitik/ModStV_MStV_und_JMStV_2019-12-05_MPK.pdf

5. The ACM’s Statement on Algorithmic Transparency and Accountability (https://www.acm.org/binaries/content/assets/public-policy/2017_usacm_statement_algorithms.pdf), for example, explicitly mentions ‘auditability’ as a desirable principle.

6. For a discussion of the intricate history of ideas behind the concepts of the common good and public interest in the anglo-american realm and a definition of the latter see Douglass (1980, p. 114): ‘the public interest would come to mean what is really good for the whole people. And in a democratic society, this would mean what is really good for the whole people as interpreted by the people.’

7.https://tosback.org

8.https://pribot.org/polisis

9.https://pribot.org

10.https://dataforgood.fb.com/

11.https://socialscience.one

12.https://www.facebook.com/ads/library/

13.https://trends.google.com

14.https://www.bundeskartellamt.de/EN/Economicsectors/MineralOil/MTU-Fuels/mtufuels_node.html

15.https://creativecommons.tankerkoenig.de / https://de.wikipedia.org/wiki/Markttransparenzstelle_für_Kraftstoffe

16.https://developer.twitter.com/en/products/tweets/sample

17.https://www.heise.de/newsticker/meldung/36C3-BahnMining-offenbart-die-nackte-Wahrheit-hinter-der-DB-Puenktlichkeitsquote-4624384.html

18.https://www.pewinternet.org/2018/11/07/many-turn-to-youtube-for-childrens-content-news-how-to-lessons/

19.https://algotransparency.org

20.https://www.nielsen.com/us/en/solutions/measurement/online/

21.https://algorithmwatch.org/datenspende-unser-projekt-zur-bundestagswahl/

22.https://algorithmwatch.org/openschufa-warum-wir-diese-kampagne-machen/

23.https://algorithmwatch.org/instagram-algorithmus/

24.http://sheriff-v2.dynu.net/views/manual

25.https://www.csa.fr/Informer/Toutes-les-actualites/Actualites/Pourquoi-et-comment-le-CSA-a-realise-une-etude-sur-l-un-des-algorithmes-de-recommandations-de-YouTube

26.https://publicaties.rekenkamer.amsterdam.nl/handhaving-vakantieverhuurbestuurlijk-rapport/

27.https://adobserver.org

28.https://edps.europa.eu/sites/edp/files/publication/20-01-06_opinion_research_en.pdf

29.https://distill.pub/2019/activation-atlas/

30.https://www.technologyreview.com/s/613508/ai-fairer-than-judge-criminal-risk-assessment-algorithm/

31.https://www.technologyreview.com/s/607955/inspecting-algorithms-for-bias/

32.http://rankingdigitalrights.org

33. This aligns with Sandvig et al. (2014, p. 17), who call for ‘regulation toward auditability’.

34.https://blog.twitter.com/developer/en_us/topics/tools/2020/introducing_new_twitter_api.html

35. The European Commission is already hosting an Observatory on the Online Platform Economy (https://platformobservatory.eu/)—of which both authors are members—and it plans to create a digital media observatory.https://ec.europa.eu/digital-single-market/en/news/commission-launches-call-create-european-digital-media-observatory. However, both bodies have a thematically restricted mandate and lack any regulatory authority.

Personal information management systems: a user-centric privacy utopia?

$
0
0

1. Introduction

Online systems and services are driven by data. There are growing concerns regarding the scale of collection, computation and sharing of personal data, the lack of user control, individuals’ rights, and generally, who reaps the benefits of data processing (German Data Ethics Commission, 2019).

Currently, data processing largely entails the capture of individuals’ data by organisations, who use this data for various purposes, in a manner that is often opaque to those to whom the data relates. This general lack of transparency has meant that consent and other legal arrangements for the safe and responsible processing of personal data are considered rather ineffective (Blume, 2012; Cate & Mayer-Schönberger, 2013; Tolmie et al., 2016; German Data Ethics Commission, 2020).

Privacy Enhancing Technologies (PETs) are technologies that aim to help in addressing privacy concerns (The Royal Society, 2019). Personal data stores (PDSs), otherwise known as personal information management systems (PIMS), represent one class of such technology, focused on data management. In essence, a PDS equips an individual (user) with a technical system for managing their data (a ‘device’). Generally, a PDS device provides the user with technical means for mediating, monitoring and controlling: (i) the data captured, stored, passing through, or otherwise managed by their device; (ii) the computation that occurs over that data; and (iii) how and when the data, including the results of computation, is transferred externally (e.g., off-device, to third-parties).

Proponents of PDSs argue that it empowers users, by “put[ting] individuals in control of their data” (Crabtree et al., 2018). This is because PDSs provide means for ‘users to decide’ what happens to their data; in principle, third-parties cannot access, receive or analyse the data from a PDS without some user agreement or action. In this way, PDSs purport a range of user benefits, from increased privacy and the ability to ‘transact’ (or otherwise monetise) their data, to better positioning users to gain insights from their own data (see subsection 2.3).

More broadly, PDSs seek to provide an alternative to today’s predominant form of data processing, where organisations collect, store and/or use the data of many individuals. As this often occurs within a single organisation’s technical infrastructure, there may be limited scope for individuals to uncover – let alone control – what happens with their data. The vision for PDSs is to decentralise data and compute, away from organisations, such that it happens with more user control.

PDS technology is nascent, but growing in prominence. Exemplar PDS platforms currently at various stages of development and availability include Hub of All Things& Dataswift (Dataswift) 1; Mydex, CitizenMe, Databox and Inrupt/Solid (Inrupt) 2 (which is led by Sir Tim Berners-Lee). As nascent technology, PDSs raise several areas for investigation by academia, policymakers, and industry alike. There is already work, for instance, on how PDSs might facilitate better accountability (Crabtree, 2018; Urquhart, 2019), and on the legal uncertainties surrounding the technology, particularly concerning data protection (Janssen et al., 2020; Chen et al., 2020).

This paper takes a broader view, questioning the extent to which PDS technology can actually empower individuals and address the concerns inherent in data processing ecosystems. After giving an overview of the technology, and its purported benefits in section 2, we examine, in section 3, some data protection implications of PDSs focusing on the user’s perspective: whether they support particular legal bases for processing personal data; the social nature of personal data captured by PDSs; and the relation of PDSs to data subject rights. In section 4, we argue that the broader information and power asymmetries inherent in current online ecosystems remain largely unchallenged by PDSs. Section 5 synthesises the discussion, indicating that many of the concerns regarding personal data are systemic, resulting from current data surveillance practices, and concluding that PDSs – as a measure that ultimately still requires individuals to ‘self-manage’ their privacy – only go so far. 3

2. Technology overview

PDSs represent a class of data management technologies that seek to localise data capture, storage and the computation over that data towards the individual. Generally, they entail equipping a user with their own device for managing their data. A device operates as a (conceptual) data ‘container’, in a non-technical sense of the word: a strictly managed technical environment in which data can be captured or stored or can pass through, and within which certain computation can occur. 4 Some devices are wholly virtual (e.g. Digi.me), hosted in the cloud, while others encompass particular physical equipment such as a box or hub (see e.g. Databox).

PDSs generally purport to empower users through their devices. Though offerings vary, generally PDSs provide technical functionality for:

  1. Local (within-device) capture and storage of user data. Mechanisms for users to populate their PDS with data from a range of sources, which may include from their phones, wearables, online services, manual data entry, sensors, etc.
  2. Local (on-device) computation. Enabling computation to occur (software to execute) on the device, which generally entails some processing of data residing with the device.
  3. Mediated data transfers. Allowing control over the data transferred externally (off-device); including ‘raw’ user data, the results of computation, and other external interactions (e.g. calls to remote services).
  4. Transparency and control measures. Tooling for monitoring, configuring and managing the above. This includes governance measures for users to set preferences and constraints over data capture, transfer and processing; visualising and alerting of specific happenings within the device; etc.

The device’s technical environment (infrastructure) manages security aspects. This can include data encryption, managing and controlling user access to the device and its data, and providing means for isolating data and compute. Further, it also works to ensure adherence with any policies, preferences and constraints that are set (see #4 above). For instance, if a user specifies that particular data cannot be transferred to some party (or component), or should not be included in some computation, the device’s technical environment will ensure these constraints are respected.

Core to many PDSs is allowing for computation (potentially any form of code execution, including analytics) to be ‘brought’ to the data. This occurs through an app: software that executes on a user’s device for processing that device’s data. 5 Some apps may provide the user with functionality without any external transfer of data. Though often apps will transfer some data off-device (such as the results of computation). PDS proponents describe such functionality as of key industry interest, arguing that receiving only the results of computation (e.g. aggregated findings) avoids the sensitivities, overheads and resistance associated with receiving and managing granular and specific user data (see subsection 2.4). Apps operate subject to constraints: they must define what data sources they seek, the data they transfer, and other details; and users may put constraints on how apps behave, e.g. regarding the data that apps may access, process, and transfer. The device’s technical environment ensures adherence to these constraints. Legal mechanisms also operate to govern the behaviour and operation of PDS ecosystems (see subsection 2.2). 6

2.1 A multi-actor ecosystem

It is worth recognising that there are several actors within a PDS ecosystem. We now introduce those most pertinent for this discussion. The focus is on users, but this article is about empowerment and power, so other actors need to be introduced.

Users are those individuals who hold a device, leveraging the PDS functionality to manage their data.

Organisations are those interested in processing user data. Here, we describe organisations as app developers, as they build apps that process user data for installation on user devices. Again, apps will often transfer some data to the organisation, such as the results of computation. PDSs may also support the transfer of data to an organisation without a specific app. This process is managed through the direct data transfer mechanisms provided by the device (which may itself be a form of app, packaged with the device).

Platformsare the organisations that provide the PDS and/or manage the PDS ecosystem. There will be a range of platforms that differ in their offerings. Often a platform’s core offering is equipping a user with a device; though this could vary from merely providing the codebase for users to compile and self-manage the operation of their devices, to providing the entire operational infrastructure—perhaps including hardware, managed cloud services for backup, and so forth (Janssen et al., 2020). Moreover, some platforms envisage hosting ‘app stores’ or ‘data marketplaces’ that broker between users and the organisations seeking to process their data, while many platforms require adherence with ‘best practices’, have defined terms of service, and may even have contractual agreements with users and organisations. In this way, platforms vary in their level of involvement in the operation of the PDS ecosystem.

2.2 Governance regimes

In addition to technical aspects, PDS platforms often entail legal governance mechanisms. These operate to help ensure that app behaviour, and data usage more generally, is compliant with user preferences, and platform requirements. Some of these are encapsulated in a platform’s Terms of Service (ToS), which commonly define how the platform can be used, and the platform’s position on the allocation of responsibilities and liabilities. Platform ToS often require app developers to have appropriate measures in place to safeguard users against unlawful processing (e.g. Dataswift’s acceptable use policy), and to safeguard users against accidental data loss or destruction (idem) while requiring them to, for instance, safely keep their passwords or to regularly update their PDSs for security purposes (e.g. Dataswift’s terms for users). Platforms may also have contracts with app developers, which contain business specific terms and conditions, governing their interactions with user data, the functionality of their apps etc. ToS and contracts might stipulate, for example, that app developers must fully comply with platform policies and principles regarding user data processing, where failure to do so may result in the platform terminating their data processing activities (example from Mydex ToS).

2.3 Purported user benefits

PDSs generally purport to provide functionality to empower users. Some claimed benefits for users include:

  • Users having granular control over the data captured about them, and how that data is shared and used (Article 29 Data Protection Working Party 2014; Crabtree et al., 2018; Urquhart et al., 2019);
  • Better protecting personal data (including ‘sensitive’ personal data) from access by third parties, by way of the technical functionality provided (Crabtree et al., 2018; Lodge et al., 2018);
  • Better informed user consent, by giving more information about data processing. This may be through various means, including the device’s monitoring functionality; the app’s data usage specifications; platform features, such as app stores ranking and describing app data usage, requiring transparency best practices, etc. (Mydata);
  • Compartmentalised data storage and computation to prevent apps from interacting with data (and other apps) inappropriately, inadvertently and without user agreement/intervention (e.g. Crabtree et al., 2018);
  • Providing opportunities for users to gain more insights from their data (e.g., Mydex; Mydata);
  • Allowing users to transact with or monetise their personal data (Ng & Haddadi, 2018);
  • Generally incentivising developers towards more privacy friendly approaches (Crabtree et al., 2018).

PDSs have also caught the attention of policymakers; the European Commission recently expressed that PDSs and similar tools have significant potential as “they will create greater oversight and transparency for individuals over the processing of their data […] a supportive environment to foster [their] development is necessary to realise [their] benefits” (European Commission, 2020). This potentially indicates that the European Commission might in the future define policy encouraging the development of these tools.

2.4 Purported organisational benefits

For organisations (app developers), the appeal of PDSs is the promise of access to more data—potentially in terms of volume, richness, velocity and variety—for processing. PDS enthusiasts argue that if users better understand how their data is being processed, and feel empowered by way of PDS’s control mechanisms, they may be less ‘resistant’ and harbour a greater ‘willingness’ for (managed) data sharing and processing (e.g., Control-Shift; Mydata; Digi.me; CitizenMe mention this in their descriptions). Similarly, given that PDSs will encapsulate a variety of user information, PDSs might offer app developers access to a broader range of data types than if they attempted to collect the data themselves (Mydata).

Though PDSs are typically described with reference to an individual, most aim to support ‘collective computation’, whereby the processing of data across many users or (particular) populations is enabled through apps operating on their devices (e.g., Mydata; Databox; CitizenMe; Digi.me). 7 Collective computations often entail some user or population profiling to support various organisational aims—customer insight, market research, details of product usage, or indeed, and as is common in online services to support a surveillance-driven advertising business model (as discussed in section 5). In this way, PDS platforms effectively provide a personal data processing architecture that operates at scale across a population. This is attractive for organisations, as PDS platforms with large user-bases offer access to a wider population and thus more data than the organisation would otherwise themselves have access to. Importantly, this also comes without the costs, risks, and compliance overheads incurred in undertaking data collection, storage, and management ‘in-house’, using their own infrastructure (Crabtree et al., 2018).

2.5 PDS platforms: the commercial landscape

Some predict that PDSs could generate substantial economic benefits for businesses and consumers alike (Control-Shift; Brochot et al., 2015; European Commission, 2020). Although the business models for organisations are likely similar to those already existing, the business models for the PDS platforms are unclear and remain under development (Bolychevsky & Worthington, 2018). A range of possible revenue streams for PDS platforms have been developed and proposed. These include:

  • Platforms charging organisations fees for access to the PDS ecosystem (e.g., annual fee, Mydex); charges for access to the platform’s app store, per user download of their app, etc);
  • Platforms charging organisations per ‘data transaction’ with a PDS device, where the type of transaction (access, computation, and/or transfer of data, including raw data, see e.g. Mydex) and/or the type of data requested (e.g. queries, behavioural data) often determines the price (see e.g. CitizenMe);
  • Organisations sharing revenue with the platform through in-app purchases (e.g. Digi.me);
  • Platforms charging organisations for support services (e.g. Mydex);
  • Users paying a subscription fee, or to unlock additional functionality (Digi.me);
  • Platforms selling, renting or leasing PDS devices to users, which could include service or maintenance contracts (Crabtree et al., 2018); or
  • Platforms in the public interest (e.g. PDSs platforms for public health) might be ‘fee-free’, funded through, e.g. donations, and public funds (see e.g. BBC-Box).

As PDSs are a developing area, the business models of platforms are nascent. In practice, one expects that platforms will likely employ a range of monetisation mechanisms.

3. Data protection

A key aim of PDSs is to give users greater visibility and control over the processing of their personal data. PDS architectures concern issues regarding personal data, and therefore the General Data Protection Regulation (GDPR) must be considered. GDPR centres around three legal roles: controllers (acting alone or with others together as joint controllers; (Arts. 4(7), 26 GDPR), processors (including sub-processors; Arts. 4(8), 28(4) GDPR), and data subjects (Art. 4(1) GDPR). The role of a particular actor as a controller or processor is generally a question of their factual influence over data processing; how an actor describes their role (for example in contract) may be indicative, but won’t be definitive (Article 29 Working Party, 2010).

GDPR tasks both controllers and processors with a range of responsibilities and obligations, the bulk of which fall on controllers, given their role in determining the nature of the data processing. Obligations for controllers include complying with data protection principles (Art. 5(1) GDPR), that this compliance is demonstrable (Art. 5(2) GDPR), that their processing of personal data is predicated on one of the GDPR’s lawful grounds (Art. 6(1) GDPR), to name a few. Typical rights afforded to data subjects (i.e. those whose personal data is being processed) which controllers are tasked with meeting, include the rights to object to data processing, to have their data erased, or to port data (subsection 3.3).

While PDS technologies and their governance models are still developing, many unresolved data protection issues exist. The assignment of roles and responsibilities in PDS systems is complex, given such ecosystems are largely shaped by the collaboration of multiple parties, including the key actors mentioned here. This reality can be difficult to reconcile with GDPR’s approach with controllers who ‘orchestrate’ the data processing in an entire system. In practice, a PDS’s ecosystem can take a number of forms, and the legal position of those involved will depend on the circumstances. Issues of roles and responsibilities under the GDPR in different PDS contexts are explored in detail by Chen et al., and Janssen et al. (2020). In this paper, we consider three key ‘user-facing’ data protection considerations: (1) how PDSs, in being oriented towards consent, relates to GDPR’s lawful grounds; (2) how personal data often relates to more persons than just the PDS user; and (3) the relationship between PDSs and data subject rights.

3.1 Lawful grounds for processing

GDPR requires that processing is predicated on one of its lawful bases as defined by Art. 6(1) GDPR. Controllers must determine which lawful ground is most appropriate in a given situation, depending on specific purposes and contexts for use, the nature of parties involved, and their motivations and relationships, and of course, the requirements for the lawful basis on which they rely. However, due to the ePrivacy Directive, where the PDS entails a physical (hardware) device, consent will generally be required for app developers to process any data (Art. 5(3) ePrivacy Directive; Janssen et al., 2020). In this context, for such devices the only available basis for processing on these devices will be consent (Arts. 6(1)(a) & 7 GDPR; Recitals 32, 42, 43 GDPR) and explicit consent (for special category data—particular classes of data deemed to require extra protections (Art. 9(1), Recitals 51-56 GDPR)). For ‘virtual’ PDS devices, such as those cloud hosted (as are currently by far the most common), legal bases other than consent may be available (unless that data is special category data, in which case explicit consent is often the only option).

PDS devices are fundamentally oriented towards supporting the grounds of (user) consent and contract (where the processing is necessary for the performance of a contract to which the user is a party) as the legal bases for processing. Importantly, both consent and contract are grounds that require agreement by the data subject to render the processing lawful. PDS platforms are generally explicitly designed for supporting such, by requiring active user agreement regarding data processing (Crabtree et al., 2018; Urquhart 2019). PDSs generally purport functionality that aims at informing users, e.g. providing them information about an app and its related data processing, and requiring the user to take positive actions, e.g. agreeing to terms upon installing the app, configuring data usage preferences and policies, in order for that processing to occur.

There are also lawful grounds for processing, such as legal obligation, public interest or legitimate interest which allow the controllers—not the data subjects (users)—to decide whether processing can occur. That is, user consent is not required for certain public tasks (e.g. perhaps in taxation), or for legitimate controller interest (e.g. perhaps for the processing of certain data to detect fraud). The requirements vary by legal basis, and can include (depending on the ground) considerations like the necessity of that processing (Arts. 6(1)(b)—(f) GDPR), that controller interests are balanced with the fundamental rights of the data subject (Art. 6(1)(f) GDPR; Kamara & De Hert, 2018), and a foundation in compatible member state law (Arts. 6(1)(c) and (e) GDPR). These grounds for processing that are not based on specific and active user involvement or agreement are rarely considered in PDS architectures, and at present it is unclear how PDS architectures would support or reconcile with these grounds where they may apply (Janssen et al., 2020).

3.2 Social nature of personal data

Personal data is relational and social by nature; it often does not belong to one single individual, as much personal data is created through interactions with other people or services (Article 29 Working Party, 2017; Crabtree & Mortier, 2015).

In practice, a PDS device will likely capture data relating to multiple individuals other than the user—for example, through sensing data from other dwellers or visitors in and around someone’s home. This raises interesting questions regarding the mechanisms for one to control what is captured about them in someone else’s PDS. That is, there may be conflicting visions and preferences between the user and others regarding the use and processing of ‘joint’ data, and these others may also have data subject rights (see subsection 3.3). At present, PDSs generally give a device’s user greater control over the processing related to that device; functionality enabling the preferences and rights of others to be managed and respected has yet had little consideration. This is an area warranting further attention.

3.3 Supporting data subject rights

GDPR affords data subjects several rights regarding the processing of their personal data. These include the rights of access to their personal data (Art. 15), rectification of inaccurate personal data (Art. 16), erasure (Art. 17), to object (Art. 21), to restrict the processing of their data (Art. 18), to port their data to another controller in a commonly used machine-readable format (Art. 20 GDPR), and to not be subject to solely automated decision-making or profiling which produces legal or similarly significant effects (Art. 22 GDPR). Controllers are tasked with fulfilling these rights. Data subject rights are not absolute—GDPR imposes conditions on the exercise of some rights, and not all rights will apply in every situation.

Data subject rights have had little consideration in a PDS context. Again, to improve the transparency of processing, PDSs usually afford users some visibility over what occurs on-device and provide information on their device’s interactions (data exchanges) with organisations (Urquhart et al., 2018). They also generally offer certain controls to manage on-device processing. As such, some have suggested that PDSs may (at least for data within the PDS device) to some extent “negate” a user’s need to exercise certain data subject rights (Urquhart et al., 2018), where such mechanisms could potentially provide means for users to themselves restrict certain processing, and erase, delete or port data, and so forth. However, current PDS tooling, at best, only gives certain users visibility and the ability to take action regarding processing happening on-device (see subsection 4.1). Data subject rights, however, are broader, and encompass more than simply giving users visibility over on-device data processing. Users will, for instance, have interests in the behaviour of organisations involved in processing.

GDPR requires controllers to account for data protection considerations, including those relating to rights, in their technological and organisational processes (Data protection by design, GDPR Art 25(1)). This has implications not only for app developers, but also for PDS platforms, who could provide mechanisms that specifically and more holistically facilitate users in exercising their rights. Though there may be questions as to whether this is legally obliged—for instance in light of the complexities regarding a platform’s roles and responsibilities given that Art 25(1) applies to controllers (see Chen et al., 2020; Janssen et al., 2020). Indeed, these considerations are exacerbated as some PDSs represent ‘open source’ projects, potentially involving a wide range of entities in the development, deployment and operation of the platform and/or device functionality. However, regardless of any legal obligation, any PDS platform should aim to better support users with regards to their data rights, given that this is wholly consistent with the stated aims of PDSs as ‘empowering users’.

Beyond PDS functionality that specifically aims at rights, there is potential for PDS transparency mechanisms to assist users with their rights more generally. For instance, PDSs might, by providing information, help users in detailing and targeting their rights requests. User observation of, or a notification by the platform indicating particular application behaviour, might encourage users to exercise their right ‘to find out more’, or perhaps encourage them to validate that their rights requests were properly actioned. This might help users to determine whether processing should continue, or help them confirm whether the information provided by the controller corresponds to the operations observed on-device.

The right to data portability grants users the right to receive copies of the data they provided to a controller in an electronic format, and to transfer that data or to have it transferred to another controller. This can only be invoked if the processing was based on the lawful grounds of consent or contract (Art. 20(1)(a) GDPR), and concerns only that data provided by data subjects themselves (Art. 20 (1) GDPR; Article 29 Working Party, 2016; Urquhart et al., 2017).

Portability is considered a key means for users to ‘populate’ their PDSs by bringing their data from an organisation’s databases to the PDS (Art. 20 GDPR; Article 29 Working Party, 2019). Indeed, some PDS platforms describe the right as enabling users to ‘reclaim’ their data from organisations (e.g. CitizenMe; Dataswift; Digi.me),and envisage offering users technical mechanisms that leverage portability rights for populating their devices (idem). Subject access requests (Art. 15(3) GDPR) may also assist in populating devices, particularly given they are less constrained in terms of when it can be used, and usually result in more information than would be received from a portability request. However, subject access requests do not require that the data be returned in a machine-readable format. Without agreed-upon interoperability standards, using subject access requests (and indeed, even portability requests to some degree) to populate PDSs will often be impractical and cumbersome.

PDSs’ transparency mechanisms are also relevant here, as they can work to improve the user’s position. This is because such mechanisms can expose the on-device computations, possibly including the results of those computations, and potentially in a meaningful technical format. This is useful not only for portability considerations (e.g. in a PDS context, potentially moving the results of computations across apps), but also in generally providing users with more knowledge and insight into the nature of data processing occurring.

4. Information asymmetries

PDS platforms state that they empower users by providing them with means for increased transparency and control, enabling users to take better, more informed decisions about whether to engage or, indeed, disengage with particular processing. However, systemic information and power asymmetries are inherent in current digital ecosystems, whereby the highly complex and largely opaque nature of data processing amplifies the asymmetries between data subjects and the organisations processing their data (Mantelero, 2014). These asymmetries, stemming from an unequal distribution of opportunities in terms of understanding, knowledge, prediction, risk assessment, and so forth (Mantelero, 2014), make it difficult if not impossible for even knowledgeable users to properly evaluate and come to genuinely informed decisions about the processing of their data (Solove, 2013; Solove, 2020).

The opaque nature of data processing is largely systemic because users of digital services often lack (or are prevented from gaining) knowledge or understanding of: (1) the practices of organisations capturing and processing their data, including the details, reasons for and implications of holding particular data or performing particular computation; (2) the data sharing practices of those organisations with third parties and beyond; (3) the technical details of the systems involved; (4) the data-driven, and indeed, often surveillance-driven business models (see section 5); and (5) the insights and power that organisations can gain through having access to data, particularly where data is aggregated or computation occurs at scale (collective computation). Legal issues may further contribute to systemic problems—including information asymmetries—within digital ecosystems (Cohen, 2019); for example, copyright, trade secrecy, or documents or databases owned by large organisations might work to restrict the information that is available to the public. However, these restrictions are not absolute and do not apply to every stakeholder. Under certain conditions, courts or regulators can be given access to data relating to trade secrets or databases not generally available to the public (Art. 58(1)(e); Recital 63 GDPR).

Crucially, PDSs only partially respond to these issues and therefore only partially address the systemic nature of the information asymmetries of digital ecosystems. Providing a localised, user-centric containerisation of data and processing may assist users in gaining some knowledge of what happens with their personal information, but only to a limited extent. While users might gain some greater understanding over the data processing relating to their device, PDSs themselves are unlikely to solve these systemic information asymmetries. Fundamentally, PDSs are grounded in the mistaken idea that with enough information presented in the right way, individuals will be able to overcome barriers that are ultimately structural and systemic in nature (Nissenbaum, 2011).

4.1 Organisational data processing practices remain largely opaque

An organisation’s intentions, motivations and behaviours may not always be clear to users (Burrell, 2016). Attempting to address this, PDSs require app developers to provide some information about their organisational processes and intentions. Such information (often encapsulated in ‘app manifests’) might include details of the types of data an app will process; the app developer’s purposes for that processing; the risks of the app; or with whom the app developer may share data received from the PDS (Crabtree, 2018; Janssen et al., 2020). 8 However, less discussed in PDS proposals is conveying information about why that particular data is necessary (as opposed to other, perhaps less sensitive data), why these weights are attached to particular data in the analytics process, and, more broadly, why that particular processing needs to occur, and the possible data protection implications this may have. This is an area needing attention.

We now elaborate two additional aspects: (i) the lack of information available regarding data that flows beyond organisational boundaries, and (ii) how the opacity of app developers’ processes can hinder PDS platform’s governance processes. Note, however, that even if PDSs could provide additional information on developers’ processing practices, the utility of this for users is unclear. Moreover, this risks potentially creating a false sense of having adequately informed users while in actuality the problems caused by information asymmetries remain (this dimension is explored in subsection 4.2).

4.1.1 Transparency and control diminish as data moves across boundaries

Once data moves beyond a system or organisation’s boundaries, the visibility over that data typically diminishes, as does the ability to control any subsequent processing (Singh et al., 2017; Crabtree et al., 2018; Singh et al., 2019). So, while PDSs might provide users with insights into device-related processing, PDSs generally will not (at least at a technical-level) provide users with information about – let alone access to – data that has moved to app developers (and, indeed, beyond). Even in a PDS context, users will (still) therefore have little meaningful information regarding the specifics of the data actually being shared between organisations and third parties. 9

The fact that some data usage is essentially out of sight raises various risks, including, for instance, around secondary uses of data that a user would not have agreed with, e.g. undisclosed monetisation (Silverman 2019), or unexpected or undesired inferences or profiling, which could be used to influence, nudge or manipulate (Wachter et al., 2019). Moreover, as many online services entail a ‘systems supply-chain’ (Cobbe et al., 2020) – whereby services from various organisations are used to deliver functionality – there may be little visibility regarding the specific organisations involved in processing once the data moves ‘off-device’.

Though these issues are not typically the focus of PDSs, they relate to the technology’s broader aims. PDSs might potentially assist where technical mechanisms can improve the visibility over data processing and transfer from the device to the first recipient (one-hop), and legal means can govern such transfers (subsection 2.2). For instance, Mydex stipulates in its ToS that app developers may not transfer user data that is obtained through the platform’s service to third-parties, except to the extent that this is expressly permitted in the relevant app developer notice (see, for another example, Dataswift). Through these measures, PDSs might better inform users of – and offer greater control over – what is initially transferred ‘off-device’. However, the ability to actually monitor, track and control data as it moves across technical and administrative boundaries is an area for research (e.g. see Singh et al., 2017; Singh et al., 2019; Pearson & Casassa-Mont, 2011).

4.1.2 Issues with opacity and non-compliance for PDS platforms

Many PDS platforms describe ToS and contractual arrangements with app developers, which define how app developers may process user data. However, organisational data processing opacities can also hinder platforms in uncovering and assessing the risks of non-compliant app and developer behaviour (Crabtree et al., 2018). Platforms’ monitoring and compliance measures might to some extent mitigate the implications of limited user understanding of app developers’ data processing practices, where non-compliance by a developer could result in termination of their processing, the app’s removal from the platform, payment of damages, etc (e.g. ToS of Mydex). This could entail log file analysis, app audits, and manual reviews, including ‘sandboxing’ (examining behaviour in a test environment), and reporting measures when non-compliance is detected on a device (comparable to software ‘crash reports’ in other contexts).

However, there are questions around whether platforms themselves can effectively detect or otherwise uncover non-compliance by app developers. Platform operators generally position themselves to not have direct access to user devices (including data, processing and logs thereof), which limits their visibility over what is happening ‘on the ground’. Platforms becoming actively involved in device monitoring, by gaining visibility over the happenings on user devices, brings additional data protection considerations, while effectively involving a device ‘backdoor’ which has security implications and could undermine the PDS ecosystem. Questions of incentives are also raised, e.g. regarding the propensity for a provider to take action against app developers where doing so has impacts on the platform’s income or business. These issues need further attention.

4.2. Users still require knowledge and expertise

PDSs are oriented towards data protection concerns, particularly regarding the difficulties in obtaining genuinely informed consent and offering users real control. But for this to be effective, users must also be able to understand the potential data protection implications of processing. This means PDS users will require some degree of data protection expertise and knowledge to enable them to comprehend the implications of certain computation and transfers. Though PDSs seek to provide users with more information about processing, and may offer some general guidance, it will not always be clear to users what the full implications of certain data processing or transfers are—not least given the risks are often contextual. A user might, for instance, allow an app developer to build a detailed profile, not realising these could subsequently be used to influence, nudge or manipulate themselves and others (Wachter & Mittelstadt, 2019).

Similarly, an app’s or platform’s explanations and visualisations of data flows, technical parameters, configuration and preference management mechanisms, and so forth, can also be complex and difficult to understand for non-experts (Anciaux et al., 2019). Moreover, identifying where app behaviour does not comply with user preferences or is unexpected can be challenging even for expert users, let alone the non-tech-savvy. Users will therefore also require some technical expertise and knowledge to meaningfully interrogate, control and interact with the functionality of the platform (Crabtree et al., 2018).

As a result, though PDSs seek to better inform users, simply providing them with more information may not produce substantially better informed and empowered users. That is, the information asymmetries currently inherent in digital ecosystems may remain largely unaddressed, and many users may remain largely unempowered and under-protected.

There is on-going research by the PDS community on how platforms can make their transparency and control measures more effective (Crabtree et al., 2018). Default policies or usage of ‘policy templates’ might enable third parties (civil society groups, fiduciaries, etc) to set a predefined range of preferences (in line with certain interests and values) which users can easily adopt. Generally, mechanisms facilitating the meaningful communication and management of data protection risks and implications are an important area of research, not just for PDSs, but for digital ecosystems as a whole.

4.3 App developers may still collect and process at scale

Many PDSs seek to support collective computations, allowing app developers to process user data at scale to generate insights from across a population (subsection 2.4). In practice, this likely contributes to further consolidating the information asymmetries between users and organisations. PDSs may help users to understand these asymmetries to some extent, as they allow users to generate insights into the personal data in their own PDSs. However, the fact that app developers can operate across user PDSs—and are encouraged by platforms to do so—means that they can process the data from many users, and thus remain better informed than individual users can ever be. Although an individual’s data may be interesting to that individual, it is analysing data at scale that can provide the insights into user behaviour and preferences that are often truly valuable to organisations. It is unlikely that PDSs will address this systemic issue by means of any of their measures; indeed, by enabling and encouraging collective computations, PDSs are likely to even further contribute to these asymmetries.

As we will explore next, these asymmetries do not only exist with respect to individual users, but also society as a whole. This is because in the current digital environment, power resides with organisations who have the ability to access and process data. In facilitating collective computations, PDSs continue to support organisations to process data at scale.

5. Discussion: PDSs, privacy self-management and surveillance capitalism

A range of commercial business models are surveillance oriented, where economic value is extracted by collecting and analysing extensive data about people’s behaviour, preferences, and interests (Andrejevic, 2011; Fuchs, 2011; Palmås, 2011; Zuboff, 2015). At present, this typically involves aggregating individual data, and analysing that aggregated data to identify patterns. The knowledge obtained through that analysis is used for various purposes. In the context of online services, where the issues are particularly pronounced, this includes algorithmically personalisation to keep users engaged with the service and to target advertising (Cobbe & Singh, 2019). Often this involves profiling, which poses threats to personal integrity, and online services often target user vulnerabilities for exploitation with addictive designs, dark patterns, and behavioural nudging (Yeung, 2017). Online service providers can work towards vendor lock-in and systemic consumer exploitation. Given the central commercial and economic imperatives of most online services, nearly all data-driven business models involve (to some degree) the trading of data and insights for profit (German Data Ethics Commission, 2019). Note, however, that not only online service providers are surveillance-oriented; PDSs themselves also encourage traditional off-line business models to be augmented with some form of user surveillance, for example, to observe the nature of product usage in a home. The extensive processing of personal data in surveillance-oriented or supported business models raises a range of concerns (Kerber, 2016; Christl, 2017; Myers West, 2017).

As discussed in section 2, PDSs seek to address these concerns by giving users greater ‘control’ over their data and its processing through more information and options regarding processing and then enforcing their choices (by bringing the data processing closer to the user and placing legal and technical constraints on it). In this way, as discussed in section 3, PDSs adopt an approach to privacy and data protection that is still centred on consent-based grounds for processing, working to achieving more effective ‘notice and consent’. Although the approach taken by PDSs may seem to empower users by giving them more ‘control’, (i) the problems with ‘notice and consent’ as a way of protecting users in digital ecosystems are well-established (Barocas & Nissenbaum, 2009; Sloan & Warner, 2013; Barth & De Jong, 2017; Bietti, 2020), and (ii) it does not fundamentally challenge the logic of those business models and surveillance practices. PDSs therefore remain firmly grounded in the logic of ‘privacy self-management’ (Solove, 2013; Solove, 2020), whereby individuals are expected to manage their own privacy and are themselves held responsible where they fail to adequately do so. This can be understood as part of a broader trend of ‘responsibilisation’ in Western societies (Hannah-Moffat, 2001; Ericson & Doyle, 2003; Brown, 2015); putting ever more responsibility on individuals to manage risks in various aspects of their lives, despite the existence of systemic issues beyond their control that can make doing so difficult if not impossible (such as the asymmetries described in section 4 that PDSs do not sufficiently alleviate).

Further, PDSs fail to deal with the realities of collective computations, whereby app developers process user data in aggregate and at scale (subsection 2.2), or with the social nature of personal data (subsection 3.3). Collective computations still exist in—indeed, largely result from—the often commercial drivers for PDS platforms and apps. Through these computations PDSs both allow and contribute to further consolidation of power and information asymmetries (subsection 4.3). However, concerns about collective computations go beyond commercial processing, such as where platforms or app developers pursue public policy or security ends (rather than or additional to commercial gains). This is of significant concern, given the rich, detailed and high-personal nature of the information that a PDS device might capture. Moreover, the social nature of personal data means that individual-level controls are sometimes inappropriate (subsection 3.2)—processing may affect a number of people, only one of whom will have had an opportunity to intervene to permit or constrain it. In all, the individualist approach taken by PDSs, rooted firmly in self-management, does not and cannot capture these more collective, social dimensions of privacy and data protection.

The inability of PDSs to adequately address these concerns speaks to a more fundamental issue with PDSs as a concept: they put too much onus on the individual and not enough focus on the business models (or other incentives for data processing). The root cause of the appropriation of user’s personal data is generally not, in fact, the failure of individuals to exercise control over that data, but those surveillance-supported business models that demand the data in the first place. These business models operate at a systemic level, supported by information asymmetries, commercial considerations, legal arrangements (Cohen, 2019), network effects, and other structural factors, and beyond the control of any individual user.

Indeed, the information asymmetries inherent in surveillance business models result in a significant asymmetry of power between users and app developers (Mantelero 2014). As Lyon argues, through information asymmetries, surveillance “usually involves relations of power in which watchers are privileged” (Lyon, 2017, p. 15). This power asymmetry is at the core of how surveillance capitalism attempts to extract monetary value from individuals, by modifying their behaviour in pursuit of commercial interests (Zuboff 2015). Yet, as discussed above, PDSs seek to ‘empower’ users without significantly dealing with those asymmetries. Nor do they address other systemic factors with structural causes that disempower users in favour of organisations. While PDSs seek to decentralise processing to users’ devices, then, it does not follow that power will also be decentralised to users themselves: decentralising processing does not necessarily imply decentralising power. Without a more systemic challenge to surveillance-based models for deriving value, shifting away from individualised forms of notice and consent and alleviating the effect of information asymmetries and other structural issues, the underlying power dynamic in those surveillance models—skewed heavily in favour of organisations rather than individuals—remains largely unchanged.

Relevant is what Fuchs describes as a form of academic ‘victimisation discourse’, where “privacy is strictly conceived as an individual phenomenon that can be protected if users behave in the correct way and do not disclose too much information” (Fuchs, 2011, p. 146), while issues related to the political economy of surveillance capitalism—advertising, capital accumulation, the appropriation of user data for economic ends—are largely ignored or unchallenged. Responses to these business models that are grounded in placing ever-greater responsibility onto users to actively manage their own privacy, in the face of systemic challenges such as endemic surveillance and data monetisation, are destined to fail. This is the case with PDSs as currently envisaged. Indeed, as previously noted, PDSs have even been described as a way of reducing user ‘resistance’ to data sharing, bringing about a greater ‘willingness’ to allow personal data to be processed (subsection 2.4). This not only explicitly accepts the logic of these business models, but appears to make them easier to pursue. In this way, PDSs following this approach might lull users into a false sense of security through the rhetoric of greater ‘choice’, ‘control’, and ‘empowerment’—despite the evidence that these are flawed concepts in light of the structural and systemic nature of the concerns—while in practice facilitating the very data extraction and monetisation practices that users may be trying to escape.

6. Concluding remarks

PDSs are nascent, but growing in prominence. Their proponents claim that PDSs will empower users to get more from their data, and to protect themselves against privacy harms by providing technical and legal mechanisms to enforce their choices around personal data processing. Though, as we have detailed, their ability to deal with the broader challenges associated with current data processing ecosystems appears limited. Regarding data protection, platforms, regulators and lawyers might together work on the specific data issues brought by PDSs, including how best to deal with issues concerning the rights of data subjects. However, despite any such efforts, and regardless of the purported benefits of PDSs, most of the issues inherent to the systemic information asymmetries and challenges in the current ecosystems remain. While PDSs might offer some helpful user-oriented data management tools, they are fundamentally grounded in the mistaken idea that with enough information presented in the right way, individuals will be able to overcome barriers that are ultimately structural and systemic in nature.

References

Anciaux, N. (2019). Personal Data Management Systems: The security and functionality standpoint. Information Systems, 21, 13 – 35. https://doi.org/10.1016/j.is.2018.09.002

Andrejevic, M. (2011). Surveillance and Alienation in the Online Economy. Surveillance & Society, 8(3), 270 – 287. https://doi.org/10.24908/ss.v8i3.4164

Article 29 Data Protection Working Party. (2007). Opinion 1/2010 on the concepts of ‘controller’ and ‘processor’. (WP169 of 16 February 2010).

Article 29 Data Protection Working Party. (2010). Opinion 1/2010 on the concepts of ‘controller’ and ‘processor’. (WP169 of 16 February 2010).

Article 29 Data Protection Working Party. (2014). Opinion 8/2014 on Recent Developments on the Internet of Things. (WP 223 of 16 September 2014).

Article 29 Data Protection Working Party. (2016). Guidelines on the right to data portability (WP242 rev.01 13 December 2016).

Barocas, S., & Nissenbaum, H. (2009). On Notice: The Trouble with 'Notice and Consent’. Proceedings of the Engaging Data Forum: The First International Forum on the Application and Management of Personal Electronic Information.

Barth, S., & De Jong, M. (2017). The privacy paradox – Investigating discrepancies between expressed privacy concerns and actual online behavior – A systematic literature review’. Telematics and Informatics, 34(7), 1038 – 1058. https://doi.org/10.1016/j.tele.2017.04.013

Bietti, E. (2020). Consent as a Free Pass: Platform Power and the Limits of the Informational Turn. Pace Law Review, 40, 317 – 398.

Binns, R. (2020). Human Judgement in Algorithmic Loops: Individual justice and automated decision-making. Regulation & Governance, 1 – 15. https://doi.org/10.1111/rego.12358

Blume, P. (2012). The inherent contradictions in data protection law. International Data Privacy Law, 2(1), 26 – 34. https://doi.org/10.1093/idpl/ipr020

Bolychevsky, I., & Worthington, S. (2018, October 8). Are Personal Data Stores about to become the NEXT BIG THING? [Blog post]. @shevski. https://medium.com/@shevski/are-personal-data-stores-about-to-become-the-next-big-thing-b767295ed842

Brochot, G. (2015). Personal Data Stores [Report]. Cambridge University. https://ec.europa.eu/digital-single-market/en/news/study-personal-data-stores-conducted-cambridge-university-judge-business-school

Brown, W. (2015). Undoing the Demos: Neoliberalism’s Stealth Revolution. Zone Books.

Burrell, J. (2016). How the machine “thinks”: Understanding opacity in machine learning algorithms. Big Data & Society, 3(1), 1–12. https://doi.org/10.1177/2053951715622512

Cate, F. H., & Mayer-Schönberger, V. (2013). Notice and consent in a world of Big data. International Data Privacy Law, 3(2), 67 – 73. https://doi.org/10.1093/idpl/ipt005

Chen, J. (2020). Who is responsible for data processing in smart homes? Reconsidering joint controllership and the household exemption. International Data Privacy Law. https://doi.org/10.1093/idpl/ipaa011

Christl, W. (2017). Corporate Surveillance in Everyday Life [Report]. Cracked Labs. https://crackedlabs.org/en/corporate-surveillance

Cobbe, J. (2020). What lies beneath: Transparency in online service supply chains. Journal of Cyber Policy, 5(1), 65 – 93. https://doi.org/10.1080/23738871.2020.1745860

Cobbe, J., & Singh, J. (2019). Regulating Recommending: Motivations, Considerations, and Principles. European Journal of Law and Technology, 10(3), 1 – 37. http://ejlt.org/index.php/ejlt/article/view/686

Cohen, J. E. (2019). Between Truth and Power: The Legal Constructions of Informational Capitalism. Oxford University Press. https://doi.org/10.1093/oso/9780190246693.001.0001

ControlShift. (2014). Personal Information Management Services – An analysis of an emerging market: Unleashing the power of trust [Report]. ControlShift.

Crabtree, A. (2018). Building Accountability into the Internet of Things: The IoT Databox Model. Journal of Reliable Intelligent Environments, 4, 39 – 55. https://doi.org/10.1007/s40860-018-0054-5

Crabtree, Andy, & Mortier, R. (2015). Human Data Interaction: Historical Lessons from Social Studies and CSCW. In N. Boulus-Rødje, G. Ellingsen, T. Bratteteig, M. Aanestad, & P. Bjørn (Eds.), ECSCW 2015: Proceedings of the 14th European Conference on Computer Supported Cooperative Work, 19-23 September 2015, Oslo, Norway (pp. 3–21). Springer International Publishing. https://doi.org/10.1007/978-3-319-20499-4_1

Directive 2002/58/EC of the European Parliament and of the Council of 12 July 2002 concerning the processing of personal data and the protection of privacy in the electronic communications sector. http://data.europa.eu/eli/dir/2002/58/oj

Directive (EU) 2016/679 of the European Parliament and of the Council of 27 April 2016 on the protection of natural persons with regard to the processing of personal data and on the free movement of personal data, and repealing Directive 95/46/EC, (2016).

E-Privacy Directive – Directive 2002/58/EC of the European Parliament and the Council of 12 July 2002 concerning the processing of personal data and the protection of privacy in the electronic communications sector, (2002). http://data.europa.eu/eli/dir/2002/58/2009-12-19

Ericson, R. V., & Doyle, A. (2003). Risk and Morality. University of Toronto Press.

European Commission. (2020). A European strategy for Data. European Union. https://eur-lex.europa.eu/legal-content/EN/TXT/?qid=1593073685620&uri=CELEX%3A52020DC0066

European Data Protection Board. (2019). Opinion 5/2019 on the interplay between the ePrivacy Directive and the GDPR, in particular regarding the competence, tasks and powers of data protection authorities (Opinion No. 5/2019; pp. 38 – 40). European Data Protection Board.

Fuchs, C. (2011). An Alternative view on the Privacy of Facebook. Information, 2(1), 140 – 165. https://doi.org/10.3390/info2010140

German Data Ethics Commission. (2019). Gutachten der Deutschen Datenethik Kommission [Expert opinion]. Datenethikkomission. https://datenethikkommission.de/wp-content/uploads/191015_DEK_Gutachten_screen.pdf

Hannah-Moffat, K. (2001). Punishment in Disguise: Penal Governance and Canadian Women’s Imprisonment. University of Toronto Press.

Janssen, H., Cobbe, J., Norval, C., & Singh, J. (2020). Decentralised Data Processing: Personal Data Stores and the GDPR [Forthcoming]. https://doi.org/10.2139/ssrn.3570895

Janssen, H., Cobbe, J., & Singh, J. (2019). Personal Data Stores and the GDPR’s lawful grounds for processing personal data. Data for Policy, Kings College London. https://doi.org/10.5281/zenodo.3234880

Kamara, I., & De Hert, P. (2018). Understanding the balancing act behind the legitimate interest of the controller ground: A pragmatic approach. (Working Paper No. 4/12; pp. 1 – 33). Brussels Privacy Hub.

Kerber, W. (2016). Digital Markets, data, and privacy: Competition law, consumer law and data protection. Journal of Intellectual Property Law & Practice, 11(11), 855 – 866. https://doi.org/10.1093/jiplp/jpw150

Lodge, T. (2018). Developing GDPR compliant apps for the edge. Proceedings of the 13th International Workshop on Data Privacy Management, 313 – 328. https://doi.org/10.1007/978-3-030-00305-0_22

Lyon, D. (2017). Surveillance Studies: An Overview. Polity Press.

Mantelero, A. (2014). Social Control, Transparency, and Participation in the Big Data World. Journal of Internet Law, 23 – 29. https://staff.polito.it/alessandro.mantelero/JIL_0414_Mantelero.pdf

Myers West, S. (2019). Data Capitalism: Redefining the Logics of Surveillance and Privacy. Business & Society, 58(1), 20–41. https://doi.org/10.1177/0007650317718185

Ng, I., & Haddadi, H. (2018, December 28). Decentralised AI has the potential to upend the online economy. Wired. https://www.wired.co.uk/article/decentralised-artificial-intelligence

Nissenbaum, H. (2011). A Contextual Approach to Privacy Online. Dædalus, 140(4), 32–48. https://doi.org/10.1162/DAED_a_00113

Palmås, K. (2011). Predicting What You’ll Do Tomorrow: Panspectric Surveillance and the Contemporary Corporation. Surveillance & Society, 8(3), 338 – 354. https://doi.org/10.24908/ss.v8i3.4168

Pearson, S., & Casassa-Mont, M. (2011). Sticky Policies: An Approach for managing Privacy across Multiple Parties. Computer, 44(9), 60 – 68. https://doi.org/10.1109/MC.2011.225

Poikola, A., Kuikkaniemi, K., & Honko, H. (2014). MyData – A Nordic Model for human-centered personal data management and processing [White Paper]. Open Knowledge Finland.

Selbst, A. D., & Powles, J. (2017). Meaningful information and the right to explanation. International Data Privacy Law, 7(4), 233 – 243. https://doi.org/10.1093/idpl/ipx022

Silverman, C. (2019, April 14). Popular Apps In Google’s Play Store Are Abusing Permissions And Committing Ad Fraud. Buzzfeed.

Singh, J. (2017). Big Ideas paper: Policy-driven middleware for a legally-compliant Internet of Things. Proceedings of the 17th ACM International Middleware Conference. https://doi.org/10.1145/2988336.2988349

Singh, J. (2019). Decision Provenance: Harnessing Data Flow for Accountable Systems. IEEE Access, 7, 6562 – 6574. https://doi.org/10.1109/ACCESS.2018.2887201

Sloan, R. H., & Warner, R. (2013). Beyond Notice and Choice: Privacy, Norms, and Consent (Research Paper No. 2013–16; pp. 1 – 34). Chicago-Kent College of Law. https://doi.org/10.2139/ssrn.2239099

Solove, D. (2013). Privacy Self-Management and the Consent Dilemma. Harvard Law Review, 126, 1888 – 1903. https://harvardlawreview.org/2013/05/introduction-privacy-self-management-and-the-consent-dilemma/

Solove, D. (2020). February 11. The Myth of the Privacy Paradox (Research Paper No. 2020–10; Law School Public Law and Legal Theory; Legal Studies). George Washington University. https://doi.org/10.2139/ssrn.3536265

The Royal Society. (2019). Protecting privacy in practice: The current use, development and limits of Privacy Enhancing Technologies in data analysis [Report]. The Royal Society. https://royalsociety.org/topics-policy/projects/privacy-enhancing-technologies/

Tolmie, P. (2016, February). This has to be the cats – personal data legibility in networked sensing systems. Proceedings of the 19th ACM Conference on Computer Supported Cooperative Work. https://doi.org/10.1145/2818048.2819992

Urquhart, L. (2018). Realising the Right to Data Portability for the Domestic Internet of Things. Pers Ubiqui Comput, 22, 317 – 332. https://doi.org/10.1007/s00779-017-1069-2

Urquhart, L. (2019). Demonstrably doing accountability in the Internet of Things. International Journal of Law and Information Technology, 2(1), 1 – 27. https://doi.org/10.1093/ijlit/eay015

Wachter, S., & Mittelstadt, B. (2019). A Right to Reasonable Inferences: Re-Thinking Data Protection Law in the Age of Big Data and AI. Columbia Business Law Review, 2, 494 – 620. https://doi.org/10.7916/cblr.v2019i2.3424

Wachter, S., Mittelstadt, B., & Floridi, L. (2017). Why a right to explanation of automated decision-making does not exist in the general data protection regulation. International Data Privacy Law, 7(2), 76–99. https://doi.org/10.1093/idpl/ipx005

Wagner, B. (2019). Liable, but Not in Control? Ensuring Meaningful Human Agency in Automated Decision-Making Systems. Policy & Internet, 11(1), 104 – 122. https://doi.org/10.1002/poi3.198

Yeung, K. (2017). 'Hypernudge’: Big Data as a mode of regulation by design. Information, Communication & Society, 20(1), 118–136. https://doi.org/10.1080/1369118X.2016.1186713

Zuboff, S. (2015). Big other: Surveillance capitalism and the prospects of an information civilization. Journal of Information Technology, 30, 75 – 89. https://doi.org/10.1057/jit.2015.5

Footnotes

1. Note that Hub-of All-Things (HAT) recently changed its name into Dataswift Ltd; Dataswift Ltd represents the commercial enterprise that grew from the university-led HAT research project which was tasked to build the decentralised HAT infrastructure and the governance model. Where we refer in the text to Dataswift, both the HAT project and the commercial enterprise Dataswift are considered within our analysis.

2. Note that Solid offers the technical infrastructure, while Inrupt is the company offering services that are built on that infrastructure. Where we refer to Inrupt, both the technical infrastructure and the company services come within our analysis.

3. This article builds on our earlier comparative analysis of commercial PDS offerings and different PDS formulations, as focused on data protection concerns (Janssen et al., 2020).

4. Note that a 'device' is conceptual, and can be underpinned by a range of technical architectures. In describing the data and processing 'within' a device, we refer to that logically governed by the device. This means, for example, that the data and compute might not necessarily occur all within a single technical component, but could potentially occur in various locations, e.g. across a range of (managed) cloud services.

5. Note that the terminology varies by platform; not all platforms would describe processing as occurring through apps, though generally there is some conceptually similar construct.

6. Note that despite the similar terms (devices, apps, app stores), PDS differ from mobile ecosystems, in that PDSs are governance oriented, with far richer and granular controls. Moreover, the degree of resemblance will depend on the specific formulation of the PDS and its ecosystem – many different approaches are possible.

7. We use ‘collective computation’ simply to refer to computation that occurs across a range of user devices. There is potential for the methods facilitating such computation to employ privacy-enhancing mechanisms (e.g. The Royal Society, 2019).

8. Note that differences exist as to what PDSs require from app developers to describe in their manifests. Databox envisages to assess risks as to whether an app developer intends to share the data with third parties, while other platforms might not envisage any risk assessment on this aspect (or it is not explicit from their documentation that they do this).

9. Databox envisages to give indications to users in their risk assessment whether app developers intend to transfer user data beyond the EU (which entails high risks to that data), or whether an app developer transfers personal data to other recipients (this also entails high risks to user data).

Political advertising exposed: tracking Facebook ads in the 2021 Dutch elections

$
0
0

Micro-targeting is the protagonist of political campaigning in COVID-19 times. In the absence of live campaign events such as rallies, political parties invest more than ever on intercepting their constituencies in their digital whereabouts. Not surprisingly, social media platforms are a central gear in the process. The Netherlands is no exception. Dutch citizens are called to elect the members of the House of Representatives (Tweede Kamer) on 15-17 March 2021. With the country in lockdown and the internet penetration at 96 percent, the transition of political campaigning to the digital invests social media platforms with an onerous stake in the game. According to the Reuters Institute Digital News Report 2020, 77 percent of Dutch adults consume news online, and 39 percent regularly use social media as news source, with Facebook leading the classification. Furthermore, unlike the United States, the Netherlands is a parliamentary democracy, in which multiple parties (as many as 37 in this round) take part in the elections. Many parties thus reach out to voters by means of targeted messages on Facebook. This bears the questions: how can we monitor political ads circulating on social media in the run-up to an election? How can we understand the impact of political micro-targeting on the shaping of political preferences?

The project Politieke Advertenties Analyse van Digitale Campagnes (Analysis of Political Ads in Digital Campaigns, henceforth PAADC) is tasked with monitoring and analysing Facebook sponsored content in relation to the Dutch general elections. Combining scraping Facebook posts with relevant survey data in the context of political elections, the project takes advantage of the synergy between digital methods and public opinion research to gain insights on the usage and effects of political micro-targeting, and the dynamics and popular perception of parties’ advertising strategies. This essay introduces the challenge of studying political micro-targeting amidst platform corporate policies and illustrates the PAADC methodology and its implications for the study of political micro-targeting. It reflects on two open questions for the study of political micro-targeting, namely the assessment of social media’s transparency initiatives and the methodological and political potential of collaborative, independent research centering on platform users.

The challenge

Ever since Facebook was accused of steering voting preferences in the 2016 US Presidential campaign (Madrigal, 2017), the platform has been under the crossfire of policymakers, researchers, and organised civil society alike. As a result, Facebook Inc. has committed to increasing transparency in the functioning of its personalisation algorithms. In 2018 it launched Social Science One, an in-house research organisation hosted by Harvard University and aimed at providing selected research parties with (part of) the platform’s goldmine of user and traffic data. In the same year, Facebook Inc. made available its political-focused ads archive, today integrated in the broader Ad Library service, which “provides advertising transparency by offering a comprehensive searchable collection of all ads currently running from across Facebook apps and services”. In practice, the service allows to manually and programmatically inspect Facebook’s vast collection of sponsored content, including (selected) details about sponsorships and placement logic (for an analysis of the quality of ad libraries see Leerseen et al., 2018). At the same time, however, Facebook Inc. has dramatically restricted the options to access its data through other established channels such as the Graph Application Programming Interface, or API (Puschmann & Ausserhofer, 2017). This controversial move was intended to limit abuse, but it ended up also restricting access to Facebook data for scholars committed to research in the public interest (Bruns, 2018). Meanwhile, the company has also actively pursued researchers investigating its political-ad-targeting practices, such as the New York University’s Ad Observatory, on grounds that “scraping tools, no matter how well-intentioned, are not a permissible means of collecting information from us”, as the Wall Street Journal reported.

To the company’s supporters, initiatives of this kind testify to the good faith intentions of the social media giant to make amends for the unintentional misbehaviors of the Facebook / Cambridge Analytica scandal. To the detractors and the skeptical ones, these moves are seen rather as part of a carefully crafted ‘open washing’ strategy which calls for further vigilance. To be sure, relying on carefully edited company data sets bears some critical questions for research purposes. How complete and reliable are data made available through corporate-controlled channels? To what extent are they tailored to answer the pressing questions that emerge from users, researchers and other concerned stakeholders? And ultimately: who ‘owns’ social media-generated data, and is thus entitled to oversee collection, sharing, and repurposing?

The approach

Against this backdrop,PAADC intervenes offering a novel methodology of noninvasive user audit (Sandvig et al., 2014) to generate social media data sets on political ads by engaging volunteer users. PAADC is a Dutch interdisciplinary research collaboration involving researchers of the University of Amsterdam, respectively from the Algorithms Exposed (ALEX) team at the Department of Media Studies and the Amsterdam School of Communication Research (ASCOR), the audience research organisation I&O, and the daily newspaper de Volkskrant. The core of PAADC’s innovative methodology is a browser extension, named PAADC-fbtrex, which repurposed the fbtrex browser extension developed by the ALEX team with the open source analysts of Tracking Exposed (Milan & Agosti, 2019). This tailor-made plugin allows for the collection of political ads classified as public posts as they appear on a user’s Facebook timeline.

In January 2021, the PAADC-fbtrex extension was downloaded and installed in the computers of a controlled group of 588 voting citizens selected by I&O (see methodological note), who agreed to collect data during the election campaign and in the aftermath of the voting (January 15 to March 28). As detailed in the privacy policy of the project, only the so-called ‘public posts’ are collected, in line with the General Data Protection Regulation. Next to volunteering individual (public) Facebook data, the user group is asked to fill in surveys on political attitudes, voting preferences and media use at regular intervals, for a total of four surveys. In a nutshell, PAADC overcomes the limitations of aggregated, platform-generated data (e.g., provided by Facebook itself) and of self-reported data by linking user experience (i.e., what a user is served in her timeline) to survey data about perceptions and political preferences.

Our approach points to at least three novel directions of analysis which we explore next, including: the descriptive characterisation of political campaigning and micro-targeting; the exploratory investigation of the ‘political ad-sphere’; and the explanatory analysis of the impact of online political advertising.

Descriptive characterisation of political campaigning and micro-targeting

The data collected through the PAADC-fbtrex extension can be aggregated to produce real-time overviews of the micro-targeting strategies of parties and candidates. Content analysis can inform reports exploring themes and focus areas of digital campaigns. Furthermore, matching the collected ads with user survey data allows us to dig into the logic of micro-targeting itself. It is worth noting that, whereas similar findings could be produced by interrogating the aforementioned Facebook Ad Library, the latter only returns information about individual ads and pre-defined, abstract targeting categories. Our methodology, instead, provides data on actual impressions as they appeared on the timelines of real users, cross-referencing the data with detailed user-level data. A real-time overview of the data collection can be found here. Figure 1 shows the daily impressions per party over a fortnight in the heat of the electoral campaign.

Figure 1: Number of daily impressions over the period 15 February-1 March 2021 per political party (PAADC project)

Exploratory investigation of the ‘political ad-sphere’

Once we know to which (anonymised) users a certain ad has been served, it is possible to reconstruct the network of co-occurrence among different political advertisers targeting different users. In other words, our methodology shows which combination of ads users are exposed to on Facebook. Once linked to survey data, this allows us to gain insights on patterns of co-targeting, answering questions like: which parties tend to target the same user base? Which parties tend to be ‘omnivorous’ (i.e., aimed at all voters indistinctly) versus ‘specialised’ (i.e., aimed at a certain segment of the population) when it comes to targeting? Which advertisers tend to target users already aligned with their party, and which use advertising to target competing parties and thus trying to move votes?

Explanatory analysis of the impact of online political advertising

The possibility to poll users’ political opinion in a longitudinal fashion—repeating a range of questions multiple times throughout the election campaign—together with data on which ads have concretely appeared on their timelines, paves the way for assessing the impact of political micro-targeting on voting behaviour and political opinions at large.

Our approach comes with three limitations to account for and, possibly, overcome. One issue relates to the size and representativity of the sample. For the Dutch elections 2021, the partnership with I&O allowed us to work with a fairly large-sized, diverse and controlled sample of the voting population. Whereas the sample of users who agreed to install the PAADC-fbrex extension does not fully satisfy the standards for statistical representativity, its size and variability provide us with the possibility to derive informed inferences on the relation between collected ads and survey variables. Future iterations, however, could aim at involving statistically representative samples to allow for solid generalisations. A second limitation has to do with the fact that our browser extension currently works only on the desktop version of Facebook and not on the mobile app version. However, mobile access is the most popular on certain segments of the population: Dutch Facebook users prefer the mobile in the 60.7 percent of the cases. Harvesting the same information from the mobile app would improve the representativity of our data. One solution to this problem is the use of ‘mobile experience sampling’, a complementary data collection strategy implemented on a subset of 120 respondents, who are uploading screenshots of ads as they appear on their Facebook timeline and answering short surveys. This will allow for cross-validation between desktop-based and mobile-based data. Finally, the process of metadata extraction at the level of the parser, as well as the function for distinguishing sponsored from non-sponsored content, have presented a number of challenges that required meticulous, continuous intervention, testing, and maintenance at the code level. Our experience exposes how certain metadata seem to be effectively shielded by the platform. We do not believe this problem to be a purely technical or methodological issue, but to a great extent as a political one—which we explore next.

Open questions in the study of political micro-targeting

The PAADC design allows us to hone in two large open questions when it comes to the analysis of digitally personalised political campaigning, namely: (1) the assessment of social media’s transparency initiatives, and (2) the methodological and political potential of collaborative, independent research centering on platform users.

(1) Next to exploring online political advertising, a corollary to our research relates to the meta-theme of algorithmic auditing and platform politics. Specifically, it contributes to the development of methods for the assessment of platform’s transparency assessment, and to the study of the politics of technical obfuscation.

Methods for assessing platforms’ transparency assessment

Following the infamous data abuse scandals, Facebook Inc. has promoted an alleged ‘new era’ of transparency and collaboration with researchers, favouring initiatives nominally devoted to guaranteeing user privacy while allowing for investigating the impact of algorithms on the democratic process. While these moves open up promising lines of research (see Venturini & Rogers, 2019), a question lingers behind: how can we assess… these transparency assessments? The PAADC data set, consisting of actual posts being served to real users, can ground the base for auditing... the Facebook transparency initiatives themselves. In other words: does the Facebook Ads Library return, as claimed, any political ad being served by the platform, and does it provide truthful information about the target audience? Given the current regulations and contractual power structures, the answer to this question might come from two sources: unconditional faith in Facebook Inc. or a large data set of actual ads served to real users. We have the latter. And, we argue, society needs more of these data sets.

The politics of technical obfuscation

However, similar objections of data set (in)completeness and data reliability can be advanced with regards to PAADC data as well. But this open question touches upon an important issue worth exploring further: that of the politics of obfuscation set into place by platforms in order to restrain malicious actors and independent researchers alike from collecting data. Developing scraping tools today is literally a cat-and-mouse situation, given that, as our experience shows, Facebook is putting active efforts to mislead parsers by obfuscating metadata and unexpectedly (and repeatedly) restructuring HTML codes (see Mancosu & Vegetti, 2020). We hope PAADC will serve as a case study for contributing to the methodological and political level of analysis of scraping proprietary platforms.

(2) The restructuring of Facebook data access policy following data breach scandals has paradoxically corresponded to a ‘platformisation’ (see Poell et al., 2019) of its research affordances, further centralising its control over data. Whereas it is desirable that loopholes such as those that allowed for the Facebook / Cambridge Analytica abuses are amended for, these moves have dramatically restricted the possibilities for (resource-poor) researchers to conduct analyses in the public interest. However, there exists a third way between unrestricted access and complete centralisation: distributed, volunteered, privacy-preserving data donation infrastructures. This aspect touches upon two points associated with what we called the contentious politics of data (Beraldo & Milan, 2019) applied to the research design: i) the political claim of breaking Terms of Services in the public interest, and ii) the question of public participation and user awareness through engagement.

Breaking Terms of Service

Scraping the content of a proprietary platform happens in a legal ‘grey area’, as it often formally violates a platform’s Terms of Service (ToS). This might occasionally result in legal contention as a study of the music streaming platform Spotify has exposed (Eriksson et al., 2018). Platform resistance to scraping, which Facebook.tracking.exposed was met with in the past (as denounced by the UK digital rights organisation Open Rights Group), has legit ethical foundations when it comes to preserving users’ privacy from being violated by third parties against their will. However, this justification does not hold true when users have explicitly provided consent for their data to be contributed to a certain research project. Ultimately, this boils down to a simple question: are data produced by the users who originate them, or by the platform that hosts them? No matter what the fine prints variably consciously ‘agreed’ upon by users say, users should have a legitimate right to transfer their data to parties they trust and whose aims they deem valuable, as experimented in the framework of data sharing/donation projects (e.g., Ohme et al., 2020). To be sure, the user base mobilised by the PAADC project has no responsibility in the (potential) violation of the ToS on the part of the researchers. However, future initiatives like ours should articulate this tension further, prefiguring organised cases of digital civil disobedience or strategic litigation. A promising direction is to legally establish the user’s right to keep a personal copy of the data in its quality of unique combination of content served by the platform to a specific individual, which could then potentially be shared with selected others.

Awareness through engagement

This challenge invites a reflection upon the potential political and transformative element associated with initiatives similar to the PAADC project, and any future iterations of it. Doing research based on data consciously contributed by users can potentially mean to do research not much on, but rather with them (Kazansky et al., 2019). The collective analysis of algorithmic recommendations around COVID-19 is a case in point: organised by Tracking Exposed, it results in the first crowdsourced analysis of YouTube's recommended system during the pandemic. Further emphasising the collaborative, voluntary, and conscious nature of users’ recruitment to these initiatives can conciliate the research goals of these projects with the societal need of awareness raising and data literacy. Our hope is that consciously contributing their experience to our research goals might have stimulated our participants' own personal reflection on the hurdles of political micro-targeting.

Looking forward

As we approach the voting window, Dutch parties get increasingly active in online advertising to steer votes. Our data will help voters and policymakers understand the role and reach of political micro-targeting in the country, and the extent to which it influences the election outcome. It can also provide informed insights to political parties and candidates concerning how Facebook users consume their sponsored content. In the future, our methodology can be repurposed to explore micro-targeting during elections in other countries as well.

We believe that the number of directions of analysis presented in this piece can be boiled down to one fundamental leitmotiv: the urge of bringing to light practices otherwise invisible to the public eye. We argue that algorithmic personalisation cannot solely and adequately be investigated by means of aggregated, decontextualised data. On the contrary, one needs to engage with user-level, actual impressions that only a natural experiment yields. One should also not audit algorithms via the data that platforms themselves (strategically) disclose: any platform-originated transparency assessment initiative needs to be assessed in itself. And finally: one cannot do research with user data without directly intervening in the politics of data: researching with social media users, bypassing the intermediation of platforms, is a political statement in itself and, potentially, a transformative process for society as a whole.

Methodological note

The users involved in PAADC have been recruited by I&O research using sampling procedures typical of survey research, and are compensated for their participation as per standards in the field. Since the sample required the active use of Facebook from a web browser, it was not possible to obtain a representative sample. The initial sample of 781 respondents has an overrepresentation of male respondents (65.5 percent) and older age categories (40.5 percent older than 65). Among this initial sample, 588 participants have agreed to install the PAADC-fbtrex browser extension. This limitation forces us to be cautious about generalisations to the whole population of Dutch Facebook users.

Acknowledgements

This project has received funding from the Stimuleringfonds voor de Journalistiek; the European Research Council (ERC) under the European Union’s Horizon 2020 research and innovation programme (grant agreement No 825974-ALEX and No 639379-DATACTIVE), the University of Amsterdam’s Research Priority Area Amsterdam Centre for European Studies (ACES) and the Amsterdam School of Communication Research.

References

Beraldo, D., & Milan, S. (2019). From data politics to the contentious politics of data. Big Data & Society, 6(2). https://doi.org/10.1177/2053951719885967

Bruns, A. (2018). Facebook Shuts the Gate after the Horse Has Bolted, and Hurts Real Research in the Process. Internet Policy Review. https://policyreview.info/articles/news/facebook-shuts-gate-after-horse-has-bolted-and-hurts-real-research-process/786

Eriksson, M., Fleisher, R., Johansson, A., Snickars, P., & Vonderau, P. (2018). Spotify Teardown. Inside the Black Box of Streaming Music. MIT Press.

Kazansky, B., Torres, G., van der Velden, L., Wissenbach, K. R., & Milan, S. (2019). Data for the social good: Toward a data-activist research agenda. In A. Daly & M. Mann (Eds.), Good Data (pp. 244–259). Institute of Network Cultures. https://data-activism.net/wordpress/wp-content/uploads/2019/02/data-activist-research.pdf

Leerseen, P., Ausloos, J., Zarouali, B., Helberger, N., & De Vreese, C. H. (2018). Platform Ad Archives: Promises and Pitfalls. Internet Policy Review, 8(4). https://doi.org/10.14763/2019.4.1421

Madrigal, A. C. (2017, October 12). What Facebook Did to American Democracy. The Atlantic. https://www.cs.yale.edu/homes/jf/MadrigalFeb2018-2.pdf

Mancosu, M., & Vegetti, F. (2020). What You Can Scrape and What Is Right to Scrape: A Proposal for a Tool to Collect Public Facebook Data. Social Media + Society, 6(3). https://doi.org/10.1177/2056305120940703

Milan, S., & Agosti, C. (2019). Personalisation algorithms and elections: Breaking free of the filter bubble. Internet Policy Review. https://policyreview.info/articles/news/personalisationalgorithms- and-elections-breaking-free-filter-bubble/1385

Ohme, J., Araujo, T., de Vreese, C. H., & Piotrowski, J. T. (2020). Mobile data donations: Assessing self-report accuracy and sample biases with the iOS Screen Time function. Mobile Media & Communication, 2050157920959106. https://doi.org/10.1177/2050157920959106

Poell, T., Nieborg, D., & Van Dijck, J. (2019). Platformisation. Internet Policy Review, 8(4). https://doi.org/10.14763/2019.4.1425

Puschmann, C., & Ausserhofer, J. (2017). Social Data APIs. Origins, Types, Issues. In M. T. Schäfer & K. van Es (Eds.), The Datafied Society. Studying Culture through Data (pp. 147–154). Amsterdam University Press.

Sandvig, C., Hamilton, K., Karahalios, K., & Langbort, C. (2014, May 22). Auditing Algorithms: Research Methods for Detecting Discrimination on Internet Platforms. Data and Discrimination: Converting Critical Concerns into Productive Inquiry, Seattle, Washington. http://social.cs.uiuc.edu/papers/pdfs/ICA2014-Sandvig.pdf

Venturini, T., & Rogers, R. (2019). “API-Based Research” or How can Digital Sociology and Journalism Studies Learn from the Facebook and Cambridge Analytica Data Breach. Digital Journalism, 7(4), 532–540. https://doi.org/10.1080/21670811.2019.1591927

A step back to look ahead: mapping coalitions on data flows and platform regulation in the Council of the EU (2016-2019)

$
0
0

Section I: Introduction

The objective of this article is to inform our understanding of upcoming policy developments with novel data on the policy-making process of recent EU Digital Single Market (DSM) legislative files. This study draws on a new data set collected as part of a multi-year research programme on the power of member states in negotiations of the Council of the EU (or “Council”). This novel data set includes information on the initial policy preferences and issue salience of all member states and EU institutions on the main controversial issues of the following legislative negotiations: the regulation on the free flow of non-personal data, the European electronic communication code directive, and the directive on copyright in the DSM.

During these negotiations, member states have discussed extensively the opportunity to (de-)regulate data flows, and introduce new (legal and financial) obligations on internet platforms. These policy controversies will remain at the centre of the EU’s political agenda for years to come, with the launch of negotiations on the Data Governance Act (DGA), Digital Services Act (DSA) and Digital Markets Act (DMA).

These new legislative processes need to be seen as a continuation of previous EU negotiations, and would thus gain at being understood in light of the coalition patterns previously mobilised by member states in the Council. Often considered as the most powerful institution of the EU legislative system, the Council is however known for the opacity of its policy-making processes, which has greatly limited academic attempts to uncover its inner-workings (Naurin & Wallace, 2008). This research explores this black box, building on a public data set (Arregui & Perarnaud, 2021) based on 145 interviews conducted in Brussels with Council negotiators and EU officials between 2016 and 2020.

By highlighting the main controversies and coalition patterns between member states on the regulation of data flows and internet platforms as part of three negotiations, this research provides relevant analytical tools to approach future EU digital policy-making processes. It underlines in particular how the ability of certain member states to form and maintain coalitions may determine decision outcomes. Given its success in the adoption process of the free flow of data regulation, the “digital like-minded group” (or the D9+ group) could indeed be activated in the course of the next negotiations on the DGA, DSA and DMA. This paper also argues that the capacity of large member states, such as Germany and France, to formulate their policy preferences early in the process could be a key determinant of their bargaining success. Moreover, while the Council is expected to remain strongly divided regarding the regulation of data flows and internet platforms, this paper indicates why the European Parliament (EP) could have a significant role in these discussions. It also signals that member states are not equal in their capacity to engage with Members of the European Parliament (MEPs), thus suggesting that the ones with more structured channels of engagement with the EP may be more likely to be successful in upcoming negotiations.

The following section offers a brief literature review on EU negotiations and Council policy-making processes in relation to the DSM. Then, the methodological approach and data set are presented. The empirical section is divided in two parts. The first maps the constellation of preferences and issue salience of member states in the three legislative files, and the second uncovers the main coalition patterns. The findings are then discussed in light of their implications for upcoming EU negotiations.

Section II: Unpacking EU digital negotiations

In recent years, EU digital policies have attracted a growing attention from scholars, partly due to the acceleration of EU legislative activities on key internet-related issues, such as data protection and cybersecurity. This trend reflects a broader pattern of states’ increasing engagement with internet governance and policy-making to exercise power in and through cyberspace (Deibert & Crete-Nishihata, 2012; DeNardis, 2014; Harcourt et al., 2020; Radu et al., 2021).

The literature has acknowledged the increasingly active role of the EU in internet policies, as illustrated by recent regulatory and policy initiatives in the fields of data governance (Borgogno & Colangelo, 2019), privacy (Ochs et al., 2016; Bennett & Raab, 2020; Laurer & Seidl, 2020), copyright (Meyer, 2017; Schroff & Street, 2018) and cybersecurity (Christou, 2019). Recent research has also investigated the nature and determinants of a 'European approach' in regulating large internet companies and safeguarding competition (Radu & Chenou, 2015), in balancing competing policy objectives such as national security and data protection (Dimitrova & Brkan, 2018), or in promoting its ‘digital sovereignty’ (Pohle & Thiel, 2020).

EU decisions have direct implications for member states, companies and citizens, but also for third countries (Bradford, 2020), as illustrated by the recent reform of the EU data protection framework (Bendiek & Römer, 2018). Reflecting its ambition to increase its 'cyber power' (Cavelty, 2018) on the international stage, the EU has progressively established cyber partnerships with third countries to engage on digital issues (Renard, 2018), building on the recognised role of the EU over the past two decades in public policy aspects of internet governance (Christou & Simpson, 2006).

But EU digital policy can also be seen as a field of struggle (Pohle et al., 2016), with major divides among governments. As emphasised by Timmers in the case of EU cybersecurity policies, the wide diversity of interests of member states can be challenging for EU policy-making (Timmers, 2018) and thus usher in competing political dynamics in the Council. Though the literature provides refined accounts of the discourse and role of the EU in internet governance debates, more limited are political scientists’ attempts to unpack the complex political processes and controversies structuring EU digital policies, and identify the "winners and losers" of policy developments from the perspective of national governments.

Exploring this gap, this study draws on recent research on the decision-making system of the EU, grounded in rational choice institutionalist analysis. Due to the intergovernmental design of the Council, a great part of the scholarship pertaining to the Council embraces a rationalist perspective, giving to national decision-makers the lead role and assuming that member states determine their actions according to their national preferences and own calculation of utility (Naurin & Wallace, 2008). This scholarship assumes that negotiations' outcomes are shaped by strategic interactions between goal-seeking governments, with bounded rationality, operating within a set of institutional constraints (Lundgren et al., 2019).

It is common knowledge that a major part of the decisions in the Council is adopted by consensus, making the emphasis on the voting phase of the decision making process less relevant than the bargaining phase. This is the reason why recent research on member states’ influence in the EU decision-making system focuses on the actual negotiation processes at play (Thomson et al., 2006; Thomson, 2011). This scholarship is primarily driven by the Decision-making in the European Union (DEU) project, followed recently by the Economic and Monetary Union (EMU) Positions dataset (Wasserfallen et al., 2019), which led to more generalisable findings on power distribution and bargaining processes in the Council.

I will argue that these analytical tools are much welcome to approach EU negotiations on DSM policies. They offer an established methodology to map the constellation of preferences on key controversial issues, and document the determinants and patterns of coalitions in the Council. The following section describes the methodological steps taken to collect the data on which this article draws its analysis upon, as well as the structure of the data set.

Section III: Data and methodology

This research draws on a new data set documenting recent EU legislative processes, and covering the initial preferences of member states and EU institutions on controversial policy issues, as well as their decision outcomes. The DEU III data set (Arregui & Perarnaud, 2021) builds on 145 semi-structured interviews conducted in Brussels with representatives of member states and EU institutions, and covers 16 recent negotiations. Four of them are directly related to the DSM. They consist in the adoption process of the regulation on the free flow of non-personal data (EU Regulation 2018/1807), the Geoblocking regulation (EU Regulation 2018/302), the European electronic communication code directive (EU Directive 2018/1972), and the directive on copyright in the DSM (EU Directive 2019/790). The selection criteria for the legislative dossiers were the negotiation rules (qualified majority voting in the Council), the adoption period (between 2016 and 2019), and the high level of ‘controversiality’ of the policy issues under discussion.

In this data set, information on actors’ policy positions and their salience is represented spatially using ‘scales’ according to an established methodology (Thomson et al., 2006; Arregui & Perarnaud, 2021). During face-to-face interviews conducted in Brussels, negotiators were asked to identify the main controversies raised among member states once the Commission had introduced the legislative proposal. Subsequently, the policy experts had to locate the positions of all actors along the policy scale. The experts were also asked to estimate the level of salience that actors attached to each controversial issue. Every estimation provided had to be justified through evidence and substantive arguments. A number of validity and reliability tests on the DEU III data set (for instance by systematically comparing experts´ judgments and documents) have corroborated previous analysis of validity and reliability of the DEU I data set made by Thomson et al. (2006).

This methodological approach has its own weaknesses, already identified in the literature (Princen, 2012). In relation to the study of actors’ influence, the first limitation is the clear focus of the DEU data set on the negotiation phase of the EU policy cycle. As the literature on agenda-setting (Princen, 2009) suggests, national governments and other stakeholders can invest significant resources into the preparatory steps of legislative negotiations, dynamics that are not thoroughly addressed in the data set. Similarly, this data set does not investigate the implementation of EU legislations and the actual compliance of member states with decision outcomes. Yet, we know that governments can repeat the same influence efforts observed during negotiations once legislative decisions are actually adopted (Blom-Hansen, 2014). Though not immune from traditional biases related to expert surveys and spatial models of politics, this data set allows us to analyse and compare the structure of the constellation of preferences in the Council across the range of policy controversies under study. In a prior analysis of this data set, Perarnaud (forthcoming) shows that bargaining success is unevenly distributed among member states when looking at DSM negotiations, and suggests that these asymmetries relate in part to variations in the resources and coordination mechanisms that can be mobilised by national governments in Brussels. Instead, this research follows a policy-oriented approach and focuses on the political controversies that shaped these negotiations, in view to inform our analysis of upcoming legislative processes.

The following section presents the main policy controversies under study. The analysis focuses in particular on three overarching issues: the regulation of data flows, the introduction of new legal obligations for internet platforms, and new financial obligations for internet platforms. Then, the coalition patterns observed in these negotiations are described, laying an emphasis on the ‘digital like-minded group’ in the Council.

Section IV: Analysis

The three EU negotiations illustrate the extent to which member states were divided regarding the regulation of data flows, the introduction of new legal obligations for internet platforms, as well as new financial obligations for internet platforms.

Data flows

The adoption process of the regulation on the free flow of non-personal data in the EU was characterised by sharp divisions in the Council. This is well illustrated by numerous protracted influence efforts led by several member states, even prior to the publication of the legislative proposal by the European Commission in September 2017.

Before the actual launch of its proposal, certain member states had repeatedly called the European Commission to propose a legislative initiative on the free flow of data. For instance, in a letter sent in December 2016 to Donald Tusk (the European Council President), sixteen heads of states and prime ministers had asked for measures to end data localisation practices (supported by Belgium, Bulgaria, Czech Republic, Denmark, Estonia, Finland, Ireland, Latvia, Lithuania, Luxembourg, Netherlands, Poland, Slovakia, Slovenia, Sweden and the United Kingdom 1). France and Germany were initially opposed to this initiative. After the legislative proposal was internally drafted by the services of the European Commission, France was reportedly concerned about the scope of the draft proposal and pressured the Commission to delay its official publication, and successfully so between 2016 and 2017 2. Indeed, the Regulatory Scrutiny Board (RSB), the body with the power to issue opinions to the college of commissioners on draft legislations, adopted two consecutive negative opinions on the Commission draft proposal in 2016 and 2017. One of the main arguments given to justify these two negative opinions was that the draft proposal also covered personal data, and thus overlapped with the newly adopted General Data Protection Regulation (GDPR).

After the negotiations officially started, the main controversial issue opposing member states consisted in the scope of the derogations which could interfere with the principle of free flow of non-personal data. While some advocated for only very limited derogations to the principle of free flow of data, others favoured extensive derogations for different purposes (security, culture, public archives). Proponents of the principle of the free flow of data wanted to maintain the scope of the regulation as broad as possible, as envisioned initially by the European Commission. Estonia, Denmark, Ireland, Czech Republic, Poland, the United Kingdom, Netherlands and Sweden were the most active member states in this group. These member states coordinated their influence efforts at the Brussels-level as part of the “digital like-minded group”, a coalition group that will be further detailed in the following sections. Though sharing a common goal, these member states were driven by different motives. For instance, for Estonia, one of the main driving forces behind this reform, the free flow of data was both an economic and a political priority. Due to its very digitalised economy and society, the Estonian government saw an economic and political interest in supporting more data transfers within the EU, while increasing its resilience against potential third countries’ attacks over its data infrastructures 3. Interest groups representing the tech industry in Brussels were also advocating for the suppression of data localisation obstacles (DIGITALEUROPE, 2017). The free flow of data in the EU was considered as a very positive change for most internet platforms, as it could lead towards the formal establishment of the free flow of (non-personal) data between the EU and third countries (such as the United States).

A range of member states were however much less positive towards the establishment of a fully-fledged principle of free flow of data within the EU. France and Germany indeed supported a number of derogations to this principle, in particular for the storage of public data, as well as exemptions for national security purposes. Their position could be explained by their negative competitive advantage in terms of cloud and data storage solutions at the global level, but also by their concerns in terms of cybersecurity and intellectual property 2. Though with less issue salience, their call for more derogations to the principle of free flow of data was backed by Spain, Hungary, Austria and Greece in the first steps of the negotiations. This process thus highlighted two diverging approaches in the Council regarding the regulation of non-personal data flows, a controversial issue that appeared salient for France, Germany, Estonia, Denmark, Ireland, Czech Republic, Poland and the United Kingdom.

New legal obligations for internet platforms

A range of recent EU negotiations has considered the opportunity and possibility to introduce new legal obligations for internet platforms. I will focus here in particular on the recent directive on copyright in the DSM and the reform of the EU telecommunication code. These two legislative instruments have indeed contemplated the introduction of new obligations for internet platforms from different angles. As for the regulation on the free flow of data, both were part of the European Commission’s DSM strategy.

The electronic communication code directive was proposed on 14 September 2016 by the European Commission (COM/2016/0590). One of the key provisions of this complex directive proposed to include new communication services, known as over-the-top (OTT) services, within its scope. The extent to which these services needed to be regulated by the same rules as telecom operators was thus a very controversial issue. The European Commission proposal consisted in including OTTs within the scope of the directive, while providing for derogations for certain types of services. The draft directive differentiated between number-based services, which connect users or companies via the public phone network, and number-independent services which do not route communication through the public telephone network and do not use telephone numbers as identifiers (such as Facebook Messenger, Gmail or Apple FaceTime). The Commission had proposed to explicitly bring number-based communication services within the scope of the end user rights provisions of the directive, but to include number-independent ones only for a limited set of provisions.

Member states were strongly divided on the extension of telecom rules to OTT service providers. For instance, France and Spain had a maximalist approach and wanted to introduce wide requirements for OTTs. Germany and Poland supported a similar stance, but with less salience. Spain even proposed that these services should be covered by the telecom general authorisation regime in each member state, like any other telecom providers 4. On the other side of the spectrum, several member states (Sweden, Finland, Denmark, Czech Republic, Netherlands, Luxembourg, Ireland, United Kingdom and Belgium) claimed that regulating number-independent services was unjustified and could hamper innovation in the EU. In its proposal, the European Commission had intended to make a step in the direction of France and Spain, without alienating other member states initially opposed to this inclusion 5.

The adoption process of the copyright directive, and in particular of its article 17 (formerly known as ‘article 13’), generated even more divides between member states (Bridy, 2020; Dusollier, 2020), and is also relevant to map the constellation of member states’ preferences in relation to the regulation of internet platforms. Member states were opposed regarding the obligations and rules that internet platforms should follow to protect rights holders’ content. They had different views on how to address right holders’ challenges to prevent copyright infringements online, and on the type of obligations and liabilities that should be placed on internet platforms. Several member states (including Denmark, Finland, the Netherlands, Czech Republic, Estonia, Luxembourg, Sweden, the UK) argued for a light-touch regulation towards internet platforms, while agreeing with a requirement for platforms to introduce a redress and complaint mechanism for users. Though in favour of the Commission approach, this group opposed a provision aiming to introduce technical requirements for providers of ‘large amounts’ of copyrighted content, which would oblige platforms to use ‘automatic filters’ (or ex ante measures) to control for copyright infringements 6. For Denmark, Finland, Ireland and the Netherlands, this issue was highly salient due to the costs and legal risks it would potentially impose on the large and small digital companies headquartered on their territory. Germany supported new ex ante measures to protect copyright holders, but with a derogation for small and medium enterprises (SMEs) and start-ups having a turn-over of less than € 20 million per year. The position of Germany was highly salient given the strong interests of German copyright holders to force platforms to monitor content uploads, combined with the German ministry of economy’s red line to maintain a carve-out for small platforms. On the other side of the political spectrum, a number of member states, led by France and Spain, promoted the introduction of more stringent obligations for platforms (regardless of their size), via the introduction of civil liabilities for internet platforms, and ex ante measures to protect copyright holders. The high salience of France and Spain was driven by the interests of their large content industry to regulate and hold internet platforms liable, and force them to pay their ‘fair share’ 7. Other member states (such as Portugal, Cyprus and Greece) did support the position defended by France, but without significant interests at stake.

New financial obligations for internet platforms

The adoption process of the copyright directive also led to heated discussions on the introduction of new financial obligations for internet platforms, in particular with the introduction of a “neighbouring right” that would allow press publishers to receive financial compensation when their work is used by internet platforms such as Google News (Papadopoulou & Moustaka, 2020). A large number of member states were opposed to the introduction of a fully-fledged neighbouring right for press publishers. This group was led by Finland, the Netherlands and Poland, which perceived this change as detrimental to their digital sector, and in particular to news-related start-ups. Also, by regulating the practice of hyperlinks, these member states were concerned that it would restrict users’ access to information and impact the core functioning of the internet. Other member states, led by France and Germany, urged in favour of the introduction of a neighbouring right for press publishers established in the EU. The high salience of this issue for France and Germany relates to the significant interests of their national press publishers and news agencies (including Bertelsmann media group in Germany) to be compensated for the use of their works 8. Germany had already introduced in its national legislation a similar provision 9. Other member states such as Portugal, Italy and Spain were also supporting this introduction, though investing less political capital, due to the more limited benefits this new right would bring to their publishers’ industry.

These cases illustrate the overall distribution of preferences and issue salience in recent EU negotiations on data flows and the regulation of internet platforms. The next section will look at actual patterns of coordination between member states with similar policy preferences, in view to appreciate their ability to form and maintain coalitions around common interests.

Section V: Member states’ coalition patterns on DSM files in the Council

As it is commonly understood, and as the previous section clearly illustrates, member states tend to be significantly divided regarding the regulation of data flows and internet platforms at the EU level. However, we also know that all member states are not equally equipped to advance their policy preferences at the EU level (Panke, 2012). All national negotiators do not benefit from the same level of ‘network capital’ with their counterparts (Naurin & Lindahl, 2010), neither do they possess the same administrative resources to formulate and defend their positions (Kassim et al., 2001). These asymmetries are partly determined by a set of domestic factors, such as political stability, financial means, administrative legacies, and bargaining style. In a forthcoming article, I argue that they have a direct, though not systematic, effect on the bargaining success of member states (Perarnaud, forthcoming). Qualified majority voting rules in the Council indeed incentivise member states to find coalition partners to either build political momentum around common policy preferences or constitute a blocking minority to challenge competing dynamics. But coalition dynamics require a particular set of resources and capabilities that not all member states can mobilise to the same extent. Coalitions can thus be considered as key to analyse previous and future EU policy-making processes, and this section will present coalition patterns observed as part of the three DSM legislative files under study.

The digital like-minded group

The digital like-minded group is both an intriguing and little-known coalition at the Council level. Though it shares similar features with other issue-based Council groupings (see for instance the Green Growth Group in the context of EU negotiations related to the environment), it is particularly worth studying given its novelty and relative success in advancing policy preferences in the context of DSM policies.

The digital like-minded group is a mostly Brussels-based informal group of national negotiators, which gathers mostly at attaché level, though ambassadors of the respective member states can also meet under this format. Originally, this coalition group had been launched when the European Commission released its DSM strategy in 2015 3. It mobilised concretely for the first time during negotiations on the NIS directive (EU Directive 2016/1148), notably to devise common strategies regarding the possible introduction of new obligations for internet services. Representatives of Denmark and Estonia in Brussels initiated this coalition 3, and in 2019, it gathered 17 member states in total. Despite its influence activities, this group is an informal alliance and does not have any public presence.

Member states part of the digital like-minded group share a liberal approach to internal market and digital issues, and have in common to be digitally “ambitious”, but not necessarily digitally advanced. For instance, Bulgaria belongs to the digital like-minded group, despite being one of the least digitalised member states in the EU (according to the Digital Economy and Society Index). This coalition can mobilise both during and prior to negotiations. For instance, before the adoption of the Commission proposal on the free flow of data, a number of joint letters from heads of states were forwarded by this group (Fioretti, 2016). These efforts to liaise with EU leaders are also exemplified by a number of more recent letters, for instance on the ‘Digital Single Market mid-term review’ in 2017 10 or in the preparation of EU leaders’ meetings in 2019 11. As part of legislative negotiations, the digital like-minded group can meet regularly to discuss text compromises and help align member states’ influence efforts. This strategy proved successful in the context of the negotiations on the free flow of data, in which the high level of coordination between members of the ‘digital like-minded group’ allowed for the formulation of efficient strategies to contain the initial concerns forcefully expressed by France, and dismantle a blocking minority (Perarnaud, forthcoming). The format of the digital like-minded group is only used by member states when they share a rather homogeneous position on a particular subject matter. For instance, members of the digital like-minded group were significantly divided in the context of the copyright directive, and thus could not leverage this format.

Among the digital like-minded group, negotiators from the most digitally advanced member states appear prominent (Sweden, Denmark, Finland, Estonia, Belgium, the Netherlands, Luxembourg, Ireland and the UK). This group of member states also meet at high-level ministerial level, in what is known as the Digital 9 group (D9), which was launched by the Swedish minister for EU affairs and trade, Ann Linde, on 5 September 2016. According to an interviewee 12, this new initiative partly stemmed from recommendations published in a report on the “European Digital Front-Runners”, carried out by the Boston Consulting Group and funded by Google, which invited member states considered as digital front-runners to coordinate their action “at both political and policy levels […] for true digitization success and to communicate their strong commitment to the execution of the digital agenda” (Alm et al., 2016). This development illustrates the indirect lobbying channels that corporate actors can use to secure influence in relation to the DSM, in addition to their more traditional repertoire of actions at the EU level (Laurer & Seidl, 2020; Christou & Rashid, 2021).

Since October 2017, “digitally advanced” member states meet at ministerial level as part of a larger group, known as the D9+, which also includes Czech Republic and Poland (The Digital Hub, 2018). The D9 initiative was indeed not perceived positively by all member states within and outside this group. Some negotiators argued that these nine member states did not hold sufficient voting power to influence legislative negotiations, whereas certain member states, such as Poland and Czech Republic, still wanted to cooperate with like-minded member states but were found excluded. These considerations were the main drivers behind the creation of the D9+ group. Yet, the overlap between these formats was identified as a challenge by two interviewees 13.

The Franco-German alliance

As opposed to the “digital front-runners”, France and Germany did not appear to coordinate their influence efforts as part of the three negotiations under study. This can be partly explained by differences in their national preferences with respect to the regulation of internet platforms. Though they generally share common concerns, Germany and France had different, and sometimes, divergent interests to defend. Also, the repeated delays in the formulation process of the German position did not allow for strong coordination mechanisms between Paris and Berlin on key controversial issues. The position of Germany on article 13 of the copyright directive (article 17 in the final text) was agreed two years after the publication of the Commission proposal, and the German government only agreed on its own position in the negotiations on the free flow of data regulation during the final stages of the adoption process of the Council’s position in 2018. These delays mainly originate from the horizontal coordination structure between ministers and Länder (Sepos, 2005), especially as the German ministry of justice and the ministry of economy’s positions can be difficult to reconcile on digital matters. The Franco-German alliance on DSM policies was thus only visible at high-level on a limited number of instances 14, but rarely at the level of negotiators.

Though the research fieldwork did not allow for conducting interviews with all the negotiators involved in these processes, it should be noted that there was no evidence of other Council coalitions mobilised by member states. The literature on coalitions in the Council indicates that national representatives generally coordinate their position and strategy with their counterparts on an ad hoc basis depending on the issues at stake (Ruse, 2013), as illustrated here by the cases of Spain and Italy. Both could be considered pivotal in the shaping process of these decisions. Spain’s progressive shift in the negotiations regarding the regulation on the free flow of data is for instance essential to understand the outcomes of these dynamics, and thus shows how member states not pertaining to coalition groups can still leverage influence in EU policy-making processes.

Section VI: Conclusion

These negotiation processes highlight the competing interests of EU member states and institutions with regard to key controversial aspects of the EU’s DSM, and the coalitions they mobilise to amplify their message in Brussels.

These descriptive findings can be of great use for understanding the next political sequence initiated by the European Commission over the regulation of data flows and internet platforms. They provide new insights on the structure and salience of member states’ preferences on these controversial issues, and the mechanisms at their disposal to gain influence.

While the observation of coalition patterns was specifically focused on the Council, it should be noted that member states can also approach other EU institutions to advance their preferences. Previous studies (Panke, 2012; Bressanelli & Chelotti, 2017) show that not all Brussels-based national negotiators regularly engage with MEPs in order to channel national interests and ‘tame’ the different political positions voiced inside the EP. These variations partly relate to differences in the size of national delegations in the EP, but also to member states’ negotiation style and administrative resources (Perarnaud, forthcoming). Interestingly, the member states with the more structured channels of communications with EU institutions are the ones that have expressed the highest salience on issues related to data flows and internet platforms regulation. As shown by previous studies (Panke, 2012), France, the Netherlands, Sweden, Finland and Czech Republic are indeed among the member states with the most powerful connections with the EP and the European Commission. Given the existing divides among member states in the Council, the European Parliament could have a greater say in the upcoming negotiations, thus possibly giving more leverage to the member states with structured mechanisms to liaise with MEPs.

The negotiations on the DSA, DMA and DGA legislative dossiers will generate significant debates and controversies at the EU level. Many detailed proposals and ideas are being voiced to shape the EU’s approach on myriad of policy issues related to the regulation of data flows, competition and content moderation (Graef & van Berlo, 2020; Gillepsie et al., 2020). Policy proposals presented in the context of this new phase should take into account the informal power balance between member states in the Council, and existing asymmetries in their capabilities to defend national positions in Brussels. As it appears that strong coordination mechanisms between the most digitally advanced countries of the EU have granted them significant influence over large member states in recent years (Perarnaud, forthcoming), future research on these processes should carefully study coordination processes and coalition patterns, as well as their implications. The recent joint statement released by the D9+ group (D9+ group, 2019) ahead of the DSA negotiations is a good indicator of the relevance of studying preferences’ allocation structure and coalitions as part of this new political sequence.

Acknowledgements

I thank Roxana Radu, Oles Andriychuk and the editors for their valuable comments on previous versions of this article.

References

Alm, E., Colliander, N., Deforche, F., Lind, F., Stohne, V., & Sundström, O. (2016). Digitizing Europe: Why Northern European frontrunners must drive the digitization of the EU economy [Report]. Boston Consulting Group. https://www.beltug.be/news/4922/European_Digital_Front-Runners_The_Boston_Consulting_Group_publishes_new

Arregui, J., & Perarnaud, C. (2021). The Decision-Making in the European Union (DEUIII) Dataset (1999-2019) [Data set]. Repositori de dades de recerca. https://doi.org/10.34810/DATA53

Bendiek, A., & Römer, M. (2018). Externalizing Europe: The global effects of European data protection. Digital Policy, Regulation and Governance, 21(1), 32–43. https://doi.org/10.1108/DPRG-07-2018-0038

Bennett, C. J., & Raab, C. D. (2020). Revisiting the governance of privacy: Contemporary policy instruments in global perspective. Regulation & Governance, 14(3), 447–464. https://doi.org/10.1111/rego.12222

Blom-Hansen, J. (2014). Comitology choices in the EU legislative process: Contested or consensual decisions? Public Administration, 92(1), 55–70. https://doi.org/10.1111/padm.12036

Borgogno, O., & Colangelo, G. (2019). Data sharing and interoperability: Fostering innovation and competition through APIs. Computer Law & Security Review, 35(5), 105314. https://doi.org/10.1016/j.clsr.2019.03.008

Bradford, A. (2020). The Brussels effect: How the European Union rules the world. Oxford University Press. https://doi.org/10.1093/oso/9780190088583.001.0001

Bressanelli, E., & Chelotti, N. (2017). Taming the European Parliament: How MS Reformed Economic Governance in the EU (Research Paper No. 2017/54). Robert Schuman Centre for Advanced Studies Research Paper.

Bridy, A. (2020). The Price of closing the ‘value gap’: How the music industry hacked EU copyright reform. Vanderbilt Journal of Entertainment & Technology Law, 22(2), 323–358. https://scholarship.law.vanderbilt.edu/jetlaw/vol22/iss2/4

Cavelty, M. D. (2018). Europe’s cyber-power. European Politics and Society, 19(3), 304-320,. https://doi.org/10.1080/23745118.2018.1430718

Christou, G. (2019). The collective securitisation of cyberspace in the European Union. West European Politics, 42(2), 278–301. https://doi.org/10.1080/01402382.2018.1510195

Christou, G., & Rashid, I. (2021). Interest group lobbying in the European Union: Privacy, data protection and the right to be forgotten’. Comparative European Politics, 19, 380–400. https://doi.org/10.1057/s41295-021-00238-5

Christou, G., & Simpson, S. (2006). The internet and public–private governance in the European Union. Journal of Public Policy, 26(1), 43–61. https://doi.org/10.1017/S0143814X06000419

Deibert, R., & Crete-Nishihata, M. (2012). Global Governance and the Spread of Cyberspace Controls’. Global Governance, 18(3), 339–361. https://doi.org/10.1163/19426720-01803006

DeNardis, L. (2014). The global war for internet governance. Yale University Press. https://doi.org/10.12987/yale/9780300181357.001.0001

DIGITALEUROPE. (2017, September 19). Free Flow of Data proposal brings the Digital Single Market strategy back on track [Press release]. DIGITALEUROPE.

Dimitrova, A., & Brkan, M. (2018). Balancing National Security and Data Protection: The Role of EU and US Policy‐Makers and Courts before and after the NSA Affair’. JCMS: Journal of Common Market Studies, 56, 751–767. https://doi.org/10.1111/jcms.12634

Directive (EU) 2016/1148 of the European Parliament and of the Council of 6 July 2016 concerning measures for a high common level of security of network and information systems across the Union.

Directive (EU) 2018/1972 of the European Parliament and of the Council of 11 December 2018 establishing the European Electronic Communications Code (Recast.

Directive (EU) 2019/790 of the European Parliament and of the Council of 17 April 2019 on copyright and related rights in the Digital Single Market and amending Directives 96/9/EC and 2001/29/EC, 92 (2019).

Dusollier, S. (2020). ‘The 2019 Directive on Copyright in the Digital Single Market: Some progress, a few bad choices, and an overall failed ambition’. Common Market Law Review, 57(ue 4), 979-1030,. https://kluwerlawonline.com/JournalArticle/Common+Market+Law+Review/57.4/COLA2020714

European Commission. (2021). Digital Economy and Society Index (DESI) 2019. European Commission Directorate-General for Communications Networks, Content and Technology. http://semantic.digital-agenda-data.eu/dataset/DESI

Fioretti, J. (2016, May 23). EU countries call for the removal of barriers to data flows. Reuters. https://www.reuters.com/article/uk-eu-digital-data-idUKKCN0YE06O

Gillespie, T., Auderheide, P., Carmi, E., Gerrard, Y., Gorwa, R., Matamoros-Fernández, A., Roberts, S. T., Sinnreich, A., & Myers West, S. (2020). Expanding the debate about content moderation: Scholarly research agendas for the coming policy debates. Internet Policy Review, 9(4).

Graef, I., & Van Berlo, S. (2020). Towards smarter regulation in the areas of competition, data protection, and consumer law: Why greater power should come with greater responsibility. European Journal of Risk Regulation, 25(1). https://doi.org/10.1017/err.2020.92

Group, D. (2019). ‘D9+ non-paper on the creation of a modern regulatory framework for the provision of online services in the EU’. https://www.gov.pl/web/digitalization/one-voice-of-d9-group-on-new-regulations-concerning-provision-of-digital-services-in-the-eu

Harcourt, A., Christou, G., & Simpson, S. (2020). Global Standard Setting in Internet Governance. Oxford University Press.

Hub, T. D. (2018). Minister Breen chairs meeting of D9+ EU countries’. https://www.thedigitalhub.com/news/minister-breen-chairs-meeting-d9-eu-countries/

Kassim, H. M., A., & Peters, B. G. (Eds.). (2001). The National Coordination of EU Policy. The European Level.

Laurer, M., & Seidl, T. (2020). Regulating the European Data‐Driven Economy: A Case Study on the General Data Protection Regulation’. Policy & Internet. DOI. https://doi.org/10.1002/poi3.246

Lundgren, M., Bailer, S., Dellmuth, L. M., Tallberg, J., & Târlea, S. (2019). Bargaining success in the reform of the Eurozone’. European Union Politics, 20(1), 65–88. https://doi.org/10.1177/1465116518811073

Meyer, T. (2017). The Politics of Online Copyright Enforcement in the EU. Palgrave Macmillan.

Naurin, D., & Lindahl, R. (2010). Out in the cold? Flexible integration and the political status of Euro opt-outs. European Union Politics, 11(4), 485–509. https://doi.org/10.1177/1465116510382463

Naurin, D., & Wallace, H. (2008). Unveiling the Council of the European Union. Palgrave Macmillan UK.

Ochs, C., Pittroff, F., Büttner, B., & Lamla, J. (2016). Governing the internet in the privacy arena’. Internet Policy Review, 5(3), 1–13. https://doi.org/10.14763/2016.3.426

Panke, D. (2012). Lobbying Institutional Key Players: How States Seek to Influence the European Commission, the Council Presidency and the European Parliament’. Journal of Common Market Studies, 50, 129–150. https://doi.org/10.1111/j.1468-5965.2011.02211.x

Papadopoulou, M.-D., & Moustaka, E.-M. (2020). Copyright and the Press Publishers Right on the Internet: Evolutions and Perspectives. In T.-E. Synodinou, P. Jougleux, C. Markou, & T. Prastitou (Eds.), EU Internet Law in the Digital Era: Regulation and Enforcement (pp. 99–136). Springer International Publishing. https://doi.org/10.1007/978-3-030-25579-4_5

Perarnaud, C. (Forthcoming). Why do negotiation processes matter?’ Informal Capabilities as Determinants of EU Member States Bargaining Success in the Council of the EU [Doctoral dissertation]. University Pompeu Fabra.

Pohle, J., Hösl, M., & Kniep, R. (2016). Analysing internet policy as a field of struggle. Internet Policy Review, 5(3). https://doi.org/10.14763/2016.3.412

Pohle, J., & Thiel, T. (2020). Digital sovereignty. Internet Policy Review, 9(4). https://doi.org/10.14763/2020.4.1532

Princen, S. (2009). Agenda-setting in the European Union. Palgrave Macmillan. https://doi.org/10.1057/9780230233966

Princen, S. (2012). The DEU approach to EU decision-making: A critical assessment. Journal of European Public Policy, 19(4), 623-634,. https://doi.org/10.1080/13501763.2012.662039

Proposal for a directive of the European Parliament and of the Council establishing the European Electronic Communications Code (Recast). COM/2016/0590 final. (n.d.).

Radu, R., & Chenou, J. M. (2015). Data control and digital regulatory space(s): Towards a new European approach. Internet Policy Review, 4(2). https://doi.org/10.14763/2015.2.370

Radu, R., Kettemann, M. C., Meyer, T., & Shahin, J. (2021). Normfare: Norm entrepreneurship in internet governance. Telecommunications Policy, 45(6), 102148. https://doi.org/10.1016/j.telpol.2021.102148

Regulation (EU) 2018/302 of the European Parliament and of the Council of 28 February 2018 on addressing unjustified geo-blocking and other forms of discrimination based on customers’ nationality, place of residence or place of establishment within the internal market and amending Regulations (EC) No 2006/2004 and (EU) 2017/2394 and Directive 2009/22/EC, (2018).

Regulation (EU) 2018/1807 of the European Parliament and of the Council of 14 November 2018 on a framework for the free flow of non-personal data in the European Union.

Renard, T. (2018). EU cyber partnerships: Assessing the EU strategic partnerships with third countries in the cyber domain. European Politics and Society, 19(3), 321–337. https://doi.org/10.1080/23745118.2018.1430720

Ruse, I. (2013). (Why) Do neighbours cooperate? Institutionalised coalitions and bargaining power in EU Council negotiations. Burdrich Unipress.

Schroff, S., & Street, J. (2018). The politics of the Digital Single Market: Culture vs. Competition vs. Copyright. Information, Communication & Society, 21(10), 1305–1321. https://doi.org/10.1080/1369118X.2017.1309445

Sepos, A. (2005). The national coordination of EU policy: Organisational efficiency and European outcomes’. Journal of European Integration, 27(2), 169–190. https://doi.org/10.1080/07036330500098227

Thomson, R. (2011). Resolving Controversy in the European Union. Cambridge University Press.

Thomson, R., N., S. F., Achen, C. H., & T, K. (Eds.). (2006). The European Union Decides. Cambridge University Press.

Timmers, P. (2018). The European Union’s cybersecurity industrial policy. Journal of Cyber Policy, 3(3), 363–384. https://doi.org/10.1080/23738871.2018.1562560

Wasserfallen, F., Leuffen, D., Kudrna, Z., & Degner, H. (2019). Analysing European Union decision-making during the Eurozone crisis with new data. European Union Politics, 20(1), 3–23. https://doi.org/10.1177/1465116518814954

Appendix 1: Extracts from the DEU III data set (Arregui & Perarnaud, 2021)

The following figures illustrate the distribution of the policy positions (axis X), and their intensity (axis Y), for all member states and EU institutions on the main controversial issues under study. The vertical arrow indicates where research respondents located the decision outcome on the policy scale.

Figure 1: Structure of the controversy on ‘derogations’ during the adoption process of the regulation on free flow of non-personal data (2017/0228/COD).

Figure 2: Structure of the controversy on the ‘value gap’ during the adoption process of the directive on copyright in the DSM (2016/0280/COD).

Figure 3: Structure of the controversy on the introduction of a ‘neighbouring right’ for press publishers during the adoption process of the directive on copyright in the DSM (2016/0280/COD).

Figure 4: Structure of the controversy on the inclusion of ‘OTTs’ during the adoption process of the European electronic communication code directive (2016/0288/COD).

Footnotes

1. The United Kingdom (UK) was still part of the European Union at the time, until formally leaving in January 2020.

2.a.b. Interview with MS representative, 19/09/2018, Brussels. 3.a.b.c. Interview with MS representative, 11/09/2018, Brussels

4. Interview with MS representative, 20/06/2018, Brussels.

5. Interview with an EU official, 10/05/2019, Brussels.

6. Interview with MS representative, 20/09/2018, Brussels.

7. Interview with MS representative, 07/09/2018, Brussels.

8. Interview with Council Secretariat official, 11/09/2018, Brussels.

9. Germany adopted in 2013 an ancillary copyright law for press publishers (‘Presseverleger-Leistungsschutzrecht, Achtes Gesetz zur Änderung des Urheberrechtsgesetzes’, 7 May 2013).

10. For more, see: https://euractiv.eu/wp-content/uploads/sites/2/2017/06/170620_HOSGs-EUCO-digital-letter-FINAL.pdf

11. For more, see: https://images.politico.eu/wp-content/uploads/2019/03/Leadersjointletter_MarchEUCO_260219.pdf

12. Interview with MS representative, 18/09/2018, Brussels.

13. Interviews with MS representatives, 14/09/2018 and 18/09/2018, Brussels.

14. See for instance the Franco-German joint statement released in the context of a high-level ministerial meeting on October 2015: https://www.economie.gouv.fr/files/files/PDF/Declaration_conference_numerique_finale_FR.pdf

Pandemic platform governance: Mapping the global ecosystem of COVID-19 response apps

$
0
0

Introduction

On 11 March 2020, the World Health Organisation (WHO) officially declared the coronavirus (COVID-19) outbreak as a global pandemic. By definition, a pandemic signals an ‘out of control’ contagion that threatens an entire population and implies a shift away from containment strategies towards extraordinary governance conditions (French et al., 2018). The WHO further stated: ‘it’s a crisis that will touch every sector, so every sector and every individual must be involved in the fight’ (WHO, 2020a, p. 3). Given the central role of platforms and apps in everyday life (van Dijck et al., 2018; Morris and Murray, 2018), this call to action would also necessarily involve working with big tech companies. Almost immediately, however, concerns were raised by civil society organisations and academic researchers about the development of apps to intervene in the COVID-19 crisis. These included risks for civil liberties regarding their potentially excessive surveillance capacities to doubts regarding their actual effectiveness particularly for digital contact-tracing, among other concerns (Ada Lovelace Institute 2020; Kitchin, 2020; Privacy International, 2020). For major platform companies such as Google and Apple, therefore, getting ‘involved in the fight’ would include making carefully negotiated decisions about how to regulate their emerging COVID-19 app ecosystems, and how to balance the concerns and priorities of multiple stakeholders.

Critical questions regarding how platforms govern stem in part from a recognition that as intermediating or multi-sided techno-economic systems, platform companies like Apple and Google have begun to resemble political actors by utilising a layering of interrelated yet distinct mechanisms to control and exploit innovation (van Dijck et al., 2018; Klonick, 2017; Suzor, 2018). Platforms like app stores, for instance, use both technical and legal regulatory means to govern their relationship with third-party software developers, end-users, and other stakeholders (Eaton et al., 2011; Gillespie, 2015; Greene and Shilton, 2018; Tiwana et al., 2010), while navigating ‘external’ legal frameworks from national and supranational institutions (Gorwa, 2019). Moreover, from the perspective of a public policy platform, corporations are also increasingly understood as political actors beyond strictly the terms of market power since they have become powerful gatekeepers of societal infrastructure that requires new forms of regulatory engagement (Khan, 2018; Klonick, 2017; Suzor, 2018). This is especially the case due to their entanglement with public communication, education, and healthcare, among other domains. Indeed, as a recent European Commission report on platform power observes, ‘the COVID-19 crisis has made the societal and infrastructural role taken up by platforms even more apparent’ (Busch et al., 2021, p. 4).

The exceptional conditions of the pandemic have produced equally exceptional responses from platform companies concerning the development of COVID-19 apps. Their interventions have, accordingly, shaped the complex and dynamic relations between software developers, users, and governments during the crisis. This article presents an exploratory systematic empirical analysis of this COVID-19 app ecosystem and draws attention to how layered platform governance and power relations have mediated the app response to the pandemic as a singular global emergency.We use the term ‘ecosystem’ to refer to a platform and the collection of (mobile) apps connected to it (Tiwana et al., 2010). Both the Android and iOS mobile platforms technically produce distinct COVID-19 app ecosystems with their own apps, despite being organisationally interconnected since many developers produce apps for both Android and iOS.

The numerous socio-political risks and issues identified with COVID-19 apps suggest an obvious need for critical observation of this domain of platform activity (Rieder and Hofmann, 2020). Rapid research outputs have assessed how the powerful global technology sector ‘mobilised to seize the opportunity’ and how the pandemic ‘has reshaped how social, economic, and political power is created, exerted, and extended through technology’ (Taylor et al., 2020). Critical commentators, moreover, have drawn attention to how specific protocological interventions by platform companies, such as the development of the GAEN (Google/Apple Exposure Notification) system, demonstrated the significant asymmetries between national governments and platform companies controlling these processes (Veale, 2020). Likewise, Milan et al. have explored the ‘technological reconfigurations in the datafied pandemic’ from the perspective of underrepresented communities (2020). Efforts to broadly map, document and categorise COVID-19 apps, meanwhile, have mainly originated from computer science with an interest in security and cryptography (Ahmed et al., 2020; Levy and Stewart, 2021; Samhi et al., 2020; Wang et al., 2020) or from public health research aiming to evaluate apps according to policy-related frameworks (Davalbhakta et al., 2020; Gasser et al., 2020). Other scoping studies have been conducted by the European Commission (Tsinaraki et al., 2020), yet such research has not systematically analysed platforms and app stores’ mediating role as socio-technical innovation and control (Eaton et al., 2011). Albright’s study is notable by stressing how ‘hundreds of public health agencies and government communication channels simultaneously collapsed their efforts into exactly two tightly controlled commercial marketplaces: Apple’s iOS and Google’s Play stores’ (2020, n.p.). However, a comprehensive empirical analysis of the specific ways that platform governance has played out in the emergence of COVID-19 apps has largely been missing.

Drawing from multi-situated app studies (Dieter et al., 2019), we address this gap by empirically mapping COVID-19 apps across Google’s Play store and Apple’s App Store ecosystems. By analysing apps in multiple infrastructural situations, moreover, we draw attention to how platform governance is layered across different dimensions. Specifically, this includes: the algorithmic sorting of COVID-19 apps; the kinds of actors involved in app development; the types of app responses; the geographic distribution of the apps; the responsivity of their development (i.e., how quickly apps are released or updated); how developers frame their apps and address their users; and the technical composition of the apps themselves. While we recognise the above mentioned importance of the GAEN protocol used to facilitate digital contract-tracing through mobile apps, it is not included in this study because it had not yet been widely implemented at the time of this analysis. 1 Similarly, while access to mobile device sensors (e.g. GPS sensors, Bluetooth adapters, etc.) is governed and controlled on the level of Google and Apple’s mobile operating systems (i.e. on the level of Android and iOS) as well as through app permissions requested from users, this study focused primarily on the governance by app stores. 2 Finally, we offer an assessment of our findings across these layers concerning key themes in discussions of platform governance, particularly around the dominance and public legitimacy of platforms as private governors, and suggest some implications for policy considerations that stem from the eventfulness of global crisis-driven platform interventions.

App stores’ responses to the COVID-19 pandemic

On 14 March 2020, three days after the initial pandemic declaration, Apple announced significant restrictive changes to its App Store policies. Apple would now evaluate all apps developed for the coronavirus disease with a heightened degree of attention. Reiterating their mantra of the App Store as ‘a safe and trusted space’, Apple affirmed a commitment ‘to ensure data sources are reputable’ as ‘Communities around the world are depending on apps to be credible news sources’ (Apple Developer, 2020a, n.p.). This would mean only accepting authoritative apps ‘from recognized entities such as government organisations, health-focused NGOs, companies deeply credentialed in health issues, and medical or educational institutions’ (Apple Developer, 2020a, n.p.). For Apple, this also meant that ‘Entertainment or game apps with COVID-19 as their theme will not be allowed’ (Apple Developer, 2020a, n.p.). On the same day, Google published an editorial campaign page on Google Play titled ‘Coronavirus: Stay Informed’ with a list of recommended apps for being ‘informed and prepared’ about coronavirus, including apps from organisations like Centers for Disease Control and Prevention (CDC), American Red Cross, News360, the WHO, and Twitter (Google Play, 2020, n.p.). Shortly before this ‘Stay Informed’ campaign, Google/Alphabet CEO Sundar Pichai had outlined measures in place across their range of services to deal with the unique challenges of the crisis, stressing that Google Play’s policies already would prohibit app developers from ‘capitalizing on sensitive events’ and restrict the distribution of medical or health-related apps that are ‘misleading or potentially harmful’ (Pichai, 2020, n.p.).

As the pandemic spread and intensified throughout the year, both companies continued to update their editorial and policy positions for managing COVID-19 apps, while elaborating a set of regulatory mechanisms, and developing new standards and techniques to control what had become an exceptional niche of software development activity. In May 2020, Google Play released its official developer guidelines for COVID-19 apps. In addition to setting Google up as an information matchmaker, ‘connecting users to authoritative information and services’, Google outlined economic limits on COVID-19 apps – that is, any apps that meet their eligibility requirements (Google Help, 2020b) – noting they could ‘not contain any monetisation mechanisms such as ads, in-app products, or in-app donations’ (Tolomei, 2020, n.p.). Similarly, it restricted content that contained ‘conspiracy theories, misleading claims, “miracle cures” or dangerous treatments, or any patently false or unverifiable information’ (Google Help, 2020b, n.p.). In an update to their App Store Review Guidelines, meanwhile, Apple required that apps providing services ‘in highly-regulated fields’, such as healthcare, should be submitted by a legal entity that provides the services, and not by an individual developer’ and that medical apps ‘must clearly disclose data and methodology to support accuracy claims relating to health measurements’, as well as new policies for collecting health-related data (Apple Developer, 2020b, n.p.). To ensure this, Apple claims that ‘every app is reviewed by experts’ based on its App Store Review Guidelines (Apple Developer, 2020b, n.p.). Both stores also added new pandemic-related requirements to their general app store policies (e.g., around health and medical advice) and expedited the app review process so that COVID-19 apps could be approved more quickly (Google Help, 2020a, n.p.; Google Help, 2020b, n.p.; Tolomei, 2020, n.p.).

Such policy changes indicated a suspension of ‘business-as-usual’ for COVID-19 apps, as particular mechanisms around competition and monetisation – typically central to the app economy – were altered by the platform companies to support the emergence of a unique space of software development. Moreover, these policy changes are also implemented through different layers of technical agency, from unique modes of algorithmic curation (i.e., Google’s editorial filter) to new protocols for developers (e.g., GAEN). In this respect, they signal broader changes that ultimately extend throughout the platform infrastructure. In what follows, we map how these layered changes initiate a form of pandemic platform governance that unfolds through an interplay between a platform’s affordances for app development, the emergence of app ecosystems around platforms, and the platform’s regulatory mechanisms, which together simultaneously enable generativity and control (Eaton et al., 2011; Tiwana et al., 2010). That is, these governance mechanisms become central to the creation, evolution, and regulation of the COVID-19 app ecosystems that have emerged around Google’s Android and Apple’s iOS mobile platforms. In turn, they support the efforts of a heterogeneous network of third-party actors that aim to intervene in and manage the unfolding pandemic as a crisis – whether or not these aims were ultimately achieved.

Demarcating pandemic app ecosystems

Since app stores are the primary environments for distributing mobile apps, we can use them to locate, demarcate, and characterise collections of mobile apps (Dieter et al., 2019). Our research focused on the two most popular app stores worldwide, Google Play for Android apps and Apple’s App Store for iOS apps, 3 and queried their supported countries and locations for [COVID-19]-related search terms. We first compared the results and analysed the types of actors behind the development of COVID-19 apps based on the developer listed for the app 4 and information on the app details page, and second compared what type of responses they offer to the pandemic by examining available information in the app stores, including developer name, developer identifier, app descriptions, app icons, app screenshots, and developer websites. In both cases, apps can belong to multiple categories as they may offer various response types and may be developed in collaboration between different actors. Third, we examined app development responsivity across countries by retrieving all app version updates to account for the release dynamics in pandemic crisis responses. This responsiveness is enabled by the generative conditions provided by platforms that enable unprompted innovation (Zittrain, 2008), but stresses the capacity of developers, rather than of platforms, to respond quickly in the face of the uncertainties of the pandemic. Fourth, we conducted a content analysis of the app descriptions to examine how developers rhetorically position their apps in terms of techniques used, and how they engage with data and privacy issues. Finally, we examined the building blocks developers use in their app software packages to build COVID-19 apps. Due to the strict technical governance of iOS apps by Apple, we focused on the embedded software development kits (SDKs, i.e., collections of software libraries and tools commonly used by app developers) in Android apps. We used the AppBrain API to retrieve the embedded SDKs. 5 We collected all the data in mid-2020 when most countries already had one or more apps listed in the app stores. Google Play data were collected on 29 June (editorial subset) and on 16 July (non-editorial subset); Apple’s App Store data were collected on 20 July. Versions were retrospectively retrieved from App Annie.

In the initial phase of demarcating our data sets, we noticed that both stores have distinct logics and mechanisms for surfacing, organising, and ranking apps. We queried the 150 supported Google Play ‘countries’ and the 140 supported App Store ‘countries and regions’ for [COVID], [COVID-19], [corona], and related keywords using custom-built app store scrapers. 6 Apple's App Store returned ranked lists of 100 apps per country for our search queries, resulting in a total source set of 248 unique iOS apps. Google Play, however, did not produce such ranked lists. Instead, it rerouted all COVID-19 queries to a relatively small set of pre-selected apps in each local store.

Typically, app stores are organised through an algorithmic logic of sorting and ranking, complemented with an editorial logic of ‘best of’ and ‘editor’s choice’ lists (Dieter et al., 2019; Gillespie, 2014). For COVID-19-related search queries, Google Play solely relies on an editorial strategy (i.e., a search query filter) to surface a highly curated set of COVID-19 apps per country. A user searching for COVID-19-related terms is automatically redirected to Google’s editorially curated list of COVID-19 apps, and specifically those of the user’s home location only. We found that we could easily circumvent this editorial filter by exposing it to simple misspellings (e.g., [COVIID], [coronna], etc.), after which Google Play returned a more extensive list of relevant apps. Consequently, we captured two complementary source sets for Google Play: (a) an ‘editorial’ set of app responses per country with 247 unique apps, and (b) a ‘non-editorial’ set of 163 additional apps through misspellings. These 163 ‘additional’ apps were present in Google Play, but Google Play's editorial filter prevents these apps from surfacing for standard [COVID-19] search queries. In addition, there are also apps that are not included in our data set (e.g., the German luca response app) because they do not mention ‘coronavirus’, ‘COVID-19’, ‘pandemic’, or related keywords (Google Help, 2020b), despite being part of the pandemic response. While this is a limitation to our method, it also attests to the governance of this app ecosystem through controlling the terms used on app details pages (as only apps from recognised sources are eligible to use COVID-19-related keywords in their titles or descriptions).

The global ecosystem of pandemic response apps

In what follows, we present results from our analysis of the [COVID-19]-related app ecosystems of Google Play (Android) and App Store (iOS).

Source sets and actor types identified

We first compared the app distribution in our data sets and the different actors involved in their production. Figure 1 shows the distribution of COVID-19 apps across both stores and further distinguishes between the editorial and non-editorial Google Play apps. Individual apps are colour-coded to represent actor types: government, civil society, health authority, academic, and private actors.

Figure 1: Demarcated source sets (Google Play and App Store). Light green: Android app ecosystem (Google Play source set); light blue: iOS app ecosystem (App Store source set). Illustration: authors and DensityDesign Lab.
Figure 1: Demarcated source sets (Google Play and App Store). Light green: Android app ecosystem (Google Play source set); light blue: iOS app ecosystem (App Store source set). Illustration: authors and DensityDesign Lab.

The most striking finding is the large number of apps that feature in only one store. While the apps shared across stores (N=136) tend to be made by government actors, many government-made apps are only available in one store. About 70% (N=134) of government apps within the Google Play editorial set do not have an iOS equivalent in the App Store. While more fine-grained analysis is needed to understand these differences, one likely factor is the different market shares of the respective mobile operating systems and app stores across countries. To illustrate, Android has a 95% market dominance in India (Statcounter, 2021), and this country produced the highest number of Android COVID-19 apps overall, as we detail below. Another contributing factor is Android’s more permissive (open) architecture, as compared to Apple’s restrictive (closed) iOS architecture style and governance (Eaton et al., 2011); specifically, the more permissive use of sensors on Android devices, which are key to developing contact-tracing applications. The variance suggests divergent national strategies for implementing apps across platforms, which has consequences for users who may be presented with a different selection of COVID-19 apps based on their mobile operating system and corresponding app store.

There are also notable differences in the composition of actors developing COVID-19-related apps in each store (Figure 2). Government-produced apps are the most prevalent in both stores, positioning governments as key official and recognised sources outlined in the app stores’ policies. However, they are significantly more prevalent in Google Play (65%, N=267), and even more so in the Google Play editorial set (79%, N=195), compared to the App Store (48%, N=121). One outcome of Google’s editorial strategy is an increased presence and visibility of these government-made apps, yet curiously 42% of Google Play’s government-made apps did not make it into the editorial source set, indicating that being a government actor alone is not enough to make the editorial list.

In contrast, private actor apps are relatively more prevalent in the App Store (41%) than Google Play (32%). The privately-developed iOS apps are predominantly from commercial actors offering healthcare solutions. While most also exist as Android apps, they do not surface in our Google Play data sets, signalling how Google and Apple have different criteria for retrieving health companies and organisations as official and recognised sources. Additionally, the COVID-19 app response conditions gave rise to governmental actors seeking app development collaborations with private actors for Android (N=26) and iOS (N=12) apps. These collaborations were often explicitly mentioned in the app description. Further, a small but significant number of apps have been developed with the involvement of academic researchers (e.g., Covid Symptom Study); civil society actors (e.g., Stopp Corona from the Austrian Red Cross, or the WHO apps); or health authorities (e.g., the French Covidom Patient to monitor COVID-19 patients after a hospital visit). While lesser in number, the presence of these other actor types contributes to the credibility and legitimacy of the apps and the ecosystem at large.

Figure 2: Actor types identified behind [COVID-19]-related apps (Android and iOS), based on the listed developer names and app descriptions. Note: apps can belong to multiple categories. Illustration: authors.
Figure 2: Actor types identified behind [COVID-19]-related apps (Android and iOS), based on the listed developer names and app descriptions. Note: apps can belong to multiple categories. Illustration: authors.

Geographical distribution of apps by country

After exploring the distribution of apps and actor types across platforms, we focused on their geographical distribution. The App Store’s ranked lists of apps are less country-specific and show a high overlap between countries and regions. Google Play, whose editorial filter surfaces only country-specific COVID-19 apps, allows for a more distinctive geographic image (Figure 3). In this store, we find that most countries offer a small selection of country-specific apps, coupled with two WHO apps (OpenWHO: Knowledge for Health Emergencies and WHO Info). As early as 15 February, a month before the pandemic was officially declared, the WHO stated that ‘we’re not just fighting an epidemic; we’re fighting an infodemic’ (Zarocostas, 2020, p. 676). To combat COVID-19 dis/misinformation, the WHO had begun working closely with more than 50 major platform companies, including Google, to implement solutions to fight the emerging infodemic (WHO, 2020b). This collaboration, initiated by the WHO, resulted in ensuring that ‘science-based health messages from the organisation or other official sources appear first when people search for information related to COVID-19’ on participating platforms (WHO, 2020b, n.p.), as we observe in Google Play with the surfacing of the WHO apps.

Figure 3: Geographical distribution of [COVID-19]-related Android apps by country or region. Illustration: authors and DensityDesign<br />
            Lab.
Figure 3: Geographical distribution of [COVID-19]-related Android apps by country or region. Illustration: authors and DensityDesign Lab.

Measured in terms of downloads, most countries have a primary app within the country-specific apps by a government actor. There are, however, notable exceptions. While India has one dominant government-provided app (Aarogya Setu), which was made mandatory for government and private sector employees during the early stages of the pandemic, India offers 61 apps in total, far more than any other country. Upon closer inspection, we found that India had a multi-tiered response with many apps developed for specific regions and developed by local governments (Bedi and Sinha, 2020). In contrast, countries such as Taiwan, Denmark, Iceland, Portugal, and Uruguay offered only one app (in addition to the WHO apps), all of which are government-provided. We also see countries where non-government apps are dominant or highly prevalent (Philippines, Thailand, Mauritius, Netherlands, Canada) or where the dominant app involves multiple actors in their production, including collaborations between governmental and private actors (Germany, Czechia, Austria, Kyrgyzstan). In some countries, we found multiple apps reflecting a regional or state-based app response, strategies with multiple apps with distinctive features, or competing (non-governmental) apps and strategies.

It is worth noting two final observations about geographical distribution. First, China is notably missing from our study because it banned Google Play. To battle the pandemic, China has relied on Health Code, a mini-programme developed by Alipay and WeChat, which generates a colour-based health code for travelling (Liang, 2020). Instead of developing new COVID-19 apps, China integrated Health Code into two dominant mobile payment apps. Second, the two WHO apps surface for every country, with one notable exception: the United States. Not only did the WHO apps not make it to the editorial list, but direct search queries for these apps redirected to the US editorial list where the WHO apps did not feature. In April 2020, President Trump halted funding to the WHO, after criticism of the US’ response to the COVID-19 pandemic. A few months later, in July, President Trump moved forward to officially withdraw the US membership from the WHO. The omission of the two WHO apps in the US may reflect broader geopolitical dynamics and suggests that the editorialisation of Google Play’s app ecosystem may not be conducted by Google alone. The editorial lists reflect a generally benevolent platform strategy to steer users to what is perceived to be the most appropriate apps; however, in this case, we see the editorial logic used for more overtly political purposes with the emergence of censorship (even though these WHO apps exist in the US store).

Pandemic response types

To understand the type of responses COVID-19 apps offer, we inquired into what kind of apps these actors built. This allows us to identify which response types are dominant, and which emerge with the distinct governance mechanisms of each store and the actors in each ecosystem.

While contact-tracing apps have received the most attention in news reporting, we found many different response types (Figure 4(a)). In both stores, 50–60% of all apps offer news and information on the pandemic, developed by various types of actors (Figure 4(b) and (c)). The prominence of authoritative information, updates and data may result from the WHO’s collaboration with platform companies to ‘immunize the public against misinformation’ by connecting users to official sources (WHO, 2020b).

At the time of the analysis, over 20% of apps engage with contact-tracing and exposure notification, which are typically built by government actors or in collaboration with private actors (Figure 4(b) and (c)). We find a diversity of potential surveillance forms beyond contact-tracing: over 48% of apps offer different kinds of symptom checkers or reporting tools, ranging from keeping a diary to the solicitation of medical and personal data. They are connected to private companies, academic research, or aligned with public healthcare. About 15% of all apps offer tools for remote healthcare developed by governmental and private actors.

Figure 4(a) to (c): Comparison of response types represented by [COVID-19]-related apps (Google Play vs App Store). Note: apps can belong to multiple categories. Illustration: authors.
Figure 4(a) to (c): Comparison of response types represented by [COVID-19]-related apps (Google Play vs App Store). Note: apps can belong to multiple categories. Illustration: authors.

We also found new categories compared to existing literature, such as mental health apps to deal with psychological pressures during the pandemic. We further found apps soliciting data for research studies, such as the German Corona-Datenspende, by donating data from various devices for assisting in academic studies on COVID-19. When comparing the two stores, we find that networked medicine apps (for healthcare workers to communicate and interact within a system) are more prevalent in the App Store, while crisis communication, quarantine compliance, and informant apps (to report people breaking COVID-19 rules to authorities) are mostly or only available in Google Play.

Notably, quarantine compliance, informant, movement permit, and crisis communication apps are primarily built by government actors. We found apps facilitating crowd-sourced state surveillance in Argentina, Chile, and Russia. These ‘social monitoring’ apps enable reporting on the suspicious behaviour of others. In Bangladesh and India, governmental apps call on citizens to report ‘possibly affected people’ to ‘free the country’ as part of their ‘citizen responsibility’. In Lithuania and India, we observed the gamification of a pandemic where users can participate in daily health monitoring or symptom tracking to collect points to receive rewards or discounts.

Developer responsivity

To analyse how rapidly the COVID-19 app ecosystem emerged and evolved, we examined how responsive app developers have been to the pandemic. We use the term responsivity as a measure or proxy for the dynamics of software updates during the crisis and its openness to unprompted innovation (Zittrain, 2008). Responsivity is defined by how quickly apps are released and is measured by the number of app updates per time interval. It captures a sense of how actively a country/developer is working on those apps and how invested countries are in the response that the app represents.

Figure 5 shows the Android apps per country plotted on a timeline, indicating when countries first introduced them in transparent circles and updated them in coloured squares. It shows that early app development commenced almost immediately after the official declaration of the pandemic with most countries launching their apps in March–April 2020. Interestingly, we found that several apps existed before the crisis started. These are primarily pre-existing e-government apps, medical apps for communicating with health professionals and apps providing healthcare information. While conforming with the new platform policies of Apple and Google that prioritise releases from official and recognised entities, these repurposed apps signal the developers’ agile response in using existing apps and app functionalities to deal with the crisis.

Figure 5: Responsivity of [COVID-19]-related app developers by country (Android only), 2013 –<br />
        August 2020. Circles are initial releases (i.e., app launches); squares are any additional releases (i.e., app updates); scaled by the total number of releases. Data: App Annie. Illustration: authors and DensityDesign Lab.
Figure 5: Responsivity of [COVID-19]-related app developers by country (Android only), 2013 – August 2020. Circles are initial releases (i.e., app launches); squares are any additional releases (i.e., app updates); scaled by the total number of releases. Data: App Annie. Illustration: authors and DensityDesign Lab.

Existing research on ‘app evolution’ has found that around 14% of apps are updated regularly on a bi-weekly basis (McIlroy et al., 2015), while developers abandon the vast majority of apps shortly after being released (Tiwana, 2015). By contrast, surveying the average pace of updates for the COVID-19 apps per country demonstrates a high level of responsivity, particularly in India, Brazil, and the United Arab Emirates. Zooming into specific examples such as Columbia’s CoronApp (the most frequently updated app in our data) reveals how agile development has coordinated with ongoing government injunctions to handle the pandemic. Inspecting the changelogs (‘What’s New’) reveals recurring efforts to synchronise app functionalities with state emergency decrees.

From an inverse perspective on responsivity, a relative absence of development activity can also prompt further research into pandemic governance. Denmark and the UK show limited responsivity, which may indicate delays in developing COVID-19 apps, including due to public controversies. In June 2020, Denmark’s data protection agency prohibited its app from processing personal data until further notice (Amnesty International, 2020). The app has since relaunched after addressing multiple privacy issues. England and Wales, meanwhile, initially experimented with an app that used a centralised approach to data collection, but this was eventually abandoned (Sabbagh and Hern, 2020). Thus the findings additionally can reflect cases of backlash and legal contestation, specifically related to data protection and privacy.

Finally, an essential aspect of pandemic app store governance is the degree to which the app stores actively enforce their policies by removing apps. While it is difficult to establish whether the developer or the app store removed an app, and for what reason, two large scale analyses found that after 1.5–2 years, Google Play (Wang et al., 2018) and the App Store (Lin, 2021) removed almost half of the apps in their stores. In our data set, Google Play removed only 7.5% (N=31) and the App Store only 6.0% (N=15) of all apps after eight months. This is even lower than the study by Samhi et al. on COVID-19 apps (2020), which observed that 15% of COVID-19-related apps had been removed in the first two weeks after data collection in June 2020. COVID-19 apps are subject to ‘an increased level of enforcement’ during the app review phase and are thus likely more thoroughly screened and removed sooner (Google Help, 2020b).

Discursive positioning of response apps

In the next step, we analysed how the apps discursively present themselves to users and how they engage with existing technology and data and privacy debates. Textual app descriptions address users in particular ways to inform them about the apps’ functionalities and use cases, and persuade users to download them. We examined whether apps explicitly mentioned specific techniques and data/privacy concerns in their descriptions, and measured their keyword frequency. The techniques listed in Figure 6(a) and (b) indicate how developers convey different COVID-19 app responses to users. It includes prominent terms like location, notification, track/trace, alongside implementation terms like GPS, Bluetooth, alert, smart, or platform, and even mentions of machine-learning algorithms and artificial intelligence to identify COVID-19 symptoms. We also found related terms such as video, chat, messaging, and bots – often used in relation to remote healthcare and diagnosis. Overall, the distribution of these terms is similar in both app ecosystems, suggesting a similar discourse around techniques is used.

Figure 6(a) and (b): Resonance of technique-related terms used in [COVID-19]-related app titles and/or descriptions (Android and iOS). Illustration: authors.

Next, we identified the presence of terms related to data/privacy solutions or concerns. Figures 8(a) and (b) show relatively high use of terms describing how apps deal with collected data, including anonymous, encrypted, sensitive, or locally stored data. We also find occasional claims that apps delete data, securely transmit data via HTTPS, or process data adhering to the EU General Data Protection Regulation (mostly European apps). As such, these apps express their compliance with the app stores’ policies, which have additional requirements for collecting and using personal or sensitive data to support COVID-19-related (research) efforts (Apple Developer, 2020b; Google Help, 2020b). Overall, we observe that the app response to the pandemic is primarily framed as a data/privacy-sensitive one. Half of iOS app descriptions (N=126) and 40% of Android apps (N=158) mention data/privacy terms, showing how app developers address their users’ potential privacy concerns. It bears emphasizing, of course, that the mere presence of these discourses does not mean the operations of these apps conform to such stated capacities and values (Kuntsman, Miyake and Martin, 2019).

Figure 7(a) and (b): Resonance of data/privacy-related terms used in [COVID-19]-related app titles and/or descriptions (Android and iOS). Illustration: authors.
Figure 7(a) and (b): Resonance of data/privacy-related terms used in [COVID-19]-related app titles and/or descriptions (Android and iOS). Illustration: authors.

Development of response apps

Finally, we inquired into the development of apps from a technical perspective, drawing attention to software development kits (SDKs) as the building blocks for mobile app development, enabling developers to implement particular frameworks and external functionalities. In this context, Google and Apple are essential players with their app stores as means of distribution, and their central role as infrastructure providers offering and controlling the means of production. They function as an ‘obligatory passage point’ for the production and distribution of apps in which their SDKs function as mechanisms of generativity and control, enabling platforms to govern the development of apps (Blanke and Pybus, 2020; Pybus and Coté, 2021; Tilson et al., 2012). This analysis, however, only focuses on Android apps due to Apple’s very restrictive technical governance of iOS apps.

For our 410 Android apps, we find 7,335 SDKs in total, with an average of 19 SDKs per app (28 apps returned no data from AppBrain). 79 apps contain no libraries at all, suggesting that they have not been built with standard development tools such as Android Studio and may have been coded from scratch, or perhaps that developers are cautious about implementing third-party code in this ecosystem. Among these are apps from the Indian, Nepalese, and Vietnamese government. The high average number of SDKs shows developers’ reliance on these libraries for building apps and for accessing (third-party) functionality. Figure 8 shows that the majority of the embedded SDKs are development tools (98.4%, N=7,217), followed by advertising network libraries (1.06%, N=78) and social libraries (0.54%, N=40). The main development tools are embedding user interface components, networking, app development frameworks, Java utilities, databases, and analytics. We find very few advertising libraries due to Google’s policy restrictions on COVID-19 app monetisation. Interestingly, we found most of them in apps built by governments. For example, we detected Google’s AdMob SDK in government-made apps from India, Qatar, and Singapore, and the Outbrain SDK in government-made apps from Australia, Argentina, Italy, and the United Arab Emirates.

Figure 8: Software libraries embedded in [COVID-19]-related apps (Android only). Nodes are library tags (left), library types, their developers or owners, and their open-source availability (right); scaled by the number of occurrences. Highlighted are libraries developed/owned by Google (dark green). Illustration: authors and DensityDesign Lab.

When looking at the developers behind the SDKs, we find 134 unique actors. We observe a strong dependency on Google as 56% of all apps rely on at least one Google-owned SDK, and a single app relies on 11 Google-owned SDKs on average (Figure 9). We further find 70 individual developers, most of them on GitHub, offering specific solutions such as data serialisation, data conversion and image cropping. 81% of all apps use one or more open source libraries with an average app using 15 open source SDKs. We find that Google dominates the means of production by owning the most libraries; not just the ‘core’ Android ones, but also those used to embed maps and app analytics. By focusing on the ownership of these libraries, we highlight the material conditions of platforms and apps like Google as ‘service assemblages’ (Blanke and Pybus, 2020) which reveals some of the deeper ways in which pandemic platform governance, and platform power more generally, manifests.

Figure 9: Developers behind software libraries embedded in [COVID-19]-related apps by country or region (Android only). Circles (pies) are library developer distributions per country; horizontal axis: continents; vertical axis: % of open source libraries. Illustration: authors and DensityDesign Lab.
Figure 9: Developers behind software libraries embedded in [COVID-19]-related apps by country or region (Android only). Circles (pies) are library developer distributions per country; horizontal axis: continents; vertical axis: % of open source libraries. Illustration: authors and DensityDesign Lab.

Conclusion: Governing the pandemic response

A key starting point for our analysis of COVID-19 apps was to go beyond the critical analysis of single apps within a national context. As we have shown, COVID-19 apps also need to be understood relationally, situated within infrastructures and embedded in the context of platform governance. Such an understanding recognises from the beginning that platform companies occupy a central role in app ecosystems, exercised through diverse mechanisms and agencies that operate across different layers (Gorwa, 2019), and mediated by the relationships between governments, citizens and other actors.

In this article, we demonstrated and discussed how the two dominant COVID-19 app ecosystems have taken shape during the pandemic through acts of exceptional platform governance. We observed unique techniques of control determining which apps make it into the stores, how they are positioned and accessed in the stores, who they are developed by, and what kinds of functionality they may have (including restrictions on ads and other economic features). Nevertheless, the platforms’ technical affordances have provided generative means for a diversity of responses to emerge, with individual apps negotiating these governing conditions as part of their development.

First, we observed a broad alignment of states, international organisations and platform companies in terms of the recognised need to act or get ‘involved in the fight’. While tensions have come to predominantly define the relations between platform companies and national governments in terms of competition, privacy, taxation or content moderation (e.g. Busch et al., 2021; Gorwa, 2019; Khan, 2018; Klonick, 2017; Suzor, 2018), the pandemic re-directs these powerful actors around a global threat in specific ways. This includes the related infodemic and the need to maintain the perception of legitimate authority during the roll-out of apps whose data-gathering powers may otherwise face strong resistance. While such tensions may obviously remain, yet they are thrown into relief by the context of the crisis (as the omission of the WHO apps in the US demonstrate), which allows for a unique empirical mapping of the asymmetries, power relations and points of potential negotiation that shape platform governance more generally.

Secondly, pandemic platform governance has initially supported the production of app ecosystems which are partially ‘sandboxed’ from the economic activity that typically constitutes platform scenarios. Although COVID-19 apps without a doubt further entrench the economic dominance of platforms overall, during this early period we observe a heightening of their role as ‘regulatory intermediators’ within this specific niche by connecting citizens with government services and other authorities (Busch, 2020). In the case of Google, for instance, this intermediation is heavily steered through specialised modes of editorialisation. How this role changes over time, however, should remain subject to ongoing critical observation.

Third, this repurposing of platform infrastructures for ostensibly public ends significantly intensifies the intermediation of platform companies and governments. Platform companies increasingly act as a quasi-critical global infrastructure (yet with limited public oversight); organising and managing the emerging app ecosystem across national contexts while also providing the means of distribution (stores) and production (with SDKs, but also in the case of the GAEN protocols). For their part, national governments are cast in the role of complementors, developing apps under the regulatory conditions of the platform companies, often in partnership with other actors. How governments act in this novel role varies significantly in terms of the apps they develop (app responses), their partnerships (actor types), and ongoing activity (responsivity).

Fourth, several aspects of the COVID-19 app ecosystem help legitimise the production and distribution of apps to respond to the pandemic. Within the apps’ descriptions, we detect discourses around specific digital technologies, data and privacy; with apps signalling their technical competence, awareness of data protection issues and data policies. Whether the apps actually abide by these stated claims is another question, yet it is telling that both solutionist and privacy protection discourses are mobilised within this niche for purposes of persuasion and reassurance. How these kinds of discourses might contribute to further blurring distinctions between figures of the user and citizen is a point for further inquiry.

Finally, within the context of the pandemic, mobile app platforms have facilitated heterogenous configurations of governance, while still systematically shaping the activities of complementors. That is, despite the tightening of platform control under pandemic conditions, there exists a wide diversity of pandemic apps responses that can raise different issues within distinct spheres of sovereign governance and authority. Thus, with platform companies acting as facilitators, we see a diverse range of national strategies, exceptions and outliers. While the operations of pandemic platform governance are global in scale, it can nevertheless produce scenarios where Argentinian citizens are snitching on each other through informant apps, United Kingdom citizens participate in academic symptom studies, and US citizens are uniquely denied access to the WHO information apps.

Pandemic platform governance, therefore, foregrounds how platforms have adopted and negotiated their new role as a marketplace serving commercial interests in ordinary times and additional public interests in exceptional circumstances. While precedents for this role exist in e-government and e-health apps and services, the pandemic has accelerated and intensified these dynamics. By mapping the ecosystems of available COVID-19 apps, therefore, we learn how mobile platforms have responded to the global pandemic and infodemic with additional extraordinary measures to demarcate public interest niches from the wider commercial environment of the app store. The question for policymakers and citizens is how this new governance might continue to evolve in future now that platforms have come to play a key role in mediating public values and global governmental responses to the pandemic.


Acknowledgements

Authors listed in alphabetical order. Thanks to Jason Chao and Stijn Peeters for their assistance in developing the Google Play and App Store scrapers for this study, and to Jason for uploading the Android application package (APK) files to the Internet Archive’s ‘COVID-19_Apps’ collection. Thanks also to Giovanni Lombardi, Angeles Briones, Gabriele Colombo, and Matteo Bettini (DensityDesign Lab) for their assistance with some of the graphics included in this article. Further, we thank those who participated in our data sprints during the 2020 Digital Methods Summer School (University of Amsterdam) and the ‘Exploring COVID-19 app ecologies’ (Aarhus University) and ‘Mapping the COVID-19 App Space’ workshops (Centre for Digital Inquiry). Finally, we thank the editors and reviewers, Michael Veale, Kaspar Rosager Ludvigsen, Angela Daly, and Frédéric Dubois, whose constructive and attentive comments greatly improved the article.

Data availability

The data that support the findings of this study are openly available in the Open Science Framework (OSF) at https://doi.org/10.17605/osf.io/wq3dr. Additionally, the available Android application package (APK) files of the COVID-19 Android apps covered in this study are openly available and preserved in the ‘COVID-19_Apps’ collection of the Internet Archive at https://archive.org/details/COVID-19_Apps.

References

Ada Lovelace Institute. (2020). Exit through the App Store? [Rapid evidence review]. Ada Lovelace Institute. https://www.adalovelaceinstitute.org/news/exit-through-the-app-store-uk-technology-transition-covid-19-crisis/

Ahmed, N., Michelin, R. A., Xue, W., Ruj, S., Malaney, R., Kanhere, S. S., Seneviratne, A., Hu, W., Janicke, H., & Jha, S. K. (2020). A Survey of COVID-19 Contact Tracing Apps. IEEE Access, 8, 134577–134601. https://doi.org/10.1109/ACCESS.2020.3010226

Albright, J. (2020, October 28). The Pandemic App Ecosystem: Investigating 493 Covid-Related iOS Apps across 98 Countries [Medium Post]. Jonathan Albright. https://d1gi.medium.com/the-pandemic-app-ecosystem-investigating-493-covid-related-ios-apps-across-98-countries-cdca305b99da

Amnesty International. (2020, June 15). Norway halts COVID-19 contact tracing app a major win for privacy. Amnesty International, News. https://www.amnesty.org/en/latest/news/2020/06/norway-covid19-contact-tracing-app-privacy-win/

Bedi, P., & Sinha, A. (2020). A Survey of Covid 19 Apps Launched by State Governments in India. The Centre for Internet and Society. https://cis-india.org/internet-governance/stategovtcovidapps-pdf

Blanke, T., & Pybus, J. (2020). The Material Conditions of Platforms: Monopolization Through Decentralization. Social Media + Society, 6(4). https://doi.org/10.1177/2056305120971632

Busch, C. (2020). Self-regulation and regulatory intermediation in the platform economy. In M. C. Gamito & H.-W. Micklitz (Eds.), The role of the EU in transnational legal ordering: Standards, contracts and codes (pp. 115–134). Edward Elgar Publishing.

Busch, C., Graef, I., Hofmann, J., & Gawer, A. (2021). Uncovering blindspots in the policy debate on platform power. European Commission Expert Group for the Observatory on the Online Platform Economy.

Davalbhakta, S., Advani, S., Kumar, S., Agarwal, V., Bhoyar, S., Fedirko, E., Misra, D. P., Goel, A., Gupta, L., & Agarwal, V. (2020). A Systematic Review of Smartphone Applications Available for Corona Virus Disease 2019 (COVID19) and the Assessment of their Quality Using the Mobile Application Rating Scale (MARS. Journal of Medical Systems, 44(9), 164. https://doi.org/10.1007/s10916-020-01633-3

Developer, A. (2020). Ensuring the Credibility of Health & Safety Information. News and Updates. https://developer.apple.com/news/?id=03142020a

Developer, A. (2021). App Store Review Guidelines. Apple: App Store Review Guidelines. https://developer.apple.com/app-store/review/guidelines/

Dieter, M., Gerlitz, C., Helmond, A., Tkacz, N., Vlist, F., & Weltevrede, E. (2019). Multi-Situated App Studies: Methods and Propositions. Social Media + Society, 5(2), 1–15. https://doi.org/10.1177/2056305119846486

Eaton, B., Elaluf-Calderwood, S., Sorensen, C., & Yoo, Y. (2011). Dynamic structures of control and generativity in digital ecosystem service innovation: The cases of the Apple and Google mobile app stores. School of Economics and Political Science. http://eprints.lse.ac.uk/47436/

French, M., Mykhalovskiy, E., & Lamothe, C. (2018). Epidemics, Pandemics, and Outbreaks. In A. J. Treviño (Ed.), The Cambridge Handbook of Social Problems (pp. 59–78). Cambridge University Press. https://doi.org/10.1017/9781108550710.005

Gasser, U., Ienca, M., Scheibner, J., Sleigh, J., & Vayena, E. (2020). Digital tools against COVID-19: Taxonomy, ethical challenges, and navigation aid. The Lancet Digital Health, 2(8), 425–434. https://doi.org/10.1016/S2589-7500(20)30137-0

Gillespie, T. (2014). The relevance of algorithms. In T. Gillespie, P. J. Boczkowski, & K. A. Foot (Eds.), Media technologies: Essays on communication, materiality, and society (pp. 167–194). MIT Press.

Gillespie, T. (2015). Platforms Intervene. Social Media + Society, 1(1). https://doi.org/10.1177/2056305115580479

Google Help. (2021). Requirements for coronavirus disease 2019 (COVID-19) apps. Play Console Help. https://support.google.com/googleplay/android-developer/answer/9889712?hl=en

Google Play. (2020, March 14). Coronavirus: Stay informed [App store]. Google Play. https://play.google.com/store/apps/topic?id=campaign_editorial_3003109_crisis_medical_outbreak_apps_cep

Gorwa, R. (2019). What is platform governance? Information, Communication & Society, 22(6), 854–871. https://doi.org/10.1080/1369118X.2019.1573914

Greene, D., & Shilton, K. (2018). Platform Privacies: Governance, Collaboration, and the Different Meanings of “Privacy” in iOS and Android Development. New Media & Society, 20(4), 1640–1657. https://doi.org/10.1177/1461444817702397

Help, G. (2021). Inappropriate Content. Policy Center. https://support.google.com/googleplay/android-developer/answer/9878810

Khan, L. M. (2018). Sources of tech platform power. Georgetown Law Technology Review, 2(2), 325–334. https://georgetownlawtechreview.org/sources-of-tech-platform-power/GLTR-07-2018/

Kitchin, R. (2020). Civil Liberties or Public Health, or Civil Liberties and Public Health? Using Surveillance Technologies to Tackle the Spread of COVID-19. Space and Polity, 24(3), 362–381. https://doi.org/10.1080/13562576.2020.177058

Klonick, K. (2018). The New governors: The People, rules, and processes governing online speech. Harvard Law Review, 131, 1598–1670. https://harvardlawreview.org/2018/04/the-new-governors-the-people-rules-and-processes-governing-online-speech/

Kuntsman, A., Miyake, E., & Martin, S. (2019). Re-thinking digital health: Data, appisation and the (im)possibility of ‘opting out’. Digital Health, 5, 1–16. https://doi.org/10.1177/2055207619880671

Levy, B., & Stewart, M. (2021). The evolving ecosystem of COVID-19 contact tracing applications [Preprint]. ArXiv. http://arxiv.org/abs/2103.10585

Liang, F. (2020). COVID-19 and Health Code: How Digital Platforms Tackle the Pandemic in China. Social Media + Society, 6(3). https://doi.org/10.1177/2056305120947657

Lin, F. (2021). Demystifying Removed Apps in iOS App Store [Preprint]. ArXiv. http://arxiv.org/abs/2101.05100

McIlroy, S., Ali, N., & Hassan, A. E. (2016). Fresh apps: An empirical study of frequently-updated mobile apps in the Google Play store. Empirical Software Engineering, 21(3), 1346–1370. https://doi.org/10.1007/s10664-015-9388-2

Milan, S., Treré, E., & Masiero, S. (Eds.). (2020). COVID-19 from the Margins: Pandemic Invisibilities, Policies and Resistance in the Datafied Society. Institute of Network Cultures. https://networkcultures.org/blog/publication/covid-19-from-the-margins-pandemic-invisibilities-policies-and-resistance-in-the-datafied-society/

Morris, J. W., & Murray, S. (2018). Appified: Culture in the Age of Apps. University of Michigan Press.

Pichai, S. (2020, March 6). Coronavirus: How we’re helping [Blog post]. The Keyword. https://blog.google/inside-google/company-announcements/coronavirus-covid19-response/

Privacy International. (2021). Fighting the Global Covid-19 Power-Grab. Privacy International Campaigns. https://privacyinternational.org/campaigns/fighting-global-covid-19-power-grab

Pybus, J., & Coté, M. (2021). Did you give permission? Datafication in the mobile ecosystem. Information, Communication & Society. https://doi.org/10.1080/1369118X.2021.1877771

Rieder, B., & Hofmann, J. (2020). Towards platform observability. Internet Policy Review, 9(4). https://doi.org/10.14763/2020.4.1535

Sabbagh, D., & Sinha, A. (2020). UK abandons contact-tracing app for Apple and Google model. The Guardian. https://www.theguardian.com/world/2020/jun/18/uk-poised-to-abandon-coronavirus-app-in-favour-of-apple-and-google-models

Samhi, J., Allix, K., Bissyandé, T. F., & Klein, J. (2021). A First Look at Android Applications in Google Play related to Covid-19 [Preprint]. ArXiv. http://arxiv.org/abs/2006.11002

Statcounter. (2021). Mobile Operating System Market Share India. Statcounter Global Stats. https://gs.statcounter.com/os-market-share/mobile/india

Suzor, N. (2018). Digital constitutionalism: Using the rule of law to evaluate the legitimacy of governance by platforms. Social Media + Society, 4(3). https://doi.org/10.1177/2056305118787812

Taylor, L., Sharma, G., Martin, A., & Jameson, S. (Eds.). (2020). Data Justice and COVID-19: Global Perspectives. Meatspace Press. https://shop.meatspacepress.com/product/data-justice-and-covid-19-global-perspectives

Tilson, D., Sorensen, C., & Lyytinen, K. (2012). Change and Control Paradoxes in Mobile Infrastructure Innovation: The Android and iOS Mobile Operating Systems Cases. 2012 45th Hawaii International Conference on System Sciences, 1324–1333. https://doi.org/10.1109/HICSS.2012.149

Tiwana, A. (2015). Platform Desertion by App Developers. Journal of Management Information Systems, 32(4), 40–77. https://doi.org/10.1080/07421222.2015.1138365

Tiwana, A., Konsynski, B., & Bush, A. A. (2010). Research Commentary—Platform Evolution: Coevolution of Platform Architecture, Governance, and Environmental Dynamics. Information Systems Research, 21(4), 675–687. https://doi.org/10.1287/isre.1100.0323

Tolomei, S. (2020, April 6). Google Play updates and information: Resources for developers. [Blog post]. Android Developers Blog. https://android-developers.googleblog.com/2020/04/google-play-updates-and-information.html

Tsinaraki, C., Mitton, I., Dalla Benetta, A., Micheli, M., Kotsev, A., Minghini, M., Hernandez, L., Spinelli, F., & Schade, S. (2020). Analysing mobile apps that emerged to fight the COVID-19 crisis (JRC 123209). European Commission. https://ec.europa.eu/jrc/communities/en/community/citizensdata/document/analysing-mobile-apps-emerged-fight-covid-19-crisis

van Dijck, J., Poell, T., & Waal, M. (2018). The Platform Society (Vol. 1). Oxford University Press. https://doi.org/10.1093/oso/9780190889760.001.0001

Veale, M. (2020). Sovereignty, privacy and contact tracing protocols. In L. Taylor, G. Sharma, A. Martin, & S. Jameson (Eds.), Data Justice and COVID-19: Global Perspectives (pp. 34–39). Meatspace Press.

Wang, H., Li, H., Li, L., Guo, Y., & Xu, G. (2018). Why are Android Apps Removed From Google Play? A Large-Scale Empirical Study. 2018 IEEE/ACM 15th International Conference on Mining Software Repositories (MSR, 231–242.

Wang, L., He, R., Wang, H., Xia, P., Li, Y., Wu, L., Zhou, Y., Luo, X., Guo, Y., & Xu, G. (2020). Beyond the Virus: A First Look at Coronavirus-themed Mobile Malware [Preprint]. ArXiv. http://arxiv.org/abs/2005.14619

W.H.O. (2020a). Virtual press conference on COVID-19. World Health Organization. https://www.who.int/docs/default-source/coronaviruse/transcripts/who-audio-emergencies-coronavirus-press-conference-full-and-final-11mar2020.pdf

W.H.O. (2020b, August 25). Immunizing the public against misinformation. World Health Organization. https://www.who.int/news-room/feature-stories/detail/immunizing-the-public-against-misinformation

Zarocostas, J. (2020). How to fight an infodemic. The Lancet, 395(10225), 676. https://doi.org/10.1016/S0140-6736(20)30461-X

Zittrain, J. (2008). The Future of the Internet—And How to Stop It. Yale University Press.

Footnotes

1. While the Google-Apple Exposure Notification (GAEN) protocols were introduced on 20 May 2020, we found that only 8 out of the 410 Android apps in our source set included (GAEN) API in their AndroidManifest.xml file by November.

2. While not discussed in this article, the collected data and information about the permissions requested by each app is openly available in the Open Science Framework (OSF).

3. Google’s mobile platform Android has a 71.18% market share worldwide, followed by Apple’s iOS with 28.19% (Statcounter, 2021). As a consequence of the platform companies tightly connecting their app stores to their mobile operating systems, Google’s Play store (except in China) and Apple’s App Store have become the key distribution channels of apps worldwide.

4. For the purposes of this article, we interpret the ‘developer name’ listed on the app store details page as the actor responsible for the development of that app. However, the actor listed as the ‘developer’ on the app details page is not necessarily, or not always, the same as the developer of that app (e.g. when the ‘developer’ merely listed the app in the app store, without having developed it).

5. AppBrain API specification, https://www.appbrain.com/info/help/api/specification.html

6. The app store scrapers have been developed by the App Studies and Digital Methods Initiatives and are available at: http://appstudies.org/tools/.

Viewing all 248 articles
Browse latest View live