Putting Meaningful Transparency at the Heart of the Digital Services Act

Why Data Access for Research Matters & How we can Make it Happen

30 October 2020

As organizations committed to upholding democratic values and fundamental rights, we see an urgent need to commit internet platforms to a higher level of transparency. We propose that EU should maintain the key principles of the limited liability regime outlined in the e-Commerce Directive, and introduce binding transparency frameworks that enable privacy-respecting access to data for scrutiny. We welcome the European Commission’s proposed Digital Services Act (DSA), and urge EU policymakers to use this “Modern Rulebook for Digital Services” in Europe as an opportunity to hold platforms to account.

Download the recommendations as PDF file

Digital Society Deliberates Online – but we Know Little About the Gatekeepers

In the spring of 2020, AlgorithmWatch found clear indications that Instagram’s newsfeed algorithm was nudging female content creators to reveal more skin, by making semi-nude photos more visible in their followers’ feeds¹. When we confronted Facebook, Instagram’s parent company, with our findings, they declined to answer our questions, and accused us of “misunderstanding how Instagram works.” They also rejected any requests to back up this statement with their own data, making it impossible to validate their claims. Instagram’s response to the AlgorithmWatch investigation is emblematic of a much deeper problem: content-hosting intermediaries² rely on extremely intrusive data collection practices to power their opaque algorithmic systems, but when independent watchdogs try to understand the effects of these practices, data for public interest scrutiny is a scarce resource.

At the same time, as the COVID19 crisis has shown, algorithmically-driven communications intermediaries play a central and ever-expanding role in modern society. They are inextricably linked to how we coordinate remotely at school or work, how we find and consume information, and how we organize our social movements or exercise key democratic rights. Large parts of our media and communications infrastructure are governed by algorithmic systems, and we need better tools to understand how these systems are impacting our democracies.

The Stakes Are High — Shaping the Digital Information Ecosystem for the Future

Public authorities bound by fundamental rights cannot ban “low-quality” content or demand its suppression as long as it is legal³. This is a fundamental basis of freedom of expression and must not be undermined. Nevertheless, it is crucial to acknowledge that algorithmically-driven content curation, on social media platforms in particular, can introduce a host of risks that affect the functionality of communication processes necessary for democracy⁴.

From Germany’s Interstate Media Treaty to France’s Avia Law to the EU’s most recent proposal on preventing the dissemination of terrorist content online, previous approaches to tackling such risks have been overlapping, incoherent, or otherwise fragmented⁵.

The Digital Services Act (DSA), meant to overhaul the E-Commerce Directive, is an opportunity for a fresh slate. The limited liability framework outlined in the E-Commerce directive is the right approach to dealing with illegal user-generated content, but this framework must be enhanced and refined. Instead of introducing measures that oblige or encourage platforms to proactively monitor speech, the existing limited liability regime should be enhanced through more rigorous, and clear procedural standards6 for notice and action and, most importantly, transparency frameworks that empower independent third parties to hold platforms to account.

Self-Regulatory Transparency does not Go Far Enough

Improved user-facing transparency is necessary⁷ and can offer much-needed insight into the personalized results presented to individual users, but it cannot provide insight into the collective influence of platforms. To assess and monitor how platforms apply their community standards, or address collective societal risks like disinformation, polarization, and bias, we must rely on evidence from independent third parties. But for the journalists, academics and civil society actors tasked with understanding and scrutinizing opaque algorithmic “black boxes”, such evidence is difficult to generate. Independent researchers encounter tremendous challenges accessing reliable data from platforms⁸ and in recent years, platforms have further restricted access to their public Application Programming Interfaces (APIs), making it nearly impossible to hold companies accountable for illegal or unethical behavior⁹.

For researchers and watchdogs, it has become clear that self-regulatory transparency frameworks are “incomplete, ineffective, unmethodical, and unreliable”¹⁰. The concentration of data in the hands of a few private companies has a deep impact on the overall health of the digital public sphere.

Towards Accountability in Platform Governance: Empowering the Digital Fourth Estate

To this end, we applaud the EU Parliament Committees’ emphasis on the need to audit algorithms used in content moderation, and curation¹¹. However, meaningful monitoring of automated decision-making (ADM) systems also requires scrutiny of system in/outputs, a task best suited for independent academics, journalists and civil society actors. We welcome the establishment of the new EU Digital Media Observatory and the Commission’s efforts to provide researchers with tools to better understand disinformation¹², but are convinced that the EU should move beyond its siloed approach to tackling specific online harms. For this reason, we propose that the DSA should introduce comprehensive data access frameworks that empower civil society and pave the way for true accountability.

Learning from best practices in privacy-respecting data-sharing governance models, these data access frameworks should include:

1. Binding rules outlining who can directly access data or can apply for access, what specific data can be accessed¹³ and how and by whom that data is to be gathered and checked before disclosure.

Disclosure obligations should differentiate between dominant players and smaller intermediaries, as defined according to indexes of annual turnaround, market share, user base and/or gatekeeping impact. We propose that the scope of the recommendations be limited to dominant platforms¹⁴.
Disclosure obligations should be based on the technical functionalities of the platform service, rather than more ambiguous and
politically-charged conceptions of harm such as ‘disinformation’, ‘political advertising’, and ‘hate speech’.
Technical features might include: high-level aggregate audience metrics; advertising¹⁵ and micro-targeting; search features; feeds, ranking and recommendation; and content moderation (including removal but also other measures such as demonetization or fact-checking).

2. An EU institution with a clear legal mandate to enable access to data and to enforce transparency obligations in case of non-compliance across the EU27.

The institution should act as an arbiter in deciding on requests for confidentiality from the disclosing party (based on e.g. intellectual property or data protection law). Barriers to gaining access to predefined data should be minimized. The institution should maintain relevant access infrastructures such as virtual secure operating environments, public databases, websites and forums. It should also be tasked with pre-processing and periodically auditing disclosing parties to verify the accuracy of disclosures.
Furthermore, the mandate shall comprise collaboration with multiple EU and national-level competent authorities such as data protection authorities, cyber-security agencies and media regulators to minimize the risk of capture or negligence. The legal framework should explicitly outline different levels of oversight and how they interact. Because trust in government bodies differs widely across Member States, installing tiered safeguards and guarantees for independence is critical. To prevent competence issues and minimize the politicization of the framework, it is advisable that the role of such an institution be limited to the role of a ‘transparency facilitator.’
The institution shall proactively support relevant stakeholders. The freedom of scientific research must be explicitly enshrined. In this spirit, the proposed institution must also proactively facilitate uptake, tools and know-how among stakeholders including journalists, regulators, academics, and civil society. The institution might also explore the possibility of engaging the broader European public in the development of research agendas (see e.g. lessons from the Dutch National Research Agenda¹⁶) or by incubating pilot projects that explore the possibility of connecting users and researchers through fiduciary models. Independent centers of expertise on AI/ADM at national level, as proposed by AlgorithmWatch and Access Now¹⁷, could play a key role in this regard and support building the capacity of existing regulators, government and industry bodies.

3. Provisions that ensure data collection is privacy-respecting and GDPR compliant

Because of the sensitive nature of certain types of data, there are legitimate concerns to be raised regarding threats to user privacy. The Cambridge Analytica scandal should serve as a cautionary tale, and any misuse of data by researchers would severely undermine the integrity of any transparency framework.
It is imperative that the institution uphold the GDPR’s data protection principles including (a) lawfulness, fairness and transparency; (b)purpose limitation; (c) data minimization; (d) accuracy; (e) storage limitation and (f) integrity
and confidentiality.
The proposed data access entity should take inspiration from existing institutions like the Finnish health data framework Findata¹⁸ which integrates necessary safeguards (both technical and procedural) for data subjects, including online rights management systems that allow citizens to exercise their data subject rights in an easy manner.
Granular data access should only be enabled within a closed virtual environment, controlled by the independent body. As was the case with the Findata framework, it is advisable for the Commission to consider testing key components of the framework in pilot phases

A Critical Piece of the Pie

A healthy democracy depends on a strong and healthy public sphere—and most importantly, a strong and healthy fourth estate. When journalists, academics, and civil society are free to challenge and scrutinize power, policymakers are kept in check, and the public can make more informed decisions. While there is no single silver bullet to address all of the challenges linked to the platform economy, we are convinced that the proposals outlined above serve as critical baseline demands for improved accountability in the digital public sphere.

Resources

These recommendations are based on the findings of three studies commissioned from the Mainz Media Institute and the Institute for Information Law at the University of Amsterdam. To read these reports in full:

Are Algorithms a Threat to Democracy?
The Rise of Intermediaries: A Challenge for Public Discourse
Professor Dr. Birgit Stark and Daniel Stegmann, M.A.
with Melanie Magin, Assoc. Prof. & Dr. Pascal Jürgens

Designing platform governance:
A normative perspective on needs, strategies, and tools to regulate intermediaries
Prof. Dr. Matthias Cornils

Operationalizing Research Access in Platform Governance
What to Learn from Other Industries?
Dr. Jef Ausloos, Paddy Leerssen, & Pim ten Thije

Signatories

Organizations

Access Now
AlgorithmWatch
Alternatif Bilişim
AMO (Asociace pro mezinarodni otazky) – Association
for International Affairs
ApTI Romania
Center for Media, Data and Society at the Central,
European University’s School of Public Policy
Centrum Cyfrowe
Civil Liberties Union for Europe
Defend Democracy
Democracy Reporting International
European Center for Not-for-Profit Law
European Digital Rights
European Policy Centre
Global Forum for Media Development
HateAid
Homo Digitalis
IT-Pol
Lithuanian Journalism Centre
Media Development Centre Bulgaria
The Open Society European Policy Institute
Panoptykon Foundation
SHARE Foundation Belgrade
Stiftung Neue Verantwortung
Xneth

Individuals

Dr. Marco T. Bastos,
Ad Astra Fellow - University College Dublin,
School of Information and Communication Studies

Prof. Dr. Lance Bennett,
Ruddick C. Lawrence Professor of Communication,
Emeritus, University of Washington, Seattle

Dr. Alan Borning,
Professor Emeritus, Paul G Allen School of Computer Science & Engineering,
University of Washington

Prof. Dr. Joanna J. Bryson,
Professor of Ethics and Technology at the Hertie School of Governance

Prof. Dr. Claes de Vreese,
Professor and Chair of Political Communication
at the Amsterdam School of Communication Research, University of Amsterdam

Prof. Dr. Natali Helberger,
Distinguished University Professor of Law and Digital Technology,
with a special focus on AI at the University of Amsterdam

Prof. Dr. Slava Jankin,
Director, Data Science Lab at the Hertie School of Governance

Dr. Christian Katzenbach,
Senior Researcher, Alexander von Humboldt Institute
for Internet and Society, Berlin

Prof. Dr. Sophie Lecheler,
Department of Communication, University of Vienna

Prof. Dr. Simon Munzert,
Hertie School of Governance

Prof. Dr. Barbara Pfetsch,
Institute for Media and Communication Studies, Freie Universität Berlin

Prof. Dr. Nathaniel Persily,
James B. McClatchy Professor of Law, Stanford Law School

Prof. Dr. Cornelius Puschmann,
ZeMKI, University of Bremen

Prof. Dr. Daniela Stockmann,
Professor of Digital Governance, Hertie School Centre for Digital Governance

Prof. Dr. Linnet Taylor,
Tilburg Institute for Law, Technology and Society

Prof. Dr. Rebekah Tromble,
Director, Institute for Data, Democracy & Politics, George Washington

Prof. Dr. Cristian Vaccari,
Professor of Political Communication and Co-Director,
Centre for Research in Communication and Culture, Loughborough University

^{1. Judith Duportail et al (2020): Undress or fail: Instagram’s algorithm strong-arms users into showing skin.↩}

^{2. We use the term “content hosting intermediary” to refer to online services that provide third-party content, including, for example, user-generated contributions and also media content in any form (text, image, audio, or video).↩}

^{3. Matthias Cornils et al (2020): Designing Platform Governance: A Normative Perspective on Regulatory Needs, Strategies, and Tools to Enhance the Information Function of Intermediaries; at the same time international human rights law (e.g. Art. 15 ECHR) puts very strict requirements for the conditions under which states can restrict freedom of expression and information, notably the principles of legality, necessity and proportionality and legitimacy.↩}
^{4. Birgit Stark et al (2020): Are Algorithms a Threat to Democracy? The Rise of Intermediaries: A Challenge for Public Discourse. ↩}
^{5. Matthias Cornils et al (2020): Designing Platform Governance: A Normative Perspective on Regulatory Needs, Strategies, and Tools to Enhance the Information Function of Intermediaries.↩}

^{6. Such standards should include complaint management/redress mechanisms, including put-back obligations. When embedded in a co-regulatory approach, independent dispute settlement bodies such as those proposed can play an important complementary role in ensuring that users’ fundamental rights are upheld. For further details, see: EDRi (2020): Platform Regulation Done Right. EDRi Position Paper on the EU Digital Services Act.↩}

^{7. For further elaboration on recommendations for user-facing transparency see: Panoptykon Foundation (2020) Panoptykon Foundation’s submission to the consultation on the Digital Services Act Package.↩}

^{8. Madeline Brady (2020): Lessons Learned: Social Media Monitoring during Elections: Case Studies from five EU Elections 2019-2020 (Democracy reporting International). ↩}

^{9. Researchers depend on private data sharing partnerships and privileged access to platform data which has the effect of further entrenching platform power, leading to chilling effects amongst researchers, who are afraid to lose access to platform data. Researchers who depend on what little data is available complain about its poor quality. Frequently, data is not available in machine-readable format or it is clearly inaccurate. The consequences are twofold. The impact of platforms on society remains severely understudied at a systemic level, and the research that exists skews heavily towards the most transparent platforms, causing substantial distortions. For further details see Nikolas Kayser-Bril (2020) For researchers, accessing data is one thing. Assessing its quality another; and Under the Twitter streetlight: How data scarcity distorts research. ↩}

^{10. For further elaboration on the recommendations see Jef Ausloos et al (2020): Operationalizing Research Access in Platform Governance: What to Learn from Other Industries?, the problems of lack of consistency in reporting and lack of relevant data available for adequate monitoring is also clearly stated by the European Regulators Group for Audio-visual Media Services (ERGA), in relation to the enforcement of the European Commission’s Code of Practice on Disinformation. ↩}

^{11. European Parliament Committee on Legal Affairs (2020): Draft report with recommendations to the Commission on a Digital Services Act: adapting commercial and civil law rules for commercial entities operating online, PE650.529v01-00, 22 April 2020 (JURI report), European Parliament Committee on the Internal Market and Consumer Protection, ‘Draft report with re-commendations to the Commission on Digital Services Act: Improving the functioning of the Single Market’, PE648.474v02-00, 24 April 2020 (IMCO report). ↩}

^{12. European Commission (2019) Commission Launches Call to Create the European Digital Media Observatory. ↩}

^{13. It is essential that disclosure rules remain flexible and subject to updates and revisions by the proposed independent institution.↩}

^{14. Additionally, these recommendations should be viewed separately from any ex-ante legislation that will apply on the basis of increased‚ data power‘ that enables dominant platforms to engage in‚ gate-keeping‘ practices that hamper new entrants and fair competition to the market. These recommendations are intended to apply prior to the activation of ex-ante legislation, and remain in effect regardless of the obligations that a future ex-ante instrument will enforce. For nuanced criteria that characterize dominant platforms/intermediaries, see EDRi (2020): Platform Regulation Done Right. EDRi Position Paper on the EU Digital Services Act, p.16. ↩}

^{15. For more details on proposed disclosure rules in the area of advertising see the European Partnership for Democracy’s joint statement on Universal Advertising Transparency by Default.↩}

^{16. Beatrice de Graaf et al (2017): The Dutch National Research Agenda in Perspective: A Reflection on Research and Science Policy in Practice.↩}

^{17. AlgorithmWatch (2020): Our response to the European Commission’s consultation on AI.↩}

^{18. See Findata case study in Jef Ausloos et al (2020): Operationalizing Research Access in Platform Governance: What to Learn from Other Industries?↩}

Read more on our policy & advocacy work on the Digital Services Act.