Platforms’ promises to researchers: first reports missing the baseline

by John Albert

An initial analysis shows that platforms have done little to “empower the research community” despite promises made last June under the EU’s revamped Code of Practice on Disinformation.

16 February 2023

Last Thursday, dozens of tech companies—including major social media platforms like Facebook, YouTube, TikTok, and Twitter—delivered the first baseline reports meant to detail their efforts to combat disinformation in the EU.

The reports, which are publicly available, speak to platforms' progress on a litany of commitments made in July under the framework of the 2022 Code of Practice on Disinformation. This includes, among other things, promises on demonetizing disinformation, political ad transparency, working with fact-checkers, and facilitating researchers' access to data.

Although the code is voluntary, its commitments may become binding for the largest platforms once the European Commission begins monitoring and enforcing them this summer under the Digital Services Act (DSA). For violating the DSA, Big Tech platforms with more than 45 million users in the EU could face fines of up to 6 percent of their global revenues—fulfilling their commitments under the code of practice is one way to ease their compliance obligations.

So how did they do? Not so well, according to NGOs and fact-checking groups closely involved with and monitoring the Code. Twitter has been widely criticized for submitting an incomplete report, and progress from the other major platforms have been decidedly mixed.

Specifically when it comes to providing researchers with better access to data, critics and European Commission Vice President for Value Věra Jourová agree that the reports leave much to be desired.

There are some positive indicators in this area—for example, the major platforms (minus Twitter) have said they are actively supporting the European Digital Media Observatory (EDMO) in developing an independent body to enable secure data sharing with independent researchers. But generally speaking, platforms have thus far reported precious few concrete measures to back up their promises.

This initial analysis goes deeper into the reports to examine what Facebook, TikTok, YouTube, & Twitter’s disclosures say—and don’t say— about their efforts to “empower the research community.”

Facebook & Instagram

A breadth of transparency efforts can’t cover up CrowdTangle’s slow death

In describing its commitments to researchers, Facebook’s report is thin on providing concrete facts and figures. In that sense, this “baseline” report does not offer much of a baseline beyond pointing to existing programs and rephrasing disclosures that are already public.

That said, Facebook’s report does cover a relatively broad assortment of existing researcher programs and transparency reports that the company publishes quarterly. It also points to its Ad library API which allows researchers to search for ad-related data and a research platform to share data with select independent researchers to study “coordinated inauthentic behavior”, i.e. online campaigns that seek to manipulate public debate.

There is one glaring omission in Facebook’s report, however.

The report follows the structure of the code, which contains 44 commitments and 128 specific measures. In Commitment 26, signatories promise APIs that allow researchers to search through and analyze public data in real time—which is already possible through Facebook’s CrowdTangle, a data analytics tool which researchers use to monitor disinformation circulating on Facebook and Instagram.

What’s missing from Facebook’s report, then, are any concrete measures on the uptake, swiftness, and acceptance levels of CrowdTangle—which is clear when looking at this graphic:

*The table is probably empty because the company has stopped accepting new researchers to CrowdTangle*

Why the empty table? It’s likely because CrowdTangle is being phased out, as was reported last year, and the program has long stopped accepting new applicants.

Once CrowdTangle goes away, the question is how Facebook plans to replace it in line with its commitments in the code—and whether a diverse range of researchers will be able to access such tools (including civil society researchers and journalists). The company has launched a new Researcher API, but this is currently only available to academics that are hand-selected by Facebook.

TikTok
The Chinese-owned company has a shorter track record on transparency—and an extra burden of proof

As the youngest of the major social media platforms to sign onto the code, TikTok doesn’t have a long history of transparency—and it’s a sensitive subject for the company, given fears of Chinese state influence corrupting the app (TikTok is owned by ByteDance, a Chinese company). Egregiously, TikTok admitted in November that it had used its app to spy on reporters in an effort to discover the source of leaks from inside the company.

TikTok has nevertheless made splashy announcements in an effort to assuage skeptics. The company plans to open a Transparency Center for moderation and data practices in Los Angeles and promises to make its source code available for inspection (the center has already been toured by select journalists; a full opening is said to be delayed due to the pandemic). TikTok has also announced a forthcoming API for researchers in line with Commitment 26 of the code of practice (unlike Facebook and Twitter, which already have established researcher APIs, TikTok at least has an excuse for not being able to report baseline metrics on its API as it hasn’t yet launched).

In Commitment 28, signatories also promise to support good faith research into Disinformation that involves their services. To that end, TikTok points to its European Safety and Advisory Council formed in 2021, its representatives' participation in research-focused events, and general engagement with the research community.

A crucial aspect of this commitment is platforms' promise to not prohibit or discourage public interest research into disinformation on their platforms, and to not take adversarial action against researcher users or accounts that undertake or participate in good faith research. Fulfilling this commitment would repent for past behavior by Facebook in muzzling public interest research from the NYU Ad Observatory and AlgorithmWatch. We’ll see how TikTok responds to AlgorithmWatch’s recently launched data donation project to study TikTok’s “For You” recommender feed.

Google Search & YouTube
Google shows the other platforms how a box-ticking exercise is done

What jumps out in Google’s report is an apparent willingness to follow the script set out by the code of practice, not only reporting on its various commitments but providing at least some general baseline metrics.

First, Google describes its publicly available research tools, including Google Trends and Fact Check Explorer, and provides actual (if unverified) data on the number of users of each tool, broken down by EU member state. The company also shares some approximate numbers regarding the uptake of its YouTube Research Program which launched in July 2022. This program promises to expand access to global YouTube video metadata via an API for academic researchers—but with less than 15 unique researchers having accessed the API thus far, YouTube has minimal granular data to report on it.

Google goes on to cite its philanthropic efforts to illustrate its commitment to supporting good faith research into disinformation. This includes the company’s €25M EUR donation to help launch the European Media and Information Fund, which provides grants to projects aimed at increasing media literacy and fighting online disinformation. Google claims to have no role in assessing the grant applications. But the research community should exercise caution when accepting the company’s money, lest Google encroach on research independence like it ensnares journalism.

None of this is to say that Google submitted an exemplary report when it comes to researcher access to data. The company still has much work to do in expanding its new flagship researcher program, for example, and providing more granular information regarding its research tools. But this baseline report practically shines compared to the other major platforms given how low the bar was set—with none lower than the final platform on this list.

Twitter
Once a leader in platform research, the company’s programs have gone AWOL under Elon Musk

Twitter’s first implementation report is woefully incomplete. It is also obtuse with regard to its commitments to public interest research. The company touts itself for being an industry leader in access to data, yet under Elon Musk the company has in fact either dismantled or cut off access to virtually all of its research programs.

For example, Twitter highlights the Twitter Moderation Research Consortium, which shared data with vetted external researchers to shed light on state-backed information campaigns conducted on the platform. But the program went dark in November once the team managing the project was gutted.

Twitter also links to previous research the company has done on issues such as political bias in algorithmic content recommendations. The problem is that most of the team responsible for this research—Twitter’s so-called “Ethical AI” team—was also fired in November.

Finally, Twitter notes its longstanding API program which researchers can apply to access. Yet the company started paywalling its API as of 13 February, placing a new financial burden on public interest researchers and effectively cutting off access to those unable to afford it.

Restricting its API is another step backward for Twitter’s compliance with the code of practice. The move has compelled over 100 research organizations and over 500 individuals to sign an open letter calling on Twitter to ensure its API remains easily accessible for journalists, academics, and civil society, and calling on policymakers to require that this vital infrastructure remain easily accessible.

Based on this evidence, anything Twitter says about its “commitment” to public interest research and its preparedness to comply with EU regulations could aptly be described as a “bad joke.”

Second reports due in July should face greater scrutiny

One of the goals of the code of practice is to ensure that independent researchers are empowered, rather than embattled, by social media platforms. That’s because allowing researchers to analyze platform data is one of the best ways to understand the spread of online disinformation and other potential risks that social media pose to individuals and society.

Yet platforms' approaches to independent researchers—as evidenced by their actions, not their words—have been uneven at best, and outright hostile at worst. Based on these first baseline reports, it seems platforms have made little progress in their approaches (in the case of Twitter and Facebook’s CrowdTangle, there has been notable regression vis-a-vis researchers).

We should expect a fuller indication of major platforms' efforts in July, when the second round of implementation reports come due. The data provided in those reports should be subject to a much higher degree of scrutiny, including from independent auditors and researchers under the DSA’s data access framework.

Whether platforms are held to their commitments under the code of practice will depend on the enforcement of the European Commission, which has the power to issue major fines against the largest platforms for failing to comply with certain commitments in the code—and by extension, the DSA. The Commission may start investigations into potential DSA violations as early as September 2023.

Platforms’ promises to researchers: first reports missing the baseline

Blog

16 February 2023

#dsa #eu #publicsphere