“Time to evaluate COVID-19 contact-tracing apps,” wrote Adam Kucharski, Luca Ferretti, Chris Wymant, Christophe Fraser, and other influential researchers in a correspondence article on the Nature Medicine website in February 2021. Only a “rigorous assessment” of the effectiveness of digital contact tracing “allows public-health benefits to be weighed against unwanted effects for individual people and society,” they added. “Stringent evaluation is needed to develop contact-tracing apps into an accepted and ethical tool for future outbreaks of other infectious diseases”.
More than one year has passed since the first introduction of digital contact tracing apps in several countries around the world – some months later, the MIT Technology Review’s ‘Covid Tracing Tracker’ listed around 50 globally, 22 of them in the European Union, according to the ’Knowledge Hub’ maintained by human rights watchdog Liberties. And yet, even if some initial data have been produced, and these early findings can and should be discussed, evidence around the real-world effectiveness of digital contact tracing apps is still contradictory, a review by AlgorithmWatch of the available literature and app usage data has found.
While it should by now have become apparent that these apps should not be used under a “tech-solutionist” framing – i.e., that the mere deployment of contact tracing technology won’t “solve” the pandemic by itself –findings on their effectiveness diverge significantly from country to country, and, at times, from study to study. Methods
To start with, we don’t even have a generally shared and recognized definition of the “effectiveness” of a digital contact tracing app. According to the literature review provided by George Grekousis and Ye Liu in ‘Digital contact tracing, community uptake, and proximity awareness technology to fight COVID-19: a systematic review’, it is possible to conceive of digital contact tracing as both “the number of contacts identified through digital contact tracing”, and – as they themselves do – “the actual effect of digital contact tracing on reducing the effective reproductive
Uptake and effectiveness in a complicated relationship
Some of the findings are also counter-intuitive. A link between uptake and potential efficacy is by now prominent in the literature: the more the downloads (and actual usage), the better the protection. For months, this translated into a narrative, popular in mainstream media, according to which only apps that reached the download threshold of 60% of the population highlighted in a debate-defining Oxford study would be effective in combating the spread of COVID-19. However, this 60% threshold was never claimed in the study. As the authors themselves note in an article on TechReview – titled, ‘No, coronavirus apps don’t need 60% adoption to be effective’ – these apps start “to have a protective effect” at “much lower levels”.
This is good news, as none of the apps used in Europe have so far been downloaded by 60% of the respective country’s population. Actually, data gathered by Liberties show that many didn’t even come close at the time of writing, Italy (17%), Spain (19%), Poland (4%), and Croatia (2%) among them. Others came closer, though: Germany (33%), England and Wales (36%), Denmark (38%). One would, therefore, expect a much greater impact in England, for example, than in Italy.
And while in general – as we’ll see later in more detail – this seems to be the case according to available evidence, there are noticeable exceptions. In the Liberties’ ‘Knowledge Hub’, for example, the downloads chart is topped by Ireland and Finland, with 49% and 45% respectively. However, the number of downloads hasn’t necessarily translated into effective contact tracing responses in either country.
Medical experts consulted by the Iltalehti newspaper in Finland claim that the “Korinavilkku” app has not produced many benefits so far. And, in Ireland, a paper commissioned by the Oireachtas (the National Parliament of the Republic of Ireland) found that “even though the up-take is higher than [in] other parts of the world, the effects of the app may be minor”.
A paper by Trinity College Dublin researchers, Stephen Farrell and Doug Leith, further analyzed six months of data from the Irish “Covidtracker” app, from October 2020 to April 2021, only to find that,
“Over that period only 25% of the expected number of tested-positive app users uploaded keys, which are required for that user’s app instance to fulfill its primary function as part of the contact tracing system. For recent months we see only 15% of expected uploads”.
The authors go on to set out a stark critique of the overall digital contact tracing endeavor:
“This data seems to further indicate that “technology-first” solutions may be ineffective and may be yet another indication that the overall process followed worldwide with BLE (Bluetooth Low Energy)-based COVID tracking apps was flawed, and could usefully be contrasted with the time-proven “test-before-deployment” strategy followed by those involved in vaccine development”.
The deployment of digital contact tracing apps before any actual evidence of their effectiveness was, in other words, problematic, according to Farrell and Leith.
Literature reviews find no conclusive evidence of the effectiveness of contact tracing apps
If we had to wait for the evidence, however, we probably wouldn’t have even deployed these apps yet. The only general conclusion an informed observer can draw from the available literature reviews is that the evidence we currently have is inconclusive as to the actual effectiveness of these apps and their contribution to the fight against COVID-19.
For example, Grekousis and Liu analyzed eight modeling studies that compare digital contact tracing to manual contact tracing. The results were contradictory: “Four,” they wrote, “reported that digital contact tracing alone (i.e., without manual tracing) significantly reduces infections or Reff (effective reproductive number) more than manual tracing”. However, “two reported only a marginal improvement of Reff”, and two reported, “manual contact tracing reduces Reff or infections more than digital tracing alone”. Their conclusion? “The above heterogeneous outcomes indicate that more evidence is needed to decide on the effectiveness of each strategy when applied alone”.
The study, supported by the National Natural Science Foundation of China and published online in May 2021, also specifies that “no empirical studies are included as none exists”, and that their review included “English literature only”. However, their conclusion is arguably shared by many in the field: “Policy-wise, the take-home message of this review is that, the success of digital contact tracing depends on a complex interplay of app uptake in the community, proximity awareness technologies, and public’s trust”.
Similarly, the data-based, “comprehensive” analysis contained in ’COVID-19 digital contact tracing applications and techniques: A review post initial deployments’, authored by a team of researchers from Pakistan, Australia, Sweden and China, concludes
The authors add another dimension potentially relevant for assessing the efficacy of the apps, arguing that “from a global perspective, digital contact tracing may be suitable for developed countries. However, in developing and underdeveloped countries, digital contact tracing frameworks may not achieve their full potential”. This has crucial implications in terms of social justice and fairness, especially when coupled with findings from perception surveys such as the one conducted in France. It concluded that “the most economically precarious people, who are more at risk of SARS-CoV-2, are also the most reluctant to use a contact tracing app”. This could mean that optimal uptake strategies “should be combined with a reduction in inequalities by acting on structural determinants”, addressing issues related to equal opportunities in terms of digital connectivity and literacy.
Other studies even concluded that there may be no evidence that digital contact tracing ever worked, including for previous outbreaks. In ‘Automated and partly automated contact tracing: a systematic review to inform the control of COVID-19’, published by The Lancet in November 2020, author Isobel Braithwaite and colleagues reviewed “automated or partly automated contact tracing” tools analyzed in studies on SARS, MERS, Ebola, and SARS-CoV-2 viruses and published between January 1, 2000, and April 14, 2020. And again, “no empirical evidence of the effectiveness of automated contact tracing (regarding contacts identified or transmission reduction) was identified”, which means that “large-scale manual contact tracing is therefore still key in most contexts”.
These results seem to be part of a larger trend affecting health apps more generally. As Jessica Morley, John Powell, and Luciano Floridi put it after a scoping study concerning apps on Apple’s App Store, “the results show that the evidence available to support the claims made by the health apps analysed is often unavailable or of questionable quality”.
At the same time, recent data and studies produced specifically in response to the COVID-19 pandemic complicate the situation by providing some early – if contested – evidence around the positive contributions that contact tracing apps can bring about when used in conjunction with traditional methods and a broader public health strategy. Therefore, in the following sections, we will further analyze the main findings and trends highlighted in the available literature on specific apps and their data
Decentralized exposure notification apps work, studies claim
A number of studies have been frequently quoted in the media in recent months as providing early evidence that “decentralized” exposure notification apps – i.e., the ones adopting the GAEN (Google/Apple Exposure Notification) protocol developed by Apple and Google – work in helping contact tracing efforts contain COVID-19 infections.
Wymant, Ferretti, Fraser, and other colleagues investigated the effectiveness of contact tracing apps deployed in England and Wales in a paper published in Nature in May 2021, titled ‘The epidemiological impact of the NHS COVID-19 app’. And the findings they illustrate are significant.
Digital contact tracing, they write, reaches a bigger number of contacts on average compared to manual contact tracing (4.4 vs. 1.8), with the app believed to be especially useful in identifying contacts outside an individual’s household.
Based on the estimates obtained through the two approaches adopted in the paper (one based on
The researchers also provided interesting results concerning the relationship between uptake and effectiveness. According to their calculations, a 1% increase in app use translates into a 0.8-2.3% reduction in infections, meaning that “on average, each confirmed COVID-19-positive individual who consented to notification of their contacts through the app prevented one new case”.
The authors recognize that there are some limitations to their study. First of all, that it is an observational study: “no randomized or systematic experiment resulted in different rates of app uptake in different locations”. Also, even though their statistical approach tried to include “adjustments for confounders”, researchers admit that “it is still possible that changes in app use over time and across geographies reflect changes in other interventions, and that our analysis incorrectly attributes the effects to the app”.
Another critique came from Florian Gallwitz, professor of media informatics at the Technical University of Nuremberg Georg Simon Ohm. According to Gallwitz, the study does not provide an answer to “the question: what percentage of the people – who were warned by the app – were actually infected?”. This number cannot be derived from the NHS app for structural reasons, so the authors had to estimate it through “numerous model assumptions and complex statistical calculations”.
They could have just used figures from the Danish app as a guide, Gallwitz contends, as “a suitable basis for calculating the number sought is directly recorded and published” in that system. However, the numbers recorded in Denmark are far lower than those provided in the authors’ estimates. “With an average of only 0.9 percent, the actual rate of positive tests after the app warning in the past few months has been many times lower than the estimate from Oxford (i.e. by Wymant et al.) and in the same order of magnitude one could probably achieve with a dice”. Given that this discrepancy is not explained by Wymant and colleagues, and that comparability issues plague the possibility of generalizing obtained results, Gallwitz concludes that “there is still no credible evidence that the app warnings are related to actual infection events”.
Positive findings have also been claimed in a Nature study (‘A population-based controlled experiment assessing the epidemiological impact of digital contact tracing’) that describes a 4-week experiment held in San Sebastián de la Gomera, the capital of La Gomera in the Canary Islands, during July 2020. There, researchers simulated four outbreaks of COVID-19 among the 10,000 inhabitants, tracked their behavior, and then computed 7 key performance indicators to evaluate the Spanish “Radar COVID” app’s effectiveness. These indicators included: “adoption”, “adherence” (i.e., whether or not the app was used 10 days after download), “compliance” (whether or not codes were entered into the app), and overall detection rates ("the average number of close-contacts of a given infected individual which are notified by the app”).
The authors claim that the results obtained provide “much needed empirical evidence on the usefulness of DCT (digital contact tracing) as a complementary nationwide epidemiological tool for the containment of COVID-19”, with the app allegedly detecting “6.3 close-contacts per primary simulated infection”, a 33% adoption rate and “relatively high adherence and compliance”.
However, the researchers themselves caution about generalizing their findings, as integration with public health policies remains crucial. It is impossible to reach a conclusion on long-term adherence to the app during a 4-week experiment and, most of all, infections are merely simulated: “since people in La Gomera are aware of this fact, we cannot extract any ‘behavioural’ conclusion of this study – e.g. we could not conclude whether those people that have downloaded the app are more or less
A much more heated debate took place in Switzerland – home to some of the researchers who most actively contributed to the global debate around the architecture of contact tracing apps, shifting many countries to adopt “decentralized”, “privacy-preserving” exposure notification systems, an approach that is widely regarded as a best practice example.
On the one hand, some studies provide evidence that the ‘SwissCovid’ app provided a positive contribution to the fight against the pandemic. On the other, substantial critiques have been put forward, among others, by Belgian mathematician Paul-Olivier Dehaye and by Serge Vaudenay, head of the Security and Cryptography Laboratory (LASEC) within the School of Computer and Communication Sciences at the Swiss Federal Institute of Technologies (EPFL) – though, interestingly, this critique has not received much attention in the media and has not reverberated widely in the public debate.
Controversies concern, for example, a preprint paper published in December 2020, ‘Digital proximity tracing app notifications lead to faster quarantine in non-household contacts: results from the Zurich SARS-CoV-2 Cohort Study’. In it, the authors surveyed a population-based sample (393 index cases and 261 close contacts) about the use of the Swiss app. After statistical analysis, they concluded to have “found evidence for a possible time advantage through the app in non-household settings, with app-notified contacts entering quarantine on average one day earlier than those not notified by the app”.
This strong claim is allegedly backed by the fact that “8 (19%) of 43 app notified contacts received the app warning before being reached by MCT (manual contact tracing)”. The authors themselves recognize that “more in-depth studies are clearly warranted” to corroborate their claim, but nonetheless, they argue that their findings “constitute the first evidence that DPT (digital proximity tracing, in which no localization data are collected) may be effective in reaching close contacts faster than MCT. Albeit small, such a time difference may be relevant in reducing transmission in the population”.
According to Dehaye, however, “these statements unfortunately do not reflect the data accurately”. For example, if the app has indeed reached only 8 of the 43 notified contacts before manual contact tracing did, this then “quite directly shows that the app notifications are slower than manual contact tracing, the exact opposite of the main claim of the paper, as relayed in its title”, wrote Dehaye in a long and detailed critical analysis published on Medium in March 2021 (‘Evidence of methodological bias in analyzing contact tracing app efficacy’).
Also, Dehaye noted that SwissCovid could only have made a difference “if someone had received a phone call from contact tracing authorities, decided to ignore it, and then received a notification from the app and acted on the notification instead”. Trusting the app more than the health authorities would however be “a terrible outcome”, he wrote, “given what we know of the app’s accuracy” – namely, that the underlying Bluetooth technology suffers from several structural limitations when computing the proximity of contacts at risk of spreading COVID-19.
Another Swiss study “based on the data collected during the initial deployment of the SwissCovid app”, ‘Early evidence of effectiveness of digital contact tracing for SARS-CoV-2 in Switzerland’, investigated the reasons why 7,842 individuals sought RT-PCR testing, finding that 41 did so because of the country’s contact tracing app. Given that only part of the sample provided such reasons, the authors estimated that the total number of those who were motivated by the app should be higher (65).
“It is impossible to say if those 41 cases had other reasons in addition to SwissCovid”, rebuked Vaudenay in ‘The Dark Side of SwissCovid’, a strongly-worded piece of criticism in which the EPFL professor casts further methodological doubts over the results obtained by the authors. For example, in the survey “doctors could also tick a fourth possible reason "other" and specify it explicitly but those others reasons were neither reported in the article nor analyzed”, he wrote. Also, it was not possible to ascertain whether those individuals “actually self-isolated between the notification and the result of the test”. Given these methodological flaws, “this is no evidence that SwissCovid has been any [use] at all”, Vaudenay concluded.
But Vaudenay’s critique goes much deeper, suggesting structural sabotage of the debate around digital contact tracing solutions in the country at the hands of “scientific lobbying”:
“A vast conflict-of-interest factory around SwissCovid was built by having the same people being the developers, the evaluators, the advisors, the communicators, the operators, and the rulers. Alternate solutions were trashed. Criticisms were downplayed. Potential threats were ignored. Positive evaluation reports were posted and negative ones were not. Pseudo-scientific proofs were forged to support pre-agreed conclusions. Press was manipulated to spread conclusions. Essentially, SwissCovid begs Swiss residents to keep faith in that SwissCovid is or will be useful without any objective evidence. The security and privacy claims are smokescreens”.
As a result “people were deceived”, he concluded: facts were hidden from view (“We can see that it played no role to avoid the second and third waves and their semi-lockdown”) and conclusions around the app’s effectiveness were incorrectly drawn (“We doubt SwissCovid has any utility, besides generating academic praises”).
Norway’s ‘Smittestopp’ app allegedly worked – before being halted for privacy abuse
Should we have adopted a “centralized” app architecture instead, then? Some researchers think the answer is affirmative. In ‘Privacy versus Public Health? A Reassessment of Centralised and Decentralised Digital Contact Tracing’, authors Lucie White and Philippe van Basshuysen argue that the promise of contact tracing apps “has been all but abandoned, with governments now downplaying the potential efficacy of the measure, and suggesting that it will have, at best, a limited role among a host of other mitigation measures”.
This was in their view not an inevitable outcome, however, and the culprit – at least in part – resides with the mainstreaming of decentralized solutions, in which proximity data are computed on each individual’s phone, rather than collected and analyzed in a central database maintained by health authorities. Such a centralized app configuration would have instead allowed for “reporting before a confirmed test”, thus showing “promise in increasing the efficiency of the measure”, the authors argue.
A preprint study published in March 2021 on the effectiveness of Norway’s centralized ‘Smittestopp’ (‘Nationwide rollout reveals efficacy of epidemic control through digital contact tracing’) contact tracing app seems to corroborate White and van Basshuysen’s claim. Based on the analysis of a “real world contact dataset” including millions of contacts “between 545,354 phones (i.e., 12.5% of the adult population in Norway)”, the authors claim to have been able to provide a first measure of “the real-world app performance”, finding an 80% “technological contact tracing efficacy”.
This means that, according to the authors’ estimates, “at least 11.0% of the discovered close contacts could not be identified by manual contact tracing”. Allegedly, this shows that “significant impact can be achieved for moderate uptake numbers” (28%, at the time of the analysis).
The authors seem, however, to downplay the cost that this implied in terms of the privacy of Smittestopp users. While dismissively reminding readers that “the app was eventually suspended in June (2020), because of a combination of low infection rates and privacy concerns”, they fail to mention the severity of the accusations that led the government to halt the use of the app.
In an Amnesty investigation published in June 2020, the Smittestopp contact tracing app was labeled “deeply intrusive”, and bundled with some of the world’s worst privacy offenders in the digital contact tracing realm—namely, Bahrain’s ‘BeAware Bahrain’ and Kuwait’s ‘Shlonik’: together, these apps “stood out as among the most alarming mass surveillance tools assessed by Amnesty, with all three actively carrying out live or near-live tracking of users’ locations by frequently uploading GPS coordinates to a central server”, the investigation found.
Also, the authors of the Smittestopp study seem to deviate from White and van Basshuysen’s hypothesis: according to them, their own results do not provide proof that centralized apps work better than decentralized ones. Actually, they expect the opposite to be true, even arguing that “limited experiments in controlled environments do (…) support the assumption that ENS (‘Exposure Notification Systems’, based on a decentralized architecture) will have an efficacy comparably to or better than we have observed in the deployment of Smittestopp”.
Data from several countries point to widespread failure
We’ve seen how the literature provides a mixed picture in terms of the efficacy of digital contact tracing apps. However, since, throughout the pandemic, several countries deployed their own solutions, many additional data are yet be rigorously analyzed.
Israel, for example, was touted as a model in providing a swift – and tech-based – response to the pandemic. Therefore, it is rather surprising to record that experts at The Israel Democracy Institute “describe the implementation of the 'Shield' (HaMagen) [GPS-based contact tracing] application as a colossal failure with 98% of Israelis who downloaded the second version subsequently removing it from their mobile devices”.
Other flaws in digital contact tracing apps also came where least expected to. Japan, home to many technology giants, saw its digital contact tracing program collapse under the weight of tech glitches that affected the system “from the get-go”, wrote The Japan Times in February 2021. Bugs and glitches also affected the Dutch contact tracing app (third-party apps could access user data on Android phones), while a Portuguese cybersecurity student discovered a vulnerability potentially affecting all GAEN Bluetooth apps.
In Australia, the failure of the digital contact tracing ‘COVIDSafe’ app has been certified by none other than Victoria’s health minister, Martin Foley. Asked whether it was used in response to the latest outbreak of COVID-19, Foley answered: “No. Not to my knowledge, and I’m sure in such a rare event it would have been brought to my attention”.
Actual usage data seem to confirm this statement, as the COVIDSafe app still hasn’t found any contacts this year, and it “has still identified just 17 people in total”, according to InnovationAus reporter Denham Sadler. Earlier, much larger figures provided by authorities at a Senate hearing turned out to be incorrect and the country’s Digital Transformation Agency was forced to issue a correction: of the 561 close contacts identified by COVIDSafe, 544 were identified by manual contact tracers.
Italy’s ‘Immuni’ app was also criticized for not being able to help contain the outbreaks that affected the country throughout 2020 and the first half of 2021. And while rigorous, peer-reviewed research is still missing, Oxford researcher Luca Ferretti – who investigated England and Wales’s NHS app – claimed in an interview with Italian
Similarly, in Canada the Federal ‘COVID Alert’ app sent out 35,000 exposure notifications during April 2021, which resulted in the identification of “at least 400 COVID-19 cases”, claimed Health Canada in a CBC report. And while this contribution might not seem negligible, it has to be seen in proportion to the over 165,000 cases recorded in the provinces and territories that adopted the app over the same month.
Lastly, in the US, only 13 States reached the 15% uptake threshold that
Why? Ladyzhets’s interpretation is worth reporting in full, as it summarizes two broad and important trends seen at play at many other latitudes: “Our failure to answer that question is partially due to the fractured nature of the system. But it’s also because specific research to measure this technology’s effectiveness simply was not a priority”.
In an emergency, even finding one single additional case through an app helps. Or at least, that’s how many policy-makers intended the whole digital contact tracing effort and its priorities. As a result, a shared, rigorous understanding of what “effectiveness” actually means for a digital contact tracing or exposure notification app, and how it is measured, is still lacking – and might not even be possible at all.
Democratic countries are not the only ones affected by such uncertainty. Evidence is inconclusive even around the contribution of digital contact tracing apps in Wuhan, China, the epicenter of the first outbreak of the pandemic, which were widely reported in the media as instrumental in quickly combating its spread. As the paper ‘Decoding China’s COVID-19 ‘virus exceptionalism’: Community-based digital contact tracing in Wuhan’ concludes, “more rigorous positivist research is needed to better understand the causal effect between the community engagement in digital contact tracing and viral spread control effectiveness”.
As argued throughout this evidence-based analysis, the effectiveness of digital contact tracing apps in fighting the COVID-19 pandemic is still in question more than a year after their initial deployment.
Some studies seem to provide early evidence of a positive contribution in terms of both identifying infected individuals and reducing infection rates, but several definitional and methodological issues remain unresolved – in addition to many actual deployments indicating a moderate impact on COVID-19 infection dynamics at best.
Also, the results obtained are hardly comparable with each other, if at all in many circumstances. This is only natural, given that public health policies are highly contextual, and include much more than mere technologies. In other words, digital tools deployed as part of these policies must always be studied by reference to their broader socio-technological context. But it still hinders the formulation of an overall, informed, and pragmatic evaluation of the contribution of automated systems to these broader endeavors.
Our analysis, which of course lacks the
We do hope that this initial analysis can provide useful suggestions for identifying open questions for further, more rigorous literature reviews of digital contact tracing systems.
Born out of a global emergency, these systems have too often been deployed within a tech-solutionist framework, and as such with little justifying data and almost no democratic discussion – even when contact tracing data opaquely ends up in the hands of the police, as it is increasingly the case, in democracies (Australia), technocracies (Singapore), and authoritarian regimes (China) alike.
A better understanding of the evidence produced through digital contact tracing systems is, therefore, essential not only to better judge their effectiveness, but also – and most importantly – to promote an informed democratic debate around how, when, and why digital contact tracing makes sense – and how, when, and why it must end.