‘Trustworthy AI’ is not an appropriate framework

AlgorithmWatch and members of the ELSI Task Force of the Swiss National Research Programme 75 on Big Data comment on the EU HLEG on AI’s Draft Ethics Guidelines


On December 18th, 2018, the European Commission’s High-Level Expert Group on Artificial Intelligence (EU HLEG on AI) published their Draft Ethics guidelines for trustworthy AI (PDF). After a consultation period that ended on February 1st, the HLEG is supposed to publish a final version of the guidelines in March 2019. Based on this final version, the HLEG is tasked with producing a second so-called deliverable, the AI Policy and Investment Recommendations, scheduled to be published in mid-2019. These recommendations are meant to give the European Commission guidance “on how to strengthen Europe's competitiveness in AI, including guidance for a strategic research agenda on AI and on the establishment of a network of AI excellence centres.

The HLEG is composed of 52 experts from a variety of fields, heavily dominated by business representatives (24), followed by academics (17), civil society (5) and others (6 – like the European Union Agency for Fundamental Rights). The group’s general objective is “to support the implementation of the European strategy on Artificial Intelligence”.

Because of its composition, the HLEG has been criticised for being dominated by business interests, but also for being pressured into delivering results without sufficient time to deliberate, using a convoluted process of meeting in parallel in 14 different working groups.

Individuals and organisations responding to the consultation can choose to have their answers published.

Participation in the consultation was possible through the EU AI Alliance platform via a form that allowed for responses to the four major sections of the Draft Ethics Guidelines (Introduction: Rationale and Foresight of the Guidelines, Chapter I: Respecting Fundamental Rights, Principles and Values – Ethical Purpose, Chapter II: Realising Trustworthy AI, Chapter III: Assessing Trustworthy AI) and general comments. We publish our response here in this structure; responses to the chapters are further divided into Major comments and Minor comments where we deemed this appropriate.

Principle author of this document is Michele Loi of the Digital Society Initiative and Institute for Biomedical Ethics and the History of Medicine, University of Zurich.

The response was co-authored and endorsed by:

Introduction: Rationale and Foresight of the Guidelines

Major comments:

The specific role of trustworthiness as the focus of ethical guidelines should be clarified:

  1. what makes AI trustworthy in addition to ‘reliable’, or ‘ethical’? What is the relation between these concepts? What are the differences between them? What is the advantage of having a code for ‘trustworthy AI’ rather than an ordinary ethical code?
  2. is AI supposed to be trustworthy or the people behind it, or a combination of both?

Furthermore, is there only a single relation of trust, or different ones? Are the same recommendations relevant for trust between AI systems and the computer scientist that create and maintain the system and trust between an AI system (including the humans responsible for its maintenance) and its users?

More generally, it would greatly help the guidance given to be provided a list of potential candidates for trustworthiness. The document remains very vague on who is to trust and who, or what, is to be trusted. It is not even clear what are the implied ‘identity criteria’ for the AI systems that are supposed to be regulated. For example, if we consider AIs in which algorithms fed on data are embedded (as in many services based on machine learning algorithms), then one could ask how updates of the AI affect its identity and, a posteriori, the trust relationship with humans (users, engineers, controlling agencies etc.) that has been established and possibly nurtured so far. At what stage of any update does the original AI stop to be ‘itself’ and mutates in something different and distinct from the original? Is the re-training of a machine learning algorithm (even in case of a single modification of the original training dataset) enough to trigger an identity shift? Should we reconsider the trust relationship so far, or is the AI still ‘itself’?

Finally, the document fails to highlight what is truly special about trust-based relations and a trust-based society. While it emphasizes transparency both at the level of fundamental principles and in practices, there seems to be no realization that transparency may not be at the centre of trust-based relations. Indeed, one can argue that one of the distinguishing characteristics of placing trust in others is precisely the willingness to rely on a third party without the ability, or even the need, to check what the other party does. This is not to say that transparency is useless in a trust-based society. But transparency appears to play a complementary role: while most people use AI because they trust it, few people are expected to be inquisitive if there is trust.

On the definition of AI: the definition of AI describes AI as a system that acts in the physical or digital world. But many potential software applications that these guidelines seem to intend to address are not artificial agents. For example, statistical models that provide assistance to human decision makers, without substituting them, do not act in the physical or digital world. Is the guideline not intended to address the concerns raised by those models? Or if so, should the definition of AI be revised?

“Bias is a prejudice for or against something or somebody, that may result in unfair decisions” (p. iv): it may be worth stressing that statistical bias will be intrinsic to decisions based on statistical predictions in a context in which features of interest are not distributed homogeneously in different sub-groups of the population (e.g. men and women being unequally likely to be liable for road accidents and to be involved in violent crimes after release from prison) and that the use of statistical criteria as a basis of decision making can be controversial - depending on the context - irrespective of issues of accuracy and bias, especially when the role for human judgment is limited or absent altogether. And among prejudices, one could perhaps also mention people’s experiences (including education, cultural and religious background) as source of bias, for humans and models trained on human data as well. Generally, it may be worth stressing that ‘getting rid of bias’ is not a sensible and feasible policy goal. Instead, policy requires making deliberate, reasoned, if possible principled and publicly legitimated choices concerning which biases to accept and which to mitigate or neutralize when optimizing models. The unavoidability of some form of bias/discrimination/unfairness in decision making that relies on statistical predictions results from the ‘fairness trade-offs’ between different definitions of bias and discrimination, highlighted by the computer science literature of the past few years.

Minor comments:

Complementary set: what are examples of NOT trustworthy AIs? Which are the legal consequences of having them either offline or online in IT systems of companies or agencies? Is there the possibility of drawing an analogy with the personal use of drugs?

  1. humans’ physical and moral integrity,
  2. personal and cultural sense of identity and
  3. the satisfaction of their essential needs. Why not including the three elements of human dignity as distinct principles?

The principle of justice (p.10)

This is an extremely important principle, but its discussion appears to be incomplete in two different ways, concerning respectively the aspect of fairness/discrimination in statistical prediction and justice in the utilization of data resources.

Concerning the first, the guideline prescribes that “positives and negatives resulting from AI be evenly distributed”. This is very unclear. The language of positives and negatives seems to refer to the context of statistical prediction. If so, first, it is unclear who are the subjects of distributive justice: legally ‘protected groups’? Vulnerable populations? (The two are not the same). Second, it is unclear what ‘evenly distributed’ means, for example, if different groups have different baseline distributions of the predicted attribute (e.g. violent reoffending for prisoners released on parole) should ‘evenly distributed’ entail that women and men should have the same probability of being released; or should it mean that women and men who do not reoffend should have the same probability of being released, etc.? There is also an issue of trade-offs with the requirement of avoiding bias since it is mathematically proven that enforcing both aforementioned ‘even distribution’ criteria comes at the expense of predictions being unbiased in a different sense (e.g. equally likely to be correct for the different groups). The document lacks any reference to the often discussed question of trade-offs between fairness objectives/metrics and fails to discuss what a possible role of future public institutions could be, with respect to providing guidance on how to resolve these ‘hard questions’ of machine learning fairness (that have been sometimes referred to as the “trolley problem for machine learning”. (p.15 (section on discrimination) mentions that data always carry some sort of bias; but the bias may not be in the data, but rather in the inferences drawn from that. See below our commentary on that section.)

Concerning the question of data resources, it is remarkable that the drafters of this document steer away from mentioning that fair access to data resources is one of the fundamental questions of justice of the data economy. Large data-driven companies, especially US corporations that have in data their largest economic assets, have accumulated a wealth of data on European citizens whose potential for social benefit is underexplored and underexploited. On the one hand, it is difficult to apply fair rates of taxation to these companies, which are able to ‘shop around’ for the most favourable rates. On the other, the potential of the data to benefit society is limited because the data is stored in their silos. European society could benefit from a bold proposal on how to make the data resources accumulated by these companies to work for the benefit of EU societies. In particular large companies collecting big data about large populations may have information that, if made more broadly accessible, could be used to develop AI-driven innovation in the public sector, including in the context of scientific research.

Chapter I: Respecting Fundamental Rights, Principles and Values – Ethical Purpose

The term ‘values’ is used here to identify more concrete entities than principles and rights, for example ‘informed consent’ is described as a value. This is unusual for both philosophical ethics, where values identify broader and more general concerns, such as equality, efficiency, freedom, etc., and everyday language (do lay people really think of ‘informed consent’ as a value?).

Understandably, the guidance is not committed to one ethical framework in particular and does not provide a decision rule to resolve potential conflicts and trade-offs which may arise. Any such framework would certainly be considered more controversial than a list of principles and rights to be weighted against each other in a context-sensitive way. However, it is to be expected that conflicts and trade-offs arise at the level of principles, rights and values. Thus, the guidance could be improved by providing some indication of procedures for assessing trade-offs.

The draft wants to establish elaborated monitoring and assessment routines for AI, which are aimed at the public discussion and should ensure public ‘trust’ in AI systems. Therefore, the five guiding principles provided are being considered as the main normative basis of judgements of AI trustworthiness and complemented by rights, values and checklists, at different levels of abstraction. What seems to be missing is an indication of some procedure to attach weights to the different principles and solve disagreements when people disagree on which principle should have priority in a given situation. Or at least, the limitations of an ethical framework that at the most fundamental level relies on prima-facie ethical principles to be traded-off against each other intuitively could be explicitly acknowledged and the need to develop forms of ethical deliberation to solve these trade-offs could be mentioned.

On social scoring (p. 12). The paragraph moves abruptly from ‘normative citizen scoring’ concerning ‘all aspects and on large scale’ to scoring in limited social domains. The document recognizes that scoring in a limited social domain refers in some cases to established social practices, including practices such as education and driving licenses, that are commonly accepted, at least when AI is not used. But what is meant here by ‘normative citizen scoring’ is entirely unclear: an example seems to be needed. Concerning the opt-out option for domain-specific social scoring, why should there be an opt-out option when AI is used but not when AI is not used? E.g. should statistical models used to identify tax evasion be handled differently - providing an-opt out option - when AI is used, but not otherwise? What does it mean to have an opt-out option? Does it imply a right to be judged without the use of the AI? And what does that mean? To be judged by a human without the use of knowledge from statistical models? If not, why do traditional statistical models (e.g. actuarial tables in insurance) differ from AI in terms of implying a right to opt out from scoring? Otherwise, suppose that the right to opt out from social scoring by AI implies a right to opt out from any social scoring (irrespective of AI is used). If so, should the person who opts out bear the cost of not being socially scored? These costs can be considerable, as they may include being unable to obtain credit (in the absence of a creditworthiness score), being unable to drive (in the absence of a driving licence), and being unable to obtain an education (in the absence of grades). If, finally, the person who opts out from social scoring should not pay the price of her opting out decision, how could practices that rely on social scoring to be sustainable be guaranteed? E.g. email relies on the scoring of email senders to activate spam filters: should a spammer have the right to opt out from this scoring and yet be allowed to send spam around?

On section 5:

5.1 – Identification without consent

Interestingly, ‘identification without consent’ does not refer to companies identifying individuals without asking their consent. Rather, this section addresses the possibility that, even if companies ask and obtain the (formally) informed consent of individuals, the informed consent provided online by citizens should not be taken at face value. This section of the document contains one of the strongest statements in any public document so far about the inadequacy of the ‘notify and consent’ strategy for dealing with privacy/ data protection. A strategy that is, and has been for decades, the main procedural solution to achieve privacy and autonomy without sacrificing either. The authors write that, in the light of the fact that

consumers give consent without consideration’,

there is

“an ethical obligation to develop entirely new and practical means by which citizens can give verified consent to being automatically identified by AI or equivalent technologies.”

We also believe that the system of privacy/data protection revolving around the current version of online informed consent as its main pillar is largely inadequate, at least for high stake decisions based on personal data. Yet this much needed critical section raises more doubts and puzzles than it solves, in the context of this document:

  1. There seems to be a contradiction between scepticism about informed consent as a procedural solution of difficult governance issues and elevating ‘informed consent’ to the status of a
  2. While it is undeniable that online informed consent procedures are sufficient to legitimize the uses of identifying technologies, it is not so clear what could substitute it in the context of AI used for online services. The section criticizes informed consent as inadequate in this context but, for lack of alternatives, leaves no option for identification technologies, outside the extreme one that identification technologies cannot be justified on the basis of a preference or desire of the consumer, until radically new forms of consent (of what kind?) will be developed. The only exception are goals (such as detecting fraud, or terrorist financing) where the justification of re-identification is independent from the informed consent of the subject of surveillance.
  3. The strong claim that ‘consumers give consent without consideration’ raises the problem of informed consent as an instrument of legitimation in general for AI, not only in relation to the specific identification technologies in question in this section of the document. If informed consent is not to be relied upon as a legitimation mechanism, because it is always given ‘without consideration’ this leaves a huge regulatory void, as informed consent is one of the cornerstones, if not the most important cornerstone, of the existing regime of data protection. Unfortunately, the document does not provide any hint as to what could replace informed consent as the cornerstone of a future regulatory regime for AI. In particular, it is not clear which of two opposing strategies the group recommends:
  4. improving informed consent procedure, with the goal of ascertaining that online behaviours correspond to authentic acts of consent and that they are adequately informed;
  5. developing an alternative framework of data governance that downplays the importance of informed consent as a pillar of justification for data-based services.

This indication would be highly relevant for both policy and business practice. Endorsing strategy

(a) could lead to guidelines and regulations that stress the need to simplify the language used in informed consent procedures, as already prescribed by the GDPR and the GDPR requirements, further. They should provide new criteria for the level of clarity and understanding to be reached, and they should deal with the hard constraint deriving from the fact that people’s willingness to spend time managing their privacy is extremely limited. It is thus unclear whether experiments with new ways of conveying information (e.g. short videos?) and of assessing the validity of the process could provide a viable solution (e.g. measurements of the time spent reading privacy policies, short tests to assess their knowledgeability?). Endorsing (b) would lead to reshaping the realm of consumer choices, moving away from assigning a dominant weight to the principle of consumer choice and autonomy in regulatory choices pertaining to consumer privacy. This is, of course, not a new issue. The centrality of informed consent to privacy protection in fair information practices has been the subject of large disputes since these practices have emerged. Critics of informed consent have maintained that informed consent does not substantively limit data collection against the interest of the data controllers, and that it merely provides a perception of privacy protection that is formalistic and enables the accumulation of power by data controllers. This is because citizens often have no choice but to provide their informed consent if they are to access the benefit of certain services. This appears to be still the case in the post-GDPR era, as every person who browses the Internet daily realizes. Those arguing against the centrality of informed consent have always maintained that privacy protection must take a less neutral stance and actively oppose surveillance, beyond attempting to protect efficient exchanges of information. It is not clear if the working group endorses this position or merely asks for better ways to determine if informed consent has been given (leaving it to the research community to solve the problem).

5.5. – Potential longer-term concerns

The scientific basis of the assessment of potential long-term harm does not appear anywhere in the document, hence it is difficult to assess the plausibility of this section. On a general note, we believe that the inclusion of merely hypothetical ideas of long-term harm lacking scientific evidence risks to weaken the overall respectability and practical relevance of this document.

Chapter II: Realising Trustworthy AI

Major comments:

One methodological problem with this section seems to be that it fails to examine the matter at hand at a coherent level of abstraction. For example, at p.15, the text includes a recommendation to ensure that data from the same person does not end up in the training and test set. This is a very specific recommendation about a specific way to mitigate statistical bias, but why does it deserve such special status?

Moreover, the section makes no mention of the trade-offs that may arise from the need to implement different rights, principles and values. For example, the recommendation to keep track of all data fed to the AI system (p.15 sect. II) and the similar claim that AI “systems should document both the decisions they make and the whole process that yielded the decisions, to make decisions traceable” (p. 20)  may be in trade off with the requirement to protect the privacy of the persons affected by the decision. This demand could, further, be in conflict with intellectual property rights and security disclosures. Models and systems are often considered a trade or governmental secret. How should this be regulated if the value transparency and intellectual rights or security are at odds with each other? How should these processes be implemented and how can it be ensured these rules of traceability and auditability are not violated?

Similarly, “the capability to describe, inspect and reproduce the mechanisms through which AI systems make decisions and learn to adapt to their environments, as well as the provenance and dynamics of the data that is used and created by the system” will tend to deliver a system that is transparent and potentially vulnerable to be manipulation. Transparency is sometimes alleged to conflict with protecting an AI decision system from malicious or self-serving attempts to manipulate their outcomes by strategically responding to them, which may lead to unfairness (e.g. between persons with different degrees of understanding of the logic behind the algorithm). We are not claiming that it is always socially desirable that the logic of algorithm should be kept opaque. But since this is an objection that is sometimes raised, by stakeholders and even regulators, against the demand of more algorithmic transparency, it would be useful if the working group were able to provide some advice on the matter to future regulators.

The section (5) on non-discrimination (p. 16) should stress that discrimination does not necessarily derive from the data (e.g. biased social practices producing the data or incomplete data) but is, in a certain sense, a non-avoidable feature of all decisions grounded in statistical predictions (which typically are only imperfectly accurate, and even if perfectly accurate, may still be objectionable). Bias can also arise from data that perfectly represent the ‘ground truth’. This is because a model may appear discriminatory if it treats individuals in different groups in very different ways, even if there is no bias in the data and the data somehow ‘justify’ this. For example, suppose that data from online learning platforms truthfully report that women are less likely than men to select a STEM subject when given the choice. Even if the statistics maintains external validity over time, society may reasonably object to a recommendation algorithm that recommends STEM subjects more often to men than to women. Hence, it seems important to introduce and stress the idea that the goal of ‘avoiding discrimination’ only makes sense relative to a prior value judgment about the kinds of inequalities that society deems permissible, even desirable, and those that are considered unjustifiable. The report could stress the importance of promoting a wide societal debate about the nature of bias, unfairness, and discrimination with statistical predictions, that attempts to reconcile conceptualizations from common-sense, ethics, law and statistics. We invite the Expert Group to provide recommendations on advancing a more transparent and informed debate about the fairness metrics that have been proposed in the field of computer science, which appears to be crucial for their political legitimation. This may include the promotion of policies that advance the public understanding of the different forms of unfairness and discrimination that may arise through the use of AIs and, more broadly, statistical models (also, already in use) both in high-stake decisions and low-stake decisions with serious cumulative effects.

It is good that the AI HLEG is recognising the specific theoretical and methodological characteristics of AI development. The question of accountability, and the shift of this accountability to the user, no matter if the system is a ‘black box’ or not, could be an important guideline for AI development and regulation.

The epistemic (methodological) values of traceability and auditability could, however, be at odds with the epistemic and scientific features of AI development. AI development and research are heavily influenced by the epistemic cultures of the disciplines informatics and computer science. Most branches of computer science are concerned with ‘making things’, like computers, algorithms or software which should solve a specific problem for governments or businesses. A possible problem, however, is that the instrument of accountability is pointless if society lacks persons with the information, skills, motivation and time to assess the achievement of the relevant desiderata by AI systems. Some information will only be distributed within the companies and even the skills necessary to make sense of information made public are very unequally distributed in the population. The majority of the population hopes to be able to ‘trust’ AI. But trust is only well placed if a more competent, motivated, inquisitive, and sceptical minority is able to assess if such trust is well placed. Recognizing the relation of dependency between an expert community and the broad population should lead to:

On the other hand, relying entirely on experts and whistleblowers is inadequate due to the complexity of some problems arising from the inaccuracy and unfairness of AI. Some of these may be hard to predict ex-ante, before specific real world biased outcomes and predictions arise, solely based on an analysis of the AI system and the data on which it is trained. Some ex-ante assessments are going to be especially problematic in the case of neural networks due to the ‘black box’ of model explainability. Thus it is also important to promote a sensitivity to AI ethical issues directly in the potentially affected population.

Finally, due to the diffusion of neural networks and other ‘black box’ algorithms, if every AI should comply with the values of traceability and auditability, this could mean that most of current AI development which don’t comply with these values is a dead end if the guidelines’ principles are taken seriously. Therefore, a common acceptance of these values in public and business sectors is not likely. The guidelines may clarify how this challenge could be tackled.

Minor comments:

Chapter III: Assessing Trustworthy AI

It will be a challenge to make these recommendations content-dependent. It would be helpful if, also outside this document, examples and illustrations would soon be produced.

General Comments

Special status of AI?

Should AI be up to special ethical standards to be trustworthy, or the same standards as non-AI-involving social practices in the same domain and fulfilling the same function as AI? For example, social scoring is an old and established social practice, even before AI. As the authors recognize, driving licences and grades at school are forms of social scoring. General questions: are the general principles used to specify goals of trustworthy AI analogous to those of not AI-based practices? If not, why should they differ?

Read more on our policy & advocacy work on ADM in the public sector.

Sign up for our Community Newsletter

For more detailed information, please refer to our privacy policy.