What role for regulators in the developing a creditable AI audit industry?

30/05/2022

AI audits are used to check and verify that algorithmic systems are meeting regulatory expectations and not producing harms (either unintended or intended). Globally the regulatory requirements for AI audits are rapidly increasing:

the draft EU AI Act mandates conformity assessments for high-risk applications of AI. Typically, this is an internal governance audit but, in specific high-risk cases external audits are required.
the Canadian government has mandated Algorithmic Impact Assessments for federal government institutions;
in the United States, senators have proposed draft legislation for an Algorithmic Accountability Act 2022, which would require impact assessments when companies are using automated systems to make critical decisions.
New York City passed a bill in November 2021 which requires hiring vendors to conduct annual bias audits of their AI systems, a move likely to be followed by other state and local governments.

These audit requirement raises many questions: who should these AI auditors be? What training and qualifications should they have? what standards should algorithmic systems be assessed against? What role should audit play in the context of demonstrating compliance?

A discussion paper has recently been published canvassing views on the potential roles regulators could have in the development of an AI audit industry, published by the group of the 4 UK regulators with a stake in the digital economy – the telecoms regulator Ofcom, the competition regulator the CMA, the privacy regulator the ICO and the financial regulator the FCA (collectively the Digital Regulation Co-operation Forum or DRCF)

So why do regulators need to be involved if the market is starting to deliver?

The DRCF says that regulators ‘have an interest in establishing trust in the audit market, so that organisations and people can be sure that audits have credibility.’ Voluntary standards have an important role, but the DRCF also said that 'there are often pull factors for companies to comply, such as technical standards translating regulatory requirements into product or process design.’

The discussion paper noted recent positive developments in AI auditing tools:

Technology companies have begun to develop technical algorithmic auditing tools including Facebook’s Fairness Flow, IBM’s AI 360 Toolkit, and Google’s Model Cards for Model Reporting or Fairness Indicators in Tensor Flow.
Industry associations have been working on voluntary standards such as the IEEE’s Ethically Aligned Design.

While this ‘nascent audit ecosystem’ provides a promising foundation, the DCRF expressed concerned that ‘this risked becoming a “wild west” patchwork where entrants were able to enter the market for algorithm auditing without any assurance of quality.’

Why AI auditing is not a ‘tick a box’ exercise

While AI auditing can draw on the general world of audit, the DRCF points out that AI auditing has its own unique challenges:

machine learning algorithms are data driven and probabilistic in nature and the only way to know the exact output in all circumstances is to test with all possible inputs. That is practically impossible to achieve in a test environment before ‘going live’ with an AI, and even in the real world, there is always a risk of unforeseen issues arising.
there can be feedback loops under which algorithms adapt their behaviour based on how other algorithms (or humans) respond to their actions. This can make it impossible to know what the outputs will be when simulating the algorithm output in an isolated test environment – or again in an operating environment after an audit is done.
re-training of AI is necessary to maintain performance as real-world conditions change. However, after retraining there is no guarantee that previous performance metrics are still valid, and it is possible that new biases or other issues may be introduced.
some models are now re-trained on an individual user’s device with local data, so different users will then have models that behave in diverging ways.
where parts of an algorithmic system are built on elements from several different providers, identifying where in the supply chain the auditing should take place could be challenging.

Know your types of AI audits

The DRCF says that the starting point to building a credible AI audit industry is to codify the different audit tools, as set out below:

	Governance audit	Empirical audit	Technical audit
Description	Assessing whether the correct governance policies have been followed.	Measuring the effect of an algorithm using inputs and/or outputs.	Looking “under the bonnet” of an algorithm at data, source code and/or method.
Methods	Impact assessment, compliance audit (including transparency audit), conformity assessment.	Scraping audit, mystery shopper audit.	Code audit, performance testing, formal verification.
Example	The draft EU AI Act has mandated conformity assessments for high-risk applications of AI.	Propublica undertook an investigation into the use of the recidivism algorithms by COMPAS through comparing predicted rates of reoffending with those that materialised over a two-year period.	Internal code peer reviewing has become a common practice in Google’s workflow development.

^{Source: Auditing algorithms: the existing landscape, role of regulators and future outlook}

The discussion paper provides the following example of how the three different types of audits might fit together in addressing whether an AI program effectively addressed the risks of ‘hate speech’:

‘A governance audit could review the organisation’s content moderation policy, including its definition of hate speech and whether this aligns with relevant legal definitions….The audit could assess whether there is appropriate human oversight and determine whether the risk of system error is appropriately managed through human review. An empirical audit could involve a ‘sock puppet’ approach where auditors create simulated users and input certain classifications of harmful, harmless or ambiguous content and assess whether the system outputs align with what would be expected in order to remain compliant. A technical audit could review the data on which the model has been trained, the optimisation criteria used to train the algorithm and relevant performance metrics.

The risks of the Big Four auditing Big Tech

While the DRCF supports the professionalisation of AI audit, it also notes concerns that AI audit may settle into a comfortable captive relationship between the big four accounting firms and the big global technology firms.

The discussion paper canvasses proposals to ‘facilitate better audits through introducing specific algorithmic access obligations’; in effect, by arming academics and civil society groups to undertake their own audits of AI used by business. The discussion paper said that ‘[p]roviding greater access obligations for research or public interest purposes and/or by certified bodies could lessen current information asymmetries, improve public trust, and lead to more effective enforcement’

But the discussion paper also acknowledged that it would be important to carefully consider the costs and benefits of any mandated access to organisations’ systems, and canvassed three approaches:

only provide access to the elements needed to undertake an empirical audit (i.e., audit of outcomes), which would respect intellectual property by not requiring access to the ‘black box’, of the AI system;
control who has access to different elements of the algorithmic system, such as to respected academic institutions with expertise in AI. Auditors could be required to operate under a non-disclosure agreement for the data they inspect.
expand the use of regulatory sandbox environments to test algorithmic systems and check for harms in a controlled environment. Regulators could collect data from organisations, for example on the safety and bias of algorithmic systems. Regulators could share sufficiently anonymised data with select third parties such as researchers to enable further investigation and reporting.

The discussion paper also canvassed approaches which, in effect, ‘crowd sourced’ AI auditing:

‘The public may also benefit from a way of reporting suspected harms from algorithmic systems, alongside the journalists, academics and civil society actors that already make their concerns known. This reporting could include an incident reporting database that would allow regulators to prioritise audits. It could also comprise some form of popular petition or super complaint mechanism through which the public could trigger a review by a regulator, subject to sensible constraints.’

The risk of AI audits that lead nowhere

Audits are only of benefit if there is a broader governance system which can take up the problems discovered by an audit of an AI system and retool the AI system to solve the problem.

The discussion paper canvasses enhanced powers for regulators:

prohibiting organisations from using the system until the organisation has addressed and mitigated the harm.
establishing red lines where algorithmic systems cannot be used based on their perceived risk to the public, building on the right to restrict processing of personal data under the UK GDPR.
sharing insights (through co-regulatory models) regulators gain from audits on how algorithmic systems can create harm, and how this can be mitigated. This can help inform algorithmic design from the outset or allow companies to gain a better understanding of how they should audit their own algorithmic systems.

The discussion paper also canvasses ‘self-help’ remedies for consumers. It notes that, unlike in other areas such as privacy, individuals harmed by poorly performing AI do not necessarily have remedies:

‘Auditing can indicate to individuals that they have been harmed, for example from a biased CV screening algorithm. It can provide them with evidence that they could use to seek redress. However, there is an apparent lack of clear mechanisms for the public or civil society to challenge outputs or decisions made with algorithms or to seek redress.’

So, what specific roles for regulators?

Given the above problems in growing a credible AI audit market, the discussion paper seeks views on 6 hypotheses on the appropriate roles for regulators:

Hypothesis	Potential benefits	Potential drawbacks
H1: There may be a role for some regulators to clarify how external audit could support the regulatory process, for example, as a means for those developing and deploying algorithms to demonstrate compliance with regulation, under conditions approved by the regulator.	Organisations gain greater certainty and clarity over how to demonstrate compliance with regulations, and greater competition in the audit market is stimulated by higher demand from those organisations.	May reduce firms’ flexibility to devise and adopt innovative approaches to audit. Further, regulators cannot always determine compliance (for example, where this is left to the courts). Thus, guidance can only make parties more likely to comply with the law.
H2: There may be a role for some regulators in producing guidance on how third parties should conduct audits and how they should communicate their results to demonstrate compliance with our respective regimes.	Guidance that helps third parties understand what types of audits are more likely to be appropriate for demonstrating compliance could also address the requirements of multiple regimes, saving costs for audited organisations. Such guidance could also lower the barrier to entry to the algorithm auditing market by creating a level playing field.	Regulators need flexibility to be able to adapt guidance on types of auditing that are deemed ‘sufficient’, to adapt to the emergence of new harmful practices as the use of algorithms evolves. In addition, guidance may be too high level and therefore risk being misinterpreted, without sufficiently demonstrative examples.
H3: There may be a role for some regulators in assisting standards-setting authorities to convert regulatory requirements into testable criteria for audit.	Third-party auditors, whether from industry, academia, or civil society, understand how they can support regulatory compliance. Creating testable criteria also lowers barriers to entry to auditing companies.	It may not be possible or appropriate to reduce some regulatory requirements to testable criteria.
H4: Some regulators may have a role to provide mechanisms through which internal and external auditors, the public and civil society bodies can securely share information with regulators to create an evidence base for emerging harms. Such mechanisms could include a confidential database for voluntary information sharing with regulators.	Such mechanisms could form an important complement to formal regulation in terms of oversight of algorithms and their impacts on individuals and society. When appropriate, the information gathered by regulators could lead to the launching of more formal investigations or other actions. Regulators could also share insights from audits to the benefit of sectors understanding how algorithms can create harms.	Information may be poor quality or opaque, thus reducing the insights that may be gathered from it.
H5: There may be a role for some regulators in accrediting organisations to carry out audits, and in some cases these organisations may certify that systems are being used in an appropriate way (for example, through a bias audit) in order to demonstrate compliance with the law to a regulator.	Accreditation of auditors reduces the need for regulatory audits and the associated costs to organisations. Greater numbers of accredited auditors can improve the trust and use of algorithmic systems.	Accreditation of auditors without attendant requirements about how transparent audits need to be, to appropriate parties or the public, risks undermining accountability for the impacts of the algorithmic system used.
H6: For some regulators there may be a further role to play in expanding the use of regulatory sandboxes (where a regulator has power to do so) to test algorithmic systems in a controlled environment.	Where regulatory sandboxes are joined up, organisations developing and deploying algorithmic systems can test their systems and understand whether they align with regulatory requirements before they are deployed. This saves them time and limits costs of regulation in the longer term.	If regulatory sandboxes are not joined up, organisations may have to approach multiple regulators to test their systems’ compliance.

Expertise Area

Competition, Consumer + Market Regulation

Decarbonising Australia

DECARBONISING AUSTRALIA - SURVEY

Learn more

Visit Smart Counsel

Learn More