Artificial Intelligence Principles Released: January, 2022 The Association of Test Publishers (ATP), the international trade organization representing the testing/assessment industry, acknowledges the life-enhancing potential of Artificial Intelligence (AI) when utilized in an appropriate fashion, while equally recognizing that the inappropriate application of AI to testing scenarios is capable of resulting in bias or discriminatory effects on individual test takers. While international regulation of AI is under active consideration, almost no final laws/regulations are in place to provide the industry with guiding benchmarks for these principles. Nevertheless, to assist ATP members in achieving accountability in their utilization of AI systems in testing scenarios, the Association has developed five (5) principles for AI development and adoption, which taken together, create a framework for constructive, responsible uses of AI systems. Ultimately, the ATP encourages every testing organization to use AI systems responsibly in an ethical and trustworthy manner. The ATP acknowledges the excellent work of the World Health Organization (WHO), the Organization for Economic Co-operation and Development (OECD), the European Commission, and many other organizations, academic institutions, and nation states who are putting forth statements of principle and proposed regulations for the fair and equitable use of AI in process and practice. The ATP intends to continue to update and maintain the testing industry’s position as new material becomes available, and as an international regulatory consensus emerges. To put these Principles into useful context for the testing industry, it is necessary to distinguish between what is AI and automated systems which do not rise to the level of AI. As Director of Google Research, Dr. Peter Norvig, has noted, “AI is all about figuring out what to do when you don't know what to do. Regular programming is about writing instructions for the computer to do what you want it to do, when you do know what you want it to do. AI is for when you don't.”[1] Thus, traditional/conventional software that merely automates human decisions, especially in adopting pre-determined rules for item development, test delivery, and scoring, should not generally be considered AI.[2] Based on that key distinction, an AI system is one that perceives its environment and takes actions through “learning, reasoning, or modeling” of data that maximize its chances of success.[3] Thus, AI adapts to unforeseen circumstances by evaluating potential actions. The ability to evaluate action is what differentiates AI from conventional computing. For example, if one is programming a car to drive from A to B, the "conventional computing way" is to program a set of instructions (e.g., "turn left, go forward for five blocks, then turn left again to end up at the fourth house on the left"), which only works for one specific route from A to B. By comparison, the "AI way" is to program actions (e.g., "turn left", "go forward", "turn right"), along with a utility (e.g., "what is the distance to the destination?"), which allows the AI system to adapt a route itself by analyzing if it will be closer to the destination after each action; the same AI system may also enable these actions to be taken safely by programming “stop” and “slow down” functions so that the car avoids hitting other cars/pedestrians/objects. Consequently, these Principles especially focus on Machine learning (ML), a major category of AI algorithms, which enables the AI system to improve through experience. The process of “learning” happens when the ML system revaluates its understandings of its own environment by minimizing the discrepancies between its output and the data it knows as the “ground truth.” For example, when considering the use of AI to implement an email spam filter, one would first train the ML system to extract facts from emails (e.g., the sender of the email, the number of recipients, the subject line, the content, attachments). In application, then, the ML system would find the optimal combinations of known facts that allow it to mark emails as spam in a way that produces results that are as similar as possible to the original labels in the training data. On the other hand, automated decision-making is generally NOT AI, because the system Is merely a computer program automating a human function/set of functions using a predetermined algorithm. For example, in scoring a test, the automated system is built to use the scoring key exactly the same way a human scorer would use it; there is no learning, nor any adaptation of facts to reach the desired outcomes. These high-level Principles are intended to encourage accountability and prudent self-governance by individual testing organizations. They also serve to discourage the carte blanche deployment of AI systems and technologies that lack rigorous validation, offer vaguely-defined metrics for detecting harmful effect, or provide insufficient documentation critical for third-party system audits and post-deployment monitoring. With this definition, scope, and illustrations in mind, then, the ATP AI guidance consists of the following Principles:
CONCLUSION The ATP urges every testing organization to integrate these Principles into the planning, development, and deployment phases of the technical lifecycle of AI in testing. Aligning every AI system to these Principles will leverage a process that is reliable, replicable, and scalable across a variety of testing programs. Equally important, reliance on these Principles will provide a testing organization with information from which to comply with any future AI regulations that are eventually adopted. [1]Talati, A. (2018, September 12). CS 6601 Artificial Intelligence. Retrieved from Subtitles To Transcripts: https://subtitlestotranscript.wordpress.com/2018/09/12/cs-6601-artificial-intelligence/ [2] Council of the European Union, Interinstitutional File: 2021/0106(COD), Doc. No. 8115/20, Presidency compromise text to Proposal for a Regulation of the European Parliament and of the Council laying down harmonised rules on artificial intelligence (Artificial Intelligence Act) and amending certain Union legislative acts (Nov. 29, 2021) (hereinafter the :Compromise Text”). [3] Id. [4] Generally, AI models and techniques can be implemented in two fashions: either deterministic, where the implementation does not change based on subsequent events, or dynamic, where the implementation continues to self-modify through subsequent events. [5] A test method that has been developed utilizing AI techniques, but deployed in a predictive and deterministic fashion quite probably does not have an alternative method that can be implemented for an “opt out” scenario. Moreover, if the AI system does not employ personal data, the privacy implications for test takers may be minimal or even non-existent. [6] A test may be discriminatory by design. For example a test that is dependent on visual stimuli would not be available to a visually impaired individual and no alternative may be available, or this testing stream is not accessible to that individual by design. |