Part 3: Policy, Regulation & Legislation

Mapping the AI Terrain: Why Policymakers Must Differentiate to Regulate

Keith Pijanowski

Executive Summary

A common pitfall policymakers fall into when trying to make AI safe is to think of it as a single entity or a single new feature that has emerged in the software industry. This is not reality. AI is evolving, and each generation of AI has produced different model types that utilize distinct data for training and yield distinct outcomes. This chapter presents a new approach for policymakers to consider AI by identifying the various model types and, for each one, describing the data used to create the model and its output, or what the model is capable of doing.

It is a fundamental principle of nature that to control something, you must understand it. The goal of this chapter is to provide a deeper understanding of AI (without diving too deeply) for more informed policy-making. This will not be a technical discussion - the text within this chapter will be understandable to anyone familiar with basic AI terminology.

The Three Waves of AI

Artificial Intelligence (AI) is progressing in waves. Before ChatGPT burst onto the scene, Prescriptive AI was mainstream. It utilizes structured data (typically just numbers) to generate a basic prediction. The next wave, Generative AI, allowed AI to unlock data found in unstructured data (usually documents). It can generate new data in a way that gives the appearance of genuine thought. This was a big step forward in pursuing true Artificially Generated Intelligence (AGI) and a significant leap for policymakers. Today, we are on the threshold of Agentic AI, which will allow AI models to plan and take action. An even more significant step toward AGI and a scarier use of AI for policymakers. Notably, unlike the waves that come from the ocean, which crash into the shore and are gone forever. These three flavors of AI will always exist. There will always be prescriptive AI for basic predictions, generative AI will continue to be used for interpreting unstructured data, and agentic AI will be employed for advanced reasoning and taking action.

Classifying AI into three waves is more than an organizational exercise. Each flavor of AI uses different data during training, produces different outcomes during inference, and the models themselves are very different. Most importantly, each form of AI can have very different bad behaviors. Consequently, policymakers need to understand precisely what they are trying to control. A collection of policies that does not consider all three waves and their unique inputs, outputs, and model types will fall short.

Wave 1: Prescriptive AI

The first wave of AI was prescriptive AI, also known as traditional AI, which is relatively straightforward. Given an input to a model, a prediction is made. The prediction could be a unique value (regression), a categorization of the input, or a classification of the input. These capabilities are useful for tasks like image classification, sorting emails (categorization), and predicting sales (regression).

This first wave brought us neural networks, a complex system of interconnected nodes (neurons) that can learn from training data and make predictions based on patterns identified in this data. Neural networks can identify patterns in inputs with hundreds or even thousands of values and can easily outperform models created from algorithms. Before the development of neural networks, models were developed using pre-built algorithms with clearly coded steps to achieve a specific outcome. If a model created by an algorithm produced an undesirable result, the data and the algorithm could be reviewed to identify and fix the problem. Unfortunately, similar issues with neural networks are impossible to resolve by examining the trained neurons within the neural network. Their values, known as the parametric memory of the model, are difficult to interpret in relation to the data. This is the transparency problem. There are two reasons these parameters cannot be interpreted. First, within neural networks, there could be billions of these parameters. Second, they are just numbers that make subtle adjustments to the signals that pass through the neural network. For example, suppose a neural network is used to approve mortgage applications, and it produces biased results. In that case, you cannot examine its parameters and conclude that a few of them caused the loan to be rejected based on the applicant's ethnicity. These parameters are not like lines of code in a computer program that can be easily interpreted.

Dealing with a neural network's lack of transparency is straightforward. Although the parameters are uninterpretable, they are derived from the data used to train the model. The data should be fully understood before being used to train a model. Ethnicity, religious beliefs, and sexual orientation should never be used during model training. During model training, all data presented to the model will be used to identify a pattern between the input and the output, even if it is the wrong thing to do. Even with clean training data, bias can still infiltrate a model, as humans ultimately determine which data to use during training, and humans are not infallible. Consequently, models that impact people directly should never be given autonomy. There should always be a human in the middle. Returning to our previous example, the neural network would not have produced a biased conclusion about the mortgage application if ethnicity had not been used during the model's training process. Even if the training data is deemed clean, the model should not be the sole criterion for rejecting or approving a loan. A human familiar with these issues should always review outputs that could negatively impact another human.

Wave 2: Generative AI

The second wave of AI brought us generative AI, which generates new content. This can take the form of answering a question, summarizing a long and complex document, or even a whole conversation, at a fluency level capable of passing the Turing test.

Generative AI introduced large language models (LLMs) to the industry. LLMs are neural networks that are even more complicated than the neural networks introduced during wave 1. These neural networks are based on the Transformer architecture (encoder/decoder) outlined in the 2017 paper, “Attention is All You Need.” Today, cutting-edge LLMs, known as frontier models, from OpenAI (ChatGPT), Google (Gemini), and Anthropic (Claude) have parameters in the billions, and soon they will exceed a trillion parameters. Since LLMs are challenging and expensive to train, generative AI has seen an increase in open-source models, leading to a rise in their adoption by enterprises. Many open-source models that enterprises use today as a starting point for their generative AI efforts have parameters in the billions.

Consider how this new form of AI differs from traditional AI, as this will inform policymaking. First, the training input (or training dataset) is different. This input is text. Organizations building frontier models scrape the internet to create a massive corpus of text for their training dataset. Enterprises building LLMs for internal use or to gain a competitive advantage train LLMs using a custom corpus of documents that contain their proprietary information and, in many cases, private information about their customers. Another input that can be sent to LLMs is additional text added to the request to assist the LLM in generating an answer. This process of adding extra text to a user’s original request is known as Retrieval Augmented Generation (RAG). It is typically done by enterprises seeking to provide additional knowledge from their custom corpus to the LLM during inference. This assists the LLM in generating a more accurate answer and helps to prevent hallucinations.

The output is also different. Unlike traditional AI, where the output is a number, with generative AI, the output is text. This text is generated based on the request sent to the LLM (also known as a prompt), the parametric memory of the LLM that was created during training, and, as stated previously, any documents sent to the LLM during inference (RAG). If bias, hateful text, or personal information leaks into the process described above, it will appear in the output.

One additional difference should concern policymakers. As stated previously, enterprises commonly use open-source LLMs as a starting point for their generative AI efforts. Open-source LLMs differ from open-source code due to the previously described transparency issue. When software engineers examine open-source code, they can precisely determine its behavior. This is not possible with LLMs that have billions of parameters. To further aggravate the problem, many organizations that open-source their LLMs do not open-source the training dataset used to train the LLM. They consider this a proprietary secret. As a result, engineers who want to use the open-source LLM cannot review the training dataset to ensure that bias, hate, and personal information have not been trained into the LLM.

There is one last fact about generative AI that will help us better understand the third AI wave , which is described in the next section. Generative AI uses what is known as “zero-shot prompting” to get an answer from an LLM. In other words, the LLM is asked to create a response as fast as possible using only “top of mind” information, or information readily accessible from parametric memory and other information sent with the prompt (RAG). If the LLM were a human, this would be like asking the LLM to perform in the following manner: “Please respond to my request start to finish in one pass without using the backspace button, delete button, or arrow keys to go back and redo any part of your answer. Furthermore, do not break down my request into smaller tasks, and do not review your reply for accuracy.” This is sometimes referred to as asking the LLM to think fast. The third wave of AI changes this by allowing the LLM to think slowly.

Wave 3: Agentic AI

Today, we are entering the third wave of AI, which is agentic AI. Agentic AI systems can plan, take action, and even revise the original plan to improve results. This is made possible by technological advancements that enable the creation and training of LLMs from Wave 2. These newer LLMs are sometimes referred to as Large Action Models (LAMs) because they can take action. These LAMs can now be trained to understand APIs, databases, and other AI models. LAMs have also improved significantly in their ability to write code. In theory, this new AI could access data locked behind APIs and databases and autonomously execute the code it writes to take action. While the first two waves of AI focused on making predictions and generating content, respectively, we now have an AI capable of interacting with its environment. Let’s delve a little deeper into the internal workings of agentic AI, as this is also a key point of differentiation from other AIs. Let’s start with their thought process.

Think about your thought process when answering a question. You will notice yourself breaking the original question down into smaller questions that are easier to answer, putting all your answers together to form an answer to the original question, and then, just before you speak, you will review the answer and possibly revise it. All humans think this way. Advances in LAMs have also allowed agentic AI to think this way as well. A process known as thinking slowly. LAMs used for agentic AI can develop a plan before taking action. Review the plan for areas of low confidence, and then revise it as necessary before taking action. This is important for policymakers, as this is a process that can and should be governed. Next, let’s take a closer look at how LAMs interact with their environments.

There are two types of LAMs (or agents) emerging in the industry today: Tool-based and code-based. Tool-based LAMs use APIs, databases, and LLMs to pull additional data into the LAM. In many ways, this is similar to retrieval-augmented generation (RAG), which allows an LLM used for generative AI to access real-time knowledge from an organization’s document collection. Tool-based agents take action by calling APIs, databases, and LLMs. The tools used differ for each request and are based on the prompt sent to the agent. The agent determines at run-time how to orchestrate these tools. In short, tool-based agents leverage existing enterprise assets to solve problems that the LAM could not solve on its own.

Code-based agents generate and execute code to perform the action they are asked to perform. The agent has all the power of a programming language. There are ways to limit the code that is generated, and it is also possible to restrict the compute cycles used for a given request.

Complex agents will use both of these techniques to accomplish their goals. Little has been done to establish guardrails around these techniques as of today, but they require attention as agentic AI gains momentum.

Summary

Traditional AI introduced neural networks and the problem of transparency which put a focus on governing the training data and putting a human in the middle of a prediction and a resulting action. Generative AI has exacerbated the transparency problem by introducing larger neural networks, known as large language models (LLMs). Due to the cost and complexity of training LLMs, generative AI has also seen an increase in the use of open-source LLMs with not-so-open training sets. Generative AI also introduced a data path to LLMs during inference. This is retrieval augmented generation (RAG). Placing a human in the middle of generative AI is impossible due to the nature of the output and its intended use. Finally, agentic AI, still in its early phases, promises even more complications for policymakers. Agentic AI can plan, take action, use APIs, other data sources, and even execute the code it generates. Due to the nature of the problems agentic AI is designed to solve it will be hard to put a human in the middle. Many of these systems are being designed for autonomous behavior.

Policies for AI need to take into account the specific type of AI that requires governance, as well as the unique characteristics of each. As shown in this chapter, not all models are alike. They use different data for training and perform different actions. Additionally, the models themselves could come from the open-source community. Finally, as we saw with agentic AI, implementation details are important (code vs. tools) when it comes to policymaking.