Nat Meysenburg
Technologist, Open Technology Institute
Last Monday, the White House released an that instructs the government to play a larger role in advancing 鈥渞esponsible global technical standards for AI development.鈥 The very next day, Mozilla posted signed by dozens of technologists, researchers, and others in the artificial intelligence (AI) field. This 鈥淛oint Statement on AI Safety and Openness,鈥 argues that in order 鈥渢o mitigate current and future harms from AI systems, we need to embrace openness, transparency, and broad access.鈥 This letter is the latest in an ongoing debate about open versus closed architectures for AI models, a topic increasingly central to discussions about securing AI, as well as those about regulating it.
Open source AI models might pose different risks than their closed-platform counterparts, but those risks need to be weighed against the many potential benefits of using open models. So much of the innovation we will see in AI is a function of the insight and creativity that comes from mixing open source code with use cases its original developers did not imagine. And while open-sourced models are available for bad actors to abuse, popular models can build on one of open source鈥檚 greatest strengths, the possibility that it can be vetted and improved by a broader range of security experts and developers鈥攐r, as the open letter puts it, 鈥渢he idea that tight and proprietary control of foundational AI models is the only path to protecting us from society-scale harm is naive at best, dangerous at worst.鈥 As legislative bodies around the world鈥攊ncluding in the United States鈥攃onsider various approaches to AI governance, we need to make sure we鈥檙e being thoughtful in how we address concerns without limiting the potential of open source AI models.
To explore what it means to open source an AI model, or how these models might be used responsibly, we鈥檒l dig into some of the technical aspects of Meta鈥檚 large language model (LLM), known as LLaMa.
An LLM is part of a subset of AI known as generative AI. Broadly speaking, generative AI are technologies that are trained on very large amounts of data, and that use that background information to generate similar content. Generative AI can be used to make images, music, and鈥攁s in the case of ChatGPT and many other LLMs鈥攖ext. In this case, open sourcing an LLM means making the language model itself available for download, along with software for developers to use to work with the model.
Building this sort of large general use model is expensive. Meta lists this among its reasons for releasing LLaMA, that the 鈥渃ompute costs of pretraining LLMs remains prohibitively expensive for small organizations.鈥 Meta also recognizes that the costs are not only monetary, and that more people training LLMs 鈥渨ould increase the carbon footprint of the sector.鈥 And they have a point: According to , training version 2 of LLaMA used about as much electricity as it would take to power for a year.
When LLaMA 2 was released earlier this year, Meta published an accompanying . The guide offers developers using LLaMA 2 for their LLM-powered project 鈥渃ommon approaches to building responsibly.鈥 Reading the guide, one notices two things. First, Meta is placing a huge amount of trust on downstream developers for responsible behavior. And second, even developers with the best intentions may struggle to act responsibly. The Responsible Use Guide warns developers that they are making 鈥渄ecisions that shape the objectives and functionality鈥 of their LLM project in ways that 鈥渃an introduce potential risks.鈥 It instructs them to take care to 鈥渆xamine each layer of a product鈥 so that they can determine where those risks might arise.
Overall, while the Responsible Use Guide offers a good discussion of development considerations, it is also somewhat frustratingly inspecific. It could benefit from discussing hypothetical use cases as a way to explore how a team could approach these issues. Even though there is a deep technical discussion in the white paper, the Responsible Use Guide could be made more technical鈥攐r at least do more to point towards the technical documentation that might be relevant to responsible use.
Still, Meta鈥檚 guide implicitly demonstrates that it takes a great deal of forethought, planning, and testing to responsibly use AI. Even with all that effort, things might still go wrong. This is likely why the guide spends so much time on the need for developing mechanisms for users to report problems (e.g., a button to push when the AI generates troubling content), and teams being set up to react to those reports when they come in.
To better appreciate the complexity that goes into the decision making, it helps to understand some of LLaMA鈥檚 layered structure, how some of its layers interact, and how each layer can be fine-tuned in ways that may have effects on other layers. LLaMA鈥檚 base layer is its foundation model, a huge pretrained and pre-tuned English language (mostly) model. The model is remarkably large, with two trillion tokens (AI jargon for pieces of information about text) used in its training. Meta distributes three sizes of the model, each with differing parameter counts (7B, 13B, 70B). In AI neural networks, parameters are systems of numbers that define the relationships between the data in the training set鈥攆or instance, weighing the strength of connection between tokens. A higher parameter count means the model has made more connections between its tokens. Pretraining, as the guide puts it, is the process where 鈥渁 model builds its understanding of the statistical patterns across the sample of human language contained in its training data.鈥 The foundation model is what powers the AI鈥檚 ability to understand prompts and generate human sounding content.
LLaMA allows developers to further pretrain their LLaMA instance with domain specific information. For example, if a team of developers are building an AI to help prospective students compare information about different colleges, they might need to add a bunch of data about colleges. The foundational model is used to understand the questions and craft answers, but the answers are further shaped by all the extra data LLaMA is trained on. For this example, it鈥檚 college data, but it could be information about anything from music discographies to catalogs of auto parts.
After further training, developers can also write rules that prohibit certain types of questions from being asked and prevent the AI from giving certain kinds of answers. This can be done either via simple rules (e.g., instructing the model that a list of words is offensive), or by training the AI on other texts. To be sure, there are other places where developers will make tuning decisions, but the interaction of these three layers illustrates the complexity of fine-tuning AI.
Returning to the earlier example of a team building an AI that compares college data for prospective students, let鈥檚 say this team has a deep commitment to making sure their chat bot cannot be coaxed into providing answers that describe or recommend violent behavior. The team could attempt to remove all violent language from the foundation model (because it鈥檚 open source, they can attempt to do that). However, if the foundation model doesn鈥檛 have any violent language, how will the AI process a later instruction to filter out violent questions and answers? In other words, if you over-tune the base model, there鈥檚 not enough contextual information for the AI to understand what violence is. Furthermore, filtering at those later stages is deeply context specific. For example, developers of a chatbot designed to help with HR functions would presumably exclude descriptions of body parts that would be completely appropriate for an AI chatbot designed to answer medical questions.
With its Responsible Use Guide, Meta is relying on development teams to not only envision the positive ways their AI system can be used, but to understand how it might be abused, attempt to abuse it themselves in the testing process, and mitigate against abuse. It is a lot to ask, but it is not too much. There are of security flaws being rooted in a lack of care or to security and abuse in the development process. The guide shows that Meta is thinking deeply about how LLaMA can be abused鈥攂ut it should do more than ask developers to be attentive and thoughtful in the face of complexity, and provide more specific guidance for developers as they set out to build with AI.
Creating an ecosystem of developers building AI for their niche use cases is not only one of the potential benefits of open source AI鈥攊t also points to an important element of responsible use. Meta鈥檚 guide makes only a brief, one-paragraph mention of defining use cases, but a well-understood use case may be one of the best tools available for developers interested in preventing the abuse of their AI projects. One of the risks of big LLMs is that in trying to do (or know) too much they become easier to trick or more likely to spew fake nonsense. By clearly scoping what the AI is doing, developers can shrink the amount of downstream misuse they have to consider, including privacy risks that could result from overbroad data collection. Returning to the college data example, if the AI is trained just on that data, and supposed to only answer questions about higher education, any prompt asking about anything else can be rejected with a 鈥渢hat鈥檚 not a college data question鈥 response. There鈥檚 still plenty of room for mischief within prompts about higher ed, but the scope of the problem likely doesn鈥檛 include thinking through scenarios like the AI producing chlorine gas recipes. Thinking of all the things that could go wrong when using a model as powerful as LLaMA is largely impossible, but tightly defining a use case for the AI can meaningfully help to limit downstream harms.
Beyond LLaMa, limiting the abuse of tools built on open source AI models will look different than doing so on the large closed models like ChatGPT鈥攂ut different need not mean worse. Open source models give developers options that are not possible on models trying to do everything for everyone. And it is in those places鈥擜I doing specific things for some people鈥攚here we might see some of AI鈥檚 most profound positive impacts.