The Spectrum of Openness
There is no easy binary that opposes 鈥渙pen鈥 and 鈥渃losed鈥 in the case of artificial intelligence (AI) models. Instead, openness should be viewed as a spectrum.1 This more flexible understanding of openness fosters productive conversations about both the varied benefits of openness and the marginal risks associated with open models relative to closed models or what is already publicly available online.2
鈥淥pen-source AI鈥 is a term with a definition that is still in flux and likely to be defined differently by different stakeholders. The Open Source Initiative (OSI), a nonprofit that advocates for the benefits of open source and acts as a multistakeholder standards body that maintains a widely-used definition of open source software as a public benefit, just released its first definition of the term 鈥渙pen source AI.鈥3 The OSI definition clarifies that to be open, an entire AI system must be considered鈥攂oth in its 鈥渇ully functional structure and its discrete structural elements.鈥 OSI further notes that 鈥渢he requirements are the same, whether applied to a system, a model, weights and parameters, or other structural elements.鈥4
Efforts such as OSI鈥檚 to align on a specific definition of open-source AI that is based on certain criteria are helpful, as different actors are currently using the term in different, often self-serving, ways. In a paper last year, David Gray Widder, Sarah West, and Meredith Whittaker described this phenomenon of 鈥渙penwashing,鈥 in which a model developer misleadingly claims the mantle of openness for public relations gains while actually providing access to their model in a way that 鈥渟hould be understood as 鈥榗losed.鈥欌5 They suggest that models fall on 鈥済radients of openness鈥 in which the term 鈥渙pen鈥 can describe models that 鈥渙ffer vastly differing levels of access.鈥
The authors offer three attributes for understanding the openness of models: (1) transparency, (2) reusability, and (3) extensibility. Transparency denotes 鈥渢he ability to access and vet source code, documentation and data;鈥 reusability is 鈥渢he ability and licensing needed to allow third parties to reuse source code and/or data;鈥 and extensibility is 鈥渢he ability to build on top of extant off-the-shelf models, 鈥榯uning鈥 them for one or another specific purpose.鈥6 These attributes are a useful framework for examining a model鈥檚 software components.
These attributes also highlight many of the same principles required by OSI鈥檚 definition. For OSI, an open-source AI system must allow use 鈥渇or any purpose and without having to ask permission,鈥 the ability to study how the system works, the ability to modify it, and to share it 鈥渨ith or without modification.鈥 These overlaps suggest a growing agreement around the term 鈥渙pen,鈥 particularly the need to include transparency, access, and modification in the definition.
Relevant to the ongoing discussions about how to define open models is the question of access to training data. While some models describe themselves as open and provide code and model weights, they do not provide access to the data used when training the model. A group of scholars recently suggested using the term 鈥渙pen-access AI鈥 in this context, arguing that 鈥溾榦pen-source AI鈥 is a misnomer for such models鈥 due to 鈥渕eaningful differences in access, control, and development.鈥7 OSI鈥檚 definition similarly regards access to training data as an essential test in determining whether or not a model is truly open source.
We believe there are at least five key ways in which a model manifests openness, whether it is a large foundation model or a more narrowly tailored one:
- Open code that can be downloaded, modified, shared, and used by others;
- Open licenses that allow third parties to use the model;
- Transparency about model inputs (data sources, model weights8);
- Transparency about envisioned threats from models and ways to mitigate against undesirable downstream effects (e.g., malicious actors fine-tuning the model to cause clear harms); and
- Open standards for interconnection and communication among AI models that allow people and companies to switch between models (portability) and for models to interoperate with one another.
To illustrate the concept of a spectrum of openness, we offer a simplified breakdown with examples. We have chosen this range of attributes as an exercise in illustratively drawing the line between models, recognizing that the spectrum could consist of many more attributes and reflect greater nuance.
The exercise of defining open-source AI, or even placing AI models along such a spectrum, demonstrates that the emerging requirements for an AI model to be considered open are similar to those in earlier free and open-source software projects. To examine鈥攁s OSI suggests鈥攖he entire system of an AI, we must investigate more attributes than simply the code or the weights. We must think more broadly of models as software projects. The history of open-source software is full of instructive examples of how to (and not to) structure and maintain large software projects, and it is also full of examples of unintended consequences that have shaped tech.
The term 鈥渙pen source鈥 is used throughout this report in a way that includes consideration of all software licensing that meets both the Free Software Foundation鈥檚 definition of free software9 and the Open Source Initiative鈥檚 鈥淥pen Source Definition.鈥 (OSD).10 We have chosen to use 鈥渙pen-source鈥 to refer to code as it resonates with the current discourse around open models and not because of a particular preference or recommendation for existing open software licenses. We use the term 鈥渙pen model鈥 throughout this report to echo that discussion but recognize that the lexicon around AI and openness is changing and may ultimately need more terminology鈥攍ike 鈥渙pen access鈥濃攖o meaningfully distinguish among model types in the future.
Much of the prevailing discourse around open models focuses on risks and fails to fully account for the significant societal benefits of open models to public transparency and accountability, to unexpected innovation and competition, to education and research, and to security. The following sections explore each of these benefits in further detail.
Citations
- 鈥淪everal speakers challenged the notion of a binary between 鈥榦pen鈥 and 鈥榗losed鈥 models, pointing toward a spectrum of options regarding the level of access to system components such as datasets, code, model cards, and model weights.鈥 Amanda Leal, Towards Effective Governance of Foundation Models and Generative AI: Takeaways from the Fifth Edition of The Athens Roundtable on AI and the Rule of Law (Future Society, March 2024), 32, . See also Dual-Use Foundation Models, .
- 鈥淥ne thing we have already learned is the importance of focusing on the marginal or differential risks and benefits of open weights. For example, we need to measure the risks of open-weight models relative to the risks that already exist today from widely-available information, or from closed models. We have also been encouraged to hear that this is not a binary choice of 鈥榦pen鈥 vs. 鈥榗losed.鈥 Rather there is a broader 鈥榞radient of openness鈥 that we need to consider and that may offer broader options for policy.鈥 Alan Davidson, 鈥淣ational Security and Open Weight Models: Remarks of Alan Davidson,鈥 National Telecommunications and Information Administration, March 22, 2024, .
- 鈥淭he Open Source AI Definition 1.0,鈥 Open Source Initiative, October 28, 2024, .
- 鈥淭he Open Source AI Definition 1.0,鈥 .
- David Gray Widder, Sarah West, and Meredith Whittaker, 鈥淥pen (For Business): Big Tech, Concentrated Power, and the Political Economy of Open AI,鈥 SSRN, August 17, 2023, .
- Widder, West, and Whittaker, 鈥淥pen (For Business),鈥 .
- Parth Nobel, Alan Z. Rozenshtein, and Chinmayi Sharma, 鈥淥pen-Access AI: Lessons From Open-Source Software,鈥 Lawfare, October 25, 2024, .
- Model weights refer to the numerical value an AI model gives to a piece of information to show the relative strength between it and another piece of information.
- 鈥淲hat is Free Software? The Free Software Definition,鈥 Free Software Foundation, December 27, 2016, .
- 鈥淭he Open Source Definition,鈥 Open Source Initiative, March 22, 2007, .