Defining Data Governance
The term 鈥渄ata鈥 refers to information created, processed, saved, and stored digitally by a computer in ones and zeros鈥攐r binary format. Network connections or devices allow this data to be transferred from one computer to another. There is also a distinction that needs to be drawn between 鈥渄ata鈥 (machine-readable ones and zeros, or 鈥渃ode鈥) and 鈥渋nformation鈥 (what that data means to humans).1 Such data and information can have different implications depending on their type (e.g., pertaining to finance, health, social media, law enforcement, etc.).
Based on these definitions and distinctions, we generally define data governance as the rules for how governments interact with the private sector鈥攁s well as with other governments鈥攚hen it comes to managing data to determine who has access to it and the ways in which those with access can use it. As previously articulated, this includes the design and enforcement of standards, policies, and laws.
We understand that the term 鈥渄ata governance鈥 has many different meanings depending on the context and the perspective of various stakeholders. For the purposes of our roundtable, our goal was to have a structured discussion about data governance as it applies to the following three issues (in no particular order):
- National security/law enforcement: a government鈥檚 interest in ensuring access to data for purposes of domestic and international security; other governments鈥 converse concerns about misuse of that data; and desires to protect data against foreign collection;
- Economic growth/innovation: objectives to create and access large databases of data for research and development of data-intensive technologies like machine learning/artificial intelligence, as well as for cross-border transactions and ecommerce; and
- Content moderation policies and practices: competing demands on what is and is not permissible content, and possible ways to manage that conflict while also ensuring the free flow of data.
Across these three distinct areas, there are different types of tools or 鈥渓evers鈥 that set the terms around how data is collected, used, transferred, and stored. These levers essentially set the bar for concepts such as 鈥渢rust鈥 in Prime Minister Abe鈥檚 concept of 鈥淒ata Free Flow with Trust.鈥 Since the concept of trust is quite vague (with significant debate regarding the degree to which regulation even enhances trust),2 a core objective of this project is to consider how the various levers at play can and should be configured to achieve certain safeguards. We discuss these levers in the next section.
Additionally, the issue of how governments support data flows across borders鈥攐r conversely, how governments restrict those flows鈥攊s a major focal point across each of the three aforementioned areas of data governance. The term 鈥渄ata localization,鈥 for example, appears frequently in policy discussions to mean restrictions on the ability of firms to transfer data from domestic sources to foreign countries鈥攊n other words, the opposite of free data flow.3 In reality, the term could have several different meanings. There is a spectrum when it comes to severity.
On the most permissive side of the spectrum is 鈥渕irroring,鈥 where a country requires that a copy of data be stored on a server within that country before it鈥檚 allowed to be sent out. Partial data localization could mean that restrictions only exist on certain domain names or on data from specific sectors like health or finance. China鈥檚 system is stricter than that of many other countries in that the government requires firms to store certain kinds of data on servers inside the country, while allowing transfer in or out under certain conditions. There still appears to be a regulatory gray zone in which multinationals in China can send certain kinds of data outside the country, but it is not clear the extent to which this will be the case in the future, given the significant weight given to national security in Beijing鈥檚 approach to data regulation.
In other cases, however, data localization may be implemented in an even stricter manner by requiring local storage and local processing while prohibiting outbound transfer altogether. This could mean foreign firms cannot access and use data to create value outside of that geographic area. Russia and India already take such an approach with some kinds of data (i.e., payment data in India鈥檚 case), and other countries are increasingly considering it. But at least for now, with a few exceptions, most governments have yet to notably implement these stricter forms of data localization.
The above pathways all fall under the 鈥渄ata localization鈥 umbrella of policy options. But localized storage and processing requirements are by no means the only policy option available for limiting free flows on data; countries could also potentially implement some form of algorithmic filtering in order to allow or disallow certain kinds of data, possibly even from certain places, to flow into or out of their borders.4 This could focus on anything from sensitive personal health information to political online content, depending on factors such as the government鈥檚 policy priorities and its technical capabilities.
We discuss the key challenges for enabling cross-border data flows as part of Theme 1 later in this report. Before that, however, we turn in the next section to the 鈥渓evers鈥 of data governance and their relationships at super-national, national, and sub-national levels.
Citations
- This is a distinction one of us previously established in: Robert Morgus and Justin Sherman, 鈥淭he Idealized Internet vs. Internet Realities,鈥 (Washington DC, 国产视频, 2018) source 27
- Daniel Castro and Eline Chivot, 鈥淭he GDPR Was Supposed to Boost Consumer Trust. Has it Succeeded?鈥 European Views, June 6, 2019,
- World Trade Report 鈥淗ow do we prepare for the technology-induced reshaping of trade?鈥 (2018)
- We have both noted, in a range of places and contexts, how hypothetical limitations on the flow of AI-related data around the world (i.e., code for neural networks, training data sets, etc.) stand in stark contrast to the current state of AI research, which remains incredibly open. See: Justin Sherman, 鈥淯.S. Tech Needs Hard Lines on China,鈥 Foreign Policy, May 3, 2019, Samm Sacks, 鈥淪mart Competition: Adapting U.S. Strategy Toward China at 40 Years,鈥 Testimony before the House Foreign Affairs Committee, May 8, 2019, source and Justin Sherman, 鈥淭he Pitfalls of Trying to Curb Artificial Intelligence Exports,鈥 World Politics Review, June 6, 2019,