Why Numbers can be Neutral but Data Can鈥檛
This month is (MAM). Celebrated every April since 1986, MAM seeks to increase the visibility of mathematics as a field of study as well as the public鈥檚 understanding of and appreciation for math. With a different theme chosen each year, this year鈥檚 theme was the future of prediction: exploring how mathematics and statistics enable us to make predictions–about the weather, the spread of disease, and even which students are at-risk of not succeeding in college. This year鈥檚 celebration also asked participants to contemplate the role math will play in uncovering novel predictions in the years to come.
Math and numbers are ubiquitous in our daily lives. For example, figuring out if it鈥檚 cheaper to order takeout or prepare a meal at home or calculating how much to leave for a 15 percent tip. This might be why it鈥檚 so easy to conflate mathematics, and its inherent use of numbers with data about people. A number is an arithmetical value, expressed by as a word, symbol, or figure, representing a particular quantity. Data, on the other hand, is information expressed as numbers. And,聽numbers and math are related as mathematics is the abstract science of numbers, quantity, and space.
Our use of 鈥渂ig data,鈥 or extremely large data sets, has only added to a numbers-driven world. Netflix and Amazon are examples of two companies that use big data to market their products to user鈥檚 individual interests. MAM dedicating its theme to big data in our everyday lives–both the opportunities and dangers–speaks to this phenomenon. Higher education has not been immune to this trend. In fact, the field has been dominated by about how big data can be best utilized to improve outcomes for institutions and other stakeholders and help students meet their .
Despite being an avid proponent of using and appreciating mathematics–with virtually every college campus offering a major or minor in mathematics, and mathematics being a requirement to complete most academic programs–higher education administrators, faculty, advisors, and consultants aren鈥檛 exempt from this proclivity to confuse mathematics, numbers, and data. This is especially true in how the field communicates these concepts. A data awareness month might be useful if it helped higher education parse out the key differences among them.
Poor Communication
Data is often to and seen as neutral. It is also believed that it鈥檚 how data is used that matters, using student data to ensure students–or bunnies–swim and not drown. A similar sentiment has been said about education technology, which often requires using or gathering student data to ensure their success. Take for instance Condoleezza Rice at a recent edtech conference that:
“Technology is neutral. It鈥檚 not good or bad. It鈥檚 how it is applied that matters.鈥
The truth is, data–and education technology for that matter–are not and cannot be neutral. While often presented as numbers, and derived from applied mathematics–such as statistics–data are not synonymous with numbers or with math. While a data scientist, analyst, or statistician can make computations using data (applied mathematics) and present data about people as numbers (a mathematical symbol or object created in abstraction), this doesn鈥檛 make data about people neutral. What differentiates data from numbers is that numbers are mathematical abstractions, an idea. Because numbers are symbols or objects used in math, they can be neutral. But data, originating from the real world and real people, cannot.
Higher education administrators, faculty, advisors, and consultants poorly communicate what data is, and how applied mathematics–like statistics–are helpful in analyzing data to solve or understand problems in higher education. Attempting to simplify education jargon, and the term data-driven in particular, an NPR 聽highlighted its use of聽a text editor that restricts you to the 1,000 most common words in the English language. The resulting definition for data-driven was:
鈥淲e should decide things using numbers.鈥
Here again data and numbers are made to be synonymous when they are not. While it may seem insignificant, any attempt to codify commonly-used language impacts the way we understand and communicate what we as practitioners, policy makers, etc. do. Thus, any effort made with faulty understanding and using loose diction can be detrimental.
As a final example, how using data propelled a university to realize it needed to change its advising strategy, one administrator conflates numbers and data. He reportedly stated:
鈥淎ll of a sudden we鈥檙e talking about real numbers.鈥
It鈥檚 unclear if unreal numbers even exist. Imaginary numbers, yes. But unreal numbers when those numbers represent students? It鈥檚 more likely that what was meant by real was that the data was objective, neutral. But, as I鈥檝e said before, this isn鈥檛 possible.
The power of data in higher education isn鈥檛 what鈥檚 in question: using data appropriately can result in impressive gains for a college and its students. This same saw their fall-to-spring retention for first-time freshmen and sophomores surpass 90 percent, an increase of 3.4 percentage points from the previous year. However, precisely because of its power and promise to help institutions move the needle on student success, is why fully understanding what data is and properly communicating about it is essential.
Remedy 1: Rethink Data
How can we avoid conceptualizing and communicating about data as an abstraction, purely numerical in nature? One way is to acknowledge how when we often communicate and share numbers, we really are talking about data on people, systems, and norms, none of which are abstractions or neutral. A way of thinking about data in this way can be best in Acumen鈥檚, a non-profit that raises charitable donations to invest in companies, leaders, and ideas that are changing the way the world tackles poverty, discussion about using data to measure social impact. The authors state:
鈥淚t鈥檚 all too easy to forget that data is about human beings and their behaviors. Data is not an abstraction. The social development sector is prone to forgetting this. We often collect data with little regard for the people behind the numbers.鈥
鈥淒ata encodes the stories of our lives, capturing not only our tastes and interests but also our hopes and fears. Data isn鈥檛 an abstract idea or a set of numbers or qualitative responses. It can be and is, ultimately, human.鈥
In a similar vein, at a recent on the reliance on data-driven risk assessments by governments and the human rights implications of this trend, Helen Nissenbaum, a professor at New York University, :
鈥We talk of data as if it鈥檚 raw fuel of algorithmic analysis. It’s collected with a purpose and not unbiased.鈥
Although the discussion was about the implications of predictive analytics use for human rights, this same caution can and should be applied to higher education. This is especially true as the use of and other algorithmic-based tools continue to gain momentum in academia.
Experts working in education have also expressed similar views. For example, Mimi Onuoha, a fellow at , a research institute focused on social, cultural, and ethical issues arising from data-centric technological development that:
鈥淓very data set involving people implies subjects and objects, those who collect and those who make up the collected. It is imperative to remember that on both sides we have human beings.”
, a research analyst at Data & Society pointedly wrote:
鈥淭he assumption of 鈥渙bjective data鈥 frees people from acknowledging structural inequality.鈥
Has the field asked directly what the underlying forces are that enable higher education to be the best equipped to understand and yet possibly more inclined to forget that the common denominator in data are people?
Remedy 2: Rethink Math Education
In addition to a new way of thinking about data, higher education might also need to take seriously the appeals to make math undergraduate education more applicable to real-world problems. By doing so, we might not only be able to reduce math as an early stumbling block for students in their college careers, but equip the next generation of leaders to have a more nuanced understanding and communication of what data is, where it comes from, and ways to use and analyze it. This includes the many ways we can mishandle data at each stage.
There have been national calls to make math more relevant and overhaul how it is taught on college campuses–away from the abstract to the practical. Transforming Post-Secondary Education in Mathematics (TPSE Math), a project by nationally recognized mathematics education leaders in 2011 to push for this change. Among the things for, is an entry-level math course that is relevant for the career goals and interests of every student at every college.
TPSE Math isn鈥檛 alone however in their calls for math to not only be better taught, but teach relevant skills students will need to contribute to solving real-world problems. Andrew Hacker, a professor emeritus at Queens College of the City University of New York, the distinction between math and arithmetic and says that colleges should focus on teaching better and upgrading the latter. Math means algebra, trigonometry, and calculus, all part of what he calls the “enigmatic orbit of abstractions.鈥 And for Hacker, arithmetic is the quantitative literacy that people actually need. Hacker has gone as far as to say that students, educators, and the like should learn to be skeptical about numbers, especially when they鈥檙e situated in the real world. This dovetails with many others understanding that data has its imperfections, one being its non-neutral nature.
Remedy 3: Convene and Train
Changing the way math is taught at the undergraduate level is a policy change that will understandably take, among other things: time, effort, resources, organizing of key stakeholders, and political will. At the very least however, the field could devise ways to bring institutional leaders, policymakers, and anyone who makes decisions using data to engage in conversation and trainings on the true essence of data–where it comes from, and the ways it鈥檚 analyzed. The goal would be to help the field remember that data doesn鈥檛 exist in the abstract, but in the real world.
An initial step could even be to critically examine what others–outlined earlier–have already said on the subject. And ultimately, convenings and trainings could keep our conversations about data grounded closer to its origin: among people, institutions, systems, norms, and values. We might then be able to strengthen our understanding, communication, and transparency around how we collect, analyze, interpret, and communicate data with these same people, processes, and structures in mind.”