“What we study is not always what is actually out there. It is often what we’re interested in, or what’s easiest to discover.” –Samuel Arbesman
Samuel Arbesman, a mathematician and network scientist at Harvard, begins his fun romp through the science of science The Half Life of Facts – Why Everything We Know Has an Expiration Date (find in a library) with a few cheeky examples of scientific “facts” that have differed depending on the time period. In the first half of the 20th century, it was widely known that there are 48 chromosomes in a human cell, but in the latter half of the 20th century, it became widely known that there are, in fact, only 46 chromosomes in a human cell.
In Chapter 1, Arbesman walks through the mathematical regularities of the growth of knowledge and the decay of knowledge, aptly using as metaphor the half-life of radioactive material. In Chapter 2, “The Pace of Discovery,” he provides an enjoyable introduction to scientometrics, beginning with a story of Derek J. de Solla Price – considered the founder of scientometrics, or the “science of science” – stacking, chronologically, every issue of a British scientific journal against the wall of his apartment and realizing in an idle moment that the heights of the volumes conformed to a specific shape: an exponential distribution. This discovery led Price to focus his research on scientometrics, leading to the publication of Little Science, Big Science, in which Price calculates the doubling times (i.e., exponentially grow) for various components of science and technology.
Price found, for example:
Domain |
Doubling Time (in years) |
Number of entries in a dictionary of national biography |
100 |
Number of universities |
50 |
Number of important discoveries; number of chemical elements known; accuracy of instruments |
20 |
Number of scientific journals; number of chemical compounds known; membership of scientific institutes |
15 |
Number of asteroids known; number of engineers in the United States |
10 |
Arbesman also cites Harvey Lehman, who published in the journal Social Forces an attempt to count major contributions in different areas of studies, and Arbesman provides the following expanded table:
Field |
Doubling Time (in years) |
Medicine and hygiene |
87 |
Philosophy |
77 |
Mathematics |
63 |
Geology |
46 |
Entomology |
39 |
Chemistry |
35 |
Genetics |
32 |
Grand opera |
20 |
Chapter 2 also introduces some of the hallmarks of bibliometrics: the h-index and journal impact factors, as well as the study of scientific discoveries, which Arbesman calls “eurekometrics.” Some of the names mentioned in Chapter 2 include: Jorge Hirsh, Harriet Zuckerman, Arthur C. Clarke, Nicholas Christakis, Tyler Cowen, Galileo, Isaac Newton, Stanley Migram.
Chapter 3, “The Asymptote of Facts,” tackles the decay of knowledge. Primarily using citation analysis, Arbesman examines the length of time until papers become “out of date.” Using this approach, determining the half-life of a field is possible: the time it takes for others to stop citing half of the literature in a field. He uses a variety of examples to illustrate the “long tail” of knowledge as it decays.
Arbesman cites a 2008 work by Rong Tang examining scholarly books in different fields, finding the following half-lives by field:
Field |
Half-life (in years) |
Physics |
13.07 |
Economics |
9.38 |
Math |
9.17 |
Psychology |
7.15 |
History |
7.13 |
Religion |
8.76 |
“[W]hen people thought the earth was flat, they were wrong. When people thought the earth was spherical, they were wrong. But if you think that thinking the earth is spherical is just as wrong as thinking the earth is flat, then your view is wronger than both of them put together.” –Isaac Asimov
Some of the names mentioned in Chapter 3 include: Marjory Courtenay-Latimer, John Hughlings Jackson, Sean Carroll, Kevin Kelly
Chapter 4, “Moore’s Law of Everything,” explores the intersection of technological progress via Moore’s Law and human knowledge. The chapter is key to understanding how the field of computational social science emerged, and why scientometrics has grown: it’s now possible to do, computationally, what was an incredibly painstaking process to do manually. The advent of computation and the exponential growth in computing power has changed what humans are able to know. Topics in Chapter 4 include: carrying capacity, logistic curves, information transformation, innovation, scientific prefixes as evidence of progress, actuarial escape velocity, population growth, and travel distances.
“Technological growth facilitates changes in facts, sometimes rapidly, in many areas: sequencing new genomes…; finding new asteroids (often done using sophisticated computer algorithms that can detect objects moving in space); even proving new mathematical theorems through increasing computer power.” –Samuel Arbesman
Some of the names mentioned in Chapter 4 include: Clayton Christensen, Rodney Brooks, Jonathan Cole, Henry Petroski, Aubrey de Grey, Bryan Caplan, Michael Kremer, Thomas Malthus, Robert Merton
In Chapter 5, “The Spread of Facts,” Arbesman gently introduces network science and mentions some of his work with Nicholas Christakis, explaining how behaviors (e.g., health behaviors) and information has empirically been shown to move through networks. The spread of information through networks introduces the possibility—or perhaps the eventuality—of fact-transmission errors or even all-out fabrications. Arbesman uses the children’s game of “telephone” to illustrate how easily a piece of knowledge can be distorted as it moves through the network. Without stealing his thunder, I’ll also note that Arbesman chooses some fantastic examples of misinformation spread in this chapter: Popeye the Sailor and dinosaurs! More mundane, but relevant to scientometrics, is the problem of inaccurate citations “entering the wild,” only to be replicated and spread to an almost unbelievable degree. Some of the names mentioned in Chapter 5 include: Gottfried Leibniz, Jukka-Pekka Onnela, Jeremiah Dittmar, Mark Granovetter, James Fallows, David Liben-Nowell, Jon Kleinberg
In Chapter 6, “Hidden Knowledge,” Arbesman goes deeper into network science and computational techniques for studying networks: he introduces random graphs, preferential attachment, evolutionary programming, meta-analysis, and the academic citation and networking product Mendeley along with several other software products. Some of the names mentioned in Chapter 6 include: Albert-László Barabási, Réka Albert, Herbert Simon, William Shakespeare
In Chapter 7, “Fact Phase Transitions,” Arbesman describes in greater detail the use of mathematical tools from physics to investigate the underlying regularity in the change of knowledge. Topics include Ising models, Fermat’s Last Theorem, and human space exploration.
Chapter 8, “Mount Everest and the Discovery of Error,” explores one of my personal favorites as an applied social scientist: measurement, and its importance to all of science, human knowledge, and understanding. Arbesman uses the change in the “fact” of the height of Mt. Everest during the 20th century—as measurement techniques either improved or, in the case of GPS, were invented and deployed—to note yet another source contributing to the decay of knowledge. Likewise, measures of length have been similarly inconsistent, until scientists finally arrived at the use of the speed of light to define the meter.
“As our measurements become more precise, the speed of light doesn’t change; instead, the definition of a meter does.”
Arbesman defines precision and accuracy (or reliability and validity for the psychologists and social scientists among us). Precision refers to the consistency of measurements over time; accuracy refers to how similar measurements are to the real value. Importantly, he also discusses error: “all methods are neither perfectly precise nor perfectly accurate; they are characterized by a mixture of imprecision and inaccuracy. But we can keep trying to improve our measurement methods. When we do, changes in precision and accuracy affect the facts we know, and sometimes cause a more drastic overhaul in our facts.” [emphasis mine]
“Statistics is the science that lets you do twenty experiments a year and publish one false result in Nature.” –John Maynard Smith
Next, Arbesman jumps into a discussion of probability, its importance in science, and the woefully misunderstood p-value. He discusses the problem of publishing false results, issues of replication in science, and the dangers of poor measurement.
“When you cannot measure, your knowledge is meager and unsatisfactory.” –Lord Kelvin
“If you can measure it, it can also be measured incorrectly.” –Samuel Arbesman’s corollary to Lord Kelvin
The penultimate chapter, Chapter 9, “The Human Side of Facts,” discusses the human aspects that so often contribute to getting facts wrong: cognitive bias, self-serving bias, representativeness bias, theory-induced blindness, change blindness, and language change, to name a few. Some of the names mentioned in this chapter include: Daniel Kahneman, John Maynard Keynes, Michael Chabon, Thomas Kuhn, John McWhorter, Isaac Newton, and Henry Kissinger.
Finally, in the final Chapter 10, “At the Edge of What We Know,” Arbesman discusses the pace of information change and whether—and how—humans can cope. Arbesman argues, essentially, “quite well,” and points to a variety of examples of where humans seem to be doing mostly okay, despite having human limits, such as Dunbar’s Number of “150 to 200 people we can know and have meaningful social ties,” which surpisingly still seems to hold considering the average number of Facebook friends was 190 at the time of writing. Arbesman points to a variety of technology that will enhance our capabilities as we bump against our limits. Some of the names mentioned in this chapter include: Kathryn Schultz, Carl Linnaeus, Robin Dunbar, Chris Magee, Jonathan Franzen
The Half Life of Facts – Why Everything We Know Has an Expiration Date (find in a library) includes almost 20 pages of endnotes and citations for those wishing to dig deeper. It’s a particularly enjoyable read for science and measurement geeks, or for anyone wanting to know more about the science of science and why what we know is always in flux. I particularly recommend Chapter 8, since measurement is so fundamental to science and to our knowledge of the world.