Share |

Why humanities citation statistics are like Eskimo words for snow

citation · statistics · metrics

 Published by the author on LinkedIn on the 30th of March, 2016 and the Blog of Joel Barnes.

‘X percent of journal articles in the humanities are never cited.’ How often have we seen this claim made? Much like the fabled Eskimo words for snow, the clue that it’s probably bunkum lies in the fact that X varies wildly depending on who’s speaking. And in that it doesn’t really matter to the speaker what X is, as long as it’s a lot.
 
Some recent(ish) efforts to claim that most journal articles in the humanities are rarely or never cited by other researchers have led me to consider the completeness of available citation data, and some of the unspoken assumptions that inform interpretations of what data we have. As it turns out, there are good reasons for treating existing statistics with extreme caution at the very least.
 
Claims about low or non-citation rates in the humanities generally have their empirical bases in studies conducted in the sciences that examine citation rates across a range of disciples. Recently, similar data have also come from the Google Scholar Metrics h5 index.
 
One study, conducted at the Institute for Scientific Information (ISI) in Philadelphia
and reported in Science in December 1990 and January 1991, suggested that either 93% or 98% of humanities journal articles go uncited, depending on how one cuts the data. The latter claim was recently recycled by Steven Pearlstein in the Washington Post; in response, Libby Nelson in Vox Education pointed to the 93% figure as the original researchers’ preferred calculation.
 
The second source, Google’s h5 index, is a complicated metric that I won’t go into here, but according to Patrick Dunleavy, Professor of Political Science at the London School of Economics, it appears to show humanities scholars citing one another’s work at a paltry fraction of the rate of those in the sciences, especially the medical and life sciences.
 
Other figures arise, frankly, from hearsay. In 2014, Dahlia Remler of the City University of New Yorksought to debunk a 90% non-citation figure—for all disciplines—that had been doing the rounds online. Sure enough, it had no examinable basis, having been claimed by the editor of the magazine Physics Today, who took it from a presentation he once attended that could not be reproduced.
 
In the blog post, however, Remler also claimed that 82% of humanities articles go uncited, even though the figure does not in fact appear in the source she gives for it. To be sure, her source, a paper published online by two Canadian academics, does show low humanities citation rates, but these are hedged around by so many caveats as to suggest that, for the humanities, the figures are basically meaningless. In her post, Remler did note good cause, of the sort discussed below, to be careful of the statistics she gave. This has not stopped the 82% figure, as an absolute non-citation rate shorn of such qualifications, gaining a lease of life of its own among academics and others who have been less than careful in examining their sources (one such even takes the liberty of rounding the figure down to 80%). In seeking to head one myth off at the pass, Remler inadvertently generated another.
 
Citation figures that have a proper empirical basis typically come from examinations of article citations in other journal articles over a five-year window following publication. On this front, the h5 index is less generous than more conventional measures, allowing publications only up to five years to be cited,at the time of writing up to five years prior to June 2015.
 
Crucially, citations of articles in books are always excluded. Citations are also counted only in selected journals. The ISI database covered only the ‘top 10% of all scientific journals published worldwide’, while the h5 index excludes publications with fewer than 100 articles published over 2010–14, which, as Katie Barclay of the University of Adelaide pointed out on Twitter, means a great many humanities journals are not counted.
 
The significance of other limitations notwithstanding, I focus here just on the exclusion of books and the five-year citation window. These seem to me to be the major shortcomings of the existing datasets. Both are likely to be seriously underestimated by non-humanities scholars who approach citation statistics from the disciplinary norms of the natural or social sciences.
 
The five-year window, perhaps appropriate enough for disciplines in which knowledge develops rapidly and just as rapidly becomes obsolete, is inadequate for the humanities. Long humanities publication lead-times mean many citations fall outside this window; at the same time, much humanities research has a much longer shelf-life—better measured in decades than years—than that in other fields.
 
The exclusion of books from citation data is similarly likely to give any humanities scholar pause. The difficulty though, precisely because of that exclusion, is that we don’t know how different the data would look if books were included. Does leaving books out mean that we should subject citation statistics to a caveat, or does that exclusion fatally undermine the representativeness of the data?
 
A small experiment suggests itself. Though the experiment is very modest and only exploratory, the results hint that, despite the extraordinary richness of modern scholarly databases, citation rates in the humanities remain extremely uncertain. Both the exclusion of books and the five-year window begin to look like serious impediments to ascertaining any meaningful statistics.
 
My experiment involves choosing a single article and tracking down all the citations to it—both in journal articles and in books—that I can find, and comparing these with the citations listed in Google Scholar (note: the Google Scholar Metrics h5 index is based on the Google Scholar database, but is further narrowed down by the exclusion of books and, as above, certain journal articles; Google Scholar is however a good indication of what Google’s algorithm is capable of finding in the first instance, before these further exclusions are made).
 
I chose Olive Anderson, ‘The Political Uses of History in Mid Nineteenth-Century England’, Past & Present 36 (1967): 87–105. I picked this article mainly because I am familiar with it from my own research, and I know that it continues to be widely cited today. Its age would usually see it excluded from the body of articles subjected to citation quantification, but there are reasons for choosing an older article that I will come back to below.
 
Using a combination of my own research knowledge and notes, Google Scholar, and keyword searches in Google Books and JSTOR, I found a total of 49 sources citing Anderson’s article, 14 in journal articles and 35 in books (these are listed below). As of March 2016, Google Scholar lists just 27, or around 55%, of these citations. It identifies all 14 article citations that I found, but just 13 of the 35 book citations, around 37%.
 
The severe limitations of Google’s dataset are apparent. While Google Scholar is good at identifying journal article citations, its hit rate for book citations is only around a third, and its ability to identify even these seems often to rely on publishers’ ebooks, where these exist. There appears to be only limited linkage between Google Scholar and Google’s own OCR’d Books dataset.
 
I make no claim for the completeness of the list. It is very likely not exhaustive, since my method of finding citations beyond those listed in Google Scholar relies mainly on keyword searches in databases that are not themselves exhaustive. It is instructive that I have been able to include Peter Mandler’sHistory and National Life and Jeremy Black’s Using History only because of my own reading. Neither citation appeared in my online searches, probably because both are set to ‘no preview’ in Google Books. There are no doubt other sources with which I am not familiar. Any additional found citations would only further downgrade Google Scholar’s hit rate. The incompleteness of the Google Scholar data described here is a best-case scenario.
 
Scholars who use Google Scholar as a research tool therefore need to be aware of its inherent strengths and weaknesses. As for citation statistics based on the h5 index, Google’s weakness on book citations is a moot point, since as noted above the index excludes these anyway, but the overall balance between found book and journal citations is suggestive.
 
Precisely because of the exclusion of books from datasets we cannot know if Anderson’s article is representative; more systematic studies would be welcome. But if it is more or less typical, and something like 70% of citations (in this case 35 out of 49) are in books, this would cast serious doubt on the meaningfulness of any article-only citation metric. In his blog post on the h5 index, Dunleavy claimed that its supposed completeness had put paid to ‘we can’t be compared with STEM’ special pleading in the humanities. It has done nothing of the sort. In a disciplinary context in which book citations appear to be, at the very least, more common than journal article citations, citation metrics that ignore books are at best dangerously misleading and at worst next to useless.
 
Finally, I come back to my reasons for choosing an older article. What is significant here is the slippage, greased by assumptions imported from the sciences, between not being cited within five years and never being cited. These assumptions are often made explicit when citation metrics move from a research to a journalism or a marketing context: ‘never cited by another researcher’, ‘not even cited once’ and ‘fail to get cited at all’ are the sorts of phrases then used.
 
In this regard, it is worth noting that of the 49 identified citations, not one of them is in research published in the five years after 1967. The earliest is in P. B. M. Blaas’s Continuity and Anachronism, published 11 years later. This continued shelf-life is invisible if we base calculations only on research published within the last five years. Had five-year citation windows been imposed in the early 1970s, Anderson’s article would likely have been written off as another entry in the dreaded ‘never cited’ category.
 
All of this suggests that, for the humanities, citation statistics need to be taken with a very large dose of salt. There is no universal database from which such metrics can be extracted. The best we have, Google Scholar, is drastically inadequate. Those who argue for extremely low humanities citation rates are guilty of an unfounded reversal of the onus of proof. Overconfident of the completeness of their data, they mistake an absence of citation evidence for proof of non-citation. The resulting willingness to believe that a wide discrepancy between humanities and non-humanities metrics reflects a problem with how the humanities is carried out rather than with the methods of comparison seems to reveal an implicit (and sometimes explicit) belief that most humanities research is a trivial, unnecessary luxury anyway. The possibility that humanities scholars might be doing their jobs perfectly well, according to the norms and standards of their respective disciplines, seems too often not to enter into the equation. Such perspectives do the humanities a grave disservice.
 

List of citations

Those entries that appear in Google Scholar as of March 2016 are marked with an asterisk.
 
Journal articles
1)    *Barryte, Bernard. ‘History and Legend in T. J. Barker’s The Studio of Salvator Rosa in the Mountains of the Abruzzi, 1865’ The Art Bulletin 71, no. 4 (1989), p. 669.
2)    *Clive, John. ‘The Use of the Past in Victorian England’, Salmagundi 68/69 (1985), p. 49.
3)    *Cunningham, Hugh. ‘The Language of Patriotism, 1750–1914’, History Workshop 12 (1981), p. 29.
4)    *Gunn, Ann V. ‘Sir George Hayter, Victorian History Painting, and a Religious Controversy’ Record of the Art Museum, Princeton University 53, no. 1 (1994), p. 29.
5)    *Isaacson, S. ‘Carlyle and Macaulay in the Journals: Toward a New Historiography’, The Carlyle Annual (1989), p. ?.
6)    *Ledger-Lomas, Michael. ‘The Character of Pitt the Younger and Party Politics’, 1830–1860’, The Historical Journal 47, no. 3 (2004), p. 645.
7)    *Mandler, Peter. ‘Against “Englishness”: English Culture and the Limits to Rural Nostalgia, 1850–1940’, Transactions of the Royal Historical Society 7 (1997), p. 159.
8)    *McGowen, Randall E., and Walter L Arnstein. ‘The Mid-Victorians and the Two-Party System’,Albion 11, no. 3 (1979), p. 257.
9)    *McLaughlin, M. ‘Adam Bede: History, Narrative, Culture’, Victorians Institute Journal (1994), p. ?.
10) *Pasamar Azuria, Gonzalo. ‘Los historiadores y el «uso público de la historia»: viejo problema y desafío reciente’ Ayer 49 (2003), p. 223.
11) *Pickering, Paul A. ‘“The Hearts of the Millions”: Chartism and Popular Monarchism in the 1840s’,History 88, no. 290 (2003), p. 237.
12) *Rich, Paul B. ‘Social Darwinism, Anthropology and English Perspectives of the Irish, 1867–1900’,History of European Ideas 19, no. 4­–6 (1994), p. 784.
13) *Weinstein, Ben. ‘“Local Self-Government Is True Socialism”’: Joshua Toulmin Smith, the State and Character Formation’, English Historical Review 123, no. 504 (2008), p. 1195.
14) *Weisbrod, Bernd. ‘Der englische “Sonderweg” in der neueren Geschichte’, Geschichte und Gesellschaft 16, no. 2 (1990), p. 250.
 
Books and book chapters
15) Allsobrook, David. Schools for the Shires: The Reform of Middle-class Education in Mid-Victorian England (Manchester, 1986), p. 276.
16) Björk, Ragnar. ‘Scholarship and Politics: History in Politics and Historians as Politicians, 1850–1940’, in Rolf Torstendahl and Irmline Veit-Brause, eds., History-making: The Intellectual and Social Formation of a Discipline: Proceedings of an International Conference, Uppsala, September 1994(Stockholm, 1996), p. 138.
17) Blaas, P. B. M. Continuity and Anachronism: Parliamentary and Constitutional Development in Whig Historiography and in the Anti-Whig Reaction between 1890 and 1930 (The Hague, 1978), pp. 91, 123.
18) *Black, Jeremy. Convergence or Divergence? Britain and the Continent (Basingstoke, 1992), in chap. 6.
19) Black, Jeremy. Using History (London, 2005), p. 86.
20) Brundage, Anthony, and Richard A. Cosgrove, The Great Tradition: Constitutional History and National Identity in Britain and the United States, 1870–1960 (Stanford, 2007), p. 319.
21) Clark, J. C. D. ‘Introduction’, in J. C. D. Clark, ed., The Memoirs and Speeches of James, 2nd Earl Waldegrave 1742–1763 (Cambridge, 1988), p. 120.
22) Cosgrove, Richard A. ‘A Usable Past: History and the Politics of National Identity in Late Victorian England’, in Nancy LoPatin-Lummis, ed., Public Life and Public Lives: Essays in Honour of Richard W. Davis (Malden, 2008), p. 31.
23) *Cramer, Kevin. The Thirty Years’ War and German Memory in the Nineteenth Century (Lincoln, 2007), p. 337.
24) Dellheim, Charles. The Face of the Past: The Preservation of the Medieval Inheritance in Victorian England (Cambridge, 1982), p. 203.
25) *Garcia, Enrique Moradiellos. Las caras de Clío: Una introducción a la Historia, 2nd ed. (Madrid, 2009), p. 172.
26) Hawkins, Angus. Parliament, Party, and the Art of Politics in Britain, 1855–59 (Stanford, 1987), p. 402.
27) *Howell, Roger. Puritans and Radicals in North England: Essays on the English Revolution (Lanham, 1984), p. 211.
28) Hunt, Tristram. Building Jerusalem: The Rise and Fall of the Victorian City (London, 2004), p. 547.
29) Jefferies, Matthew. ‘The Age of Historism’, in Stefan Berger, ed., A Companion to Nineteenth-Century Europe, 1789–1914 (Malden, 2006), p. 331.
30) Kammen, Michael. Mystic Chords of Memory: The Transformation of Tradition in American Culture (New York, 1991), p. 717.
31) Lee, Yoon Sun. Nationalism and Irony: Burke, Scott, Carlyle (Oxford, 2004), p. 191.
32) Löffler, Marion. The Literary and Historical Legacy of Iolo Morganwg, 1826–1926 (Cardiff, 2007), p. 9.
33) Mandler, Peter. ‘“In the Olden Time”: Romantic History and English National Identity, 1820–50’, in Laurence Brockliss and David Eastwood, eds., A Union of Multiple Identities: The British Isles, c.1750–c.1850 (Manchester, 1997), p. 90.
34) Mandler, Peter. History and National Life (London, 2002), p. 166.
35) Mandler, Peter. The Fall and Rise of the Stately Home (New Haven, 1997), p. 426.
36) McNulty, Eugene. The Ulster Literary Theatre and the Northern Revival (Cork, 2008), p. 241.
37) *Melman, Billie. Women’s Orients: English Women and the Middle East, 1718–1918 (Basingstoke, 1992); p. 369 of 2nd ed. (1995).
38) *Metzler, Gabriele. Großbritannien – Weltmacht in Europa: Handelspolitik im Wandel des europäischen Staatensystems 1856 bis 1871 (Berlin, 1997), p. 314.
39) Otte, T. G. The Foreign Office Mind: The Making of British Foreign Policy, 1865–1914 (Cambridge, 2011), p. 16.
40) Pionke, Albert D. ‘A Ritual Failure: The Eglinton Tournament, the Victorian Medieval Revival, and Victorian Ritual Culture’, in Karl Fugelso and Carol L. Robinson, eds., Medievalism in Technology Old and New, Studies in Medievalism XVI (Cambridge, 2008), p. 41.
41) *Poinke, Albert D. The Ritual Culture of Victorian Professionals: Competing for Ceremonial Status, 1838–1877 (Abingdon, 2013), p. 197.
42) *Ryan, Martin J. ‘“Charters in Plenty, If Only they Were Good for Anything”: The Problem of Bookland and Folkland in Pre-Viking England’, in Jonathan Jarrett and Allan Scott McKinley, eds.,Problems and Possibilities of Early Medieval Charters (Turnhout, 2013), p. ?.
43) Sassoon, Donald. The Culture of the Europeans: From 1800 to the Present (London, 2006), p. 1465.
44) *Seaman, John T. Citizen of the World: The Life of James Bryce (London, 2006), p. 268.
45) *Slee, P. H. R. Learning and a Liberal Education: The Study of Modern History in the Universities of Oxford, Cambridge, and Manchester, 1800–1914 (Manchester, 1986), p. 22.
46) *Sylvest, Casper. British Liberal Internationalism, 1880–1930: Making Progress? (Manchester, 2009), p. 185.
47) *Treloar, G. R. Lightfoot the Historian: The Nature and Role of History in the Life and Thought of J.B. Lightfoot (1828–1889) as Churchman and Scholar (Tübingen, 1998), p. 436.
48) Wawn, Andrew. The Vikings and the Victorians: Inventing the Old North in Nineteenth-century Britain (Cambridge, 2000), p. 375.
 
49) *Williams, Kevin. ‘Flattened Visions from Timeless Machines: History in the Mass Media’, in Siân H. Nicholas, Tom O’Malley and Kevin Williams, eds., Reconstructing the Past: History in the Mass Media 1890–2005 (Oxford, 2008), p. 25.