Every once in a while, someone invents a new way to describe what scholars do. The results are typically quantitative. Sometimes they’re also quite useful, in the sense of providing a “thin description” (as a corollary to Clifford Geertz’ meaning-full thick description) of whatever it is you want to know. I’ve even done this myself, in trying to get a better handle on — or, if you prefer, a broader perspective of — what it is that historians of psychology are interested in.
But as useful as such a summary can be, you can’t just trust the numbers. And you certainly can’t let them decide for you. Indeed, this is how Ted Porter put it:
Quantification is a way of making decisions without seeming to decide. Objectivity lends authority to officials who have very little of their own.Porter, T. M. (1995). Trust in numbers: The pursuit of objectivity in science and public life (p. 8). Princeton University Press.
He has also written about thin description from the perspective of Science Studies, calling it “the aspiration to superficiality.” (He can certainly turn a phrase!)
Unfortunately, what should have been understood as a warning about naïve credulity has instead become naïve dismissal. Thus we see funding agencies turning away from quantitative metrics in grant applications, fully, and replacing them with “narrative descriptions.”
That, though, is a mistake. Because it allows other undesirable factors to creep into decision-making, including racism and sexism. There’s also just no space in a grant application for a properly “thick” description of one’s research plans: reviewers are too busy to assess a hundred or more pages of detailed technical prose, including attachments, and anyway — with rejection rates regularly rising above 90% — they wouldn’t bother.
In other words, there’s a place for quantitative summaries. And there’s a place for thin description. (Superficiality is very efficient!) We just have to be able to read these things for what they say: we have to use them to augment our thinking, not replace it.
While… numbers and systems of quantification can be very powerful, the drive to supplant personal judgment by quantitative rules reflects weakness and vulnerability. I interpret it as a response to conditions of distrust attending the absence of a secure and autonomous community.Porter, T. M. (1995). Trust in numbers: The pursuit of objectivity in science and public life (p. xi). Princeton University Press.
The latest such tool gaining traction comes from Google. It’s called Google Scholar Metrics.
Google has for many years provided the useful Google Scholar search engine for free. (Many of us, including me, have profiles there that summarize our work in a superficial way.) This new offering is then a generalization of that tool: it provides access in a new way to whatever it is that the search engine captures, but at the journal level rather than the author level or the article level.
The result is akin to providing Scholar profiles for journals instead of for individuals. So it’s also a bit like the Journal Citation Reports, by Clarivate Analytics (acquired from Thomson Reuters), that provide the evidentiary basis for the controversial but often-used Journal Impact Factor metric. Except it’s even thinner.
Google Scholar Metrics provide an easy way for authors to quickly gauge the visibility and influence of recent articles in scholarly publications.Google Scholar Metrics: Overview
Google Scholar Metrics’ reports are limited to academic journals publishing digital editions that the Google search engine can examine. And that can obviously be problematic: not everything that matters — which, in this case, is a larger set than “everything that counts” — is published in journals. Nevertheless, this is a very common narrowing. So we need to keep it in mind when operating in book-publishing disciplines (like history and philosophy). And also in multilingual publishing environments, like Europe, because separations and rankings are made by language rather than by discipline.
So, in short, what Google is doing is making lists with a twist: the greater the influence of the journal, the greater the number of articles listed. This is operationalized using the h-index metric, which allows for the presentation of weighted lists of the top articles published in the top journals.
The h-index of a publication is the largest number h such that at least h articles in that publication were cited at least h times each.Google Scholar Metrics: Metrics
Their most current data is from June 2020, which covers journals articles published in 2015-2019. Digging into the data, I was then of course delighted to find several of my own articles included among the top-cited works for the journals that published them. That also enabled me to use material I know well to engage critically with what it is that these thin descriptions are describing.
Let’s start with my highest-impact article that’s included in this dataset:
On the Meanings of Self‐Regulation: Digital Humanities in Service of Conceptual ClarityGoogle Scholar Metrics: Child Development
JT Burman, CD Green, S Shanker
Child Development 86 (5), 1507-1521
69 citations within the study period
On the basis solely of a superficial reading of the raw numbers, this seems quite good. It’s a top article published in a top journal. Indeed, I’ve always been quite proud of this piece.
I’m not really a developmental or child psychologist, though. (That’s my primary subject area, but it’s not my method: I don’t work directly with children.) So it doesn’t really make sense to describe me by only referring to this piece, in that journal, having the greatest number of citations. Of course this isn’t surprising either. We’re working with an ‘n’ of 1.
Before considering more data, though, let’s first read this description more closely. That’s what needs to happen anyway, if such metrics are to be made useful as an aid to decision-making.
There’s enough information here that we can place the article’s impact in context, at least minimally, by comparing it to the impact of the journal as a whole. And that, I think, is enough to make it useful. (Or it’s at least as useful as knowing that, at the time of writing, the journal’s five-year impact factor is “5.605.”)
My article was included because it was among the journal’s most frequently cited articles for the study period: it met the criteria for inclusion by h-index. It is also included in the calculation of the five-year impact factor. So part of that metric is about what I contributed. When looking at the median number of citations (h5-median) for all of the articles included by that criterion, however, we also see that — although this was a top-performing article — it was not among the very best in that sub-set: it received ~16% fewer citations than the median article in this top group. So although it contributed, it did not pull substantially more than its weight.
Reflexivity is a virtue in my specialty, so I feel duty-bound to highlight that weakness. But it ultimately doesn’t hurt me to do so, in part because this wasn’t my only top-performing article included in the reports. And also because I have tenure now. So let’s use it.
We can compare this to my work published in History of Psychology. There, we see three articles of mine are in the journal’s top 16. And two of these were cited more often than the median; one 50% higher than the median, and the other nearly as high. That’s clearly better than the previous article’s ~16% below the median. Despite having fewer citations in total. (Even when combined!)
History of Psychology
Searching for the structure of early American psychology: Networking Psychological Review, 1894–1908.
CD Green, I Feinerer, JT Burman
History of Psychology 18 (1), 15
33 citations within the study period
Searching for the structure of early American psychology: Networking Psychological Review, 1909–1923.
CD Green, I Feinerer, JT Burman
History of Psychology 18 (2), 196
30 citations within the study period
Neglect of the foreign invisible: Historiography and the navigation of conflicting sensibilities.Google Scholar Metrics: History of Psychology
History of Psychology 18 (2), 146
17 citations within the study period
That the journal itself has a lower five-year impact factor (“0.954“) doesn’t tell us as much as knowing each top article’s position relative to the others. (Because they belong to different sub-disciplines, and we don’t know if there’s a difference in citation behaviour between them.) I suppose we could also make the same sort of move by comparing journals, and in the process we discover that History of Psychology is ranked 10 of 34 by Clarivate for “History of Social Sciences.” But that’s a very broad and multidisciplinary category. In my judgment, it’s not a good representation of what’s actually done in the field.
In any case, one could conclude from these data that that my impact has been made primarily in History of Psychology and secondarily in Developmental Psychology. And this actually does seem like a reasonably good superficial description of my interests.
It’s not a full or even sufficient description, of course, but it’s somewhere to start. And it’s not wrong. So we’ve very quickly been able to derive something useful that can be built upon. As a criterion for a fast first pass, in what is usually a months-long grant application process, that’s a good thing. It could even be done automatically.
Imagine if everyone’s work were deposited to a central repository, with assessments like this performed in their proper contexts, and a percentage of the usual research funding allocated by invitation rather than by application. To wit: “We’ve noticed that you are among the top scholars in your field, and we’d like to ask if you are working on anything especially exciting that would benefit from our support.” The subsequent grant applications could then be fast-tracked.
Numbers create and can be compared with norms, which are among the gentlest and yet most pervasive forms of power in modern democracies.Porter, T. M. (1995). Trust in numbers: The pursuit of objectivity in science and public life (p. 45). Princeton University Press.
By using numbers to make invitations, rather than to perform normative assessments, we could save a lot of time and energy. That increase in efficiency would then make more research possible for the same investment. Still, though, it’s clear that there’s a limit to what these numbers can say.
Reading more closely still, and moving into a more narrative mode, we see that these articles actually present a mix of new digital methods (which I often co-author) and more traditional historical scholarship (which I usually solo-author). We also see the same co-author — Christopher Green — in the by-line of my first-authored essay in Child Development and in those of my co-authored essays in History of Psychology.
This makes sense: not only were these all digital projects, but he was also my doctoral supervisor. We worked together a lot. Furthermore, I defended my dissertation that he supervised (which included a revision of the solo-authored article) the following year. So we can see a reflection here of my growing independence: my two captured co-authored articles had greater impact, when considered in their context, than my two first-authored works (which were cited more often).
Those first-authored essays were good enough to be counted among the highest-cited in their respective journals, but they weren’t “top of the top” (i.e. they both received less than the median number of citations when compared to their journal’s other most influential articles).* Not like the co-authored essays. But that’s okay: I was a PhD student! And having four articles recognized is obviously wonderful.
We can, however, continue with the critical reflection. In particular, I would mention that it’s noteworthy that these four articles were all published in 2015: the start of the study window. They therefore had the greatest opportunity to collect citations.
This highlights a common objection to the use of h-index: it prefers older material, all else being equal, and thereby reinforces the Matthew Effect when algorithms rely on it to determine relevance in a search. The use of h-index, especially without context, is therefore also deeply problematic for assessing the contributions of early career researchers with different publishing histories.
Next year, of course, will afford an entirely different picture. These articles won’t be included in the study window. That will shift to 2016-2020. So too, then, will the assessment of my impact: I will be a quantitatively-different person! A truly useful superficial assessment would therefore reflect several years of such reports. (That’s not unreasonable; they’re automatically generated.)
From this perspective, it will be interesting to see if my contributions to Theory & Psychology end up being highlighted next year. My beloved (to me) but much-abridged “Piaget’s neo-Gödelian turn” article, for example, will be among the oldest in the study window. And it seems to have enough citations to be a contender.
It’s not a big deal if it’s not included: most authors won’t be recognized in this way every year anyway, if only because the best research takes time (and many publications are “supporting” works). Nor does “well-cited” always mean “good.”
So does all of this mean that quantitative tools can’t be used? Not at all. They simply need to be used thoughtfully: they are a beginning, not an end. (Can we get to the cool stuff now?)
* Honestly, I think my very best independent scholarship is represented by much more recent work. Depending on whether you value traditional methods or innovation, my judgment of “my best” splits between the quite long archival history of Piaget’s genetic epistemology (recently published by Oxford University Press) and the shorter but still-quite-long digital-supported philosophical argument that used comparisons between translations of Piaget to make a point about the formal meaning of meaning that’s required by historians interested in change (now in press at the Review of General Psychology). But aside from observing that the former isn’t a journal article, and therefore that it could never be captured by the Google Scholar Metric reports, this — here — is a partial digression. Hence the footnote.
Author note. Thank you to Eric Fath-Kolmes and Christopher Green for their comments. (Also: Green would want me to point out that “thin description” originates with Gilbert Ryle, even though most everyone now cites Geertz.)
J. T. Burman
JEREMY TREVELYAN BURMAN, PhD, is tenured Senior Assistant Professor (UD1 with indefinite contract) of Theory and History of Psychology at the University of Groningen in the Netherlands. The primary focus of his research is Jean Piaget, but he is also interested more generally in the formalization and movement of scientific meaning—over time, across disciplines, between languages, and internationally. To pursue these interests, he uses methods borrowed from the history and philosophy of science (esp. archival study) and the digital humanities (esp. network analysis).
Selected recent major works
Burman, J. T. (in press). The genetic epistemology of Jean Piaget. In W. Pickren (Ed.), The Oxford Research Encyclopedia of the History of Psychology. Oxford University Press.
Burman, J. T. (2020). On Kuhn’s case, and Piaget’s: A critical two-sited hauntology (or, on impact without reference). History of the Human Sciences, 33(3-4), 129-159. doi:10.1177/0952695120911576
Burman, J. T. (2019). Development. In R. J. Sternberg & W. Pickren, eds, The Cambridge Handbook of the Intellectual History of Psychology (pp. 287-317). New York: Cambridge University Press.