In basketball, Regularized Adjusted Plus/Minus (RAPM) measures how much a player raises or lowers their team's scoring margin, controlling for teammates and opponents. Each possession is an observation, each player on the court is a dummy variable, and, ridge regression estimates isolated impact.
Bibliographic Plus/Minus applies the exact same logic to academic publishing. Each paper is a possession. The outcome is citations in years 1–3 after publication, normalized within field and year. Each author is a dummy variable.
Ridge regression (L2 penalty) estimates one coefficient per author simultaneously. Ridge is used because co-authors are correlated: researchers who always (or in reality very frequently) collaborate cannot be separated by OLS. The penalty shrinks uncertain estimates toward zero, analogous to how RAPM handles players with limited minutes and/or players that play with the same teamates most of their time.
Authors are split into two groups. Those with ≥ 2 papers in the corpus receive an individual BPM coefficient. Authors with only 1 paper cannot be estimated reliably on their own. Instead, they are pooled into a single baseline author variable equal to the count of such authors on each paper (that is the baseline_count(p) in the equation above). The model estimates one shared coefficient for this group, representing the average citation contribution of an author who appears only once in the corpus. This baseline BPM is empirically estimated rather than assumed to be zero or negative.
Citations are z-scored within each year cohort, so BPM is in units of standard deviations above or below the field-year average.
BPM = +1.0 → papers with this author get ~1 SD more citations than expected, controlling for co-authors
BPM = 0 → papers perform at the field average
BPM = −1.0 → papers underperform the average
A researcher who always publishes with superstars gets a discounted BPM. A researcher who consistently elevates modest teams ranks higher than raw citation counts suggest.
The Kendall rank correlation between BPM and raw citation rankings is 0.68. The way to think of this is that for any two randomly chosen authors, the two metrics agree on who ranks higher about two thirds of the time. They share 69% of their variance (Spearman rank correlation is 0.83), meaning 31% is independent signal. Nearly half of all ranked authors move more than 500 positions between the two metrics. That gap is what makes the adjustment worthwhile.
Data sourced from OpenAlex (open, free API) using the Sports Analytics and Performance topic (T11674), filtered to remove pure CS/AI papers with no sports content (e.g., theoretical game theory). Covers 38,609 papers from 2015–2024.
Authors with ≥ 2 papers in the corpus receive individual BPM estimates (5,167 authors). Authors with only 1 paper are pooled into a single baseline author variable. The methodology is field-agnostic and can be applied to any discipline covered by OpenAlex (250M+ papers).