An Experimental Author Quality Metric · Sports Analytics · 2015–2024
Proof of Concept
▸ About this metric A citation-based author quality metric inspired by basketball's Adjusted Plus/Minus: controls for co-authors to isolate individual contribution to citation impact
The Core Idea

In basketball, Regularized Adjusted Plus/Minus (RAPM) measures how much a player raises or lowers their team's scoring margin, controlling for teammates and opponents. Each possession is an observation, each player on the court is a dummy variable, and, ridge regression estimates isolated impact.

Bibliographic Plus/Minus applies the exact same logic to academic publishing. Each paper is a possession. The outcome is citations in years 1–3 after publication, normalized within field and year. Each author is a dummy variable.

The Model
citations_p ~ Σ β_i · author_i(p) + β_baseline · baseline_count(p) + ε  |  BPM_i = β_i

Ridge regression (L2 penalty) estimates one coefficient per author simultaneously. Ridge is used because co-authors are correlated: researchers who always (or in reality very frequently) collaborate cannot be separated by OLS. The penalty shrinks uncertain estimates toward zero, analogous to how RAPM handles players with limited minutes and/or players that play with the same teamates most of their time.

Authors are split into two groups. Those with ≥ 2 papers in the corpus receive an individual BPM coefficient. Authors with only 1 paper cannot be estimated reliably on their own. Instead, they are pooled into a single baseline author variable equal to the count of such authors on each paper (that is the baseline_count(p) in the equation above). The model estimates one shared coefficient for this group, representing the average citation contribution of an author who appears only once in the corpus. This baseline BPM is empirically estimated rather than assumed to be zero or negative.

Citations are z-scored within each year cohort, so BPM is in units of standard deviations above or below the field-year average.

How to Read BPM

BPM = +1.0 → papers with this author get ~1 SD more citations than expected, controlling for co-authors

BPM = 0 → papers perform at the field average

BPM = −1.0 → papers underperform the average

A researcher who always publishes with superstars gets a discounted BPM. A researcher who consistently elevates modest teams ranks higher than raw citation counts suggest.

The Kendall rank correlation between BPM and raw citation rankings is 0.68. The way to think of this is that for any two randomly chosen authors, the two metrics agree on who ranks higher about two thirds of the time. They share 69% of their variance (Spearman rank correlation is 0.83), meaning 31% is independent signal. Nearly half of all ranked authors move more than 500 positions between the two metrics. That gap is what makes the adjustment worthwhile.

Data & Corpus

Data sourced from OpenAlex (open, free API) using the Sports Analytics and Performance topic (T11674), filtered to remove pure CS/AI papers with no sports content (e.g., theoretical game theory). Covers 38,609 papers from 2015–2024.

Authors with ≥ 2 papers in the corpus receive individual BPM estimates (5,167 authors). Authors with only 1 paper are pooled into a single baseline author variable. The methodology is field-agnostic and can be applied to any discipline covered by OpenAlex (250M+ papers).

Baseline Author BPM
+0.0469
The average citation contribution of a single-paper author, estimated empirically by the regression rather than assumed to be zero. The slightly positive value reflects a selection effect: one-time collaborators tend to be brought into projects selectively, for specific expertise or data access, rather than at random. With only one paper in the corpus, the regression cannot reliably separate their individual contribution from the overall quality of the project, which is why they are pooled rather than ranked individually.
Proof of concept. These results should not be taken at face value. Many conference papers, working papers, and practitioner publications are missing or misclassified in OpenAlex. The rankings reflect what is in the database, not the full landscape of the field.
Author Rankings
🔍
# Author p BPM
Loading…
BPM Distribution
Select an author
Click a row in the rankings or a dot in the chart
Papers
Co-authors