Stat Lab  ·  Est. 2024
A statistical analysis of NBA player performance before and after hip-hop mentions. Real game logs. Real lyrics. Real math.
Loading results.json…
Run notebooks 01–05 to generate data/processed/results.json
01Key Findings
02Performance Delta by Mention Type
Avg Δ Points — by type
Avg Δ Assists — by type
Avg Δ Rebounds — by type
03Top 10 Biggest Impacts
Composite performance shift — pace-adjusted, 30-game window
Statistical Tests
Paired t-tests — 30 games before vs. after (pace-adjusted)
Correlations
Pearson r — with composite performance delta (30g, pace-adjusted)
Artist Tier × Performance Change
Scatter — artist tier vs composite delta
Era Breakdown
Mean composite delta by era (pace-adjusted)
Mention Explorer
mentions
PlayerArtistSongYear TypeΔPTSΔASTΔREBBar
Methodology

This project treats each hip-hop mention as a natural experiment — a discrete, externally-generated cultural event that lets us measure before/after NBA performance without any experimental manipulation. Each mention is a data point; the song release date is the event marker.

Data sources

NBA game logs — pulled via nba_api covering the season before, the season of the drop, and the season after — ensuring off-season releases always have a complete post-window.

Lyrics & release dates — verified via the Genius API. Release date is used as the event date.

Sentiment scoring — VADER (Valence Aware Dictionary and sEntiment Reasoner) applied to each lyric snippet, producing a continuous score from −1 (most negative) to +1 (most positive). This is tested as a predictor alongside the manual compliment / diss label.

Statistical windows

Baseline (before) — the 30 games immediately before the release date. For off-season drops this is the last 30 games of the previous season.

After windows — next 1 game, next 10 games (~2 weeks), next 30 games (~1 month), rest of season. For off-season drops, after windows begin with the first game of the following season, capturing the full post-mention response.

Normalization

Pace adjustment — NBA pace has varied from ~89 possessions/game in the 2000s to ~100 in the 2020s. All stats are scaled to a 2010s reference pace so Shaq (1994) and Giannis (2018) are directly comparable.

Era z-scores — the primary 30-game delta is also z-scored within era groups to produce standardized effect sizes for cross-era comparison.

Statistical tests

Paired t-tests — test whether the mean before/after difference is significant across all mentions. Null hypothesis: no change.

Mann-Whitney U — non-parametric comparison of compliment vs. diss group distributions. Used because group sizes are unequal.

Pearson r — linear association between artist tier, VADER sentiment score, and performance delta.

OLS regression — models composite delta as a function of artist tier, mention type dummies, and sentiment score. Reports R², F-statistic, and 95% confidence intervals per coefficient.

Composite impact score

Weighted average of pace-adjusted 30-game deltas: PTS (40%) + AST (30%) + REB (30%). Multiplied by artist tier for the weighted version shown in the explorer. This is a descriptive ranking tool, not a statistical test.

Limitations & honest caveats

This is a correlational study. Hip-hop mentions cannot be shown to cause performance changes — basketball is driven by injury status, opponent quality, rest, team context, and dozens of other variables. We report p-values and effect sizes honestly and note where results do not reach significance.

Selection bias — only famous, well-documented mentions are captured. Unknown references from smaller artists are underrepresented, which likely inflates the apparent artist tier effect.

Sample size — ~100 complete mention windows is sufficient for exploratory analysis and correlation testing, but underpowered for detecting small effects in subgroup analyses.