← Model Journal/
Milestone

347 features across 12 layers — the pipeline that feeds the model.

May 2026v26.0-pre
The foundation

347 features across 12 layers — the pipeline that feeds the model.

Building the prediction engine required solving a fundamental data problem first: raw NBA statistics are almost useless as predictors on their own. A player who scored 18 points per game last season will score approximately 18 points per game next season — but that tells us almost nothing about whether they'll score 15 or 21. The signal is in the context.

So before training a single model, we built a 12-layer feature engineering pipeline. 347 features per player-season, capturing everything we believe actually drives year-to-year performance changes: momentum, opportunity, context, age, environment, matchups, and archetype. Zero DB errors across 150-player integration test. 9 of 10 pipeline checks passing.

The 12 layers

What goes into every projection.

Each layer contributes a distinct signal class. Together they give the model enough context to understand not just what a player did, but why — and whether those conditions will persist.

Layer 01
Player Talent Baseline
Per-game stats, percentages, and volume metrics from the current and prior seasons. The raw signal the model builds on.
pts_per_gamereb_per_36usage_ratetrue_shooting_pct
Layer 02
Rolling Momentum Windows
3, 5, 10, 20-game rolling averages for every core stat. Captures hot/cold streaks and multi-scale momentum signals.
pts_rolling_10greb_trend_5gfg_pct_rolling_20gast_momentum
Layer 03
Opportunity Signals
Projected minutes, usage share, shot attempts, pace context. A player with the same talent at higher opportunity projects better.
projected_mpgshot_opportunity_scoreteam_paceusage_delta_yoy
Layer 04
Lineup Context
Role in starting vs bench units, lineup net rating with the player on/off, positional scarcity within the team roster.
on_court_net_rtglineup_usage_sharestarter_probabilitypositional_scarcity
Layer 05
Injury Ripple Effects
How teammate injuries affect each player's opportunity. Built with a separate injury ripple model — when a star is injured, role players see opportunity gains.
teammate_injury_scoreopportunity_bumpinjury_ripple_ptsstar_absence_days
Layer 06
Team & Coach Context
Team pace, offensive system, coach tendencies, three-point rate, playoff pressure. Some systems produce more points; others produce more assists.
team_off_ratingcoach_3pt_emphasissystem_archetypeplayoff_urgency_score
Layer 07
Opponent Matchups
Projected opponent difficulty, opponent defensive rating by position, strength of schedule for the projection window.
opp_def_rtgmatchup_difficultysos_next_30dposition_def_rank
Layer 08
Schedule Effects
Back-to-backs, rest days, travel load, home/away split, density of the next 30-game window.
rest_daysback_to_back_pcttravel_mileshome_game_pct
Layer 09
Age Curves
Player age relative to position-specific peak years, career trajectory stage, age-adjusted decay rates. Separate curves per position.
age_vs_position_peakcareer_stageage_decay_rateprime_years_remaining
Layer 10
Volatility Modelling
Historical stat variance, game-to-game consistency scores, boom/bust probability. High variance players get wider confidence intervals.
pts_std_dev_10gconsistency_scoreboom_bust_flagvolatility_tier
Layer 11
Market Signals
ADP, expert consensus ranks, waiver wire velocity, ownership trends. Stubbed for Season 1 — will incorporate when market data is live.
adp_rankour_rank_vs_adpwaiver_add_velocityexpert_consensus
Layer 12
Archetype Embeddings
Player archetype from clustering: Heliocentric Creator, Elite Wing Defender, 3-and-D Specialist, etc. Peer cohort averages used as priors.
archetype_labelarchetype_peer_avg_ptsarchetype_peer_avg_agearchetype_cluster_id
Integration results

The pipeline is working.

347
Total features
Across 12 layers
9/10
Checks passing
Integration test
0
DB errors
150-player test
What failed the 1 check: The market signals layer (Layer 11) is stubbed — ADP, consensus ranks, and waiver velocity all return null because live market data isn't connected yet. This is expected and intentional. The model trains cleanly without it. Layer 11 will be live when market integrations are connected in a future task.
← Back to Model Journal