School Achievement Bins: Model-Based vs Raw Thresholds

Two classification approaches compared:

Raw approach (used on the landing page): 3-year weighted average L3/L4% vs province benchmark, binned at ±5 pp ("similar") and ±15 pp ("somewhat"). Simple and transparent but treats all schools equally regardless of enrolment size — a 15-student school and a 300-student school get the same confidence.
Model approach (Bayesian partial pooling): Posterior means from a two-stage GLMM with measurement-error correction (me()) in Stage 2. Small schools shrink toward the grand mean; G3 intake quality is separated from G6 value-added. Quintile bins (1 = bottom 20%, 5 = top 20%) for fitted G6 achievement; tertile bins for value-added.

The cross-tabs below show where the two approaches agree and where they diverge — and whether disagreement concentrates in small schools.

Frequency Distributions

G6 Achievement Level

How many schools fall into each bin? Raw approach uses ±5/15 pp province thresholds; model uses equal-count quintiles across 2,938 English G6 schools.

G3→G6 Pattern vs Value-Added

Raw approach compares each school's G3→G6 change to the provincial G3→G6 change (±5 pp). Model approach ranks schools on the posterior VA mean into equal-count tertiles.

Cross-tab: Model Quintile × Raw Level Bin

How do the two G6 achievement classifications compare school-by-school? Rows = model quintile (1 = bottom 20%, 5 = top 20%); columns = raw bin relative to province. Values are row percentages (% of schools in that model quintile that fall in each raw bin); N = schools in that model quintile. A school classified identically by both approaches falls on the diagonal.

Cross-tab: Model VA Tertile × Raw G3→G6 Change Bin

Cross-tab: Model G6 Quintile × Raw G3 Level Bin

The model's fitted G6 achievement (g6_fitted = g3_coef × g3_school_re + va_mean) combines G3 intake and G6 value-added, so it should correlate with raw G3 level — but not perfectly. Rows = model G6 quintile; columns = raw G3 level bin. A school in the top model quintile despite a below-average raw G3 level has a high VA component lifting it.

Cross-tab: Model VA Tertile × Raw G6 Level Bin

Value-added measures G6 performance conditional on G3 intake, so it need not correlate strongly with raw G6 standing — a "Below typical" VA school can still have high raw G6 achievement if it started from a strong G3 base. Rows = model VA tertile; columns = raw G6 level bin.

Scatter: Model vs Raw G6 Achievement

Each point is a school. X-axis = model fitted G6 quality (logit scale, school-specific component g3_coef × g3_re + va_mean); Y-axis = raw 3-year weighted avg Math L3/L4%. Colour shows whether the model and raw approach agree on rank direction: grey = same rank order; blue = model ranks higher than raw; red = model ranks lower than raw. Disagreement typically means the raw rank was inflated or deflated by noise that the model shrinks away.

School Size and Bin Agreement

Does model vs raw disagreement concentrate in small schools? Schools split into quartiles by mean annual G6 enrolment (mean_n_g6). "Exact agreement" = model quintile equals raw Math rank (both on a 1–5 ordinal scale); "Within 1" = ranks differ by at most 1 step.

Key findings: Overall, % of schools land in the exact same bin under both approaches, and % are within one step. The smallest-school quartile has exact agreement vs for the largest schools — confirming that partial pooling reshuffles mainly small schools, where raw percentages are most noise-prone.

N = schools (English G6 schools with Bayesian VA estimates, matched to 3-year 2022–23 through 2024–25 achievement data). Province benchmarks (3-year weighted avg): Math G6 = %, R&W G6 = %. Size quartile cut points: Q1 ≤ , Q2 ≤ , Q3 ≤ students/year.

Schools Excluded from the Model

The two-stage Bayesian model requires at least 2 observations (across three years, 2022–23 through 2024–25) with ≥ 70% participation, plus a G3 school quality estimate from Stage 1. Individual observations with < 50% participation are dropped before counting. English schools that don't meet these criteria receive a raw classification but no model-based bin.

Coverage

Of English schools with G6 data in 2022–23 through 2024–25, (%) received a model-based bin and (%) did not.

Exclusion Reasons

Schools can be flagged for multiple reasons; counts below sum to more than .

Raw Achievement Bin Distribution: Excluded vs Included

How do excluded schools distribute across raw G6 achievement bins? If excluded schools cluster at extremes (suppressed = mostly small schools with volatile %) or below average (low-participation schools often have lower outcomes), the model's coverage is non-random.

Participant Count Distribution in Excluded Schools

How many students are in the schools that couldn't be modelled? Values are mean annual fully-participating Math students per school across years present (null = all data suppressed). Included school sizes are shown for comparison.

excluded schools have entirely suppressed participation data (all years below the EQAO n < 15 reporting threshold) and are not shown in the histogram or percentile table above. These are the smallest schools in the system.