Mean Level Score
A continuous alternative to the binary Level 3/4 measure: weighted average of level percentages, range [1, 4].
EQAO reports four performance levels. The mean level score summarises a school's full achievement distribution as a single number by assigning ordinal scores to each level:
mean_level = (1·L1% + 2·L2% + 3·L3% + 4·L4%) / 100
The result lies in [1, 4]. Unlike L3/L4%, which collapses the four bins to a binary threshold, the mean level captures:
- how concentrated the below-standard population is at L1 vs L2, and
- how many above-standard students reach the highest level (L4 vs L3).
Two schools with the same L34% will differ in mean level when their within-half distributions differ.
Section 1: Distribution of Mean Level
Summary by year:
Section 2: Mean Level vs L3/L4%
L34% and mean level are highly correlated, but not equivalent. The scatter below shows all school-years; the OLS regression line and R² quantify how much distributional information L34% alone captures.
Theoretical basis. With levels scored 1–4, mean_level decomposes as:
mean_level = 1 + L₂/100 + 2·(L₃+L₄)/100 + L₄/100
where L₂, L₃, L₄ are percentages. Because L₃+L₄ = L34%, the OLS slope on L34% is theoretically close to 0.02 (per percentage point), adjusted for the cross-correlations between L₄ and L34% in the data.
Section 3: What Drives the Residual?
After removing the L34% trend, the residual reflects two independent within-half effects:
- Above-standard quality — share of L3+L4 students at L4 vs L3 (L4 share).
- Below-standard severity — share of L1+L2 students at L2 vs L1 (L2 share).
Both push the residual positive when students concentrate at the upper end of their respective half.
Section 4: Residual Distribution and Outliers (2025)
Highest positive residuals — mean level well above L34%-prediction: high L4 concentration among at-standard students, and/or high L2 among below-standard students.
Largest negative residuals — mean level below L34%-prediction: high L1 concentration among below-standard students, and/or high L3 (not L4) among at-standard students.
Methodology Notes
- Score assignment: Ordinal scores 1–4 assume equal spacing between levels. This is the simplest defensible choice given that we observe counts per bin rather than underlying continuous scores. Converting to a 0–100 scale via
(mean_level − 1) / 3 × 100gives an equivalent metric compatible with the noise model used in Metric Stability. - Data availability: All four level columns (L1–L4) are present in
schools_g{3,6}.parquetand are computed from the same assessed-student denominator as L34%. No additional ETL is required. - Using mean level in model validation: The metric is included in Metric Stability Section 5b as "Mean level (0–100)" — the normalized version on a 0–100 scale so RMSE is directly comparable to L34% results.
- Noise model: At the 0–100 normalized scale, the worst-case score SD is ≈37 (uniform distribution across 4 equally-spaced levels), vs 50 for a Bernoulli at p=0.5. The conservative
100/√nnoise ceiling used throughout the dashboard remains an upper bound.