The normr package provides tools to calculate continuous
norms for psychological test scores, where the norms depend on a
continuous variable, denoted as the norm predictor. Typically, the norm
predictor is age. The package allows for a single norm predictor.
Applying the norms result in standardized scores such as percentiles,
T-scores, or IQ scores. Such standardized scores allow for
norm-referenced interpretation, that is, comparing an individual’s
performance to that of a relevant reference population.
For example, an IQ score expresses an individual’s performance
relative to the general population of the same age. To estimate such
norms, normr requires raw scores from a representative
normative sample.
This vignette illustrates how to derive norms from such a sample,
following the approach described in a tutorial (Timmerman, Voncken, and Albers 2021).
The main workflow in normr consists of:
shape_data)fb_select)normtable_create,
plot_normtable)Advanced functionality includes norming of composite scores, norms for new data, user-defined distributions, and parallelization.
We illustrate the process using IDS data (Grob
and Hagmann-von Arx 2018; Grob et al. 2018). An anonymized and
perturbed version of this data is included in normr.
We focus on Test 14, with \(n=1660\) children and raw scores ranging
from 2–34.
Three candidate distributions are considered:
To compare candidate models, we use the free order model selection procedure with the Bayesian Information Criterion (BIC) as selection criterion, which has been shown to be effective in model selection (Voncken, Albers, and Timmerman 2019).
We begin by fitting the BB distribution to model the raw scores of
Test 14 as a function of age. Before doing so, we must ensure that the
data are in the correct format. This is handled by the function
shape_data.
The shape_data function verifies whether the response
variable is properly formatted and, if necessary, reshapes it. For
binomial-type distributions like the BB, gamlss requires the response to
be a two-column matrix: one column for the number of correct items and
one for the number of incorrect items. For each individual, these
columns sum to the maximum possible raw score on the test.
By default, shape_data uses the highest observed raw
score in the data as the maximum. This can be overridden with the
argument max_score. The function also removes any cases with missing
values on Test 14 or age.
Applying shape_data to our data produces a new
dataframe, mydata_BB, which includes the additional column
"shaped_score" containing the scores in the correct format
for BB models:
mydata_BB <- shape_data(data = ids_data,
age_name = "age",
score_name = "y14",
family = "BB",
max_score = 34)
#> Binomial family: converted response to matrix with two columns:
#> col1 = number correct, col2 = number incorrect.We can also apply shape data to obtain scores in the correct format for the BCPE model, which requires nonzero, positive scores.
mydata_BCPE <- shape_data(data = ids_data,
age_name = "age",
score_name = "y14",
family = "BCPE")
#> Response values are valid and were not transformed.Because Test 14 has no results with zero scores, the scores were not
transformed. A new column "shaped_score" was created, but
it is identical to "y14".
To fit the candidate models, we use the model selection function
fb_select(), which implements the free order model
selection procedure (Voncken, Albers, and
Timmerman 2019). We first apply this procedure to the BB
distribution. Note that we pass "shaped_score" as the
score_name, because this column contains the reshaped
response variable created by shape_data().
mod_BB_y14 <- fb_select(data = mydata_BB,
age_name = "age",
score_name = "shaped_score",
family = "BB",
selcrit = "BIC")
#> FB-select iteration 0: Polynomial degrees = 2,1 : BIC = 8469.301
#> FB-select iteration 1: Polynomial degrees = 2,2 : BIC = 8462.629The free order model selection procedure selects a second degree polynomial of age for both the \(\mu\) and \(\sigma\) parameters of the BB model.
As alternative models, we also fit the BCPE and NO models.
mod_BCPE_y14 <- fb_select(data = mydata_BCPE,
age_name = "age",
score_name = "shaped_score",
family = "BCPE",
selcrit = "BIC")
#> FB-select iteration 0: Polynomial degrees = 2,1,0,0 : BIC = 8404.609
mod_NO_y14 <- fb_select(data = ids_data,
age_name = "age",
score_name = "y14",
family = "NO",
selcrit = "BIC")
#> FB-select iteration 0: Polynomial degrees = 2,1 : BIC = 8456.394After fitting the BB, BCPE, and NO models, we first compare their BIC values. Lower BIC values indicate better model fit.
mod_BB_y14$selcrit
#> [1] 8462.629
mod_BCPE_y14$selcrit
#> [1] 8404.609
mod_NO_y14$selcrit
#> [1] 8456.394The results show that the BCPE model provides the best fit, as it has the lowest BIC. The NO model follows at some distance, while the BB model performs worst according to BIC.
To further evaluate model fit, we inspect worm plots (Buuren and Fredriks 2001). Worm plots display the relationship between empirical quantiles and model-implied quantiles.
Ideally, the points (the “worm”) are close to the red horizontal line,
and most values fall inside the funnel-shaped confidence bands (dashed lines).
For Test 14, the worm plot confirms that the BCPE model has the best fit: the worm is relatively flat, and nearly all values lie within the confidence bands. By contrast, the BB and NO models show some deviations from the expected pattern.
#with about 5% of the points outside the bands, and no substantial deviating patterns
gamlss::wp(mod_BB_y14, ylim.all = 2) #Substantial and clear deviations at the interval [-4, -2]Because both BIC values and worm plots point to the BCPE as the superior model, we select this model for further inspection and prediction.
*Tip: Worm plots can also be created per age interval, which is highly recommended to check fit across the full age range. This can be done with the arguments x.var (age variable) and n.inter (number of age intervals). For example, for the BCPE model: `gamlss::wp(modBCPE, xvar = age, ylim.worm = 1.5, n.inter = 3)`*
A sample size of at least 200 per worm plot is recommended for stable results (Buuren and Fredriks 2001).
Once the best-fitting model has been selected (BCPE in this case), we can visually inspect its fit using centile curves. These curves show the predicted centiles of test scores as a function of age, allowing us to see how scores are distributed across the age range.
#> % of cases below 5 centile is 5.481928
#> % of cases below 25 centile is 25.12048
#> % of cases below 50 centile is 50.42169
#> % of cases below 75 centile is 75.06024
#> % of cases below 95 centile is 95.36145
For most distributions, the centiles function in GAMLSS works well. However, for binomial-type distributions such as the BB distribution, the standard centiles function does not always render correctly. For these cases, centiles_bin provides a reliable alternative:
centiles_bin(mod_BB_y14, xvar = age, cent = c(5,25,50,75,95))
#> % of cases below5centile is 7.29
#> % of cases below25centile is 27.65
#> % of cases below50centile is 55.06
#> % of cases below75centile is 80.72
#> % of cases below95centile is 97.65These plots complement worm plots and BIC comparisons by providing a direct, visual summary of the distribution of scores across ages, making it easier to assess model fit and to interpret age-dependent norms.
After selecting the best-fitting model, we can generate norm tables
using the function normtable_create(). By default, the
function produces z-scores (i.e., \(\mu_{age}
= 0, \sigma_{age} = 1\)), but other norm types can also be
requested:
The function returns a list containing:
cdf_matrix: the percentiles for each raw score and
agenorm_matrix: the corresponding normed scores in the
requested norm typenorm_sample and cdf_sample: the percentile
and normed scores for the observed sampleOptionally, an Excel file can be generated by specifying
excel_name. The Excel file contains:
For example, in the norm matrix:
norm_mod_BCPE_y14 <- normtable_create(model = mod_BCPE_y14,
data = mydata_BCPE,
age_name = "age",
score_name = "shaped_score",
normtype = "IQ",
excel_name = "../../norms_y14_1.xlsx",
excel = TRUE)
#> Norm table written to: ../../norms_y14_1.xlsx
head(norm_mod_BCPE_y14[["norm_matrix"]][, c(1, 11)],
n = 15)
#> age 11
#> 1 4.000000 129.11870
#> 2 4.181818 125.16161
#> 3 4.363636 121.54584
#> 4 4.545455 118.22558
#> 5 4.727273 115.16269
#> 6 4.909091 112.32525
#> 7 5.090909 109.68653
#> 8 5.272727 107.22416
#> 9 5.454545 104.91960
#> 10 5.636364 102.75783
#> 11 5.818182 100.72752
#> 12 6.000000 98.82354
#> 13 6.181818 97.04659
#> 14 6.363636 95.38812
#> 15 6.545455 93.83693We can visualize the generated norm tables using the function
plot_normtable(). This function plots the percentiles as a
function of age for each raw score. Each line represents a specific raw
score:
This visualization allows us to inspect how raw scores translate to normed scores across ages and to see whether the norms behave as expected.
By default, normtable_create() returns point estimates
of the norms. Optionally, you can also calculate confidence intervals
(CIs) for the normed scores. The default confidence level is 0.95 (95%
interval).
The CIs are estimated using an extension of the group model from classical test theory, adapted for age-dependent scores (Heister et al. 2024). To compute the upper and lower boundaries, you need a dataframe containing two vectors:
age: the ages at which the norms are evaluatedrel: the estimated test reliability at each ageThe resulting CIs are stored in the output tables with the suffix
_lower and _upper (e.g.,
norm_sample_lower and norm_sample_upper).
For distributions without a closed-form mean and standard deviation, such as the BCPE distribution, computing the CIs can be slower because these quantities need to be estimated via resampling.
In the example below, we use the fictitious age-dependent
reliabilities provided in ids_rel_data. Later, in the
section Estimating test reliability dependent on age, we show
how such reliabilities can be derived from individual item scores using
a sliding window approach. Note that for Test 14, we cannot estimate
these reliabilities from raw scores, as the individual items are not
available.
Composite scores are calculated by combining multiple subtest scores, typically as the sum of their standardized scores (z-scores). To illustrate this process, we first fit a BCPE model to Test 7. Based on the wormplot inspection (not shown), the model fits the data well. Note that fitting BCPE models can be computationally intensive.
mydata_BCPE_y7 <- shape_data(data = ids_data,
age_name = "age",
score_name = "y7",
family = "BCPE")
#> Response values are valid and were not transformed.
mod_BCPE_y7 <- fb_select(data = mydata_BCPE_y7,
age_name = "age",
score_name = "shaped_score",
family = "BCPE",
selcrit = "BIC")
#> FB-select iteration 0: Polynomial degrees = 2,1,0,0 : BIC = 9116.086
#> FB-select iteration 1: Polynomial degrees = 2,1,0,1 : BIC = 9096.959
#> FB-select iteration 2: Polynomial degrees = 3,1,0,1 : BIC = 9084.672
#> FB-select iteration 3: Polynomial degrees = 4,1,0,1 : BIC = 9076.861
#> FB-select iteration 4: Polynomial degrees = 4,1,1,1 : BIC = 9067.153
#> FB-select iteration 5: Polynomial degrees = 4,1,2,1 : BIC = 9063.883
norm_mod_BCPE_y7 <- normtable_create(model = mod_BCPE_y7,
data = mydata_BCPE_y7,
age_name = "age",
score_name = "shaped_score")Next, we calculate composite scores based on the norm tables of Test 7 and Test 14. The workflow is as follows:
composite_shape() to compute the
composite scores as the sum of z-scores across the tests.normtable_create().composite <- list(norm_mod_BCPE_y7,
norm_mod_BCPE_y14)
composite_data <- composite_shape(normtables = composite)
modNO_comp <- fb_select(
data = composite_data,
age_name = "age",
score_name = "z_sum",
family = "NO",
selcrit = "BIC"
)
#> FB-select iteration 0: Polynomial degrees = 2,1 : BIC = 6630.911
#> FB-select iteration 1: Polynomial degrees = 2,0 : BIC = 6623.52
#> FB-select iteration 2: Polynomial degrees = 1,0 : BIC = 6618.417
#> FB-select iteration 3: Polynomial degrees = 0,0 : BIC = 6611.19
norm_modNO_comp <- normtable_create(modNO_comp,
composite_data,
age_name = "age",
score_name = "z_sum",
cont_cor = TRUE)It is possible to calculate normed scores directly for a new sample, using their raw scores and ages.
In this example, we select the raw scores and ages of the first five individuals in our dataset. We then use the previously estimated model to obtain their normed scores and display the results.
newdata <- ids_data[1:5,c("age","y14")]
norm_mod_BCPE_newdata <- normtable_create(model = mod_BCPE_y14,
data = newdata,
age_name = "age",
score_name = "y14",
new_data = TRUE,
datarel = ids_rel_data)
norm_mod_BCPE_newdata[["norm_sample"]]
#> age Z
#> [1,] 10.97146 -0.2890158
#> [2,] 11.67420 0.1106675
#> [3,] 11.44422 1.6213568
#> [4,] 10.04932 -0.8806376
#> [5,] 14.18207 -0.6811148Free order model selection with the fb_select is also
available for distributions that are defined by the user, for example
truncated distributions generated with the gamlss.tr
package. Using truncated distributions can improve model fit when
continuous distributions are applied to scores that have natural bounds.
By explicitly accounting for minimum and maximum possible values,
truncation prevents unrealistic predictions outside the valid score
range and often leads to more accurate norm estimates. We illustrate the
use of a truncated BCPE distribution for the ids_kn_data
data, which consists of 34 binary item scores.
To fit a truncated BCPE model, we first create a new distribution
using the gen.trun() function. We specify truncation at
both ends of the distribution (type="both") with a minimum
score of 0 and a maximum score of 34 (par=c(0,34)). We
apply this to the BCPE family (family="BCPE") and name the
new distribution tr34, resulting in a distribution called
BCPEtr34.
Next, we shape the data appropriately for the truncated distribution and fit the model.
gamlss.tr::gen.trun(par=c(0,34), family="BCPE", name="tr34", type="both")
#> A truncated family of distributions from BCPE has been generated
#> and saved under the names:
#> dBCPEtr34 pBCPEtr34 qBCPEtr34 rBCPEtr34 BCPEtr34
#> The type of truncation is both
#> and the truncation parameter is 0 34
mydata_BCPEtr34 <- shape_data(data = ids_kn_data,
age_name = "age_years",
score_name = "rawscore",
family = "BCPEtr34")
#> Response values are valid and were not transformed.
#mod_BCPEtr34_KN <- fb_select(data = mydata_BCPEtr34,
# age_name = "age_years",
# score_name = "shaped_score",
# family = "BCPEtr34")To calculate confidence intervals for the normed scores, we first
estimate the reliability of the test as a function of age using a
sliding window approach. This approach is implemented in the function
reliability_window(), which computes Cronbach’s alpha
within age-specific groups (Heister et al.
2024). We again use the ids_kn_data dataset.
To estimate the reliability of the test, information at the item-level is needed. First, we identify the columns in the dataset corresponding to the items.
To choose appropriate parameters for estimating age-dependent
reliabilities, we first explore a range of window widths and step sizes.
This helps ensure that the estimated reliabilities are stable and vary
smoothly with age. The function different_rel() allows us
to compare the reliability estimates across different settings:
rel_int <- different_rel(data = ids_kn_data,
item_variables = item_variables_kn,
step_window = c(0.5, 1 , 2, 5, 10, 20),
age_name = "age_years",
min_agegroup = 5,
max_agegroup = 20,
step_agegroup = c(0.5,1,1.5,2))We then visualize the reliability estimates across age for the different window widths and step sizes to guide our selection. Ideally, the step size should be small enough to capture smooth changes in reliability, and the window width should be wide enough to assume approximately equal true scores within the window.
For this example, we select a step size of 2 and a window width of
2.
rel_est <- reliability_window(data = ids_kn_data,
age_name = "age_years",
item_variables = item_variables_kn,
window_width = 2)When we estimate the normed scores using the truncated BCPE model, we can now construct reliability based CIs for the normed scores.