Please choose a data set for your calculations. You can use an inbuilt example or load your own file:
HINT: If you choose a file from your own directory, the chosen example will not be used during the session anymore!
The built-in example data:
Variable which is used to divide the observations into groups, for example variable age by rounding to a half or a full year
As default, the explanatory variable is set to grouping variable. If available, use a continuous explanatory variable like age.
This variable specifies the power parameter for the Taylor polynomial. As default number of power is set to 4. Higher values might lead to a closer fit, but yield the danger of overfitting.
Here, you can calculate a regression model that models the original data as close as possible, while smoothing the curves and eliminating noise. The plots display the percentile curves and the information criteria for the different models, beginning with the model with one terms up to the maximum. A high R2 with as few terms as possible is preferable.
This function helps in selecting the number of terms for the model by doing repeated cross validation with 80 percent of the data as training data and 20 percent as the validation data. The cases are drawn randomly but stratified by norm group. Successive models are retrieved with increasing number of terms and the RMSE of raw scores (fitted by the regression model) is plotted for the training, validation and the complete dataset. Additionally to this analysis on the raw score level, it is possible (default) to estimate the mean norm score reliability and crossfit measures.
HINT: The function has a high computational load when computing norm scores and takes some time to finish. Time increases with number of maximum terms, sample size and number of repetitions.
The chart shows how well the model generally fits the observed data. The observed percentiles are represented as dots, the continuous norm curves as lines. In case of intersecting norm curves the model is inconsistent. Please change the number of terms in the 'Best Model' tab in order to find a consistent model. You can use the 'Series' option to look out for suitable parameters.
Please seperate the values by a comma or space.
In oder to facilitate model selection, the chart displays percentile curves of the different models.
Please use the slider to change the number of terms in the model. Please select a model with non-intersecting percentile curves. Avoid undulating curves, as these indicate model overfit.
The chart is comparable to the percentile plot. It only shows the norm curves for some selected norm scores.
Please seperate the values by a comma or space. The percentile values are automatically transformed to the norm scale used in the data preparation. In order to get curves specific z values, you can use the following percentiles:
The plot shows the probability density function of the raw scores based on the regression model. Like the 'Derivative Plot', it can be used to identify violations of model validity or to better visualize deviations of the test results from the normal distribution. As a default, the lowest, highest and a medium group is shown.
To check whether the mapping between latent person variables and test scores is biunique, the regression function can be searched numerically within each group for bijectivity violations using the 'checkConsistency' function. In addition, it is also possible to plot the first partial derivative of the regression function to l and search for negative values. Look out for values lower than 0. These indicate violations of the model.
The plot shows the observed and predicted norm scores. You can identify, how well the model is able to predict the norm scores of the dataset. The duration of the computation increases with the size of the dataset.
The plot shows the observed and predicted raw scores. You can identify, how well the model is able to predict the raw scores of the original dataset.