Ceiling and floor effects are common in data. Ceiling or floor effects occur when the tests or scales are relatively easy or difficult such that substantial proportions of individuals obtain either maximum or minimum scores and that the true extent of their abilities cannot be determined.
Ceiling and floor effects, subsequently, causes problems in data analysis. For example, ceiling or floor effects alone would induce, respectively, attenuation or inflation in mean estimates. And both ceiling and floor effects would result in attenuation in variance estimates. This imposes challenges in mean and variance based data analytic methods.
This package implements methods to deal with challenges associated with ceiling/floor effects in the data using paramtric methods that assume normality for the true scores. The current version is capable of mean and variance recovery given data with ceiling/floor effects and of mean comparison tests such as t-test and ANOVA for data with ceiling/floor effects.
The package contains a helper function threeganova.sim
that would generate a three-group anova data with a standard normal
control group and positive/negative treatment groups of effect with same
magnitudes. In addition, one can specify the standard deviation in
positive treatment group. To see the specifics of the function, user can
enter ?threeganova.sim
in the R console.
Another helper function included in the package is
induce.cfe
where the user can manually induce ceiling and
floor effects to healthy data. To see the specifics of the function,
user can enter ?induce.cfe
in the R console.
Moreover, the function F.star.test
allows user to
conduct a Brown-Forsythe F star test. This is a variant of the commonly
used F test. F star test is robust against violations of homogeneity of
variance (HOV) assumption for the F test.
The current version of the package includes three functions that can
facilitate the user to conduct data analyses for data with ceiling/floor
effects.rec.mean.var
estimates the true mean and variance
of the data with ceiling/floor effects. That is, as mentioned in the
summary, the observed mean and variance of data with ceiling/floor
effects are often biased. Thus, rec.mean.var
aims to help
the user to recover the mean and variance of the data were ceiling/floor
effects absent. lw.t.test
conducts a t test that adjusts
for ceiling/floor effects in the data. As lw.t.test
also
uses Welch’s t test, the adjusted t test is robust against HOV
violation. lw.f.star
conducts a F star test for one-way
ANOVA that adjusts for ceiling/floor effects in the data.
lw.f.star
is also robust against HOV violation. For both
lw.f.star
and lw.t.test
: method a
is a liberal appraoch that yields accurate effect size estimates but has
mildly inflated type I error rates, b
is a conservative
approach with well-controlled type I error rates that have good, but
less accurate than a
, effect estimates.
Imagine a scenario where we wish to test the difference in cognitive
ability for people of different age groups. In this toy example, we have
1000 participants for three age groups, the younger-aged group has true
mean and variance of respectively 30 and 25, the middle-aged group 20
and 25 and the older-aged group 10 and 100. The higher the score, the
higher the cognitive ability. We can check the mean and variance of the
true mean and variance on the data composed of true scores,
ca.true
.
## Group.1 x
## 1 1 29.953915
## 2 2 20.205565
## 3 3 9.706224
## Group.1 x
## 1 1 23.37425
## 2 2 22.52663
## 3 3 93.25280
Now consider the fact that a substantial proportion of the
younger-aged group may score maximum at the cognitive ability test and a
substantial proportion of the older-aged group may score minimum. Let
both the ceiling and the floor proportions be 15%, we have the dataset
ca.cf
.
## Group.1 x
## 1 1 29.03192
## 2 2 20.20557
## 3 3 11.54578
## Group.1 x
## 1 1 12.73413
## 2 2 22.52663
## 3 3 53.76783
We can see that both the mean and the variance estimates from the
younger-aged and the older-aged groups are biased. The function
rec.mean.var
can help recover the mean and variance. In the
example of the younger-aged group, we first select all the scores of the
younger-aged group and name it as a new variable young
and
then use our function rec.mean.var
to recover the mean and
variance. We can do the same for the older-aged group.
# younger-aged group
young=ca.cf[ca.cf[,2]==1,1]
rec.mean.var(young) # true mean and variance are 30 and 25
## $ceiling.percentage
## [1] 0.295
##
## $floor.percentage
## [1] 0.001
##
## $est.mean
## [1] 29.88923
##
## $est.var
## [1] 22.21391
# the estimated floor and ceiling percentages and the recovered mean and variance estimates are displayed above
# older-aged group
old=ca.cf[ca.cf[,2]==3,1]
rec.mean.var(old) # true mean and variance are 10 and 100
## $ceiling.percentage
## [1] 0.001
##
## $floor.percentage
## [1] 0.317
##
## $est.mean
## [1] 9.610808
##
## $est.var
## [1] 97.39203
# the estimated floor and ceiling percentages and the recovered mean and variance estimates are displayed above
Now we wish to conduct an ANOVA in the data with floor and ceiling
effects. We can use the function lw.f.star
. We can also
conduct a t-test between the older-aged and the younger-aged group by
using the function lw.t.test
. Both methods a
and b
are used for the illustration purposes.
## $statistic
## [1] 2170.703
##
## $p.value
## [1] 0
##
## $est.f.squared
## [1] 1.447135
## $statistic
## [1] 1449.992
##
## $p.value
## [1] 0
##
## $est.f.squared
## [1] 1.216436
## $statistic
## [1] 56.47165
##
## $p.value
## [1] 2.387715e-274
##
## $est.d
## [1] 2.489837
##
## $conf.int
## [1] 19.57350 20.98335
## $statistic
## [1] 58.63512
##
## $p.value
## [1] 2.208567e-275
##
## $est.d
## [1] 2.622242
##
## $conf.int
## [1] 19.59944 20.95741
Both the ANOVA and the t-tests returned significant results.
The following example provides an overview of the helper functions in the package that can aid in simulations and further demonstrates data analytic functions in the package.
# Simulate healthy data for two groups
x.1=rnorm(300,2,4)
x.2=rnorm(300,3,5)
# check mean and variance for simulated healthy data
mean(x.1);var(x.1)
## [1] 2.295143
## [1] 15.11878
## [1] 2.788979
## [1] 23.11086
# induce ceiling effects of 20% in group 1
x.1.cf=induce.cfe(.2,0,x.1)
# induce floor effects of 10% in group 2
x.2.cf=induce.cfe(0,.1,x.2)
# recover the mean and variance for ceiling/floor data
rec.mean.var(x.1.cf)
## $ceiling.percentage
## [1] 0.003333333
##
## $floor.percentage
## [1] 0.18
##
## $est.mean
## [1] 2.378155
##
## $est.var
## [1] 13.92959
## $ceiling.percentage
## [1] 0.09666667
##
## $floor.percentage
## [1] 0.003333333
##
## $est.mean
## [1] 2.836167
##
## $est.var
## [1] 23.73713
##
## Welch Two Sample t-test
##
## data: x.1 and x.2
## t = -1.3834, df = 572.96, p-value = 0.1671
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## -1.1949782 0.2073058
## sample estimates:
## mean of x mean of y
## 2.295143 2.788979
##
## Welch Two Sample t-test
##
## data: x.1.cf and x.2.cf
## t = 0.53385, df = 540.14, p-value = 0.5937
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## -0.4515919 0.7886511
## sample estimates:
## mean of x mean of y
## 2.757459 2.588929
## $statistic
## [1] -1.205103
##
## $p.value
## [1] 0.2293059
##
## $est.d
## [1] -0.1048535
##
## $conf.int
## [1] -1.2065509 0.2905258
## $statistic
## [1] -1.292585
##
## $p.value
## [1] 0.1972241
##
## $est.d
## [1] -0.1055391
##
## $conf.int
## [1] -1.1555287 0.2395037
# generate a dataframe for ANOVA demo
testdat=threeganova.sim(10000,.0625,1)
# induce ceiling/floor effects in the data
testdat.cf=testdat
testdat.cf[testdat.cf$group==2,]$y=induce.cfe(.2,0,testdat.cf[testdat.cf$group==2,]$y)
# conduct an adjusted F star test on ceiling/floor data
lw.f.star(testdat.cf,y~group,"a")
## $statistic
## [1] 900.5565
##
## $p.value
## [1] 0
##
## $est.f.squared
## [1] 0.0600371
## $statistic
## [1] 810.9701
##
## $p.value
## [1] 0
##
## $est.f.squared
## [1] 0.05795541