McGill, R. J., & Busse, R. T. (2016). When Theory Trumps
Science: a Critique of the PSW Model for SLD Identification. Contemporary School Psychology, 1-9.
In this paper, McGill & Busse criticize the use of PSW operational
definitions of learning disabilities (such as the Flanagan/CHC operational
definition).
An operational definition does not define
the concept itself (an operational definition of learning disability (LD) does
not define the essence of LD). Rather,
it portrays the way in which the concept is measured (what we should actually
do in order to determine whether a child is learning disabled or not).
Each operational definition of learning
disability has advantages and disadvantages, and we can find papers criticizing
each definition. It's important to be
familiar with the critique.
The mere existence of critique does not make the use of an operational
definition wrong. It's important to
weigh each definition's critique and to choose the definition with the least
heavy- weight critique.
Now we turn to the paper, with remarks and explanations by me in green.
Within
the professional literature, there is growing support for educational agencies
to adopt an approach to SLD identification that emphasizes the
importance
of an individual’s pattern of cognitive and achievement strengths and
weaknesses (PSW). The Flanagan/CHC definition of learning
disability is one of these operational definition. Cognitive strengths and weaknesses can
manifest in CHC abilities such as Fluid Ability, Short Term Memory, Long Term
Storage and Retrieval, Visual Processing, Auditory Processing, Processing
Speed, Comprehension Knowledge. Achievement
strengths and weaknesses can manifest in tests measuring Reading Decoding/Reading
Comprehension/Writing/Written Expression/Math Calculations/Math Reasoning.
The Flanagan definition is
based on five major criteria: A. Significantly
poor performance on achievement tests. B.
One or more of the CHC cognitive abilities is significantly below
average. C. Concordance/linkage between the poor area of
achievement and the low cognitive ability.
The low cognitive ability can explain/be the cause of the poor achievement
area. D. Other cognitive abilities are
intact. E. Exclusionary factors are not
the primary reason for the poor performance on the achievement tests.
In 2014, the California Association of School
Psychologists released a position paper
endorsing this approach. As a vehicle for examining the
PSW model, the authors respond critically to three fundamental positions taken
in the position paper: (a) diagnostic validity for the model has been
established; (b) cognitive profile analysis is valid and reliable; and (c) PSW
data have adequate treatment utility. The authors conclude that at the present
time there is insufficient support within the empirical literature to support
adoption of the PSW method for SLD identification.
Prior to
IDEA (INDIVIDUALS WITH DISABILITIES EDUCATION ACT) 2004, federal regulations
emphasized the primacy of the discrepancy model, wherein Specific Learning Disability was
operationalized as a significant discrepancy between an individual’s
achievement and their cognitive ability (Full Scale IQ score). This model was heavily
criticized. First, it causes "waiting
for failure", because such a discrepancy can be proven
only in third grade, when the child "achieves" a two-year discrepancy
between his IQ score and his performance in reading/writing/math. Second, it causes under-identification of
learning disabilities in adolescence. LD
affects full scale IQ (for instance, a learning disabled child reads less, thus
his comprehension knowledge is less developed, which lowers his IQ). Thus in adolescence there is a lowered chance
for a learning disabled child to have a discrepancy between his IQ score and
his achievement scores. This renders it
impossible to diagnose him as learning disabled despite him being so.
In
contrast to previous legislation, IDEA 2004 permitted local educational
agencies the option of selecting between the discrepancy method and
alternatives such as response-to
intervention (RTI). The RTI
model defined learning disability as persistent low achievement (in
reading/writing/math) despite adequate intervention. This model permits but
does not require to conduct a psychological assessment for a differential diagnosis with
intellectual disability, language disorder and emotional /behavioral disorders.
This point (the lack of differential diagnosis)
came under a lot of criticism. Over the last decade, RTI has been widely embraced within
the technical literature and adopted as an SLD classification model by many
educational agencies across the country, resulting in renewed concern regarding
the validity of identification approaches that deemphasize the role of
cognitive testing.
PSW
models were developed in response to these problems. There are several such models which are quite similar to each
other: (a) the concordance/discordance model (C/DM; Hale and
Fiorello 2004), (b) the
Cattell-Horn-Carroll
operational model (CHC; Flanagan et al. 2011), and (c) the
discrepancy/consistency model (D/CM; Naglieri 2011). It
is noteworthy that, although the models differ with respect to their
theoretical orientations
and
the statistical formulae used to identify patterns of strengths and weakness,
all three PSW models share at least three core assumptions as related to the
diagnosis of Specific LD: (a) evidence of cognitive weaknesses must be present,
(b) an academic weakness must also be established, and (c) there must be
evidence of spared (i.e., not indicative of a
weakness)
cognitive-achievement
abilities. The authors go on to
briefly discuss each model. This will
not be done here.
Now the authors criticize
PSW models on three points:
Critical Assumption One: Diagnostic Validity for the
Model Has Been Established
Steubing
et al. (2012) investigated the diagnostic accuracy
of several PSW models and reported high diagnostic specificity (a high percentage of children who do not have
LD and are correctly identified as not having LD) across all
models.
However, the models had low to moderate sensitivity (a low to moderate
percentage of children who have LD and are identified as having LD). Only a very small percentage of the
population (1%-2%) met criteria for specific learning disabilities using these
models. Clinically, it feels like the CHC definition does lower the
percentage of children identified as LD, but other definitions, like DSM5,
inflate the percentage of children identified as LD. We have no way of knowing what is the
"real" percentage of LD in the population, because every study that
assesses this percentage is conducted in light of some operational definition,
usually a relatively "inflating" one.
Since there
are no objective criteria with which we can know who is really learning
disabled, Steubing et al cannot argue that "these models had low to
moderate sensitivity". The only
thing that Steubing et al can say is that PSW models identify less children as
learning disabled than other models. But
we cannot know whether this fact makes PSW models better or worse at
identifying LD.
Kranzler
et al. (2016) examined
the broad cognitive abilities of the Cattell-Horn-Carroll theory held to be
meaningfully related to basic reading, reading comprehension, mathematics
calculation, and mathematics reasoning across age groups. Results of analyses
of 300 participants in three age groups (6–8, 9–13, and 14–19 years) indicated
that the XBA method (a method for implementing the PSW approach) is very reliable and accurate in detecting true negatives (the percentage of children who are not LD and are correctly
identified as not having LD). The model identified 92% of children who were
not LD as not having LD. However Kranzler
et al found the model to have quite low sensitivity, indicating that this method is very poor at detecting the percentage of LD children who are correctly identified as
having LD. Only 21% of children who were
LD were identified as having LD according to this model.
A brief peek at Kranzler et al's study makes me wonder at the way
Kranzler et al determined which abilities were related to which areas of
achievement. For example, at age 6-8, Kranzler
et al considered the broad abilities Comprehension Knowledge, Long Term Storage
and Retrieval, Processing Speed and Short Term Memory as related to basic
reading skills. Kranzler et al say they
took these links out of McGrew and Wendling's 2010 study. But that study (presented on slide no.
11 in the second presentation of the Intelligence and Cognitive Abilities
presentation series on the right hand column of this blog) found that the narrow
ability Phonological Coding is also related to basic reading skills at age 6-8!
Generally, McGrew and Wendling recommend using combinations of broad and
narrow abilities to predict performance in different areas of
achievement. Kranzler et al used only
broad abilities.
Thus it may be that the low
sensitivity found in Kranzler et al's study results from the fact that Kranzler
et al did not implement McGrew and Wendling's findings accurately (I have to say this
very carefully since I did not read the entire Kranzler paper).
Critical Assumption Two: Cognitive Profile Analysis Is
Reliable and Valid
In order to identify LD according to the Flanagan
method we have to use cognitive ability/index scores. If they are not reliable – the identification
of LD will not be reliable as well.
Significant
questions have been raised about the long term stability and structural and
incremental validity of factor level measures from intelligence. Structural validity investigations using
exploratory factor analysis have revealed conflicting factor structures from
those reported in the technical manuals of contemporary cognitive measures which indicates that these instruments may be overfactored
(Frazier and Youngstrom 2007). Additionally, the long-term
stability and diagnostic utility of these indices has been found wanting. The authors cite a study by Watkins and Smith
(2013) who investigated the long-term
stability of the WISC-IV with a sample of 344 students twice evaluated
for special education eligibility at an average interval of 2.84 years.
Test-retest reliability coefficients for the Verbal Comprehension Index (VCI),
Perceptual Reasoning Index (PRI), Working Memory Index (WMI), Processing Speed
Index (PSI), and the Full Scale IQ (FSIQ) were .72, .76, .66, .65, and .82,
respectively. As far as I know, good
reliability is considered to be above 0.7.
Thus the WMI and the PSI were found to be not reliable in this research
and the VCI and PRI had low reliability. However,
25% of the students earned FSIQ scores that differed by 10 or more points, and
29%, 39%, 37%, and 44% of the students earned VCI, PRI, WMI, and PSI scores,
respectively, that varied by 10 or more points. Given this variability, Watkins
and Smith argue that it cannot be assumed that
WISC-IV scores will be consistent across long test-retest intervals for
individual students.
In light of this study we
ought to give up using index scores and use only the FSIQ score as the lesser
evil. In the context of LD this brings
us back to the discrepancy model.
However it's possible that
during the 2.84 years these children received special education services their
cognitive abilities were improved. This can
explain the unstableness of the indices in this study. Does this
unstableness exist in the general population?
In other intelligence tests?
Critical Assumption Three: PSW Methods Have Adequate
Treatment Utility
Despite many attempts to validate group by treatment
interactions, the efficacy of interventions focused on cognitive deficits
remains speculative and unproven.
Particularly noteworthy, are the findings
obtained from a recent meta-analysis of the efficacy of academic interventions derived
from neuropsychological assessment data by Burns et al. (2016). In contrast to
the effects attributed to more direct measures of academic skill, it was found
that the effects of interventions developed from cognitive data were
consistently
small
(g=0.17). As a result, Burns et al. (2016) concluded, "the current
and previous data indicate that measures of cognitive abilities have little
to no [emphasis added] utility in screening or planning interventions for
reading and mathematics".
Do other
operational definitions of learning disability lead to more efficient
interventions?
To summarize, McGill & Busse criticize the CHC/PSW LD definition in
three ways: A. the diagnostic validity of the model is
weak. B. index scores are not reliable
and valid. C. the model does not lead to
efficient interventions.
Are you convinced?
No comments:
Post a Comment