יש לי הכבוד לפרסם את תשובתו של פרופ' מקגיל
לפוסט שכתבתי על מאמר שלו. הפוסט נכתב
בתאריך 11 ביוני ואפשר לקרוא אותו למטה או בלחיצה כאן.
Prof. McGill's
response to my June 11th post: "When Theory Trumps Science: a Critique of the PSW Model
for SLD Identification"
I have the honor to present Prof. McGill's comments on this post:
Thank you for your interest and overview of our paper. I
have been following your blog for several years now and always enjoy your
thoughtful posts on intelligence and cognition.
Our commentary was somewhat brief and is conceptually
similar to a more substantive paper that was recently published in Learning
Disability Quarterly that was more focused on potential measurement issues with
the PSW model:
McGill, R. J., Styck, K. S., Palomares, R. S., & Hass,
M. R. (2015). Critical issues in specific learning disability identification:
What we need to know about the PSW model. Learning
Disability Quarterly. Advance online publication. doi:
10.1177/0731948715618504
The point you raise about diagnostic validity studies and LD
is a good one and an issue we raised in the LDQ paper as there is no “gold
standard” for SLD diagnoses, as a consequence, we have no way of knowing who
truly has SLD and thus the results from SLD DV studies will always have this
limitation. While I still think they are of some value as they give us some
estimate of the potential DV of identification models, this limitation must
always be considered when interpreting those results.
I also concur that the simulation studies conducted by
Steubing et al. and the Kranzler study have limitations (as all studies do).
Most germane, the fidelity in which the authors attempted to model various PSW
implementations. With all the potential permutations, these models are
incredibly complex which renders them difficult to conceptualize without access
to significant sources of clinical assessment data (really hard to obtain).
Even if you have the data, simulating the multi-step decision-making that these
models require of clinicians is the biggest hurdle that a researcher faces and
one that so far has not been able to be overcome. I am hopeful that with advent
of machine learning algorithms in advanced statistical software programs such
as R, one day we may be able to model this stuff better.
What I always want to stress when discussing these things is
that I am not anti-PSW, I actually think it makes a lot of conceptual sense. I
don’t really have a dog in this fight. My major concern with the model has to
do with the tools that we are using to make decisions within these models
(i.e., IQ tests). While I love IQ tests and think they are all very good at
estimating overall cognitive ability, I do think that they have significant
limitations when we try to get more than that out of them. As an example, the
results from numerous independent factor analytic studies raise questions about
the viability of publisher suggested measurement models. Most pertinent, do
these instruments measure lower-order abilities well if at all? What we
consistently find is that g dominates
all levels of IQ tests and the scores provided by those instruments and when
this source of variance is accounted for, there is often only a small
proportion of reliable variance attributable to the lower-order abilities
(e.g., auditory processing, visual processing, etc.) that are of most interest
to clinicians and the focus of clinical interpretation in PSW models. These are
significant confounds as we use these models as the basis for clinical
interpretation of scores. In my opinion, there has been insufficient discussion
of these issues and their potential impact on clinical
decision-making…especially for those using and advocating the PSW model.
My opinion is that we are not measuring these constructs
very well (not that they don’t exist) and that is why we have the issues with
long-term stability and incremental prediction of achievement. Of course, the
issues with cognitive profile analysis (regardless of the level of the scores)
have long been known (see Canivez, 2013; Glutting, Watkins & Youngstrom,
2003; Watkins, 2000) and that is all PSW really is….profile analysis at the
factor score level rather than subtest level. Scatter and variability are
endemic in the population. As an example, my analyses indicate that over 30% of
the KABC-II normative sample have at least a 23 points difference between their
highest and lowest factor scores. That’s a lot of noise that PSW models will
have to sift through in order to find the “signal.” To be fair, this is
something that Flanagan and colleagues have repeatedly discussed in their
writings.
In sum, these are complex issues that we think clinicians
need to be aware as it relates to the PSW model and SLD identification in
general. Perhaps these limitations will be overcome in the future however
presently my opinion is that we need to know more before we utilize these
models to make important diagnostic and treatment decisions in practice.
As you rightly note, in general I advocate more circumspect
interpretation of IQ tests. Whereas, the corpus of the empirical literature
indicates that one can interpret FSIQ with confidence, significant questions
remain as one moves to lower levels of dimensionality. As we indicate in our
paper, if one uses these scores within a diagnostic decision-making model
(i.e., PSW), their shortcomings will be encapsulated in those models and render
consistent and defensible decision-making very difficult.
In spite of this, I do not advocate a return to the flawed
discrepancy model. I advocate using FSIQ as a rule out element within the broader
conceptual definition of LD (i.e., unexpected underachievement). If a kiddo has
low average or higher ability than I can deduce that is probably not the reason
for their underachievement and thus LD or some other condition is a more viable
explanation. This requires thinking about performance on IQ tests more from a
criterion-based perspective which when you think about it is the level of precision
with which we measure functioning on these instruments (think confidence
intervals), that’s really all we can get out of them anyway. As a colleague of
mine says, we are trying to measure really complex aspects of cognition with
what are virtually stone tools. When it comes to additional assessment, I
stipulate that we need more than CBM and other related achievement data but it
remains to be seen whether the use of multi-factored cognitive batteries and
hours and hours of additional assessment is indeed the answer. Nevertheless, I
think certain dimensions are more important than others (e.g., working memory
and processing speed).
We have been trying to figure out how to validly diagnose LD
for a long time now. I don’t profess to have the answer. As previously
mentioned, this is a complex issue that we have been attempting to adjudicate
for a long time now. Unfortunately, proposed remedies have consistently been
found wanting once they have been implemented. Perhaps we would be better
served if we quit this quixotic quest to diagnose an illusive construct and
just figured out which kids need help and get to helping.
No comments:
Post a Comment