Sunday, May 29, 2016

Your words may predict your future mental health


Can we predict the emergence of mental illness or dementia from oral or written language samples?  How can we use these samples to study the development of concepts throughout history?

Dr. Mariano Sigman attempts to answer these questions in this fascinating TED talk, which was brought to my attention by Michelle Lisses Topaz. 

Here are excerpts from the transcript:

Can the theory that introspection built up in human history only about 3,000 years ago be examined in a quantitative and objective manner? 
  
The space of words (or  Latent Semantic Analysis -LSA)  is a computer simulation that contains all words in such a way that the distance between any two of them is indicative of how closely related they are.  So for instance, the words "dog" and "cat" are very close together, but the words "grapefruit" and "logarithm" are very far away. And this has to be true for any two words within the space. 

When two words are related, they tend to appear in the same sentences, in the same paragraphs, in the same documents, more often than would be expected just by pure chance. This simple method, with some computational tricks that have to do with the fact that this is a very complex and high-dimensional space, turns out to be quite effective.
 Words automatically organize into semantic neighborhoods. So you get the fruits, the body parts, the computer parts, the scientific terms and so on.  The algorithm also identifies that we organize concepts in a hierarchy. 

Once we've built the space, the question of the history of introspection, or of the history of any concept which before could seem abstract and somehow vague, becomes concrete -- becomes amenable to quantitative science.

All that we have to do is take the books, we digitize them, and we take this stream of words as a trajectory and project them into the space, and then we ask whether this trajectory spends significant time circling closely to the concept of introspection.

And with this, we could analyze the history of introspection in the ancient Greek tradition, for which we have the best available written record. So what we did is we took all the books -- we just ordered them by time -- for each book we take the words and we project them to the space, and then we ask for each word how close it is to introspection, and we just average that. And then we ask whether, as time goes on and on, these books get closer, and closer and closer to the concept of introspection.

And this is exactly what happens in the ancient Greek tradition. So you can see that for the oldest books in the Homeric tradition, there is a small increase with books getting closer to introspection. But about four centuries before Christ, this starts ramping up very rapidly to an almost five-fold increase of books getting closer, and closer and closer to the concept of introspection.

We  ran this same analysis on the Judeo-Christian tradition, and we got virtually the same pattern.

Can  the words we say today tell us something of where our minds will be in a few days, in a few months or a few years from now?

We can ask whether monitoring and analyzing the words we speak, we tweet, we email, we write, can tell us ahead of time whether something may go wrong with our minds. And with Guillermo Cecchi, who has been my brother in this adventure, we took on this task. And we did so by analyzing the recorded speech of 34 young people who were at a high risk of developing schizophrenia.

And so we measured speech at day one, and then we asked whether the properties of the speech could predict, within a window of almost three years, the future development of psychosis. But despite our hopes, we got failure after failure. There was just not enough information in semantics to predict the future organization of the mind. It was good enough to distinguish between a group of schizophrenics and a control group, a bit like we had done for the ancient texts, but not to predict the future onset of psychosis.

But then we realized that maybe the most important thing was not so much what they were saying, but how they were saying it. More specifically, it was not in which semantic neighborhoods the words were, but how far and fast they jumped from one semantic neighborhood to the other one. And so we came up with this measure, which we termed semantic coherence, which essentially measures the persistence of speech within one semantic topic, within one semantic category.

And it turned out to be that for this group of 34 people, the algorithm based on semantic coherence could predict, with 100 percent accuracy, who developed psychosis and who will not. And this was something that could not be achieved -- not even close -- with all the other existing clinical measures.

We may be seeing in the future a very different form of mental health, based on objective, quantitative and automated analysis of the words we write, of the words we say.

Want to read more?
    
 Bedi, G., Carrillo, F., Cecchi, G. A., Slezak, D. F., Sigman, M., Mota, N. B., ... & Corcoran, C. M. (2015). Automated analysis of free speech predicts psychosis onset in high-risk youths. npj Schizophrenia, 1.   https://neuro.org.ar/sites/neuro.org.ar/files/Automated%20analysis%20of%20free%20speech%20predicts%20psychosis%20onset%20in.pdf

This also reminds me of this post: Detecting signs of Alzheimer's disease through written language analysis – and what does it have to do with Iris Murdoch?



No comments:

Post a Comment