Relation of LAS score to our sample of children with ASD
click to enlarge report

Research Behind
Everything We Do


Background

Standard assessments of autism spectrum disorders (ASD) rely primarily on negative markers for detection of the disorders. Research has indicated that certain vocal characteristics of children diagnosed with ASD may differ consistently from those of typically developing children, suggesting the presence of a positive marker for ASD in child vocalization activity. However, investigations into the potential clinical utility of such a marker have been limited by two key challenges: 1) the difficulty of obtaining sample data of sufficient quantity and quality; and 2) the identification of consistent discriminative vocalization patterns. The LENA System’s advanced technology overcomes these limitations and provides a unique approach to the detection of ASD with good accuracy.

Development

The LENA System comprises two distinct components: recording hardware and processing software. The LENA Digital Language Processor (DLP) is a small, lightweight digital recorder that fits into the front pocket of specially designed children’s clothing and records up to 16 hours of continuous, high-quality audio. Recordings include all vocalizations produced by the key child (i.e., the child wearing the DLP) and all externally sourced sounds and speech activity within an approximate 4-6 foot radius. This unobtrusive approach to data sampling permits the collection of naturalistic full-day recordings from a child’s home language environment with relative ease, rendering negligible the limitations arising from the first challenge, obtaining adequate child vocalization data. The second challenge, identifying consistent patterns in child vocalizations that can be utilized to discriminate a child with ASD from a typically developing child, is addressed by the processing software as described below.

The LENA System software processes the audio recording into segments from several seconds to several minutes in duration, assigning a sound category (e.g., key child vocalizations, adult male speech, TV/electronic sound, silence) to each segment based on previously developed acoustic models. Key child vocalization segments are further processed to determine the probability that the child’s vocal output is consistent with a pre-defined classification model for ASD. We have developed two complementary methods for detecting unique and discriminating patterns in the vocalizations produced by children with ASD and deriving these classification probabilities.

The first method, here called phone-based (PB), defines a unique acoustic feature set using a quantitative approach that incorporates modified components of the open-source Sphinx automatic speech recognition (ASR) software. Child vocalization segment data are processed by this software into 46 unique categories that include 39 “phone” and 7 “nonphone” categories. Note that these “phones” are more broadly defined acoustic approximations of commonly accepted phoneme categories. Sequential pairs of these “phones” are grouped into “biphones” that are then linearly recombined and reduced to 50 dimensions following a previously derived principal components analysis. For a more detailed description of the phone-based approach described here, please see LENA Technical Report LTR-08-1, "The LENATM Automatic Vocalization Assessment" (http://www.lenafoundation.org/TechReport.aspx/AVA/LTR-08-1

The second method, here called cluster-based (CB), utilizes an unsupervised k-means clustering routine applied directly to child vocalization segment data. This self-organized approach utilizes 64 phone-like clusters generated on the acoustic feature of mel-frequency cepstrum (MFC). For this method, as for the phone-based method, because the goal is not to recognize or translate speech it is not necessary that the resulting clusters or dimensions be identifiable as specific phones but only that the processing provides reliable or consistent results.

Ultimately, a previously derived linear discriminant analysis (LDA) function is applied to the combined PB and CB feature sets to determine the probability of classification to the ASD pattern. For convenience and to enhance interpretability, LDA classification probabilities are reduced to seven ordinal categories using a variable threshold based on sensitivity and specificity for our development data.

Performance

Classification performance was assessed for a sample of 190 children ages 24–48 months based on each child’s first recording after age 24 months and employing the method of Leave-One-Out Cross Validation (LOOCV) to maximize data usage and enhance generalizability. The sample included 75 children diagnosed with ASD, 34 children diagnosed with a language delay (LD), and 81 typically developing children (TD). The ASD sample was recruited nationwide, and families were required to provide documented confirmation of the ASD diagnosis from a professional or team of professionals. In addition, parents completed the self-report symptom questionnaires the Modified Checklist for Autism in Toddlers (M-CHAT) and the Social Communication Questionnaire (SCQ); average parent score for the M-CHAT was 9.5 (SD=4.8; Range 0-19) and for the SCQ was 18.7 (SD=5.7; Range 7-32). The performance metric presented here is based on the Equal Error Rate (EER).1 The following table summarizes EER performance across three comparisons: ASD vs. non-ASD (TD & LD); ASD vs. LD; and ASD vs. TD.


EER performance chart

The LENA Automatic Autism Screen compares favorably to other non-automatic measures, which are widely used. Shown below are some of the reliability statistics reported by other well known measures.

LLAS comparison to non-automatic measures

Summary

The combined phone-based and cluster-based detection method detailed above demonstrates relatively low classification error rates, reinforcing the viability of an automated detector for ASD based on child vocalization activity. The DLP provides researchers the means to collect comprehensive naturalistic language environment data in a simple and unobtrusive manner, and the automated processing software enables the assessment of ASD-specific vocal characteristics using completely objective measures.

The LENA Foundation is exploring other approaches that incorporate additional information that may be derived from recording data, which has the potential to improve accuracy. In addition, the Foundation is seeking to increase the sample size, include a more diverse sample set, as well as younger children with ASD, with the hope that the screen could be extended down to 18 months and perhaps even younger.



1 In any classification problem, it is necessary to set a threshold probability value for detection of the target group of interest. This threshold value determines not only the number of correct detections but also the number of false acceptances (false positives) and false rejections (false negatives). There is a trade-off between these two types of error; for example, as the false positive rate decreases the false negative rate increases. A generally accepted measure of classification performance is the EER, or the classification error using a threshold at which the false positive rate equals the false negative rate. The lower the EER, the fewer classification errors overall.