Quantifying the Impact of Data Characteristics on the Transferability of Sleep Stage Scoring Models: From Sleep Detection Models to Future Wearable Devices

After the publication of the “TinySleepNet” model, a milestone in using a single signal from the midpoint between the head and forehead to measure sleep efficiency (Sleep Stage Scoring), the model achieved significant success and was extended by various developers. Asst. Prof. Dr. Akara Supratak, an instructor in the Computer Science Academic Group at ICT Mahidol, the original developer of this model, has revisited and further expanded this work. How and why? Let’s explore below.

The work, “Quantifying the Impact of Data Characteristics on the Transferability of Sleep Stage Scoring Models,” focuses on developing a model to analyze and score sleep stages using EEG data. It builds upon the 2020 research, “TinySleepNet: An Efficient Deep Learning Model for Sleep Stage Scoring Based on Raw Single-Channel EEG.” While the TinySleepNet model has been widely extended since its publication, its application in wearable devices remains unexplored.

“I have been following my research, “TinySleepNet” and have seen that many have developed models from my work, but no one has examined how these models perform when implemented in wearable devices. They all assume that measuring signals from the frontal brain area is the best position without investigating whether the model works well in real wearable applications. This question led me to conduct this research,” explained Asst. Prof. Dr. Akara.

The core question of this study is: “If we need to extend the model for wearable devices at an affordable price, how much will its performance degrade?”

Overall architectures of TinySleepNet and U-Time.

“This research answers the question of how well the model will perform if installed in wearable devices, or how much its performance will decrease, providing interested parties with a rough idea before designing future devices, saving production and testing costs. This is the origin of this research,” said Asst. Prof. Dr. Akara.

He started by identifying factors of wearable devices that might affect model performance. This study examines three factors:

  • Recording Channels (or Positions): The positions used to measure brainwave frequencies.
  • Recording Environments: The environment and equipment used for measurement and data collection.
  • Subject Conditions: Whether the patient has normal or abnormal sleep patterns.
Transferability matrices from TinySleepNet and U-Time models.

After identifying these factors, Asst. Prof. Dr. Akara, tested the model with different sets of data and summarized the results. The study found that different recording environments had the most significant impact, degrading model performance by up to 14%.

“I studied using publicly available datasets, one from Hospital A and another from Hospital B. I trained the model using data from Hospital A and applied the trained model without any adjustments to the dataset from Hospital B. This method resulted in a maximum performance degradation of 14%. For example, if the model previously performed well at 90% in the same setting, using it in a different recording environment where the equipment differs, the performance is predicted to decrease by 14%, from 90% to 76% or less,” explained Asst. Prof. Dr. Akara.

During the study, Asst. Prof. Dr. Akara encountered two main challenges: the inability to cover all variations due to limitations in available public datasets, and the long time required for model training due to numerous learning variables. Nevertheless, he also mentioned the future extension of this work, focusing on predicting the model’s performance degradation without additional model training, providing future expectations for upcoming work.

“If we can collaborate with hospitals in Thailand and succeed in predictive research, I believe we will get closer to having devices that measure sleep quality more accurately than wristwatches. Wristwatches measure sleep efficiency using heart rate and movement from the wrist, but the most accurate measurement is from the front of the brain. Ultimately, I hope that we will have devices that approach medical-device grade as closely as possible. Such devices could provide an early warning if someone’s sleep habits are problematic. They can quantify behavior better than a watch, such as providing a sleep score that correlates with how refreshed someone feels upon waking. This helps them decide whether to see a doctor or change their behavior. I hope that with more comprehensive data and complete research, we will take another step forward,” concluded Asst. Prof. Dr. Akara.”

For more details and to access the published work: https://doi.org/10.1016/j.artmed.2023.102540

Leave a Reply

Your email address will not be published.