Mental health issues are one of the most prevalent problems worldwide. Behavioral disorders such as depression, anxiety etc. reported to be the main drivers of disability leading to severe consequences including suicide and heart disease. Many of these mental health issues can heavily benefit from a prompt diagnosis or personalized and timely intervention for treatment. Unfortunately, due to the rising numbers of affected people there are long waiting times for mental health screening and delivering further treatment that causes undesirable effects. Hence governments and health facilities are exploring methods to aid this process using technology, such as automatic diagnosis or monitoring.
This research will focus on the problem of mood assessment and analysis for detecting mood disorders such as depression from video and audio data in natural environments. The outcomes of the proposed research can be successfully applied in delivering mental health care in automated patient monitoring or therapy administering platforms.
Digital mental health is a growing area of research that amalgamates advancements in digital technologies into mental health care, such as e-Health and mHealth [1] using the Internet and mobile phone software to deliver mental health services. Mental health care currently benefits from the use of mobile phone or Internet based software in various clinical care stages such as symptom assessment, patient engagement and psycho-education, tracking and monitoring treatment progress . Many evidence-based apps or technologies are currently available to deliver assessment, monitoring and interventions in mental health such as schizophrenia, substance abuse, eating disorder, sleeping disorder, mood disorders such as bipolar, anxiety and depressive disorders [1].
An accurate characterization of facial expressions that can assess mood in real time can be used as a reliable sensor in mental health technologies for managing mood disorders [2]. This would open more opportunities to deliver behavioral interventions based on audio-visual data, prompting seamless user engagement through video sessions. A related study from the MIT [3] use self reported mood and behavioral data that were collected from the users on a daily basis using a custom built app. They use a recurrent neural network for prediction of depression based on the self reported data collected over 22 months from 2362 users. This is study is a good indication that using deep learning based methods on big data can build models with high predictive power for mood based disorders. In this scenario, we propose to use images, video and speech sensing for predicting mood patterns indicative of mood based disorders and validate it using self reported scores.
Many published studies addressing the problem of mood analysis for mental health disorders, point out the difficulty in attaining labelled data at a large scale [1]. This makes most studies resort to collecting their own datasets in the lab and relying on self reported scores for depression estimation. Majority of the available data are in the form clinical interviews with limited number of subjects and under restricted clinical settings. It limits addressing the detection of mood disorders using facial expressions in previously unseen environments or in-the-wild difficult.\
In order to build the system, this PhD will investigate state-of-the-art techniques based on deep learning. Given an appropriate training dataset, such techniques have been shown to produce results of great accuracy for a number of classification or regression tasks. However, as seen above, for the problem of depression detection there are only very few datasets available, that are collected under very specific settings and within the lab. If a machine learning method is trained on such datasets then it is certain that the method wont be able generalize on new data possibly collected in a different environment and under different context. Also, from a practical view point, it is important to develop a system that is able to assess depression in natural environments. Hence this PhD will investigate the so-called domain adaptation techniques [6] for addressing the problem of lack of real-world data for training a system for depression recognition. This research will also explore methods to augment the available datasets using synthesized facial expressions data [7].
Can an individual's mood be assessed reliably based on facial expressions and speech using deep learning for detecting mood disorders?
How can we successfully leverage the use of big-data based deep learning methods despite the scarcity of labelled data for identification of mood disorders in natural environments?
Can models learned from specific datasets be generalized to detect depression in data collected from different domains?
How can we evaluate the prediction models using clinically validated measures or methods?
From an application perspective, successful cross domain prediction ability of the approach can be deployed in virtual therapists and interactive screening platforms to assess patient mood quickly and provide prompt therapy.
From the technical contribution side, the proposed system is among the first to use state-of-the-art domain adaptation methods to learn robust models for mood prediction, for assessing mood disorders, despite the scarcity of labeled datasets.
A data collection medium that facilitates collection of natural data and provide useful mood feedback to its users is another positive outcome of the research.
The scale of personal and societal damage caused by mental health issues is large and any successful step towards alleviating its effects has a huge impact on the the welfare of the society. The research in mental health care technologies is progressing to address the rapidly growing treatment gap. Projects for virtual therapists and affective agents to monitor and deliver therapy are already in place. Quick mood assessment is a critical component of such systems. The ability to visually infer mood and predict mood disorder symptoms will contribute towards fast deployment of such systems.
This author is supported by the Horizon Centre for Doctoral Training at the University of Nottingham (RCUK Grant No. EP/L015463/1) and Biomedical Reserach Centre.