Who am I? One’s identity is a puzzling thing, and is something that we only learn about slowly throughout our life. What exactly constitutes one’s identity is perhaps not entirely known, but we do know that part of it is one’s personality. In the past decades, personality has been conceptualized from various theoretical perspectives, and at different levels of breadth or abstraction [1, 2]. Traditionally, people’s personality evaluation require extensive participation from experienced psychologists and an understanding of the individual’s psychological testing records, history, self-reporting, and assessment during interviews . This is often a lengthy procedure, and relevant data or experts may not always be accessible. As a result, there is an increasing demand for shorter and simper personality measurements . Nowadays, the increasing number of video channels from the internet allows us to store a myriad of spontaneous nonverbal cues extracted from our physical appearance . Thus, it is interesting to see if an automatic system can be built, which can incrementally learn non-verbal cues from facial expressions and audio signals, and predict different traits of people’s personalities
The main research question of this project is that how can we use non-verbal information from video and audio to predict personalities only using machine learning-based systems. To be more specific, we want to explore:
In this project, several state-of-the-art machine learning technologies will be utilized, including Deep Learning , Cooperative Learning , Bi-directional Long Short-Term Memory Neural Networks , Probabilistic Graphical Networks et al.. Meanwhile, feature selection methods will be used to select different audio and video features that correlate to each traits of the personality and depression.
The expected contributions of this PhD project can be summarized as following:
1) Explore the relationship between each trait of the personality and each combination of AUs.
2) Create an algorithm to predict people’s personalities from facial expressions and speech signals using the state-of-the-art machine learning technologies.
3) Explore the relationship between the depression and each combination of AUs.
4) Create an algorithm to diagnose people’s depression from facial expressions and speech signals using the state-of-the-art machine learning technologies.
5) Collect a video and audio database for further automatic personality prediction research.
 O. P. John, S. E. Hampson, and L. R. Goldberg, "The basic level in personality-trait hierarchies: studies of trait use and accessibility in different contexts," Journal of personality and social psychology, vol. 60, p. 348, 1991.
 P. S. Macadam and K. A. Dettwyler, Breastfeeding: biocultural perspectives: Transaction Publishers, 1995.
 T. Yingthawornsuk, H. K. Keskinpala, D. M. Wilkes, R. G. Shiavi, and R. M. Salomon, "Direct acoustic feature using iterative EM algorithm and spectral energy for classifying suicidal speech," in INTERSPEECH, 2007, pp. 766-769.
 B. Rammstedt and O. P. John, "Measuring personality in one minute or less: A 10-item short version of the Big Five Inventory in English and German," Journal of research in Personality, vol. 41, pp. 203-212, 2007.
 L. E. Buffardi and W. K. Campbell, "Narcissism and social networking web sites," Personality and social psychology bulletin, vol. 34, pp. 1303-1314, 2008.
 S. Jaiswal and M. Valstar, "Deep learning the dynamic appearance and shape of facial action units," in Applications of Computer Vision (WACV), 2016 IEEE Winter Conference on, 2016, pp. 1-8.
 Zhang, Zixing, et al. "Cooperative learning and its application to emotion recognition from speech." IEEE/ACM Transactions on Audio, Speech and Language Processing (TASLP) 23.1 (2015): 115-126.
 Graves, Alex, and Jürgen Schmidhuber. "Framewise phoneme classification with bidirectional LSTM and other neural network architectures." Neural Networks 18.5 (2005): 602-610.
Song, S., Shen, L. and Valstar, M., 2018, May. Human behaviour-based automatic depression analysis using hand-crafted statistics and deep learned spectral features. In 2018 13th IEEE International Conference on Automatic Face & Gesture Recognition (FG 2018) (pp. 158-165). IEEE.
Song, S., Zhang, S., Schuller, B.W., Shen, L. and Valstar, M., 2018, July. Noise invariant frame selection: a simple method to address the background noise problem for text-independent speaker verification. In 2018 International Joint Conference on Neural Networks (IJCNN) (pp. 1-8). IEEE.
This author is supported by the Horizon Centre for Doctoral Training at the University of Nottingham (RCUK Grant No. EP/L015463/1) and Shenzhen University/Nottingham Biomedical Research Centre (BRC).