Horizon CDT Research Highlights

Research Highlights

The Application of Social Data in Assessing Creditworthiness

  Sam Doehren (2014 cohort)   www.sdoehren.com

There are few people who will never seek some form of credit; it will be rarer for some than for others, but the majority of people will attempt to acquire credit at some point during their lifetime. For the potential lender a decision needs to be made on whether an individual will, or will not, succeed in repaying a debt; the individual’s creditworthiness or likelihood to default.

The decision of whether or not to lend to an individual is currently based on a variety of factors, but primarily on their currently levels of debt and whether they have failed to repay debt previously. As debt has been shown to be linked to suicidal ideation it is important to highlight individuals at risk of over burdening themselves (1). To date there have been few attempts to identify those who are likely to find themselves with problematic debt before the debt is problematic. This project attempts to identify new ways of highlighting individuals at risk of acquiring debt that they are unable to manage.

One approach for the identification of individuals at risk is to look at non-credit file information. For example social media has been highlighted as a potential way of estimating a user’s credit behaviour, though this has been done through peer comparison (2). Social media has also been shown to offer an insight into the personality of an individual (3). With this, impulsivity have been found to have linked to the levels of debt an individual has (4–7). Despite it being regularly being linked to levels of debt; impulsivity has not been investigated as a factor that influence whether those debts are then defaulted upon.

One of the issues with producing models that are focused purely on one social networking site is the ability to generalise. Often research focuses on a single aspect of a site that is applicable only to that site; Facebook likes or Twitter’s uni-directional networks (2,3). This type of research can only be applied to the site under investigation and becomes somewhat obsolete if the site significantly changes focus or closes; as is the case with MySpace and Friends Reunited, respectively.

This project aims investigate a possible link between impulsivity and default whilst also producing a new method for detecting those at a high risk of acquiring debt they are unable to manage.

Primarily the use of language in text form is to be investigated to discover if an individual’s impulsivity can be detected. Initially this is to be done via Tweets but, if found, how generalizable the link between impulsivity and the use of language to other forms of informal writing will be investigated.

As discussed, the link between impulsivity and levels of debt has been well established. The project will investigate whether this relationship can be extended to impulsivity and default.

Beyond the theoretical research gap this project will also present new ways of performing social media research in an ethically sound manner. The public data produced from social networking sites are easily available through ‘profile harvesting’ and ‘web scraping’ and offers great opportunities for the analysis of content and relationships. The process of data acquisition via profile harvesting and scraping on social networking sites raises a number of ethical questions. The main of which is “is consent required to collect this data?”
The main two points of the discussion regarding the ethics of the use of profile harvesting and scraping occur around where ‘public/private boundary’ exists and what is considered acceptable use of publicly available personal information (8–11). Ongoing work discusses the potential of an addition to person-centric ethical framework to allow for research to be performed without ethical concern whilst retaining the strengths of modern data collection methods.

Overall this project aims to produce a model in which an individual’s risk of acquiring debt they are unable to manage can be detected from everyday social media uses allowing for intervention at an early stage whilst producing an ethically sound framework for the research to take place in.


  1. Meltzer H, Bebbington P, Brugha T, Jenkins R, McManus S, Dennis MS. Personal debt and suicidal ideation. Psychol Med. 2011 Apr;41(04):771–8.
  2. Danyllo WA, Alisson VB, Alexandre ND, Moacir LMJ, Jansepetrus BP, Oliveira RF. Identifying Relevant Users and Groups in the Context of Credit Analysis Based on Data from Twitter. In: 2013 Third International Conference on Cloud and Green Computing (CGC). 2013. p. 587–92.
  3. Kosinski M, Stillwell D, Graepel T. Private traits and attributes are predictable from digital records of human behavior. Proc Natl Acad Sci. 2013 Apr 9;110(15):5802–5.
  4. Ottaviani C, Vandone D. Impulsivity and household indebtedness: Evidence from real life. J Econ Psychol. 2011 Oct;32(5):754–61.
  5. Gathergood J. Self-control, financial literacy and consumer over-indebtedness. J Econ Psychol. 2012 Jun;33(3):590–602.
  6. Norvilitis JM, Merwin MM, Osberg TM, Roehling PV, Young P, Kamas MM. Personality Factors, Money Attitudes, Financial Knowledge, and Credit-Card Debt in College Students1. J Appl Soc Psychol. 2006 Jun 1;36(6):1395–413.
  7. Garibaldi J, Ferguson E, Aickelin U. A Data Mining Framework to Model Consumer Indebtedness with Psychological Factors. In: 2014 IEEE International Conference on Data Mining Workshop (ICDMW). 2014. p. 150–7.
  8. Koene A, Vallejos EP, Carter CJ, Statache R, Adolphs S, O’Malley C, et al. Investigating conditions for consent to analyze social media data. In: Peres, P (Ed) and Mesquita, A (Ed), ECSM2015-Proceedings of the 2nd European Conference on Social Media 2015: ECSM 2015. Academic Conferences Limited; 2015. p. 634–7.
  9. Krotoski AK. Introduction to the Special Issue: Research ethics in online communities. Int J Internet Res Ethics. 2010;3(1):1–5.
  10. Krotoski AK. Data-driven research: open data opportunities for growing knowledge, and ethical issues that arise. Insights UKSG J. 2012 Mar 1;25(1):28–32. 
  11. Rooke B. Four Pillars of Internet Research Ethics with Web 2.0. J Acad Ethics. 2013 Dec;11(4):265–8.


Gygax, P. M., Garnham, A. & Doehren, S. (2015). What Do True Gender Ratios and Stereotype Norms Really Tell Us? Frontiers in Psychology, 7, 1036. https://doi.org/10.3389/fpsyg.2016.01036

This author is supported by the Horizon Centre for Doctoral Training at the University of Nottingham (RCUK Grant No. EP/L015463/1) and Experian.