Horizon CDT Research Highlights

Research Highlights

Evolution of interaction networks in developing countries for social intervention.

  Maddy Ellis (2016 cohort)   www.nottingham.ac.uk/~psxme6

The latest comprehensive data on global poverty in 2013 showed that there are 767 million people estimated to be in below the poverty line. 10.7\% of people in the world are in poverty, that is almost 11 people in every 100 people in the world. [1] Despite the number of people in poverty falling globally between 2012 and 2013, poverty in Africa is still widespread, poverty levels in Africa are high relative to all other regions of the world. [1] Poverty is threatening the lives and well-being of an unacceptable proportion of our population.

The data indicating these poverty levels in Africa has been historically bad, in 1990 only 20 countries had data allowing measurements of poverty. [2] Household surveys initially provided some insight into wealth distribution, however these surveys omit a significant proportion of the poorest people, making it poor indicator for poverty. [3] Since then DHS (Demographic and Health Surveys), income and expenditure surveys have been introduced, this has drastically improved the data situation in these developing countries. However there are still massive deficiencies in this data, such surveys are often too infrequent and take too long to have much value. [4]. Surveyed data is hard to obtain as it is labour and cost intensive it is therefore scarce. [5] The deficiency in reliable data explaining local poverty in developing countries restrains the impact of local policy makers, governments and aid organizations. [4] , [5] Accurate estimates of population characteristics such as poverty are critical to development [6] There has been serious concerns for the reliability of quantitative data in developing countries for researchers, National statistics on economic production for example may be off by as much as 50\% in Africa. [7]

Satellite data provides a more time efficient approach to investigating the poverty of different areas than traditional surveying. [8] High-resolution satellite imagery, is now increasingly inexpensive and reliable. [5]The increase in satellite data availability has contributed to the study of geo-spatial information with broad applications across many areas including the distribution of poverty [4] [9] Poverty stricken regions however are also the ones which are more likely to have less internal funding, more civil-wars, poor infrastructure and inadequate government resourcing available for research such as surveys and satellite data, hence there are still vast gaps in the collection of reliable data which could be used to describe poverty. [4]

There are however increasingly new sources of collecting data on individuals such as mobile phone and internet records which are enabling new approaches to demographic profiling and opening an exciting field of potential analysis. [10] Data from a communication network of mobile phones and business landlines for example were used to show that communication diversity is a strong indicator for the economic health of communities in the UK. [11] In developing countries there are fewer sources of big data, however mobile phone use is becoming increasingly ubiquitous in these regions and providing a fruitful source of data [6] In regions where resources such as time, labour and money our scarce for such research, this approach creates a method for gathering information on individuals at a fraction of the cost of traditional methods such as surveys and satellite images. [6] The diversity of individuals relationships is a key indicator of social and economic life, until recently this was not so widely quantifiable, we now have data on networks of peoples behaviours which allow us to draw conclusions at a population levels. [11]

Although the ubiquitous use of mobile phones in developing countries creates data with a lot of potential to help organization who are currently struggling to identify the poorest parts of regions, as this is a new form of data there is a lot of work to be done in finding different ways to assess this data for useful results. [12] There is remarkably little known about the demographics of CDR data in developing countries. [14] Such data will serve as a basis for this study of evolving interactions across unprivileged communities The geographic distribution of poverty and wealth can be used to make decisions about resource allocation and has a high potential impact. [13]

This research will be reviewing and aiming to evolve and extend upon existing methods of data analysis using interaction network data such as block models and point processes. Initially, with aims to address demographic studies in Africa with the use of CDR data available from the University of Nottingham N-LAB. They have obtained samples of all customer records from the second largest cell phone company in Tanzania. The stochastic block model produces a useful measure for the task of exploring the community structure within network data. The model produces subsets or communities which are defined by the pattern of connections between the nodes in the network. [16] It has previously been used in the application of fields such as disease transmission or demographics. [15] The block model provides a platform to provide more information on the state and demographics of social network structures.

In addition point processing is the modelling of mathematical objects to represent temporally structured events. [19] Temporally structured phenomena analysed in this way can refer to interactions across communities, individuals or entities. [17] Event series are continuous, irregular and often highly sparse and application areas of such modelling are diverse and could include, earthquake forecasting and health and financial event predictions.[18] Events such as the arrival of air resources and finances can be modelled in this way to learn more about the network of communities in developing regions and countries.

Work done on stochastic block model and point processing could be used to shed some light on the problem and expanded and combined to analyse the dynamic data of the social network.

Multidisciplinary Statement.

A mathematical and analytical approach to humanitarian and socio-technical is- sues with the use of digital technologies. This topic in- volves the combination of multiple disciplines including: mathematics, computing, computational sociology, geospatial and human factors and the digital economy.

Horizon Relevance.

This PhD will be linking into some of the key areas of Horizons focuses: Global Impacts, Data Science, Large Data, Digital economy and Public Engagement.

References

[1] Timm Bönke Soumya Chattopadhyay Shaohua Chen Will Durbin María Eugenia Genoni Aparajita Goyal Christoph Lakner Terra Lawson-Remer Maura K. Leary Renzo Massari Jose Montes David Newhouse Stace Nicholson Espen Beer Prydz Maika Schmidt José Cuesta, Mario Negre and Ani Silwal. Poverty and shared prosperity. Technical report, Taking on Equality World Bank, 2016

[2] Finn Tarp Channing Arndt, Andy McKay Growth and poverty in Sub-Saharan Africa. Oxform University Press, United Stated of America, 198 Madison Avenue, New York, NY 10016, 2016

[3] Roy Carr-Hill. Improving population and poverty estimates with citizen surveys: Evidence from east africa. World Development, 93:249 – 259, 2017

[4] Stefano Ermon George Azzari Marshall Burke Anthony Perez, Swetava Ganguli and David Lobell. Semi-supervised multitask learning on multispectral satellite images using wasserstein generative adversarial networks (gans) for predicting poverty. Technical report, Stanford University, 2016.

[5] Michael Xie, Neal Jean, Marshall Burke, David B. Lobell, and Stefano Ermon. Transfer learning from deep features for remote sensing and poverty mapping. CoRR, abs/1510.00098, 2015

[6] Joshua Blumenstock, Gabriel Cadamuro, and Robert On. Predicting poverty and wealth from mobile phone metadata. Science, 350(6264):1073–1076, 2015

[7] M. Jerven Poor Numbers: How We Are Misled by African Development Statistics and What to Do About It. Cornell Univ. Press, 2013.

[8] Gary R. Watmough, Peter M. Atkinson, Arupjyoti Saikia, and Craig W. Hutton. Understanding the evidence base for poverty environment relationships using remotely sensed satellite data: An example from assam, india.World Development, 78:188 – 203, 2016

[9] Neal Jean, Marshall Burke, Michael Xie,W. Matthew Davis, David B. Lobell, and Stefano Ermon. Combining satellite imagery and machine learning to predict poverty. Science, 353(6301):790–794, 2016.

[10] Gary King. Ensuring the data-rich future of the social sciences. Science, 331(6018):719-721,2011.

[11] Nathan Eagle, Michael Macy, and Rob Claxton.Network diversity and economic development. Science, 328(5981):1029–1031, 2010

[12] Chris Smith-Clarke and Licia Capra. Beyond the baseline: Establishing the value in mobile phone based poverty estimates. In Proceedings of the 25th International Conference on World Wide Web, WWW ’16, pages 425–434, Republic and Canton of Geneva, Switzerland, 2016. International World Wide Web Conferences Steering Committee.

[13] Gary S. Fields. Changes in poverty and inequality in developing countries. The World Bank Research Observer, 4(2):167–185, 1989.

[14] J.E.Blumenstock, D.Gillick, N. (2010). Whos calling? de- mographics of mobile phone use in rwanda.

[15] Carrington, P. J., Scott, J., and Wasserman, S. (2005). Models and methods in social network analysis, volume 28. Cam- bridge university press.

[16] Nowicki, K. and Snijders, T. A. B. (2001). Estimation and prediction for stochastic blockstructures. Journal of the American Statistical Association, 96(455):1077–1087

[17] Kingman, J. F. C. (1993). Poisson processes. Wiley Online Library.

[18] Goulding, J., Preston, S., and Smith, G. (2016). Event se- ries prediction via non-homogeneous poisson process modelling. In Data Mining (ICDM), 2016 IEEE 16th In- ternational Conference on, pages 161–170. IEEE.

[19] Doob, J. L. (1953). Stochastic processes, volume 7. Wiley New York.

This author is supported by the Horizon Centre for Doctoral Training at the University of Nottingham (RCUK Grant No. EP/L015463/1) and Humanitarian OpenStreet Mapping .