Big data for gender


Twitter posts, credit card purchases, phone calls, and satellites are all part of our day-to-day digital landscape.

Detailed data, known broadly as “big data” because of the massive amounts of passively collected, high-frequency information that such interactions generate, are produced every time we use one of these technologies. These digital traces have great potential and have already developed a track record for application in global development and humanitarian response.

Data2X has focused particularly on what big data can tell us about the lives of women and girls in resource-poor settings. In Big Data and the Well-Being of Women and Girls, we demonstrate how four big data sources can be harnessed to fill gender data gaps and inform policy aimed at mitigating global gender inequality. Big data can complement traditional surveys, offering a glimpse into dimensions of girls’ and women’s lives that have otherwise been overlooked, as well as providing a level of precision and timeliness that policymakers need to make actionable decisions.

Below we summarize the findings from our recent report, and share an exciting new opportunity to further this work: a Big Data for Gender Challenge.

Social media data can improve understanding of the mental health of girls and women.

Mental health conditions, from anxiety to depression, are a significant contributor to the global burden of disease, particularly for young women. But precise data on mental health is sparse in most countries. Research by Georgia Tech University, commissioned by Data2X, finds that social media provides an accurate barometer of mental health status.

Algorithms can not only detect genuine self-disclosures of mental illness on Twitter, but can disaggregate these tweets by sex and gauge characteristics like tone and affect to track positive or negative expressions. Across the world, these tools can serve as a first step in assessing the prevalence of mental health conditions. And for individual women and girls, they may be used to provide information on treatment and resources to groups with high prevalence levels.

Image: UN Global Pulse, 'Sex-Disaggregation of Social Media Posts,' Big Data Tools Series, no. 3, 2016. For more information, also access http://post2015.unglobalpulse.net/

These methodologies still have significant limitations, including bias toward literate (and tech-literate) women and girls, dominant-language Twitter users, and those with access to the internet. However, as more women, and particularly young women, come online, these methodologies are likely to be increasingly valuable, especially given the severity of mental health issues and the challenges associated with collecting mental health information through other means.

Cell phone and credit card records can illustrate women’s economic and social patterns – and track impacts of shocks in the economy.

Our spending priorities and social habits often indicate economic status, and these activities can also expose economic disparities between women and men.

By compiling cell phone and credit card records, our research partners at MIT traced patterns of women’s expenditures, spending priorities, and physical mobility. Their research found that women have less mobility diversity than men, live farther away from city centers, and report less total expenditure per capita.

Since this data is continuously generated, this type of analysis can be performed over longer time spans to capture impacts of economic and environmental shocks, stressors, and policy changes on women’s lives in real time.

It is critical to note that, despite its promise, data access and privacy remain a key challenge for the institutionalization of these real-time surveillance systems into country statistical offices. And, as with social media information, any analysis performed on cell phone and credit card data must be complemented with other ‘ground truthing’ surveys to ensure that researchers know what types of women are included in – and left out of – the dataset for reasons of access, affordability, literacy, and other barriers.

Satellite imagery can map rivers and roads, but it can also measure gender inequality.

Satellite imagery has the power to capture high-resolution, real-time data on everything from natural landscape features, like vegetation and river flows, to human infrastructure, like roads and schools. Research by our partners at the Flowminder Foundation finds that it is also able to measure gender inequality.

Satellite imagery can fill gaps in traditional surveys by providing more frequent and higher resolution information about girls’ and women’s lives, including in areas where surveys have not been conducted. Our research piloted methods of correlating geospatial variables (like distance to roads) with well-being indicators (like literacy) to infer patterns of social and health phenomena.

Mapping these phenomena using this method can reveal pockets of gender inequalities that are typically masked by averages on the country or district level. This use of big data for more frequent, and higher resolution, information on the well-being of women and girls offers huge potential for helping policymakers more effectively direct resources to where they are needed most.

Image: Bosco et al. 2017. WorldPop Program, Flowminder Foundation, University of Southampton.

Next steps

Building on the release of this report, Data2X is excited to announce a Big Data for Gender Challenge, which aims to attract data scientists interested in gender questions. The Challenge is seeking submissions which use a combination of digital and conventional data sources to answer gender-related research questions and/or build practical tools to monitor the well-being of women and girls over time. Applications are due July 7.

While the first set of pilot projects showed the potential for big data for gender analysis, the Challenge seeks to more directly link these research questions to policy that will drive better outcomes in women’s and girls’ lives. We look forward to seeing the great blend of creativity and practical solutions that we hope this Challenge will spur.

Read the full Big Data and the Well-Being of Women and Girls report here.

Image: Zach Stern.