Documentation

Overview

Thank you for using the ATAI harmonized ATAI Data Portal. This page is intended to provide you with all the information you need to employ the data found on the portal responsibly. If you have any questions, or feel as though important information is missing from our documentation, please email atai@povertyactionlab.org.

Our intent with this endeavor is to provide you with the opportunity to access data from ATAI projects cleaned in a common method. While the papers produced out of many ATAI projects publish their data and code on sites like the Harvard Dataverse, the preparation of the data for those websites may produce inconsistencies when comparing data across studies. The most straightforward example of an inconsistency would be a unit difference, where one study defines “Yield” as KG/Acre while another study uses Tonnes/Hectare. Understanding these differences by referencing questionnaires and raw data and guaranteeing common preparation across a select set of variables is the work that our team has done on your behalf.

The full suite of variables can be found in a matrix format on the ATAI DATA Portal and exist in eleven categories:

Identifiers

Administrative Information

Assets

Consumption

Agricultural Inputs

Crop Production

Livestock

Revenue

Shocks

HH Characteristics

Plot Characteristics

Environment

The last category, Environmental information, is the only category that does not come directly from projects’ survey data. Please see Section 2 below for how these data were matched to project survey data and prepared.

Harmonization Process

After ATAI project Principal Investigators (PIs) provide their raw data, our team parses through the questionnaires to understand what subset of indicators map to the variables in our harmonization effort. After that we transform and clean the raw data, handling any inconsistencies with our harmonized format. 

Some projects included data at more granular levels than others. For example, some projects asked only if any of the harvest was sold while others asked about marketed surplus  for each crop-cycle combination. We have opted to not remove this information and simply created multiple variable sets denoting the different levels at which each study collected information. For this reason and following this example, you will find both m_surplus_any (indicator for any marketed surplus) and m_surplus_crop1_qn, m_surplus_crop1_val, and m_surplus_crop1_share (for market surplus for crop 1 by quantity, value, and share of total harvest)  in the ATAI Data Portal.

To facilitate easy use of these data we provide you with scripts to attach variable and factor labels to the data you’ve downloaded. The two scripts, “99 ATAIVarLabel Stata.do” and “99 ATAIVarLabel R.R” are equivalent and perform this function for your preferred statistical software. While data downloads may vary based on filters like country and date selected by the user, these two scripts are static and flexible so that they can work with whatever subset of the full harmonized data that you have loaded in memory. Please make these scripts a part of your workflow, or copy and paste the relevant code into your own scripts to take advantage of the labels we’ve created for you.

Employing These Data Responsibly

Please think carefully about the populations from which the underlying projects sampled the groups of households for their studies. Even if data were collected from the same area at the same time, a project with interventions aimed at people who tend to livestock and a project with new seed varieties likely did not consider the same set of households for their sample. There are likely very few representative statements about populations larger than the target populations one can make using the data from the projects we have harmonized. This itself is a valuable source of heterogeneity for your work, and could lead to interesting insights in different areas so we urge you to keep it top of mind as you employ these data.

Due to the fact that data from different projects comes from different places, at different times and from different institutional contexts, project fixed effects must be used whenever doing regression analysis with a sample containing data from more than one project.

Missing values are going to be present whenever a project’s questionnaire did not contain the indicator of interest. This is not the same as a missing value in a situation where the project did not indicate it because it was deemed irrelevant or an individual refused to respond. For example, if the questionnaire of Study A contains a module asking about child education expenditures, but a respondent answers that they do not have children and that module was skipped by the survey team for brevity, the resulting variables for that observation will still be missing values.

Complementing Environment Data

A set of six environmental variables were collected for the portal using Google Earth Engine;

1. Luminosity; DMSP OLS: Nighttime Lights Time Series Version 4, Defense Meteorological Program Operational Linescan System

https://developers.google.com/earth-engine/datasets/catalog/NOAA DMSP-OLS NIGHTTIME LIGHTS
Annual Level, from 1992-2013

2. Forest Cover; Hansen Global Forest Change v1.8 (2000-2020)

https://developers.google.com/earth-engine/datasets/catalog/UMD hansen global forest change 2020 v1
Annual Level, from 2000-2019

3. Soil Properties; SoilGrids250m 2.0 (Bulk density, Cation exchange capacity at pH7, Coarse fragments, Clay, Total Nitrogen, Organic carbon density, Organic carbon stock, pH in H2O, Sand, Silt, Soil organic carbon)

 https://git.wur.nl/isric/soilgrids/soilgrids.notebooks/-/blob/master/markdown/access on gee.md

Cross-Sectional Level

4. NDVI; USGS Landsat 8 Level 2, Collection 2, Tier 1

 https://developers.google.com/earth-engine/datasets/catalog/LANDSAT LC08 C02 T1 L2

Annual Level, from 2013-2022
Matched via GAUL 2015 Boundaries

5. Precipitation; CHIRPS Daily: Climate Hazards Group InfraRed Precipitation With Station Data (Version 2.0 Final)

https://developers.google.com/earth-engine/datasets/catalog/UCSB-CHG CHIRPS DAILY

Monthly Level, from January 1981-August 2022

6. Temperature; ERA5 Daily Aggregates – Latest Climate Reanalysis Produced by ECMWF / Copernicus Climate Change Service

https://developers.google.com/earth-engine/datasets/catalog/ECMWF ERA5 DAILY
Monthly Level, from January 1979-June 2020

The process to match these data to project survey data involved a few steps. In most instances, GPS points for each sampled community were provided by the underlying project, which we then jittered according to the best practices put forth in the DHS and LSMS guidelines.

The jittered GPS points are provided to you as the variables location lat & location lon. When GPS were not available from the underlying project, we employ the GPS of the centroid of the lowest available administrative region label from the GADM region names to match to the environmental data. If neither GPS information nor administrative region labels were provided, then unfortunately we had no way to spatially connect the project data to the environmental data and all these variables will contain missing values for that project. Whether or not spatial identifiers were available for a project is noted in the project-specific ReadMe files.

Two sets of named regional border definitions were employed. Most project region names mapped to those found in the GADM data. In some cases, these regions overlapped perfectly with the borders of the FAO GAUL 2015, which Google Earth Engine treats as authoritative and uses for the environmental data series above. As the GPS centroids of GADM regions were used, there may be some cases where the Administrative region names of the two naming conventions don’t match. Both names are provided in the location adm & location GAUL variable sets.

All the above matching and data collection was done employing Google Earth Engine. The exact code used can be found on our GitHub: @agricultural-technology-adoption.

Merging With Other Datasets

If you would like to merge data found on this portal with other data, you have a number of ways to do so. The jittered GPS will be useful if trying to join with any spatial data and having both the GADM and GAUL administrative region labels available should help when merging to spatial layers not based on GPS.

Some projects have also published their data on the Harvard Dataverse. It is feasible to merge data from the portal to those datasets published by the authors. If you would like to do this, please know that we required the obs_id variable to be numeric in its harmonized form and it will likely not match the id used in the original study. This means that when the underlying study had a string for a household identifier we encoded it as a numeric and it can no-longer be used to merge back to data from the original study.

Citing the Data Portal

Any research output from the ATAI Data Portal must be cited with our DOI and the original provenance of the data. The ATAI Data Portal relies on the efforts of original authors and researchers who collected and shared their data. Because the data from all underlying studies has been transformed from its creation, this ATAI Data Portal has a unique DOI.

To cite the ATAI Data Portal please use the following citation:

Agricultural Technology Adoption Initiative (2023): ATAI Data Portal. Agricultural Technology Adoption Initiative. Dataset. https://doi.org/10.26085/C3N306

Citations for the original study data are included below:

Study ID BD201300075
Project Name Reducing Imbalanced Fertilizer Use Through Instructions in Bangladesh
Citation Islam, Mahnaz, and Sabrin Beg. “’Rule-of-Thumb’ Instructions to Improve Fertilizer Management: Experimental Evidence from Bangladesh.” Economic Development and Cultural Change. https://doi.org/10.1086/711174
Study ID ET201100020
Project Name Linking Weather Index Insurance and Credit to Improve Agricultural Productivity in Rural Ethiopia
Citation Ahmed, Shukri, Craig McIntosh, and Alexandros Sarris. 2020. “The Impact of Commercial Rainfall Index Insurance: Experimental Evidence from Ethiopia.” American Journal of Agricultural Economics 102 (4): 1154–76. https://doi.org/10.1002/ajae.12029.
Study ID GH201000027
Project Name Examining Underinvestment in Agriculture: Returns to Capital and Insurance Among Farmers in Ghana
Citation Karlan, Dean, Robert Osei, Isaac Osei-Akoto and Christopher Udry. “Agricultural Decisions after Relaxing Credit and Risk Constraints.” The Quarterly Journal of Economics Volume 129, Issue 2, May 2014, Pages 597–652. https://doi.org/10.1093/qje/qju002
Study ID GH201400039
Project Name Disseminating Innovative Resources and Technologies to Smallholders (DIRTS) in Ghana
Citation Fosu, Mathias, Dean Karlan, Shashidhara Kolavalli, Christopher Udry. “Disseminating Innovative Resources and Technologies to Smallholder Farmers in Ghana.” Innovations for Poverty Action. https://poverty-action.org/node/7351/pdf
Study ID IN201100048
Project Name Mobile Phone-Based Extension Services and Agricultural Advice for Cotton Farmers in Gujarat, India
Citation Cole, Shawn and Nilesh Fernando. “Mobile’izing Agricultural Advice Technology Adoption Diffusion and Sustainability.” The Economic Journal Volume 131, Issue 633, January 2021, Pages 192–219. https://doi.org/10.1093/ej/ueaa084
Study ID IN201200029
Project Name Adoption of Improved Fertilizer Management Practices Under Risk Reduction Due to Submergence-tolerant Rice
Citation Emerick, Kyle, Alain de Janvry, Elisabeth Sadoulet, and Manzoor H. Dar. 2016. “Technological Innovations, Downside Risk, and the Modernization of Agriculture.” American Economic Review 106 (6): 1537-61. DOI: 10.1257/aer.20150474
Study ID IN201400022
Project Name The Impact of Financial Access and Social Networks on Agriculture in Rural Tamil Nadu, India
Citation Binzel, Christine, Erica Field, Rohini Pande, and Louise Paul-Delvaux. “Does the arrival of a formal financial institution alter community networks? Experimental evidence from India.” Working Paper, October 2019.
Study ID IN201400033
Project Name Group Incentives, Hygiene, and Milk Quality Among Dairy Cooperatives in Karnataka, India
Citation Rao, Manaswini, and Ashish Shenoy. 2023. “Got (Clean) Milk? Organization, Incentives, and Management in Indian Dairy Cooperatives.” Journal of Economic Behavior & Organization 212 (August): 708–22. https://doi.org/10.1016/j.jebo.2023.06.002.
Study ID IN201500086
Project Name Farmer Field Days and Demonstrator Selection for Increasing Technology Adoption
Citation Emerick, Kyle, and Manzoor H. Dar. 2021. “Farmer Field Days and Demonstrator Selection for Increasing Technology Adoption.” The Review of Economics and Statistics 2021, 680-693. https://doi.org/10.1162/rest_a_00917.
Study ID IN201800125
Project Name Overcoming Barriers to Soil Fertility Management
Citation Cole, Shawn, et al. “Using Satellites and Phones to Evaluate and Promote Agricultural Technology Adoption: Evidence from Smallholder Farms in India.” Precision Agriculture for Development Working Paper, 2021.
Study ID KE201100005
Project Name Contract Farming, Technology Adoption, and Agricultural Productivity: Evidence from Smallholder Farmers in Western Kenya
Citation Casaburi, Lorenzo, Michael Kremer, Sendhil Mullainathan, and Ravindra Ramrattan. 2019. “Harnessing ICT to Increase Agricultural Production: Evidence From Kenya.” Working Paper, September 2019.
Study ID KE201200016
Project Name Innovative Finance and Grain Storage among Farmers and Traders in Kenya
Citation Burke, Marshall, Lauren Falcao Bergquist, and Edward Miguel. 2018. “Sell Low and Buy High: Arbitrage and Local Price Effects in Kenyan Markets*.” The Quarterly Journal of Economics 134 (2): 785–842. https://doi.org/10.1093/qje/qjy034.
Study ID KE201400062
Project Name Selective Trials for Agricultural Technology Adoption
Citation Chassang, Sylvain. Pascaline Dupas, Catlan Reardon, Erik Snowberg. “Selective Trials for Agricultural Technology Adoption.” ATAI Research. https://www.atai-research.org/project/selective-trials-for-agricultural-technology-adoption/
Study ID MZ201300040
Project Name Mobile Money and Agricultural Investment in Mozambique
Citation Batista, Catia, and Vicente, Pedro C. 2020. “Improving Access to Savings through Mobile Money: Experimental Evidence from African Smallholder Farmers.” World Development 129. doi.org/10.1016/j.worlddev.2020.104905
Study ID UG201300076
Project Name Does Poor-Quality Hinder Agricultural Technology Adoption? Evidence from the Market for Fertilizers in Uganda
Citation Bold, Tessa, Kayuki C. Kaizzi, Jakob Svensson, and David Yanagizawa-Drott. 2017. “Lemon Technologies and Adoption: Measurement, Theory, and Evidence from Agricultural Markets in Uganda.” The Quarterly Journal of Economics 132(3): 1055–1100. doi: 10.1093/qje/qjx009.
Study ID KE201500110
Project Name Agricultural Subsidies, Savings, and Farm Reinvestment
Citation Aggarwal, Shilpa, Eilin Francis, and Jonathan Robinson. 2018. “Grain Today, Gain Tomorrow: Evidence from a Storage Experiment with Savings Clubs in Kenya.” Journal of Development Economics 134(2018): 1-15. doi: 10.1016/j.jdeveco.2018.04.001
Study ID ZM201300052
Project Name Providing Foodstuffs and Cash Loans to Improve Smallholder Farming in Zambia
Citation Fink, Gunther, Kelsey Jack, and Felix Masiye. “Seasonal Liquidity, Rural Labor Markets, and Agricultural Production.” American Economic Review 2020, 110(11): 3351–3392. https://doi.org/10.1257/aer.20180607
Study ID IN201500101
Project Name Long-term Diffusion and Impact of Flood-tolerant Rice
Citation Janvry, Alain de, Manaswini Rao, Elizabeth Sadoulet. “Seeding the Seeds: Role of Social Structure in Agricultural Technology Diffusion.” Working Paper, August 2022.
Study ID IN201400094
Project Name The Impacts of Drought Tolerance on Local Labor Markets in India
Citation Baysan, Ceren, Manzoor H. Dar, Kyle Emerick, and Elisabeth Sadoulet. “The Agricultural Wage Gap within Rural Villages.” Working Paper, September 2022.
Study ID IN201600121
Project Name Targeted Information for The Adoption of Flood-Tolerant Rice in India
Citation Dar, Manzoor, Alain de Janvry, Kyle Emerick, Elizabeth Sadoulet, and Eleanor Wiseman. “Private Input Suppliers as Information Agents for Technology Adoption in Agriculture.” Working paper, November 2022.
Study ID PK201300092
Project Name Coordinating Farmers with Cell Phones: Technology Innovation in Livestock Extension Services in Pakistan
Citation Hasanain, Syed Ali, Muhammad Yasir Khan, and Arman Rezaee. 2023. “No Bulls: Experimental Evidence on the Impact of Veterinarian Ratings in Pakistan.” Journal of Development Economics 161 (March): 102999. https://doi.org/10.1016/j.jdeveco.2022.102999.
Study ID KE201600115
Project Name Local Agricultural Information Delivery in Western Kenya
Citation Fabregas, Raissa, Michael Kremer, Matthew Lowes, Robert On, and Giulia Zane. “SMS-extension and Farmer Behavior: Lessons from Six RCTs in East Africa.” Working Paper, September 2019.

Frequently Asked Questions

  1. Something is broken, I have a suggestion or comment.

If you have anything you’d like to share with the ATAI team, please write us an email at ATAI@povertyactionlab.org.