Modeling Urban/Rural Fractions in Low- and Middle-Income Countries

21 Sep 2022  ·  Yunhan Wu, Jon Wakefield ·

In low- and middle-income countries, household surveys are the most reliable data source to examine health and demographic indicators at the subnational level, an exercise in small area estimation. Model-based unit-level models are favored in producing the subnational estimates at fine scale, such as the admin-2 level. Typically, the surveys employ stratified two-stage cluster sampling with strata consisting of an urban/rural designation crossed with administrative regions. To avoid bias and increase predictive precision, the stratification should be acknowledged in the analysis. To move from the cluster to the area requires an aggregation step in which the prevalence surface is averaged with respect to population density. This requires estimating a partition of the study area into its urban and rural components, and to do this we experiment with a variety of classification algorithms, including logistic regression, Bayesian additive regression trees and gradient boosted trees. Pixel-level covariate surfaces are used to improve prediction. We estimate spatial HIV prevalence in women of age 15-49 in Malawi using the stratification/aggregation method we propose.

PDF Abstract

Datasets


  Add Datasets introduced or used in this paper