Spatio-Causal Patterns of Sample Growth

Main Article Content

Andre F. Ribeiro

Abstract

Different statistical samples (e.g., from different locations) offer populations and learning systems observations with distinct statistical properties. Samples under (1) ’Unconfounded’ growth preserve systems’ ability to determine their variables’ effects on outcomes-of-interest (and lead, therefore, to interpretable black-box predictions). Samples under (2) ’Externally-Valid’ growth preserve their ability to make predictions that generalize across out-of-sample variation. The first generates predictions that generalize over sample populations, the second over their common unobserved factors. We illustrate these theoretic patterns in the full American census from 1840 to 1940, and samples ranging from the street-level all the way to the national. This reveals new conditions for the generalizability of samples over space and time, and connections among the Shapley value, counterfactual statistics, and hyperbolic geometry.

Article Details

Section
Articles