Geographically Explicit Synthetic Population Dataset with Networks for the US

The geo-simulation research domains including agent-based modeling requires high-quality synthetic population dataset that can capture individuals' demographic characteristics, spatial distribution and social connections. In this paper published in Nature - Scientific Data , ( Na (Richard) Jiang , Boyu Wang , Andrew Crooks , and myself, we introduced a Python-based workflow that uses US Census 2020 dataset to generate large-scale geographically explicit synthetic population for America’s 50 states and Washington D.C. . The generated synthetic population is at individual level and their aggregated demographic attributes including age, gender distributions can match the US Census Data at the census tract level. In addition to demographic attributes such as age, gender, household and urban/rural status , our synthetic population data is also geographically explicit by assigning home locations to individuals using road networks as a proxy. We also i...