Synthetic Population
Last updated
Last updated
Our goal was to create simple one-level synthetic population with travel demands for an average day to showcase the possibilities of the project. In the future, we would like to incorporate household information (unavailable atm) and people coming from outside of the city (work traveleres and tourism).
This chapter contains all the steps we went through for creating a synthetic population for Prague's digital twin.
In short the steps are:
Research and analysis of available case-studies.
Derive data specification.
Find data sources.
Population synthesis pipeline (incl. data and survey synthesis, spatial distribution and activity chain validation).
Export of synthetic population with daily activity chains for MATSim.
A is a simplified microscopic representation of the actual population. Simplified, because not all attributes are included (for example hair color is deemed to be irrelevant for land use modeling, thus it is ignored). It is microscopic because every household and every person is represented individually. A synthetic population is not identical to the actual population, i.e. most reside nts will find a record in the synthetic population that matches themselves. Instead, the synthetic population matches various statistical distributions of the actual population, and therefore, is close enough to the true population to be used in modeling
More information:
(outdated) Possible input for generating synthetic population of Prague and Central Bohemian Region.
Comprehensive tutorial, something of a MasterDoc: and earlier version
SynthPop for MATSim:
Simulation Note
MATSim is an activity-based model, which requires more detailed information on household demographics. The collection of individual micro data (data that can be associated with single buildings, individuals) is neither allowed nor desirable for privacy reasons.
A model is needed to generate synthetic micro data from general accessible aggregated data.
data and attributes of a modeled agent are synthesized by integrating a diverse set of data sources and using models for interpolation and extrapolation of data.
Syntetic Agent is similar to real individuals but is not identical to any individual in the population.
Individual attributes of agents and their links are based on real-world collected data.
Privacy of individuals is protected.
The correlations between the data sets agree with the measured correlations of data in the real world.
A representation of elements’ states and interactions that is not intended to match any snapshot of the system, but to provide a statistically accurate overall picture. [AAMAS16]
A synthetic population represents a set of synthetic agents that share a common geographic, social or biological characteristic (e.g. people in an rural or urban region, individuals from a given tribe, etc) A synthetic society is a synthetic population that is endowed with other social and behavioral attributes that allow the individuals within a population to interact by following these social, behavioral norms and rules.[AAMAS16]
A synthetic network is composed of synthetic agents combined with links that capture interactions among the agents.[AAMAS16]
Although microdata are commonly made publicly available as census samples [8] or household survey samples [9], full census microdata are almost never publicly released to protect the privacy of respondents. A more realistic option for researchers to obtain a dataset of all household observations and associated characteristics in a population is to simulate it, and recent advances in generating synthetic populations have made this approach a viable alternative [10]. Well generated synthetic population may also represent the key attributes of the real population ina generalized state, leading to lower data complexity. Synthetic population datasets also have the advantage over actual census data that multiple scenarios can be generated to test outcomes in potential future populations.