Data Specification

Definition and preprocessing of available and necessary data to create a synthetic population.

Available data (as of 7. 12. 2021):

SHP files (2021)

Population address point - residences with average number of residents in Prague
Amenities - commercial, civic, recreational POI - [x, y] coordinates
Zones in Prague - ZSJ (základní sídelní jednotka) - 687 zones

Additional files

Clean census -> one census person = individual of the synthetic population
Clean travel diary -> travelers with travel day plan and district of residence
Statistical match of a census person with a traveler (based on age, sex, employment, and district of residence) -> individual has an activity chain (travelling plan for a day) copied from the matched traveler
Assign home coordinates to each individual based on zone of residence -> individual has a home location based on which all other coordinates for other destinations are determined
Assign coordinates to destinations (work, education, shop, other, leisure) for each individual based:
1. Amenities
2. Travel mode
3. Trip duration
4. Beeline distance
5. Probability of commute between zones to work or education
6. Home coordinates
Convert to MATSim XML population file with travel demands

entries: 2551962 Prague entries: 1268463

Steps:

Steps: Sum # of cars (private, company, utility, other)) in car_number

Steps:

Drop NA in ['sex', 'employment']
Fill age median grouped by ['sex', 'employment']
Fill NA using sociodemographics
- If age > 18 and driving_license is NA -> driving_license = True
- If age < 18 and driving_license is NA -> driving_license = False
- If age < 16 or age > 64 -> pt_avail = True

Steps:

Drop NA in [traveler_id]
Drop NA in [departure_h, departure_m, arrival_h, arrival_m, origin_purpose, destination_purpose, traveling_mode]
Delete travelers who do not have any destination ‘home’
Keep only travelers where ‘_purpose’ == ‘home’ and ‘_code’ == 1000
Fill NA in ‘*_code’ for missing commute home-work, home-education (and vice versa)
Impute driving mode "other" by mode of groupby by traveling speed.

Last updated 2 years ago