Product
Synthetic Australian population data
Privacy-safe synthetic individuals you can slice, model, simulate, or join to your own data. CSV / Excel / Parquet. 15,343 suburbs. 27.5 million synthetic persons.
For ML / AI teams
AUSynth is privacy-preserving synthetic Australian Census data, ideal for: training ML models without privacy risk, bias testing across demographic profiles, realistic test data for development / staging.
For large training runs, bundle the 5,000-credit pack — that covers 2.5M synthetic individuals at the standard rate.
How it works
Pick a geography (suburb / postcode / LGA / state / national), pick a dataset type (persons / families / dwellings), pick a format, and download. Each record is statistically representative of real ABS Census 2021, but is not a real Australian.
Pricing
1 credit per 500 records (fractional, so 100 records = 0.2 credits). Hierarchical-tier downloads (persons within families, etc.) cost 1.5× the base. See bundles →
What's in each record
47 Census variables, including age, sex, income, education, occupation, industry, household composition, dwelling type, and more. See the data dictionary for the full schema.
Worked examples
Three published reports built on the same data. Skim them to see what the variables look like when you actually ask a question with them.
- The Gender Pay Gap in Australia
How occupation and within-occupation pay combine to produce the gap. Mediation report.
Read the report → - The Immigrant Pay Gap in Australia
Two opposing forces shape the gap. Mediation report.
Read the report → - Single-Parent Family Income in Australia
Structural patterns behind the income gap. Mediation report.
Read the report →
Who uses this
- Realistic Australian data for student tutorials.Data
- Synthetic cohorts for simulation studies.Data
- Train models on realistic data, zero privacy risk.Data
- Realistic test data for staging environments.Data
- Customer demos without exposing real records.Data
- Demographic mix for grant applications.Data
- Income distributions for risk modelling.Data