ɫɫÀ²

News

AI-created family trees confirm class divisions in Finland in the 18th and 19th century

The genealogy algorithm AncestryAI efficiently combines huge amounts of birth data.
A small section of a family tree covering 13 generations that was derived by the algorithm. The colours show the socio-economic status of the individual. Image: Eric Malmi.

It would take 100 person-years for a genealogist to map and find all the parents for five million people – with a rate of one person per minute. The AncestryAI algorithm can do the same work in an hour using 50 parallel computers and with a success rate of 65 per cent. The algorithm can also measure the level of uncertainty for each connection so that unreliable results can be ignored. Genealogists and demographers can use the algorithm to shed light on societal change and history.

‘The algorithm does not replace the work of genealogists; it is simply a tool for helping them in their work. The genealogy algorithm can suggest connections which are probably correct, but on its own it is not as precise as a careful genealogist. The algorithm can also search for parents from nation-wide data, while a genealogist may need to limit their search to just one parish,’ explains Eric Malmi, doctoral student at Aalto University who currently works for Google in Zürich.

Using AncestryAI, launched in 2017, genealogists have indeed succeeded in finding new ancestors, such as familial ties between with individuals of whom some have relocated to different regions in Finland. Currently, AncestryAI is being used to derive the genealogical relationships for people who died in the Finnish Civil War in 1918 to give, for instance, a more precise estimate of the number of war orphans.

Class division in Finland remained unchanged for 150 years

The genealogy algorithm helps examine huge amounts of data to analyse social change over long periods of time rather than at only particular and narrow timeframes. Malmi’s work has confirmed, for example, that class division in Finland remained virtually unchanged between 1735 and 1885.

‘We studied the effect of socio-economic status on the choice of spouse and found that they are clearly connected. Against our expectations, however, the strength of the connection did not decrease over time, but rather stayed the same,’ explains Malmi.

Socio-economic status was deduced based on the profession of a spouse’s father. Farmhands and other landless peasants represented the lowest class, and the rest were then divided into tenant farmers, farmers, middle-class and upper-class.

AncestryAI makes use of statistical deduction and machine learning procedures developed for genealogical use. The basic algorithm seeks to separately deduce the mother and father for each individual based on their name, locality and date of birth. A supplementary algorithm then improves the accuracy of the basic algorithm by taking into account other factors, such as that people usually have children with the same spouse.

AncestryAI makes use of data in the HisKi database maintained by the Genealogical Society of Finland. The data consist of a total of 5 million births and 3.3 million deaths during 1648–1918. The algorithm has made a total of 7.3 million connections between children and their parents.

The research was published at the (International Web Conference) in Lyon. The research also won the award for best paper at the in Prague.

Further information:

Eric Malmi, Doctoral Student, Aalto University
eric.malmi@aalto.fi
tel. +358 44 047 8010

Arno Solin, Academy Research Fellow, Aalto University
arno.solin@aalto.fi
tel. +358 40 5776226

  • Updated:
  • Published:
Share
URL copied!

Read more news

Filmbot robot
Research & Art Published:

Researchers make micromanipulation more accessible

FilMBot aims to lower the barrier to high-precision work in education, research, and micro-assembly
Research often involves choosing a single analytic path, but there are other options available, Picture: Matti Ahlgren, Aalto University.
Press releases Published:

Scientific conclusions depend on who performs the analysis

More than 450 independent researchers from around the world conducted over 500 re-analyses of datasets from one hundred previously published studies in the social and behavioural sciences. All analysts received the same data and the same central research question, but they were free to carry out the analysis based on their own expert judgment.
Group of students at round tables talking and working on laptops in a bright office space
Research & Art, Studies Published:

Positive communication and improvisation help build students’ communication skills to meet employer needs

The School of Business redesigned its mandatory first-year communication course
Avner Peled's doctoral thesis presented in the Aalto ARTS 2025 annual review
Research & Art Published:

Learning Environments Research Group — 2025 in Review

2025 recap: three doctoral theses on context-aware interaction design, AI as creative learning partner, and telerobotic puppetry for peacebuilding.