The data consists in genome-wide SNPs that were obtained from 5 different sources, keeping the SNPs common to all 5 sources. The combined dataset consisted in 3146 autosomal SNPs for 4025 individuals from 167 populations. It was analyzed using FRAPPE, with default settings, and K=9. The resulting ancestry profiles were averaged within each population, and a chi-squared distance between the average ancestry profiles were used to compute a distance tree using FastME. This tree was computed using all sampled populations. The colour of a leaf is chosen according to the linguistic family of the corresponding population.