# Archives

Date

## Everybody is your 16th cousin

In my article two weeks ago about the odds of knowing a cousin I puzzled over the question of how many 3rd cousins a person might have. This is hard to answer, because it depends on figuring out how many successful offspring per generation the various levels of your family (and related families) have. Successful means that they also create a tree of descendants. This number varies a lot among families, it varies a lot among regions and it has varied a great deal over time. An Icelandic study found a number of around 2.8 but it’s hard to conclude a general rule. I’ve used 3 (81 great-great-grandchildren per couple) as a rough number.

There is something, however, that we can calculate without knowing how many children each couple has. That’s because we know, pretty accurately, how many ancestors you have. Our number gets less accurate over time because ancestors start duplicating — people appear multiple times in your family tree. And in fact by the time you go back large numbers of generations, say 600 years, the duplication is massive; all your ancestors appear many times.

To answer the question of “How likely is it that somebody is your 16th cousin” we can just look at how many ancestors you have back there. 16th cousins share with you a couple 17 generations ago. (You can share just one ancestor which makes you a half-cousin.) So your ancestor set from 17 generations ago will be 65,536 different couples. Actually less than that due to duplication, but at this level in a large population the duplication isn’t as big a factor as it becomes later, and if it does it’s because of a closer community which means you are even more related.

So you have 65K couples and so does your potential cousin. The next question is, what is the size of the population in which they lived? Well, back then the whole world had about 600 million people, so that’s an upper bound. So we can ask, if you take two random sets of 65,000 couples from a population of 300M couples, what are the odds that none of them match? With your 65,000 ancestors being just 0.02% of the world’s couples, and your potential cousin’s ancestors also being that set, you would think it likely they don’t match.

Turns out that’s almost nil. Like the famous birthday paradox, where a room of 30 people usually has 2 who share a birthday, the probability there is no intersection in these large groups is quite low. it is 99.9999% likely from these numbers that any given person is at least a 16th cousin. And 97.2% likely that they are a 15th cousin — but only 1.4% likely that they are an 11th cousin. It’s a double exponential explosion. The rough formula used is that the probability of no match will be (1-2^C/P)^(2^C) where C is the cousin number and P is the total source population. To be strict this should be done with factorials but the numbers are large enough that pure exponentials work.

Now, of course, the couples are not selected at random, and nor are they selected from the whole world. For many people, their ancestors would have all lived on the same continent, perhaps even in the same country. They might all come from the same ethnic group. For example, if you think that all the ancestors of the two people came from the half million or so Ashkenazi Jews of the 18th century then everybody is a 10th cousin.

Many populations did not interbreed much, and in some cases of strong ethnic or geographic isolation, barely at all. There are definitely silos, and they sometimes existed in the same town, where there might be far less interbreeding between races than among races. Over time, however, the numbers overwhelm even this. Within the close knit communities, like say a city of 50,000 couples who bred mostly with each other, everybody will be a 9th cousin.

These numbers provide upper bounds. Due to the double exponential, even when you start reducing the population numbers due to out-breeding and expansion, it still catches up within a few generations. This is just another measure of how we are all related, and also how meaningless very distant cousin relationships, like 10th cousins, are. As I’ve noted in other places, if you leave aside the geographic isolation that some populations lived in, you don’t have to go back more more than a couple of thousand years to reach the point where we are not just all related, but we all have the same set of ancestors (ie. everybody who procreated) just arranged in a different mix.

The upshot of all this: If you discover that you share a common ancestor with somebody from the 17th century, or even the 18th, it is completely unremarkable. The only thing remarkable about it is that you happened to know the path.