You are here

Everybody is your 16th cousin

Topic: 
Tags: 

In my article two weeks ago about the odds of knowing a cousin I puzzled over the question of how many 3rd cousins a person might have. This is hard to answer, because it depends on figuring out how many successful offspring per generation the various levels of your family (and related families) have. Successful means that they also create a tree of descendants. This number varies a lot among families, it varies a lot among regions and it has varied a great deal over time. An Icelandic study found a number of around 2.8 but it's hard to conclude a general rule. I've used 3 (81 great-great-grandchildren per couple) as a rough number.

There is something, however, that we can calculate without knowing how many children each couple has. That's because we know, pretty accurately, how many ancestors you have. Our number gets less accurate over time because ancestors start duplicating -- people appear multiple times in your family tree. And in fact by the time you go back large numbers of generations, say 600 years, the duplication is massive; all your ancestors appear many times.

To answer the question of "How likely is it that somebody is your 16th cousin" we can just look at how many ancestors you have back there. 16th cousins share with you a couple 17 generations ago. (You can share just one ancestor which makes you a half-cousin.) So your ancestor set from 17 generations ago will be 65,536 different couples. Actually less than that due to duplication, but at this level in a large population the duplication isn't as big a factor as it becomes later, and if it does it's because of a closer community which means you are even more related.

So you have 65K couples and so does your potential cousin. The next question is, what is the size of the population in which they lived? Well, back then the whole world had about 600 million people, so that's an upper bound. So we can ask, if you take two random sets of 65,000 couples from a population of 300M couples, what are the odds that none of them match? With your 65,000 ancestors being just 0.02% of the world's couples, and your potential cousin's ancestors also being that set, you would think it likely they don't match.

Turns out that's almost nil. Like the famous birthday paradox, where a room of 30 people usually has 2 who share a birthday, the probability there is no intersection in these large groups is quite low. it is 99.9999% likely from these numbers that any given person is at least a 16th cousin. And 97.2% likely that they are a 15th cousin -- but only 1.4% likely that they are an 11th cousin. It's a double exponential explosion. The rough formula used is that the probability of no match will be (1-2^C/P)^(2^C) where C is the cousin number and P is the total source population. To be strict this should be done with factorials but the numbers are large enough that pure exponentials work.

Now, of course, the couples are not selected at random, and nor are they selected from the whole world. For many people, their ancestors would have all lived on the same continent, perhaps even in the same country. They might all come from the same ethnic group. For example, if you think that all the ancestors of the two people came from the half million or so Ashkenazi Jews of the 18th century then everybody is a 10th cousin.

Many populations did not interbreed much, and in some cases of strong ethnic or geographic isolation, barely at all. There are definitely silos, and they sometimes existed in the same town, where there might be far less interbreeding between races than among races. Over time, however, the numbers overwhelm even this. Within the close knit communities, like say a city of 50,000 couples who bred mostly with each other, everybody will be a 9th cousin.

These numbers provide upper bounds. Due to the double exponential, even when you start reducing the population numbers due to out-breeding and expansion, it still catches up within a few generations. This is just another measure of how we are all related, and also how meaningless very distant cousin relationships, like 10th cousins, are. As I've noted in other places, if you leave aside the geographic isolation that some populations lived in, you don't have to go back more more than a couple of thousand years to reach the point where we are not just all related, but we all have the same set of ancestors (ie. everybody who procreated) just arranged in a different mix.

The upshot of all this: If you discover that you share a common ancestor with somebody from the 17th century, or even the 18th, it is completely unremarkable. The only thing remarkable about it is that you happened to know the path.

Comments

Everybody is your (genetic) 16th cousin, but not an actual 16th cousin.

First the math is wrong. The chances of two people in a room of 30 people with the same birthday is 70.6%, not an almost sure event. http://en.wikipedia.org/wiki/Birthday_problem. Generalization to pedigrees needs to be even more careful.

When the subject goes as far as 16th-cousins, you'll likely have no more genetic segments in common with a 16th cousin as with an arbitrary other northern european person. At that point, identical-by-state is more common than identical-by-descent. Divide up the half million SNPs that 23andme tests by 2^16, and you would get an average of 7 from each 16th-great-grandparent (assuming no pedigree collapse). On the extreme off-chance that you match those with another 16th-cousin not through random chance, then those are identical-by-descent. On the other hand, on average, a northern european will match 75% of their genome with another northern european. Just by flipping a 75-25 lopsided coin a half-million times, you'll find segments shared in common of lengths at least 40 (but in reality shared DNA segments will be much longer). Such is identical by state. It is in this sense that "everybody is your (genetic) 16th cousin".

Now for the issue of actual 16th-cousins. Pedigree collapse has to be factored in. Pre-1800, outside of the main cities many villages were much smaller than today. There may have been 100 or fewer houses in each village. It wouldn't take more than 5 generations to have most houses related. Now, what if you take someone else whose ancestors 400 years ago are from a distant village in a separate country, also with many of the relatives contained within their village? Travel was rare. There's not likely to be family-tree overlap.

A better guess of the Ashkenaski population in the mid 18th century is about 400,000. http://www.statemaster.com/encyclopedia/Historical-Jewish-population-comparisons. But more significantly, much of that population arose largely from a severe population bottleneck of 1000-1400AD. Two arbitrary people selected from this population are likely to be close cousins by descent, but much more assuredly by state.

I agree that "almost surely" is incorrect and modified it. I merely want to point out that people have bad intuitions about intersections of sets and coincidence.

I also agree that the amount of shared DNA becomes nada at that point.

As for pedigree collapse, I think you agree that this actually makes us more related to random people from our geographic area, not less, which is why it doesn't alter the main point.

If the Ashkenazi were only 400,000 (I looked a bit, but not too much for that reference, so thanks) then everybody is a 10th cousin, and of course, probably closer.

My main point was to show the math that even if took the whole population of the world as you source, we all were sure to have a common ancestor after what some people think is a relatively small amount of time (300 to 400 years.) In fact it's usually much sooner. Yet we regularly see people saying, "Isn't that amazing, we have a common ancestor from the 17th century!"

It is, in fact, completely unremarkable, and that is the main point.

I am an amateur genealogist and I know some basic statistics. But this is just confusing. Does this mean I could be descended and/or related to, oh, lets say a Japanese person?! What does this mean. Please Clarify this soon.

We are all related within 300- 400 years ago?

I'm an amateur genealogist and I find this confusing. Does this mean 400 years ago I am related and/or descended from, oh, lets say a Japanese person!? please clarify this for me.

As noted in the article, there are populations that had geographic isolation and so had minimal interbreeding. However, there was some with most of the populations, though the Americas were isolated from the rest of the world from about 10K years ago to 500 years ago, and Australia from about 8K years ago.

So sixteen generations ago we are all related, that is fascinating. But as the person above has said, does this apply to Japanese, Australian Aborigines, Alaskan Inuits or some other far-flung peoples? I wonder how a degree of relationship would apply to those nationalities.
As for the question about being related or descended from Japanese person, I think it would vary from person to person who lived 400 years ago. What do any of you think?

The main point is that if you come from populations that did intermingle, you don't have to go very far to get to a point where it gets extremely unlikely that your set of ancestors doesn't overlap with all the other such people.

As the person above has said, someone could have a distant 400 years removed Japanese ancestor, even though they could be African American or Caucasian American? Hmm, what does anyone think.

Everyone living in Japan back then would have had different numbers of descendants; some had many, others didn't or none at all. Still, many lived long enough to procreate. And this applies to all countries around the world.

I read this and I have a question: where these common ancestors scattered around the globe 300 to 400 years ago or where they all from a specific region. For example, I am African- American, would my common ancestor with a Chinese be a Chinese person, a european person or a black person? What isyour take on this.

I read this and I have a question: where these common ancestors scattered around the globe 300 to 400 years ago or where they all from a specific region. For example, I am African- American, would my common ancestor with a Chinese be a Chinese person, a European person or a black person? What isyour take on this.

What do any of you think about the odds of being descended from anyone in any given point in time? I mean, how likely is it being descended from a person in the 16th century?

But to go back 500 years to 1511, say, that's going to be 20 or so generations. Your family tree 20 generations back has one million slots, but due to re-forking (distant inbreeding) many of those ancestors will appear in many slots, especially if you came from closely knit tribes or villages in some sections of your tree.

So if you imagine you have 250,000 unique ancestors from that period -- and I don't really know what the right re-blending number is -- you would want to look at the populations of the regions and ethnic groups from which you came, and you could calculate the odds of any one member of such a group being an ancestor. But many of us are mixes of ethnic groups, so only some of your tree is going to qualify.

If you are truly "purebred" and your ethnic group had a population of 10 million 500 years ago, and you have 250K ancestors within it, then the odds are 1 in 40 that any given one is your ancestor. However, the more purebred you are, the more inbreeding, so in fact the odds are poorer. I may be underestimating the inbreeding by a lot here.

If there was no inbreeding by the time you go back 28 generations ( around 1,000 AD)
You have 268,435,456 Ancestors - World Population was around 275,000,000.
My Mom's side & Dad's side of the tree meet with Alice Montagu, Joan Beaufort just to name 2 common ancestors of both trees & there are a lot more. I have SEVERAL places where a Ancestor has 2 or more children - they split off but come back together in a couple generations. I can get back to Noah through Shem & Japheth from either side of the family- PROVING it is the hard part.

I would say that proving descent from fictitious people is going to be tricky.

Unless your family fell out of the sky, the chance of being descended from someone in the 16th century is 100%. If you mean a particular person, well, the maths gets a little more tricky. :)

So I'm related to {at least some} Japanese and other people of different ethnic groups 16 generations ago. That's kinda cool. It just goes to show how connected many of us really are but we tend to forget that.

Food for thought, the claim here is that we are all related by our ancestors from at most 400 years. I find this hard to believe, because 400 years ago do you think everyone then was first cousins or something? Like maybe they were all sitting around going, "man you think we're all related somehow?" And someone interjects... "duh, didn't you know every one here is no more than first cousins?" "Ya 400 years from now all our descendants are gonna figure it out." No, 400 years ago their were lines of genealogy were just like ours today. Their were only fewer of those lines of course.

Those people 400 years ago in their region were also all 16th cousins to each other (actually closer, populations were smaller and stayed closer -- but not 1st cousins.)

When I say that you and I are at least 16th cousins it means that you and I share an ancestor 17 generations ago. You also share a similarly distant ancestor with any other random person from your group of origin -- like President Obama who is in lots of ethnic groups so he's a likely choice. But you don't share the same ancestor with him as you do with me. Those two people are themselves related but in a different way, and in the distant past.

What does anybody think about the odds of being descended from any given person living back generations back? Like, for example, what are the odds of the ancestor of being a Lord, a Samurai, a peasant or a Tribal warrior? What does anyone think? My own thoughts would be to divide the population of the world by the number of ancestors to determine some basic odds. Well, please give me some feedback.

Once you go back far enough so you have millions of ancestors at that level, then you are going to be descended from every type of person in the regions your ancestors came form. Some lords, lots of peasants -- just as in the population then, there were a few lords and lots of peasants. You may get a slightly larger representation of lords and rich people because they could afford to have more families. Some lords were notorious for also siring lots of children outside wedlock, and thus are over-represented compared to the ordinary individual.

Since populations used to keep much more closely together in the middle ages (today we interbreed very freely by comparison) you will have fewer ancestors outside your ethnic groups, but it would be very rare to have none if you go back far enough.

And generally, within any given geographic region, go back 1K to 2K years and you are descended from everybody who lived then, unless that person's line died out quickly (ie. they had no children, or grandchildren.) Once a person gets a line going it's pretty much impossible to keep it from mixing with all the other lines. The only thing that will do that is a huge geographic barrier. (ie. the population of the Americas was kept isolated from Eurasia for 10,000 years until 1492.)

I just had a curious thought. Could this counter-intuitive phenomenon, the Birthday Paradox, be used to estimate unsuspected ancestry? Like, could it be used to say that a European could have ancestors among the Medieval Chinese Population? How likely is it and how long in terms of generations into the past would it take? What do you think about that? Is there any merit to it?

Because as you go further back in time, people moved around less and bred around less, things don't branch out geographically as much as they would with today's patterns. Again, going back 500 years you have a million ancestor slots, and due to overlap, a smaller number of ancestors. There were Chinese who made it to Europe 500 years ago and who were breeding with Europeans. It is not certain you have some of them in your ancestry but certainly possible if your general ancestry is European.

Fewer Americans came to Europe, none before 500 years ago, and I would guess they bred less.

But if you share at least on 16th G-G-G grandparent with any Chinese, wouldn't it seem likely that least one of those common ancestors are Chinese?

I don't understand this question. Because of geographic isolation, Europeans may not share with Chinese, necessarily. They will share with other Europeans. The math in this post simply says that if I have a million ancestors from Europe and you have a million ancestors from Europe, the odds that there is nobody in both sets are very, very low. If I have a million ancestors from Europe and you have a million from China, it's more possible that they two sets don't overlap.

So only Europeans are 16th cousins?

Wasn't this article supposed to show how related everyone in the world is or does this only include people of European descent"

Everybody in the world would be related if we interbred with people in other countries and tribes the way we do today. However, prior to 500 years ago, nobody interbred between the Americas and Eurasia/Africa, and Australia was also isolated. Travel was also slow within Eurasia/Africa.

However, there has been a lot of interbreeding in the past 5,000 years, so there are fewer and fewer "pureblood" people left on the planet. Soon they will all be gone unless they take tremendous effort.

I actually took this up with some experts. They concluded that finding common ancestors for people of European, Asian and African ancestry would be within 300-500 years. However, this does not apply to isolated Native American tribes, Oceanic peoples and Australian aborigines. In other words, finding ancestors within that time frame would strictly apply to people of Old World descent, not including the completely isolated peoples.

Lets say just Asians, Europeans, Africans and other non-isolated populations can be included. Judging from my own intuition and other factors on the field I have contacted, could 16th cousins still hold? I mean, that far back, each couple could potentially have 4294967296 people descended from them. Of course, due to inbreeding, ancestors are shared thus increasing the odds of anyone being descended from them. So lets just say Asians, Europeans and Africans. How would that turn out?

The numbers are huge, but the amount of travel and interbreeding of the regions was much less until the 19th century and in the 20th century the borders were vastly lowered to interbreeding. I say everybody is your cousin within the breeding community you came from but it takes a lot of detail to track the more dispersed groups.

Well, not *nobody*. Less prevalent than now, perhaps- but there were Chinese and Ethiopian legionaries in the Roman army.

I read the posts above. Here is something I got from a bio-statisican. Note, it goes 500 years not 400 years.

"if we take 500 years as being 20 generations, you have 2 the power of 20 unrelated ancestors 500 years ago (barring any marriages between relatives). That is 1,048,576 people. So does any modern European, African or Asian. So the two of you have the square of that number (1.2 times 10 to the 13 ancestors between you. 12,000,000,000,000 or 12 thousand billion). Even today we have only 7 billion people alive. So you and any European, African or Asian have many ancestors in common 20 generations ago. You two are certainly related."

The probability of two people sharing ancestors 500 years ago is complicated, and that probability will be altered by ethnicity and family history {i.e. inbreeding} but it will not be zero.

No, the two people have the sum -- 2 million -- ancestors between them. Less than that because of duplication in the tree.

However, the point I make is if you take 1 million people (ancestors of person A) and another 1 million people (ancestors of person B) it's effectively impossible not to have some overlap if you pick them at random. However, they are not picked at random, and some migrations were limited, so it is possible not to have one in common -- but unlikely. And if you have any geographical similarity in ancestry it's extremely unlikely.

You mentioned yourself even if you took the whole world as a source, there is some overlap probably. Are you saying there were *probably* some ancestors in common I share with Chinese or some other far-off group 400-500 years ago, but a lot more ancestors in common with another white person? Is that what your saying? That we have more common ancestors with others of similar ancestry 400-500 years ago?

It depends on how much mixing there was between the source groups. Active mixing of Chinese and other cultures was slow until the 19th century, I think.

The main point is that if you take a large population, and pull one million people (the ancestors of person A) out of it, and you take another million (the ancestors of person B) out of it, it is essentially certain there will be overlap if the ancestors are drawn at random (ie. from within a geographic region.) If you have regions with minimal interbreeding (for example, there was no interbreeding between native Australians and the rest of the world before 1600, and only minor interbreeding before euro-colonization in the 18th century. So for those populations the cousin probability is much lower. And it's lower between Asians and Europeans (though not as extreme as with Australians) and so on.

Starting around the 19th century, the world started doing really serious interbreeding and today it's huge. Going into the future there soon won't be anybody on the planet not sharing an ancestor a few centuries back.

Perhaps this is more applicable when talking about slightly more than 500 years back. But something that I think often gets overlooked is that people married within ~100-200 miles, not just within their villages; cumulatively, this diffused ancestry/ genetics farther and farther over the centuries. Even though travel was less common, people might easily move 100 miles during their lifetime; not to mention large migrations & diaspora. (Communities may have been more tight knit, but people liked sex just like now and people cheated etc, and there was more open space to go and do it.)

There was also shipping/ trade concentrated around the Mediterranean etc connecting Europe, Africa and the Middle East, and the Silk Road connected Asia, Europe and the Middle East. Another factor was things like Vikings kidnapping women from Ireland etc... that type of pillaging was not as rare 500+ years ago. While the rate of distant inter-breeding from all these might have been small, it only takes _one_ couple to bridge the ancestry from one group to another, which (my understanding is it's more than 50% likely) then diffuses through most of the population over several centuries (assuming there was a growing population, which it was). As mentioned in the blog, something like 10 generations later you are unlikely to have any of a given ancestor's DNA (simply because at a point in time with 1000 ancestors, and with DNA typically being passed in ~200 "chunks" to offspring, that is a 200/ 1000 or 20% chance that any chunks came from them). This is largely because the chunks of DNA are chopped at the same locations when passed to offspring.... if they were chopped up in random locations, pieces would linger around for longer.

Some surprising things I noticed related to this, in the pdf's on this webpage... the distinctions between countries/ groups are much more fuzzy than you might think.. for example check out the "Genetic Map of Interconnected World Regions", in the middle of the page -

http://www.dnatribes.com/sampleresults.html

[I didn't link to the pdf itself because it's a large file and seemed to be acting up...]

If there were random intermixing, then we would each have ~1 million ancestors living in 1500 AD, out of a world population of ~500 million.
So the fractional overlap between two people would be about 1/500th.
But the probability that two people share at least one common ancestor would be essentially 100%. Basically, you are choosing a random number between 1 and 500 a million times and you're asking whether you ever choose number 500. In a million trials, we expect this to happen 2000 times. So that it happens at least once is guaranteed.
If we get rid of the random intermixing, the fractional overlap will drop to much less than 1/500th. But I suspect that the probability of at least one overlap will remain very high.

Note that 500 years ago any two people will share one common ancestor- this ancestor may vary between any two people and this ancestor is NOT the ancestor of everyone on Earth but of those two people.

If the population in 1500 was 500 million, and it is 6 billion today (12x larger).
If the average generation length is 30 years, there are 17 generations in 500 years.
So the average number of surviving children per mother is exp((log 12)/17) = 1.157
Since a child has two parent, the average number of surviving children per person is 2 * 1.157 = 2.315

So this is the average growth rate per generation for the descendants of a person in 1500.
2.315^17 = 1.575 million.

So an average person in 1500 has about 1.5 million offspring alive today. Sampling from the whole world, the probability that a random person from 1500 is an ancestor of a random person in 2000 would be 1.5 million / 6 billion = 0.025%.

If you were only considering people in a region like Europe, it would probably be something like 1.4 million / 700 million = 0.2%.

This is not my own quote. Just something I got from someone else- A computer scientist.

Why stop at 16th cousin? By the definition of a cousin being, "someone you hold a common anscestor with," one could argue that every living being on Earth are distant cousins, as well have all evolved from a common single celled ancestor.

What is interesting to find out is how long till the stranger next to me will be descendants of the same person?

What I think is amazing is not a matter of shared dna, but of shared destiny. If any one of my million ancestors, 20 or so generations back, had done something differently (like not had the exact child that they had when they did) me and all my 16th cousions would not exist. Now that's crazy, how many people in total had to do all of the things they did in order for me (or any of us) to exist at all, amazing!

Excellent blog and additional comments from my cousins all around the world. :)
May I ask another question? Thank you. If we add the factor of names, how does this narrow the percentages? Say for instance, I know my relative's name is Baroni back to 1700's and that he was born in Livorno, Tuscany. If I go back to the late 1400's, I find a Baroni who was born not far from my relative and who shares some traits and an occupational description, as well as a similar socio-economic standing. I also know that he had at least one son. How likely is it that these are DIRECTLY related, that is, son to son to son, given that they inhabited the same region and had the same last name?

But remember, only a tiny fraction of your relatives will share your last name. And that was in the traditional "take dad's last name" world, where less than half your 1st cousins (on average) share your last name -- those from your father's brothers -- but when it comes to 2nd cousins, it's much less, and by 3rd cousins forget about it.

So there was a person 17 generations ago that was a common ancestor to both Native Americans and the indigenous peoples of Papua New Guinea? Where would this person have lived? And he evidently had one son who was the ancestor of ALL the Native Americans (the Native American "Adam"), and another who was the ancestor of ALL the indigenous peoples of Papua New Guinea (the Papua New Guinean "Adam"). Or would you like to assert that there was more than one founder? Let's say it was two guys who tie all the Native Americans to all the Papua New Guineans. So there were two pairs of "half-Adams". Or maybe there were more...maybe there was LOTS of people who had different lines go to both the New World, and to Papua New Guinea...in the 1500s. I think you see the problem. It is either ridiculous or ridiculous, take your pick.

Please stop polluting the internet with misinformation. If you want to write up about how interbreeding populations lead to closer cousins than one might expect, that's fine, but when you title it "EVERYBODY is your 16th cousin", then you are expanding the paradigm into territory where you know the summary is false. Please stop doing this.

If we go back to lets say, 5 or 6 Thousand years ago aren't we getting at what most people are wanting to know? Are we are all related? Going back that far it seems obvious that the entire population of the planet were related to one another and all had ancestral connections (even if very distant at that time), and thereby we ALL being descendants of those very ancient peoples are ALL VERY distantly related. Regardless of philosophical beliefs about the origins of humanity... if you go back far enough, we are all related, be it to Adam or the first single celled organism. Both beliefs trace the entirety of humanity to a common ancestor. I guess the fascination is the idea of our most distant or tenuous connection to even the most seemingly unrelated person or racial group being no more than a few dozen generations apart.

https://www.youtube.com/watch?v=BhtgINeaJWg

all the presidents are related 18 generation cousins incuding obama, trump and hillary. what are the chances of that?

So everyone who lived 1000+ years ago is my great-great-great-...-grandparent?! And yours too?!

What are the chances that anyone of us are a direct descendent of William the conqueror? I would say that it’s 100% especially if you are Caucasian and you descendent from the British. William the Conqueror lived in the year 1066. Not everyone living at that time would have direct descendants living today 1000 years later. Some would have children but no grandchildren. Someone have grandchildren, but no great-grandchildren. But we do know that William the conqueror has direct descendants living on the earth today because the queen of England is a direct descendent of him.

Since I know that my direct ancestors also lived in the year 1066, then the question is, “what are the chances that William the conqueror is one of my direct ancestors?” Since I would have over 4,000,000,031st great grandparents without pedigree collapse, that I would say it’s 100% certainty that all of us are direct descendants of William the conqueror as well as the queen of England. This is especially true if your Caucasian just sending from ancestors who came from England.

If we assume that each couple had two children who then intern also had two children who intern also had two children, then the numbers are exactly the same just going in the opposite direction. So they were for one couple living 1000 years ago could have 4 billion descendants living today.

With this not mean that in any of the figures in history, such as Alfred the great, William the conqueror , Charlemagne, etc. would almost have to certainly be our ancestors today. This would only work as long as those people had direct descendants living today and we know that the queen of England is descended from all of those people so we know that they do have descendants still living today.

If some of the experts can confirm my mass I would appreciate it. I find it very interesting to look at sculptures and paintings and very well marked tombs of figures in history into know that I am there a direct descendent.

If we went back another thousand years to the time of Christ, then would it not also be even more certainly the case that famous people in the Bible and history from that time. Would be our direct ancestors today providing of course that they have direct descendants living on the earth today as well?

Add new comment

Subscribe to Comments for "Everybody is your 16th cousin"