Everybody is your 16th cousin

In my article two weeks ago about the odds of knowing a cousin I puzzled over the question of how many 3rd cousins a person might have. This is hard to answer, because it depends on figuring out how many successful offspring per generation the various levels of your family (and related families) have. Successful means that they also create a tree of descendants. This number varies a lot among families, it varies a lot among regions and it has varied a great deal over time. An Icelandic study found a number of around 2.8 but it’s hard to conclude a general rule. I’ve used 3 (81 great-great-grandchildren per couple) as a rough number.

There is something, however, that we can calculate without knowing how many children each couple has. That’s because we know, pretty accurately, how many ancestors you have. Our number gets less accurate over time because ancestors start duplicating — people appear multiple times in your family tree. And in fact by the time you go back large numbers of generations, say 600 years, the duplication is massive; all your ancestors appear many times.

To answer the question of “How likely is it that somebody is your 16th cousin” we can just look at how many ancestors you have back there. 16th cousins share with you a couple 17 generations ago. (You can share just one ancestor which makes you a half-cousin.) So your ancestor set from 17 generations ago will be 65,536 different couples. Actually less than that due to duplication, but at this level in a large population the duplication isn’t as big a factor as it becomes later, and if it does it’s because of a closer community which means you are even more related.

So you have 65K couples and so does your potential cousin. The next question is, what is the size of the population in which they lived? Well, back then the whole world had about 600 million people, so that’s an upper bound. So we can ask, if you take two random sets of 65,000 couples from a population of 300M couples, what are the odds that none of them match? With your 65,000 ancestors being just 0.02% of the world’s couples, and your potential cousin’s ancestors also being that set, you would think it likely they don’t match.

Turns out that’s almost nil. Like the famous birthday paradox, where a room of 30 people usually has 2 who share a birthday, the probability there is no intersection in these large groups is quite low. it is 99.9999% likely from these numbers that any given person is at least a 16th cousin. And 97.2% likely that they are a 15th cousin — but only 1.4% likely that they are an 11th cousin. It’s a double exponential explosion. The rough formula used is that the probability of no match will be (1-2^C/P)^(2^C) where C is the cousin number and P is the total source population. To be strict this should be done with factorials but the numbers are large enough that pure exponentials work.

Now, of course, the couples are not selected at random, and nor are they selected from the whole world. For many people, their ancestors would have all lived on the same continent, perhaps even in the same country. They might all come from the same ethnic group. For example, if you think that all the ancestors of the two people came from the half million or so Ashkenazi Jews of the 18th century then everybody is a 10th cousin.

Many populations did not interbreed much, and in some cases of strong ethnic or geographic isolation, barely at all. There are definitely silos, and they sometimes existed in the same town, where there might be far less interbreeding between races than among races. Over time, however, the numbers overwhelm even this. Within the close knit communities, like say a city of 50,000 couples who bred mostly with each other, everybody will be a 9th cousin.

These numbers provide upper bounds. Due to the double exponential, even when you start reducing the population numbers due to out-breeding and expansion, it still catches up within a few generations. This is just another measure of how we are all related, and also how meaningless very distant cousin relationships, like 10th cousins, are. As I’ve noted in other places, if you leave aside the geographic isolation that some populations lived in, you don’t have to go back more more than a couple of thousand years to reach the point where we are not just all related, but we all have the same set of ancestors (ie. everybody who procreated) just arranged in a different mix.

The upshot of all this: If you discover that you share a common ancestor with somebody from the 17th century, or even the 18th, it is completely unremarkable. The only thing remarkable about it is that you happened to know the path.

Everybody is your (genetic)

Everybody is your (genetic) 16th cousin, but not an actual 16th cousin.

First the math is wrong. The chances of two people in a room of 30 people with the same birthday is 70.6%, not an almost sure event. http://en.wikipedia.org/wiki/Birthday_problem. Generalization to pedigrees needs to be even more careful.

When the subject goes as far as 16th-cousins, you'll likely have no more genetic segments in common with a 16th cousin as with an arbitrary other northern european person. At that point, identical-by-state is more common than identical-by-descent. Divide up the half million SNPs that 23andme tests by 2^16, and you would get an average of 7 from each 16th-great-grandparent (assuming no pedigree collapse). On the extreme off-chance that you match those with another 16th-cousin not through random chance, then those are identical-by-descent. On the other hand, on average, a northern european will match 75% of their genome with another northern european. Just by flipping a 75-25 lopsided coin a half-million times, you'll find segments shared in common of lengths at least 40 (but in reality shared DNA segments will be much longer). Such is identical by state. It is in this sense that "everybody is your (genetic) 16th cousin".

Now for the issue of actual 16th-cousins. Pedigree collapse has to be factored in. Pre-1800, outside of the main cities many villages were much smaller than today. There may have been 100 or fewer houses in each village. It wouldn't take more than 5 generations to have most houses related. Now, what if you take someone else whose ancestors 400 years ago are from a distant village in a separate country, also with many of the relatives contained within their village? Travel was rare. There's not likely to be family-tree overlap.

A better guess of the Ashkenaski population in the mid 18th century is about 400,000. http://www.statemaster.com/encyclopedia/Historical-Jewish-population-com.... But more significantly, much of that population arose largely from a severe population bottleneck of 1000-1400AD. Two arbitrary people selected from this population are likely to be close cousins by descent, but much more assuredly by state.

How much DNA

I agree that “almost surely” is incorrect and modified it. I merely want to point out that people have bad intuitions about intersections of sets and coincidence.

I also agree that the amount of shared DNA becomes nada at that point.

As for pedigree collapse, I think you agree that this actually makes us more related to random people from our geographic area, not less, which is why it doesn’t alter the main point.

If the Ashkenazi were only 400,000 (I looked a bit, but not too much for that reference, so thanks) then everybody is a 10th cousin, and of course, probably closer.

My main point was to show the math that even if took the whole population of the world as you source, we all were sure to have a common ancestor after what some people think is a relatively small amount of time (300 to 400 years.) In fact it’s usually much sooner. Yet we regularly see people saying, “Isn’t that amazing, we have a common ancestor from the 17th century!”

It is, in fact, completely unremarkable, and that is the main point.

I'm confused

I am an amateur genealogist and I know some basic statistics. But this is just confusing. Does this mean I could be descended and/or related to, oh, lets say a Japanese person?! What does this mean. Please Clarify this soon.

So the bottom line is...

We are all related within 300- 400 years ago?

I'm confused

I'm an amateur genealogist and I find this confusing. Does this mean 400 years ago I am related and/or descended from, oh, lets say a Japanese person!? please clarify this for me.

Possibly, possibly not

As noted in the article, there are populations that had geographic isolation and so had minimal interbreeding. However, there was some with most of the populations, though the Americas were isolated from the rest of the world from about 10K years ago to 500 years ago, and Australia from about 8K years ago.

This is interesting but...

So sixteen generations ago we are all related, that is fascinating. But as the person above has said, does this apply to Japanese, Australian Aborigines, Alaskan Inuits or some other far-flung peoples? I wonder how a degree of relationship would apply to those nationalities.
As for the question about being related or descended from Japanese person, I think it would vary from person to person who lived 400 years ago. What do any of you think?

Yes it varies

The main point is that if you come from populations that did intermingle, you don’t have to go very far to get to a point where it gets extremely unlikely that your set of ancestors doesn’t overlap with all the other such people.

So in theory...

As the person above has said, someone could have a distant 400 years removed Japanese ancestor, even though they could be African American or Caucasian American? Hmm, what does anyone think.

Yeah, It varies beacause...

Everyone living in Japan back then would have had different numbers of descendants; some had many, others didn't or none at all. Still, many lived long enough to procreate. And this applies to all countries around the world.

Where are these common ancestors from?

I read this and I have a question: where these common ancestors scattered around the globe 300 to 400 years ago or where they all from a specific region. For example, I am African- American, would my common ancestor with a Chinese be a Chinese person, a european person or a black person? What isyour take on this.

Where are these common ancestors from?

I read this and I have a question: where these common ancestors scattered around the globe 300 to 400 years ago or where they all from a specific region. For example, I am African- American, would my common ancestor with a Chinese be a Chinese person, a European person or a black person? What isyour take on this.

What are the odds?

What do any of you think about the odds of being descended from anyone in any given point in time? I mean, how likely is it being descended from a person in the 16th century?

Depends on the person

But to go back 500 years to 1511, say, that’s going to be 20 or so generations. Your family tree 20 generations back has one million slots, but due to re-forking (distant inbreeding) many of those ancestors will appear in many slots, especially if you came from closely knit tribes or villages in some sections of your tree.

So if you imagine you have 250,000 unique ancestors from that period — and I don’t really know what the right re-blending number is — you would want to look at the populations of the regions and ethnic groups from which you came, and you could calculate the odds of any one member of such a group being an ancestor. But many of us are mixes of ethnic groups, so only some of your tree is going to qualify.

If you are truly “purebred” and your ethnic group had a population of 10 million 500 years ago, and you have 250K ancestors within it, then the odds are 1 in 40 that any given one is your ancestor. However, the more purebred you are, the more inbreeding, so in fact the odds are poorer. I may be underestimating the inbreeding by a lot here.

Family tree calculations

If there was no inbreeding by the time you go back 28 generations ( around 1,000 AD)
You have 268,435,456 Ancestors - World Population was around 275,000,000.
My Mom's side & Dad's side of the tree meet with Alice Montagu, Joan Beaufort just to name 2 common ancestors of both trees & there are a lot more. I have SEVERAL places where a Ancestor has 2 or more children - they split off but come back together in a couple generations. I can get back to Noah through Shem & Japheth from either side of the family- PROVING it is the hard part.

I would say that proving

I would say that proving descent from fictitious people is going to be tricky.

Sweet stuff

So I'm related to {at least some} Japanese and other people of different ethnic groups 16 generations ago. That's kinda cool. It just goes to show how connected many of us really are but we tend to forget that.

Food for thought, the claim

Food for thought, the claim here is that we are all related by our ancestors from at most 400 years. I find this hard to believe, because 400 years ago do you think everyone then was first cousins or something? Like maybe they were all sitting around going, "man you think we're all related somehow?" And someone interjects... "duh, didn't you know every one here is no more than first cousins?" "Ya 400 years from now all our descendants are gonna figure it out." No, 400 years ago their were lines of genealogy were just like ours today. Their were only fewer of those lines of course.

That's not what it means

Those people 400 years ago in their region were also all 16th cousins to each other (actually closer, populations were smaller and stayed closer — but not 1st cousins.)

When I say that you and I are at least 16th cousins it means that you and I share an ancestor 17 generations ago. You also share a similarly distant ancestor with any other random person from your group of origin — like President Obama who is in lots of ethnic groups so he’s a likely choice. But you don’t share the same ancestor with him as you do with me. Those two people are themselves related but in a different way, and in the distant past.

What are the odds?

What does anybody think about the odds of being descended from any given person living back generations back? Like, for example, what are the odds of the ancestor of being a Lord, a Samurai, a peasant or a Tribal warrior? What does anyone think? My own thoughts would be to divide the population of the world by the number of ancestors to determine some basic odds. Well, please give me some feedback.

How far back do you want to go

Once you go back far enough so you have millions of ancestors at that level, then you are going to be descended from every type of person in the regions your ancestors came form. Some lords, lots of peasants — just as in the population then, there were a few lords and lots of peasants. You may get a slightly larger representation of lords and rich people because they could afford to have more families. Some lords were notorious for also siring lots of children outside wedlock, and thus are over-represented compared to the ordinary individual.

Since populations used to keep much more closely together in the middle ages (today we interbreed very freely by comparison) you will have fewer ancestors outside your ethnic groups, but it would be very rare to have none if you go back far enough.

And generally, within any given geographic region, go back 1K to 2K years and you are descended from everybody who lived then, unless that person’s line died out quickly (ie. they had no children, or grandchildren.) Once a person gets a line going it’s pretty much impossible to keep it from mixing with all the other lines. The only thing that will do that is a huge geographic barrier. (ie. the population of the Americas was kept isolated from Eurasia for 10,000 years until 1492.)

What about ancestors

I just had a curious thought. Could this counter-intuitive phenomenon, the Birthday Paradox, be used to estimate unsuspected ancestry? Like, could it be used to say that a European could have ancestors among the Medieval Chinese Population? How likely is it and how long in terms of generations into the past would it take? What do you think about that? Is there any merit to it?

Your ancestors

Because as you go further back in time, people moved around less and bred around less, things don’t branch out geographically as much as they would with today’s patterns. Again, going back 500 years you have a million ancestor slots, and due to overlap, a smaller number of ancestors. There were Chinese who made it to Europe 500 years ago and who were breeding with Europeans. It is not certain you have some of them in your ancestry but certainly possible if your general ancestry is European.

Fewer Americans came to Europe, none before 500 years ago, and I would guess they bred less.

My reply

But if you share at least on 16th G-G-G grandparent with any Chinese, wouldn't it seem likely that least one of those common ancestors are Chinese?

Sorry

I don’t understand this question. Because of geographic isolation, Europeans may not share with Chinese, necessarily. They will share with other Europeans. The math in this post simply says that if I have a million ancestors from Europe and you have a million ancestors from Europe, the odds that there is nobody in both sets are very, very low. If I have a million ancestors from Europe and you have a million from China, it’s more possible that they two sets don’t overlap.

I understand, sort of...

So only Europeans are 16th cousins?

I'm WAAAAY Confused!

Wasn't this article supposed to show how related everyone in the world is or does this only include people of European descent"

The world

Everybody in the world would be related if we interbred with people in other countries and tribes the way we do today. However, prior to 500 years ago, nobody interbred between the Americas and Eurasia/Africa, and Australia was also isolated. Travel was also slow within Eurasia/Africa.

However, there has been a lot of interbreeding in the past 5,000 years, so there are fewer and fewer “pureblood” people left on the planet. Soon they will all be gone unless they take tremendous effort.

Actually...

I actually took this up with some experts. They concluded that finding common ancestors for people of European, Asian and African ancestry would be within 300-500 years. However, this does not apply to isolated Native American tribes, Oceanic peoples and Australian aborigines. In other words, finding ancestors within that time frame would strictly apply to people of Old World descent, not including the completely isolated peoples.

Lets narrow things down...

Lets say just Asians, Europeans, Africans and other non-isolated populations can be included. Judging from my own intuition and other factors on the field I have contacted, could 16th cousins still hold? I mean, that far back, each couple could potentially have 4294967296 people descended from them. Of course, due to inbreeding, ancestors are shared thus increasing the odds of anyone being descended from them. So lets just say Asians, Europeans and Africans. How would that turn out?

It's harder over the long distances

The numbers are huge, but the amount of travel and interbreeding of the regions was much less until the 19th century and in the 20th century the borders were vastly lowered to interbreeding. I say everybody is your cousin within the breeding community you came from but it takes a lot of detail to track the more dispersed groups.

How About This....

I read the posts above. Here is something I got from a bio-statisican. Note, it goes 500 years not 400 years.

"if we take 500 years as being 20 generations, you have 2 the power of 20 unrelated ancestors 500 years ago (barring any marriages between relatives). That is 1,048,576 people. So does any modern European, African or Asian. So the two of you have the square of that number (1.2 times 10 to the 13 ancestors between you. 12,000,000,000,000 or 12 thousand billion). Even today we have only 7 billion people alive. So you and any European, African or Asian have many ancestors in common 20 generations ago. You two are certainly related."

The probability of two people sharing ancestors 500 years ago is complicated, and that probability will be altered by ethnicity and family history {i.e. inbreeding} but it will not be zero.

Not the square

No, the two people have the sum — 2 million — ancestors between them. Less than that because of duplication in the tree.

However, the point I make is if you take 1 million people (ancestors of person A) and another 1 million people (ancestors of person B) it’s effectively impossible not to have some overlap if you pick them at random. However, they are not picked at random, and some migrations were limited, so it is possible not to have one in common — but unlikely. And if you have any geographical similarity in ancestry it’s extremely unlikely.

So You Mean....

You mentioned yourself even if you took the whole world as a source, there is some overlap probably. Are you saying there were *probably* some ancestors in common I share with Chinese or some other far-off group 400-500 years ago, but a lot more ancestors in common with another white person? Is that what your saying? That we have more common ancestors with others of similar ancestry 400-500 years ago?

Populations

It depends on how much mixing there was between the source groups. Active mixing of Chinese and other cultures was slow until the 19th century, I think.

The main point is that if you take a large population, and pull one million people (the ancestors of person A) out of it, and you take another million (the ancestors of person B) out of it, it is essentially certain there will be overlap if the ancestors are drawn at random (ie. from within a geographic region.) If you have regions with minimal interbreeding (for example, there was no interbreeding between native Australians and the rest of the world before 1600, and only minor interbreeding before euro-colonization in the 18th century. So for those populations the cousin probability is much lower. And it’s lower between Asians and Europeans (though not as extreme as with Australians) and so on.

Starting around the 19th century, the world started doing really serious interbreeding and today it’s huge. Going into the future there soon won’t be anybody on the planet not sharing an ancestor a few centuries back.

Mixing of populations...

Perhaps this is more applicable when talking about slightly more than 500 years back. But something that I think often gets overlooked is that people married within ~100-200 miles, not just within their villages; cumulatively, this diffused ancestry/ genetics farther and farther over the centuries. Even though travel was less common, people might easily move 100 miles during their lifetime; not to mention large migrations & diaspora. (Communities may have been more tight knit, but people liked sex just like now and people cheated etc, and there was more open space to go and do it.)

There was also shipping/ trade concentrated around the Mediterranean etc connecting Europe, Africa and the Middle East, and the Silk Road connected Asia, Europe and the Middle East. Another factor was things like Vikings kidnapping women from Ireland etc... that type of pillaging was not as rare 500+ years ago. While the rate of distant inter-breeding from all these might have been small, it only takes _one_ couple to bridge the ancestry from one group to another, which (my understanding is it's more than 50% likely) then diffuses through most of the population over several centuries (assuming there was a growing population, which it was). As mentioned in the blog, something like 10 generations later you are unlikely to have any of a given ancestor's DNA (simply because at a point in time with 1000 ancestors, and with DNA typically being passed in ~200 "chunks" to offspring, that is a 200/ 1000 or 20% chance that any chunks came from them). This is largely because the chunks of DNA are chopped at the same locations when passed to offspring.... if they were chopped up in random locations, pieces would linger around for longer.

Some surprising things I noticed related to this, in the pdf's on this webpage... the distinctions between countries/ groups are much more fuzzy than you might think.. for example check out the "Genetic Map of Interconnected World Regions", in the middle of the page -

http://www.dnatribes.com/sampleresults.html

[I didn't link to the pdf itself because it's a large file and seemed to be acting up...]

Probability

If there were random intermixing, then we would each have ~1 million ancestors living in 1500 AD, out of a world population of ~500 million.
So the fractional overlap between two people would be about 1/500th.
But the probability that two people share at least one common ancestor would be essentially 100%. Basically, you are choosing a random number between 1 and 500 a million times and you're asking whether you ever choose number 500. In a million trials, we expect this to happen 2000 times. So that it happens at least once is guaranteed.
If we get rid of the random intermixing, the fractional overlap will drop to much less than 1/500th. But I suspect that the probability of at least one overlap will remain very high.

Note that 500 years ago any two people will share one common ancestor- this ancestor may vary between any two people and this ancestor is NOT the ancestor of everyone on Earth but of those two people.

If the population in 1500 was 500 million, and it is 6 billion today (12x larger).
If the average generation length is 30 years, there are 17 generations in 500 years.
So the average number of surviving children per mother is exp((log 12)/17) = 1.157
Since a child has two parent, the average number of surviving children per person is 2 * 1.157 = 2.315

So this is the average growth rate per generation for the descendants of a person in 1500.
2.315^17 = 1.575 million.

So an average person in 1500 has about 1.5 million offspring alive today. Sampling from the whole world, the probability that a random person from 1500 is an ancestor of a random person in 2000 would be 1.5 million / 6 billion = 0.025%.

If you were only considering people in a region like Europe, it would probably be something like 1.4 million / 700 million = 0.2%.

This is not my own quote. Just something I got from someone else- A computer scientist.

Cousins

Why stop at 16th cousin? By the definition of a cousin being, "someone you hold a common anscestor with," one could argue that every living being on Earth are distant cousins, as well have all evolved from a common single celled ancestor.

What is interesting to find

What is interesting to find out is how long till the stranger next to me will be descendants of the same person?

What I think is amazing is

What I think is amazing is not a matter of shared dna, but of shared destiny. If any one of my million ancestors, 20 or so generations back, had done something differently (like not had the exact child that they had when they did) me and all my 16th cousions would not exist. Now that's crazy, how many people in total had to do all of the things they did in order for me (or any of us) to exist at all, amazing!

Thanks for the info cuz :)

Excellent blog and additional comments from my cousins all around the world. :)
May I ask another question? Thank you. If we add the factor of names, how does this narrow the percentages? Say for instance, I know my relative's name is Baroni back to 1700's and that he was born in Livorno, Tuscany. If I go back to the late 1400's, I find a Baroni who was born not far from my relative and who shares some traits and an occupational description, as well as a similar socio-economic standing. I also know that he had at least one son. How likely is it that these are DIRECTLY related, that is, son to son to son, given that they inhabited the same region and had the same last name?

That would require a lot of math

But remember, only a tiny fraction of your relatives will share your last name. And that was in the traditional “take dad’s last name” world, where less than half your 1st cousins (on average) share your last name — those from your father’s brothers — but when it comes to 2nd cousins, it’s much less, and by 3rd cousins forget about it.

I know that's an absurd thought

So there was a person 17 generations ago that was a common ancestor to both Native Americans and the indigenous peoples of Papua New Guinea? Where would this person have lived? And he evidently had one son who was the ancestor of ALL the Native Americans (the Native American "Adam"), and another who was the ancestor of ALL the indigenous peoples of Papua New Guinea (the Papua New Guinean "Adam"). Or would you like to assert that there was more than one founder? Let's say it was two guys who tie all the Native Americans to all the Papua New Guineans. So there were two pairs of "half-Adams". Or maybe there were more...maybe there was LOTS of people who had different lines go to both the New World, and to Papua New Guinea...in the 1500s. I think you see the problem. It is either ridiculous or ridiculous, take your pick.

Please stop polluting the internet with misinformation. If you want to write up about how interbreeding populations lead to closer cousins than one might expect, that's fine, but when you title it "EVERYBODY is your 16th cousin", then you are expanding the paradigm into territory where you know the summary is false. Please stop doing this.

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

Post new comment

His name is Brad Templeton. You figure it out.
Please make up a name if you do not wish to give your real one.
The content of this field is kept private and will not be shown publicly.
Personal home pages only. Posts with biz home pages get deleted and search engines ignore all links
  • Allowed HTML tags: <a> <em> <strong> <cite> <code> <ul> <ol> <li> <dl> <dt> <dd>
  • Web page addresses and e-mail addresses turn into links automatically.
  • Lines and paragraphs break automatically.

More information about formatting options