More on DNA

I am posting this in the hope that someone will either confirm my theories or tell me that I am completely wrong.

If I share a single 3 cM segment of DNA with someone, then there is a 97% probability that it is due to random factors. Turning it round, there is only a 3% chance of a genuine match. (Source: The DNA Geek, and others.) Unless there is other evidence, then the sensible course of action is to give up and look elsewhere. At 4 cM, it is still overwhelmingly likely (81.5%) that the so-called match is due to chance. Most identical twins do not have totally identical DNA (Source: article published in Nature Genetics, 7 January 2021, quoted by Smithsonian Magazine. Only 38 out of 381 had identical DNA.) Small mutations take place in the womb. These mutations accumulate with time, giving rise to false positives. At 5 cM, the odds of a true match are roughly 50:50. So far, this is all very discouraging.

But what about multiple small matches? You might be aware of the example of the odds of children in a class sharing the same birthday. With only two children, the chances are 364/365, or 0.27%. With 10 children, the chances rise to 11.7%. (364/365 x 363/365 etc., etc.) For 23 children, the odds that two of them share the same birthday rise to 50.7%. That is simple mathematics. Make your own spreadsheet and check it out!

I am a member of the Roscommon Ireland DNA Facebook Group. All of the members have posted their DNA onto Gedmatch. When carrying out a comparison between two samples, the default setting for analysis is 7 cM. Anything below that does not show. The minimum that can be set (at least on the free version) is 3 cM. Given the likelihood of false positives at that level, this might seem to be a waste of computing power. I ran a comparison on a sample from another member and found nineteen matches across the chromosomes. The largest hit was only 4.6 cM, tailing all the way down to 3.0, but the total was 67.1 cM. Applying the mathematics suggests that the chance of all nineteen being false positives is only 7.7%, or 1 in 13.

The big question remains: can I apply standard probability theory to DNA results in this way?

I appreciate that very low levels of match indicate that the common ancestor must be way back in time. But, picking up on my ‘looking for faint stars’ post, at least we will know that there is a connection, a star, to find. It is not due to metaphorical dust on the lens.