One of the interesting things about a project like the
Italy DNA Project is that we occasionally turn up some interesting data. Because of Italy's geographic and historical position, we occasionally find haplotypes and/or haplogroups that are not common in Europe as a whole. Many of these likely result from gene flow from Eastern Europe, the Middle East and Central Asia in both historical and prehistoric times. The existence of two R1b1b, one from Lazio and one from Sicilia, in our project is an example of this phenomenon.
R1b1b is a subclade of
haplogroup R1b1, the most common haplogroup in Europe. Haplogroup R1b1 is defined by SNPs M343 and P25 (although
P25 is not particularly reliable, as it is prone to back mutation), and contains four subclades:
- R1b1a, defined by the SNP M18
- R1b1b, defined by the SNP M73
- R1b1c, defined by the SNP M269
- R1b1d, defined by the SNP M335
The vast majority of R1b males in Europe are in subclade R1b1c (M269+), but the Italy DNA Project has two members (one confirmed and one pending confirmation) in subclade R1b1b (M73+). R1b1b is very rare in Europe (in fact, as far as I know these two Italians are the only M73+ samples discovered in Europe to date), but much more common in Central Asia.
M73 refers to the deletion of 2 basepairs (GT) at nucleotide position 260 on Y-chromosome locus G65537. The SNP was apparently discovered by P.J. Oefner and the first examples were reported in a paper by Underhill et al. (2000). Subsequent examples have been reported in papers by Cinnioglu et al. (2004) and Sengupta et al. (2006).
The Underhill paper reported six samples from Central Asia/Siberia. The Cinnioglu paper reported four samples from Turkey. The Sengupta paper reported ten samples from China, one from Japan, and eight samples from Pakistan. Combined with the two Italian samples from our project, we have identified a grand total of 31.
The six Underhill samples were never haplotyped (not publicly, anyway), but the Cinnioglu and Sengupta samples have all been haplotyped at 10 loci (DYS19, DYS388, DYS390, DYS391, DYS392, DYS393, DYS389I, DYS389II, DYS439, and DYSA7.2).
However, I believe (based on personal correspondence with a reliable source) that at least two of the samples identified as M73+ in the Sengupta paper are not actually R1b1b.
Additionally two of the M73+ in the Cinnioglu paper (haplotypes 442 and 443) have interesting values for DYS390. These two samples have DYS390=19, which is far below the reported value for this marker in any other published R1b samples. DYS390 is a multi-segment locus, and in rare cases the normally invariant flanking segments (DYS390.1, DYS390.2, and DYS390.3) mutate or suffer a deletion (partial or comple). The phenomenon was described by Peter Forster in an excellent 1998 paper, Phylogenetic Resolution of Complex Mutational Features at Y-STR DYS390 in Aboriginal Australians and Papuans.
I have excluded the Japanese sample from my analysis, but am including the remainder at the present time. Haplotype data for the 24 samples in this study are available here.
One of the two Italian samples has been haplotyped at 25 loci and the other haplotyped at 37 loci, but testing is still in progress. Neither has been tested for DYSA7.2 (a.k.a. DYS461) yet. When both samples are done, each will be haplotyped at 38 loci including all ten loci for which the Cinnioglu and Sengupta samples were tested.
Based on the results we have accumulated (i.e. nine common markers for 21 M73+ samples), it looks like the age of the R1b1b subclade is approximately 10,00 to 12,500 years. I made that estimate based on the pairwise genetic distance of the samples and an effective mutation rate (0.00069) from Zhivotovsky (2oo4), but such an estimate should be considered with caution at this stage of the investigation.
Given the distribution of samples, it seems probable that R1b1b arose in Central Asia (possibly near Kazakhstan, Kyrgyzstan, or Uzbekistan which were along the Silk Road connecting China, Turkey, and Italy) and then spread east and west from there.
Based on the limited haplotype data we currently have, there appear to be two distinct branches within R1b1b. The Italian, Anatolian, and Pakistani samples cluster into one branch, while the Chinese samples cluster into another. You can view a preliminary phylogenetic tree showing these clusters. I excluded the Japanese sample from the Sengupta paper, mentioned earlier, but included all the Chinese samples since it is not yet clear which of them is suspect. The reader should be aware that because our sample size is so small, one or two mistaken inclusions/exclusions could have a dramatic impact on the shape of the phylogenetic tree.
At this point, there is no clear defining modal for the group. The modal values for the few samples we have collected are similar to the Atlantic Modal Haplotype (AMH), with one weak exception: DYS393. The R1b1b samples tend to have slightly higher values (e.g. 14) at DYS393 than the AMH variety of R1b1c which has a modal of 13. But this marker is highly variable in both R1b1b and R1b1c, so DYS393 is not predictive for R1b1b.
Our hope is that more complete haplotyping of the existing samples combined with more systematic sampling of Central Asian populations (of the kind being done by the National Geographic Genographic Project) will allow us to refine our understanding of haplogroup R1b1b.
Labels: R1b, Y-DNA