Identifying new sex-linked genes through BAC sequencing in the dioecious plant Silene latifolia
Background: Silene latifolia represents one of the best-studied plant sex chromosome systems. A new approach using RNA-seq data has recently identified hundreds of new sex-linked genes in this species. However, this approach is expected to miss genes that are either not expressed or are expressed at low levels in the tissue(s) used for RNA-seq. Therefore other independent approaches are needed to discover such sex-linked genes. Results: Here we used 10 well-characterized S. latifolia sex-linked genes and their homologs in Silene vulgaris, a species without sex chromosomes, to screen BAC libraries of both species. We isolated and sequenced 4 Mb of BAC clones of S. latifolia X and Y and S. vulgaris genomic regions, which yielded 59 new sex-linked genes (with S. vulgaris homologs for some of them). We assembled sequences that we believe represent the tip of the Xq arm. These sequences are clearly not pseudoautosomal, so we infer that the S. latifolia X has a single pseudoautosomal region (PAR) on the Xp arm. The estimated mean gene density in X BACs is 2.2 times lower than that in S. vulgaris BACs, agreeing with the genome size difference between these species. Gene density was estimated to be extremely low in the Y BAC clones. We compared our BAC-located genes with the sex-linked genes identified in previous RNA-seq studies, and found that about half of them (those with low expression in flower buds) were not identified as sex-linked in previous RNA-seq studies. We compiled a set of similar to 70 validated X/Y genes and X-hemizygous genes (without Y copies) from the literature, and used these genes to show that X-hemizygous genes have a higher probability of being undetected by the RNA-seq approach, compared with X/Y genes; we used this to estimate that about 30 % of our BAC-located genes must be X-hemizygous. The estimate is similar when we use BAC-located genes that have S. vulgaris homologs, which excludes genes that were gained by the X chromosome. Conclusions: Our BAC sequencing identified 59 new sex-linked genes, and our analysis of these BAC-located genes, in combination with RNA-seq data suggests that gene losses from the S. latifolia Y chromosome could be as high as 30 %, higher than previous estimates of 10-20 %.