Published as a preprint on bioRxiv, this study introduces Dog10K_Boxer_Tasha_1.0, a highly contiguous chromosome-level genome assembly produced using modern long-read sequencing technologies. The domestic dog plays a crucial role in biomedical research due to its natural occurrence of many diseases shared with humans. Therefore, improving the reference genome represents an essential step in advancing canine genetics, comparative genomics, and disease-mapping studies.
This new assembly significantly outperforms the earlier Canfam3.1 draft, closing more than 23,000 genomic gaps, many of which were located in GC-rich promoter regions and first exons—areas essential for accurate gene interpretation. The researchers report a greater than 100-fold increase in sequence contiguity, providing clearer insight into structural variants and regulatory sequences that were previously unresolved.
A major achievement of Dog10K_Boxer_Tasha_1.0 is the identification of over 1,200 new protein-coding transcripts, increasing the resolution and accuracy of canine gene annotation. This is particularly relevant for understanding genetic mechanisms related to morphology, physiology, and behavior, as well as inherited diseases that parallel human medical conditions.
By leveraging long-read sequencing and advanced assembly pipelines, Jagannathan and colleagues offer a transformative resource that addresses the limitations of earlier references. Their work underscores the importance of continually updating genomic resources as sequencing technologies evolve. This improved assembly is expected to accelerate both veterinary and human-related biomedical research, strengthen genotype–phenotype investigations, and provide a more robust platform for studying canine evolutionary history.
Source: Jagannathan, V., Hitte, C., et al. (2021). Dog10K_Boxer_Tasha_1.0: A Long-Read Assembly of the Dog Reference Genome. bioRxiv. Accession: GCF_000002285.5.







