Current position in this corpus

Population-scale long-read SV resources are shifting from "nice to have discovery datasets" toward clinically and biologically useful infrastructure.

Core supporting sources

What these atlases do

  • expand the catalog of real human SV diversity beyond short-read-era references
  • provide allele-frequency resources for filtering rare disease candidates
  • capture ancestry-specific and population-stratified variation
  • expose mobile-element and repeat-mediated events that short reads underrepresent

Current takeaways

  • Trio-aware and graph-aware designs improve trustworthiness.
  • Diverse-population resources matter because many insertions and other SVs remain absent from legacy databases.
  • Functional follow-up becomes more informative once large cohort resources exist.

Open questions

  • How should intermediate-coverage and deep-coverage strategies be combined in future atlas building?
  • What is the minimum cohort size at which a new ancestry-specific long-read SV resource becomes clinically valuable?