|
Approaches to long-read sequencing in a clinical setting to improve diagnostic rate
The paper's central claim is pragmatic rather than triumphalist: long read WGS is genuinely useful, but it should initially be deployed where it has the biggest technical advantage. It shows that long read WGS covers about 98% of next generation sequencing dead zones, performs well for small variants, outperforms short reads for structural variants, and adds native methylation information.
|
Sources |
2022 |
Ingested |
Rare Disease |
2 |
ClinicalComplex rearrangements and hard regionsIngestedLong-read vs short-read WGS |
|
|
Comparative evaluation of SNVs, indels, and structural variations detected with short- and long-read sequencing data
This paper is important because it reduces vague claims like "long reads are better" into specific statements about where they are better. The strongest long read advantage appears in insertions greater than 10 bp and in SV detection within repetitive regions, especially STR rich sequence.
|
Sources |
2024 |
Ingested |
Clinical |
3 |
BenchmarkClinicalComplex rearrangements and hard regionsIngested |
|
|
compare retrieval strategies
How does persistent synthesis differ from plain RAG? Pending analysis.
|
Queries |
Unspecified |
Open |
Other Papers |
0 |
OpenQuery |
|
|
Complex Rearrangements and Hard Regions
This concept covers repetitive regions, segmental duplications, sequencing dead zones, complex medically relevant loci, and rearrangements whose structure matters as much as their presence. This is one of the strongest justification zones for long read WGS.
|
Concepts |
Unspecified |
Unspecified |
Clinical |
6 |
ClinicalConcept |
|
|
Construction of a trio-based structural variation panel utilizing activated T lymphocytes and long-read sequencing technology
The paper is as much about sample logistics as it is about variant discovery. It argues that long read population studies need a reliable source of high quality DNA and proposes activated T lymphocytes as a practical biobank resource.
|
Sources |
2022 |
Ingested |
Rare Disease |
4 |
BiobankClinicalCocultureIngested |
|
|
HiFi vs ONT
HiFi currently looks strongest for general purpose variant calling accuracy in this corpus. ONT looks strongest where native DNA properties matter, especially methylation aware workflows and flexible large scale deployment.
|
Concepts |
Unspecified |
Unspecified |
Rare Disease |
6 |
BenchmarkClinicalConceptHiFi |
|
|
Index
Overview top level summary of the current collection. Multi platform discovery of haplotype resolved structural variation in human genomes deep ingested benchmark source page.
|
Core |
Unspecified |
Unspecified |
Rare Disease |
10 |
BenchmarkBiobankClinicalCore |
|
|
Log
Seeded the repository structure for an LLM maintained research wiki. Added AGENTS.md, README.md, and scripts/wiki.py.
|
Core |
Unspecified |
Unspecified |
Rare Disease |
0 |
BenchmarkBiobankClinicalCore |
|
|
Long read sequencing enhances pathogenic and novel variation discovery in patients with rare diseases
This paper argues most directly for long read WGS as a unified clinical platform. The study reports detection of all known pathogenic SNV, SV, and methylation variants in its positive controls, then shows additional diagnoses in 10% of previously negative cases.
|
Sources |
2025 |
Ingested |
Rare Disease |
7 |
BenchmarkClinicalHiFiHiFi vs ONT |
|
|
Long Read WGS Starter Corpus
This corpus contains 10 deeply ingested papers on long read whole genome sequencing across human structural variation, clinical diagnostics, large cohort benchmarking, rare disease resolution, and population scale studies. Multi platform discovery of haplotype resolved structural variation in human genomes
|
Syntheses |
Unspecified |
Active |
Rare Disease |
10 |
ActiveBenchmarkBiobankClinical |
|
|
Long-read genome sequencing resolves complex genomic rearrangements in rare genetic syndromes
This paper is a clean demonstration of why long reads matter beyond simply finding "a CNV is present." In both patients, prior methods detected copy number abnormalities, but long read sequencing resolved the actual rearrangement architecture.
|
Sources |
2024 |
Ingested |
Rare Disease |
5 |
ClinicalComplex rearrangements and hard regionsHiFiIngested |
|
|
Long-read sequencing of 945 Han individuals identifies structural variants associated with phenotypic diversity and disease susceptibility
This paper moves beyond atlas building into trait interpretation. It reports 111,288 SVs, with 24.56% not previously reported, and uses phenotypic, multi omics, and mouse model follow up to argue that selected SVs are causal rather than merely associated.
|
Sources |
2025 |
Ingested |
Kidney |
2 |
ClinicalIngestedKidneyLong-read vs short-read WGS |
|
|
Long-read vs Short-read WGS
Long read WGS is not uniformly better for every task, but it is consistently better where read length, phasing, or native molecule information matters. insertions larger than roughly 10 bp
|
Concepts |
Unspecified |
Unspecified |
Clinical |
5 |
ClinicalConceptMethylationPopulation |
|
|
Long-read whole-genome analysis of human single cells
This paper is a frontier methods study rather than a diagnostic or population paper. It shows that single cell long read WGS can recover variant classes and genomic regions that short read single cell workflows miss, including repeat rich and dark regions.
|
Sources |
2023 |
Ingested |
Rare Disease |
4 |
AssembloidClinicalHiFiIngested |
|
|
Multi-platform discovery of haplotype-resolved structural variation in human genomes
This paper establishes the benchmark logic for much of the later Long Read WGS literature in this corpus: structural variation is systematically undercounted when sequencing and calling are dominated by short reads. By combining multiple orthogonal technologies, the authors report roughly 818,054 indels and 27,622 SVs per genome, plus about 156 inversions per genome.
|
Sources |
2019 |
Ingested |
Clinical |
5 |
BenchmarkClinicalComplex rearrangements and hard regionsHiFi |
|
|
Overview
As of 2026 04 07, this wiki contains 10 deeply ingested Long Read WGS papers spanning 2019 2025. Long read WGS shows its clearest advantage in structural variation, insertions, repeat rich loci, sequencing dead zones, complex rearrangements, phasing, and methylation aware workflows.
|
Core |
Unspecified |
Unspecified |
Clinical |
0 |
BenchmarkClinicalCoreHiFi |
|
|
Population-scale SV Atlases
Population scale long read SV resources are shifting from "nice to have discovery datasets" toward clinically and biologically useful infrastructure. Otsuki 2022
|
Concepts |
Unspecified |
Unspecified |
Rare Disease |
4 |
ConceptPopulationRare DiseaseStructural Variation |
|
|
Rare Disease Diagnostics
Long read WGS adds the most diagnostic value in rare disease when the likely mechanism involves structural complexity, dead zones, phasing problems, methylation, or hard to map medically relevant genes. Kobayashi 2022
|
Concepts |
Unspecified |
Unspecified |
Rare Disease |
4 |
ClinicalConceptMethylationPopulation |
|
|
Structural Variation
Structural variants in this corpus are genomic alterations typically 50 bp or larger, including deletions, insertions, duplications, inversions, translocations, mobile element insertions, and repeat mediated events. Structural variation is the clearest and most consistent advantage zone for long read WGS.
|
Concepts |
Unspecified |
Unspecified |
Clinical |
6 |
ClinicalConceptPopulationStructural Variation |
|
|
Structural variation in 1,019 diverse humans based on long-read sequencing
This paper is the most globally representative population resource in the corpus. It shows that intermediate coverage long read sequencing can still produce a highly useful SV atlas when paired with strong graph aware analysis.
|
Sources |
2025 |
Ingested |
Rare Disease |
2 |
ClinicalComplex rearrangements and hard regionsIngestedPaper |
|
|
Utility of long-read sequencing for All of Us
This paper asks a deployment question: if a program the size of All of Us considers long reads, what do they buy? The answer is that long reads materially improve coverage and variant recovery in medically relevant, technically challenging genes, and that HiFi gives the strongest overall variant calling accuracy in this pilot.
|
Sources |
2024 |
Ingested |
Clinical |
6 |
BenchmarkBiobankClinicalComplex rearrangements and hard regions |
|