The Genome

Context on what your genome is, how scan types differ, and how Varia scans your file locally.

The Genome

A human genome is a sequence of approximately 3 billion base pairs split across 23 pairs of chromosomes. Inside that sequence sit around 20,000 protein-coding genes, but those genes account for only about 1.5 percent of the total. The remaining 98.5 percent is regulatory sequence that controls when genes turn on, non-coding RNAs that perform their own functions, structural elements that organize the chromosome, repetitive sequence, and regions whose function the literature has not yet resolved. The genome is far more than the genes.

Within those 3 billion bases, roughly 10 million positions are commonly variable across the human population. A variant at one of these positions is what makes one person's biology slightly different from another's at the molecular level. Most variants are biologically silent. A small fraction sit in or near a gene in a way that materially changes how a protein is built, expressed, or regulated, and a smaller fraction still have been studied carefully enough that we can say what the variant means with clinical confidence.

That fraction is what Varia curates. Out of approximately 10 million common variants in the human genome, Varia covers 49 SNPs that we have evaluated against the peer-reviewed literature and judged interpretable enough to act on. Each variant carries genotype-specific findings across the 12 health domains Varia covers in V1.

What's in a genome

What's in a genome The human genome contains approximately 3 billion base pairs. About 1.5 percent encodes proteins for around 20,000 genes. About 10 million common variants exist across the genome. Varia curates 49 of the most interpretable SNPs. ~3 billion base pairs Full human genome ~1.5% codes for ~20,000 proteins ~10 million common SNPs 49 SNPs curated by Varia Scale at a glance Coding genes: ~1.5% of bases Common variation: ~10M positions Varia V1 curation: 49 SNPs The interpretable fraction is tiny next to the whole genome.
The genome is huge. The fraction that's interpretable for action is small. Varia curates the most interpretable subset.

This is a deliberately small number. The reason is the gap between what genomics can read and what it can interpret.

In 2003, when the first human genome was completed, sequencing one human genome cost approximately one billion dollars and took 13 years. By 2026, sequencing the same genome costs under 200 dollars and takes a few days. That is roughly a six-order-of-magnitude improvement in throughput per dollar over 23 years. Over the same period, clinical interpretation of what individual variants mean has grown roughly three orders of magnitude through peer-reviewed publication. The two curves are not parallel. The gap between what we can sequence and what we can interpret has widened every year since the first human genome was published, and there is no reason to expect it to close soon.

The interpretation gap From 2003 to 2026, sequencing throughput grew approximately six orders of magnitude while peer-reviewed clinical interpretation grew approximately three orders of magnitude. The gap between what can be sequenced and what can be interpreted has widened over time. 2003 2010 2017 2026 1 100 10K 1M 100M Sequencing throughput per dollar (approx. 10^6 growth) Peer-reviewed clinical interpretation (approx. 10^3 growth) The interpretation gap keeps widening
We can read genomes faster than we can interpret them. Varia commits to the slower side of that gap.

Varia commits to the slower, harder side of that gap. The product is the editorial discipline that decides which findings cross the bar from "studied" to "actionable," not the count of variants surfaced. Promethease and similar consumer genomics products operate at the other end of the same gap, returning tens of thousands of raw associations with no curation between the user and the literature. Both approaches have their place. Varia's bet is that for most users, most of the time, the question is not "what does my genome say about every variant ever studied" but "what does my genome say that I and my physician can do something about."

The remaining 98 to 99 percent of the genome is real, and meaningful, and worth study. Varia's V2 and beyond will continue to add curated domains as the literature matures. What Varia will not do is enumerate the unstudied.