Mutations in somatic cells generate a heterogeneous genomic population that may result in serious medical conditions, such as cancer. This new and improved understanding of the interplay between cancer and somatic mutations is triggered by high throughput sequencing technologies. Nonetheless, somatic variants remain notoriously difficult to identify. It is known that sequencing is challenged by artifactual errors that display the same low allelic frequency as cancer mutations. Most sequencing errors are thought to result from polymerase chain reaction mistakes or sequencing miscalls. In this paper, the authors show that mutagenic damage can accounts for the majority of the erroneous identification of variants with low to moderate frequency. For this, they developed an algorithm to determine a Global Imbalance Value (GIV) that estimates the amount of damage caused by sequencing. They found signatures of damage in most sequencing data sets in widely used resources, including the 1000 Genomes Project and The Cancer Genome Atlas, establishing damage as a pervasive cause of sequencing errors. The extent of this damage directly confounds the determination of somatic variants in these data sets.