Abstract: Cancer types (e.g., organ/tissue of origin and/or cancer subtype for an organ/tissue) can be distinguished by applying statistical methods to data samples consisting of counts of somatic single nucleotide variations (SNVs) across a tumor genome of a patient. For example, a factor loading matrix for each cancer type to be distinguished can be computed using a set of training data samples for which the cancer type is known. To determine the cancer type for a testing (or diagnostic) data sample, a regression analysis over the set of factor loading matrices yields statistical parameters that can be used to identify the cancer type.