ASSAYS TO DETERMINE DNA METHYLATION AND DNA METHYLATION MARKERS OF CANCER
Methods are provided for determining a genomic methylation profile in a DNA sample. In certain aspects, the methods can be used to determine if a subject has, or is at risk for developing, a bladder cancer or other cancers of the urinary tract. Methods for treatment of such subjects are likewise provided.
The present application is a continuation of co-pending U.S. application Ser. No. 15/552,825, which was filed Aug. 23, 2017, which is a national phase application under 35 U.S.C. § 371 of International Application No. PCT/US2016/019310, filed Feb. 24, 2016, which claims the benefit of U.S. Provisional patent Application No. 62/120,373, filed Feb. 24, 2015, both of which are incorporated herein by reference, in its entirety.
BACKGROUND OF THE INVENTION 1. Field of the InventionThe present invention relates generally to the fields of molecular biology, epigenetics, and predictive medicine. More particularly, it concerns method for determining a genomic DNA methylation profile in a sample.
2. Description of Related ArtCancers of the urinary tract include bladder, urethral, kidney and prostate cancers whose cells are detectible via urine or biopsy. Bladder cancer was one of the 10 most prevalent malignancies in males in 2011, ranking fourth and eighth in terms of deaths and new cases, respectively (Siegel et al., 2011; Morgan and Clark, 2010). Nonmuscle invasive bladder cancer (NMIBC) accounts for 80% of all the cases, and can be further classified into mucosa only (Ta), carcinoma in situ (Tis), and lamina propria invading, (T1) lesions (Babjuk et al., 2011, Sobin et al., 2009). The primary treatment for NMIBC is transurethral resection of bladder tumor (TURBT) with or without intravesical chemo or immunotherapy; however, more than 50% of patients recur after the TURBT procedure, with the highest rate of recurrence occurring in patients with high-risk disease (Shelley et al., 2010; Millan-Rodriguez et al., 2000). As a result, patients require frequent and lifelong monitoring following TURBT, making bladder cancer one of the most costly types of cancer to manage.
The current standard for monitoring of bladder cancer recurrence involves the use of cystoscopy and cytology (Morgan and Clark, 2010; Babjuk et al., 2011). Disease surveillance is cumbersome because of the invasive nature of cystoscopic examination and the low sensitivity of urinary cytology in the detection of low-grade tumors (Lintula and Hotakainen, 2010). The addition of nuclear matrix protein 22 (NMP-22), bladder tumor antigen, or UroVysion FISH has shown to help increase the sensitivity of cytology (Parker and Spiess, 2011). However, due to their inconsistent performance in terms of specificity or sensitivity, the markers proposed to date have not been widely adopted in routine clinical practice (Reinert 2012). Therefore, there is a need to find reliable markers to monitor bladder cancer patients as well as to distinguish different cancers associated with the urinary tract. Moreover, there is a need to new methodologies for assessing genomic DNA methylation profiles in biological samples.
SUMMARY OF THE INVENTIONIn a first embodiment, the invention provides a method for determining a genomic DNA methylation profile in a sample comprising: (a) obtaining a substantially purified test genomic DNA sample; (b) contacting a portion test genomic DNA of the sample with a first reaction mixture comprising: (i) at least two methylation sensitive restriction endonucleases; (ii) a hot-start DNA polymerase; (iii) a pH buffered salt solution; (iv) dNTPs; (v) DNA primer pairs for polymerase chain reaction (PCR) amplification of at least a first, second and third different genomic region in the DNA sample; and (vi) fluorescent probes complementary to sequences in said first, second and third different genomic regions for quantitative detection of amplified sequences from the first, second and third different genomic regions, wherein each of the probes comprises a distinct fluorescent label; wherein (I) the first genomic region is a cleavage control that is known to be unmethylated, (II) the second genomic region is a copy number control that does not include any cut sites for the methylation sensitive restriction endonucleases of the first reaction mixture, and (III) the third genomic region is a test region having an unknown amount of methylation and including at least three cut sites for the methylation sensitive restriction endonucleases of the first reaction mixture; (c) subjecting the first reaction mixture to digestion and thermal cycling, while detecting fluorescent signals from the fluorescent probes, thereby performing real time PCR on the samples in the first and second reaction mixtures; and (d) using the detected fluorescent signals to determine the genomic DNA methylation profile in a sample.
In further aspects, the method additionally comprises: (a) obtaining a substantially purified test genomic DNA sample; (b) contacting a portion test genomic DNA of the sample with a first reaction mixture comprising: (i) at least two methylation sensitive restriction endonucleases; (ii) a hot-start DNA polymerase; (iii) a pH buffered salt solution; (iv) dNTPs; (v) DNA primer pairs for polymerase chain reaction (PCR) amplification of at least a first, second and third different genomic region in the DNA sample; and (vi) fluorescent probes complementary to sequences in said first, second and third different genomic regions for quantitative detection of amplified sequences from the first, second and third different genomic regions, wherein each of the probes comprises a distinct fluorescent label; and a second reaction mixture, identical to the first reaction mixture, but lacking the at least two methylation sensitive restriction endonucleases, wherein: (I) the first genomic region is a cleavage control that is known to be unmethylated; (II) the second genomic region is a copy number control that does not include any cut sites for the methylation sensitive restriction endonucleases of the first reaction mixture; and (III) the third genomic region is a test region having an unknown amount of methylation and including at least three cut sites for the methylation sensitive restriction endonucleases of the first reaction mixture; (c) subjecting the first and second reaction mixtures to digestion and thermal cycling, while detecting fluorescent signals from the fluorescent probes, thereby performing real time PCR on the samples in the first and second reaction mixtures; and (d) using the detected fluorescent signals to determine the genomic DNA methylation profile in a sample.
In some aspects, the first reaction mixture further comprises a PCR enhancer. In specific aspects, the PCR enhancer may comprise DMSO. In certain aspects, obtaining a substantially purified test genomic DNA sample may comprise purifying the DNA sample. In several aspects, the substantially purified test genomic DNA sample is of sufficient purity to provide at least 85%, 90%, 95% or 99% digestion of the DNA by said at least two methylation sensitive restriction endonucleases in 2 hours at 30° C. In particular aspects, the substantially purified test genomic DNA sample comprises 50 pg to 1,000 ng of DNA. In some specific aspects, the substantially purified test genomic DNA sample comprises less than 50 ng of DNA.
In certain aspects, the substantially purified test genomic DNA sample may be obtained from a urine, stool, saliva, blood or tissue sample. In some particular aspects, the substantially purified test genomic DNA sample is obtained from a biopsy sample. In other particular aspects, the substantially purified test genomic DNA sample is obtained from a urine sample. In several aspects, the first reaction mixture comprises at least three methylation sensitive restriction endonucleases. In further aspects, the at least three methylation sensitive restriction endonucleases comprise AciI, HinPl1 and HpaII.
In some aspects, step (b) may further comprise contacting a portion test genomic DNA of the sample with a first reaction mixture comprising: (i) at least two methylation sensitive restriction endonucleases; (ii) a hot-start DNA polymerase; (iii) a pH buffered salt solution; (iv) dNTPs; (v) DNA primer pairs for polymerase chain reaction (PCR) amplification of at least a first, second, third and fourth different genomic region in the DNA sample; and (vi) fluorescent probes complementary to sequences in said first, second, third and fourth different genomic regions for quantitative detection of amplified sequences from the first, second, third and fourth different genomic regions, wherein each of the probes comprises a distinct fluorescent label, wherein: (I) the first genomic region is a cleavage control that is known to be unmethylated, (II) the second genomic region is a copy number control that does not include any cut sites for the methylation sensitive restriction endonucleases of the first reaction mixture, and (III) the third and fourth genomic regions are test regions having an unknown amount of methylation and including at least three cut sites for the methylation sensitive restriction endonucleases of the first reaction mixture.
In further aspects, step (b) may additionally comprise contacting a portion test genomic DNA of the sample with a first reaction mixture comprising: (i) at least two methylation sensitive restriction endonucleases; (ii) a hot-start DNA polymerase; (iii) a pH buffered salt solution; (iv) dNTPs; (v) DNA primer pairs for polymerase chain reaction (PCR) amplification of at least a first, second, third, fourth and fifth different genomic region in the DNA sample; and (vi) fluorescent probes complementary to sequences in said first, second, third, fourth and fifth different genomic regions for quantitative detection of amplified sequences from the first, second, third, fourth and fifth different genomic regions, wherein each of the probes comprises a distinct fluorescent label, and wherein (I) the first genomic region is a cleavage control that is known to be unmethylated, (II) the second genomic region is a copy number control that does not include any cut sites for the methylation sensitive restriction endonucleases of the first reaction mixture, and (III) the third, fourth and fifth genomic regions are test regions having an unknown amount of methylation and including at least three cut sites for the methylation sensitive restriction endonucleases of the first reaction mixture. In certain aspects, the third, fourth and fifth genomic regions may be regions of Unk05, Unk09 and SOX17. In still further aspects, a method of the embodiments further comprises determining whether the cells comprise an aneuploidy relative to one or more gene region.
In several aspects, at least 4, 5, 6, 7 or 8 cut sites for the methylation sensitive restriction endonucleases of the first reaction mixture. In some aspects, the primer pairs are complementary to sequences no more than 300, 275, 250, 225, 200, 190, 180, 170, 160, 150, 140, 130, 120, 110, 100, 90, 80, 70, 60 or 50 nucleotides apart. In certain aspects, the first genomic region is a genomic region of a housekeeping gene. The housekeeping gene may be GAPDH. In still further aspects, the second genomic region may be a genomic region of the POLR2A gene. In some aspects, the third genomic region is selected from the group provided in Table 1A or Table 2. In some specific aspects, the third genomic region is selected from the group consisting of DMRTA2, EVX2, Unk21, OTX1, SOX1, SEPT9, Unk05, Unk09, GALR1, Unk07, Unk19, TBX15, EEF1A2, TFAP2B, DCHS2 and SOX17. In a particular aspect, using the detected fluorescent signals to determine the genomic DNA methylation profile in a sample comprises calculating the relative methylation percentages for the sample. In still further aspects, a method of the embodiments comprises the use of one or more of the probes or primer pairs provided in Table 1C.
In a further embodiment, the invention provides a reaction mixture comprising: (i) at least two methylation sensitive restriction endonucleases; (ii) a hot-start DNA polymerase; (iii) a pH buffered salt solution; (iv) dNTPs; (v) a substantially purified genomic DNA sample; (vi) DNA primer pairs for polymerase chain reaction (PCR) amplification of at least a first, second and third different genomic region in the DNA sample; (vii) fluorescent probes complementary to sequences in said first, second and third different genomic regions for quantitative detection of amplified sequences from the first, second and third different genomic regions, wherein each of the probes comprises a distinct fluorescent label wherein: (I) the first genomic region is a cleavage control that is known to be unmethylated; (II) the second genomic region is a copy number control that does not include any cut sites for the methylation sensitive restriction endonucleases of the first reaction mixture; and (III) the third genomic region is a test region having an unknown amount of methylation and including at least three cut sites for the methylation sensitive restriction endonucleases of the first reaction mixture. In some aspects, the third genomic region is selected from the group provided in Table 1A or Table 2. In still further aspects, the probes complementary to sequences in said first, second and third different genomic regions are selected from the probes provided in Table 1C.
In still a further embodiment, there is provided a method for determining a genomic DNA methylation profile in a sample comprising: (a) obtaining a test genomic DNA sample, which has been bisulfite converted; (b) contacting the test sample with a first reaction mixture comprising: (i) a hot-start DNA polymerase; (ii) a pH buffered salt solution; (iii) dNTPs; (iv) DNA primer pairs for polymerase chain reaction (PCR) amplification of at least a first, and second different genomic region in the DNA sample, wherein the primer pairs are complementary to sequences no more than 200 nucleotides apart; and (v) fluorescent probes complementary to sequences in said first, and second different genomic regions for quantitative detection of amplified sequences from the first and second different genomic regions, wherein each of the probes comprises a distinct fluorescent label, wherein: (I) the first genomic region is a copy number control region that that does not comprise CpG dinucleotides; and (II) the second genomic region is a test region having an unknown amount of methylation and including at least five CpG dinucleotides in sequences that are complementary to DNA primer pairs and the probe for the second genomic region; (c) subjecting the first reaction mixtures to thermal cycling, while detecting fluorescent signals from the fluorescent probes, thereby performing real time PCR on the sample in the first reaction mixture; and (d) using the detected fluorescent signals and fluorescent signal from a DNA methylation standard curve to determine the genomic DNA methylation profile in a sample.
In yet still a further embodiment, the invention provides a method for determining a genomic DNA methylation profile in a sample comprising: (a) obtaining a test genomic DNA sample, which has been bisulfite converted, and a methylation control genomic DNA sample that has been fully methylated; (b) contacting the test sample with a first reaction mixture comprising: (i) a hot-start DNA polymerase; (ii) a pH buffered salt solution; (iii) dNTPs; (iv) DNA primer pairs for polymerase chain reaction (PCR) amplification of at least a first and second different genomic region in the DNA sample, wherein the primer pairs are complementary to sequences no more than 200 nucleotides apart; and (v) fluorescent probes complementary to sequences in said first and second different genomic regions for quantitative detection of amplified sequences from the first and second different genomic regions, wherein each of the probes comprises a distinct fluorescent label, wherein: (I) the first genomic region is a copy number control region that that does not comprise CpG dinucleotides; and (II) the second genomic region is a test region having an unknown amount of methylation and including at least five CpG dinucleotides in sequences that are complementary to DNA primer pairs and the probe for the second genomic region; (c) contacting the methylation control bisulfite converted genomic DNA sample with a second reaction mixture having identical components as said first reaction mixture; (d) subjecting the first and second reaction mixtures to thermal cycling, while detecting fluorescent signals from the fluorescent probes, thereby performing real time PCR on the samples in the first and second reaction mixtures; and (e) using the detected fluorescent signals and fluorescent signal from a DNA methylation standard curve to determine the genomic DNA methylation profile in a sample.
In some aspects of the embodiments described herein, the methods further comprise using the detected fluorescent signals of the first genomic region to normalize DNA quantity across all tested samples. In certain aspects, the first reaction mixture may further comprise a PCR enhancer. In particular aspects, the PCR enhancer comprises DMSO. In several aspects, determining the genomic DNA methylation profile comprises determining the copy number of methylated DNA molecules. In still further aspects, the method may additionally comprise using the detected fluorescent signals to determine ratio of methylation in the test sample to the reference methylation control. In specific aspects, the genomic DNA sample comprises 50 pg to 10 ng of DNA. In certain aspects, the genomic DNA sample may comprise DNA isolated from 6 to 1,500 cells.
In yet still further aspects of the embodiments described herein, the methods may additionally comprise obtaining a genomic DNA sample and subjecting the genomic DNA sample to bisulfate conversion. In specific aspects, the genomic DNA is obtained from a urine, stool, saliva, blood or tissue sample. In other aspects, the genomic DNA is obtained from a biopsy sample. In a particular aspect, the genomic DNA is obtained from a urine sample.
In certain aspects, step (b) further comprises contacting the test sample with a first reaction mixture comprising: (i) a hot-start DNA polymerase; (ii) a pH buffered salt solution; (iii) dNTPs; (iv) DNA primer pairs for PCR amplification of at least a first, second and third different genomic region in the DNA sample, wherein the primer pairs are complementary to sequences no more than 200 nucleotides apart; and (v) fluorescent probes complementary to sequences in said first, second and third different genomic regions for quantitative detection of amplified sequences from the first, second and third different genomic regions, wherein each of the probes comprises a distinct fluorescent label, wherein: (I) the first genomic region is a copy number control region that that does not comprise CpG dinucleotides; and (II) the second and third genomic regions are test regions having an unknown amount of methylation and including at least five CpG dinucleotides in sequences that are complementary to DNA primer pairs and the probe for the second and third genomic regions. In a particular aspect, step (b) may still further comprise contacting the test sample with a first reaction mixture comprising: (i) a hot-start DNA polymerase; (ii) a pH buffered salt solution; (iii) dNTPs; (iv) DNA primer pairs for PCR amplification of at least a first, second, third, fourth and fifth different genomic region in the DNA sample, wherein the primer pairs are complementary to sequences no more than 200 nucleotides apart; and (v) fluorescent probes complementary to sequences in said first, second, third, fourth and fifth different genomic regions for quantitative detection of amplified sequences from the first, second, third, fourth and fifth different genomic regions, wherein each of the probes comprises a distinct fluorescent label, wherein: (I) the first genomic region is a copy number control region that that does not comprise CpG dinucleotides; and (II) the second, third, fourth and fifth genomic regions are test regions having an unknown amount of methylation and including at least five CpG dinucleotides in sequences that are complementary to DNA primer pairs and the probe for the second, third, fourth and fifth genomic regions
In several aspects of the embodiments described herein, the second genomic region includes at least 6, 7, 8, 9 or 10 CpG dinucleotides in sequences that are complementary to DNA primer pairs and the probe for the second region. In some specific aspects, one of said CpG dinucleotides in sequences that are complementary to DNA primer pairs includes a C positioned in the last five nucleotides at the 3′ end of the DNA primer pairs. In a further aspect, the C may be positioned at the 3′ end of the DNA primer pairs.
In certain aspects, the primer pairs are complementary to sequences no more than 170, 160, 150, 140, 130, 120, 110 or 100 nucleotides apart. In some aspects, each of the probes is no more than 40 bp in length. In some particular aspects, each of the probes is no more than 30 bp in length. In several aspects, each of the probes comprises a CG ratio of 30-80%. In specific aspects, each of the primers has a Tm of 55-62° C. In a further aspect, each of the probes may have a Tm of 65-72° C.
In yet still further aspects, the copy number control region is region of the COL2A1 gene. In some aspects, the second genomic region is selected from the group provided in Table 1A. In some particular aspects, the second genomic region is selected from the group consisting of DMRTA2, EVX2, Unk21, OTX1, SOX1, SEPT9, Unk05, Unk09, GALR1, Unk07, Unk19, TBX15, EEF1A2, TFAP2B, DCHS2 and SOX17. In certain aspects, using the detected fluorescent signals to determine the genomic DNA methylation profile in a sample comprises calculating the relative methylation percentages for the sample.
In yet a further embodiment there is provided a synthetic polynucleotide sequence comprising a sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98% or 99% identical to one of the probe sequences selected from those provided in Table 1B or 1C, wherein the polynucleotide is conjugated to a reporter molecule. In some aspects, the synthetic polynucleotide comprises a sequence identical to one of the probes of Table 1B or 1C. In certain aspects, the reporter molecule is a fluorophore.
In still a further embodiment there is provided a primer pair, where the primers comprise a sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98% or 99% identical to one of the primer sequences selected from those provided in Table 1B or 1C. In some aspects, the primer pair comprises primers having a sequence identical to a primer pair of Table 1B or 1C.
In still further aspects, a kit is provide comprising reagents for preforming qPCR and a recombinant polynucleotide sequence comprising a sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98% or 99% identical to one of the probe sequences selected from those provided in Table 1B or 1C, wherein the polynucleotide is conjugated to a reporter molecule.
In a further embodiment, the invention provides a method of treating a patient comprising determining a genomic methylation profile for the patient in accordance with any one of embodiments and aspects described above and performing a treatment on the patient based on the genomic methylation profile. In certain aspects, the treatment comprises performing a biopsy of the patient. In some aspects, the treatment comprises administering an anti-cancer therapy to the patient. The anti-cancer therapy may be chemotherapy, radiotherapy, gene therapy, surgery, hormonal therapy, anti-angiogenic therapy or cytokine therapy.
In a further embodiment there is provided a method of detecting the presence of, or an increased risk of, bladder cancer or other cancers of the urinary tract in a patient comprising determining a methylation status in one or more genomic regions in a patient sample selected from the group provided in Table 1A wherein an increased level of methylation in one or more of the genomic regions of Table 1A relative to a reference level indicates that the patient has or is at risk of developing bladder cancer.
In some aspects, one or more genomic regions is selected from the group consisting of Unk 09, Unk 05, DCHS2, OTX1, Unk 07, EVX2, SEPT9, SOX1, Unk 19, Unk 21, and SOX17 (as indicated in Table 1A). In certain aspects, the one or more genomic regions is selected from 2, 3, 4, 5, 6, 7, 8, 9, 10 or 11 of the genomic regions selected from the group consisting of Unk 09, Unk 05, DCHS2, OTX1, Unk 07, EVX2, SEPT9, SOX1, Unk 19, Unk 21, and SOX17 (as indicated in Table 1A).
In other aspects, the one or more genomic regions is selected from the group consisting of GALR1, TBX15, EEF1A2, DMRTA2, and TFAP2B (as indicated in Table 1A). In some aspects, the one or more genomic regions is selected from 2, 3, 4, or 5 of the genomic regions selected from the group consisting of GALR1, TBX15, EEF1A2, DMRTA2, and TFAP2B (as indicated in Table 1A).
In certain aspects, the one or more genomic regions is selected from the group consisting of SCT, Unk 14, Unk 29, CERKL, and SHH (as indicated in Table 1A). In further aspects, the one or more genomic regions is selected from 2, 3, 4, or 5 of the genomic regions selected from the group consisting of SCT, Unk 14, Unk 29, CERKL, and SHH (as indicated in Table 1A).
In some aspects, said determining comprises determining a methylation status in two, three or more of said genomic regions. In certain aspects, said determining comprises determining a methylation status in 2, 3, 4, 5, 6, 7, 8, 9, 10 or more of said genomic regions.
In some aspects, said determining comprises determining a methylation status in 2, 3, 4, 5, 6, 7, 8, 9, 10 or more of said genomic regions, wherein the genomic regions are selected from the group consisting of Unk 09, Unk 05, DCHS2, OTX1, Unk 07, EVX2, SEPT9, SOX1, Unk 19, Unk 21, SOX17, GALR1, TBX15, EEF1A2, DMRTA2, and TFAP2B (as indicated in Table 1A). In further aspects, said determining comprises determining a methylation in each of the genomic regions: Unk 09, Unk 05, DCHS2, OTX1, Unk 07, EVX2, SEPT9, SOX1, Unk 19, Unk 21, SOX17, GALR1, TBX15, EEF1A2, DMRTA2, and TFAP2B (as indicated in Table 1A).
In certain aspects, the patient has been previously treated for or diagnosed with bladder cancer. In some aspects, the method is further defined as a method for detecting bladder cancer recurrence or a risk of bladder cancer recurrence.
In other aspects, said determining comprises analyzing DNA methylation in the sample using restriction endonuclease digestion and qPCR. In further aspects, the digestion reaction is completed in a first step, followed by the qPCR reaction in a second step.
In some aspects, the patient is a human. In certain aspects, the sample is a urine sample. In other aspects, the sample is a blood sample. In further aspects, the sample is obtained by drawing blood from the patient. In other aspects, the sample is obtained from a third party.
In certain aspects, determining a methylation status comprises determining the nucleotide positions in the genomic regions that comprise methylation. In some aspects, determining a methylation status comprises determining the proportion of methylation at nucleotide positions in the genomic region. In further aspects, determining a methylation status comprises determining the proportion of nucleotide positions that are methylated in the genomic region.
In some aspects, the reference level is a level of methylation from a patient that does not have bladder cancer. In certain aspects, the method is further defined as a method for determining the severity of bladder cancer.
In a further embodiment, the invention provides a method of detecting the presence of, or an increased risk of, bladder cancer in a patient comprising obtaining a patient sample, determining a methylation status in one or more genomic regions selected from those in Table 1, and identifying the presence of, or an increased risk of, bladder cancer in the patient based on an increased level of methylation in one or more of the genomic regions relative to a reference level.
In yet a further embodiment, the invention provides a method for treating a patient having bladder cancer or at risk for having bladder cancer comprising administering a therapy to the patient, wherein the patient was previously determined to have an increased level of methylation in one or more of the genomic regions selected from those provided in Table 1 relative to a reference level. In some aspects, the therapy comprises administering an anti-cancer therapy to the subject. In further aspects, the anti-cancer therapy is chemotherapy, radiotherapy, gene therapy, surgery, hormonal therapy, anti-angiogenic therapy or cytokine therapy. In certain aspects, the anti-cancer therapy is a BCG therapy.
In still a further embodiment, the invention provides kits for analysis of DNA methylation. In some aspects, a kit is provided comprising a sealed container comprising primers or probes designed to detect methylation in one, two, three, four, five, six, seven, eight, nine, ten, eleven, twelve, thirteen, fourteen, fifteen, sixteen or more genomic regions of Table 1A. For example, a kit may comprise a primer pair for amplification of an interval of sequence in one of the genomic regions of Table 1 and a reagent for analysis of DNA methylation (e.g., a methylation sensitive restriction endonuclease). In a further aspect, a kit may comprise reagents for analysis of total DNA methylation levels.
In a further aspect, kits are provided for determining one or more methylation positions in a DNA sample. For example, a kit can comprise, at least, an active glucosyltransferase and a DNA endonuclease (e.g., MspI, TaqI or a methylation dependent DNA endonuclease, such as BisI, GlaI or McrBC). Kits according to the invention can further comprise one or more MSEs; a DNA methyltransferase (e.g., M.SssI and/or M.CviPI methyltransferase); an enzyme that converts 5′mC into 5′hmC (e.g., recombinant Tet1, Tet2 and/or Tet3 proteins); one or more reference DNA samples; an affinity purification column; a DNA ligase; a DNA polymerase; DNA sequencing reagents; a PCR buffer; instructions; methylation specific antibodies; and/or DNA primers (also reagents to determine general urine parameters (pH, amount of hemoglobin, leukocytes, e.g. Osumex 10P test kit, Uriscan (YD Diagnostic Corp.).
As used herein the specification, “a” or “an” may mean one or more. As used herein in the claim(s), when used in conjunction with the word “comprising”, the words “a” or “an” may mean one or more than one.
The use of the term “or” in the claims is used to mean “and/or” unless explicitly indicated to refer to alternatives only or the alternatives are mutually exclusive, although the disclosure supports a definition that refers to only alternatives and “and/or.” As used herein “another” may mean at least a second or more.
Throughout this application, the term “about” is used to indicate that a value includes the inherent variation of error for the device, the method being employed to determine the value, or the variation that exists among the study subjects.
Other objects, features and advantages of the present invention will become apparent from the following detailed description. It should be understood, however, that the detailed description and the specific examples, while indicating preferred embodiments of the invention, are given by way of illustration only, since various changes and modifications within the spirit and scope of the invention will become apparent to those skilled in the art from this detailed description.
The following drawings form part of the present specification and are included to further demonstrate certain aspects of the present invention. The invention may be better understood by reference to one or more of these drawings in combination with the detailed description of specific embodiments presented herein.
The methylome of cancer cells greatly differ from that of normal cells. In general, cancer cells show an increase in CpG islands methylation, while other genomic elements, including transposons, lose their normal methylated status. Thus it is possible to detect the presence of cancer cells in a cell population through the analysis of the methylation status of CpG elements that are specifically methylated in cancer cells. Provided herein are multiplex methods for determining a genomic methylation profile in a subject. These new techniques allow for highly quantitative and rapid methods for determine the methylation profile from patient samples. These profiles can, in turn, be used to determine disease risk in patients. For example, methylation profiles can be used on their own or in conjunction with other diagnostic tests to determine if a patient has a cancer or to assess the aggressiveness of a cancer. Moreover, methylation profiles can be used to select the most effective therapy for a subject having a disease. For example, the profiles can be used to determine whether a cancer is likely to respond to a given chemotherapeutic agent or if surgical removal of cancer cells is likely to provide an effective treatment and prevent recurrence.
In some aspects, the inventors identify a panel of methylation markers (e.g., genomic regions) that can be assessed for methylation status to determine whether a subject has, or is at risk for development of, a bladder cancer. Briefly, cancer specific methylation markers were identified by analyzing tissue and or other cancers of the urinary tract (i.e. detected from cells present in the urine or urine sediment or alternately from normal sample biopsy) samples from patients with bladder cancer using the HM450 human methylation array and RRBS next-generation sequencing. A linear regression model was utilized to select the most significant methylation biomarkers which separated tumor and normal samples (
In particular, the inventors have used a unique methylation marker panel, along with controls and methylation sensitive restriction enzymes to generate a novel reaction design that allows multiplex digestion and qPCR reaction in the same buffer to detect the methylation at specific CpG dinucleotides. In certain aspects, this method may be preferred as it has a great advantage in allowing for the analysis of CpG methylation without threating the DNA with sodium bisulfite at high temperature (a reaction that strongly impact the DNA sequence and reduce the quality of DNA for downstream experiments).
II. Reagents and KitsThe kits may comprise suitably aliquoted reagents of the present invention, such as a glucosyltransferase (e.g., a β-glucosyltransferase) and one or more Methylation-Sensitive DNA endonucleases (e.g., MspI, ClaI, Csp6I, HaeIII, TaqαI, MboI, or McrBC) or a methylation dependent endonuclease such as BisI, GlaI or McrBC. Additional components that may be included in a kit according to the invention include, but are not limited to, MSEs (e.g., AatII, AccIII, Acil, AfaI, Agel, AhaII, Alw26I, Alw44I, ApaLI, ApyI, Ascl, Asp718I, AvaI, AvaII, Bme216I, BsaAI, BsaHI, BscFI, BsiMI, BsmAI, BsiEI, BsiWI, BsoFI, Bsp105I, Bsp119I, BspDI, BspEI, BspHI, BspKT6I, BspMII, BspRI, BspT104I, BsrFI, BssHII, BstBI, BstEIII, BstUI, BsuFI, BsuRI, CacI, CboI, CbrI, CceI, Cfr10I, ClaI, Csp68KII, Csp45I, CtyI, CviAI, CviSIII, DpnII, EagI, Ec1136II, Eco47I, Eco47III, EcoRII, EcoT22I, EheI, Esp3I, Fnu4HI, FseI, FspI, Fsp4HI, GsaI, HaeII, HaeIII, HgaI, HhaI, HinPlI, HpaII, HpyAIII, HpyCH4IV, ItaI, KasI, Kpn2I, LlaAI, LlaKR2I, MboI, MflI, MluI, MmeII, MroI, MspI, MstII, MthTI, NaeI, NarI, NciAI, NdeII, NgoMIV, NgoPII, NgoS II, NlaIII, NlaIV, NotI, NruI, NspV PmeI, PmlI, Psp1406I, PvuI, RalF40I, RsaI, RspXI, RsrII, SacII, SalI, Sau3AI, SexAI, SfoI, SfuI, SmaI, SnaBI, SolI, SpoI, SspRFI, Sth368I, TaiI, TaqI, TflI, TthHB8I, VpaK11BI, or XhoI), oligonucleotide primers, reference DNA samples (e.g., methylated and non-methylated reference samples), distilled water, probes, a PCR buffer, dyes, sample vials, polymerase, ligase and instructions for performing methylation assays. In certain further aspects, reagents for DNA isolation, DNA purification and/or DNA clean-up, analysis of urine clinical parameters, may also be included in a kit.
The components of the kits may be packaged either in aqueous media or in lyophilized form. The container means of the kits will generally include at least one vial, test tube, flask, bottle, syringe or other container means, into which a component may be placed, and preferably, suitably aliquoted. Where there is more than one component in the kit, the kit also will generally contain a second, third or other additional container into which the additional components may be separately placed. However, various combinations of components may be comprised in a vial. The kits of the present invention also will typically include a means for containing reagent containers in close confinement for commercial sale. Such containers may include cardboard containers or injection or blow-molded plastic containers into which the desired vials are retained.
When the components of the kit are provided in one or more liquid solutions, the liquid solution is an aqueous solution, with a sterile aqueous solution being preferred.
However, the components of the kit may be provided as dried powder(s). When reagents and/or components are provided as a dry powder, the powder can be reconstituted by the addition of a suitable solvent. It is envisioned that the solvent may also be provided in another container means.
In some aspects, labeled probes may be used to detect and/or quantify PCR amplification. Numerous reporter molecules that may be used to label nucleic acids probes are known. Direct reporter molecules include fluorophores, chromophores, and radiophores. Non-limiting examples of fluorophores include, a red fluorescent squaraine dye such as 2,4-Bis[1,3,3-trimethyl-2-indolinylidenemethyl]cyclobutenediylium-1,3-dio-xolate, an infrared dye such as 2,4 Bis[3,3-dimethyl-2-(1H-benz[e]indolinylidenemethyl)]cyclobutenediylium-1,-3-dioxolate, or an orange fluorescent squarine dye such as 2,4-Bis[3,5-dimethyl-2-pyrrolyl]cyclobutenediylium-1,3-diololate. Additional non-limiting examples of fluorophores include quantum dots, Alexa Fluor™ dyes, AMCA, BODIPY™ 630/650, BODIPY™ 650/665, BODIPY™-FL, BODIPY™-R6G, BODIPY™-TMR, BODIPY™-TRX, Cascade Blue, CyDye™, including but not limited to Cy2™, Cy3™, and Cy5™, a DNA intercalating dye, 6-FAM™, Fluorescein, HEX™, 6-JOE, Oregon Green™ 488, Oregon Green™ 500, Oregon Green™ 514, Pacific Blue™, REG, phycobilliproteins including, but not limited to, phycoerythrin and allophycocyanin, Rhodamine Green™, Rhodamine Red™, ROX™, TAMRA™, TET™, Tetramethylrhodamine, or Texas Red™. A signal amplification reagent, such as tyramide (PerkinElmer), may be used to enhance the fluorescence signal. Indirect reporter molecules include biotin, which must be bound to another molecule such as streptavidin-phycoerythrin for detection. Pairs of labels, such as fluorescence resonance energy transfer pairs or dye-quencher pairs, may also be employed.
III. DefinitionsAs used herein, a “methylation sensitive restriction endonuclease” (MSRE) is a restriction endonuclease that includes CG as part of its recognition site and has altered activity when the C is methylated as compared to when the C is not methylated (e.g., SmaI). Non-limiting examples of methylation sensitive restriction endonucleases include HpaII, BssHII, BstUI, SacII, EagI and NotI. An “isoschizomer” of a methylation sensitive restriction endonuclease is a restriction endonuclease that recognizes the same recognition site as a methylation sensitive restriction endonuclease but cleaves both methylated CGs and unmethylated CGs, such as for example, MspI is an isoschizomer of HpaII. “Restriction endonuclease” and “restriction enzyme” are used interchangeably herein.
As used herein the term “genomic region” refers to a region of genomic DNA encoding and controlling expression of a particular RNA or polypeptide (such as sequences coding for exons, intervening introns and associated expression control sequences) and its flanking sequence or other genomic regions of interest (e.g. repetitive elements or repeated regions of genomic DNA such as dispersed or interspersed retroelements, SINES; LINES; among other such elements). Thus, in some aspects, a genomic region is defined by the regions encoding the genomic regions listed in Table 1. It is, however, recognized in the art that methylation in a particular region (e.g., at a given CpG position or in an amplification interval) is generally indicative of the methylation status at proximal genomic sites. This is particularly true for regulatory elements like CpG islands. Accordingly, determining a methylation status of a particular genomic region can comprise determining a methylation status at a site or sites within about 100, 50, or 25 kb of a named genomic region. Thus, in some aspects, assessing methylation in genomic regions, such as those of Table 1 comprises assessing the methylation at one or more potential sites of methylation with-in 100, 50, or 25 kb (or preferably with 10 kb) of a potential methylation position listed in Table 1.
As used herein the term “genomic amplification interval” refers to a region of genomic DNA that can be amplified by PCR. As used herein an amplification interval comprises at least one CpG position that is a potential site of methylation. In some cases, the amplification interval comprises 2, 3, 4 or more potential sites of CpG methylation (e.g., wherein the CpG is in a sequence recognized by an MSE). In general an amplification interval is less than about 1,200 bp, such as between about 50 bp and 100, 200, 300, 400 or 500 bp. In certain aspects, the amplification interval is 130 bp or less.
As used herein “determining a methylation status” for an indicated genomic region means determining whether one of more position in the DNA of the genomic region is methylated. Thus, in certain aspects, determining a methylation status for a genomic region comprises determining the methylation status of at least 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25, 30, 35, 40, 45, 50 or more sites of potential DNA methylation. In other aspects determining a methylation status means determining the methylation status of one or more methylated sites in a differential methylated region (DMR, e.g. in a window of about 1-10 bp, 10-100 bp, 100-200 bp, 100-1,000, bp or larger window)
IV. ExamplesThe following examples are included to demonstrate preferred embodiments of the invention. It should be appreciated by those of skill in the art that the techniques disclosed in the examples which follow represent techniques discovered by the inventor to function well in the practice of the invention, and thus can be considered to constitute preferred modes for its practice. However, those of skill in the art should, in light of the present disclosure, appreciate that many changes can be made in the specific embodiments which are disclosed and still obtain a like or similar result without departing from the spirit and scope of the invention.
Example 1—Analysis of Methylation Using Restriction Endonuclease Digestion and Real-Time PCRThe analysis system is based on the principle that some restriction enzymes can cut sequences containing CG dinucleotides only if the C is not methylated (
Marker Design—A group of 39 genomic regions (see Table 1 below) were identified to be methylated in bladder cancer and selected for further analysis. Many of the selected markers are not yet annotated (i.e., the gene associated to the CpG island is not known yet; named Unknown (Unk) #). Several control regions are included in Table 2 below. The LINE-1 controls were designed by the inventors and are highly methylated in normal tissues and less in cancer. The GAPDH control was designed by the inventors to serve as a determination of digestion efficiency. Its CpG island is not methylated in all normal and cancer tissues. The POLR2A control marker was designed by the inventors as well and it does not contains any restriction site, so it is a copy number control (for the determination of the amount of template used in each reaction) Other LINE-1 controls could be used as a copy number control (the amplicon does not contains any restriction site). This control will amplify about 16,500 copies of LINE-1 elements in the human genome, circumventing the problem that a deletion affecting a single-copy copy number control will impair the analysis. The usage of LINE-1 as a copy number marker is not yet confirmed and the inventors have used the POLR2A control.
MSRE qPCR design—Four different methylation sensitive restriction endonucleases were selected (AciI, HpyCH4IV, HinP1l and Hpall). Each restriction endonuclease is a four-base cutter that is able to cut the restriction site only if the cytosine in the targeted CpG dinucleotide is unmethylated. All the amplicons (including the primer sequences) were designed in order to contain one or more recognition sites for at least three out of four different restriction endonucleases. This ensures a high digestion reliability (multiple restriction sites within the same amplicon are targeted but it is sufficient that just one site is cut to impair the DNA amplification in the next step) and reduce the possibility that small changes in the DNA sequence (e.g. SNPs) can result in false positive.
Moreover, the amplicon length range is 70 to 158 bp and all the primers were designed and verified to have efficiencies between 90 and 110% and an annealing temperature of 60° C. These are prerequisites for qPCR data reliability and for the standardization of the qPCR condition.
Digestion Conditions—All the restriction endonucleases were purchased from New England Biolabs and mixed together resulting in a stock solution in which the final concentration for each enzyme was 2.5 U/μl. The digestion reaction solutions were prepared as follows: CutSmart 1×, Endonuclease-MIX 0.125U/μl, Template 5 or 2.5 ng/μl; MgCl2 4 mM (Stock solution: 20 mM MgCl2 in 10 mM Tris-HCl pH 7.5; increasing Mg2+ concentration will increase the digestion efficiency even if the reaction contains 0.1 mM EDTA). The final template concentration was 5 ng/μl. Digestions were incubated overnight (16 h) at 30° C. in a thermocycler machine, after that enzymes were inactivated by heating the reactions at 80° for 5′. It was previously determined by the inventors that 30° C. is the optimal working temperature for all 4 enzymes (Acil loses a substantial percentage of its activity over time at 37° C. but not at 30°). A parallel negative reaction (with no enzymes) was also conducted.
qPCR Reaction Conditions—The Real Time reactions were performed using the BioRad CFX machine. Preliminary trials demonstrate that the complete denaturation of the template is a critical step for the qPCR reactions. This is particularly true for methylated CpG-rich templates that are slightly more difficult to denaturate compared to the non-methylated DNA (e.g. see
The inventors also verified the possibility to perform a multiplex reaction without decreasing the amplification efficiency.
The PCR Reaction solutions were prepared according to the following formula: ZymoTaq PreMix (2×), 10 μl; primer FW, 0.4 μM (final concentration); primer RV, 0.4 μM (final concentration); template, 10 ng (2 μl of digestion reaction); DMSO, 5% (final concentration); and ddWater to 20 μl. Reactions were amplified on a CFX BioRad qPCR machine with the thermal profile: 97° C. for 2 minutes; then forty-five cycles of 95° C. for 20 seconds, 60° C. for 30 seconds, and a final extension at 72° C. for 1 minute (new conditions are 60° C. for one minute avoiding the elongation step at 72° C.). All amplifications were then subjected to melt curve analysis with fluorescence measurements to ensure specific amplification and identity. The melt curve was 60° C. to 97° C. The amplicons were then stored at 15° C.
Digestion and qPCR reactions were done in separate tubes (two-step) since the single-step reaction strongly decreased the sensitivity of the method for some markers (
All the markers listed in Table 2 (including LINE-1 and DPM2 controls) were tested to verify their methylation status in cancer using the MSRE qPCR approach. qPCR reactions where thus performed using undigested and digested genomic DNA obtained from blood, LD583 bladder cancer cell line and normal Urine (from non-cancer individuals). All the samples were normalized against undigested blood gDNA (reference sample; considered to be 100%) and values were corrected for the amount of template loaded in the reaction (this was determined using the POLR2A control marker). Digestion efficiencies were determined using the GAPDH control marker. The results are shown in
Preferred markers were markers that did not show any methylation in blood and normal urine gDNA and were 100% methylated in the cancer cell line DNA. From this analysis, makers that resulted less than 2% methylated in blood or normal urine DNA were subdivided into three categories (see Table 3 below).
Blood is known to be a major urine contaminant, however also sperm might contaminate urine sample. Because of this, the methylation level of all the 21 above validated markers in Table 3 were also tested using sperm DNA. The results confirm that all the 21 markers are not methylated in sperm DNA (<1%).
Additionally, for the 16 markers belonging to categories A and B in Table 3, the methylation detection limit the MSRE qPCR system was determined and thus the minimum amplification value that can be considered higher than the background signal (background was determined in digested normal Urine gDNA, which is not methylated; e.g. see
Clinical trial urine samples from bladder cancer tumor patients and normal (non-cancer) urine samples were analyzed using the category A and B markers listed in Table 3. The % of the signal detected for a specific digested sample was calculated by comparison with the signal of the same sample but undigested (100%) (in Example 1, all the signals were calculated in comparison with the signal of undigested blood sample) (e.g.
The methylation signals in digested clinical samples compared with the same undigested samples are shown in
Markers 01, 16 and 04 (see red arrows in
The assay of Example 1 was further developed to increase its overall performance, including sensitivity, simplicity and detection limit. First, the 16 cancer markers panel (Table 4) that have methylation levels of at least 50% in the LD583 bladder cancer cell line and generally lower than 1% in normal urine, blood and sperm samples was further analyzed to arrive at a panel of 12 markers to validate clinical samples.
A small cohort of normal urine DNA samples (n=9) collected from individuals with different age, gender and ethnicity was analyzed by the assay of Example 1. As shown in Table 5, all of the 9 samples were classified as negative for bladder cancer. Despite the fact that the cohort was very small, no significant differences in methylation background levels were linked to age, gender or ethnicity.
Moreover, the results also pointed out an increased background level for four markers (the four rightmost markers in Table 5). In particular marker #04 had a methylation level >1% (up to 4%) in most of the samples, while marker #01 had an elevated background in four out of nine samples. Marker #16 (SEPT9) and #32 (EEF1A2), that are associated with relevant malignancies including colorectal, prostate, pancreas, breast and ovarian cancers, also had a high background in a few, but not all, samples. It is interesting, however, that the background level of these two markers is very similar in terms of sample pattern and magnitude. To note, SEPT9 methylation levels in sample #03 and #05 were confirmed by a parallel experiment based on DNA bisulfite-conversion instead of CARE assay. This similarity might indicate that the increased methylation level in these samples is not due to technical issues (e.g. incomplete digestion) but is rather specifically elevated in some sample. This suggests that the slightly increase in methylation of these two markers might not be random and might perhaps be indicative, or predictive, of a specific clinical status (i.e. risk to develop a specific clinical condition). Thus, marker #32 was kept in the marker panel. Instead, markers #04, #01 and #16 were not considered for further validation experiments reducing the bladder cancer biomarker panel to 13 loci.
Next, the selected 13 bladder cancer biomarkers were tested using a small cohort (n=10+10) of bladder cancer samples and their correspondent adjusted normal tissues (non-cancer bladder tissue isolated along with the tumor at the moment of tumor removal) (Table 6). The overall methylation level of the 13 cancer biomarkers was significantly higher in tumor samples than in the adjusted normal one. 7 out of 10 tumor samples had all 13 markers significantly methylated (>1%). Some markers, specifically #27, #36 and #30 were also highly methylated in the adjusted normal samples.
The data generated from this analysis were used to score every single marker; in particular each marker was evaluated for A) its ability to recognize a bladder malignancy; this parameter was determined as the number of tumor samples recognized positive (methylation level >1%) by a specific marker. Each cancer-specific marker should be significantly methylated in all the tumor samples; B) for the level of methylation of a specific marker in a specific tumor sample in comparison to the average methylation level between all the markers in the same sample (a high methylation level in cancer samples is desirable); C) For the number of adjusted normal sample not recognized as normal sample; this last parameter was considered important since, contrary to blood or normal urine, adjusted normal samples are expected to show an increase in bladder cancer markers methylation levels since they are tissue samples often containing cells predisposed to the development of cancer, a phenomena named “field effect” (e.g. Giovannucci and Ogino, 2005; Bernstein et al., 2007); it is therefore expected that the methylome of adjusted normal sample will be altered at some extent. It was reasoned that a good cancer marker is supposed to detect these differences. Based on these criteria the 13 markers were classified into three marker set categories (A, B and C; Table 7). Set A and B contain the controls and the markers that more likely will be included in the final marker set, while set C include all the other markers. To note, marker #23 showed poor performances as cancer marker (low methylation level in many tumor samples compared to the other markers), therefore it was not included in any set. Thus, the marker panel contains 12 biomarkers.
All the markers included in the three sets (Table 7) of the new panel were further validated using DNA isolated from normal urine sample collected from a larger cohort (n=31). This experiment confirmed the data already presented in Table 5 and methylation background was found to be low for all the markers except for marker #32 (that was found to be elevated in some sample).
In an additional experiment, it was tested if the selected biomarker panel could be used to identify different type of malignancies. To this end, DNA extracted from bladder, prostate, endometrium and kidney tumor and their correspondent adjusted normal sample was analyzed. Moreover, also included in the analysis was one bladder sample isolated from a healthy individual (as a negative control) and three multiple myeloma sample (HA, MM1.S and CRF30).
Despite the small cohort used, the results clearly demonstrated that many of the biomarkers listed in sets A, B and C are heavily methylated in different types of cancer including bladder, prostate, endometrial and multiple myeloma tumors. Surprisingly, no increase of methylation was detected in kidney cancer. As expected, none of the markers was found to be significantly methylated (methylation level >1%) in DNA isolated from healthy bladder. Additional different types of cancer can potentially also be detected using only urine samples.
In addition to the trials described above, a pre-clinical study was performed in which a small cohort of 63 urine sample was analyzed using the enhanced CARE assay. As indicated in Table 9 this cohort includes 20 samples collected from healthy individuals (“Normal sample”) with different age, gender and risk behaviors concerning bladder cancer development (the cohort in fact includes four young smoker individuals-<55 years old-). The rest of the cohort (n=43; “Cancer sample”) is represented by samples collected from individuals with bladder cancer and other type of tumors; specifically, 33 samples were collected from persons affected by bladder cancer before cancer removal (“Bladder cancer before surgery”), 8 sample were collected from individuals who were recently subjected to blabber tumor removal (“Bladder cancer after surgery”) and other 2 samples (collectively named “Non-bladder cancer”) were collected from individuals, #48N and #34S, affected by stomach cancer and colon adenocarcinoma, respectively.
Based on the data obtained from this analysis, a panel marker of #05, #36 and #09 has the highest diagnostic value, therefore they were used for a statistical cancer diagnostic model built utilizing the generalized linear model. This model will classify the population in positive or negative for bladder cancer only based on the data obtained from CARE assay. ¾ and ¼ of the cohort (only the 20 normal and 33 bladder cancer samples were considered) were used as Training and Test Sets respectively and the results are represented in Table 10. Results indicate that the enhanced CARE assay was able to correctly classify most of the Test Set samples (85.71% of the controls and 100% of the bladder cancer samples). In addition, for the Test set the enhanced CARE assay has a sensitivity of 100% and a specificity of 85.71%.
It is also interesting to note that most of the “Bladder cancer sample after surgery” samples were diagnosed as positive by CARE assay (Table 9). This very interesting data support the “field effect” concept and is in agreement with previous data represented in Table 6 in which most of the adjusted normal tissue samples were found to be positive using CARE assay. However it is not clear at this point if the exposition to the cancer environment is also able to trigger epigenetic changes in normal cells present in the vicinity of the tumor. Thus, urine from patients who were subjected to bladder tumor removal will be collected periodically and analyzed using CARE assay to monitor the methylation level evolution of the bladder cancer biomarkers over time.
Another interesting result is that the colon adenocarcinoma urine sample (male individuals) was clearly positive for CARE assay. This might be due to the presence of metastasis in the bladder, or again to a “field effect” phenomenon, given the anatomical vicinity of colon and bladder. This second hypothesis is further supported by the fact that the stomach cancer urine sample (#48N; anatomically distant from bladder) was found to be negative with CARE assay.
Concerning the normal urine cohort, no major differences were observed between the different samples except in the subpopulation of healthy males older than 55 years (M>55 YO); in this group, three out of four individuals had a slightly elevated methylation background (generally <1%) in at least half of the biomarkers tested with CARE assay. To note, aging is one of the predominant risk factor for bladder cancer development and the male population (especially Caucasian) is way more affected by this pathology than females. The CARE assay was able to detect slightly increases of methylation in the older male subgroup of the normal cohort which suggests that the CARE assay could discriminate individuals that have more risk to develop bladder cancer from those that do not. In this perspective, it might be also relevant the fact that one out of four female older than 55 years (F>55 YO) also have an elevated methylation level of six biomarkers, while the other three individuals belonging to the same subgroup have a bladder cancer markers methylation profile similar to those seen for younger individuals. To note, all the four smoker young individuals were classified as negative using the enhanced CARE assay.
In conclusion, the enhanced CARE assay was able to correctly classify most of the normal and bladder cancer samples included in the Test set cohort and has a sensitivity and specificity of 100% and 85.71% respectively (Table 9 and Table 10). In addition, the assay detected as positive a urine sample collected from one colon adenocarcinoma affected individuals, indicating that the CARE assay has the potential to detect non-urinary tract cancers using urine samples. Moreover, the assay was able to detect a slight increase of biomarker methylation in the older male (>55 years old) group belonging to the healthy cohort (a risk population concerning bladder cancer development); suggesting that the enhanced CARE assay might represent a potentially interesting tool not only for bladder cancer recurrence diagnosis, but also to better determine the subpopulation with higher risk to develop bladder cancer for the first time.
Example 4—Methods of Enhanced CARE AssayThe enhanced CARE assay of Example 3 has several improvements as compared to the assay of Example 1. First, the reaction buffer of the enhanced CARE assay is compatible with both digestion and qPCR steps. Therefore, the only manual step that the operator needs to do is the addition of the proper amount of template (5-10 ng) to the reaction buffer. After mixing, the reaction will be incubated in the qPCR machine that will automatically perform the digestion and the qPCR steps. This new buffer allows complete digestion in only 2 hours of incubation at 30° C. The reaction mix contains all the components necessary for both steps, including Taq polymerase, the restriction enzymes mix, deoxyribonucleotides, DNA primers and probes, magnesium, additives (e.g., DMSO) and salts, and the optimal concentration of each component was determined in order to maximize CARE assay performances in terms of accuracy, sensitivity and specificity. The best results were obtained using ZymoTaq™ Taq Polymerase, however other Hot-Start enzymes (e.g. AmpliTaq Gold® DNA Polymerase, Phusion® High-Fidelity DNA Polymerase) can be used to prepare the CARE assay reaction buffer.
In addition, the enhanced CARE assay allows multiplexing and the simultaneous analysis up to 5 biomarkers in a single reaction. All the 12 bladder cancer biomarkers and two internal controls present in set A, B and C (described above) were therefore grouped in three multiplex reactions (one for each set; see Table 11). Each amplicon was reviewed and new primers and dual-labeled TaqMan probes were specifically designed in order to maximize the robustness of CARE assay and the overall performances of the system. For each marker the amplification efficiency ranges between 90 and 110% and no significant differences (in terms of amplification efficiencies and background level) can be observed neither when singleplex and multiplex reaction are run in parallel, nor when digestion and qPCR steps are performed in one machine rather than in two separate steps (digestion in a thermocycler or incubator and qPCR in the Real-Time PCR machine). Importantly primers and probes were designed on sequences lacking any annotated SNPs and repeated element. Despite different methodologies and Real-Time machines can be potentially used for the qPCR step, we obtained the best results by combining TaqMan methodology and the CFX96TM-Real Time System (Bio Rad). In the new CARE assay version dual-labeled TaqMan probes are marked with FAM, HEX, Texas Red-X, Cy5 and Quasar 705 and quenched by either BHQ1 or BHQ2 molecules. However, other dyes like Biosearch Blue, TET, CAL Fluor Gold 540, JOE, VIC, CAL Fluor Orange 560, Quasar 570, Cy3, NED, TAMRA, CAL Fluor Red 590, Cy3.5, ROX, CAL Fluor Red 610, Texas Red, CAL Fluor Red 636, Pulsar 650, Quasar 670, Cy5.5, TEX 615, TYE 563, TYE 665, MAX, Yakima Yellow, ABI and JUN and different quenchers like TAMRA, QSY, BHQ2, BHQ3, Iowa Black can be successfully used for the qPCR step. Other system variants, like Molecular Bacons and Scorpions Probes can also be successfully used for the qPCR step.
A third important improvement that was made in the enhanced CARE assay is the increase of the number of restriction sites that are investigated for each amplicon. The newly designed amplicons contain at least 6 different restriction sites that are targeted at least by two out of three restriction endonucleases selected for this assay (AciI, HinP1l and HpaII). This, combined with the innovative chemistry of the new reaction mixture, ensures a complete digestion in 2 hours (Table 13) and minimizes the risk to incurring in artifacts and unspecific background signal.
In addition, all oligonucleotide (primers and probes) sequences do not contain consensus motives for the aforementioned restriction endonucleases; this seems to prevent the degradation (although very slight) of the oligonucleotides during the digestion reaction potentially caused by the temporary and probably only partially specific interactions with other DNA molecules that might occur at the digestion temperature (30° C.).
Several other restriction enzymes, including AatII, Acc65I, AccI, AciI, AclI, AfeI, AgeI, AhdI, AleI, ApaI, ApaLI, ApeKI, AscI, AsiSI, AvaI, AvaII, BaeI, BanI, BbvCI, BceAI, BcgI, BcoDI, BfuAI, BfuCI, BglI, BmgBI, BsaAI, BsaBI, BsaHI, BsaI, BseYI, BsiEI, BsiWI, BslI, BsmAI, BsmBI, BsmFI, BspDI, BspEI, BsrBI, BsrFI, BssHII, BssKI, BstAPI, BstBI, BstUI, BstZ17I, BtgZI, Cac8I, ClaI, DpnI, DraIII, DrdI, EaeI, EagI, EarI, EciI, Eco53kI, EcoRI, EcoRV, FauI, Fnu4HI, FokI, FseI, FspI, HaeII, HgaI, HhaI, HincII, HinfI, HinPlI, HpaI, HpaII, Hpy166II, Hpy188III, Hpy99I, HpyAV, HpyCH4IV, KasI, MboI, MluI, MmeI, MspA1I, MwoI, NaeI, NarI, NciI, NgoMIV, NheI, NlaIV, NotI, NruI, Nt.BbvCI, Nt.BsmAI, Nt.CviPII, PaeR7I, PhoI, PleI, PluTI, PmeI, PmlI, PshAI, PspOMI, PspXI, PvuI, RsaI, RsrlI, SacII, SalI, Sau3AI, Sau96I, ScrFI, SfaNI, SfiI, SfoI, SgrAI, SmaI, SnaBI, StyD4I, TfiI, TliI, TseI, TspMI, XhoI, XmaI, ZraI can in principle be utilized for the CARE assay.
Controls: Another core component of the enhanced CARE assay is represented by the controls. The actual CARE assay version contemplates 4 different controls: A) endogenous copy number control (POLR2A), B) endogenous digestion control (GAPDH), C) undigested sample control and D) exogenous digestion control (standardized blood sample). To note, the controls used for CARE assay are not manipulated in vitro (e.g. chemically or enzymatically modified) reducing in that way the chances to introducing artefacts and errors.
Endogenous copy number control—POLR2A: One point for the reliability of the assay is the normalization of the signal obtained from each marker against the amount of DNA sample loaded in each reaction. Tiny differences in the amount of template loaded in different reactions are common in qPCR and they can impact the results of the analysis if they are not monitored. Because of that the usage of an internal copy number control that allows to precisely quantitate the amount of template used in each reaction and to correct the values obtained from the analysis is needed. A portion of the coding sequence of the human POLR2A gene was chosen as the copy number control. That choice was made because A) POLR2A is an essential gene, so the control signal will always be detected, B) this control marker is a single-copy gene and its usage in multiplex qPCR will not interfere with the amplification of the other markers (like what can happen using a multi-copy marker like a transposable element), C) the sequence that was selected does not contains target sites for any of the restriction endonucleases used in the assay, so the signal of the copy number control will always reflects the amount of DNA used independently on the fact that the sample is digested or not. In principle, every locus satisfying the three points listed above can be successfully used as endogenous copy number control.
Endogenous digestion control—GAPDH: One other point for the reliability of CARE assay is the achievement of a complete enzymatic digestion. In principle, a single cleavage event within a specific locus is sufficient to prevent the amplification of the associated target region. Nevertheless, as described previously, amplicons were designed in order to include at least 6 restriction sites recognized by at least two out of the three restriction endonucleases used for the assay. This, combined with the new reaction mix composition, allows the achievement of the maximum specific enzymatic activity and strongly abates the possibility of incomplete digestion. However, the digestion completeness of each reaction was monitored by including in the final CARE assay an endogenous digestion control. This is represented by a DNA locus that is A) essential for the cell survival, B) that contains multiple restriction sites for all of the three enzymes used in the CARE assay C) and that is always unmethylated in any type of cells and tissues. The internal digestion control was designed on the CpG island of the human GAPDH gene; this housekeeping gene have an essential role in glycolysis and is widely used as a copy number control for gene expression analysis in RT-qPCR reactions and immunoblots since it is always expressed at almost constant level in all the cell type. In all of the digested samples tested with CARE assay (a wide variety of specimen including biological fluids, tissues and cell types) GAPDH signal was always lower than the maximum acceptable background level (1%) and in most of the case it was not detectable at all (e.g. in Table 5). Also, in this case, loci that satisfy the three points aforementioned can be used as endogenous digestion controls; between all, possible alternative endogenous digestion control loci can be designed on CpG islands of genes ATP5C1, TBP, GARS, LDHA and PGK1.
Undigested sample control: This control consists of a DNA sample processed exactly like the samples that will be subjected to digestion, but without adding the restriction endonucleases mix. Therefore, in this sample, the DNA will remain intact and all the marker copies added initially to the reaction will be suitable for amplification. The amplification signal for each marker in this sample can therefore be considered 100%, and the percentage of methylated (uncut) marker detected in the parallel digested sample will be calculated accordingly. Initially, a DNA sample isolated from a healthy blood donor was used as the undigested sample control. However, to minimize the possible differences that might exist between this standard sample and urine DNA samples, an undigested reaction (run in parallel with the digested ones) was performed for each urine sample that will be analyzed with CARE assay. Thus, the final assay will therefore include two distinct premixed reaction solutions, named D and U, for each marker set; only the reaction solution D (but not the U) will contain the restriction nucleases mix needed for the digestion step. In general, it was concluded that running an undigested control for each single sample represents the most reliable way to precisely quantify the amount of methylation level for each marker in each sample.
To further increase the precision of the assay, a second digestion control represented by a standardized sample of blood DNA was included. The fact that all the cancer markers included in the current assay are not (or only weakly) methylated in blood DNA (Table 12) allows the detection of potential problems that might occur during the digestion step and that are listed below:
A) Different markers are cleaved by a different combination of restriction endonucleases (some markers are cleaved by all three enzymes, some other just by two); moreover, a different number of CpG dinucleotides are interrogated in different markers (ranging from 6 to 12). Therefore, although sufficient, the evaluation of the digestion efficiency using only the GAPDH control marker will be imprecise (GAPDH locus is targeted at 6 different sites by all 3 restriction endonucleases). In an extreme unlikely situation in which only one out of three enzymes is fully active, GAPDH control will be completely digested, while cancer markers that are targeted only by the two other enzymes will be only partially digested. This will leads to an unreal increase of some cancer marker methylation status, generating a possible false positive situation. The abnormal increase of methylation for some cancer markers in the E.D.C. control will immediately inform the operator about this situation
B) Moreover, if a general increase in methylation background level (including for the GAPDH endogenous digestion control) is observed for a specific sample, it will be impossible to know (in absence of the E.D.C.) if that is caused by a decreased activity of the restriction enzymes (in that case the reaction mix need to be substituted with a new batch) or by a poor quality of the DNA sample (in this second case the sample needs to be purified). Since the quality of the exogenous digestion control is standard and all the markers in this sample are unmethylated, an increase of the background level for all the markers in the exogenous digestion control will clearly indicate that the enzymatic activity of the restriction enzymes present in the reaction mixture is impaired and the mix has to be replaced with a new batch.
In addition to the four controls aforementioned, information was collected to monitor different biochemical parameters of the urine sample before proceeding with the DNA extraction. The primary reason was to determine the possible influences of specific conditions (e.g. the increased presence of proteins and ketones in the urine, the urine pH value, etc.) on the CARE assay performances. Secondly, other parameters can perhaps support CARE assay diagnosis.
An analysis was performed using Multistix® 10 SG Reagent Strips (Siemens). None of the parameters monitored with this tool have an impact on the CARE assay performances. Moreover, hematuria, high leukocyte and protein content was detected in the majority of the urine collected from bladder cancer patients but not in urine collected from healthy individuals.
One of the most remarkable characteristics of CARE assay consists in the combination between high performances (in terms of robustness, detection limit, sensitivity and accuracy) and extreme simplicity and versatility. Indeed, based on the data obtained so far, the CARE assay has a sensitivity of at 100% and a specificity of 85.71% (Table 11). In addition, CARE assay has a remarkably low detection limit; indeed, it is able to detect with confidence less than 7 cancer cells in a sample containing more than 1500 cells (as shown in
On the other hand, the CARE assay is very simple, rapid, and user friendly. The whole assay (including digestion, qPCR step and data analysis) can be performed in less than 5 hours. For each marker set, two premixed reaction solutions (D and U) and the exogenous digestion control sample (E.D.C.) will be provided (
The stability of both reaction mixes D and U were tested and found to be stable for several months at −20° C. Moreover, 10 thaw/freeze cycles did not alter the reaction performances (Table 14).
To obtain the CARE assay performances described above, the urine DNA quality (in terms of purity and integrity) should be high enough to guarantee an efficient digestion and amplification during the digestion and qPCR step respectively. Therefore, a preservative agent (UCB™, Zymo Research) was added immediately after the urine collection in order to prevent nucleic acids degradation before the DNA isolation step. It was observed that the addition of UCB™, while well preserving the DNA integrity in the sample over time (Table 15), did not interfere with the CARE assay performance when Quick-DNA™ Urine Kit (Zymo Research) was used to isolate the DNA (Table 15).
As mentioned before, the DNA quality might represent a key prerequisite for a successful CARE assay analysis. Therefore, it was decided to test if different urine DNA isolation kits are equally able to provide DNA with a sufficient quality to perform CARE assay. Three kits (Quick-DNA™ Urine Kit-Zymo Research-, Supplier QI and Supplier NG) were tested in parallel using a single urine sample with or without UCB™ 2 hours before the extraction (Table 16). When DNA was isolated with different kits no substantial differences were observed neither in the total amount of DNA recovered (DNA was quantified using Femto™ Human DNA Quantification Kit, Zymo Research; data not shown), nor in the DNA amplification potential (CT mean values for POLR2A internal control in different CARE reactions are very similar; Table 16). Surprisingly, however, the quality of the DNA isolated using the kit of Supplier NG is not sufficient to obtain good CARE assay performance. In particular for the internal digestion control GAPDH and marker #27 the methylation background level is above the high confidence limit (1%). Moreover, this problem is further pronounced in the sample added with UCB™ (Table 16). This indicates that while the quality of the DNA isolated with different kits is sufficient for the qPCR step, the same is not true for the digestion step. Therefore, it was concluded that a very high DNA quality is a prerequisite for a successful CARE assay.
Algorithm used for data analysis: As mentioned above, the CFX96TM-Real Time System Machine will perform also the data analysis based on the ΔΔCT algorithm. Briefly (an example is represented in
Alternatively, the degree of methylation for each marker can be expressed in methylated DNA copies (resistant to digestion). From this perspective, it should be considered that when 10 ng of DNA are used (the DNA amount derived from approximately 1500 cells, that correspond to 3000 DNA copies), the signal obtained from the undigested sample is derived from the amplification of 3000 DNA copies, while the signal detected in the digested sample (DNA copies derived from cancer cells) can be calculated accordingly.
Moreover, the signal obtained from the analysis of each sample and marker can be corrected for the average background signal level obtained from each single marker in a normal urine cohort. This will further increase the precision of the analysis, however a larger and properly selected normal urine cohort has to be analyzed using CARE assay in order to establish the most reliable correction value to apply for each single marker.
Thus, provided herein is a unique combination of cancer-specific biomarkers and an optimized system design that renders this assay a robust, accurate, sensitive, non-invasive and very simple epigenetic-based method for the early detection of bladder cancer from urine samples. Thanks to the newly designed reaction mix and to the high degree of automation a complete analysis using CARE assay can be performed in less than 5 hours with minimal sample manipulation. Moreover, the unique combination of endogenous and exogenous controls, the meticulous amplicon design and the optimized reaction chemistry makes this system incredibly robust, reproducible and accurate. CARE assay have a sensitivity of 100%, a specificity of 85.71% and a very low detection limit (less than 7 cancer cells can be detected in a sample containing more than 1500 cells). In addition, the CARE assay has the potential to detect chromosomal aneuploidies, tumor type other than bladder cancer and perhaps to individuate persons who have the highest risk to develop bladder cancer for the first time.
Example 5—Analysis of Methylation Using MMSP AssayIn order to precisely detect a predisposition to, or the incidence of bladder cancer with limited biopsy or liquid biopsy samples from patients, a highly sensitive and specific multiplex assay was developed to analyze the number of methylated DNA molecular in a vast un-methylated background.
Bisulfite conversion: Genomic DNA extracted from patients' biopsy samples, including urine, blood or tissues samples, was bisulfite converted. As a standard methodology for DNA methylation study, bisulfite conversion uses chemistry to specifically convert unmethylated cytosine residues to uracil, allowing the use of polymerase chain reaction to selectively amplify methylated genome DNA from unmethylated DNA. The conversion efficiency of this assay is more than 99.5% and the lowest input amount of gDNA for bisulfite conversion is 50 pg DNA, thus the assay is feasible for processing as low as 10 diploid cells (
PCR design: The primers and probes to target the biomarker genome regions were designed under guidelines which consider melting temperatures (Tm) (identical Tm=58-60° C. for primers and Tm=68-70° C. for probes), length of probes (less than 30 bp) and CG ratio (30-80%) of primers and probes. In addition, the number of CpG dinucleotides covered by primers and probes was carefully controlled. For example, at least 7 CpG loci should be covered by one group of primers and probes to achieve the desired specificity of this assay. The position of CpG dinucleotides on primer and probes wass also considered (e.g., at least one C in the last 5 nucleotides the 3′ end of primers)
To overcome the size variance of DNA fragments in biopsy samples, the length of PCR amplicon was limited to 130 bp to150 bp to avoid missing small fragment DNA in urine samples. (
Group A: DMRTA2, EVX2, Unk21, OTX1, CNC
Group B: SOX1, SEPT9, Unk05, Unk09, CNC
Group C: GALR1, Unk07, Unk19, TBX15, CNC
Group D: EEF1A2, TFAP2B, DCHS2, SOX17, CNC
Controls: The internal copy number control (CNC) of the assay is collagen type II, alpha 1 gene (COL2A1), which is a single copy gene. The entire amplicon of COL2A1 is devoid of any CpG dinucleotide in the original genome sequence. Therefore, the amount of input DNA can be measured for each reaction regardless of the methylation status of template DNA (Widschwendter et al., Cancer Research, 64: 3807-3813, 2004). Thus, for each group, the CNC reaction was included to check for sample quantity and integrity in the same reaction well.
M.SssI-modified gDNA (D5014-2) was used as a positive control of 100% Methylation (MC). For each testing plate (e.g., 96 well plate), 10 ng of MC sample is tested two or three times. The MC control allows for normalization of the intra-assay variations, including reagent batches and PCR instruments. (
The assay also showed good linearity for all of the 16 targeted genes based on repeated measurements of relative methylation percentage values on DNA mixtures containing 100%, 50%, 5%, 1%, 0.5% and 0% of methylated (M.SssI-treated) normal urine DNA (
Methylation analysis: For each analyte, the raw copy number of each targeted region and internal control was automatically calculated according to the standard curve. If more than one PCR reaction was performed for a sample, the mean value of duplicates or triplicates was calculated for following analysis. The relative methylation percentage calculation based on following equation.
If the standard curve was not performed, then an alternative calculation for methylation percentage was based on delta CT as follows:
Validation: 35 urine samples from normal healthy people were collected and the genomic DNA was extracted using the Quick-DNA™ Urine Kit (Zymo, USA). In addition, genomic DNA are extracted using the ZR Urine DNA Isolation Kit and ZR Genomic DNA—Tissue Kits (Zymo, USA) from 9 bladder cancer tumor tissues. The DNA concentration was measured by Nanodrop and 250 ng of each sample was subjected to bisulfite conversion using the EZ Direct kit (Zymo, USA). The converted DNA was eluted into nuclease free water and diluted into about 2 ng/ul for polymerase chain reaction.
25 μl of reaction master mix (1U AmpliGold Taq enzyme, 1× ampliGold Buffer, 400 uM dNTP, 5.5 mM MgCl2, 300 nM Primers, 100 nM probes, PCR enhancer(optional)) of Group A, B, C, D was premade separately and loaded to a 96 well plate (
Real-time PCR was performed on the CFX96 Touch™ Real-Time PCR Detection System. The raw copy number of each of the targeting regions in each sample was automatically reported by the instrument's program. The relative methylation percentages were calculated as described. As shown in
Validation #2: DNA was extracted from 15 bladder cancer urine samples and bisulfite converted and stored in −20° C. for more than one year. These samples were diluted with 80 μl of H2O for this assay. 25 μl of reaction master mix (1U AmpliGold Taq enzyme, 1× ampliGold Buffer, 400 uM dNTP, 5.5 mM MgCl2, 300 nM Primers, 100 nM probes, 1.5 μl DMSO) of Group B was premade separately and loaded to a 96 well plate. 5 ul of sample DNA was used per reaction. Each sample was tested in duplicate. A technical duplication of standard curves was performed for each 96-well plate. 5 ul of 100% methylated Positive Control (MC) sample was also tested twice.
Real-time PCR was performed on the CFX96 Touch™ Real-Time PCR Detection System. The raw copy number of each targeting regions in each sample was automatically reported by the instrument's program. To ensure the statistical significance, only the samples which showed at least 200 copy of COL2A1 were kept for the next step of analysis. The relative methylation percentages were calculated as described previously. As shown in
In addition, 3 months after therapy, the group B markers detected a dramatic drop in DNA methylation indicating the success of the therapy. In the following check-up, the group B markers confirmed that bladder cancer did not recur (
Validation #3: A pre-clinical study was conducted to further validate the assay in urine samples from healthy individuals and bladder cancer patients. 33 urine samples from bladder cancer patients were purchased from Geneticist-IIBGR (Glendale, Calif.). The urine samples were collected before transurethral resection of the bladder tumor (TURBT) and without any chemotherapy. 20 normal urine samples were collected internally (consented) with diverse age, gender and smoking history. All the urine samples were preserved in Urine Conditioning Buffer (Zymo research, D3061-1) upon collection. Both cellular and cell free DNA from urine samples were extracted using Quick-DNA™ Urine Kit (Zymo, USA). DNA concentration was measured by Nanodrop and 15Ong of each sample was subjected to bisulfite conversion using EZ Direct kit (Zymo, USA). The converted DNA was eluted into nuclease free water and diluted into about 2 ng/ul for polymerase chain reaction.
Biomarkers in Group A, B, C, D were tested as in Assay Validation #1, except that each sample was tested three times. The RMP results for each sample were subjected to further statistical analysis. The best biomarker, Unk05 in Group B, was selected based on a special algorithm. Using Unk05 (cut-off value of 1.5%) allowed for stratification of samples in a bladder cancer and bladder cancer free group (Sensitivity 90.9%; Specificity 85%). Thus, the assay has enhanced sensitivity and specificity as compared to current urine-based tumor markers in bladder cancer (
Weeding trivial biomarkers using the Random Forest algorithm: A statistical model was built to predict the probability of the presence of bladder cancer by measuring DNA methylation of a set of CpG sites in a urine sample. DNA methylation levels of twelve bladder cancer specific CpG sites (Table 7) of urine samples were obtained using the CARE assay and the multiplex methylation-specific qPCR (MMSP) assay. Urine samples were randomly split into a training set and a test set in the ratio of 75% and 25% respectively. A subgroup from the twelve biomarkers with the lowest root mean squared error (RMSE) was identified by using the recursive feature selection algorithm called random forest (Svetnik 2003; implemented in the R package caret).
Assignment of different weights to the selected biomarkers by the Generalized Linear Model: A coefficient was assigned to each of the selected CpG sites and an intercept was obtained using the generalized linear regression model (Friedman, 2008; implemented in the R package glmnet), regressing DNA methylation on cancer status of the urine samples (absence or presence of bladder cancer). The alpha parameter of 0.5 was used to balance having smaller variance and less variables, and the lambda parameter which helps achieve the smallest error was chosen using cross validation of the training data.
Calculation of probability of presence of bladder cancer using logistic regression: Logistic regression relates a binary outcome variable (like presence or absence of cancer) to a group of predictor variables (like a set of CpG sites) (Freedman 2009). The probability of presence of bladder cancer is calculated by the following formula: log(p/(1−p))=X0+CpG1*X1+ . . . +CpGn*Xn=X, where p represents the probability of bladder cancer, n is the number of selected CpG sites, X0 is the intercept, X1 is the coefficient of the CpG site 1, CpG1 is the DNA methylation value of the CpG site 1, and so on. X is the sum of the intercept and every biomarker's weighted DNA methylation value which is CpG*coefficient. Therefore, p=exp(X)/(1+exp(X)).
Results of the CARE assay: DNA methylation values of the twelve CpG sites were obtained for the 53 urine samples from 20 healthy individuals and 33 bladder cancer patients. The samples were randomly split into a Training set (39 samples; 13 healthy and 26 cancer) and a Test set (14 samples; 7 healthy and 7 cancer). A subgroup of three CpG sites of #05 (Unk 05), #36 (SOX17) and #09 (Unk09) was identified having the lowest RMSE error by the Random Forest algorithm (
Results of the MMSP assay: DNA methylation was measured using the MMSP assay for the same sample set as that of the CARE assay. The Random Forest algorithm identified one single CpG site R3N5 [See Table 12] with the lowest RMSE error (
Another hallmark of cancer cells is the chromosomal instability that frequently results in an increased chromosomal ploidy or deletion. In particular tetraploidy of chromosomes 3, 7 and 17, and loss of the chromosomal region 9p21 are frequently associated with urothelial carcinomas and accumulate during the development of bladder cancer starting three years before diagnosis (Bonberg et al., 2014). To note, markers #05, #07 and POLR2A (Set A and B) are positioned on chromosomes 7, 17 and 17 respectively, make them suitable for the detection of frequent chromosomal aberrations associated with bladder cancer.
Although so far we did not deeply investigated this aspect, an interesting aspect of CARE assay is that it can provide information about the presence of chromosomal aneuploidy in the sample analyzed. Indeed, the qPCR signal obtained from each marker in the undigested sample is equal to the number of copy of DNA added initially to the reaction (each marker, including the endogenous controls, is a single-copy locus). Therefore, the comparison between the signal obtained from a specific marker in an undigested euploid sample (e.g. the undigested reaction performed for E.D.C. control) and the signal detected for the same marker in an undigested clinical sample (undigested sample control) might point out the eventual presence of an extra or a missing copy of the marker under analysis. For example, if we analyze a urine sample collected from an individual affected by a bladder cancer in which marker SOX1 (#27) is duplicated, we might in principle expect that the signal detected from SOX1 (but not for the other markers) in the undigested clinical sample is doubled compared to the signal obtained from the undigested EDC control.
Despite this aspect might be interesting since it can provide additional information for bladder cancer diagnostic purposes, we have to consider that the possibility to identify chromosomal aberration taking advantage of CARE assay is strongly dependent from the amount of normal and cancer cells (and more specifically aneuploid cells) present in the original sample. Indeed, if the aberrant cancer cells represent only a small fraction of the cell population present in the urine sample, the detection of aneuploidy using CARE assay will not be possible. More data regarding this aspect will be collected from future experiments.
All of the methods disclosed and claimed herein can be made and executed without undue experimentation in light of the present disclosure. While the compositions and methods of this invention have been described in terms of preferred embodiments, it will be apparent to those of skill in the art that variations may be applied to the methods and in the steps or in the sequence of steps of the method described herein without departing from the concept, spirit and scope of the invention. More specifically, it will be apparent that certain agents which are both chemically and physiologically related may be substituted for the agents described herein while the same or similar results would be achieved. All such similar substitutes and modifications apparent to those skilled in the art are deemed to be within the spirit, scope and concept of the invention as defined by the appended claims.
REFERENCESThe following references, to the extent that they provide exemplary procedural or other details supplementary to those set forth herein, are specifically incorporated herein by reference.
- Babjuk et al., EAU guidelines on non-muscle-invasive urothelial carcinoma of the bladder, the 2011 update. Eur Urol 2011; 59: 997-1008.
- Bernstein et al., 2007
- Giovannucci and Ogino, 2005
- Lintula and Hotakainen, Developing biomarkers for improved diagnosis and treatment outcome monitoring of bladder cancer. Expert Opin Biol Ther 2010; 10:1169-80.
- Millan-Rodriguez et al., Primary superficial bladder cancer risk groups according to progression, mortality and recurrence. J Urol 2000; 164:680-4.
- Morgan and Clark, Bladder cancer. Curr Opin Oncol 2010; 22: 242-9.
- Parker and Spiess, Current and emerging bladder cancer urinary biomarkers. Scientific World Journal 2011; 11:1103-12.
- Reinert T, Methylation markers for urine-based detection of bladder cancer: The next generation of urinary markers for diagnosis and surveillance of bladder cancer. Adv Urol 2012; 2012:503271.
- Shelley et al., Intravesical therapy for superficial bladder cancer: A systematic review of randomised trials and meta-analyses. Cancer Treat Rev 2010; 36:195-205.
- Siegel et al., Cancer statistics, 2011: The impact of eliminating socioeconomic and racial disparities on premature cancer deaths. CA Cancer J Clin 2011; 61: 212-36.
- Sobin et al., TNM classification of malignant tumours. 7th ed. Wiley-Blackwell; 2009.
- Widschwendter et al., Cancer Research, 64: 3807-3813, 2004
Claims
1. A method for determining a genomic DNA methylation profile in a sample comprising:
- (a) obtaining a substantially purified test genomic DNA sample;
- (b) contacting a portion test genomic DNA of the sample with:
- a first reaction mixture comprising: (i) at least two methylation sensitive restriction endonucleases; (ii) a hot-start DNA polymerase; (iii) a pH buffered salt solution; (iv) dNTPs; (v) DNA primer pairs for polymerase chain reaction (PCR) amplification of at least a first, second and third different genomic region in the DNA sample; and (vi) fluorescent probes complementary to sequences in said first, second and third different genomic regions for quantitative detection of amplified sequences from the first, second and third different genomic regions, wherein each of the probes comprises a distinct fluorescent label, wherein: (I) the first genomic region is a cleavage control that is known to be unmethylated; (II) the second genomic region is a copy number control that does not include any cut sites for the methylation sensitive restriction endonucleases of the first reaction mixture; and (III) the third genomic region is a test region having an unknown amount of methylation and including at least three cut sites for the methylation sensitive restriction endonucleases of the first reaction mixture;
- (c) subjecting the first reaction mixture to digestion and thermal cycling, while detecting fluorescent signals from the fluorescent probes, thereby performing real time PCR on the samples in the first and second reaction mixtures;
- (d) using the detected fluorescent signals to determine the genomic DNA methylation profile in a sample.
2. The method of claim 1, further comprising:
- (a) obtaining a substantially purified test genomic DNA sample;
- (b) contacting a portion test genomic DNA of the sample with:
- a first reaction mixture comprising: (i) at least two methylation sensitive restriction endonucleases; (ii) a hot-start DNA polymerase; (iii) a pH buffered salt solution; (iv) dNTPs; (v) DNA primer pairs for polymerase chain reaction (PCR) amplification of at least a first, second and third different genomic region in the DNA sample; and (vi) fluorescent probes complementary to sequences in said first, second and third different genomic regions for quantitative detection of amplified sequences from the first, second and third different genomic regions, wherein each of the probes comprises a distinct fluorescent label; and
- a second reaction mixture, identical to the first reaction mixture, but lacking the at least two methylation sensitive restriction endonucleases, wherein: (I) the first genomic region is a cleavage control that is known to be unmethylated; (II) the second genomic region is a copy number control that does not include any cut sites for the methylation sensitive restriction endonucleases of the first reaction mixture; and (III) the third genomic region is a test region having an unknown amount of methylation and including at least three cut sites for the methylation sensitive restriction endonucleases of the first reaction mixture;
- (c) subjecting the first and second reaction mixtures to digestion and thermal cycling, while detecting fluorescent signals from the fluorescent probes, thereby performing real time PCR on the samples in the first and second reaction mixtures;
- (d) using the detected fluorescent signals to determine the genomic DNA methylation profile in a sample.
3-8. (canceled)
9. The method of claim 1, wherein the substantially purified test genomic DNA sample is obtained from a urine, stool, saliva, blood or tissue sample.
10. The method of claim 9, wherein the substantially purified test genomic DNA sample is obtained from a biopsy sample.
11. (canceled)
12. The method of claim 1, wherein the first reaction mixture comprises at least three methylation sensitive restriction endonucleases.
13. (canceled)
14. The method of claim 1, wherein step (b) further comprises:
- (b) contacting a portion test genomic DNA of the sample with:
- a first reaction mixture comprising: (i) at least two methylation sensitive restriction endonucleases; (ii) a hot-start DNA polymerase; (iii) a pH buffered salt solution; (iv) dNTPs; (v) DNA primer pairs for polymerase chain reaction (PCR) amplification of at least a first, second, third and fourth different genomic region in the DNA sample; and (vi) fluorescent probes complementary to sequences in said first, second, third and fourth different genomic regions for quantitative detection of amplified sequences from the first, second, third and fourth different genomic regions, wherein each of the probes comprises a distinct fluorescent label, wherein: (I) the first genomic region is a cleavage control that is known to be unmethylated; (II) the second genomic region is a copy number control that does not include any cut sites for the methylation sensitive restriction endonucleases of the first reaction mixture; and (III) the third and fourth genomic regions are test regions having an unknown amount of methylation and including at least three cut sites for the methylation sensitive restriction endonucleases of the first reaction mixture.
15. The method of claim 14, wherein step (b) further comprises:
- (b) contacting a portion test genomic DNA of the sample with:
- a first reaction mixture comprising: (i) at least two methylation sensitive restriction endonucleases; (ii) a hot-start DNA polymerase; (iii) a pH buffered salt solution; (iv) dNTPs; (v) DNA primer pairs for polymerase chain reaction (PCR) amplification of at least a first, second, third, fourth and fifth different genomic region in the DNA sample; and (vi) fluorescent probes complementary to sequences in said first, second, third, fourth and fifth different genomic regions for quantitative detection of amplified sequences from the first, second, third, fourth and fifth different genomic regions, wherein each of the probes comprises a distinct fluorescent label, wherein: (I) the first genomic region is a cleavage control that is known to be unmethylated; (II) the second genomic region is a copy number control that does not include any cut sites for the methylation sensitive restriction endonucleases of the first reaction mixture; and (III) the third, fourth and fifth genomic regions are test regions having an unknown amount of methylation and including at least three cut sites for the methylation sensitive restriction endonucleases of the first reaction mixture.
16. (canceled)
17. The method of claim 1, wherein at least 4, 5, 6, 7 or 8 cut sites for the methylation sensitive restriction endonucleases of the first reaction mixture.
18. The method of claim 1, wherein the primer pairs are complementary to sequences no more than 300, 275, 250, 225, 200, 190, 180, 170, 160, 150, 140, 130, 120, 110, 100, 90, 80, 70, 60 or 50 nucleotides apart.
19. The method of claim 1, wherein the first genomic region is a genomic region of a housekeeping gene.
20. The method of claim 19, wherein the housekeeping gene is GAPDH.
21. The method of claim 1, wherein the second genomic region is a genomic region of the POLR2A gene.
22. The method of claim 1, wherein the third genomic region is selected from the group provided in Table 1A of Table 2.
23. The method of claim 1, wherein the third genomic region is selected from the group consisting of DMRTA2, EVX2, Unk21, OTX1, SOX1, SEPT9, Unk05, Unk09, GALR1, Unk07, Unk19, TBX15, EEF1A2, TFAP2B, DCHS2 and SOX17.
24. The method of claim 1, wherein using the detected fluorescent signals to determine the genomic DNA methylation profile in a sample comprises calculating the relative methylation percentages for the sample.
25. The method of claim 1, wherein the fluorescent probes and/or primer pairs are selected from those provided in Table 1C.
26-28. (canceled)
29. A method for determining a genomic DNA methylation profile in a sample comprising:
- (a) obtaining a test genomic DNA sample, which has been bisulfite converted;
- (b) contacting the test sample with a first reaction mixture comprising: (i) a hot-start DNA polymerase; (ii) a pH buffered salt solution; (iii) dNTPs; (iv) DNA primer pairs for polymerase chain reaction (PCR) amplification of at least a first, and second different genomic region in the DNA sample, wherein the primer pairs are complementary to sequences no more than 200 nucleotides apart; and (v) fluorescent probes complementary to sequences in said first, and second different genomic regions for quantitative detection of amplified sequences from the first and second different genomic regions, wherein each of the probes comprises a distinct fluorescent label, wherein: (I) the first genomic region is a copy number control region that that does not comprise CpG dinucleotides; and (II) the second genomic region is a test region having an unknown amount of methylation and including at least five CpG dinucleotides in sequences that are complementary to DNA primer pairs and the probe for the second genomic region;
- (c) subjecting the first reaction mixtures to thermal cycling, while detecting fluorescent signals from the fluorescent probes, thereby performing real time PCR on the sample in the first reaction mixture; and
- (d) using the detected fluorescent signals and fluorescent signal from a DNA methylation standard curve to determine the genomic DNA methylation profile in a sample.
30-58. (canceled)
59. A method of detecting the presence of, or an increased risk of, bladder cancer or other cancers of the urinary tract in a patient comprising determining a methylation status in one or more genomic regions in a patient sample selected from the group provided in Table 1A wherein an increased level of methylation in one or more of the genomic regions of Table 1A relative to a reference level indicates that the patient has or is at risk of developing bladder cancer.
60. The method of claim 59, wherein the one or more genomic regions is selected from the group consisting of Unk 09, Unk 05, DCHS2, OTX1, Unk 07, EVX2, SEPT9, SOX1, Unk 19, Unk 21, and SOX17.
61. The method of claim 60, wherein the one or more genomic regions is selected from 2, 3, 4, 5, 6, 7, 8, 9, 10 or 11 of the genomic regions selected from the group consisting of Unk 09, Unk 05, DCHS2, OTX1, Unk 07, EVX2, SEPT9, SOX1, Unk 19, Unk 21, and SOX17.
62-66. (canceled)
67. The method of claim 59, wherein said determining comprises determining a methylation status in 2, 3, 4, 5, 6, 7, 8, 9, 10 or more of said genomic regions.
68. The method of claim 59, wherein said determining comprises determining a methylation status in 2, 3, 4, 5, 6, 7, 8, 9, 10 or more of said genomic regions, wherein the genomic regions are selected from the group consisting of Unk 09, Unk 05, DCHS2, OTX1, Unk 07, EVX2, SEPT9, SOX1, Unk 19, Unk 21, SOX17, GALR1, TBX15, EEF1A2, DMRTA2, and TFAP2B.
69. The method of claim 68, wherein said determining comprises determining a methylation in each of the genomic regions: Unk 09, Unk 05, DCHS2, OTX1, Unk 07, EVX2, SEPT9, SOX1, Unk 19, Unk 21, SOX17, GALR1, TBX15, EEF1A2, DMRTA2, and TFAP2B.
70. The method of claim 59, wherein the patient has been previously treated for or diagnosed with bladder cancer.
71. The method of claim 70, further defined as method for detecting bladder cancer recurrence or a risk of bladder cancer recurrence.
72. The method of claim 59, wherein said determining comprises analyzing DNA methylation in the sample using restriction endonuclease digestion and qPCR.
73. The method of claim 72, wherein the digestion reaction is completed in a first step, followed by the qPCR reaction in a second step.
74. The method of claim 59 wherein the patient is a human.
75. The method of claim 59, wherein the sample is a urine sample.
76. The method of claim 59, wherein the sample is a blood sample.
77-78. (canceled)
79. The method of claim 59, wherein determining a methylation status comprises determining the nucleotide positions in the genomic regions that comprise methylation.
80. The method of claim 59, wherein determining a methylation status comprises determining the proportion of methylation at nucleotide positions in the genomic region.
81. The method of claim 59, wherein determining a methylation status comprises determining the proportion of nucleotide positions that are methylated in the genomic region.
82. The method of claim 59, wherein the reference level is a level of methylation from a patient that does not have bladder cancer.
83-85. (canceled)
86. A method of detecting the presence of, or an increased risk of, bladder cancer in a patient comprising:
- (i) obtaining a patient sample;
- (ii) determining a methylation status in one or more genomic regions selected from those in Table 1; and
- (iii) identifying the presence of, or an increased risk of, bladder cancer in the patient based on an increased level of methylation in one or more of the genomic regions relative to a reference level.
87-94. (canceled)
95. A synthetic polynucleotide sequence comprising a sequence at least 90% identical to one of the probe sequences selected from those provided in Table 1B or 1C, wherein the polynucleotide is conjugated to a reporter molecule.
96. The polynucleotide of claim 95, wherein the polynucleotide is a fluorophore.
97. The polynucleotide of claim 95, comprising a sequence at least 95% identical to one of the probe sequences selected from those provided in Table 1B or 1C.
98. The polynucleotide of claim 95, comprising a sequence identical to one of the probe sequences selected from those provided in Table 1B or 1C.
99. A kit comprising at least two primer pairs and at least two probes for amplification and detection of a gene region selected from those provided in Table 1A.
100. The kit of claim 99, comprising at least three, four or five two primer pairs and at least three, four or five probes for amplification and detection of a gene region selected from those provided in Table 1A.
101. The kit of claim 99, wherein the least two primer pairs and at least two probes are selected from those provided in Table 1B or 1C.
102. The kit of claim 99, further comprising one or more of the following: (i) instructions; (ii) reagents for real-time qPCR; (iii) a methylation sensitive endonuclease; and (iv) a control DNA sample.
Type: Application
Filed: Sep 16, 2020
Publication Date: Jun 3, 2021
Inventors: Wei GUO (Irvine, CA), Paolo PIATTI (Irvine, CA), Xiaojing YANG (Irvine, CA), Xi-Yu JIA (Irvine, CA)
Application Number: 17/022,345