SYSTEM, METHOD, COMPUTER-ACCESSIBLE MEDIUM AND APPARATUS FOR DNA MAPPING
Exemplary embodiments of the present disclosure can include, for example, an atomic force microscopy (AFM) system, including a cantilever(s), an optical pickup unit(s) (OPU(s)) including a laser positioned over the cantilever(s), and a power source providing noise with a noise level that is below 300 Picometers. The noise level of the power source can be below 200 Picometers. A digitizing arrangement can be included which can be associated with the OPU. The digitizing arrangement(s) can have a bandwidth of about 2 MHZ. The OPU(s) can have a detection bandwidth of at least 80 MHZ. The exemplary apparatus can be combined with a chemical protocol and statistical signal processing and image analysis procedures to map DNA at high speed and accuracy.
This application claims benefit of U.S. provisional patent applications 62/442,672, filed Jan. 5, 2017 and 62/443,325 filed Jan. 6, 2017, the complete contents of which is hereby incorporated by reference.
FIELD OF THE DISCLOSUREThe present disclosure relates generally to deoxyribonucleic acid (“DNA”) mapping, and more specifically, to exemplary embodiments of an exemplary system, method, computer-accessible medium and apparatus for DNA mapping.
BACKGROUND INFORMATIONProgress in whole genome sequencing using short read technologies (e.g., less than 150 base pairs (“bp”)) has reinvigorated interest in high resolution physical mapping to fill technical gaps not well addressed by sequencing. For example, combining short range sequencing with long range maps to produce haplotypically correct whole genome sequences, and to provide an easy method to detect structural variants larger than the size of the sequence reads. Such a procedure can better elucidate fundamental biology involved in structural genomic changes in tumorigenesis (e.g. translocation), uncharacterized polymorphisms in a population, null models for stratified populations utilized to regularize genome wide association studies and modeling evolutionary processes such as recombination, gene conversion and duplication. State of the art physical mapping approaches involve using low-quality long reads (e.g., Pacbio, Oxford Nanopore), dilution mapping (e.g., Moleculo, 10×), mutational mapping (e.g., Museq), single molecule optical restriction mapping (e.g., Opgen, Bionano), or ultra-high coverage sequencing. Despite recent progress, these technologies remain expensive, incomplete, computationally intense, or all of the above.
Long range genetic variations such as deletion, duplications, inversions and translocations can play a significant role in complex diseases such as cancer (see, e.g., References 1-5). Current, high-throughput, genomic procedures, such as short-read next-generation sequencing, work best to study variations in relatively short sequences (e.g., several hundred base pairs) and phased sequencing utilizes various complicated approaches; repetitive sequences, haplotypic ambiguities, as well as amplification-induced bias, are existing hurdles for next generation sequencing (“NGS”) analysis. In contrast, traditional optical mapping can only resolve variations larger than several thousand base pairs (see, e.g., References 6 and 7). Nano-channel augmented (see, e.g., Reference 8) and super-resolution optical mapping (see, e.g., References 9 and 10) are more precise (e.g., down to +/−100 bp), but the experimental protocols to achieve this are still lengthy and complex, and utilize sophisticated Bayesian procedures involving multiple hyper-parameters (see, e.g., Reference 11).
Thus, it may be beneficial to provide an exemplary system, method, computer-accessible medium and apparatus for DNA mapping which can overcome at least some of the deficiencies described herein above.
SUMMARY OF EXEMPLARY EMBODIMENTSExemplary embodiments of the present disclosure can include, for example, an atomic force microscopy (AFM) system, including a cantilever(s), a scanning probe arrangement(s) including a laser positioned over a portion of the cantilever(s) which contacts a surface of a sample(s), wherein a tilt angle of the at least one cantilever with respect to the at least one scanning probe arrangement is less than 10 degrees, and a power source, wherein the AFM system is configured to generate a displacement noise that is less than 300 Picometers. The noise level of the power source can be below 200 Picometers. A digitizing arrangement can be included which can be associated with the scanning probe arrangement(s). The digitizing arrangement(s) can have a bandwidth of about 2 MHZ. The scanning probe arrangement(s) can have a detection bandwidth of at least 80 MHZ.
In some exemplary embodiments of the present disclosure, a transparent sample plate positioned below the cantilever(s) can be included. A light emitting arrangement positioned under the sample plate can be included which can be configured to emit a light through the sample plate. The light emitting arrangement(s) can include a light(s) and a mirror(s). The cantilever(s) can have a spring constant of less than about 0.03 newton meters.
In certain exemplary embodiments of the present disclosure, a camera(s) positioned above the scanning probe arrangement(s) can be included. The laser can be positioned directly above the cantilever(s). The scanning probe arrangement(s) can include a plurality of scanning probe arrangements, the cantilever(s) can include a plurality of cantilevers and each of the scanning probe arrangements can be positioned above a corresponding one of the cantilevers. A computer hardware arrangement configured to adjust a position of the laser relative to the cantilever(s) can be included.
A further exemplary embodiment of the present disclosure for generating information regarding a portion(s) of a deoxyribonucleic acid (DNA) sample(s) can be provided, which can include, for example, receiving data related to a plurality of markers on the portion(s) of the DNA sample(s), determining a distance between at least two markers of the plurality of markers, and generating the information regarding the portion(s) of the DNA sample(s) based on the distance. The information can be (i) a map of the portion(s) of the DNA sample(s), or (ii) a species count of the portion(s) of the DNA sample(s).
In certain exemplary embodiments of the present disclosure, the information can be generated using a contour length of the portion(s) of the DNA sample(s) when the distance is above a particular distance, which can be about 125-175 base pairs, or about 140-160 base pairs, or about 150 base pairs. The information can be generated using a distribution of lengths of DNA molecules of a homogeneous population when the distance is between a first distance and a second distance, where the first distance can be about 350-450 base pairs, about 375-425 base pairs, or about 400 base pairs and the second distance can be about 50-100 base pairs, about 70-80 base pairs, or about 75 base pairs. The information can be generated by summarizing a distribution of lengths of DNA molecules as a median or a mode of the distribution of lengths when the distance is below a particular distance, which can be about 350-450 base pairs, about 375-425 base pairs, or about 400 base pairs.
A further exemplary embodiment of the present disclosure for mapping nucleotide molecules, can include, for example, (i) incubating a target nucleotide in a magnesium-free mixture, where the mixture can include a Clustered Regularly Interspaced Short Palindromic Repeats (CRISPR)-cellular apoptosis susceptibility (Cas) protein and a guide RNA, and where incubating the target nucleotide with the CRISPR-Cas protein can bind the CRISPR-Cas protein to the target nucleotide to form a CRISPR-Cas/target nucleotide complex without the CRISPR-Cas protein cleaving the target nucleotide, (ii) depositing the CRISPR-Cas/target nucleotide complex on a flat surface, where after the depositing the CRISPR-Cas/target nucleotide complex can be bound to the flat surface and (iii) imaging the CRISPR-Cas/target nucleotide complex on the flat surface by using a scanning probe-labelling technique, where prior to imaging, substantially all unbound CRISPR-Cas protein or guide RNA can be removed.
In certain exemplary embodiments of the present disclosure, the target nucleotide is deoxyribonucleic acid (DNA). The target nucleotide can also be a DNA/RNA hybrid. The DNA can be a PCR amplicon. The DNA can be genomic DNA obtained from a biological sample.
A biological sample may be of any biological tissue or fluid. Such samples include, but are not limited to, bodily fluids which may or may not contain cells, e.g., blood (e.g., whole blood, serum or plasma), urine, synovial fluid, saliva, and joint fluid; or tissue or fine needle biopsy samples, such as from bone or cartilage.
In some exemplary embodiments, the systems and methods described herein are performed on DNA, e.g. genomic DNA, with sizes ranging from tens to hundreds of thousands of base pairs. In other embodiments, smaller strands of DNA are analyzed, for example, under 300 bp in length, or under 250 bp or under 200 bp in length. The magnesium-free mixture can include EDTA, where the EDTA can chelate any magnesium in the mixture to render the mixture magnesium-free. The magnesium-free mixture can be a magnesium-free deposition buffer that can include magnesium-alternates, which can be zinc, polyamine or 3-aminopropylsilatrane. The CRISPR-Cas protein can be Cas9 or a modified Cas9. The guide RNA can be an sgRNA (single guide RNA), where the sgRNA can be designed to target a specific target nucleotide sequence marker.
In some exemplary embodiments of the present disclosure, before depositing the CRISPR-Cas/target nucleotide complex on a flat surface, any unbound CRISPR-Cas protein or guide RNA can be removed. After depositing the CRISPR-Cas/target nucleotide complex on a flat surface, any unbound CRISPR-Cas protein or guide RNA can be removed. The flat surface can be a mica surface. The flat surface can be a transparent surface. The scanning probe-labelling technique can be atomic force microscopy. The atomic force microscopy can include an atomic force microscopy (AFM) system which can include a cantilever(s), an optical pickup unit(s) (OPU) including a laser having a laser positioned over the cantilever(s), and a power source, where the AFM system can be configured to generate a displacement noise that is less than 300 Picometers.
In certain exemplary embodiments of the present disclosure, after imaging the CRISPR-Cas/target nucleotide complex on the flat surface, the image can be used for de novo mapping of the target nucleotide. After imaging the CRISPR-Cas/target nucleotide complex on the flat surface, the image can be used for quantitating the amount of the target nucleotide. After incubating, the CRISPR-Cas protein can be fixed to the target nucleotide by adding formaldehyde to the mixture.
A further exemplary embodiment of the present disclosure for mapping nucleotide molecules, can be provided, which can include, for example, (i) incubating a deoxyribonucleic acid (DNA) molecule in a magnesium-free mixture, where the mixture can include a CRISPR-Cas9 protein and an sgRNA, and further where incubating the DNA molecule with the CRISPR-Cas9 protein can bind the CRISPR-Cas9 protein to the DNA molecule to form a CRISPR-Cas9/DNA complex without the CRISPR-Cas9 protein cleaving the DNA molecule, (ii) fixing the CRISPR-Cas9/DNA complex, where formaldehyde can be added to the mixture; depositing the CRISPR-Cas9/DNA complex on a mica surface, where after the depositing the CRISPR-Cas9/DNA complex can be bound to the mica surface, and (iii) imaging the CRISPR-Cas9/DNA complex on the mica surface by using atomic force microscopy, where before imaging any unbound CRISPR-Cas9 protein or sgRNA can be removed.
These and other objects, features and advantages of the exemplary embodiments of the present disclosure will become apparent upon reading the following detailed description of the exemplary embodiments of the present disclosure, when taken in conjunction with the appended claims.
Further objects, features and advantages of the present disclosure will become apparent from the following detailed description taken in conjunction with the accompanying Figures showing illustrative embodiments of the present disclosure, in which:
Throughout the drawings, the same reference numerals and characters, unless otherwise stated, are used to denote like features, elements, components or portions of the illustrated embodiments. Moreover, while the present disclosure will now be described in detail with reference to the Figures, it is done so in connection with the illustrative embodiments and is not limited by the particular embodiments illustrated in the Figures and the appended paragraphs.
DETAILED DESCRIPTION OF EXEMPLARY EMBODIMENTSThe exemplary system, method, computer-accessible medium and apparatus can utilize an exemplary physical mapping procedure characterizing, for example, genomic DNA using CRISPER/Cas9 which can be used for constructing high resolution maps from few or even single DNA molecules, with a one-step labeling chemistry, and high speed (“HS”) atomic force microscopy (“AFM”) detection making it inexpensive and scalable (see e.g.,
The efficiency and precision of Cas9 labeling was demonstrated using several gene-specific sgRNAs targeted to BRCA1, HER2 and TERT gene sequences (see, e.g.,
Existing optical mapping methods can be based on recognition of short sequences by enzymes such as restriction nucleases, nicking endonucleases and methyltransferases (see, e.g., References 9, 10 and 12-14); however, only a limited number of different sequences can be available, and the distribution of recognition sites of such enzymes throughout the genome can be uneven. Further, while CRISPR/Cas9 nickase was recently applied to introduce sequence-specific fluorescent labels for optical DNA mapping (see, e.g., Reference 15), the nickase-mediated labeling protocols used can be multistep processes, including several enzymatic reactions (e.g., introduction of nicks, incorporation of labeled nucleotide with terminal transferase or polymerase, etc.) that utilizes 300-900 ng of DNA labeling input amount (see, e.g., Reference 8). In contrast, the CRISPR/Cas9 labeling protocol used here can be simple and, can consist of mixing all components together followed by incubation, fixing and purification. 100 ng of DNA input amount was used; however, decreasing the amount down to below 10 ng does not significantly change labeling efficiency, and a lower limit was not determined. Optimization of the exemplary protocol can be as follows: when the labeling rate may not be sufficient, the concentrations of either sgRNA or Cas9 or both can be increased. The exemplary labeling protocol can also be easily incorporated into an automated sample preparation workflow. This can be applied to quantify RNA sequences as it was shown that Cas9 can cleave DNA-RNA hybrid, although with reduced efficiency (see, e.g., References 16 and 17).
Custom sgRNAs were created to label the smaller TERT and HER2 amplicons (e.g., approximately 650 bp), for the purposes of accurately determining positioning accuracy and labeling kinetics. For the TERT amplicon, two unique sgRNAs were tested independently and in combination; for HER2, five unique sgRNAs were tested independently and in combination. A ladder series of typical polymerase chain reactions (“PCR”) amplicons (e.g., 75 bp to 300 bp long), was measured and synthesized, which was labeled at each end.
The spatial resolution of labeling, for example, the precision of location of the label on DNA molecule, can be an important parameter in DNA mapping (see, e.g., Reference 12). As shown in the graph of
As shown in the graph in
The exemplary system, method, computer-accessible medium and apparatus can replace a laser vibrometer, the most costly hardware component (e.g., approximately $60,000), with an optical pickup unit (“OPU”), for example from a DVD player (see, e.g., Reference 21 and
The OPU can have onboard voice coil actuators to focus the detection beam, and possess integrated laser control electronics and high bandwidth (e.g., greater than 10 MHz) signal amplifiers, making it ideal for rapid, high precision displacement sensing. The OPU HS-AFM produced similar quality images as the vibrometer, with a lower frame rate of 0.5-1 frames per second due to the utilized frame averaging. In these images, the Cas9 labels can be sharply visible, as can be the DNA backbone in some cases. The Cas9 intra-label spacing on doubly-labeled TERT and HER2 amplicons was determined with an accuracy and precision equivalent to that achieved with the significantly more expensive laser vibrometer.
With no modifications, this OPU HS-AFM can be used to detect and measure length polymorphisms between closely spaced markers (e.g., less than 400 bp). The multiple Cas9 labels can also be used as simple single molecule barcodes, making the technology useful in counting applications such as detecting gene copy number variation, digital PCR or transcriptional profiling. Long DNAs (e.g., greater than 10 kbp) can be stretched linearly for genomic mapping using exemplary DNA elongation procedures such as micro- or nano-channels, pressure and electrokinetic flow devices, or laser/magnetic tweezers.
Achievable improvements in signal-to-noise and drift compensation (see, e.g., References 10 and 23) can facilitate the OPU to reliably detect the DNA backbone, as well as the Cas9 labels. Moreover, DVD OPUs have typical detection bandwidths of 80 MHz, and Blu-ray OPUS can be even higher, at 400 MHz. Thus, if a higher sampling rate digitization electronics can be used, the pixel rate of an OPU-based instrument can be increased from the 2 MS/s to as high as 80 or even 400 MS/s, depending on the OPU model used. By increasing the scan speed and image size, these higher pixel rates can be directly translated into measurement throughput. The cost of the exemplary system, method, computer-accessible medium and apparatus can reduce the overall cost of the HS-AFM system by more than an order of magnitude. The cost of the OPU detector can be roughly that of the cantilever itself, making the exemplary system, method, computer-accessible medium and apparatus, unlocking scalable high throughput (see e.g., diagrams shown in
The exemplary system, method, computer-accessible medium and apparatus can utilize a DVD laser, although a CD laser can be used as well. Lasers with shorter wavelength (ex. Blu-ray, which is about 405 nm) can be used to increase the resolution. The noise floor of the exemplary system, method, computer-accessible medium and apparatus can be lower than 300 picometers (e.g., RMS) in order to resolve the DNA molecule backbone. It can also be lower than 3 nanometers to resolve the CRISPR labels. A power source with very low electronic noise can be utilized (e.g., a battery source). The detector can include 4 photodiode elements which can sense the shape of the reflected laser spot. In an exemplary embodiment, the signal from each of the four elements can be separately and simultaneously digitized to provide optimal performance. In some exemplary cases, the signal can be recorded in a differential format (e.g., order), such as A-D and B-C. In this exemplary case, only two composite signals need be recorded (see e.g.,
Primers and sgRNA were obtained from Integrated DNA Technologies. Sequences are shown in Table 2. sgRNA were prepared according to the manufacturer's recommendations. Wild-type Cas9 protein was purchased from New England Biolabs, Cas9 (e.g., D10A) nickase and Cas9 (e.g., D10A and H840A) from Novateinbio.
For the BRCA1 amplicon, Alu-family repeats containing the GG sequence to be used as a PAM site were identified with RepeatMasker software. Using BLAST software, five perfect and five 1- or 2-base mismatched sites to the chosen sgRNA sequence were found, which are designated Alu-sgRNA in Table 3 below.
As shown in the diagram of
The exemplary protocol can be based on in vitro digestion of DNA with Cas9 protocol, e.g., New England Biolabs (see e.g., flow diagram shown in
For HS-AFM sample preparation (e.g., all samples except BRCA1), imaging and data analysis, an exemplary experimental protocol was used (see, e.g., References 25 and 26). One microliter of 100 ng/ul solution of the amplicons in deposition buffer (e.g., 10 mM TRIS, 10 mM MgCl2, pH=7.6) was deposited on freshly cleaved mica surface, incubated for 1.5 minutes in humid environment, rinsed three times with 200 ul of MQ water, baked at 120° C. for 20 minutes and cooled down to 40° C. in oven. For BRCA1, the only modification was made at deposition: the mica surface was tilted by 45 degrees to enable gravity-driven flow.
Exemplary Preparation of the DNA-CRISPR Complexes in Magnesium Free SolutionThe solutions used in preparation and storage of the complexes preferably should not contain magnesium to prevent magnesium-mediated enzymatic activity of CRISPR nuclease. This can be obtained by the addition of chelating agents such as EDTA. Minimization of exposure of DNA-CRISPR complexes to magnesium during deposition on AFM slides can be beneficial. The complexes can be deposited within several seconds after addition of magnesium-containing deposition buffer. Incubation time is preferably less than 30-90 seconds. Alternatively, magnesium-free deposition chemistry can be applied (e.g. zinc, polyamine, 3-aminopropysilatrane (“APS”) etc.). The flat surface onto which the complex is deposited can be “pre-treated” with Mg++ or Ni++ or amino-silane, etc., which enhances binding of the complex to the flat surface by creating an adhesive layer independently of the magnesium-free incubation mixture. Minimization of exposure of prepared AFM slide to humid environment can be beneficial. The slides can then be transferred to humidity-free chamber after preparation.
Exemplary Application of Short CRISPR-Labeled DNA Molecules (<300 bp)The spatial properties and conformational changes of A-, B-, Z-, and other forms of duplex DNA and non-canonical DNA structures such as triplexes, quadruplexes and four-way junctions of various sequence composition at different experimental conditions (pH, temperature, ions etc.) can be examined. This can include, for example, formation of triplexes or quadruplexes under certain conditions.
Structural variations of the above DNA structures caused by small molecules, proteins and other DNA-specific molecules can be examined. The small molecules can include but are not limited to, intercalators and a major and a minor groove binder. Binding to DNA can be either covalent or non-covalent. The alterations can include conformational changes of DNA duplex (e.g. unwinding of DNA helix, bending of DNA molecule, B-to A-form transition) and non-canonical DNA structures (e.g. changing the geometry of four-way junction) caused by binding of DNA-specific molecule(s). CRISPR-labeled DNA molecules can be employed as a highly sensitive biosensor to detect the presence of DNA-specific agents.
Damaged DNA, such as covalent DNA adduct, single-strand breaks, thymine dimers and abasic sites, can be examined, which can include: (i) DNA sequence content between the labels, (ii) Mismatched duplexes, (iii) Methylation of DNA bases, (iv) GC content of the bases between the labels, (v) Conformational changes to molecules bound to the DNA backbone in between the labels (such as enzymes, histones, etc.), or (vi) highly sensitive amplification-free sizing of DNA. Two CRISPR recognition sites (either the same or different) are introduced during two step melting-annealing-extension protocol. This can be applied, for example, to sizing Circulating Cell-Free DNA (“ccfDNA”).
Individual amplicons in multiplex real-time PCR can be identified and quantified. Amplicon sequences can be used to target CRISPR, or CRISPR recognition sites can be introduced at the amplicon ends during amplification. This exemplary approach can provide sequence-specificity comparable to hydrolysable probe PCR with significantly increased multiplexing level.
Genetic variations, such as Single Nucleotide Polymorphism (“SNP”), short insertions and deletions (e.g., InDels) and others can occur. All genetic alterations resulting in change of DNA size can be visualized directly. For other genetic variations (e.g. SNP) another exemplary strategy can be used. To detect a small fraction of mutation at the background of wild-type DNA (e.g. presenting in ccfDNA) both mutated and wild type DNA can be co-amplified, and hybridized to form mismatched duplexes which can be resolved with mismatch-specific nucleases (e.g. T7 Endonuclease I) and labeled with CRISPR close to or at cleavage site. This exemplary approach can facilitate specific detection of amplicon bearing mutation(s).
In some embodiments, mutated nucleotides are detected using a rolling circle amplification scheme. As known in the art, a vector is used ligate the target nucleotide into a closed circle. The nucleotide is then amplified as a rolling circle to produce tandem copies and then labeled with CRISPR as described herein. The wildtype sequence will produce a repeated pattern of labels at known intervals. A mutated sequence will not bind label as efficiently as the wildtype and thus will produce irregular spacing between labels. In some embodiments, the CRISPR label is targeted to match the mutant sequence. Non-mutant sequences would thus have less CRISPR labeling. In some embodiments, this technique is used for more accurate measurement of spacing between labels. For example, less than 10 tandem repeats may be used to precisely measure the spacing that is less than 200-300 base pairs. With more precise measurement of spacing, more species, each with slightly different spacing between labels, can be identified in a mixture.
In order to improve labeling technology, modified Cas9 and guide RNA as well as other CRISPR systems can be used in addition to wild-type Cas9. One type of the protein modifications can be employed to improve sequence-specificity, for example, discrimination between perfect and mismatch recognition sites (e.g. genetically modified high-fidelity versions of Cas9). Another type of the protein and/or guide RNA modifications can facilitate changing volume of tertiary CRISPR-guide RNA-DNA to discriminate different sequences with HS-AFM (e.g. chemical attachment of biotin for further labeling with streptavidin). Other CRISPR systems with different PAM sequence can be employed to complement Cas9 (e.g. CRISPR-Cpfl with TTTN PAM site). Additionally, other particles bound to DNA can be used as labels.
Multiplex Ligation-Dependent Probe Amplification (MLPA) with CRISPR-Labeled Amplicons AFM-Based Readout
The systems and methods described herein also provide a substantial improvement to existing MLPA techniques. The improvement relates to the different readout method used. Standard MLPA readout is size-based electrophoretic separation of amplicons. Described herein are methods of detection and identification of individual amplicon molecules based on recognition of the amplicons' unique pattern of CRISPR labels.
Advantages of CRISPR/AFM detection of MLPA amplicons include:
1. Better sensitivity and dynamic range of AFM compared to currently employed capillary electrophoresis. This allows for detection of rare species in multiplex PCR and is important in applications such as highly sensitive mutation quantification and gene expression studies.
2. Using amplicons of the same size to eliminate size-related amplification bias in multiplex PCR: longer probes give lower signal. This allows for improving the precision of the technique and increasing the size of amplicons.
3. Increasing multiplexing level.
Features of the technique are:
1. MLPA probe and primer design. The MLPA probe sequence design is done using standard recommendations with the exception of amplicon size: it may be longer than recommended, e.g. greater than 300-400 bp. MLPA probes sequences should contain CRISPR recognition sites; these sites may be naturally occurring and/or artificially introduced.
2. MLPA sample preparation and reactions. Nucleic acid extraction and purification, reverse transcription (in case of RNA), preparation of reaction solution, MPLA ligation, treatment with methylation-sensitive nucleases (in case of methylation analysis) and all other pre-amplification steps are done with standard MLPA protocol. If longer than standard MLPA probes are used the amplification conditions should be optimized correspondingly (i.e. increasing extension time).
3. MLPA reaction post-amplification treatment. The amplicons can be treated with enzymes such as mismatch-specific nuclease, if necessary. The amplicons are labeled with CRISPR, purified and imaged with AFM.
The full length of the molecule and the distance from one end of the molecule to the label of focus, were used to bisect the molecule into a shorter and a longer segment. The proportion of the full length represented by the segment of interest can be the target sequence. The theoretical expected value of proportion can also be determined with using expected measurements of full length and site location.
For each set of data, the distribution of these proportions, denoted p, around this value can vary as a function of image quality, deposition procedure and measurement errors as calculated by the Matlab analysis program. Proportions can be bounded by 0 and 1, and can often be analyzed by transforming them into logit form (see, e.g., Reference 27) defined as, for example:
The logit can be the natural log of the long segment proportion divided by the proportion's complement (e.g., one minus the proportion). This exemplary transformation can facilitate traditional linearized distribution statistics including means, standard deviations and confidence intervals based on the logits. The means and the lower and upper limits of the logit scale confidence intervals can then be back-transformed through exponentiation using, for example:
The percent of outliers for the 95% confidence interval of each label site are shown in Table 4 below. Additionally, an exemplary image of a Cas9-labeled BRCA1 amplicon is shown in
The exemplary system, method, computer-accessible medium and apparatus according to an exemplary embodiment of the present disclosure can utilize a very low spring constant triangle cantilever (e.g., Bruker Nano, MSNL, 0.01-0.03 N/m spring constant) operating in contact mode with no z-feedback. Exemplary spring constants can vary around 0.03 N/m (e.g., between 0.03 N/m to 0.04 N/m) The cantilever vertical deflection was measured by a 2.5 MHz bandwidth laser Doppler vibrometer (e.g., Polytec) using a height decoder module (see, e.g., Reference 28). The sample was translated in the fast (e.g., 1000 Hz) and slow (e.g., 4 Hz) scan directions by a piezo-actuated flexure stage capable of 3 urn deflection in both axes. Images of size 2×2 um, 1000×1000 pixels were captured in raster mode and rendered using customized LabView software (see, e.g., Reference 29). The high-speed flexure stage was mounted on a stick-slip x-y positioner with less than 100 nm repeatability on both axes (e.g., Smaract, Inc.).
Exemplary AFM Data Processing/AnalysisAn exemplary image-processing program called AFMExplorer was used to analyze images. AFM images can then be flattened and pre-filtered to reduce noise, followed by adaptive thresholding based on pixel height to recognize regions corresponding to DNA molecules; a binary skeletonization procedure can be used to determine the best backbone contour for each molecule, and the molecule length in nanometers can be calculated by a cubic spline fit to the backbone pixel set. The exemplary system, method, computer-accessible medium and approach can be used for obtaining an exemplary enzyme recognition. For example, the height change on the backbone of the molecule can be detected due to each CRISPR label and the length in nanometers of the distance from one end of the molecule to the label site can be calculated. In the case of more than one label the measurements can be continued to each additional site. Manual Queueing can be used to correct or ignore automation errors; otherwise the molecule identification and measurement can be fully automatic.
Large numbers of identical molecule can be measured in order to obtain a precise average value for their length via determining the spacing between labels. In some exemplary cases, the center-to-center distance between labels can be determined, rather than the contour length of the molecule between labels. This method can be more tolerant of poor imaging conditions because it can be taller than the DNA backbone. The exemplary image processing can also be simpler in this exemplary case (see e.g.,
As shown in the diagrams of
As shown in
Further, the exemplary processing arrangement 2005 can be provided with or include an input/output arrangement 2035, which can include, for example a wired network, a wireless network, the internet, an intranet, a data collection probe, a sensor, etc. As shown in
Introduction. Described herein are two substantial technical advances in DNA nanotechnology and single molecule genomics: (1) a labeling technique (CRISPR-Cas9 nanoparticles) for high-speed AFM-based physical mapping of DNA and, (2) the use of DVD optics to image DNA molecules with high-speed AFM. As proof of principle, we used this new ‘nanomapping’ method to detect and map precisely BCL2-IGH translocations present in lymph node biopsies of follicular lymphoma patents. This HS-AFM ‘nanomapping’ technique can be complementary to both sequencing and other physical mapping approaches.
In this study, HS-AFM was used to create precise DNA single molecule physical maps, and show scalability by integrating consumer electronics (DVD optical pickup units). To date the impetus behind the field of high-speed AFM development has been the visualization and study of biomolecular processes. The instruments designed to achieve this goal all sacrifice sample versatility to provide a small (e.g., 1×1 μm) window into the nanoscale. The lack of scalability of these measurements prohibits their use in physical mapping, which requires both high data rates and wide area coverage. In contrast, the present method for high-speed AFM addresses directly this need. It follows design principles distinct from the rest of the field: operating in a high-speed contact mode, bandwidth bottlenecks are avoided and unprecedented rates of nanoscale measurements are enabled.
To unlock the potential of HS-AFM-based ‘nanomapping’, a new use of Cas9 as a stable and specific “programmable nanoparticle” was developed. One very significant discovery described herein is the relative stability of the Cas9-sgRNA-DNA complex in the face of the harsh perturbation generated by the AFM tip moving at linear speeds of up to 10 millimeters per second. Previous work on DNA labeling with CRISPR-Cas9 has focused on its properties and uses as an enzyme, which is very different from the present use of it as a ‘nanoparticle’. Molecule fragmentation is a major drawback of nicking-based labeling schemes. DNA with closely spaced nicks (e.g., <˜200 bp) is not stable, and will not remain intact during processing. This problem impairs precisely locating translocation breakpoints, or localizing short insertions or deletions. It also puts an upper bound on the maximum length of the molecule, because very long molecules will inevitably contain two nearby nicking sites. Using Cas9 as a nanoparticle instead of a nicking enzyme avoids these drawbacks.
BCL2-IGH Translocation Mapping.
An important application of ‘nanomapping’ as described herein is detecting cancer-related structural variants of diagnostic and prognostic significance. In the clinical lab, fluorescent in situ hybridization (FISH) and PCR remain the mainstays; unfortunately, they fail in a significant fraction of cases, due either to insufficient resolution (FISH) or the fact that the vast majority of structural variant breakpoints are scattered widely and thus cannot be localized a priori for amplification by PCR. While microarrays can improve the detection of copy number variations, they are not a replacement for FISH, for example, because they cannot detect un-localized balanced translocations.
As shown in
sgRNAs targeting the JH elements in IGH and the MBR region of BCL2 were designed.
Clinical Validation of the AFM Translocation Assay.
Multiple Myeloma is characterized by translocations involving the IgH locus on chromosome 14q32, involving a number of partners including CCND1, FGFR3, c-MAF (see reference 31). These have been reported in frequencies of ˜15-20%, 15% and 2-6% respectively (see references 32 and 33). In addition to diagnostic value these translocations have prognostic significance as well. This study will validate the ability of the AFM translocation assay to accurately identify these translocations and to be able to follow them prospectively in patients. Patients with myeloma when they undergo diagnostic biopsies, will have an aliquot of the marrow aspirate sample submitted for multiplex AFM assay for chromosome 14q32 and its translocation partners. The patients will also undergo standard testing, specifically FISH and cytogenetics. FISH will be performed using a standardized set of MM specific probes. Following initial therapy, the patients with identified translocations will undergo repeat testing at each of the following time points: pre transplant, post-transplant day 60-90, six months, and at one year to identify evidence of minimal residual disease.
Where a range of values is provided, it is understood that each intervening value, to the tenth of the unit of the lower limit unless the context clearly dictates otherwise, between the upper and lower limit of that range and any other stated or intervening value in that stated range, is encompassed within the invention. The upper and lower limits of these smaller ranges may independently be included in the smaller ranges and are also encompassed within the invention, subject to any specifically excluded limit in the stated range. Where the stated range includes one or both of the limits, ranges excluding either or both of those included limits are also included in the invention.
The foregoing merely illustrates the principles of the disclosure. Various modifications and alterations to the described embodiments will be apparent to those skilled in the art in view of the teachings herein. It will thus be appreciated that those skilled in the art will be able to devise numerous systems, arrangements, and procedures which, although not explicitly shown or described herein, embody the principles of the disclosure and can be thus within the spirit and scope of the disclosure. Various different exemplary embodiments can be used together with one another, as well as interchangeably therewith, as should be understood by those having ordinary skill in the art. In addition, certain terms used in the present disclosure, including the specification, drawings and claims thereof, can be used synonymously in certain instances, including, but not limited to, for example, data and information. It should be understood that, while these words, and/or other words that can be synonymous to one another, can be used synonymously herein, that there can be instances when such words can be intended to not be used synonymously. Further, to the extent that the prior art knowledge has not been explicitly incorporated by reference herein above, it is explicitly incorporated herein in its entirety. All publications referenced are incorporated herein by reference in their entireties.
EXEMPLARY REFERENCESThe following references are hereby incorporated by reference in their entireties:
- 1. Kwong, A. et al. The importance of analysis of long-range rearrangement of BRCA1 and BRCA2 in genetic diagnosis of familial breast cancer. Cancer Genet-Ny 208, 448-454 (2015).
- 2. Judkins, T. et al. Clinical significance of large rearrangements in BRCA1 and BRCA2. Cancer-Am Cancer Soc 118, 5210-5216 (2012).
- 3. Seong, M. W. et al. A multi-institutional study of the prevalence of BRCA1 and BRCA2 large genomic rearrangements in familial breast cancer patients. Bmc Cancer 14 (2014).
- 4. Gad, S. et al. Identification of a large rearrangement of the BRCA1 gene using colour bar code on combed DNA in an American breast/ovarian cancer family previously studied by direct sequencing. J Med Genet 38, 388-392 (2001).
- 5. Welcsh, P. L. & King, M. C. BRCA1 and BRCA2 and the genetics of breast and ovarian cancer. Hum Mol Genet 10, 705-713 (2001).
- 6. Jing, J. P. et al. Automated high resolution optical mapping using arrayed, fluid-fixed DNA molecules. Proceedings of the National Academy of Sciences of the United States of America 95, 8046-8051 (1998).
- 7. Lai, Z. W. et al. A shotgun optical map of the entire Plasmodium falciparum genome. Nature Genetics 23, 309-313 (1999).
- 8. Das, S. K. et al. Single molecule linear analysis of DNA in nano-channel labeled with sequence specific fluorescent probes. Nucleic Acids Research 38 (2010).
- 9. Baday, M. et al. Multicolor Super-Resolution DNA Imaging for Genetic Analysis. Nano Lett 12, 3861-3866 (2012).
- 10. Vranken, C. et al. Super-resolution optical DNA Mapping via DNA methyltransferase-directed click chemistry. Nucleic Acids Res 42, e50 (2014).
- 11. Anantharaman, T. S., Mysore, V. & Mishra, B. Fast and cheap genome wide haplotype construction via optical mapping. (2005).
- 12. Reed, J. et al. Identifying individual DNA species in a complex mixture by precisely measuring the spacing between nicking restriction enzymes with atomic force microscope. J R Soc Interface 9, 2341-2350 (2012).
- 13. Levy-S akin, M. & Ebenstein, Y. Beyond sequencing: optical mapping of DNA in the age of nanotechnology and nanoscopy. Curr Opin Biotechnol 24, 690-698 (2013).
- 14. Neely, R. K. et al. DNA fluorocode: A single molecule, optical map of DNA with nanometre resolution. Chem Sci 1, 453-460 (2010).
- 15. McCaffrey, J. et al. CRISPR-CAS9 D10A nickase target-specific fluorescent labeling of double strand DNA for whole genome mapping and structural variation analysis. Nucleic Acids Res 44, ell (2016).
- 16. O'Connell, M. R. et al. Programmable RNA recognition and cleavage by CRISPR/Cas9. Nature 516, 263-+(2014).
- 17. Abudayyeh, O. O. et al. C2c2 is a single-component programmable RNA-guided RNA-targeting CRISPR effector. Science 353, aaf5573 (2016).
- 18. De Brakeleer, S., De Greve, J., Lissens, W. & Teugels, E. Systematic Detection of Pathogenic Alu Element Insertions in NGS-Based Diagnostic Screens: The BRCA1/BRCA2 Example. Hum Mutat 34, 785-791 (2013).
- 19. Puget, N. et al. A 1-kb Alu-mediated germ-line deletion removing BRCA1 exon 17. Cancer Res 57, 828-831 (1997).
- 20. Petrij-Bosch, A. et al. BRCA1 genomic deletions are major founder mutations in Dutch breast cancer patients. Nat Genet 17, 341-345 (1997).
- 21. Quercioli, F., Tiribilli, B., Ascoli, C., Baschieri, P. & Fred ani, C. Monitoring of an atomic force microscope cantilever with a compact disk pickup. Review of Scientific Instruments 70, 3620-3624 (1999).
- 22. Hwu, E. T., Huang, K. Y., Hung, S. K. & Hwang, I. S. Measurement of cantilever displacement using a compact disk/digital versatile disk pickup head. Japanese Journal of Applied Physics Part 1-Regular Papers Brief Communications & Review Papers 45, 2368-2371 (2006).
- 23. Hwu, E. T. et al. Anti-drift and auto-alignment mechanism for an astigmatic atomic force microscope system based on a digital versatile disk optical head. Review of Scientific Instruments 83 (2012).
- 24. Jia, H., Guo, Y., Zhao, W. & Wang, K. Long-range PCR in next-generation sequencing: comparison of six enzymes and evaluation on the MiSeq sequencer. Sci Rep 4, 5737 (2014).
- 25. Reed, J. et al. Identifying individual DNA species in a complex mixture by precisely measuring the spacing between nicking restriction enzymes with atomic force microscope. JR Soc Interface 9, 2341-2350 (2012).
- 26. Mikheikin, A. et al. High-Speed Atomic Force Microscopy Revealing Contamination in DNA Purification Systems. Anal Chem 88, 2527-2532 (2016).
- 27. Warton, D. I. & Hui, F. K. C. The arcsine is asinine: the analysis of proportions in ecology. Ecology 92, 3-10 (2011).
- 28. Payton, O. D., Picco, L., Miles, M. J., Homer, M. E. & Champneys, A. R. Improving the signal-to-noise ratio of high-speed contact mode atomic force microscopy. Rev Sci Instrum 83 (2012).
- 29. Kiapetek, P. et al. Large area high-speed metrology SPM system. Nanotechnology 26 (2015).
- 30. Sundstrom, A. et al. Image Analysis and Length Estimation of Biomolecules Using AFM. Ieee T Inf Technol B 16, 1200-1207 (2012).
- 31. Inagaki A, Tajima E, Uranishi M, Totani H, Asao Y, Ogura H, Masaki A, Yoshida T, Mori F, Ito A, Yano H, Ri M, Kayukawa S, Kataoka T, Kusumoto S, Ishida T, Hayami Y, Hanamura I, Komatsu H, Inagaki H, Matsuda Y, Ueda R, Iida S. Global real-time quantitative reverse transcription-polymerase chain reaction detecting pro to-oncogenes associated with 14q32 chromosomal translocation as a valuable marker for predicting survival in multiple myeloma. Leuk Res. 2013 December; 37(12):1648-55.
- 32. Hervé A L, Florence M, Philippe M, Michel A, Thierry F, Kenneth A, Jean-Luc H, Nikhil M, Stéphane M. Molecular heterogeneity of multiple myeloma: pathogenesis, prognosis, and therapeutic implications. J Clin Oncol. 2011 May 10; 29(14):1893-7.
- 33. Sawyer J R. The prognostic significance of cytogenetics and molecular profiling in multiple myeloma. Cancer Genet. 2011 January; 204(1):3-12.
Claims
1. An atomic force microscopy (AFM) system, comprising:
- at least one cantilever;
- at least one scanning probe arrangement including a laser positioned over a portion of the at least one cantilever which contacts a surface of at least one sample, wherein a tilt angle of the at least one cantilever with respect to the at least one scanning probe arrangement is less than 10 degrees; and
- a power source,
- wherein the AFM system is configured to generate a displacement noise that is less than 300 Picometers.
2. The AFM system of claim 1, wherein the noise level of the power source is below 200 Picometers.
3. The AFM system of claim 1, further comprising at least one digitizing arrangement associated with the at least one scanning probe arrangement.
4. The AFM system of claim 3, wherein the at least one digitizing arrangement has a bandwidth of about 2 MHZ.
5. The AFM system of claim 1, wherein the at least one scanning probe arrangement has a detection bandwidth of at least 80 MHZ.
6. The AFM system of claim 1, further comprising a transparent sample plate positioned below the at least one cantilever.
7. The AFM system of claim 6, further comprising at least one light emitting arrangement positioned under the sample plate configured to emit a light through the sample plate.
8. The AFM system of claim 7, wherein the at least one light emitting arrangement includes at least one light and at least one mirror.
9. The AFM system of claim 1, wherein the at least one cantilever has a spring constant of less than about 0.03 newton meters.
10. The AFM system of claim 1, further comprising at least one camera positioned above the at least one scanning probe arrangement.
11. The AFM system of claim 1, wherein the laser is positioned directly above the at least one cantilever.
12. The AFM system of claim 1, wherein (i) the at least one scanning probe arrangement includes a plurality of scanning probe arrangements, (ii) the at least one cantilever includes a plurality of cantilevers, and (iii) each of the scanning probe arrangements is positioned above a corresponding one of the cantilevers.
13. The AFM system of claim 1, further comprising a computer hardware arrangement configured to adjust a position of the laser relative to the at least one cantilever.
14. A method of mapping nucleotide molecules, comprising:
- incubating a target nucleotide in a magnesium-free mixture, wherein the mixture comprises a Clustered Regularly Interspaced Short Palindromic Repeats (CRISPR)-cellular apoptosis susceptibility (Cas) protein and a guide ribonucleic acid (RNA), and wherein incubating the target nucleotide with the CRISPR-Cas protein binds the CRISPR-Cas protein to the target nucleotide to form a CRISPR-Cas/target nucleotide complex without the CRISPR-Cas protein cleaving the target nucleotide;
- depositing the CRISPR-Cas/target nucleotide complex on a flat surface, wherein after deposition the CRISPR-Cas/target nucleotide complex is bound to the flat surface; and
- imaging the CRISPR-Cas/target nucleotide complex on the flat surface by using atomic force microscopy, wherein prior to imaging, substantially all unbound CRISPR-Cas protein or guide RNA is removed and wherein the atomic force microscopy comprises an atomic force microscopy (AFM) system which comprises: at least one cantilever; at least one scanning probe arrangement including a laser positioned over a portion of the at least one cantilever which contacts a surface of at least one sample, wherein a tilt angle of the at least one cantilever with respect to the at least one scanning probe arrangement is less than 10 degrees; and a power source, wherein the AFM system is configured to generate a displacement noise that is less than 300 Picometers.
15. The method of claim 14, wherein the target nucleotide is deoxyribonucleic acid (DNA).
16. The method of claim 14, wherein the target nucleotide is a DNA/RNA hybrid.
17. The method of claim 15, wherein the DNA is a polymerase chain reaction (PCR) amplicon.
18. The method of claim 15, wherein the DNA is genomic DNA obtained from a biological sample.
19. (canceled)
20. The method of claim 14, wherein the magnesium-free mixture includes EDTA, wherein the EDTA chelates any magnesium in the mixture to render the mixture magnesium-free.
21. The method of claim 14, wherein the magnesium-free mixture is a magnesium-free deposition buffer that comprises magnesium-alternates selected from the group consisting of zinc, polyamine, and nickel.
22. The method of claim 14, wherein the CRISPR-Cas protein is Cas9.
23. The method of claim 14, wherein the CRISPR-Cas protein is a modified Cas9.
24. The method of claim 14, wherein the guide RNA is an sgRNA, wherein the sgRNA is designed to target a specific target nucleotide sequence marker.
25. The method of claim 14, wherein at least two or more guide RNA targeting different nucleotide sequences are present in the mixture.
26. The method of claim 14, wherein before depositing the CRISPR-Cas/target nucleotide complex on a flat surface, any unbound CRISPR-Cas protein or guide RNA is removed.
27. The method of claim 14, wherein after depositing the CRISPR-Cas/target nucleotide complex on a flat surface, any unbound CRISPR-Cas protein or guide RNA is removed.
28. The method of claim 14, wherein the flat surface is a mica surface.
29. The method of claim 14, wherein the flat surface is a transparent surface.
30-31. (canceled)
32. The method of claim 14, wherein after imaging the CRISPR-Cas/target nucleotide complex on the flat surface, the image is used for de novo mapping of the target nucleotide.
33. The method of claim 14, wherein after imaging the CRISPR-Cas/target nucleotide complex on the flat surface, the image is used for quantitating the amount of the target nucleotide.
34. The method of claim 14, further comprising a step of fixing the CRISPR-Cas protein to the target nucleotide after said incubating step.
35. A method of mapping nucleotide molecules, comprising: wherein the AFM comprises:
- incubating a double-stranded deoxyribonucleic acid (dsDNA) molecule in a magnesium-free mixture, wherein the mixture comprises a CRISPR-Cas9 protein and an sgRNA, and wherein incubating the dsDNA molecule with the CRISPR-Cas9 protein binds the CRISPR-Cas9 protein to the dsDNmolecule to form a CRISPR-Cas9/dsDNcomplex without the CRISPR-Cas9 protein cleaving the dsDNmolecule;
- fixing the CRISPR-Cas9/dsDNA complex by adding formaldehyde to the mixture;
- depositing the CRISPR-Cas9/dsDNA complex on a mica surface, wherein after the deposition the CRISPR-Cas9/dsDNA complex is bound to the mica surface; and
- imaging the CRISPR-Cas9/dsDNA complex on the mica surface by using AFM, wherein before imaging any unbound CRISPR-Cas9 protein or sgRNA is removed;
- an AFM system, comprising: at least one cantilever; at least one scanning probe arrangement including a laser positioned over a portion of the at least one cantilever which contacts a surface of at least one sample, wherein a tilt angle of the at least one cantilever with respect to the at least one scanning probe arrangement is less than 10 degrees; and a power source,
- wherein the AFM system is configured to generate a displacement noise that is less than 300 Picometers.
36-45. (canceled)
Type: Application
Filed: Jan 5, 2018
Publication Date: Nov 21, 2019
Inventors: Jason REED (Midlothian, VA), Bhubaneswar MISHRA (Great Neck, NY), Andrey MIKHEYKIN (Richmond, VA), Loren PICCO (Bristol), Oliver PAYTON (Bristol), Freddie RUSSELL-PAVIER (Bristol)
Application Number: 16/475,704