METHODS AND SYSTEMS TO ANALYZE REACTIONS USING AN INFORMATION SYSTEM
Disclosed are example methods and systems to determine the quantity of an analyte initially present in a chemical and or biological reaction. Also disclosed are computer implemented methods and systems to automate portions of the analysis comprising mathematical or graphical analysis of an amplification reaction.
This patent is a continuation of U.S. patent application Ser. No. 12/189,358, entitled, “Method and System for Analyzing Reactions Using an Information System,” which was filed on Aug. 11, 2008, which is a divisional application of U.S. patent application Ser. No. 10/991,025, entitled, “Method and System for Analyzing Reactions Using an Information System,” which was filed on Nov. 17, 2004, which claims the benefit of U.S. Provisional Patent Application Ser. No. 60/527,389, entitled “Method and/or System for Analyzing Reactions Using an Information System,” which was filed on Dec. 6, 2003, and all of which are incorporated herein by reference in their entireties.
COPYRIGHT NOTICEPursuant to 37 C.F.R. 1.71(e), applicants note that this disclosure contains material that is subject to and for which is claimed copyright protection, such as, but not limited to, source code listings, screen shots, user interfaces, user instructions, and any other aspects of this submission for which copyright protection is or may be available in any jurisdiction. The copyright owner has no objection to the facsimile reproduction by anyone of the patent document or patent disclosure, as it appears in the records of the Patent and Trademark Office. All other rights are reserved, and all other reproduction, distribution, creation of derivative works based on the contents, public display, and public performance of the application or any part thereof are prohibited by applicable copyright law.
BACKGROUND OF THE INVENTION1. Field of the Invention
The present invention relates to analysis of data of nucleic acid amplification reactions. More specifically, the invention relates to an information system and method for making determinations regarding chemical and/or biological reactions. The invention also involves an alternate method of quantifying nucleic acids in a sample comprising amplification of a target nucleic acid and analysis of data obtained during the amplification reaction. The invention further involves a diagnostic system and/or kit using real-time nucleic acid amplification including, but not limited to, PCR analysis.
2. Discussion of the Art
In many different industrial, medical, biological, and/or research fields, it is desirable to determine the quantity of a nucleic acid of interest. Some methods of quantifying nucleic acids of interest involve amplifying them and observing a signal proportional to the quantity of amplified products made; other methods involve generating a signal in response to the presence of a target nucleic acid, which signal accumulates over the duration of the amplification reaction. As used herein, nucleic acid amplification reaction refers both to amplification of a portion of the sequence of a target nucleic acid and to amplification and accumulation of a signal indicative of the presence of a target nucleic acid, with the former often being preferred to the latter. The quantification of nucleic acids is made more difficult or less accurate or both because data captured during amplification reactions are often significantly obscured by signals that are not generated in response to the target nucleic acid (i.e., noise). Furthermore, the data captured by many monitoring methods can be subject to variations and lack of reproducibility due to conditions that can change during a reaction or change between different instances of a reaction. In view of the above, there is a need to develop improved means of quantifying a nucleic acid. Where quantification of nucleic acids is enabled by amplification reactions, there is also a need to improve current methods of detecting suspect or invalid amplification reactions. There further remains a need to improve current abilities to distinguish between amplification reactions that do not detect a target nucleic acid (i.e., negative reactions) from weak signals obtained from amplification reactions suffering from low quantities of a target nucleic acid in a sample, a degree of inhibition of the amplification reaction, or other causes. The present invention provides improvements in these areas as is disclosed below.
A non-exhaustive list of references providing background information regarding the present invention follows:
- Livak, K. and Schmittgen, T., Analysis of Relative Gene Expression Data Using Real-Time Quantitative PCR and the 22DDCT Method, METHODS 25: 402-408 (2001) doi:10.1006/meth.2001.1262.
- Bustin S A, Absolute quantification of mRNA using real-time reverse transcription PCR assays, Journal of Molecular Endocrinology 25: 169-193 (2000).
- Bustin S A., Quantification of mRNA using real-time reverse transcription PCR: trends and problems, J Mol Endocrinol. 29: 23-29 (2002).
While the inventors cannot guarantee that the following website will remain available and do not necessarily endorse any opinions expressed therein, an interested person may wish to refer to the website www.wzw.turn.de/gene-quantification/index.shtml for useful background information.
The discussion of any works, publications, sales, or activity anywhere in this submission, including in any documents submitted with this application, is not intended to be an admission of any manner that any such work constitutes prior art, unless explicitly stated to the contrary. Similarly, the discussion of any activity, work, or publication herein is not an admission that such activity, work, or publication was known in any particular jurisdiction.
Real-time PCR is an amplification reaction used for the quantification of target nucleic acids in a test sample. Conventionally, skilled artisans typically view the amplification reaction as comprising three distinct phases. First, there is a background or baseline phase, in which the target nucleic acid is being amplified but the signal proportional to the quantity of the target nucleic acid cannot be detected because it is too small to be observed relative to signals independent of the target (sometimes called “background” or “background signal”). Next, there is a logarithmic phase in which the signal grows substantially logarithmically because the signal is substantially proportional to the quantity of target nucleic acid in the amplification reaction and is greater than the background signal. Finally, the growth in the signal slows during a “plateau” phase reflecting less than logarithmic amplification of the target nucleic acid. As is known in the art, the time at which the logarithmic phase crosses a threshold value, which is a value somewhat greater than the value of the background signal, is reproducibly related to the log of the concentration of the target nucleic acid. This prior art method is generically referred to as the C, method, perhaps so named for the Cycle at which the signal crosses the threshold. C, analysis is reasonably reproducible and accurate, but suffers from some drawbacks, which need not be discussed here to understand the present invention.
U.S. Pat. No. 6,303,305 discloses a method of quantification of nucleic acids employing PCR reactions. The method disclosed employs the nth derivative of the growth curve of a fluorescent nucleic acid amplification reaction. This method effectively avoids the need to perform a baseline correction, but provides no reliable method of determining reactive from non-reactive samples, and does not reasonably suggest how to use an nth derivative calculation to assess the validity of the results obtained. In addition, nucleic acid amplification signals resulting from any artifacts in the system (e.g., crosstalk or positive bleedover—defined infra) cannot be distinguished from true positive responses using the methods disclosed therein and can lead to false positive results. However, the first derivative calculation disclosed by U.S. Pat. No. 6,303,305 provides an efficiency related value that is useful in the context of the present invention. The skilled artisan can refer to U.S. Pat. No. 6,303,305 for additional details relating to calculation of a first derivative of a nucleic acid amplification signal growth curve. U.S. Pat. No. 6,303,305 is incorporated by reference only in the United States of America, and other jurisdictions permitting incorporation by reference, to the extent it discloses the calculation of the first derivative of a nucleic acid amplification growth curve. However, U.S. Pat. No. 6,303,305 does not disclose or suggest the uses of this efficiency related value described in this disclosure (below).
Co-owned U.S. Provisional Patent Application No. 60/527,389, filed Dec. 6, 2003, discloses a method for analyzing a nucleic acid amplification reaction in which the log of the signal from an amplification reaction is examined for the maximum gradient or slope. This value, which for any data set corresponds to a point a certain period of time or number of cycles after the initiation of the amplification reaction, is called the MGL of the reaction. The MGL is useful in certain embodiments of the present invention, particularly in those that distinguish qualitatively those samples comprising little target nucleic acid from those samples that do not contain target nucleic acid. U.S. Patent Application No. 60/527,389, filed Dec. 6, 2003 is incorporated herein by reference in its entirety.
SUMMARY OF THE INVENTIONThe present invention provides a method for determining whether a sample contains a nucleic acid of interest, for quantifying this nucleic acid, and for assessing the validity or quality of the data used to reach the preceding qualitative and quantitative determinations.
The method of this invention method comprises contacting a sample with amplification or detection reagents or both in order to amplify the nucleic acid (as the term “amplified” is used herein). The amplification reaction generates signals indicative of the quantity of the target nucleic acid present in the sample, which signals are recorded at numerous points during the amplification reaction. The signal can be measured and recorded as a function of time value, or in the alternative, cycle number.
Suitable “efficiency related transforms” viewed or calculated as a function of time are determined for the amplification reaction, and the point in the amplification reaction of the maximum of the efficiency related transform, the magnitude of the maximum of the efficiency related transform, or the width (or similar parameter) of a peak in the plot of the efficiency related transform as a function of time can be used to obtain information about the reaction. This point in the reaction represents the point in time or the amplification cycle at which the maximum of the efficiency related transform occurs. Advantageously, the maximum of the efficiency related transform for a particular reaction, as well as the duration and magnitude of substantial changes in the calculated efficiency related transform, have consistently reproducible relationships to the initial concentration of a target nucleic acid in a sample, to the reliability of the data and information generated by the assay, to the presence or absence of a bona fide target nucleic acid, and to other parameters of the reaction. Advantageously, these relationships hold even in the presence of substantial noise and unpredictable variations in the signal(s) generated by the amplification reaction. As used herein, the term “maximum”, as applied to efficiency related transforms, is intended to include the minimum of the efficiency related transform when the reciprocal of the efficiency related transform is used. One can use the inverse ratio, in which, in the case of a curve, the curve will start at a value of approximately 1 in the baseline region, decrease during the growth region, and return approximately to one in the plateau region. The use of this transform would allow one to use the magnitude and the position of the trough instead of the magnitude and position of the peak for analysis. This transform is implemented in a manner that essentially equivalent to the ratio method in which the maximum of the efficiency related transform for a particular reaction is employed.
In all embodiments, signals from the amplification reaction are measured at intervals of time appropriate for the amplification reaction during the amplification reaction. These signals can be referred to as time-based or periodic measurements, such that every measurement of the signal generated for a particular reaction can be expressed as a function of time. In some embodiments, the amplification reaction is cyclical (e.g., as in PCR). Because cycles often have a substantially uniform duration, it is frequently convenient to substitute a “cycle number” for a time measurement. Accordingly, in some embodiments of the present invention, a region of data identified by one or more methods on an information processing system as described herein can correspond to a cycle number. However, some cyclical amplification reactions have cycles of non-uniform duration. For these amplification reactions, it may be preferable to measure time in non-uniform measures. For example, the theoretical extent of amplification in a PCR reaction having cycles of varying duration will be linked more directly to the number of cycles performed rather than the duration of the reaction. Accordingly, the skilled artisan will readily appreciate that the time-based measurements can easily be scaled to reflect the underlying amplification reaction. As is known in the art, it is often useful to interpolate data and results between cycle numbers, which gives rise to the concept of a fractional cycle number “FCN.” Similarly, in reactions where measurements are based on time, events can be measured in fractional time units.
In further embodiments, the invention advantageously involves a system or method or both for analyzing a reaction sample, such as a PCR reaction sample, that uses a substantial set of available reaction kinetics data to identify a region of interest, rather than using a very limited data set, such as where a reaction curve crosses a thresh'old.
In certain embodiments, an identified region can be used to determine one or more qualitative results, or quantitative data analysis results, or both. The reaction point of the maximum of the efficiency related transform can be used to determine the concentration of a target nucleic acid in a sample or to determine qualitatively whether any target analyte is present in a test sample. These and other values can be compared with reference quantities in generally the same way that a threshold cycle number (C1) or fractional threshold cycle number can be used in the prior art.
The reaction point corresponding to the maximum of the efficiency related transform can be understood as indicating or being derived from a cycle number that is located at a relatively consistent point with respect to reaction efficiency, such as at a maximum of reaction efficiency or a region consistently related to a maximum of reaction efficiency or consistently related to some other reaction progression. Different methods can be used to determine a reaction point related to a maximum of reaction efficiency. This value can comprise adjusted FCN values (e.g., FCNMR Adj, and FCNInt. Adj.), as described below. In certain embodiments of this invention, methods of the invention can determine FCN values for multiple reaction signals, such as a target and/or a control and use those values in determining reaction parameters, including, but not limited to, quantity of target nucleic acid initially present in a sample and the validity of the results generated by an amplification reaction.
The present invention can identify a value indicative of the reaction efficiency (at times, herein, generally referred to as an “efficiency related value” (ERV)) at one or more regions on a signal growth curve. A specific efficiency related value is referred to as a MaxRatio value or MR. MaxRatio refers to one possible method for calculating an efficiency related value as further discussed herein. This is one example of a method for determining an ERV and illustrative examples herein that refer to MR should also be understood to include other suitable methods for determining an efficiency related value, including, but not limited to, the maximum gradient of the log of the growth curve, as described in co-owned U.S. Patent Application No. 60/527,389, filed Dec. 6, 2003, the maximum first derivative of the signal obtained from the amplification reaction (e.g., as disclosed in U.S. Pat. No. 6,303,305), and the maximum difference between two sequential signals obtained from the amplification reaction. Thus, this invention is involved with an analytical method that identifies two values for a reaction curve: (1) one value related to a cycle number or time value and (2) one value indicating an efficiency related value. The invention can use those two values in analysis of reaction data performed using an information-handling system and method of using the system. An example of two such values are FCN and MR specific embodiments discussed below.
This invention is also involved with a method and system that uses two values as discussed above that are determined from a reaction under examination to compare that reaction to one or more criteria data sets. A criteria comparison can be used to determine and/or correct any results and/or quantifications as described herein. Criteria data can be derived by generating pairs of cycle number related values-efficiency related values (e.g., FCN-MR pairs) from multiple calibration reactions of known quantity or known concentration or both.
This invention also involves one or more techniques for performing efficiency analysis of reaction data. This analysis can be used separately from or in conjunction with the cycle number related value-efficiency related value analysis discussed herein. Efficiency analysis can be used to find a region of interest for making a determination about reaction data, such as, for comparison to calibration data sets, in a way similar to C, analysis as understood in the art.
The present invention also provides a method for analyzing a nucleic acid amplification reaction, in which a sample containing a nucleic acid is contacted with amplification agents and placed under suitable amplification conditions to amplify a portion of the nucleic acid in the sample. During the amplification reaction, signals that are proportional to the amount of the target nucleic acid present are periodically measured at a suitable interval. Conveniently, the interval can correspond to the duration of a cycle for those amplification reactions that are cyclical. The signals are then manipulated to determine an efficiency related transform for the amplification reaction. Any suitable efficiency related transform can be used for the invention. Efficiency related transforms preferred in the context of the present invention include the slope of the line, which can be determined by many techniques, including, but not limited to, difference calculations on sequential data points, determining the first derivative of a line fitted to the growth curve of the reaction signal, and determining the gradient, slope, or derivative of the log of the growth curve (i.e., Log (growth curve)). More preferably, the efficiency related transform is the ratio of sequential data points, sometimes referred to herein as the ratio curve. When the efficiency related transform for the reaction is known, a plot of the efficiency related transform as a function of time (preferably expressed in the units used to measure the signal) (or mathematical manipulation yielding information similar to a plot) can be used to identify a peak value. However, a plot is not required. The width of the peak in the selected range of acceptable peak widths can be determined by any suitable technique or method. However, a preferred method for determining the acceptable peak width involves statistically analyzing the degree of variance in peak widths obtained from objectively normal amplification reactions that are very similar to or even identical to the amplification method analyzed by the method of this invention. In the reaction analyzed, an unknown test sample is usually used in place of samples used to characterize the amplification reaction or an analyte assay. If the peak width of the analyzed amplification reaction falls within the prescribed range of acceptable peak widths, the reaction is declared normal; if the peak width of the analyzed amplification reaction does not fall within the prescribed range of acceptable peak widths, the reaction is identified as having provided sub-optimal, aberrant, or otherwise questionable signals. The width of the leading half of the efficiency related transform peak is evaluated. This evaluation is a more forgiving measurement of amplification reaction validity, and therefore may be preferred in some instances, but generally not in all instances.
The invention further involves an information system and/or program able to analyze captured data. Data can be captured as image data from observable features of the data, and the information system can be integrated with other components for capturing, preparing, and/or displaying sample data. Representative examples of systems in which the invention can be employed include, but are not limited to, the BioRad® i-Cycler®, the Stratagene® MX4000®, and the ABI Prism 7000® systems. Similarly, the present invention provides a computer product capable of executing the method of this invention.
Various embodiments of the present invention provide methods and/or systems that can be implemented on a general purpose or special purpose information handling system by means of a suitable programming language, such as Java, C++, C#, Cobol, C, Pascal, Fortran, PL1, LISP, assembly, etc., and any suitable data or formatting specifications, such as HTML, XML, dHTML, TIFF, JPEG, tab-delimited text, binary, etc. For ease of discussion, various computer software commands useful in the context of the present invention are illustrated in MATLAB® commands. The MATLAB software is a linear algebra manipulator and viewer package commercially available from The Mathworks, Natick, Mass. (USA). Of course, in any particular implementation (as in any software development project), numerous implementation-specific decisions can be made to achieve the developer's specific goals, such as compliance with system-related and/or business-related constraints, which will vary from one implementation to another. Moreover, it will be appreciated that such a developmental effort might be complex and time-consuming, but would nevertheless be a routine undertaking of software engineering for those of ordinary skill in the art having the benefit of this disclosure.
The invention will be better understood with reference to the following drawings and detailed descriptions. For purposes of clarity, this discussion refers to devices, methods, and concepts in terms of specific examples. However, the invention and aspects thereof may have applications to a variety of types of devices and systems.
Furthermore, it is well known that logic systems and methods such as those described herein can include a variety of different components and different functions in a modular fashion. Different embodiments of the invention can include different combinations of elements and functions and may group various functions as parts of various elements. For purposes of clarity, the invention is described in terms of systems that include many different components and combinations of novel components and known components. No inference should be taken to limit the invention to combinations requiring all of the novel components in any illustrative embodiment of this invention.
As used herein, “the invention” should be understood to include one or more specific embodiments of the invention (unless explicitly indicated to the contrary). Many variations according to the invention will be understood from the teachings herein to those of ordinary skill in the art.
As used herein, the expression “efficiency related value” means a value that has a consistent relationship to the efficiency of an amplification reaction. The expression “efficiency related transform” means a mathematical transformation involving the response in an amplification reaction that is used to determine an efficiency related value. The expression “reaction point” means a point during a reaction at which an efficiency related value occurs. The reaction point can be a point in time measured from the beginning of the reaction. Alternatively, the reaction point can be a point that denotes a cycle measured from the beginning of the reaction. The term “derivative” means the slope of a curve at a given point in the curve.
The present invention is directed to the analysis of a sample containing an analyte. The analyte can be a nucleic acid. In the context of the present invention, copies of a portion of the analyte are made (hereinafter “amplified”) in a manner that generates a detectable signal during amplification. The signal is indicative of the progress of the amplification reaction, and preferably is related either to the quantity of analyte and copies of the analyte present in a test sample, or is related to the quantity of the copies of the analyte produced by the reaction. The amplification is preferably configured to allow logarithmic accumulation of the target analyte (e.g., as in a PCR reaction), and in a more preferred embodiment, the amplification is a PCR reaction in which data are collected at regular time intervals and/or at a particular point in each PCR cycle.
Many systems have been developed that are capable of amplifying and detecting nucleic acids. Similarly, many systems employ signal amplification to allow the determination of quantities of nucleic acids that would otherwise be below the limits of detection. The present invention can utilize any of these systems, provided that a signal indicative of the presence of a nucleic acid or of the amplification of copies of the nucleic acid can be measured in a time-dependent or cycle-dependent manner. Some preferred nucleic acid detection systems that are useful in the context of the present invention include, but are not limited to, PCR, LCR, 3SR, NASBA, TMA, and SDA.
Polymerase Chain Reaction (PCR) is well-known in the art and is essentially described in Saiki et al., Science 230; 1350-1354 (1985); Saiki et al., Science 239:487-491 (1988); Livak et al., U.S. Pat. Nos. 5,538,848; 5,723,591; and 5,876,930, and other references. PCR can also be used in conjunction with reverse transcriptase (RT) and/or certain multifunctional DNA polymerases to transform an RNA molecule into a DNA copy, thereby allowing the use of RNA molecules as substrates for PCR amplification by DNA polymerase. Myers et al. Biochem. 30: 7661-7666 (1991)
Ligation chain reactions (LCR) are similar to PCR with the major distinguishing feature that, in LCR, ligation instead of polymerization is used to amplify target sequences. LCR is described inter alia in Backman et al., European Patent 320 308; Landegren et al., Science 241:1077 (1988); Wu et al., Genomics 4:560 (1989). In some advanced forms of LCR, specificity can be increased by providing a gap between the oligonucleotides, which gaps must be filled in by template-dependent polymerization. This can be especially advantageous if all four dNTPs are not needed to fill the gaps between the oligonucleotide probes and all four dNTPS are not supplied in the amplification reagents. Similarly, rolling circle amplification (RCA) is described by Lisby, Mol. Biotechnol. 12(1):75-99 (1999)), Hatch et al., Genet. Anal. 15(2):35-40 (1999) and others, and is useful in the context of the present invention.
Isothermal amplification reactions are also known in the art and useful in the context of the present invention. Examples of isothermal amplification reactions include 3SR as described by Kwoh et al., Proc. Nat. Acad. Sci. (USA) 86: 1173-1177 (1989) and further developed in the art; NASBA as described by Kievits et al., J. Virol. Methods 35:273-286 (1991) and further developed in the art; and Strand Displacement Amplification (SDA) method as initially described by Walker et al., Proc. Nat. Acad. Sci. (USA) 89:392-396 (1992) and U.S. Pat. No. 5,270,184 and further developed in the art.
Thus, many amplification or detection systems requiring only that signal gains indicative of the quantity of a target nucleic acid can be measured in a time-dependent or cycle-dependent manner are useful in the context of the present invention. Other systems having these characteristics are known to the skilled artisan, and even though not discussed above, are useful in the context of the present invention.
Analysis of the data collected from the amplification reaction can provide answers to one or more of the following questions:
(1) Was the target sequence found?
(2) If yes, what was the initial level or quantity of the target sequence?
(3) Is the result correct?
(4) Did the reaction series run correctly?
(5) Was there inhibition of the desired or expected reaction?
(6) Is the sample preparation recovery acceptable?
(7) Is the calibration to any reference data, if used, still valid?
According to some embodiments of this invention, one or more of these questions can be answered by identifying a region of interest (e.g., an FCN) and an efficiency related value (e.g., an MR) of a target and/or internal control reaction. In other embodiments, one or more of these questions can be answered by comparing such values to data sets herein referred to as criteria data, criteria curves, and/or criteria data sets. In additional embodiments, one or more of these questions can be answered by comparing such values obtained for an internal control, e.g., a 2nd amplification control reaction, in the same reaction mixture as its criteria data. In still further embodiments, one or more of these questions can be answered by comparing such values obtained for the target reaction to such values obtained for an internal control reaction in the same reaction mixture as their respective criteria data.
For clarity, the invention will be illustrated with reference to real-time PCR reactions, which are one class of measuring and monitoring techniques of high interest in automated and manual systems for detecting and quantifying human nucleic acids, animal nucleic acids, plant nucleic acids, and nucleic acids of human, non-human animal, and plant pathogens. Real-time PCR is also well adapted to detection of bio-warfare agents and other living or viral organisms in the environment. Real-time PCR combines amplification of nucleic acid (NA) sequence targets with substantially simultaneous detection of the amplification product. Optionally, detection can be based on fluorescent probes or primers that are quenched or are activated depending on the presence of a target nucleic acid. The intensity of the fluorescence is dependent on the concentration or amount of the target sequence in a sample (assuming, of course, that the quantity of the target is above a minimal detectable limit and is less than any saturation limit). This quench/fluoresce capability of the probe allows for homogeneous assay conditions, i.e., all the reagents for both amplification and detection are added together in a reaction container, e.g., a single well in a multi-well reaction plate. Electronic detection systems, target-capture based systems, and aliquot-analysis systems and techniques are other forms of detection systems useful in the context of the present invention so long as a given system accumulates data indicative of the quantity of target present in a sample during various time points of a target amplification reaction.
In PCR reactions, the quantity of target nucleic acid doubles at each cycle until reagents become limiting or are exhausted, there is significant competition, an inadequate supply of reactants, or other factors that accumulate over the course of a reaction. At times during which a PCR reaction causes doubling (exactly) of the target in a particular cycle, the reaction is said to have an efficiency (e) of 1 (e.g., e=1). After numerous cycles, detectable quantities of the target can be created from very small and initially undetectable quantity of target. Typically, PCR cycling protocols consist of between around 30-50 cycles of amplification, but PCR reactions employing more or fewer cycles are known in the art and useful in the context of the present invention.
In the real-time PCR reactions described below to illustrate the present invention, the reaction mixture includes an appropriate reagent cocktail of oligonucleotide primers, fluorescent dye-labeled oligonucleotide probes capable of being quenched when not bound to a complementary target nucleic acid, amplification enzymes, deoxynucleotide triphosphates (dNTPs), and additional support reagents. Also, a second fluorescent dye-labeled oligonucleotide probe for detection of an amplifiable “control sequence” or “internal control” and a “reference dye”, which optionally may be attached to an oligonucleotide that remains unamplified throughout a reaction series, can be added to the mixture for a real-time PCR reaction. Thus, some real-time PCR systems use a minimum of three fluorescent dyes in each sample or reaction container (e.g., a well). PCR systems using additional fluorescent probe(s) for the detection a second target nucleic acid are known in the art and are useful in the context of the present invention.
Systems that plot and display data for each of one, or possibly more, reactions (e.g., each well in a multi-well plate) are also useful in the context of the present inventions. These systems optionally calculate values representing the fluorescence intensity of the probe as a function of time or cycle number (CN) or both as a two-dimensional plot (y versus x). Thus, the plotted fluorescence intensity can optionally represent a calculation from multiple dyes (e.g., the probe dye and/or the control dye normalized by the reference dye) and can include subtraction of a calculated background signal. In PCR systems, such a plot is generally referred to as a PCR amplification curve and the data plotted can be referred to as the PCR amplification data.
In PCR, data analysis can be made difficult by a number of factors. Accordingly, various steps can be performed to account for these factors. For example, captured light signals can be analyzed to account for imprecision in the light detection itself. Such imprecision can be caused by errors or difficulties in resolving the fluorescence of an individual dye among a plurality of dyes in mixture of dyes (described below as “bleedover”). Similarly, some amount of signal can be present (e.g., “background signal”) and can increase even when no target is present (e.g., “baseline drift”). Thus, a number of techniques for removing the background signal, preferably including the baseline drift, trend analysis, and normalization are described herein and/or are known in the art. These techniques are useful but are not required in the context of the present invention. (Baseline drift or trending can be caused by many sources, such as, for example, dye instability, lamp instability, temperature fluctuations, optical alignment, sensor stability, or combinations of the foregoing. Because of these factors and other noise factors, automated methods of identifying and correcting the baseline region are prone to errors.)
Typically in PCR, the answers of interest are generally determined from a growth curve, which characteristically starts out as nearly flat during the early reaction cycles when insufficient doubling has occurred to cause a detectable signal, and then rises exponentially until one or more reaction limiting conditions, such as exhaustion of one or more reactants, begins to influence the amplification reaction or the detection process.
A number of methods have been proposed and have been used in research and other settings to analyze PCR-type reaction data. Typically, these methods attempt to detect when the reaction curve has reached a particular point, generally during a period of exponential or near-exponential signal growth (also known as “the log-linear phase”). While not wishing to be bound by any theory, the inventors believe that the earliest point(s) in which the log linear phase can be observed above the baseline or background signal provides the most useful information about the reaction and that the slope of the log-linear phase is a reflection of the amplification efficiency. Some prior art references erroneously suggest that for the slope to be an indicator of real amplification (rather than signal drift), there has to be an inflection point, which is the point on the growth curve where the log-linear phase ends. The inflection point can also represent the greatest rate of change along the growth curve. In some reactions where inhibition occurs, the end of the exponential growth phase may occur before the signal emerges from the background.
In running a PCR analysis, it is generally desired to determine one or more assay results regarding the initial amount/concentration of the target molecules. For discussion purposes, results may be expressed by answers to at least one of four questions:
(1) Was the target molecule present at all in the initial sample (e.g., a positive/negative detection result)?
(2) What was the absolute quantity of the initial target present?
(3) What is the confidence (e.g., sometimes expressed as a confidence value that the answers to questions 1 or 2 are correct)?
(4) What is the relative amount of the target present in two different samples? A number of methods have been proposed and can be used in research and other settings to answer one or more of these questions.
Data for PCR reactions is often collected one time in each cycle for each dye that is measured (i.e., fluorescence determined) in a reaction. While such data is useful in the context of the present invention, more precise quantification can be carried out by interpolation between the data points acquired at each cycle. In this way, the data can be analyzed to generate “fractional cycle numbers”, and points of interest can be determined to be coincident with a particular cycle number or at a reaction point between any pair of cycle numbers.
One problem with methods that rely on thresholds, particularly in diagnostic settings where it is desirable to fix thresholds, is that theses methods can be susceptible to errors due to the presence of noise factors, particularly systematic noise factors, such as, for example, “crosstalk” and “bleedover”. Crosstalk can generally be understood as occurring when a signal from an assay in one location (such as one well in a multi-well plate) causes an anomaly in a signal in a different, usually adjacent assay location. Bleedover can generally be understood as occurring in situations where more than one signal or data set is detected from the reaction. While detection dyes for a reaction are selected to be largely independent from each other and to have individual fluorescence emission spectra, the emission spectra sometimes overlap such that the emission spectrum from one dye will bleedover into the emission spectrum of a different dye.
Both crosstalk and bleedover can have the effect of either increasing or decreasing the calculated measurement of interest. Furthermore, in both cases, there can be situations where the curve itself can have an anomaly due to either or both of these phenomena. Systematic noise factors such as crosstalk and bleedover can be especially difficult to deal with when performing a baseline correction.
In some systems of the prior art, in order to detect low-level signals for either qualitative results or quantitative results, a low threshold is generally required. However, the use of a low threshold causes discrimination between a false positive signal due to crosstalk and a correct positive signal to be particularly difficult, because either can cause the PCR curve to rise above an amplification threshold, thereby suggesting that a target analyte is present. Positive and negative bleedover can also present problems. Positive bleedover can produce a false-positive results or cause falsely elevated estimates of the initial quantity of target in a sample, while negative bleedover can cause falsely depressed estimates of the initial quantity of target in a sample or falsely indicate the absence of a target in a test sample.
The method or system of this invention can reproducibly identify a region in a reaction curve or data, preferably using an information processing system, which can then be used to provide results based on the amplification reaction data. The invention can identify this region regardless of the base level of the signal, even in the presence of substantial noise. The invention can furthermore identify a value that is representative of efficiency at that region. This value can be used in determining primary results or in adjusting results or in determining confidence values as described herein, or all of the foregoing.
The invention can be illustrated by a specific example, shown below. In this example, an information processing system is used to analyze data representing the growth curve of an amplification reaction. In the amplification, a “peak” is generated by one step in the data analysis. The location of this peak (measured in time units or in cycles from the initiation of the amplification reaction) is referred to as the fractional cycle number (FCN) and the maximum value of the peak is referred to as the ERV (efficiency related value). These values can be used in a method to identify an efficiency related value region and to determine an efficiency related value at this peak. Both of these values can be understood as being derived from a method that analyzes the shape of the reaction curve regardless of the intensity of the amplification signal, which intensity of amplification signal can vary from reaction to reaction and from instrument to instrument, despite starting with identical samples. The reaction curve is a representation of the reaction wherein a signal substantially indicative of the quantity of target in a reaction is plotted as a function of time or, when appropriate, cycle number. The FCN can be understood as being consistently related to a point of maximum growth efficiency of a reaction curve, and the ERV can be understood as being consistently related to the efficiency at that point.
In some embodiments of this invention, analytical methods can optionally, and advantageously, be employed without use of baseline correction. In some systems and methods of this invention, a reference dye is not needed.
The present invention allows objective quantification of the quantity of a target present in a test sample without the need to calculate a subjective and variable threshold or a C, value, as employed in some techniques of the prior art. Furthermore, the invention can use information that is available for determining the degree of inhibition in a reaction by analyzing the shape of the PCR amplification curve, including data that previously has generally been ignored, such as data in cycles after a C1.
General methods for generating and using data pairs determined from reaction curve data will be understood from the examples below. For clarity, these examples refer to a specific set of data and specific functions for analyzing that data, though the invention is not limited to the examples discussed.
Example 1 Captured DataBy way of example, a typical real-time PCR reaction detection system generates a data file that stores the signal generated from one or more detection dyes.
Although optional, normalization can be performed on the captured data in several different ways. One method involves dividing the target and control values at each cycle reading by the corresponding reference dye signal. Alternatively, the divisor can be the average reference value over all cycles or an average over certain cycles. In another alternative embodiment, the divisor can be the average of the target dye or the control dye or the target dye and the control dye over one or more earlier (baseline) cycles, when no amplification signal is detected. Any known normalization method can be employed in a data analysis. The invention can be used with data that has already been normalized by a PCR system.
Because normalization is optional, the present invention can be used to analyze reaction data without the use of a normalization or reference dye. Alternatively, the target signal or the control signal or both can be used for normalization.
Example 3 ScalingScaling is optional but can be performed to make it easier for a human operator to visualize the data. Scaling does not affect analytical results. Scaling can be carried out in addition to normalization, in the absence of normalization, or before or after normalization.
One method of scaling involves dividing each data set value by the average of the values during some early cycles, generally in the baseline region before any positive data signal is detected. In this example, readings 4 through 8 were averaged and normalization was performed first.
One or more digital filtering methods can be applied to the captured data to “clean up” the signal data sets and to improve the signal to noise ratio. Many different filtering algorithms are known. The present invention can employ a four-pole filter with no zeros. This eliminates the potential for overshoot of the filtered signal. As an example, this can be implemented with the MATLAB function “filtfilt” provided with the MATLAB Signal Processing Toolbox, which both forward and backward filters to eliminate any phase lag (time delays). An example of parameters and MATLAB function call is as follows:
b=0.3164;
a=[1.0000-1.0000 0.3750-0.0625 0.0039];
data(:,:,assay)=filtfilt(b,a,data(:,:,assay));
data(:,:,ic)=filtfilt(b,a,data(:,:,ic));
In this example, “b” and “a” contain the filter coefficients. “data(:,:,assay)” and “data(:,:,ic)” contain the captured data that may or may not have been normalized, scaled, or both. In this case, the filtered data is both normalized and scaled.
An optional slope removal method can be used to remove any residual slope that is present in the early baseline signal before any detectable actual signal is produced. This procedure may also be referred to as baselining, but in some embodiments, the offset is not removed, only the slope. According to this invention, for slope removal, both the target (DYE 1) and control (DYE2) signals are examined simultaneously. Whichever signal comes up first defines the forward regression point, and the method generally goes back 10 cycles. If 10 cycles back is before cycle 5, then cycle 5 is used as the initial regression point to avoid any earlier signal transients. A linear regression line is calculated using the signal data between these points and the slope of the regression for each dye is subtracted from that dye's signal. In this case, the slope removal is applied to the normalized, scaled, and filtered data discussed above.
An embodiment of the method of this invention is the MaxRatio method. In this method, the ratio between sequential measurements is calculated, thereby yielding a series of ratios, each of which can be indexed to a time value or cycle number. Many suitable means of calculating these ratios exist, and any suitable means can be used. The simplest way of performing this ratio calculation utilizes the following function:
where n represents the cycle number and s(n) represents the signal at cycle n. This calculation provides a curve that starts at approximately 1 in the baseline region of the response, increases to a maximum during the growth region, and returns to approximately 1 in the plateau region. A MATLAB expression that performs this calculation efficiently is the following:
Ratio=s(2:end,:)./s(1:end−1,:),
where “s” represents the signal response matrix, with each column representing a separate response.
Although the MaxRatio algorithm is usable as described, it is convenient to shift the curve by subtracting a constant, e.g., about one (1), from each point. This operation provides a transformation of the original response, which starts near zero in the baseline region, rises to a peak in the growth region of the curve, and returns near zero in the plateau region. This shifted ratio calculation is described by the following function:
In order to enhance cycle number resolution, an interpolation can be performed. Many ways of accomplishing this operation are known in the art. One method of interpolating in the context of the invention is cubic spline interpolation, which provides a smooth interpolation, so that even the second derivative of the captured data sets will be continuous. The invention can be used to interpolate the entire data series. The invention can be used to determine a region of interest and then to interpolate only in that region to achieve sub-periodic, or sub-cycle, resolution. An example of a MATLAB command for performing a cubic spline interpolation is as follows:
out=interp1(x,in,x2,‘spline’)
where “x” represents the period (or cycle) numbers (1, 2, 3 . . . ), “in” represents the uninterpolated signal at those cycles, “x2” represents the higher resolution period (or cycle) vector (1.00, 1.01, 1.02, . . . ) and “out” represents the interpolated signal that corresponds to the fractional cycles in “x2”.
It should be understood that the steps described above can be performed in different orders, such as, for example, filtering first, followed by baselining before scaling. However, if the interpolation is performed before the ratio calculation, care must be taken to select the appropriate interpolated response values for the ratio calculation. It is important that the interval between ratio values remain the same. Thus, if cycles are used as the period of measurement, and interpolation increases the time resolution to 0.01 cycles, then the shifted ratio at x=2.35 would be R=s(3.35)/s(2.35)−1.
Example 8 Finding Peaks to Determine FCN and ERV (e.g., MR) of Target and ControlAnother step is to select peaks in the data series. This operation involves the steps of (1) finding local peaks and (2) selecting from local peaks one or more peaks for further analysis, optionally using criteria data (defined infra).
A peak-finding algorithm identifies where the slope of the curve changes from positive to negative, which represents a local maximum. The algorithm identifies the locations and the magnitude of the peaks. An example of a MATLAB function to do this calculation is as follows:
In the method discussed above, a number of local maximum peaks are often identified for both the target data and the control data. Various methods can be used for selecting which of these local maximum peaks will be used for determining an FCN and ERV.
Typically, and in particular during well-behaved reactions, the highest peak or maximum peak is selected. In many situations, this selection provides the most reproducible reaction point from which to perform further calculations as discussed herein. However, in some situations, a first peak, or first peak above a particular cutoff or after a particular number of cycles is preferable. Thus, in particular examples, a Max Peak or First Peak selection can be employed where Max Peak finds the largest peak in the shifted ratio curve while First Peak finds the first peak that is higher than some selected value.
Once criteria data are determined, these data can also be used to determine which peak to select for an ERV determination during actual operation, particularly for weak or noisy signals.
In
An information appliance or system apparatus can also be used to perform the methods of this invention.
The analytic methods described herein can be applied to reactions containing either known or unknown target concentrations. In one embodiment, known target nucleic acid concentrations will be included in calibration wells in a reaction carried out in a multi-well reaction plate, and the ERV and value of the reaction point will be used from these known concentration samples to perform quantification. Known concentrations may also be used to develop criteria data as further described herein.
Example 10 Determining Criteria Curve/Criteria Data SetsIn other embodiments, efficiency related values (e.g., MR values) can be plotted as a function of their reaction point values (FCN values) for a number of data sets of known concentration in order to generate a characteristic criteria curve for a particular assay. The criteria curve is characteristic of a particular assay formulation and detection protocol and can be used to reliably determine positive/negative results, to determine whether a particular result should be discarded as unreliable, to determine a confidence measure of a result, or any combination of the foregoing. In general, pairs of reaction data that lie below a criteria curve indicate non-reactive samples, or non-functional reactions, such as reactions encountering significant inhibition.
Criteria data can be used to select which peaks to report or to use in reaction analysis, or both. Criteria data provide an automatic and reliable method for discriminating between negative results (e.g., target not present at all) and results showing low amount of target.
Multiple, relatively simple criteria data sets can be used to provide characteristic criteria curves for a number of assays. One useful approach involves taking the mean of the MR values for the set of negative responses and adding to this value a multiple of the standard deviation of the MR values for the negative responses. For the example shown in
(X1,Y1)=(1,0.036)
(X2,Y2)=(20,0.036)
(X3,Y3)=(25,0.026)
(X4,Y4)=(45,0.026)
As a further example, the criteria curve shown in
(X1,Y1)=(1,0.10)
(X2,Y2)=(10,0.10)
(X3,Y3)=(20,0.05)
(X4,Y4)=(40,0.05)
Criteria curves and/or criteria data sets, including sets having different shapes or more complex shapes or both, can be determined without undue experimentation. The intended use of the PCR application will call for different approaches to establishing criteria lines. The skilled artisan will readily appreciate that when high sensitivity is desired in an assay, a low criteria line is used. For example, if an assay is designed for differentiating sequence variants, such as population consensus sequence (i.e., a “wild type” sequence) versus polymorphic or variant sequences (e.g., a “single nucleotide polymorphism”), then a criteria line of higher value can be used, because the detection of limiting quantities of target nucleic acid is not usually required in the determination of sequence variants.
The particular example shown in
Generally, as further discussed herein, a FCN-MR response is determined for samples of known concentration across the target concentration range of interest to define the “normal” response. Additional studies in a population of samples that challenge the assay reaction may be run to see how much deterioration in MR is acceptable before the assay performance is compromised. These types of characterization analyses can be used to establish criteria data or sets of criteria data independently of the standard deviation or other characteristics of the noise or baseline observed when samples that do not contain target nucleic acid are treated under amplification conditions.
According to other embodiments of the invention, criteria data also can be determined in ways similar to determining a C1, for C1 analysis as has been done in the prior art. A particular assay under design can be performed a number of times to characterize its typical MR-FCN response. From this typical response, the criteria data set can be defined. However, unlike C, analysis, in FCN-MR, the response is independent of intensity of signal and is easily reproducible, even across instruments of a particular type that produce highly variable results with identical samples.
Example 11Alternative Region of Interest
It has been empirically found that the FCN value of an efficiency related value as determined above can be advantageously adjusted to provide an even more reproducible quantification value. For example,
Thus, the invention involves determining an offset from the cycle number of maximum efficiency value (herein referred to as an FCN2 value), which is the location of another point on a reaction curve that can be used for analysis as described herein. In further embodiments, an Efficiency Related Value Threshold (ERVT) or Ratio Threshold (RT) value can be selected and used to determine a cycle number region of interest. An ERVT or RT can be an automatically or empirically determined value for a particular assay. The RT value can be set near to or at a criteria data level that is determined at the latter cycles during assay calibration.
One embodiment of a method of this invention starts at the FCN value on the shifted ratio curve and determines an earlier reaction point where the curve crosses the RT value. This reaction point is reported as an FCN2 value. It is believed that the FCN2 value provides improved linearity in samples having low copy numbers, in contrast with FCN values for certain assays, such as reactions where non-specific product formation reduces the efficiency of product formation in samples having low copy numbers.
In this example, the curve of one response flattens out early and differs in shape from the curve of the other response, and the shifted ratio curve shows a difference. The early flattening can cause the earlier peak. In this example, the FCN2 values are more closely matched than the FCN values. In general, FCN and FCN2 values have been found to be more precise (lower standard deviations) than C, values. While these examples focus on use of the MR, it will be appreciated that other measures of the efficiency of the amplification reaction can be employed in the FCN and FCN2 embodiments of the present invention. Other efficiency related transforms useful in the context of the present invention include, but are not limited to, (a) use of first derivative, (b) use of the differences between sequential periodic data points, and (c) use of the slope or gradient of the log of the growth curve.
Example 12 Quantification Using MR-FCN AnalysisQuantification is often desired in various types of reaction analysis. In PCR reactions, for example, quantification generally refers to an analysis of a reaction to estimate a starting amount or concentration of a target having an unknown concentration. The invention involves methods or systems or both for using an efficiency related value and a cycle number value (e.g., FCN) to perform a quantification. Specifically, the ERV of a test sample is compared to one or more of the ERV of at least one calibrator, preferably at least two calibrators, and, optionally, 3, 4, 5, or 6 calibrators, each of which contains a known quantity of a target nucleic acid.
In further embodiments, quantification can generally be understood as involving one or more calibration data captures and one or more quantification data captures. The calibration data and quantification are related using a quantification relationship or equation.
In calibration, a relationship between captured data, or a value derived from captured data (such as an FCN, FCN2, or MR, or combination of the foregoing), and one or more known starting concentration reactions is used to establish one or more parameters for a quantification equation. These parameters can then be used to determine the starting concentrations of one or more unknown reactions.
Various methods and techniques are known in the art for performing quantification and/or calibration in reaction analysis. For example, in diagnostic PCR settings, it is not uncommon to analyze test samples in a 96-well reaction plate. In each 96-well reaction plate, some wells are dedicated to calibration reactions with samples having known initial concentrations of target. The calibration values determined for these samples can then be used to quantify the samples of unknown concentration in the well.
Two general types of calibration methods are referred to as one-point calibration and standard curve (e.g., multiple points) calibration. Examples of these types are set forth below. Any suitable calibration method, however, can be used in the context of the present invention.
When there is no inhibition or interference, the PCR reaction proceeds with the target sequence showing exponential growth, so that after N cycles of replication, the initial target concentration has been amplified according to the relationship:
ConcN∝Conc0(1+e)N
which can also be expressed as:
where ConcN represents the concentration of amplified target after N reaction cycles, Conc0 represents the initial target concentration before amplification, N represents the cycle number and e represents the efficiency of the target amplification. Quantitative data analysis is used to analyze real time PCR reaction curves so as to determine Conc0 to an acceptable degree of accuracy. Previous C1 analysis methods attempt to determine a cycle number at a reaction point where the ConcN is the same for all reactions under analysis. The FCN value determined by the methods of the invention provides a good estimate for the cycle number N for an assay in which no significant inhibition or signal degradation over the dynamic range of input target concentrations is demonstrated. The following proportionality relationship between a starting concentration and FCN can be used:
Conc0(FCN)∝1/(1+e)FCN
where Conc0 (FCN) represents the estimate of the initial target concentration determined by using the FCN value as determined by the methods of this invention. In other words, the lower the starting concentration of target, the higher the FCN value determined for the PCR reaction. This relationship can be used for both calibration data and for quantification data.
This proportionality relationship can also be expressed as an equivalence, such as
Conc0(FCN)=K×1/(1+e)FCN
where K represents a calibration proportionality constant. For calibration data, Conco (FCN) represents a known concentration, such as 500,000 copies of target nucleic acid/mL; the exponent FCN is a FCN cycle number determined as described above; and e represents the efficiency value for a reaction, with e=1 indicating a doubling each cycle. These factors combine to form a relationship to allow for determination of the proportionality constant. Determination of the proportionality constant can only be made if there is a priori knowledge of the efficiency, e, of the amplification reaction. This a priori knowledge enables a one-point calibration. For quantification data, FCN values are determined for reactions involving samples having unknown concentrations of target. The FCN values are then converted to concentration values by use of the above equation. If the efficiency, e, is not known a priori, then a standard curve quantification method can be used. In this case, for calibration data, different samples having different levels of known concentration are amplified, and the FCN values of the samples are determined. These FCN values can be plotted against the log (base 10) of the known concentrations to describe a log (concentration) vs. FCN response. For an assay that demonstrates no significant inhibition or signal degradation over the dynamic range of input target concentrations, this response is typically well-fitted by a linear curve. The following equation describes the form of this standard curve:
Log10(Conc0(FCN))=m×FCN+b
where Log10(Conc0(FCN)) represents the log (base 10) of the initial target concentration, m represents the slope of the linear standard curve, and b represents the intercept of the linear standard curve. By using two or more known concentration calibration samples, a linear regression can be applied to determine the slope, m, and intercept, b, of the standard curve. For quantification data, FCN values are determined for reactions involving test samples of unknown concentration, which values are then converted to log (concentration) values by use of the above linear equation. Results can be reported in either log (concentration) or concentration units by the appropriate conversion.
It should be noted that the one-point calibration equation is easily converted to this linear standard curve form:
Conc0(FCN)=K×1/(1+e)FCN
Log10(Conc0(FCN))=−log10(1+e)×FCN+log10(K). The linear coefficient m can be used to calculate the efficiency of the particular PCR reaction.
Example 13 Quantification AdjustmentsWhen PCR reactions are subjected to inhibition, the resulting real-time PCR signal intensity can be depressed or delayed. The effect of this signal degradation on an efficiency related value such as MR is a reduction in that value. In addition, the effect of signal degradation on the fractional cycle number is generally to identify the FCN at an earlier cycle number than would be expected for the uninhibited reaction. These factors cause the plot of log (concentration) as a function of FCN to be less well described by a linear curve fitting function. Although higher order curve fitting functions can be applied for a standard curve, a linear fit requires fewer calibration levels and is simpler to calculate.
Some of these problems can be addressed in a standard curve analysis by incorporating an ERV or Intensity value into the quantification relationships as discussed above. Thus, the equations above can be rewritten a:
Conc0(FCNIntensity Adj)∝Intensity/(1+e)FCN
Conc0(FCNMR Adj)∝MR/(1+e)FCN
where Intensity represents the response intensity (above background) at the determined FCN value, MR represents the MR value as described previously. Conc0 (FCNIntensity Adj) represents the estimate of the initial concentration of the target determined by using the FCN value adjusted by using the Intensity value and Conc0 (FCNMR Adj) represents the estimate of the initial concentration of the target determined by using the FCN value adjusted by using the MR value.
These expressions take advantage of the relationship observed between the intensity at the selected FCN cycle or the MR determined at the selected FCN cycle, or both, and the change to the FCN value in the presence of inhibition, as discussed above. The net effect is that the right hand side of the proportionality expressions above is relatively insensitive to inhibition and other factors that affect the PCR amplification curve, and, therefore, provide significant robustness as expressions for determining the concentration values of the target.
The following discussion further explains the properties and relationships of FCN, FCNIntensityAdj, and FCNMR Adj. Assuming the efficiency is 1, the previous can be simplified to:
Conc0(FCN)∝½FCN
Conc0(FCNIntensity Adj)∝Intensity/2FCN
Conc0(FCNMR Adj)∝MR/2FCN
Taking the Log base two of the expressions yields:
Log2(Conc0(FCN))∝FCN
Log2(Conc0(FCNIntensity Adj))∝FCN−Log2(Intensity)
Log2(Conc0(FCNMR Adj))∝FCN−Log2(MR)
From the right sides of the expressions come the values for compensating for intensity or MR to adjust the FCN value by means of the following formulas:
FCNInt. Adj.=FCN−Log2(Intensity)
FCNMr. Adj.=FCN−Log2(MR).
This calculation then provides quantification by using adjusted FCN values analogous to using FCN values or C, values. It should be noted that the use of these adjusted FCN values provide significant robustness to inhibition and other factors that affect PCR amplification, such as C, values used in determining the concentrations of the target in the unknown samples. The plot of Log (concentration) vs. these adjusted FCN values is generally well fitted by a linear standard curve. Thus, the present invention provides a method for determining the quantity of a target nucleic acid in a sample comprising involving the steps of (a) finding the period of time or cycle number of an amplification reaction corresponding to a maximum of an efficiency related value, preferably of an MR, and (b) adjusting that value by subtracting a logarithm of the Intensity or a logarithm of the MR, and (c) comparing the value obtained to calibration data obtained using the same methodology.
Example 14 Standard Curve CalibrationDevelopment of a standard curve from known concentrations and use thereof for quantification is well known in the art and can be further understood from the following example. In a typical case, a number of calibration reactions (such as in wells in which the initial concentrations are known) are used during each amplification or series of amplifications to perform the calibration operation. One problem that arises with attempting to quantify a target nucleic acid in a sample through a large range of possible initial concentrations is that quantification of lower quantities of target nucleic acid in any particular reaction becomes more difficult. For example,
Because calibration runs in a reaction plate are relatively expensive, it is conventional to collect a minimal acceptable number of calibration data sets. For example, in one implementation, the average of two replicates each of the 500; 50,000; and 5,000,000 copy/mL samples are run along with the diagnostic assays, thereby requiring perhaps six wells in a 96 well plate to be used for calibration reactions.
Because the relationship between the cycle numbers and the log of the calibrator concentration is substantially linear, a linear regression can be performed between a log (e.g., log10) of the calibrator concentrations and the cycle number. This regression can easily be performed via the Excel program and other mathematical analysis software.
In each of the curve fit equations, the x-axis displays values of the Log10 [Target] actual or known concentration. Thus, solving for x provides an expression for converting from cycle number related values to Log10 (Target) calculated concentration of the assay. If the assay response is not linear with Log (Target), a higher order or more complex regression, or a larger number of calibration reactions, or both, can be used. In this example, the following equations were determined:
FCN=−3.0713*Log10(Conc0)+31.295
FCN2=−3.0637*Log10(Conc0)+25.006
FCNMR adj=−3.2344*Log10(Conc0)+33.271
FCNInt. adj=−3.2870*Log10(Conc0)+32.775
In order to examine the different characteristics of calibrations using the different cycle number related values described above, quantification can be performed on various samples having known concentrations, and the concentrations calculated compared with the known concentrations. In one example of such a comparison, the standard curves having the parameters generated above were used to carry out quantification of the assay responses shown in
As indicated by
A one-point calibration can be used for quantification. In this case, two wells at the 50,000 copies/mL concentration (Log(4.7)) were used for calibration. In order to calculate the calibration constant, the following equation is used: K=Conc0*2FCN, where K represents the calibration constant, Conc0 represents the known concentration of the calibrator, FCN represents the fractional cycle number of the calibrator, and the efficiency of the reaction, e, as described earlier, is assumed to be 1. Similar calibration constants can be generated using the proportionality relationships such as FCN2, FCNMR Adj. and FCNInt. Adj.
In this case, the constant was generated for two wells and the average was used. Once the calibration constant is generated, the concentration for each assay is calculated with the following equation: Conc=KFCN/2FCN.
As can be seen, the FCN results are elevated at the lowest two concentrations and accurate from log (Conc) equals 3.7 and above. FCN2 shows improved accuracy at low concentrations compared to FCN, but under-quantifies at log (Target) equal to 5.7 and 6.7. FCN-MR adjusted shows good linearity over the entire range with slight over-quantification at the two lowest concentrations. FCN-Intensity adjusted also shows good linearity with very slight under-quantification at the lowest two concentrations. Accordingly, each of these embodiments works well and the skilled artisan can readily select from among these options.
As discussed above, an FCN-MR analysis can be used to characterize a particular reaction as positive or negative or to compare the reaction to criteria data, or both. These values can be used to quantify a reaction. A variety of quantification methods can benefit from FCN-MR analysis rather than C, analysis.
In one embodiment, a FCN value, a FCN2 value, or a FCN adjusted value can be used in any way that a C, value has been used in the prior art. Typically, but not necessarily, FCN-adjusted, FCN2-adjusted, or FCN-adjusted analysis can be applied to various sets of calibration data to thereby develop reference data curves or an equation for comparing the result of a reaction in which the concentration of target is unknown to the results of reactions in which the concentration of target is known. Thus, the present invention can be used to develop reference data and to perform a comparison wherein two values (e.g., FCN-MR) are used both for developing reference data and also for making a comparison to that data.
While experiments using the MR method regularly used different preprocessing steps on the captured data set before processing the data set with a ratio function, most of these steps are not required. In particular, experimental results have indicated that scaling, normalization by a reference dye, baselining (both offset and slope correction), and filteting are not required. However, filtering has generally been found to be desirable as it improves performance in the presence of noise. Slope correction (for the baseline region) has also been found to be desirable as it slightly improves discrimination between samples that do not contain target nucleic acid and those that contain very little target nucleic acid or suffer from significant inhibition of the amplification reaction. Generally, however, when FCNIntensity adj is used, it is preferable to use a normalization technique, such as, but not limited to, scaling or normalization to a reference dye.
Example 17 MR Algorithm Applied to HBV Data Using a One-Point CalibrationHBV assays of control solutions ranging from 10 copies/reaction to 109 copies/reaction and negatives were processed on an ABI Prism 7000 with six replicates at each concentration. The captured data was processed using only a digital filter. FCN values were then calculated using a MR algorithm as described above. The concentrations were calculated by means of a one-point calibration using the three of the responses at 109 copies/reaction as a reference calibrator.
Even without normalization, scaling, or baselining, the resulting quantification was very good, with the exception of an acceptable amount of over-quantification of the 10 copies/reaction and 100 copies/reaction samples (i.e., the Log (Target)=1 and 2 samples). There was a very clear distinction between the negatives and the 10 copies/reaction assays, with no false positives or false negatives. Additional results indicated that when the same data was quantified with C1 analysis, the 10 copies/reaction and 100 copies/reaction assays are also slightly over-quantified, and the precision at all concentrations above 10 copies/reaction is better with the MR analysis. In this case, the C1 results were normalized, baselined, and calibrated by means of a two-point calibration with three replicates each at concentrations 103 and 107 copies/reaction.
In this example, HIV assays of control solution were performed at concentrations of negatives, 50 copies/mL, and 100 copies/mL, through 106 copies/mL in replicates of six. The responses were processed by means of the MR algorithm using FCNMR Adj. with normalizing and baselining.
It has been found that pairs of reaction time or cycle number values and efficiency related values (e.g., pairs of FCN-MR values) can provide valuable information about a nucleic acid amplification reaction, e.g., a PCR reaction, which can be further enhanced by considering data pairs for both the internal control and target amplification reactions. While pairs for a target reaction alone carry important information about reaction efficiency and can be used for comparison with criteria data, additional factors that arise in processing samples or in the samples themselves may be better analyzed by considering control data as well.
For example, in processing specimens for use in PCR or other suitable amplification reactions, the sample can carry various inhibitors into the reaction, which might be detectable through assessment of target data only. However, abnormal recovery of target nucleic acid during sample preparation typically would not be detected by analysis of a single amplification reaction. Furthermore, a target nucleic acid may possess polymorphic sequences that could impair detection of the target nucleic acid, e.g., if a probe is used that binds to a polymorphic region of the sequence. Mismatches caused by the polymorphic sequence in this region would affect the detected signal, and, consequently, the amplification might not appear as abnormal or inhibited using the evaluation of data pairs for a single amplification. Co-analysis of an internal control together with analysis of the target amplification responses can provide accurate quantification of the target nucleic acid in such samples when other methods would typically indicate an invalid reaction.
Thus, pairs of reaction time or cycle number values and efficiency related values can be used together to assess the validity of a given reaction, such as in a given container or well. One could design the internal control (IC) amplification reaction to be comparable in robustness to the target amplification reaction, or slightly less robust. Robustness in this context means the sensitivity of the reaction performance to factors that can affect the PCR processing pathway, such as inhibition that results from sample preparation or the samples themselves, or to variability in transferring of the reaction mixture by pipette, such as transferring inaccurate amounts of amplification reagents by pipette.
Example 20 Multiple Criteria Data CurvesMultiple criteria curves for the pairs of cycle number value—efficiency related value (e.g., FCN-MR pairs) can be developed and can have different uses or levels of importance, particular for use with validity determination. For example, a first criteria curve can be selected so as to be able to discriminate reactive amplification signals from non-reactive responses. A second criteria curve can be selected so as to be more constraining than the first type, so that it would be useful in identifying sample responses that lead to accurate quantification in contrast to those having partial inhibition that might have lower confidence in quantification.
For example, the first type of criteria data that differentiates reactive and non-reactive amplification reaction can be referred to as “MR criteria data.” These data act as a cutoff threshold—reactive responses will have MR values that exceed the MR criteria data, whereas negative samples will have MR values that will not exceed the criteria value or criterion line. The criteria data is preferably set so that noise in the response signal does not exceed the criteria, nor will such biases as cross-talk or bleed-over.
The second type of criteria data is referred to as the MR normal range. This range would be the range of MR values for a given FCN over which quantification of the sample is accurate. If a signal response is suppressed, the MR value observed will drop. As the MR value decreases due to inhibition, the FCN value can shift to earlier cycles, whereas a threshold based C, might shift to later cycles. The MR normal range would be the range for MR values in a criteria data set for which a chosen value related to a cycle number would provide an accurate quantitative result for the sample when used to determine the concentration of target in the sample from the assay standard curve.
The “MR normal range” can be developed using a Bivariate Fit of the MR by FCN as will be understood in the art.
A statistically derived confidence interval, as shown, is a systematic approach to determining which data points represent “normal” responses and should therefore be quantified. Data points lying outside this interval are exceptional and are preferably identified to a human operator by a software program so that further investigation can be made.
In alternative embodiments, such a curve can be simplified in the form of one or more straight-line segments. This simplification can in some cases be performed by a technician viewing the raw data or may be derived from an alpha interval as discussed above.
A similar statistical fit can be performed on the internal control (IC) data.
Thus, the present invention also provides a method for analyzing an amplification reaction, the method comprising establishing a “confidence corridor”, which is a range of selected values provided in pairs in which the first value is a maximum efficiency related value (which is preferably the MR), and the second value is a time value or cycle number value at a reaction point (which optionally can be fractional). The method further comprises determining whether a maximum efficiency value occurring at any particular periodic time value or cycle number value at a reaction point (which optionally can be fractional) falls within the selected range. If the value does not, then further investigation, or disregarding the results, is indicated. Any suitable method can be used to establish the selected confidence corridor. Preferred methods include setting the confidence corridor about 1, 2, 3, 5, 10, or any other suitable number of standard deviations from the mean of data obtained from a set of reactions used to characterize the assay. Another suitable method involves modifying the confidence corridor by observing known aberrant or discrepant results and modifying the confidence corridor to exclude a portion of those aberrant or discrepant results in future assays. The use of the confidence corridor of the present invention can be applied to target nucleic acid quantification, analysis of any of standards, calibrators, controls, or to combinations of the foregoing.
Example 21 Validity AnalysisThus, a validity check can optionally proceed as a series of questions regarding the internal control (IC) and/or target data.
In
As shown in the figure, an invalid result can be further characterized or explained by considering one or more characteristics of the target MR.
Thus, by combining the analysis based on multiple targets and using both cycle number and efficiency related values, one can distinguish an inhibited sample from a sample that suffered from poor nucleic acid recovery during sample preparation. The analysis makes use of pre-established knowledge of the assay that is contained in the internal control and target criteria data.
Example 22 Validity Determination Using Peak WidthIn contrast to the conventional C, analyses in the prior art, which only presents a single value describing an amplification response, an efficiency related value analysis (and preferably an MR analysis) can provide an efficiency related transform curve with data corresponding to the time value or cycle number value of the entire amplification reaction or any portion thereof. It has been discovered that within a specific assay formulation, normal assay responses generate highly reproducible efficiency related transform curves. One characteristic in particular is the width of the peak of the efficiency related transform curve. It has been found that the width of the peak of the efficiency related transform, e.g., as defined by its width at the half maximum height, varies very little even when the magnitude of the fluorescence intensity varies greatly.
Any suitable method can be used to determine the width of the peak of the efficiency related value.
Peak width can be used to detect an abnormal assay response. The full peak width calculation was applied to the assay data that contained the abnormal response shown in
The full peak width calculation will be affected by abnormal variations in amplification response that occur both before and after the reaction point value (e.g., the FCN) of the efficiency related value. Abnormal variations that occur after the reaction point value of the efficiency related value are not considered for an assay validity test, because they cannot affect assay quantification by the MR method. This option can readily be achieved using the half peak width calculation illustrated in
The systems of this invention can be incorporated into a multiplicity of suitable computer products or information instruments. Some details of a MR software implementation are provided below. Specific user interface descriptions and illustrations are taken to illustrate specific embodiments only and any number of different user interface methods known in the information processing art can be used in systems embodying this invention. The invention can also be used in systems where virtually all of the options described below are preset, calculated, or provided by an information system, and, consequently, provide little or no user interface options. In some cases, details and/or options of a prototype system are described for exemplification purposes; many of these options and/or details may not be relevant or available for a production system.
Furthermore, software embodiments can include various functionalities, such as, for example, processing reactions with one or two target reactions, or one or more internal control reactions, or reference data, or combinations of the foregoing. A software system suitable for use in this invention can provide any number of standard file handling functions such as open, close, printing, saving, etc.
The invention can also be embodied in whole or in part within the circuitry of an application specific integrated circuit (ASIC) or a programmable logic device (PLD). In such a case, the invention can be embodied in a computer understandable descriptor language, which may be used to create an ASIC, or PLD, that operates as described herein.
Other EmbodimentsThe invention has now been described with reference to specific embodiments. Other embodiments will be apparent to those of skill in the art. In particular, a viewer digital information appliance has generally been illustrated as a computer workstation such as a personal computer. However, the digital computing device is meant to be any information appliance suitable for performing the logic methods of the invention, and could include such devices as a digitally enabled laboratory systems or equipment, digitally enabled television, cell phone, personal digital assistant, etc. Modification within the spirit of the invention will be apparent to those skilled in the art. In addition, various different actions can be used to effect interactions with a system according to specific embodiments of the present invention. For example, a voice command may be spoken by an operator, a key may be depressed by an operator, a button on a client-side scientific device may be depressed by an operator, or selection using any pointing device may be effected by the user.
It is understood that the examples and embodiments described herein are for illustrative purposes and that various modifications or changes in light thereof will be suggested by the teachings herein to persons skilled in the art and are to be included within the spirit and purview of this application and scope of the claims.
All publications, patents, and patent applications cited herein or filed with this application, including any references filed as part of an Information Disclosure Statement, are incorporated by reference in their entirety.
Claims
1.-17. (canceled)
18. A method comprising:
- partitioning data points associated with a signal between data points indicative of an amplification of a target nucleic acid in a sample and data points not indicative of the amplification of the target nucleic acid, wherein the data points include data pairs comprising an efficiency related value and a corresponding fractional cycle number; and
- determining the presence of the target nucleic acid in the sample based on the data points.
19. The method of claim 18, wherein the efficiency related value is a maximum ratio of sequential data points in the signal.
20. The method of claim 18, wherein the efficiency value is the highest peak in a signal transform.
21. The method of claim 18, wherein the efficiency value is the first peak above a threshold in a signal transform.
22. The method of claim 18 further comprising interpolating additional data points between the data points to define the efficiency related value and the fractional cycle number.
23. The method of claim 18 further comprising:
- determining a mean of data points associated with multiple signals;
- establishing a confidence corridor within a range around the mean; and
- using only data points within the confidence corridor to determine the presence of the target nucleic acid.
24. The method of claim 18, wherein the partitioning the data points based on amplification of the target nucleic acid is based on a threshold.
25. The method of claim 18 further comprising partitioning data points indicative of the amplification of the target nucleic acid between data points indicative of amplification inhibition and data points not indicative of amplification inhibition.
26. The method of claim 25, wherein the partitioning the data points indicative of amplification inhibition is based on a multiple of a standard of deviation of the efficiency related values.
27. The method of claim 18 further comprising:
- determining a width of the signal at half of a maximum height of the signal;
- comparing the width to a range of widths;
- validating the amplification reaction if the width is within the range of widths; and
- determining an abnormal assay if the width is not within the range of widths.
28. A method comprising:
- partitioning data points associated with a signal indicative of an amplification of a target nucleic acid in a sample between data points indicative of amplification inhibition and data points not indicative of amplification inhibition, wherein the data points not indicative of amplification inhibition form quantifiable data points; and
- determining the presence of the target nucleic acid in the sample based on the quantifiable data points.
29. A system comprising:
- an interface to obtain a signal proportional to an amount of a target nucleic acid in a sample in contact with an amplification agent to induce an amplification reaction, the signal obtained during the amplification reaction; and
- a processor to: determine efficiency related values based on the signal; determine respective fractional cycle numbers of the efficiency related values; partition data points comprising data sets of efficiency related values and corresponding fractional cycle numbers between data points indicative of the amplification reaction and data points not indicative of the amplification reaction; determine the presence of the nucleic acid based on the data points; and calculate a concentration of the target nucleic acid based on the fractional cycle numbers of the data points indicative of the amplification reaction where the presence of the target nucleic acid has been detected.
30. The system of claim 29, wherein the efficiency related value is a maximum ratio of sequential data points in the signal.
31. The system of claim 29, wherein the efficiency value is the highest peak in a signal transform.
32. The system of claim 29, wherein the efficiency value is the first peak above a threshold in a signal transform.
33. The system of claim 29, wherein the processor is to interpolate additional data points between the gathered data points to define the efficiency related value and the fractional cycle numbers.
34. The system of claim 29, wherein the processor is to: use only data points within the confidence corridor to determine the presence of the target nucleic acid.
- determine a mean of a data points associated with multiple signals;
- establish a confidence corridor within a range around the mean; and
35. The system of claim 29, wherein the processor is to partition the data points based on the amplification of the target nucleic acid based on a threshold.
36. The system of claim 29, wherein the processor is to partition the data points indicative of the amplification of the target nucleic acid between data points indicative of amplification inhibition and data points not indicative of amplification inhibition.
37. The system of claim 36, wherein the processor is to partition the data points indicative of the amplification inhibition based on a multiple of a standard of deviation of the efficiency related values.
38. The system of claim 29, wherein the processor is to:
- determine a width of the signal at half of a maximum height of the signal;
- compare the width to a range of widths;
- validate the amplification reaction if the width is within the range of widths; and
- determine an abnormal assay if the width is not within the range of widths.
39. A tangible machine readable medium having instructions, which when read, cause a machine to at least:
- partition data points associated with a signal between data points indicative of an amplification of a target nucleic acid in a sample and data points not indicative of the amplification of the target nucleic acid, wherein the data points include data pairs comprising an efficiency related value and a corresponding fractional cycle number; and
- determine the presence of the target nucleic acid in the sample based on the data points.
40. The medium of claim 39, wherein the efficiency related value is a ratio of sequential data points in the signal.
41. The medium of claim 39, wherein the efficiency value is the highest peak in a signal transform.
42. The medium of claim 39, wherein the efficiency value is the first peak above a threshold in a signal transform.
43. The medium of claim 39, wherein the instructions further cause the machine to interpolate additional data points between the gathered data points to define the efficiency related value and the fractional cycle numbers.
44. The medium of claim 39, wherein the instructions further cause the machine to:
- determine a mean of data points associated with multiple signals;
- establish a confidence corridor within a range around the mean; and
- use only data points within the confidence corridor to determine the presence of the target nucleic acid.
45. The medium of claim 39, wherein the instructions cause the machine to partition the data points based on the amplification of the target nucleic acid based on a threshold.
46. The medium of claim 39, wherein the instructions cause the machine to partition the data points indicative of the amplification of the target nucleic acid between data points indicative of amplification inhibition and data points not indicative of amplification inhibition.
47. The medium of claim 46, wherein the instructions cause the machine to partition the data points indicative of the amplification inhibition based on a multiple of a standard of deviation of the efficiency related values.
48. The medium of claim 39, wherein the instructions further cause the machine to:
- determine a width of the signal at half of a maximum height of the signal;
- compare the width to a range of widths;
- validate the amplification reaction if the width is within the range of widths; and
- determine an abnormal assay if the width is not within the range of widths.
Type: Application
Filed: Jan 13, 2012
Publication Date: Aug 23, 2012
Inventors: ERIC B. SHAIN (Glencoe, IL), John M. Clemens (Wadsworth, IL), Tzyy-Wen Jeng (Vernon Hills, IL), George J. Schneider (Barrington, IL)
Application Number: 13/350,571
International Classification: G06F 19/00 (20110101);