METHODS FOR FLUORESCENCE DATA CORRECTION

Info

Publication number: 20170032083
Type: Application
Filed: Apr 13, 2015
Publication Date: Feb 2, 2017
Applicant: UgenTec BVBA (Hasselt)
Inventors: Wouter UTEN (Hasselt), Martin REIJANS (Maastricht), Yves Antonius OZOG (Zonhoven)
Application Number: 15/303,105

Abstract

The invention concerns (computer implemented) methods for data correction, in particular, fluorescence data, related systems, software, graphic user interfaces and the use thereof, more in particular, the invention describes a method for adjusting a part of measurements taken from a plurality of measurements, under the same time-dependent area, at different time points in which said adjustments are performed in order to obtain calculated measurements which can be compared as if they were taken at the same time point (and hence, under the same environment).

Description

Description

AREA OF APPLICATION OF THE INVENTION

The invention concerns (computer implemented) methods for data correction, in particular, fluorescence data, related systems, software, graphic user interfaces and the use thereof.

BACKGROUND TO THE INVENTION

Although the concept of using (computer implemented) methods for data correction exists, these methods are mostly integrated in the measurement systems and as such they are not very accurate or flexible.

PURPOSE OF THE INVENTION

The purpose of the invention is to provide (computer implemented) methods for data correction, related systems, software, graphic user interfaces and the use thereof, which are more accurate and/or flexible, and that can or cannot be used standalone, which are in particular suitable for fluorescence data.

BRIEF DESCRIPTION OF THE INVENTION

The invention comprises various (computer implemented) sub-methods, for data correction, related systems, software, graphic user interfaces and the use thereof, in particular suitable for fluorescence data, as well does the invention provides a preferred ordering of these sub-methods. The underlying software could be used as standalone or be integrated in the measuring equipment, and as such it may be functioning on a processor, it may be integrated in the measurement equipment, or it may be split (i.e. partially on the measuring equipment, partially on a standalone computer).

The underlying software could also consist of basic software and a part that is client- and/or application specific. The inventive contribution concerns the recognition that the data obtained from the measuring equipment deviate from what could ideally be expected, even after processing the data with software integrated in the measurement equipment. Even more, said deviation could be caused by co-integrated software, and careful correction of the defects of the measuring equipment is thus necessary. This correction should preferably happen in various steps, even more so, preferably by sequencing these steps in a specific order.

A further inventive contribution is recognising the fact that these corrections are client and/or application and/or measuring equipment specific, and as such providing sufficient options to deal herewith, for example by providing the possibility of using multiple data formats, calibration data and/or other settings parameters. However, the invention does not only focus on the correction of the data as such, but is also aimed at improving the use of data for detection of target molecules, more specific detection of infections, which is typically done via peak detection. The data analysis following the correction of the data must thus preferably be integrated in the software and also have the required parameters and validation methods.

The various aspects of the invention are described in the claims.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1 shows the flow diagram of an embodiment of the invention.

FIG. 2 shows the fluorescence frequency spectrum and the consequences of the time shift.

FIG. 3 shows the fluorescence frequency spectrum and the consequences of colour compensation.

FIG. 4 shows a reference melting curve.

FIG. 5 shows a melting curve without compensation.

FIG. 6 shows a melting curve with compensation.

FIG. 7 shows data without subtraction of a background signal.

FIG. 8 shows data after subtraction of a background signal.

FIGS. 9(A and B) shows the effect of using a moving average.

FIG. 10 illustrates the problems of loss of data points.

FIG. 11 shows a flow diagram of the method.

FIG. 12 shows the flow diagram of a validation method.

FIG. 13 illustrates a peak detection.

FIG. 14 illustrates a shoulder infection.

FIG. 15 illustrates a multiple infection.

FIG. 16 illustrates a method for the data analysis.

FIG. 17 illustrates the use of 3-value-score in various areas.

FIG. 18 illustrates the method of the time shift

NUMBERED STATEMENTS WITH REGARD TO THE INVENTION

Statement 1. Method for adjusting a subset of measurements (630, 660, 670) of at least a first parameter (600) from multiple parameters (600, 610, 620), each being subjected to the same time-dependent environment, where the measurements are taken at different time points (700, 703, 706) and where the adjustments result in calculated values (630*, 660*, 6701 which can be compared with corresponding measurements (650, 680, 690) of at least a second parameter (620), as if they were taken at the time (702, 705, 708) when the corresponding measurements of the second parameter (620) were determined, where the method includes the steps of:

- (1) calculating a number of measurements (630, 660, 670) each taken at a different time (700, 703, 706) for a first parameter (600);
- (2) determining the time points (702, 705, 708) when the corresponding measurements (650, 680, 690) of the second parameter (620) were determined; and
- (3) determining the calculated values (630*, 660*, 670*) representative for the first parameter (600) at the time points (702, 705, 708) of the second parameter (620), based on the measurements of the first parameter (630, 660, 670).

Statement 2. Method according to statement 1, wherein the measurements, are fluorescence data of real time PCR experiments, and the time-dependent environment is reflected by the temperature during the PCR reaction.

Statement 3. Method according to statement 1 or 2, wherein the calculated values of the first parameter are determined by interpolation of the measurements of the first parameter with regard to the time points of the second parameter.

Statement 4. Method for adjusting fluorescence measurements (300, 310, 320) of a sample, wherein the measurements are taken in at least two partially overlapping wavelength areas, which method comprises the steps of:

- (1) determining the fluorescence measurements,
- (2) determining the information indicative for said overlap,
- (3) adjustment of the fluorescence measurements, by using mathematical multiplication operations based on the information from step (2), to reduce the influence of said overlap.

Statement 5. Method according to statement 4, where the adjustment is such that there is no over-compensation.

Statement 6. Method for adjusting fluorescence measurements, comprising performing the method according to one of statements 1 to 3, followed by preforming the method according to statements 4 to 5.

Statement 7. Method according to one of statements 1 to 6, further comprising the steps of: determining the presence of background signals for at least one of the parameters; and correcting (deducting) the adjusted measurements obtained by the methods of statements 1 to 6, for said background signals.

Statement 8. Method according to statements 1 to 7, further comprising the step of levelling the adjusted measurements obtained by the methods of statements 1 to 7, preferably by using a moving average.

Statement 9. Method for determining the presence of one or more target molecules in a sample, comprising: (1) preforming the method of one of the previous statements on measurements obtained from the sample, in order to obtain the calculated values, (2) determining the presence or absence of the target molecules based on the calculated values from step (1).

Statement 10. Method according to claim 9, wherein in step (2), determining the presence or absence of the target molecules based on the calculated values from step (1), is performed by determining the maximum values and taking into account a 3-value-score for the reliability of this calculation.

Statement 11. Method according to statements 9 or 10, further comprising: (3) characterisation of the target molecules by performing a symmetrical analysis on the aforementioned adjusted values, around the specified maximum value.

Statement 12. A computer program product, active on a processor for performing calculations according to one of the steps of the method according to one of the statements 1 to 11.

Statement 13. A machine readable storage medium that stores the computer program product of the previous findings.

Statement 14. Use of a method according to statements 1 to 12.

Statement 15. A graphic user-interface that is suitable for use of the methods according to statements 1 to 12.

In a specific embodiment, the invention also comprises a method as defined herein, wherein the above-mentioned steps are performed by loading and/or calculating the measurements/time points by means of a computer. It should be specifically noted here that in this specific embodiment the time points are indeed taken into account, whereas the prior art does not foresee any time correction, and therefore does not acknowledge the effect of the time shift on the detection of target molecules, and more specifically on the detection of infections in the samples to be analyzed. In particular, the time points are taken into account, either be explicitly reading the time points together with the rough data or by implicitly determining these time points by using the features of the measuring device (for example, the change it makes to the environmental factors, i.e. the temperature), and using these in the further calculations to improve the detection of the target molecules, and more specifically, the infections in the samples to be analyzed.

Where the different steps are performed one after the other in the method, every step is performed using the calculated values received from the previous step.

DETAILED DESCRIPTION

The invention includes methods that can be implemented in software, in particular for use in calculating, correcting and processing data used in bio-technological applications, such as fluorescence in PCR analysis, and as such, it is aimed at applications for detection of target molecules, more specifically infections in the samples to be analysed.

The below represents a possible embodiment, however, the invention is not limited to this embodiment alone.

In FIGS. 2 and 3, the x-axis (10) represents the wavelength of the measured light and the y-axis represents the amount of measured light (fluorescence).

In FIGS. 4 to 10 (here only illustrated in FIG. 4), FIGS. 13 to 15 and FIG. 17, the x-axis (30) represents the temperature (indicative for the environment of the measurement sample) and the y-axis (40) represents processed fluorescence data (for example minus the derivative of those data to the temperature).

Software for Performing Methods According to this Invention

In a particular embodiment of the invention, it was decided to develop a standalone software application. This simplifies the use and the installation of the software for the end-user. This way, all data remain on one system, which appears safer when there are many users. The methods according to this invention work on data obtained from systems for PCR analysis. Such systems often contain an integrated software package to visualise the measured data. However, these software packages are at first instance aimed at research and development, which is at the same time also the biggest disadvantage. Every time the user wants to view a sample, he needs to perform various actions (mouse clicks) and thus taking some time to retrieve the data, especially if he/she wants to analyse multiple samples. However, from this integrated software it is possible to export data to, for example an XML file. This way, all data related to a certain run are saved in one file. This XML file, or other suitable files (eg. text files (txt)) are used in the method according to this invention and could be captured in the related software. One of the aspects of the invention is thus to be able to process data received in various formats.

This way, the user is able to export the data after the full cycle in the PCR device, with only a single mouse click. From there he could use the method of the current invention to perform the full data analysis.

In addition to the increased efficiency and accuracy, it is also possible via the method of this present invention to lift the technology of the users to a higher level. Modifications made with the method of this invention enable users to detect multiple parameters (for example Fluorescence channels) that increase the options of their tests (e.g. multiplex).

Loading the Data

Loading the data in the software to perform the methods according to this invention is performed using a completely self-developed code. The XML file that must be loaded consists of 150 000 lines. Since we only want to use some of this information, we filter the desired data from this file. An object is created for every sample and all objects are kept in a list. Every sample object contains the raw fluorescence data and the temperature for the three parameters measured (FAM, ROX and CY5 channels). The flow chart of the flow diagram represented in FIG. 1 schematically shows the code written to capture all data from the XML file.

Editing the Data

Before we can analyse the data, we must first edit and compensate the data. We perform four steps to edit raw data.

- 1. Time shift
- 2. Colour comp
- 3. Subtraction
- 4. Moving average

Timeshift

Time shift correction is one of the first corrections we perform. This correction is required to compare various parameters (fluorescence channels) at the same temperature.

In a first aspect, the invention thus comprises a method for adjusting a subset of measurements (630, 660, 670) of at least a first parameter (600) from multiple parameters (600, 610, 620), each being subjected to the same time-dependent environment, where the measurements are taken at different time points (700, 703, 706) and where the adjustments result in calculated values (630*, 660*, 670*) which can be compared with corresponding measurements (650, 680, 690) of at least a second parameter (620), as if they were taken at the time (702, 705, 708) when the corresponding measurements of the second parameter (620) were determined, where the method includes the steps of:

- (1) calculating a number of measurements (630, 660, 670) each taken at a different time (700, 703, 706) for a first parameter (600);
- (2) determining the time points (702, 705, 708) when the corresponding measurements (650, 680, 690) of the second parameter (620) were determined; and
- (3) determining the calculated values (630*, 660*, 670*) representative for the first parameter (600) at the time points (702, 705, 708) of the second parameter (620), based on the measurements of the first parameter (630, 660, 670).

The measuring hardware will internally continuously increase the temperature during the PCR response. Since various fluorescence channels are measured one after the other, every channel is read at a different temperature. This way there is a slight change on the temperature scale between the measured fluorescence in, for example the FAM and ROX channels and between the ROX and CY5 channels. (FIG. 2). The time shift correction is performed on the raw data. In the software, the temperatures from the first channels are compensated to the temperature of the last of these channels, for example the FAM and ROX channels are compensated to the temperature of CY5. Through this correction, we are able to compare the measurements received from the various channels at the same temperature. The time shift interpolates the data based on the temperature from the current and the next channel. Depending on the temperature measured in the first channels (for example ROX or FAM), we will compensate more or less. For example, we compensate the FAM and ROX temperature based on the CY5 temperature. This ensures that the corrected data (calculated values) move to a slightly higher temperature where the following measurement takes place.

During the development of this time shift correction we assessed, based on the data originating from various exports, how we could optimally do this compensation. Eventually we decided to move the temperatures from the first channels (eg. FAM and ROX) to the last channel (eg. CY5). However, it is possible to use any channel as reference and to correct the values from the other channels to this. In a specific embodiment of the software, the time shift correction will be the first correction on the raw data. As, in this specific embodiment, the artificially incorrect reference values (for the environment, for example, represented by the temperature and actually underlying the incorrect reference time) are no longer used in the further steps, these further steps take place more optimally and therefore finally results in an improved detection of target molecules, and, more especially, results in a better detection of the infections in the samples to be analyzed. The insight of this specific embedment, more specifically that the time correction should take place before the colour compensation, requires that any possible colour compensation on the device should be switched off. This will be explained further.

The time shift correction is performed both on the fluorescence data of the sample to be analysed and on the data of the negative control. This negative control (reference sample) is a sample that only contains the reagents and no clinical material. Time shift preferably takes place on the data from the first channels (for example the FAM and ROX channel).

FIG. 18 illustrates measurements (dark grey, for example 630), set out along a time axis, for various parameters (600, 610, 620), taken at various time points (700-708) and calculated measurements (light grey, for example 630*), suitable to be included on the time axis of measurements of 1 of these parameters (600), though at a time (702) when a measurement (650) is available in the time axis of measurements for another parameter (620), to allow comparison of the measurements (650) and (630*).

Mathematically, the data are interpolated to the temperature of the last channel CY5, with preference according to interpolation on a curve with X and Y values.

Colour Compensation

The colour compensation is a very important step in obtaining a correct analysis later.

In a next aspect, the invention thus includes a method for modifying fluorescent measurements (300, 310, 320) of a sample, where the measurements are taken in at least two partially overlapping wavelengths, which method then includes steps of: (1) determining fluorescence measurements, (2) determining information indicative for said overlap, (3) adjustment of the fluorescence measurements, by using mathematical multiplication operations based on the information from step (2), to reduce the influence of said overlapping.

During the real-time PCR process, measurements take place in three channels. This causes a radiation of the signal from the first channel (eg. the FAM channel) to the second channel (eg. the ROX channel) and from the second channel (eg. the ROX channel) to the last channel (eg. the CY5 channel). This radiation can clearly be seen in FIG. 3. The curve corresponding with the first channel (300) partially overlaps with the curve from the second channel (310). This overlapping part is the radiation (330) of the first channel (FAM) into the second channel (ROX). This also happens with the signal of the second channel (310), which radiates in the curve of the last signal (320). This last radiation (340) is the radiation of the second channel (ROX) into the last channel (CY5).

The percentage radiation of the various channels depends on the device used. In order to perform a correct colour compensation, a calibration must be performed on every device to determine to what extend there is radiation between the various channels. Without this correction, errors could occur during the analysis due to peaks that radiate from one channel into another channel. In order to perform a colour compensation, we must first calculate the correction factors, which we can use later to correct the measured data. We calculate these correction factors based on the data originating from a colour compensation run or calibration run. the software was developed in such a manner that once this run has been captured, the correction factors are stored on the hard disk of the user. The user will from now on be able to reload and use these correction factors every time he opens the software.

Colour compensation is performed on the raw data. The correction that is performed depends on a colour compensation run. In such a run, the fluorescence is also measured in the three different channels. The only difference with a normal run is that the samples do not contain all fluorescent labels, but that there are samples with only ROX, CY5 or FAM labels. By radiating and measuring one of the samples in the three channels during the analysis, it is possible to determine the amount of radiation into the other channels. These measurements are only possible when there is only one fluorescent label in the samples.

FIGS. 4 and 5 indicate the purpose of the colour compensation.

Assume we have a peak in ROX as seen in FIG. 4. This peak has a height of approximately 0.7. If we determined via the colour compensation run that the ROX channel radiates 35% into the CY5 channel, we would need to see a peak of approximately 0.25 in the CY5 channel when the colour compensation is turned off. (FIG. 5).

Once we activate the colour compensation, the peak disappears in the CY5 channel. Since there is no longer any radiation, the peak, originating from the radiation of ROX, disappears. We now have a minimum in CY5 that is actually indicative of an over-compensation (FIG. 6). With the over-compensation, an infection in CY5 at the same place as the radiation would also disappear. We optimised the colour compensation algorithm in the software so that there will not be any over-compensation, such that the peaks that are indeed an infection are not mistakenly filtered out.

If we look in more detail into this colour compensation we will compensate the data based on the correction factors. We only need to apply the colour compensation on the second and subsequent channels (ROX and CY5), because only these channels could have radiation. The first channel measured (FAM) will not have any radiation from channels with a lower wavelength.

The correction from FAM to ROX is explained first. To compensate the radiation of FAM in the ROX channel, a function is written, which departs from four parameters: the raw fluorescence data from the ROX channel, the possible time shift correcting data from the FAM channel and the radiation percentage of the FAM channel. We will calculate a corrected dataset based on the abovementioned data, which we can use further as corrected data for the ROX channel.

The colour compensation happens based on an algorithm we developed that can be tailor-made for the end-user, which will improve the effect in comparison to the existing algorithms available. In a specific embodiment of the invention, the user should therefore switch off any colour compensation present on the device. More concretely, this specific embodiment therefore comprises of establishing a number of measurements each taken at a different time point to that of the first parameter, whereby the measurements are fluorescent data of the real time PCR-experiments, whereby the colour compensation on the device is switched off. In an alternative embodiment of the invention, (for example in a case where the user cannot switch off any possible colour compensation present on the device), the method of the invention will, besides the necessary colour compensation, also compensate for any aberrant colour compensation of the device.

For the colour compensation, we will measure the radiation for every device once. This happens by loading a specific run on the device. By merely radiating and measuring 1 channel, we will know the amount of radiation. Based on the percentage measured here, we will later be able to correct other runs via multiplication.

Subtraction

Subtraction is a possible third correction that we use. With this subtraction, we want to remove the background of the signal. After subtraction, we only retain the pure data without an artificial increase by a specific background or noise.

FIG. 7 shows that the curve for the negative control (400) is not completely on 0, but in this case around 0.1. This signal is the result of a sample with all PCR reagents, but without a DNA sample in the reaction. Theoretically, it will not be possible for any products to form without DNA. Theoretically, it would also not be possible to measure a fluorescence in 1 of the channels of this negative control. If we test this in practice we will clearly see a light background. This background is indicated on the graph as the curve (400). In the corrected graph (FIG. 8) we removed this background by subtracting the negative control signal at the fluorescence data of the sample and the fluorescence data of the negative control. This way, the curve (410) of the negative control is perfectly on 0 for all points. The graph (420) that reflects the fluorescence data of the sample is lower as a result of this subtraction. Depending on the height of the background, the sample data will drop more or less as a result of this subtraction.

This subtraction should preferably be the last correction for rounding off the data. This subtraction can only take place in the ROX- or CY5 channel. If we had to perform a subtraction in the FAM channel, the negative control, which is read in the FAM channel, would not be interpreted correctly due to this. The subtraction would make the control signals in the FAM channel appear more negative than they effectively are.

Mathematically, the subtraction consists of a difference between the signal and the background.

Moving Average

The moving average is not a correction to rectify measurement errors. However, this moving average was included in the software. After investigation we saw that it was efficient to optimise the data based on this moving average. There were no visible spikes on the raw data, but we saw that the form of the graphs was not always optimal. After further analysing this problem, we concluded that this was due to the current protocol where only one measurement per 1° C. increase is determined. By compensating the data via the moving average, the deviances from the data disappeared.

By using a moving average, we were able to “smooth” the graph. We could experimentally conclude that this smoothing would later simplify the detection of target molecules, in particular infections.

The moving average is the last correction of the data. This correction does not take place on the raw data, but on the derivate data. We must always use the derivative data to calculate the gradient of the curve to ensure maximum detection.

FIG. 9 shows the advantage of this correction. In the screenshots we see how the curve in the graph (A) is rather angular, while it is more smooth after correction with the moving average in graph (B). The smoothing of this graph was increased by applying the moving average to the derivative data. This very light maximum, which was present in graph (A) was practically removed in graph (B) due to the smoothing. This proves that peak detection on flowing data is even easier. Mathematically, a moving average happens by replacing a point by the average of the point and one or more of the next points.

Correction Parameters

In order to perform all corrections, a number of mathematical parameters can be used. This way the extent of the correction can be adjusted experimentally. The correction of the data is important to perform a correct peak detection. The amount of smoothing is a balance between a flowing curve and still maintaining sufficient resolution to in order not to loose small infections from the results.

Dx: this first parameter determines how we calculate the derivative values based on the adjustable parameter 9dx (the differentiation interval). The higher this dx value, the higher the reach of the subtraction and the smoother our curve will be. This value may also not be too high because this could lead to insufficient resolution to perform a correct analysis. A number of points are always lost when deriving the data. In an embodiment, the subtraction of data was designed in such a way that not all data points at the back are lost. For example, assume that we know that the melting point temperature when calculating the first target molecule from the panel is at least 53° C., while the melting point temperature when calculating the last target molecule from the panel is a maximum of 80° C. and we can measure with the measuring equipment between 45° and 85°. From this data we can lose a few data points in the front (low temperature) and at the back (high temperature) without losing useful data pairs.

FIG. 10 shows how the data can be derived. Since we can view the gradient between dx points left and right from a measuring point we lose dx points at the front and in the back (see respectively (800, 810). This way we are able to properly derive without losing any data points close to the values with regard to the first or last target molecule.

Moving Average DX: Moving average can, as described before, be used to compensate the minor irregularities in the curve. This creates a smoother curve that simplifies peak detection. By this correction, the number of maxima reduces. We filter out the maxima, which do not correspond with a target molecule.

Here too, we work in such a manner that we lose a point in the front and at the back. The principle is identical to the manner in which we derive. Assume a dx of 1 is used for the moving average, the average will be calculated based on 1 point in front and one point behind the current point and the point itself. If a dx of 0 is chosen for the moving average, the part between the current and the next point will be taken into consideration. This way we only lose one point at the back.

In an embodiment, a limited subtraction is combined with a moving average. Both mathematical operations will smoothen the curve to some extent.

Extra percentage colour compensation: It is possible to increase a percentage of the correction factors for the colour compensation in the software. This way, the user can increase the influence of the colour. In certain cases, this could lead to an improved correction.

Apart from the option to set the quantity of correction for the various corrections, it is possible to eliminate one or more corrections individually. The software was developed in such a manner that every compensation can be included or excluded separately, while only the other corrections will continue to happen correctly.

Conclusion Processing of the Data

FIG. 11 shows the full flowchart of the function, which will correct the data. The function requires a sample number and a channel as parameters. Based on these data, the correct data are calculated for the desired sample and the selected channel and the curve are displayed. The flow of this algorithm was established by experimentally starting to search for the most suitable manner to correct this data. Apart from the various methods to correct the data, we also looked into the sequence in which the various steps could be taken. We had to determine the parameters for the various corrections, which led to a search for the correct balance between smoothing and a good correction, which will not lead to any loss of data.

FIG. 11 shows in step (100) how to determine the calculated values, for the parameter which measures are loaded (fluorescence channel), by using these loaded measurements, to achieve the calculated values, representative for the parameter under the environment at the loaded time. FIG. 11 shows in step (200) the adjustment of the aforementioned calculated value to suppress the influence of the frequency overlap by using the loaded information. FIG. 11 shows in step (300) how to determine the presence of background signals for at least one of the parameters; and correcting (subtracting) the adjusted measurements. FIG. 11 shows in step (400) the smoothing, preferably through a moving average, of the calculated values. FIG. 11 also shows the preferred sequence of the sub methods and steps.

Validation of Negative Control

Before we start analysing the data, we check the IAC (internal amplification control). By checking the IAC we can determine whether the reaction in the kit occured correctly. Depending on the kit, there is a temperature at which the IAC signal must have a minimum if no infections are found in the ROX or CY5 channel. We have included a few parameters for the validation of the IAC. When we tested the software, we established that certain weak infections were not detected. After analysing these specific cases, we came to the decision that it was not the parameters that had to be adjusted, but that a better negative balance ensured that we could realise better normalisation which meant that the weak infections that would have been missed earlier, came to the surface. When the user choose to validate the IAC, he receives a notification with the option to implement a new IAC if the previously entered IAC does not comply with the requirements of the valid IAC. If the user gives the same IAC as before or the current IAC, the software will use this IAC. A valid IAC is a valid negative signal with sufficient difference between the first and last flourescence value. See FIG. 12.

Data Analysis

After all data are accurately corrected, the data will be analysed. This analysis consists of filtering the infection peaks from the full data set. As seen from FIG. 13, not all maxima are truly the maximum for a certain infection. This then also illustrates the inventive contribution of the invention. There is an enormous variety of data, but the algorithm must process the data universally and always only filter the correct maxima from the full data set. In order to do this, we developed a set of functions that analyse the data set or a portion of this data set. From the moment when a maximum successfully passes this analysis one could conclude that the maximum represents a certain infection.

We divided the analysis of single infections and multiple infections. This way an infection is always scored at its absolute maximum. This way there is never a possibility that we can miss even the most obvious infection from multiple infection. From the moment when this maximum complies with a combination of parameters there will be a further investigation as to see whether there is shoulder to the left or right of the maximum found.

In certain cases we are not dealing with a shoulder infection, but with a multiple infection that can be identified by its enormous width at the bottom. We must also perform an analysis of these infections, which can only be performed in the cases that we can derive from the data whether this is potentially a multiple infection. FIG. 14 clearly shows this. A clear shoulder can be seen. To the left of the scored infection there is a smaller, weaker infection that ensures that the curve obtains a certain shoulder. FIG. 15 shows a clear example of a multiple infection without a shoulder.

The major difference between both curves lies in the symmetry of the curve. A multiple infection without a shoulder will always to a large extent be symmetric with regard to the vertical symmetry lines due to the maximum found. In contrast to this, a shoulder infection will be symmetrical at the top but always asymmetrical at shoulder height. Based on this symmetry calculation, we also search for multiple infections or shoulder infections.

From a general point of view the discovered method thus enables, based on the aforementioned adjusted measurements, the detection of the presence of the aforementioned contamination, by determining the maxima and by classifying the contamination by performing a symmetry analysis on the aforementioned adjusted measurements concerning the specified maximum.

During the analysis we have used a lot of parameters to filter the infection peaks from the full data set.

Thus, as indicated earlier, the ability to set parameters of the methods and the underlying software contribute to the invention. The various parameters that could be used are discussed in more detail below.

Dynamic factor threshold positive: Determines how many times the threshold for clearly positive infections lies higher than the average value of the negative control. For example, the average value of the negative control is 0.1 and the Dynamic factor threshold positive is 2.5, then the peak must be higher than 0.25 to be considered as clearly positive.

Dynamic factor Threshold Negative: Identical to the previous parameter. This value will only determine the minimum threshold for the uncertain area. An infection that is higher than this threshold but still lower than the positive threshold falls in the uncertain area. This uncertain area is a zone that includes the uncertain cases. The user should rather once again, visually check the infections from this uncertain area.

Absolute factor raw: In order to consider a signal as an infection, there must be an absolute difference between the sample data and the negative control that is higher than this parameter. The difference here always concerns the raw data and the first point from both data sets. The difference occurs based on the fluorescence of the sample and the fluorescence of the negative control.

Dynamic factor raw: A signal can only be scored as an infection when the first data point is higher than the measurement of the negative control multiplied by this parameter. Here too, the raw fluorescence data is used for the calculation.

Width: This parameter contains the minimum width that an infection peak must have. The distance from the top used to look at the width is determined by the PercentageWidth parameter.

Percentage width: This parameter gives the percentage from where the width must be viewed. This percentage is always seen from the top. Assume we have a maximum with a Y value of 1 and this percentage is set at 15%, then the width of the peak will be viewed from a height of 0.85. This way, an infection must have a certain width and height, something that lots of maxima do not have and that can be found at infection peaks. Based on this and the previous parameter, we filter quite a lot of incorrect maxima from the area around the background.

Width Bottom & Percentage bottom border: These parameters work identical to the normal width and the related percentage. These parameters will only be used to view the width at the bottom. These parameters play an important role in the detection of multiple infections without a shoulder. Since these peaks are characterised by an enormous width at the bottom, they are easy to detect based on these values.

Absolute threshold: This threshold is an absolute threshold. A peak must always be higher than this value to be seen as an infection. Even if the peak is still above the dynamic threshold, but still not higher than this value it will not be a valid infection peak. This parameter was created to filter maxima in the background. In certain situations we found maxima in negative signals that were still scored as infections.

Double infection peak minimum height: this parameter is identical to the abovementioned parameter and this parameter will record a minimum height for double infections.

Symmetry difference left right: In order to detect shoulders, we view the symmetry at a certain height. A shoulder must have a deviation from the symmetry that is higher than this value to be able to be scored as a shoulder. This deviation is calculated based on the ratio between the left and right part of the symmetry axis. When a signal has to deviate 30%, i.e. that the ratio between the left part of the symmetry axis and the right part must be lower than 0.7 or higher than 1.3.

Symmetry height: The height at which the symmetry axis is viewed depends on this percentage. Here too there will be a certain percentage drop from the Y-value. If we view the peak detection in detail in a particular embodiment, the flow diagram of FIG. 16 can be followed. As you can see, we first detect every maximum applicable to one or more infections. From the moment a maximum has been detected, we will further analyse whether this maximum is single or multiple.

Automatic Detection of Samples

Before starting the analysis, the software must preferably first check which sample has been identified as negative control. The user can specify tags in the software, which can be used to automatically search for negative controls or to differentiate a mix1 and mix2 sample. When the user submits his tags once, these are saved on the hard disk of the system and these will always be loaded when the user starts the software.

When we evaluated the data analysis and compared it with prior art software we see that the data analysis of the invention is similar in 90% of the cases for all compared software packages. The difference between the software packages lies in the other 10%. This last 10% of the peaks include the very weak infections, the shoulders and the multiple infections. Because this concerns medical diagnostics, the software score must be trusted for 100%. When only 2% of the cases cannot be scored accurately, all samples must be checked visually to ensure that the score was correct.

In order to solve this problem, a negative, uncertain and certain positive zone is used in an embodiment of the invention. The parameter setting must be determined in such a manner that we are certain that a peak is positive at a result in the green or positive area. The situations where the peak lands in the uncertain area must still be checked visually by the user. This way the number of samples that must be analysed manually are reduced to the number that falls in the orange zone. This way we also do not score any false positives. FIG. 17 shows the zones with the accompanying result. The invention is unique in the use of more than 2 zones, preferably 3 zones (500 positive, 510 uncertain, 520 negative). This approach is consciously chosen because this will lead to the realisation of a more reliable result.

In general the methods thus detect, based on the aforementioned adjusted measurements, the presence of said contamination, preferably by determining the maximum and by including a 3-value-score of the reliability of this determination.

Output of the Data

After a correction and analysis was performed on the data, it must also be possible to export the data. By creating an export, the user gets a nice global overview with the results of the complete run.

The software also offers the opportunity to give an overview of the infections found for a certain sample in the software with accompanying curves for the selected channel. If the user wants to automatically score the full run, he should rather select an export. This export will always be created by the software as a PDF and CSV file. The CSV gives the opportunity to easily process the data in spreadsheets such as Excel. The export of the PDF has, depending on the mode where the PDF is created, a different layout. In the software, we differentiate the experimental mode where it is possible to compare 2 parameter settings with one another and the mode to score infections. A second mode is the mode to score peak names. This mode was developed to automatically score a run and to quickly receive a summary with the various samples and the infections found. In this mode a summary is created internally, in the software, between a 1 mix assay and 2 mix assays. It is necessary that both samples have the same name, with a specific tag at the end that indicates whether it concerns a mix1 or mix2. A pdf of a 2 mix assay in the “score peak names” mixes always next to each other with the infections found and the accompanying data below the curves. For the user, nothing changes about the software. The user chooses the assay he wants to use. Depending on the assay used, the software processes the export in a different manner. This happens without any interference of the user. On the background there is a clear difference though between the processing of a 2-mix assay and a 1-mix assay. In the 2-mix mode, 1 sample will also be displayed per page. If only 1 of the mixes had to be included in the run for a certain sample, the software will indicate this by leaving the column to the left or right of the corresponding mix empty. At a “Score peak Name” export, the PDF contains a table with a summary of the results on the first page. This way it is easy to quickly obtain a global summary of the full run. After this summary there will be a table with the parameters used during the run.

Claims

1. Method for adjusting a subset of measurements of at least a first parameter from multiple parameters, each being subjected to the same time-dependent environment, where the measurements are taken at different time points and where the adjustments result in calculated values which can be compared with corresponding measurements of at least a second parameter, as if they were taken at the time when the corresponding measurements of the second parameter were determined, where the method includes:

(1) determining a number of measurements, each taken at a different time, for the first parameter;

(2) determining the time points when the corresponding measurements of the second parameter were determined; and

(3) determining the calculated values representative for the first parameter, at the time points of the second parameter, based on the measurements of the first parameter wherein the measurements are fluorescence data of real time PCR experiments, and the time-dependent environment is indicated by the temperature during the PCR reaction.

2. Method according to claim 1, wherein the calculated value for the first parameter is obtained by interpolation of the measurements of the first parameter with regard to the time points of the second parameter.

3. Method for adjusting fluorescence measurements, comprising the method according to claim 1, followed by the method for adjusting fluorescence measurements of a sample, wherein the measurements are taken in at least two partially overlapping wavelength areas, which method includes: (1) determining the fluorescence measurements, (2) determining information indicative for the aforementioned overlap, (3) adjusting the fluorescence measurements, by using mathematical multiplication operations based on the information from (2), to reduce the influence of the aforementioned overlap.

4. Method according to claim 3, in which the adjusting is such that no overcompensation takes place.

5. Method according to claim 1, further comprising: determining the presence of background signals for at least one of the parameters; and correcting the adjusted measurements obtained for the aforementioned background signals.

6. Method according to claim 1, further comprising smoothing the adjusted measurements that were obtained.

7. Method for determining the presence of one or more target molecules in a sample, including: (1) performing the method of claim 1 on measurements obtained from the sample, to achieve the calculated values, (2) determining the presence or absence of the target molecules based on the calculated value from (1).

8. Method according to claim 7, where in (2), determining the presence or absence of the target molecules based on the calculated values from (1) is performed by calculating the maximum and including a 3-value-score for the reliability of this determination.

9. Method according to claim 7, further including: (3) characterisation of the target molecules by performing a symmetrical analysis on the aforementioned adjusted measurements, concerning the specified maximum.

10. A computer program product, working on a processor for performing one of the calculations according to the methods according to claim 1.

11. A non-volatile machine-readable storage medium that saves the computer program product of claim 10.

12. (canceled)

13. A graphic user interface that is suitable for performing the methods according to claim 1.