IDENTIFYING IONS FROM MASS SPECTRAL DATA

Info

Publication number: 20080302957
Type: Application
Filed: Jun 2, 2008
Publication Date: Dec 11, 2008
Inventors: Yongdong Wang (Wilton, CT), Ming Gu (Yardley, PA)
Application Number: 12/131,888

Abstract

A method for identify isotope patterns in mass spectral data, comprising obtaining a desired mass spectral peak shape function; obtaining mass spectral data composed of actual isotope patterns to be analyzed; calculating theoretical isotope pattern from known elemental composition of at least one basic ion whose isotope pattern is representative of the ions to be analyzed, by using mass spectral peak shape function; comparing quantitatively corresponding parts of the theoretical isotope pattern to that of the mass spectral data; calculating a numerical metric to measure similarity between the theoretical isotope pattern and actually measured isotope pattern; and utilizing the numerical metric as an indication for possible presence of ions whose isotope patterns resemble that of the basic ion. A computer for and a computer readable medium having computer readable code thereon for performing the methods. A mass spectrometer having an associated computer for performing the methods.

Description

Description

This application claims priority, under 35 U.S.C. §119(e), from provisional patent applications Ser. No. 60/941,656 filed on Jun. 2, 2007 and 60/956,692 filed on Aug. 18, 2007. The entire contents of these applications are incorporated herein, in their entireties.

CROSS REFERENCE TO RELATED PATENT APPLICATIONS/PATENTS

The entire contents of the following documents are incorporated herein by reference in their entireties:

U.S. Pat. No. 6,983,213; International Patent Application PCT/US2004/013096, filed on Apr. 28, 2004; U.S. patent application Ser. No. 11/261,440, filed on Oct. 28, 2005; International Patent Application PCT/US2005/039186, filed on Oct. 28, 2005; International Patent Application PCT/US2006/013723, filed on Apr. 11, 2006; U.S. patent application Ser. No. 11/754,305, filed on May 27, 2007; International Patent Application PCT/US2007/069832, filed on May 28, 2007. U.S. patent application Ser. No. 11/830,772 which was filed on Jul. 30, 2007 and which claims priority from provisional patent application Ser. No. 60/833,862 filed on Jul. 29, 2006.

BACKGROUND OF THE INVENTION

1. Field of the Invention

The present invention relates to mass spectrometry systems. More particularly, it relates to mass spectrometry systems that are useful for the analysis of complex mixtures of molecules, including large and small organic molecules such as proteins or peptides, environmental pollutants, pharmaceuticals and their metabolites, and petrochemical compounds, to methods of analysis used therein, and to a computer program product having computer code embodied therein for causing a computer, or a computer and a mass spectrometer in combination, to affect such analysis.

2. Prior Art

In drug metabolism studies, researchers typically create a radio-labeled version of the parent drug before dosing the drug in animal or human test subjects. Through biotransformations, the drug will be transformed into its metabolites, between just a few to as many as 50-70 metabolites. By detecting and following the radioactivity, researchers can trace these bio transformations and account for the metabolites. The sample is typically injected into an LC/MS system for analysis, where various metabolites are separated in (retention) time and detected by mass spectrometry. While these metabolites can be traced by a radio activity detector in a split flow arrangement in parallel to mass spectrometry, the identification of these metabolites will ultimately have to rely on mass spectrometry due to its mass (m/z) measuring capability. Unfortunately in many cases, the biological sample, even after extensive clean-up, sample preparation, and LC separation, still suffers from significant matrix or background ion interferences, making metabolite identification a time-consuming and tedious process. To help with the mass spectral identification of possible metabolites, researchers may dose test subjects with a mixture of the native and radio-labeled compound, creating a unique mass spectral signature that is easier for researchers to spot in a mass spectrum. Subject to limitations on total dosage, radioactivity exposure for a given test species, mass spectral saturation, and the uncertainty surrounding the ratio between the native and the radio-labeled version of the drug, metabolite identification remains a daunting task for researchers, even with the aid of radioactivity tracing.

After an ion has been identified to be possibly drug-related, it is typically required to then confirm its elemental composition before structural elucidation through further MS/MS experimentation, or even isolation for NMR analysis. Due to the various backgrounds present, typically, higher resolution mass spectrometry is desired in order to avoid interference from the matrix or background ions. Higher resolution mass spectrometry systems such as TOF, qTOF, Orbi-Trap, or FT ICR MS, offer two distinct advantages: less spectral interferences and higher mass accuracy. Even with elaborate calibration schemes such as lock mass, dual spray, and internal calibration, obtaining unique elemental composition remains a challenge at the extremely high mass accuracy of 100 ppb.

A previous approach, as in U.S. Pat. No. 6,983,213 and International Patent Application PCT/US2005/039186, filed on Oct. 28, 2005, provides a novel method for calibrating mass spectra for improved mass accuracy and line shape correction to improve the ability to perform elemental composition analysis or formula identification.

Very high mass accuracy can be obtained on so-called unit mass resolution systems in accordance with the techniques taught in U.S. Pat. No. 6,983,213.

Accurate line shape calibration provides a highly reliable metric to assist in unambiguous formula identification by matching the measured spectra to calculated candidate formulas, as in International Patent Application PCT/US2005/039186, filed on Oct. 28, 2005.

However, obtaining unique elemental composition from conventional to high resolution mass spectrometry systems remains a challenge to practitioners of mass spectrometry.

Thus, there exists a significant gap between what current mass spectral system can offer, and what is being achieved at the present using existing technologies for mass spectral analysis.

SUMMARY OF THE INVENTION

It is an object of the invention to provide a mass spectrometry system and a method for operating a mass spectrometry system that overcomes the disadvantages described above, in accordance with the methods described herein.

It is another object of the invention to provide a storage media having thereon computer readable program code for causing a mass spectrometry system to perform the method in accordance with the invention.

An additional aspect of the invention is, in general, a computer readable medium having thereon computer readable code for use with a mass spectrometer system having a data analysis portion including a computer, the computer readable code being for causing the computer to analyze data by performing the methods described herein. The computer readable medium preferably further comprises computer readable code for causing the computer to perform at least one of the specific methods described.

Of particular significance, the invention is also directed generally to a mass spectrometer system for analyzing chemical composition, the system including a mass spectrometer portion, and a data analysis system, the data analysis system operating by obtaining calibrated continuum spectral data by processing raw spectral data; generally in accordance with the methods described herein. The data analysis portion may be configured to operate in accordance with the specifics of these methods. Preferably the mass spectrometer system further comprises a sample preparation portion for preparing samples to be analyzed, and a sample separation portion for performing an initial separation of samples to be analyzed. The separation portion may comprise at least one of an electrophoresis apparatus, a chemical affinity chip, or a chromatograph for separating the sample into various components.

BRIEF DESCRIPTION OF THE DRAWINGS

The foregoing aspects and other features of the present invention are explained in the following description, taken in connection with the accompanying drawings, wherein:

FIG. 1 is a block diagram of a mass spectrometer in accordance with the invention.

FIG. 2 is flow chart of the steps in the identification of isotopically similar ions used by the system of FIG. 1.

FIG. 3A to FIG. 3F are graphical representations of the some results obtained during the process of FIG. 2.

DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS

Referring to FIG. 1, there is shown a block diagram of an analysis system 10, that may be used to analyze proteins or other molecules, as noted above, incorporating features of the present invention. Although the present invention will be described with reference to the single embodiment shown in the drawings, it should be understood that the present invention can be embodied in many alternate forms of embodiments. In addition, any suitable types of components could be used.

Analysis system 10 has a sample preparation portion 12, other detector portion 23, a mass spectrometer portion 14, a data analysis system 16, and a computer system 18. The sample preparation portion 12 may include a sample introduction unit 20, of the type that introduces a sample containing proteins, peptides, or small molecule drug of interest to system 10, such as Finnegan LCQ Deca XP Max, manufactured by Thermo Electron Corporation of Waltham, Mass., USA. The sample preparation portion 12 may also include an analyte separation unit 22, which is used to perform a preliminary separation of analytes, such as the proteins to be analyzed by system 10. Analyte separation unit 22 may be any one of a chromatography column, an electrophoresis separation unit, such as a gel-based separation unit manufactured by Bio-Rad Laboratories, Inc. of Hercules, Calif., and is well known in the art. In general, a voltage is applied to the unit to cause the proteins to be separated as a function of one or more variables, such as migration speed through a capillary tube, isoelectric focusing point (Hannesh, S. M., Electrophoresis 21, 1202-1209 (2000), or by mass (one dimensional separation)) or by more than one of these variables such as by isoelectric focusing and by mass. An example of the latter is known as two-dimensional electrophoresis.

The mass spectrometer portion 14 may be a conventional mass spectrometer and may be any one available, but is preferably one of MALDI-TOF, quadrupole MS, ion trap MS, qTOF, TOF/TOF, or FTMS. If it has a MALDI or electrospray ionization ion source, such ion source may also provide for sample input to the mass spectrometer portion 14. In general, mass spectrometer portion 14 may include an ion source 24, a mass analyzer 26 for separating ions generated by ion source 24 by mass to charge ratio, an ion detector portion 28 for detecting the ions from mass analyzer 26, and a vacuum system 30 for maintaining a sufficient vacuum for mass spectrometer portion 14 to operate efficiently. If mass spectrometer portion 14 is an ion mobility spectrometer, generally no vacuum system is needed and the data generated are typically called a plasmagram instead of a mass spectrum.

In parallel to the mass spectrometer portion 14, there may be other detector portion 23, where a portion of the flow is diverted, for nearly parallel detection of the sample in a split flow arrangement. This other detector portion 23 may be a single channel UV detector, a multi-channel UV spectrometer, or Reflective Index (RI) detector, light scattering detector, radioactivity monitor (RAM) etc. RAM is most widely used in drug metabolism research for ¹⁴C-labeled experiments where the various metabolites can be traced in near real time and correlated to the mass spectral scans.

The data analysis system 16 includes a data acquisition portion 32, which may include one or a series of analog to digital converters (not shown) for converting signals from ion detector portion 28 into digital data. This digital data is provided to a real time data processing portion 34, which processes the digital data through operations such as summing and/or averaging. A post-processing portion 36 may be used to do additional processing of the data from real time data processing portion 34, including library searches, data storage and data reporting.

Computer system 18 provides control of sample preparation portion 12, mass spectrometer portion 14, other detector portion 23, and data analysis system 16, in the manner described below. Computer system 18 may have a conventional computer monitor or display 40 to allow for the entry of data on appropriate screen displays, and for the display of the results of the analyses performed. Computer system 18 may be based on any appropriate personal computer, operating for example with a Windows® or UNIX® operating system, or any other appropriate operating system. Computer system 18 will typically have a hard drive 42, on which the operating system and the program for performing the data analysis described below is stored. A drive 44 for accepting a CD or floppy disk is used to load the program in accordance with the invention on to computer system 18. The program for controlling sample preparation portion 12 and mass spectrometer portion 14 will typically be downloaded as firmware for these portions of system 10. Data analysis system 16 may be a program written to implement the processing steps discussed below, in any of several programming languages such as C++, JAVA or Visual Basic.

As mentioned in the U.S. Pat. No. 6,983,213, for a given standard ion of known elemental composition, the acquired profile mode mass spectral data y₀and its theoretical counterpart y are related to each other through

(g{circle around (×)}y₀)=(g{circle around (×)}y){circle around (×)}p Equation 1

where {circle around (×)} represents convolution, g represents a small Gaussian, and p represents the mass spectral peak shape function. When y₀, y, and g are known, the mass spectral peak shape function p can be readily calculated through deconvolution.

When the measured y₀is a linear combination of two ions at varying relative signal levels, such as the native and radio labeled version of a small molecule drug, additional parameters need to be introduced, such that:

y₀=c₁y_1,0+c₂y_2,0 Equation 2

y=c₁y₁+c₂y₂ Equation 3.

As long as the two additional parameters c₁and c₂are known or their ratio c₁/c₂or c₂/c₁is given, the same approach outlined in U.S. Pat. No. 6,983,213 can be used to arrive at the peak shape function p. When their relative concentrations are not known, as is the case in drug metabolism research, due to incomplete isotope replacement reaction, an iterative approach to arrive at c₁/c₂and p has been disclosed in International Patent Application PCT/US2005/039186, filed on Oct. 28, 2005 and International Patent Application PCT/US2006/013723, filed on Apr. 11, 2006.

While generally producing excellent results, there are situations in which an iterative approach is not preferred due to at least two considerations: it may be computationally extensive and its convergence is not always guaranteed. For this reason, a more direct, computationally efficient, and reliable approach will be disclosed here as a preferred embodiment described below, which is described herein, in a few distinct steps:

- a. Examining the theoretical isotope cluster from the native and the corresponding isotope labeled version, one finds that the two vectors y₁and y₂are, for the most part, simply shifted version of each other. For example, the Verapamil native drug C₂₇H₃₉N₂O₄⁺ and its radio-carbon labeled version ¹⁴CC₂₆H₃₉N₂O₄⁺ for the most part are shifted version of each other with the radio-labeled version shifted on the mass axis by 14.00324-12.00000 or +2.00324Da. Based on this observation, one could proceed by setting c₂(or c₁) to zero and c₁(or c₂) to one (in Equations 2 and 3) and perform the deconvolution in Equation 1 to calculate a peak shape function p, which would contain the true peak shape function duplicated twice with 2.00324Da spacing in between. While one could perform peak detection and peak selecting to select one of the two peaks as the peak shape function and jump to step c, a preferred approach will be described next, which is more generally applicable, even at less than unit mass resolution, and with stable ¹³C isotope labeling, where the mass spacing between the two peaks would be less, and reliable separation of the peaks becomes more challenging.
- b. Treat the above calculated p as y₀in Equation 1, create a new y through Equation 3 by deleting all other isotopes except for the monoisotopes from the theoretically calculated isotope distribution y₁and y₂, and set c₁=1 and c₂=some initial estimate (or vice versa with c₂=1 and c₁=some initial estimate, with no loss of generality in the following descriptions). Note that y₁and y₂thus created would be spaced exactly the same mass distance apart as in the two peaks in p (new y₀). Typically the initial estimate for c₂can be easily obtained from the sample preparation process. Applying another round of deconvolution based on Equation 1 creates a new peak shape p, which now contains primarily one single peak shape function. To the extent that c₂is in error, there will still be a small peak in the deconvoluted p, either negative or positive, depending on the sign of the error in c₂. This small extra peak can be zeroed out to arrive at a cleaned-up version of p.
- c. Treat c₁and c₂from Equation 3 as unknowns, insert Equation 3 into Equation 1 and simplify to arrive at

y₀=c₁(y₁{circle around (×)}p)+c₂(y₂{circle around (×)}p) Equation 4

which can now be solved to obtain updated values for c₁and c₂using the same y₀, y₁, and y₂from step b above and the cleaned-up version of p, through, for example, least squares linear regression.

- d. Repeat steps a-c above as necessary using the updated concentrations c₁and c₂Typically only one more calculation through step a. results in a true peak shape function p with the complication from interfering ion(s) completely removed.

It should be noted that with the monoisotopic peak from the native ion from a higher resolution system, where the monoisotopic peak is baseline resolved from other isotopes, the true peak shape function p can be directly obtained without iteration, and the relative concentrations c₁and c₂can be obtained from the above Equation 4 in a single step. Once the true peak shape function p is obtained, one may proceed with the mass spectral calibration as referenced in U.S. Pat. No. 6,983,213 to calibrate for the mass axis while also transforming the peak shape into a desired or target peak shape function that is mathematically definable. Alternatively, but less desirably, one could leave the raw mass spectral data as is, except that the peak shape function is now known and numerically represented by p. This completes Step 230 in FIG. 2.

One can now move to the next stage, Step 240 in FIG. 2, to construct an ion pattern to be searched in the rest of the mass spectral data for the possible presence of similar or “resembling” ions that would also show similar isotope patterns. This is useful for researchers in the drug metabolism area where a parent drug along with its unique isotope pattern gives rise to various metabolites exhibiting similarly unique isotope patterns. The unique isotope pattern may come from the parent drug itself due to the presence of Br or Cl elements in its elemental composition, or from the mixing of the native drug with its isotope labeled version. If the metabolites contain the same number of Br or Cl elements as the drug itself and the drug metabolism pathway is indifferent with respect to certain isotopes (which is the case in most applications), similar isotope patterns will be observed for the metabolites. Since various metabolites come at different masses and chromatographic retention times, it is difficult and time consuming to spot these isotope patterns in a typical LC/MS run that generates a data matrix on the order of 4000 time points by 8000 mass points in the presence of matrix and background ions typical of biological samples.

This similarity in isotope patterns among the parent drug and its various metabolites will now be exploited for an automatic algorithm to identify the possible presence of these resembling ions without actually knowing their precise elemental compositions. Once the peak shape function p has been obtained along with the concentration ratios c₁/c₂between the two basic ions (those of know chemical composition, such as, for example, a parent drug, the isotope labeled version of the parent drug, a known fragment of the parent drug or its isotope labeled version, a known metabolite or its fragment, and the isotope labeled version of the known metabolite or its fragment from drug metabolism studies; e.g. those of a composition that is know or has already been determined), a mass spectral isotope pattern t can be established by

t=(c₁y₁+c₂y₂){circle around (×)}p Equation 5

where y₁and y₂are theoretically calculated from the elemental compositions of the basic ion and its isotope labeled version, respectively (Step 240 in FIG. 2). This isotope pattern t can be used to fit to a segment of mass spectral data r through the following model

r=Kc+e Equation 6

where r is an (n×1) matrix of the profile mode mass spectral data, digitized at n m/z values; c is a (k×1) matrix of regression coefficients which are representative of the concentrations of k components in matrix K; K is an (n×k) matrix composed of profile mode mass spectral responses for the k components, all sampled at the same n m/z points as r; and e is an (n×1) matrix of a fitting residual with contributions from random noise and any systematic deviations from this model. The k columns of the matrix K will contain the isotope pattern t (for example, in its first column, without the loss of generality, for easy subsequent description) and any background or baseline components, which may or may not vary with mass (as additional columns). A least square solution to Equation 6 leads to

=K⁺r Equation 7

where K⁺ (dimensioned as k×n) is the pseudo inverse of the matrix K, a process well established in matrix algebra, as referenced in U.S. Pat. No. 6,983,213; International Patent Application PCT/US2004/013096, filed on Apr. 28, 2004; U.S. patent application Ser. No. 11/261,440, filed on Oct. 28, 2005; International Patent Application PCT/US2005/039186, filed on Oct. 28, 2005; and International Patent Application PCT/US2006/013723, filed on Apr. 11, 2006.

Note that in Equation 7, each row in K⁺ serves as a digital filter applied to the mass spectral segment r to arrive at a concentration vector c containing the contribution of each component, including the ion isotope pattern t and any components included in matrix K. These digital filters in K⁺ can be calculated once in a limited mass spectral range and then applied to a mass spectral segment in an extended mass range in a sliding window, much like a convolution filter, in Step 240 in FIG. 2, to generate a concentration vector c from Equation 7 for each retention time and mass location combination. In the special case where there is just one column involved in matrix K, i.e., no baseline or background involved besides the isotope pattern t itself, it can be proved that the digital filter is of the same form as the isotope pattern t itself, subject to a scaling factor. A residual can be calculated for each such retention time and mass location combination as

=r−K Equation 8

in Steps 250 and 260 in FIG. 2. This residual vector can be further reduced into a scalar by taking the 2-norm, i.e., the square root of the sum of squares of all elements involved, or root mean square error, and converted into a relative residual error as

e=∥e∥₂/∥r∥₂

While one can relate this residual error directly to the likelihood for the presence of a resembling ion, it may be more convenient intuitively to convert this residual error into a numeric metric that increases when the measured isotope pattern more closely resembles the given isotope pattern t given in Equation 5. This numeric metric may be equal to the t-statistic or one minus the p-value as disclosed in U.S. Pat. No. 6,983,213 and U.S. patent application Ser. No. 11/754,305, filed on May 27, 2007; corresponding to International Patent Application PCT/US2007/069832, filed on May 28, 2007, or some other appropriate function of the residual error. This corresponds to Step 280 in FIG. 2.

FIG. 3A shows the average of a few mass spectral scans from a retention time window corresponding to a faint radioactivity monitor (RAM) signal and FIG. 3B shows a corresponding resemblance weight factor calculated as:

$w_{i} = r_{i} 2^{- \frac{e_{i}}{a}}$

where the subscript i refers to mass spectral data point i, r_iand e_iare the mass spectral raw signal and residual corresponding to mass spectral data point i based on the above calculations using a mass spectral segment centered around mass spectral data point i, and a is a user-settable parameter that takes on the form of:

a=0.15, for e_i<0.15 or 15% relative residual error

a=0.05, for e_i≧0.15 or 15% relative residual error

Comparing the zoomed-in versions of FIGS. 3A and 3B shown in FIGS. 3C and 3D, respectively, it is clear which mass spectral region contains ions of high resemblance to the basic ions and one may proceed further in identifying the elemental compositions for these high likelihood ions, using the approach outlined in International Patent Application PCT/US2005/039186, filed on Oct. 28, 2005, and International Patent Application PCT/US2006/013723, filed on Apr. 11, 2006.

These high likelihood ions and their elemental compositions are reported out by computer 18 (FIG. 1) by being displayed on the monitor 40 and/or by printing on a printer (not shown) associated with computer 18.

Since all weights across the mass spectrum can be summed up into a total weight and plotted out as a function of chromatographic retention time (FIG. 3F), a time-dependent data trace very similar to a RAM data trace (shown in FIG. 3E) in nature can be generated, where each peak indicates the retention time point where a possible high resemblance ions exist. It should be pointed out, however, that RAM is only applicable to cases involving radio-labeled compound, whereas this novel approach does not require radioactivity for detection, and may actually be more sensitive due to the typically higher sensitivity available through mass spectrometry detection. Due to the lack of RAM sensitivity, it is sometimes required, in in vitro experiments, to work with 100% radio-labeled compound without mixing-in the native compound, causing possible cell or enzyme deaths and making mass spectral identification of drug-related ions difficult without the unique isotope pattern from a 50%:50% mixture. With this new approach, one can reduce the amount of radioactivity exposure to test subjects or even eliminate it completely by working with stable isotopes such as ¹³C labeling instead. On the other hand, in the presence of good quality RAM signal, the above calculations can be significantly sped up, by focusing only on the retention time region where there is a rise in RAM signal.

For reasons discussed in U.S. Pat. No. 6,983,213; International Patent Application PCT/US2004/013096, filed on Apr. 28, 2004; U.S. patent application Ser. No. 11/261,440, filed on Oct. 28, 2005; International Patent Application PCT/US2005/039186, filed on Oct. 28, 2005; International Patent Application PCT/US2006/013723, filed on Apr. 11, 2006; and U.S. patent application Ser. No. 11/754,305, filed on May 27, 2007; International Patent Application PCT/US2007/069832, filed on May 28, 2007, it is preferred to carry out all of the above calculations using the profile mode mass spectral data and have the raw profile mode data calibrated for both mass and peak shape. The above calculations can, however, be carried out in centroid mode, with or without peak shape calibration, with inferior results. In this case, the peak shape function described in this application becomes a delta function with just one non-zero element in the entire peak shape vector.

While the description above uses a pair of two ions as basic ions for easy discussion, the same approach applies to cases involving 3 or more ions. For example, when there are 2 ¹⁴C replacements with incomplete reaction, it is possible to have a mixture as a linear combination of native, one ¹⁴C labeled, and two ¹⁴C labeled ion. Identical process and algorithm can be utilized for these multiple ¹⁴C labeling experiment by simply augmenting the relevant matrices including K, c, K⁺, and adding y₃. Although there appears to be three concentration elements in this case, there are actually only two independent concentration elements due to the closure rule:

c₃=1−c₁−c₂

which can be utilized to reduce the number of unknowns estimated and improve the numerical and statistical stability of the calculations. As a special case, when there is only one ion involved as the basic ion for the metabolism study of a Br— or Cl— containing drug, all of the above calculations and algorithms still apply, except that there are no concentration estimate steps b, c, or d.

For all the analysis described above, it may be advantageous to transform the m/z axis into another more appropriate axis before hand, to allow for analysis with a uniform peak shape function in the transformed axis, as pointed out in U.S. Pat. No. 6,983,213 and International Patent Application PCT/US2004/034618 filed on Oct. 20, 2004.

The process described above includes a fairly comprehensive series of steps, for purposes of illustration, and to be complete. However, there are many ways in which the process may be varied, including leaving out certain steps, or performing certain steps before hand or “off-line”. For example, it is possible to follow all the above approaches by including disjoining isotope segments (segments that are not continuous with respect to one another, but have spaces between them in the spectrum), especially with data measured from higher resolution MS systems, so as to avoid the mass spectrally separated interference peaks that are located within, but are not directly overlapped, with the isotope cluster of an ion of interest. Furthermore, one may wish to include only the isotopic peaks that are not overlapped with interferences in the above analysis, using exactly the same vector or matrix algebra during the quantitative comparison Step 250 in FIG. 2. If the disjoining isotope segments pose a mathematical difficulty in terms of derivative calculations, one may consider zero-filling the left out regions in the isotope cluster before the relevant calculations or leaving out the regions with interferences after the derivative calculations. Lastly, one may wish to perform a weighted regression from Equation 1 to 8 to better account for the signal variance, as referenced in U.S. Pat. No. 6,983,213.

Although the matrix operation is used to describe the process including Equation 1 to 8, its mathematical equivalence such as digital filtering, convolution, deconvolution, correlation, auto-correlation, regression, optimization, and fitting may also be utilized to the same effect, as is well known by one skilled in the art of digital signal processing and numerical analysis.

This invention discloses an approach to calculate or calibrate the actual peak shape function in order to achieve the best possible results. One may bypass this actual peak shape function and instead simply assume a peak shape function to proceed with the ion isotope pattern identification, with somewhat inferior results.

It is noted that the terms “mass” and “mass to charge ratio” are used somewhat interchangeably in connection with information or output as defined by the mass to charge ratio axis of a mass spectrometer. This is a common practice in the scientific literature and in scientific discussions, and no ambiguity will occur, when the terms are read in context, by one skilled in the art.

The methods of analysis of the present invention can be realized in hardware, software, or a combination of hardware and software. Any kind of computer system—or other apparatus adapted for carrying out the methods and/or functions described herein—is suitable. A typical combination of hardware and software could be a general purpose computer system with a computer program that, when loaded and executed, controls the computer system, which in turn control an analysis system, such that the system carries out the methods described herein. The present invention can also be embedded in a computer program product, which comprises all the features enabling the implementation of the methods described herein, and which—when loaded in a computer system (which in turn control an analysis system), is able to carry out these methods.

Computer program means or computer program in the present context include any expression, in any language, code or notation, of a set of instructions intended to cause a system having an information processing capability to perform a particular function either directly or after conversion to another language, code or notation, and/or reproduction in a different material form.

Thus the invention includes an article of manufacture, which comprises a computer usable medium having computer readable program code means embodied therein for causing a function described above. The computer readable program code means in the article of manufacture comprises computer readable program code means for causing a computer to effect the steps of a method of this invention. Similarly, the present invention may be implemented as a computer program product comprising a computer usable medium having computer readable program code means embodied therein for causing a function described above. The computer readable program code means in the computer program product comprising computer readable program code means for causing a computer to effect one or more functions of this invention. Furthermore, the present invention may be implemented as a program storage device readable by machine, tangibly embodying a program of instructions executable by the machine to perform method steps for causing one or more functions of this invention.

It is noted that the foregoing has outlined some of the more pertinent objects and embodiments of the present invention. The concepts of this invention may be used for many applications. Thus, although the description is made for particular arrangements and methods, the intent and concept of the invention is suitable and applicable to other arrangements and applications. It will be clear to those skilled in the art that other modifications to the disclosed embodiments can be effected without departing from the spirit and scope of the invention. The described embodiments ought to be construed to be merely illustrative of some of the more prominent features and applications of the invention. Thus, it should be understood that the foregoing description is only illustrative of the invention. Various alternatives and modifications can be devised by those skilled in the art without departing from the invention. Other beneficial results can be realized by applying the disclosed invention in a different manner or modifying the invention in ways known to those familiar with the art. Thus, it should be understood that the embodiments has been provided as an example and not as a limitation. Accordingly, the present invention is intended to embrace all alternatives, modifications and variances which fall within the scope of the appended claims.

Claims

1. A method for identify isotope patterns from mass spectral data, comprising:

obtaining a desired mass spectral peak shape function;

obtaining mass spectral data composed of actual isotope patterns to be analyzed;

calculating theoretical isotope pattern from known elemental composition of at least one basic ion whose isotope pattern is representative of the ions to be analyzed, by using mass spectral peak shape function;

comparing quantitatively corresponding parts of the theoretical isotope pattern to that of the mass spectral data;

calculating a numerical metric to measure similarity between the theoretical isotope pattern and actually measured isotope pattern; and

utilizing the numerical metric as an indication for possible presence of ions whose isotope patterns resemble that of the basic ion.

2. The method of claim 1, wherein the desired peak shape function is one of assumed peak shape function, actual measured peak shape function, actual calculated peak shape function, target peak shape function from a mass spectral calibration involving peak shape, and a delta function if the obtained mass spectral data is centroid data.

3. The method of claim 1, wherein the actual isotope pattern is measured in profile mode.

4. The method of claim 3, wherein measurement of similarity between the theoretical isotope pattern and the actually measured isotope pattern is performed at a MS resolution higher than unit mass resolution.

5. The method of claim 4, wherein the measured isotope pattern is converted to have a desired peak shape function.

6. The method of claim 5, wherein a desired peak shape function is one of assumed peak shape function, actual peak shape function, measured and calculated peak shape function, and target peak shape function from a mass spectral calibration involving peak shape.

7. The method of claim 1, wherein the actual isotope pattern is a linear combination of at least two basic ions and/or their fragments.

8. The method of claim 7, wherein the at least two basic ions and/or their fragments include native and isotope labeled versions of the ion or fragment.

9. The method of claim 1, wherein the actual isotope pattern is generated by an ion with signature elements, with distinct isotope patterns.

10. The method of claim 1, wherein the measured mass spectral response is calibrated to have a desired peak shape function.

11. The method of claim 10, wherein a desired peak shape function is one of assumed peak shape function, actual peak shape function, measured and calculated peak shape function, and target peak shape function from a mass spectral calibration involving peak shape.

12. The method of claim 1, wherein the theoretical isotope pattern is calculated by convolution of isotope distribution and a desired peak shape function.

13. The method of claim 12, wherein a desired peak shape function is one of assumed peak shape function, actual peak shape function, measured and calculated peak shape function, and target peak shape function from a mass spectral calibration involving peak shape.

14. The method of claim 12, wherein the isotope distribution is theoretically calculated from the elemental composition of at least one basic ion.

15. The method of claim 12, wherein the theoretical isotope pattern is calculated as a linear combination of more than one basic ion with the linear combination coefficients being at least one of a user input and calculated from the actual isotope patterns.

16. The method of claim 7, wherein the linear combination coefficients of relevant ions are calculated from the actual isotope patterns and the known elemental compositions of these relevant ions.

17. The method of claim 16, wherein the calculation of the linear combination coefficients of relevant ions involves at least one of convolution, deconvolution, matrix multiplication, matrix inversion, and iteration.

18. The method of claim 1, wherein the basic ion is at least one of the parent drug, the isotope labeled version of the parent drug, a known fragment of the parent drug or its isotope labeled version, a known metabolite or its fragment, and the isotope labeled version of the known metabolite or its fragment from drug metabolism studies.

19. The method of claim 1, wherein the ions to be analyzed are metabolites from drug metabolism studies.

20. The method of claim 1, wherein the quantitative comparison comprises at least one of a digital filtering, matrix multiplication, matrix inversion, convolution, deconvolution, correlation, auto-correlation, regression, and fitting.

21. The method of claim 20, wherein the quantitative comparison includes at least one of baseline, backgrounds, and other known ions in the same mass spectral range.

22. The method of claim 1, wherein the numerical metric is derived from residual error.

23. The method of claim 22, wherein the numerical metric is weight calculated as a function of the residual error such that a higher weight corresponds to a smaller residual error and hence a higher probability of presence of an ion whose isotope pattern resembles that of the basic ion.

24. The method of claim 23, wherein the weight can be summed over the entire mass spectral range into a total weight and plotted as a function of retention time in LC/MS or GC/MS analysis.

25. The method of claim 24, further comprising comparing the total weight when plotted as a function of retention time to the radioactivity trace in radio-labeled drug metabolism studies.

26. The method of claim 24, further comprising using the total weight when plotted as a function of retention time to replace output data from a radioactivity detector.

27. The method of claim 1, further comprising determining elemental compositions for ions resembling the basic ions.

28. The method of claim 27, wherein the ions resembling the basic ions include multiple ions.

29. The method of claim 28, wherein the multiple resembling ions follow the same linear combination relationship as the basic ions.

30. The method of claim 29, wherein both the basic ion and the resembling ion each contain the native and the isotope-labeled version of the same ion.

31. A computer programmed to perform the method of claims 1.

32. The computer of claim 31, in combination with a mass spectrometer for obtaining mass spectral data to be analyzed by said computer.

33. A computer readable medium having computer readable code thereon for causing a computer to perform the method of claim 1.

34. A mass spectrometer having associated therewith a computer for performing data analysis functions of data produced by the mass spectrometer, the computer performing the method of claim 1.