Reagents and Methods for Identifying Gene Targets for Treating Cancer

Info

Publication number: 20070154933
Type: Application
Filed: Feb 9, 2007
Publication Date: Jul 5, 2007
Applicant: The Board of Trustees of the University of Illinois (Urbana, IL)
Inventors: Thomas Priminano (Chicago, IL), Bey-Dih Chang (Lombard, IL), Igor Roninson (Loudonville, NY)
Application Number: 11/673,521

Abstract

The invention provides methods and reagents for identifying mammalian genes necessary for tumor cell growth as targets for developing drugs that inhibit expression of said genes and inhibit tumor cell growth thereby.

Description

Description

This application is a divisional of U.S. patent application Ser. No. 10/199,820 filed Jul. 19, 2002 which claims priority to U.S. Provisional Application Ser. No. 60/306,730, filed Jul. 20, 2001.

This application was supported by a grant from the National Institutes of Health, No. R01 CA62099. The government may have certain rights in this invention.

BACKGROUND OF THE INVENTION

1. Field of the Invention

The invention is related to methods and reagents for inhibiting tumor cell growth. Specifically, the invention identifies genes necessary for tumor cell growth as targets for developing drugs to inhibit such genes and thereby inhibit tumor growth. The invention provides methods for screening compounds to identify inhibitors of said genes, and methods for using said inhibitors to inhibit tumor cell growth. The invention also provides peptides encoded by genetic suppressor elements of the invention and mimetics and analogues thereof for inhibiting tumor cell growth. Also provided by the invention are normalized random fragment cDNA libraries prepared from tumor cells of one or a plurality of tumor cell types wherein the cDNA fragments can be induced by treating recipient cells with a physiologically-neutral stimulating agent.

2. Summary Of The Related Art

The completion of the draft sequence of the human genome has provided the art with a partial list of known and putative human genes, the total number of which is estimated to be between 30,000 and 45,000 (Venter et al., 2001, Science 291: 1304-1351; Lander et al., 2001, Nature 409: 860-921). These genes provide many potential targets for drugs, some of which may be useful in preventing the growth of cancers. However, the development of clinically useful gene-targeting anticancer drugs could be greatly facilitated by the ability to narrow down the list of human genes to those that are involved in the primary feature of cancer, uncontrolled tumor growth. It would be especially useful to identify genes necessary for the growth of tumor cells and to determine which of the genes play a tumor-specific role and are not required for normal cell growth. These genes are particularly attractive targets for developing tumor-specific anticancer agents.

Most of the effort in tumor-specific drug targeting in the prior art has focused on oncogenes, the function of which has been associated with different forms of cancer Perkins and Stern (1997, in CANCER: PRINCIPLES AND PRACTICE OF ONCOLOGY, DeVita et al., eds., (Philadelphia: Lippincott-Raven), pp. 79-102). Oncogene targets have been viewed in the art as being more “tumor-specific” than “normal” cellular enzymes that are targeted by the drugs used in present chemotherapeutic regimens. The tumor specificity of oncogenes has been suggested primarily by the existence of oncogene-associated genetic changes, such as mutations or rearrangements, specific to neoplastic cells. Although oncogenes are mutated or rearranged in some cases, in other cases they are merely expressed at elevated levels or at inappropriate stages of the cell cycle, without changes in the structure of the gene product (Perkins and Stern, 1997, Id.). Even when mutated, proteins encoded by oncogenes rarely acquire a qualitatively novel function relative to the “normal” protooncogene products. Hence, products of mutated, rearranged or overexpressed oncogenes generally perform the same biochemical functions as their normal cell counterparts, except that the functions of the activated oncogene products are abnormally regulated.

It is noteworthy that none of the “classical” oncogenes known in the art have been identified as targets for clinically useful anticancer drugs discovered by traditional mechanism-independent screening procedures. Rather the known cellular targets of chemotherapeutic drugs, such as dihydrofolate reductase (inhibited by methotrexate and other antifolates), topoisomerase II (“poisoned” by epipodophyllotoxins, anthracyclines or acridine drugs), or microtubules that form the mitotic spindle (the targets of Vinca alkaloids and taxanes) are essential for growth and proliferation of both normal and neoplastic cells. Tumor selectivity of anticancer drugs appears to be based not merely on the fact that their targets function primarily in proliferating cells, but rather on tumor-specific response to the inhibition of anticancer drug targets. For example, Scolnick and Halazonetis (2000, Nature 406 430-435) disclosed that a high fraction of tumor cell lines are deficient in a gene termed CHFR. In the presence of antimicrotubular drugs, CHFR appears to arrest the cell cycle in prophase. CHFR-deficient tumor cells, however, proceed into drug-impacted abnormal metaphase (Scolnick and Halazonetis, 2000, Id.), where they die through mitotic catastrophe or apoptosis (Torres and Horwitz, 1998, Cancer Res. 58: 3620-3626). In addition to CHFR, tumor cells are frequently deficient in various cell cycle checkpoint controls, and exploiting these deficiencies is a major direction in experimental therapeutics (O'Connor, 1997, Cancer Surv. 29: 151-182; Pihan and Doxsey, 1999, Semin. Cancer Biol. 9: 289-302). In most cases, however, the reasons that inhibition of anticancer drug targets selectively induces cell death or permanent growth arrest in tumor cells are unknown. There is therefore need in the art to identify additional molecular targets in tumor cells, inhibition of which would arrest tumor cell growth.

One method known in the art for identifying unknown genes or unknown functions of known genes is genetic suppressor element technology, developed by some of the present inventors (in U.S. Pat. Nos. 5,217,889, 5,665,550, 5,753,432, 5,811,234, 5,866,328, 5,942,389, 6,043,340, 6,060,134, 6,083,745, 6,083,746, 6,197,521, 6,268,134, 6,281,011 and 6,326,488, each of which is incorporated by reference in its entirety). Genetic suppressor elements (GSEs) are biologically active cDNA fragments that interfere with the function of the gene from which they are derived. GSEs may encode antisense RNA molecules that inhibit gene expression or peptides corresponding to functional protein domains, which interfere with protein function as dominant inhibitors. The general strategy for the isolation of biologically active GSEs involves the preparation of an expression library containing randomly fragmented DNA of the target gene or genes. This library is then introduced into recipient cells, followed by selection for the desired phenotype and recovery of biologically active GSEs from the selected cells. By using a single cDNA as the starting material for GSE selection, one can generate specific inhibitors of the target gene and map functional domains in the target protein. By using a mixture of multiple genes or the entire genome as the starting material, GSE selection allows one to identify genes responsible for a specific cellular function, since such genes will give rise to GSEs inhibiting this function. In a variation of this approach, the vector used for library preparation contains sequences permitting regulated expression of cDNA fragments cloned therein.

This method can be used to identify genes required for tumor cell growth by subjecting the cells to negative growth selection. One example of this type of selection is known in the art as bromodeoxyuridine (BrdU) suicide selection, which has long been used to select conditional-lethal mutants (Stetten et al., 1977, Exp. Cell Res. 108: 447-452) and growth-inhibitory DNA sequences (Padmanabhan et al., 1987, Mol. Cell. Biol. 7: 1894-1899). The basis of BrdU suicide selection is the destruction of cells that replicate their DNA in the presence of BrdU. BrdU is a photoactive nucleotide that incorporates into DNA and causes lethal DNA crosslinking upon illumination with white light in the presence of Hoechst 33342. The only cells that survive this selection are cells that do not replicate their DNA while BrdU is present, such as cells that express growth-inhibitory genes or GSEs. One advantage of this method is very low background of surviving cells. When used with GSE libraries under the control of an inducible vector, this selection method excludes spontaneously arising BrdU-resistant mutants by the insensitivity of their phenotype to the presence or absence of the inducing agent. Another major advantage of this technique is its sensitivity for weak growth-inhibitory GSEs: even if only a small fraction of GSE-containing cells are growth-inhibited by GSE induction, such cells will survive BrdU suicide and will give rise to a recovering clone.

The applicability of this approach to the isolation of growth-inhibitory GSEs was first demonstrated by Pestov and Lau (1994, Proc. Natl. Acad. Sci. USA 91: 12549-12553). These workers used an IPTG-inducible plasmid expression vector to isolate cytostatic GSEs from a mixture of cDNA fragments from 19 murine genes associated with the G₀/G₁transition. In this work, three of the genes in the mixture gave rise to growth-inhibitory GSEs (Pestov and Lau, 1994, Id.). In a subsequent study, Pestov et al. (1998, Oncogene 17: 13187-3197) used the same approach to isolate one full-length and one truncated cDNA clone with growth-inhibitory activity from a 40,000-clone library of nominally full-length mouse cDNA. However, the method disclosed in the art cannot be efficiently used for transducing a library of random fragments representing the total mRNA population from a mammalian cell such as a tumor cell because the method relies on plasmid expression vectors for library construction, and only a limited number of cells can be stably transfected by such libraries.

There remains a need in the art to discover novel genes and novel functions of known genes necessary for tumor cell growth, especially by using methods for identifying genes based on function. There is also a need in the art to identify targets for therapeutic drug treatment, particularly targets for inhibiting tumor cell growth, and to develop compounds that inhibit the identified targets and thereby inhibit tumor cell growth.

SUMMARY OF THE INVENTION

The present invention identifies genes that are targets for developing drugs for the treatment of cancer by inhibiting tumor cell growth. Such genes are identified as disclosed herein through expression selection of genetic suppressor elements (GSEs) that inhibit the growth of tumor cells in vitro. This selection has revealed multiple genes, some of which have been previously known to play a role in cell proliferation, whereas others were not known to be involved in cell proliferation prior to instant invention; the latter genes constitute novel drug targets and are set forth in Table 3.

In a first embodiment, the invention provides a method identifying a compound that inhibits growth of a mammalian cell, the method comprising the steps of:

- (a) culturing a cell in the presence or absence of the compound;
- (b) assaying the cell for expression or activity in the sample of one or a plurality of the genes set forth in Table 3; and
- (c) identifying the compound when expression or activity in the sample of at least one of the genes set forth in Table 3 is lower in the presence of the compound than in the absence of the compound.

In preferred embodiments, the cell is a mammalian cell, preferably a human cell, and most preferably a human tumor cell. In further preferred embodiments, gene inhibition is detected by hybridization with a nucleic acid complementary to the gene, biochemical assay for an activity of the gene or immunological reaction with an antibody specific for an antigen comprising the gene product. In a preferred embodiment, the cell is a recombinant cell in which a reporter gene is operably linked to a promoter from a cellular gene in Table 3, to detect decreased expression of the reporter gene in the presence of the compound than in the absence of the compound. In further preferred embodiments, the cell is assayed for cell growth in the presence and absence of the compound, to identify compounds that inhibit cell growth and a gene identified in Table 3.

The invention also provides compounds that inhibit tumor cell growth that are identified by the methods of the invention, and pharmaceutical formulations of said compounds. The invention specifically provides peptides encoded by sense-oriented genetic suppressor elements of the invention. In addition the invention provides peptide mimetics comprising all or a portion of any of said peptides, peptido-, organo- or chemical mimetics thereof.

In a second embodiment, the invention provides a method for assessing efficacy of a treatment of a disease or condition relating to abnormal cell proliferation or tumor cell growth, comprising the steps of:

- (a) obtaining a biological sample comprising cells from an animal having a disease or condition relating to abnormal cell proliferation or tumor cell growth before treatment and after treatment with a compound that inhibits expression or activity of a gene identified in Table 3;
- (b) comparing expression or activity of at least one gene in Table 3 after treatment with the compound with expression or activity of said genes before treatment with the compound; and
- (c) determining that said treatment with the compound has efficacy for treating the disease or condition relating to abnormal cell proliferation or neoplastic cell growth if expression or activity of at least one gene in Table 3 is lower after treatment than before treatment.

In preferred embodiments, the cell is a mammalian, most preferably human cell, most preferably a tumor cell.

In a third aspect, the invention provides a method for inhibiting tumor cell growth, the method comprising the steps of contacting a tumor cell with an effective amount of a compound that inhibits expression of a gene in Table 3.

In a fourth aspect, the invention provides a method for treating a disease or condition relating to abnormal cell proliferation or tumor cell growth, the method comprising the steps of administering to an animal having said disease or condition a therapeutically effective amount of a compound that inhibits expression of a gene in Table 3.

Pharmaceutically acceptable compositions effective according to the methods of the invention, comprising a therapeutically effective amount of a peptide or peptide mimetic of the invention capable of inhibiting tumor cell growth and a pharmaceutically acceptable carrier or diluent, are also provided.

Specific preferred embodiments of the present invention will become evident from the following more detailed description of certain preferred embodiments and the claims.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1 is a schematic diagram illustrating the principles of genetic suppressor element technology.

FIG. 2 is a schematic diagram of the structure of the LNCXCO3 retroviral vector.

FIG. 3 is a schematic diagram of the BrdU selection protocol.

FIG. 4 is a photograph of cell culture plates containing library-transduced cells subjected to BrdU suicide selection in the presence or in the absence of IPTG, immediately after G418 selection (top), after one round of BrdU suicide selection in the presence of IPTG (middle), or after two rounds of BrdU suicide selection in the presence of IPTG (bottom).

FIG. 5 is a bar diagram of the results of testing of cell populations transduced with individual GSEs for IPTG-dependent resistance to BrdU suicide, measured in triplicates and expressed as mean and standard deviation of the numbers of colonies surviving BrdU suicide selection in the presence and in the absence of IPTG. Sequences for the shown results are GSE (SEQ ID NO):GBC-1 (79), GBC-3 (94), STAT3 (205), STAT5b (211), PRL31 (192), GBC-11 (85), L1CAM (125), INTB5 (112), OKCeta (170), VWF (225), ZIN (228), HSPCA (103), CDC20 (37), PKC zeta (172), CDK10 (39), DAP3 (59), RPA3 (190), NFκB1 (157), HES6 (99), and MBD1 (142).

FIG. 6 is a bar diagram of the results of IPTG growth inhibition assays carried out with clonal cell lines transduced with individual GSEs, measured in triplicates and expressed as mean and standard deviation of the cell numbers after 7 days of culture in the presence and in the absence of IPTG. Sequences for the shown results are GSE (SEQ ID NO): HNRPF (101), HRMT1L2 (102), STAT5b (211), CCND1 (57), 28S RNA (17), RPL31 (192), CDK2 (40), AHRG (183), GBC-1 (79), L1CAM (125), NIN283 (158), MYL6 (155), DAP3 (59), TAF7 (215), STAT3 (205), IF1 (32), GBC-11 (85), LYN (138), c-KIT (48), GBC-3 (94), eIF-3 (62), PKCeta (170), EFNA1 (67), ATF4 (27), HNRPA2B1(102), GBC-12(86), INTB5 (112), BAM22 (35), FOS (43), FGFR1 (77), and KIAA1270 (123).

FIGS. 7A and 7B are photomicrographs illustrating the morphological effects of an L1CAM-derived GSE (SEQ ID NO 134) in a clonal IPTG-inhibited cell line. FIG. 7A shows the effects on cell morphology of four-day treatment with IPTG. FIG. 7B shows evidence of mitotic catastrophe in IPTG-treated cells.

DETAILED DESCRIPTION OF PREFERRED EMBODIMENTS

This invention provides target genes involved in cell growth, preferably tumor cell growth, methods for identifying compounds that inhibit expression or activity of these genes and methods for specifically inhibiting tumor cell growth by inhibiting expression or activity of these genes. Preferably, the methods of the invention do not substantially affect normal cell growth.

This invention provides methods for identifying genes that are required for tumor cell growth. Such genes, which are potential targets for new anticancer drugs, are identified through expression selection of genetic suppressor elements (GSEs). GSEs are biologically active sense- or antisense-oriented cDNA fragments that inhibit the function of the gene from which they are derived. Expression of GSEs derived from genes involved in cell proliferation is expected to inhibit cell growth. According to the inventive methods, such GSEs are isolated by so-called “suicide selection” of cells the growth of which is inhibited under cell culture conditions in which growing cells are specifically killed. In a preferred embodiment the suicide selection protocol is bromodeoxyuridine (BrdU) suicide selection, in which cells are incubated with BrdU and then illuminated with bright light. Growing cells incorporate BrdU into chromosomal DNA, making the DNA sensitive to illumination with light, which specifically kills growing cells. GSEs are produced starting from a normalized (reduced-redundance) library of human cDNA fragments in an inducible retroviral vector. In preferred embodiments, the recipient cells are tumor cells, most preferably human tumor cells, for example breast carcinoma cells.

For the purposes of this invention, reference to “a cell” or “cells” is intended to be equivalent, and particularly encompasses in vitro cultures of mammalian cells grown and maintained as known in the art.

For the purposes of this invention, reference to “cellular genes” in the plural is intended to encompass a single gene as well as two or more genes. It will also be understood by those with skill in the art that effects of modulation of cellular gene expression, or reporter constructs under the transcriptional control of promoters derived from cellular genes, can be detected in a first gene and then the effect replicated by testing a second or any number of additional genes or reporter gene constructs. Alternatively, expression of two or more genes or reporter gene constructs can be assayed simultaneously within the scope of this invention.

Recombinant expression constructs can be introduced into appropriate mammalian cells as understood by those with skill in the art. Preferred embodiments of said constructs are produced in transmissible vectors, more preferably viral vectors and most preferably retrovirus vectors, adenovirus vectors, adeno-associated virus vectors, and vaccinia virus vectors, as known in the art. See, generally, MAMMALIAN CELL BIOTECHNOLOGY: A PRACTICAL APPROACH, (Butler, ed.), Oxford University Press: New York, 1991, pp. 57-84.

In additionally preferred embodiments, the recombinant cells of the invention contain a construct encoding an inducible retroviral vector comprising random cDNA fragments from total tumor cell mRNA, wherein the fragments are each under the transcriptional control of an inducible promoter. In more preferred embodiments, the inducible promoter is responsive to a trans-acting factor whose effects can be modulated by an inducing agent. The inducing agent can be any factor that can be manipulated experimentally, including temperature and most preferably the presence or absence of an inducing agent. Preferably, the inducing agent is a chemical compound, most preferably a physiologically-neutral compound that is specific for the trans-acting factor. In the use of constructs comprising inducible promoters as disclosed herein, expression of the random cDNA fragments from the recombinant expression construct is mediated by contacting the recombinant cell with an inducing agent that induces transcription from the inducible promoter or by removing an agent that inhibits transcription from such promoter. A variety of inducible promoters and cognate trans-acting factors are known in the prior art, including heat shock promoters than can be activated by increasing the temperature of the cell culture, and more preferably promoter/factor pairs such as the tet promoter and fusions thereof with mammalian transcription factors (as are disclosed in U.S. Pat. Nos. 5,654,168, 5,851,796, and 5,968,773), and the bacterial lac promoter of the lactose operon and its cognate lacI repressor protein. In a preferred embodiment, the recombinant cell expresses the lacI repressor protein and a recombinant expression construct encoding the random cDNA fragments under the control of a promoter comprising one or a multiplicity of lac-responsive elements, wherein expression of the fragments can be induced by contacting the cells with the physiologically-neutral inducing agent, isopropylthio-α-galactoside. In this preferred embodiment, the lacI repressor is encoded by a recombinant expression construct identified as 3'SS (commercially available from Stratagene, LaJolla, Calif.).

The invention also provides recipient cell lines suitable for selection of growth-inhibitory GSEs. In preferred embodiments, the cell lines are human breast, lung, colon and prostate carcinoma cells, modified to comprise a trans-acting factor such as the lac repressor and further to express a retroviral receptor cognate to the tropism of the retroviral vector in which the library is constructed. In a preferred embodiment, the cells are modified to express the bacterial lac operon repressor, lacI (to allow for IPTG-inducible gene expression) and to express the ecotropic mouse retroviral receptor (to enable high-efficiency infection with ecotropic recombinant retroviruses). In alternative preferred embodiments, the cells are telomerase-immortalized normal human fibroblasts and retinal pigment and mammary epithelial cells that have been modified to express lacI and the mouse ecotropic retrovirus receptor.

The invention utilizes modifications of methods of producing genetic suppressor elements (GSEs) for identifying tumor cell growth controlling genes. These DNA fragments are termed “GSE” herein to designate both sense- and antisense-oriented gene fragments that can inhibit or modify the function of the target gene when expressed in a cell. Both types of functional GSEs can be generated by random fragmentation of the DNA of the target gene and identified by function-based selection of fragments that confer the desired cellular phenotype such as cell growth inhibition. Such function-based GSE selection makers it possible to develop genetic inhibitors for the selected targets, identify protein functional domains, and identify genes involved in various complex phenotypes.

A generalized scheme of GSE selection is shown in FIG. 1. Originally developed using a model bacterial system (see U.S. Pat. No. 5,217,889, incorporated by reference), this method has been adapted for use in mammalian cells. Because less than 1% of random fragments derived from a typical cDNA have GSE activity, the size of expression libraries required for GSE selection is much larger than the corresponding size of libraries that can be used for function-based selection of full-length cDNAs. Retroviral vectors are used to deliver such large libraries into mammalian cells, because it is a non-stressful delivery system that can be used for stable transduction into a very high fraction (up to 100%) of recipient cells. In the preparation of these retroviral-based libraries, packaging cell lines are used, most preferably human 293-based packaging cell lines, such as BOSC23 (Pear et al., 1993, Proc. Natl. Acad. Sci. USA 90: 8392-8396), which provide efficient and uniform retrovirus packaging after transient transfection (Gudkov and Roninson, 1997, in METHODS IN MOLECULAR BIOLOGY: cDNA LIBRARY PROTOCOLS, Cowell and Austin, eds. (Totowa, N.J.: Humana Press), pp. 221-240). Additionally, large-scale expression selection required modifications in conventional retroviral vectors. The retroviral vectors used to produce the normalized tumor libraries of the invention carry one constitutively expressing and one inducible promoter, which minimizes the problem of promoter interference under non-inducing conditions. Preferred embodiments of the modified retroviral vectors of the invention express the bacterial neomycin resistance gene (neo, selectable in mammalian cells with G418) from an LTR promoter in the retrovirus. The vectors also contain a multiple cloning site 3′ to the selectable marker gene and adjacent to a regulatable promoter comprising promoters from cytomegalovirus (CMV) or Rous sarcoma virus (RSV) LTR containing 2-4 bacterial lac operator sequences. The regulatable promoter is cloned in the anti orientation to the retroviral LTR. A diagram of the topography of one of these viruses, LNXCO3 is shown in FIG. 2. In alternative embodiments, the neo gene is exchanged for a gene encoding green fluorescent protein (Kandel et al., 1997, Somat. Cell Genet. 23: 325-340) or firefly luciferase (Chang et al., 1999, Oncogene 18: 4808-4818). As a positive control for growth inhibition an embodiment of LNXCO3 was used that expressed human p21, a CDK inhibitor know to strongly inhibit tumor cell growth (see International Patent Application, Publication No. WO01/38532, incorporated by reference).

The invention provides a normalized cDNA fragment library from a mixture of poly(A)+ RNA preparations from one or a multiplicity of human cell lines, derived from different types of cancer. This normalized library is prepared in a vector, preferably a retroviral vector and most preferably a retroviral vector containing sequences permitting regulated expression of cDNA fragments cloned therein. In a preferred embodiment, the vector is the retroviral vector LNXCO3, comprising a promoter inducible by isopropyl-β-thio-galactoside (IPTG), a physiologically neutral agent.

The invention provides methods for isolating growth-inhibitory GSEs from a normalized cDNA fragment library, representing most of the expressed genes in a human tumor cell. As provided herewith, normalized cDNA fragment libraries contain on the order of 5×10⁷clones (Gudkov et al., 1994, Proc. Natl. Acad. Sci. USA 91: 3644-3748; Levenson et al., 1999, Somat. Cell Molec. Genet. 25: 9-26), corresponding to >1,000 cDNA fragments per gene. Selection of individual GSEs from a library of this size requires a procedure with high sensitivity and low background, most preferably BrdU suicide selection. The principle of BrdU suicide selection is illustrated in FIG. 3. In preferred embodiments, the GSEs are expressed under the control of an inducible promoter, most preferably a promoter that is inducible by a physiologically neutral agent (such as IPTG), provided that the growth inhibitor is induced prior to the addition of BrdU. Following BrdU selection, the inducer is washed from the culture and cells infected with growth-inhibitory GSEs begin to proliferate, thus providing colonies of cells harboring selected GSEs.

BrdU suicide is not the only technique that can be used to select growth-inhibitory genes or GSEs. In one alternative approach, cells are labeled with a fluorescent dye that integrates into the cell membrane and is redistributed between daughter cells with each round of cell division. As a result, cells that have divided the smallest number of times after labeling show the highest fluorescence and can be isolated by FACS (Maines et al., 1995, Cell Growth Differ. 6: 665-671). It is also possible to isolate cells that die upon the addition of the inducer, by collecting floating dead cells or isolating apoptotic cells on the basis of altered staining with DNA-binding fluorescent dyes. These methods have been used to isolate GSEs from single-gene cDNA fragment libraries prepared from the MDR1 gene (Zuhn, 1996, Ph.D. Thesis, Department of Genetics, University of Illinois at Chicago, Chicago, Ill.) or from BCL2 (U.S. Pat. No. 5,789,389, incorporated by reference). There are no theoretical problems with any of these approaches, and all of them work to enrich for growth-inhibitory elements in low-complexity libraries. The only disadvantage of these alternatives when compared with BrdU selection is that they have higher spontaneous background rates that may prevent rare clones to be selected from an exceedingly complex normalized library. Thus, BrdU selection is the preferred embodiment of the inventive methods.

Prior art methods (Pestov and Lau, 1994, Id.) for adapting GSE technology to identify growth-inhibitory GSEs were of limited utility when applied to total tumor cell cDNA. The prior art methods cannot be efficiently used for transducing a library representing the total mRNA population from a mammalian cell such as a tumor cell because the method relies on plasmid expression vectors for library construction, and only a limited number of cells can be stably transfected by such libraries. To overcome this limitation, the invention provides a set of inducible retroviral vectors that are regulated by IPTG through the bacterial LacI repressor. This inducible system provides comparable levels of induction among most of the infected cells. The induced levels of expression can be finely regulated by using different doses of IPTG.

The methods of the invention are exemplified herein by use of this IPTG-inducible retroviral system to generate a normalized cDNA library from human breast cancer cells. This library was used to select GSEs that induce growth arrest in a breast carcinoma cell line. Using this approach, more than 90 genes were identified that were enriched by BrdU suicide selection. Many of these GSEs were shown to have a growth-inhibiting effect when re-introduced into tumor cells. Included in the genes identified using the inventive methods are known oncogenes, some of which have been specifically associated with breast cancer, as well as other genes with a known role in cell proliferation. Many of the identified genes, however, had no known function or were not previously known to play a role in cell cycle progression. The latter genes and their products represent therefore novel targets for cancer treatment. Furthermore, some of the genes giving rise to the GSEs that inhibited the proliferation of breast carcinoma cells appear to be inessential for normal cell growth, since homozygous knockout of these genes does not prevent the development of adult mice.

The invention provides methods for cloning unknown genes containing GSEs identified using GSE libraries and negative growth selection methods of the invention. In the practice of this aspect of the methods of the invention, GSEs with no homology to known human genes in the NCBI database are used to clone unknown genes by any technique known in the art.

In a preferred embodiment, genomic DNA is isolated from the two-step selected library-transduced cells and used as a template for PCR, using vector-derived sequences flanking the inserts as primers. The PCR-amplified mixture of inserts from the selected cells is recloned into a vector. In further preferred embodiments, the vector is a TA cloning vector from Invitrogen Life Technologies that facilitates direct cloning of PCR products. Plasmid clones from the library of selected fragments are sequenced by high-throughput DNA sequencing using vector-derived sequences flanking the inserts as primers. The sequences of growth-inhibitory GSEs are used as query for the BLAST homology search in the NCBI nr database to identify genes that gave rise to the selected GSE fragments.

In cases where no match can be found in the database, a pair of oppositely directed primers is designed according to the GSE sequence. cDNAs from the same human cell lines where the normalized GSE library is derived is used as template. Rapid Amplification of cDNA Ends (RACE) is performed using technique known in the art to capture the missing parts of the cDNA (Frohman et al., 1988, Proc. Natl. Acad. Sci. USA 85: 8998-9002; also see U.S. Pat. Nos. 5,578,467, and 5,334,515, incorporated by reference). Full-length cDNA of the unknown gene can be obtained by assembling the RACE products with the GSE clone. In a preferred embodiment, the GSE is used to BLAST search the NCBI human EST database. The longest corresponding EST is obtained from the I.M.A.G.E. Consortium (distributed by American Type Culture Collection or Research Genetics) and sequence verified. ORF Finder from NCBI is used to identify putative open reading frame from the GSE, which helps to determine if the cDNA fragment lacks the 3′ or/and the 5′ portion. The RACE primers are designed according to the extended cDNA sequence based on the EST sequence to amplify the end segments.

Alternatively, a GSE with no homology to known human genes in the NCBI database is PCR-amplified using primers derived from the end sequences of said GSE. The PCR product is then used as probe to screen a cDNA library constructed from the same human cell lines where the GSE library is derived. Positive clones that hybridize to said probe are sequenced to identify putative open reading frame. In cases where the cDNA is not full-length, RACE experiment is performed as described hereinabove.

The invention provides methods for measuring gene expression or activity of the gene products corresponding to GSEs identified using GSE libraries and negative growth selection methods of the invention. In the practice of this aspect of the methods of the invention, gene expression or gene product activity is assayed in cells in the presence or absence of a compound to determine whether the compound inhibits expression or activity of such a gene or gene product. In preferred embodiments, gene expression is assayed using any technique known in the art, such as comparison of northern blot hybridization to cellular mRNA using a detectably-labeled probe (as disclosed, for example, in Sambrook et al., 2001, MOLECULAR CLONING: A LABORATORY MANUAL, 3^rded., Cold Strong Harbor Laboratory Press: N.Y.), or by in vitro amplification methods, such as quantitative reverse transcription—polymerase chain reaction (RT-PCR) assays as disclosed by Noonan et al. (1990, Proc. Natl. Acad. Sci. USA 87: 7160-7164), or by western blotting using antibodies specific for the gene product (Sambrook et al., 2001, Id.). Gene product activity is assayed using assays specific for each gene product, such as immunoassay using antibodies specific for said gene products or biochemical assay of gene product function.

Alternatively, gene expression is assayed using recombinant expression constructs having a promoter from a gene corresponding to GSEs identified using GSE libraries and negative growth selection methods of the invention, wherein the promoter is operably linked to a reporter gene. The reporter gene is then used as a sensitive and convenient indicator of the effects of test compounds on gene expression, and enables compounds that inhibit expression or activity of genes required for cell, preferably tumor cell growth to be easily identified. Host cells for these constructs include any cell expressing the corresponding growth-promoting gene. Reporter genes useful in the practice of this aspect of the invention include but are not limited to firefly luciferase, Renilla luciferase, chloramphenicol acetyltransferase, beta-galactosidase, green fluorescent protein, and alkaline phosphatase.

The invention provides peptides encoded by some of the GSEs of the invention that have been identified using the GSE-negative growth selection methods disclosed herein. Such peptides are presented in Table 5 and in the Sequence Listing as SEQ ID NOS. 229-314. Some of these peptides are derived from proteins that were previously known to play a role in cell proliferation, and others from proteins that were first assigned such a role in the instant inventions. All of the identified peptides, however, are novel inhibitors of tumor cell proliferation. Also provided are related compounds within the understanding of those with skill in the art, such as chemical mimetics, organomimetics or peptidomimetics. As used herein, the terms “mimetic,” “peptide mimetic,” “peptidomimetic,” “organomimetic” and “chemical mimetic” are intended to encompass peptide derivatives, peptide analogues and chemical compounds having an arrangement of atoms is a three-dimensional orientation that is equivalent to that of a peptide encoded by a GSE of the invention. It will be understood that the phrase “equivalent to” as used herein is intended to encompass compounds having substitution of certain atoms or chemical moieties in said peptide with moieties having bond lengths, bond angles and arrangements thereof in the mimetic compound that produce the same or sufficiently similar arrangement or orientation of said atoms and moieties to have the biological function of the peptide GSEs of the invention. In the peptide mimetics of the invention, the three-dimensional arrangement of the chemical constituents is structurally and/or functionally equivalent to the three-dimensional arrangement of the peptide backbone and component amino acid sidechains in the peptide, resulting in such peptido-, organo- and chemical mimetics of the peptides of the invention having substantial biological activity. These terms are used according to the understanding in the art, as illustrated for example by Fauchere, 1986, Adv. Drug Res. 15: 29; Veber & Freidinger, 1985, TINS p. 392; and Evans et al., 1987, J. Med. Chem. 30: 1229, incorporated herein by reference.

It is understood that a pharmacophore exists for the biological activity of each peptide GSE of the invention. A pharmacophore is understood in the art as comprising an idealized, three-dimensional definition of the structural requirements for biological activity. Peptido-, organo- and chemical mimetics can be designed to fit each pharmacophore with current computer modeling software (computer aided drug design). Said mimetics are produced by structure-function analysis, based on the positional information from the substituent atoms in the peptide GSEs of the invention.

Peptides as provided by the invention can be advantageously synthesized by any of the chemical synthesis techniques known in the art, particularly solid-phase synthesis techniques, for example, using commercially-available automated peptide synthesizers. The mimetics of the present invention can be synthesized by solid phase or solution phase methods conventionally used for the synthesis of peptides (see, for example, Merrifield, 1963, J. Amer. Chem. Soc. 85: 2149-54; Carpino, 1973, Acc. Chem. Res. 6: 191-98; Birr, 1978, ASPECTS OF THE MERRIFIELD PEPTIDE SYNTHESIS, Springer-Verlag: Heidelberg; THE PEPTIDES: ANALYSIS, SYNTHESIS, BIOLOGY, Vols. 1, 2, 3, 5, (Gross & Meinhofer, eds.), Academic Press: New York, 1979; Stewart et al., 1984, SOLID PHASE PEPTIDE SYNTHESIS, 2nd. ed., Pierce Chem. Co.: Rockford, Ill.; Kent, 1988, Ann. Rev. Biochem. 57: 957-89; and Gregg et al., 1990, Int. J. Peptide Protein Res. 55: 161-214, which are incorporated herein by reference in their entirety.)

The use of solid phase methodology is preferred. Briefly, an N-protected C-terminal amino acid residue is linked to an insoluble support such as divinylbenzene cross-linked polystyrene, polyacrylamide resin, Kieselguhr/polyamide (pepsyn K), controlled pore glass, cellulose, polypropylene membranes, acrylic acid-coated polyethylene rods or the like. Cycles of deprotection, neutralization and coupling of successive protected amino acid derivatives are used to link the amino acids from the C-terminus according to the amino acid sequence. For some synthetic peptides, an FMOC strategy using an acid-sensitive resin may be used. Preferred solid supports in this regard are divinylbenzene cross-linked polystyrene resins, which are commercially available in a variety of functionalized forms, including chloromethyl resin, hydroxymethyl resin, paraacetamidomethyl resin, benzhydrylamine (BHA) resin, 4-methylbenzhydrylamine (MBHA) resin, oxime resins, 4-alkoxybenzyl alcohol resin (Wang resin), 4-(2′,4′-dimethoxyphenylaminomethyl)-phenoxymethyl resin, 2,4-dimethoxybenzhydryl-amine resin, and 4-(2′,4′-dimethoxyphenyl-FMOC-amino-methyl)-phenoxyacetamidonorleucyl-MBHA resin (Rink amide MBHA resin). In addition, acid-sensitive resins also provide C-terminal acids, if desired. A particularly preferred protecting group for alpha amino acids is base-labile 9-fluorenylmethoxy-carbonyl (FMOC).

Suitable protecting groups for the side chain functionalities of amino acids chemically compatible with BOC (t-butyloxycarbonyl) and FMOC groups are well known in the art. When using FMOC chemistry, the following protected amino acid derivatives are preferred: FMOC-Cys(Trit), FMOC-Ser(But), FMOC-Asn(Trit), FMOC-Leu, FMOC-Thr(Trit), FMOC-Val, FMOC-Gly, FMOC-Lys(Boc), FMOC-Gln(Trit), FMOC-Glu(OBut), FMOC-His(Trit), FMOC-Tyr(But), FMOC-Arg(PMC (2,2,5,7,8-pentamethylchroman-6-sulfonyl)), FMOC-Arg(BOC)₂, FMOC-Pro, and FMOC-Trp(BOC). The amino acid residues can be coupled by using a variety of coupling agents and chemistries known in the art, such as direct coupling with DIC (diisopropyl-carbodiimide), DCC (dicyclohexylcarbodiimide), BOP (benzotriazolyl-N-oxytrisdimethylaminophosphonium hexa-fluorophosphate), PyBOP (benzotriazole-1-yl-oxy-tris-pyrrolidinophosphonium hexafluoro-phosphate), PyBrOP (bromo-tris-pyrrolidinophosphonium hexafluorophosphate); via performed symmetrical anhydrides; via active esters such as pentafluorophenyl esters; or via performed HOBt (1-hydroxybenzotriazole) active esters or by using FMOC-amino acid fluoride and chlorides or by using FMOC-amino acid-N-carboxy anhydrides. Activation with HBTU (2-(1H-benzotriazole-1-yl),1,1,3,3-tetramethyluronium hexafluorophosphate) or HATU (2-(1H-7-aza-benzotriazole-1-yl), 1,1,3,3-tetramethyluronium hexafluoro-phosphate) in the presence of HOBt or HOAt (7-azahydroxybenztriazole) is preferred.

The solid phase method can be carried out manually, although automated synthesis on a commercially available peptide synthesizer (e.g., Applied Biosystems 431A or the like; Applied Biosystems, Foster City, Calif.) is preferred. In a typical synthesis, the first (C-terminal) amino acid is loaded on the chlorotrityl resin. Successive deprotection (with 20% piperidine/NMP (N-methylpyrrolidone)) and coupling cycles according to ABI FastMoc protocols (ABI user bulletins 32 and 33, Applied Biosystems are used to build the whole peptide sequence. Double and triple coupling, with capping by acetic anhydride, may also be used.

The synthetic mimetic peptide is cleaved from the resin and deprotected by treatment with TFA (trifluoroacetic acid) containing appropriate scavengers. Many such cleavage reagents, such as Reagent K (0.75 g crystalline phenol, 0.25 mL ethanedithiol, 0.5 mL thioanisole, 0.5 mL deionized water, 10 mL TFA) and others, can be used. The peptide is separated from the resin by filtration and isolated by ether precipitation. Further purification may be achieved by conventional methods, such as gel filtration and reverse phase HPLC (high performance liquid chromatography). Synthetic calcitonin mimetics according to the present invention may be in the form of pharmaceutically acceptable salts, especially base-addition salts including salts of organic bases and inorganic bases. The base-addition salts of the acidic amino acid residues are prepared by treatment of the peptide with the appropriate base or inorganic base, according to procedures well known to those skilled in the art, or the desired salt may be obtained directly by lyophilization out of the appropriate base.

Generally, those skilled in the art will recognize that peptides as described herein may be modified by a variety of chemical techniques to produce compounds having essentially the same activity as the unmodified peptide, and optionally having other desirable properties. For example, carboxylic acid groups of the peptide may be provided in the form of a salt of a pharmaceutically-acceptable cation. Amino groups within the peptide may be in the form of a pharmaceutically-acceptable acid addition salt, such as the HCl, HBr, acetic, benzoic, toluene sulfonic, maleic, tartaric and other organic salts, or may be converted to an amide. Thiols can be protected with any one of a number of well-recognized protecting groups, such as acetamide groups. Those skilled in the art will also recognize methods for introducing cyclic structures into the peptides of this invention so that the native binding configuration will be more nearly approximated. For example, a carboxyl terminal or amino terminal cysteine residue can be added to the peptide, so that when oxidized the peptide will contain a disulfide bond, thereby generating a cyclic peptide. Other peptide cyclizing methods include the formation of thioethers and carboxyl- and amino-terminal amides and esters.

Specifically, a variety of techniques are available for constructing peptide derivatives and analogues with the same or similar desired biological activity as the corresponding peptide compound but with more favorable activity than the peptide with respect to solubility, stability, and susceptibility to hydrolysis and proteolysis. Such derivatives and analogues include peptides modified at the N-terminal amino group, the C-terminal carboxyl group, and/or changing one or more of the amido linkages in the peptide to a non-amido linkage. It will be understood that two or more such modifications can be coupled in one peptide mimetic structure (e.g., modification at the C-terminal carboxyl group and inclusion of a —CH₂— carbamate linkage between two amino acids in the peptide).

Amino terminus modifications include alkylating, acetylating, adding a carbobenzoyl group, and forming a succinimide group. Specifically, the N-terminal amino group can then be reacted to form an amide group of the formula RC(O)NH— where R is alkyl, preferably lower alkyl, and is added by reaction with an acid halide, RC(O)Cl or acid anhydride. Typically, the reaction can be conducted by contacting about equimolar or excess amounts (e.g., about 5 equivalents) of an acid halide to the peptide in an inert diluent (e.g., dichloromethane) preferably containing an excess (e.g., about 10 equivalents) of a tertiary amine, such as diisopropylethylamine, to scavenge the acid generated during reaction. Reaction conditions are otherwise conventional (e.g., room temperature for 30 minutes). Alkylation of the terminal amino to provide for a lower alkyl N-substitution followed by reaction with an acid halide as described above will provide for N-alkyl amide group of the formula RC(O)NR—. Alternatively, the amino terminus can be covalently linked to succinimide group by reaction with succinic anhydride. An approximately equimolar amount or an excess of succinic anhydride (e.g., about 5 equivalents) are used and the terminal amino group is converted to the succinimide by methods well known in the art including the use of an excess (e.g., ten equivalents) of a tertiary amine such as diisopropylethylamine in a suitable inert solvent (e.g., dichloromethane), as described in Wollenberg et al., U.S. Pat. No. 4,612,132, is incorporated herein by reference in its entirety. It will also be understood that the succinic group can be substituted with, for example, C₂— through C₆— alkyl or —SR substituents, which are prepared in a conventional manner to provide for substituted succinimide at the N-terminus of the peptide. Such alkyl substituents are prepared by reaction of a lower olefin (C₂— through C₆— alkyl) with maleic anhydride in the manner described by Wollenberg et al., supra., and —SR substituents are prepared by reaction of RSH with maleic anhydride where R is as defined above. In another advantageous embodiments, the amino terminus is derivatized to form a benzyloxycarbonyl-NH— or a substituted benzyloxycarbonyl-NH— group. This derivative is produced by reaction with approximately an equivalent amount or an excess of benzyloxycarbonyl chloride (CBZ-Cl) or a substituted CBZ-Cl in a suitable inert diluent (e.g., dichloromethane) preferably containing a tertiary amine to scavenge the acid generated during the reaction. In yet another derivative, the N-terminus comprises a sulfonamide group by reaction with an equivalent amount or an excess (e.g., 5 equivalents) of R—S(O)₂Cl in a suitable inert diluent (dichloromethane) to convert the terminal amine into a sulfonamide, where R is alkyl and preferably lower alkyl. Preferably, the inert diluent contains excess tertiary amine (e.g., ten equivalents) such as diisopropylethylamine, to scavenge the acid generated during reaction. Reaction conditions are otherwise conventional (e.g., room temperature for 30 minutes). Carbamate groups are produced at the amino terminus by reaction with an equivalent amount or an excess (e.g., 5 equivalents) of R—OC(O)Cl or R—OC(O)OC₆H₄—p—NO₂in a suitable inert diluent (e.g., dichloromethane) to convert the terminal amine into a carbamate, where R is alkyl, preferably lower alkyl. Preferably, the inert diluent contains an excess (e.g., about 10 equivalents) of a tertiary amine, such as diisopropylethylamine, to scavenge any acid generated during reaction. Reaction conditions are otherwise conventional (e.g., room temperature for 30 minutes). Urea groups are formed at the amino terminus by reaction with an equivalent amount or an excess (e.g., 5 equivalents) of R—N═C═O in a suitable inert diluent (e.g., dichloromethane) to convert the terminal amine into a urea (i.e., RNHC(O)NH—) group where R is as defined above. preferably, the inert diluent contains an excess (e.g., about 10 equivalents) of a tertiary amine, such as diisopropylethylamine. Reaction conditions are otherwise conventional (e.g., room temperature for about 30 minutes).

In preparing peptide mimetics wherein the C-terminal carboxyl group is replaced by an ester (e.g., —C(O)OR where R is alkyl and preferably lower alkyl), resins used to prepare the peptide acids are employed, and the side chain protected peptide is cleaved with base and the appropriate alcohol, e.g., methanol. Side chain protecting groups are then removed in the usual fashion by treatment with hydrogen fluoride to obtain the desired ester. In preparing peptide mimetics wherein the C-terminal carboxyl group is replaced by the amide —C(O)NR₃R₄, a benzhydrylamine resin is used as the solid support for peptide synthesis. Upon completion of the synthesis, hydrogen fluoride treatment to release the peptide from the support results directly in the free peptide amide (i.e., the C-terminus is —C(O)NH₂). Alternatively, use of the chloromethylated resin during peptide synthesis coupled with reaction with ammonia to cleave the side chain Protected peptide from the support yields the free peptide amide and reaction with an alkylamine or a dialkylamine yields a side chain protected alkylamide or dialkylamide (i.e., the C-terminus is —C(O)NRR₁, where R and R₁are alkyl and preferably lower alkyl). Side chain protection is then removed in the usual fashion by treatment with hydrogen fluoride to give the free amides, alkylamides, or dialkylamides.

In another alternative embodiment, the C-terminal carboxyl group or a C-terminal ester can be induced to cyclize by displacement of the —OH or the ester (—OR) of the carboxyl group or ester respectively with the N-terminal amino group to form a cyclic peptide. For example, after synthesis and cleavage to give the peptide acid, the free acid is converted in solution to an activated ester by an appropriate carboxyl group activator such as dicyclohexylcarbodiimide (DCC), for example, in methylene chloride (CH₂Cl₂), dimethyl formamide (DMF), or mixtures thereof. The cyclic peptide is then formed by displacement of the activated ester with the N-terminal amine. Cyclization, rather than polymerization, can be enhanced by use of very dilute solutions according to methods well known in the art.

Peptide mimetics as understood in the art and provided by the invention are structurally similar to the paradigm peptide encoded by each of the sense-oriented GSEs of the invention, but have one or more peptide linkages optionally replaced by a linkage selected from the group consisting of: —CH₂NH—, —CH₂S—, —CH₂CH₂—, —CH═CH— (in both cis and trans conformers), —COCH₂—, —CH(OH)CH₂—, and —CH₂SO—, by methods known in the art and further described in the following references: Spatola, 1983, in CHEMISTRY AND BIOCHEMISTRY OF AMINO ACIDS, PEPTIDES, AND PROTEINS, (Weinstein, ed.), Marcel Dekker: New York, p. 267; Spatola, 1983, Peptide Backbone Modifications 1: 3; Morley, 1980, Trends Pharm. Sci. pp. 463-468; Hudson et al., 1979, Int. J. Pept. Prot. Res. 14: 177-185; Spatola et al., 1986, Life Sci. 38: 1243-1249; Hann, 1982, J. Chem. Soc. Perkin Trans. 1307-314; Almquist et al., 1980, J. Med. Chem. 23: 1392-1398; Jennings-White et al., 1982, Tetrahedron Lett. 23: 2533; Szelke et al., 1982, European Patent Application, Publication No. EP045665A; Holladay et al., 1983, Tetrahedron Lett. 24: 4401-4404; and Hruby, 1982, Life Sci. 31: 189-199, each of which is incorporated herein by reference. Such peptide mimetics may have significant advantages over polypeptide embodiments, including, for example: being more economical to produce, having greater chemical stability or enhanced pharmacological properties (such half-life, absorption, potency, efficacy, etc.), reduced antigenicity, and other properties.

Mimetic analogs of the tumor-inhibiting peptides of the invention may also be obtained using the principles of conventional or rational drug design (see, Andrews et al., 1990, Proc. Alfred Benzon Symp. 28: 145-165; McPherson, 1990, Eur. J. Biochem. 189:1-24; Hol et al., 1989a, in MOLECULAR RECOGNITION: CHEMICAL AND BIOCHEMICAL PROBLEMS, (Roberts, ed.); Royal Society of Chemistry; pp. 84-93; Hol, 1989b, Arzneim-Forsch. 39:1016-1018; Hol, 1986, Agnew Chem. Int. Ed. Engl. 25: 767-778, the disclosures of which are herein incorporated by reference).

In accordance with the methods of conventional drug design, the desired mimetic molecules are obtained by randomly testing molecules whose structures have an attribute in common with the structure of a “native” peptide. The quantitative contribution that results from a change in a particular group of a binding molecule can be determined by measuring the biological activity of the putative mimetic in comparison with the tumor-inhibiting activity of the peptide. In a preferred embodiment of rational drug design, the mimetic is designed to share an attribute of the most stable three-dimensional conformation of the peptide. Thus, for example, the mimetic may be designed to possess chemical groups that are oriented in a way sufficient to cause ionic, hydrophobic, or van der Waals interactions that are similar to those exhibited by the tumor-inhibiting peptides of the invention, as disclosed herein.

The preferred method for performing rational mimetic design employs a computer system capable of forming a representation of the three-dimensional structure of the peptide, such as those exemplified by Hol, 1989a, ibid.; Hol, 1989b, ibid.; and Hol, 1986, ibid. Molecular structures of the peptido-, organo- and chemical mimetics of the peptides of the invention are produced according to those with skill in the art using computer-assisted design programs commercially available in the art. Examples of such programs include SYBYL 6.5®, HQSAR™, and ALCHEMY 2000™ (Tripos); GALAXY™ and AM2000™ (AM Technologies, Inc., San Antonio, Tex.); CATALYST™ and CERIUS™ (Molecular Simulations, Inc., San Diego, CA); CACHE PRODUCTS™, TSAR™, AMBER™, and CHEM-X™ (Oxford Molecular Products, Oxford, Calif.) and CHEMBUILDER3D™ (Interactive Simulations, Inc., San Diego, Calif.).

The peptido-, organo- and chemical mimetics produced using the peptides disclosed herein using, for example, art-recognized molecular modeling programs are produced using conventional chemical synthetic techniques, most preferably designed to accommodate high throughput screening, including combinatorial chemistry methods. Combinatorial methods useful in the production of the peptido-, organo- and chemical mimetics of the invention include phage display arrays, solid-phase synthesis and combinatorial chemistry arrays, as provided, for example, by SIDDCO, Tuscon, Ariz.; Tripos, Inc.; Calbiochem/Novabiochem, San Diego, Calif.; Symyx Technologies, Inc., Santa Clara, Calif.; Medichem Research, Inc., Lemont, Ill.; Pharm-Eco Laboratories, Inc., Bethlehem, Pa.; or N. V. Organon, Oss, Netherlands. Combinatorial chemistry production of the peptido-, organo- and chemical mimetics of the invention are produced according to methods known in the art, including but not limited to techniques disclosed in Terrett, 1998, COMBINATORIAL CHEMISTRY, Oxford University Press, London; Gallop et al., 1994, “Applications of combinatorial technologies to drug discovery. 1. Background and peptide combinatorial libraries,” J. Med. Chem. 37: 1233-51; Gordon et al., 1994, “Applications of combinatorial technologies to drug discovery. 2. Combinatorial organic synthesis, library screening strategies, and future directions,” J. Med. Chem. 37: 1385-1401; Look et al., 1996, Bioorg. Med. Chem. Lett. 6: 707-12; Ruhland et al., 1996, J. Amer. Chem. Soc. 118: 253-4; Gordon et al., 1996, Acc. Chem. Res. 29: 144-54; Thompson & Ellman, 1996, Chem. Rev. 96: 555-600; Fruchtel & Jung, 1996, Angew. Chem. Int. Ed. Engl. 35: 17-42; Pavia, 1995, “The Chemical Generation of Molecular Diversity”, Network Science Center, www.netsci.org; Adnan et al., 1995, “Solid Support Combinatorial Chemistry in Lead Discovery and SAR Optimization,” Id., Davies and Briant, 1995, “Combinatorial Chemistry Library Design using Pharmacophore Diversity,” Id., Pavia, 1996, “Chemically Generated Screening Libraries: Present and Future,” Id.; and U.S. Pat. Nos. 5,880,972 to Horlbeck; 5,463,564 to Agrafiotis et al.; 5,331,573 to Balaji et al.; and 5,573,905 to Lerner et al.

The invention also provides methods for using the genes identified herein (particularly the genes set forth in Table 3) to screen compounds to identify inhibitors of expression or activity of said genes. In the practice of this aspect of the methods of the invention, cells expressing a gene required for cell growth, particularly a gene identified in Table 3, are assayed in the presence and absence of a test compound, and test compounds that reduce expression or activity of the gene or gene product identified thereby. Additionally, the assays can be performed under suicide selection conditions, wherein compounds that inhibit cell growth by inhibiting expression or activity of the gene select for survival of the cells. In alternative embodiments, reporter gene constructs of the invention are used, wherein expression of the reporter gene is reduced in the presence but not the absence of the test compound.

The methods of the invention are useful for identifying compounds that inhibit the growth of tumor cells, most preferably human tumor cells. The invention also provides the identified compounds and methods for using the identified compounds to inhibit tumor cell, most preferably human tumor cell growth. Exemplary compounds include neutralizing antibodies that interfere with gene product activity; antisense oligonucleotides, developed either as GSEs according to the methods of the invention or identified by other methods known in the art; ribozymes; triple-helix oligonucleotides; and “small molecule” inhibitors of gene expression or activity, preferably said small molecules that specifically bind to the gene product or to regulatory elements responsible for mediating expression of a gene in Table 3. It is recognized by one skilled in the art that a gene of the present invention can be used to identify biological pathways that contain the protein encoded by such. Any member of such pathways may be used to identify compounds that inhibit the growth of tumor cells.

The invention also provides embodiments of the compounds identified by the methods disclosed herein as pharmaceutical compositions. The pharmaceutical compositions of the present invention can be manufactured in a manner that is itself known, e.g., by means of a conventional mixing, dissolving, granulating, dragee-making, levigating, emulsifying, encapsulating, entrapping or lyophilizing processes.

Pharmaceutical compositions for use in accordance with the present invention thus can be formulated in conventional manner using one or more physiologically acceptable carriers comprising excipients and auxiliaries that facilitate processing of the active compounds into preparations that can be used pharmaceutically. Proper formulation is dependent upon the route of administration chosen.

Non-toxic pharmaceutical salts include salts of acids such as hydrochloric, phosphoric, hydrobromic, sulfuric, sulfinic, formic, toluenesulfonic, methanesulfonic, nitric, benzoic, citric, tartaric, maleic, hydroiodic, alkanoic such as acetic, HOOC—(CH₂)_n—CH₃where n is 0-4, and the like. Non-toxic pharmaceutical base addition salts include salts of bases such as sodium, potassium, calcium, ammonium, and the like. Those skilled in the art will recognize a wide variety of non-toxic pharmaceutically acceptable addition salts.

For injection, tumor cell growth-inhibiting compounds identified according to the methods of the invention can be formulated in appropriate aqueous solutions, such as physiologically compatible buffers such as Hank's solution, Ringer's solution, or physiological saline buffer. For transmucosal and transcutaneous administration, penetrants appropriate to the barrier to be permeated are used in the formulation. Such penetrants are generally known in the art.

For oral administration, the compounds can be formulated readily by combining the active compounds with pharmaceutically acceptable carriers well known in the art. Such carriers enable the compounds of the invention to be formulated as tablets, pills, dragees, capsules, liquids, gels, syrups, slurries, suspensions and the like, for oral ingestion by a patient to be treated. Pharmaceutical preparations for oral use can be obtained with solid excipient, optionally grinding a resulting mixture, and processing the mixture of granules, after adding suitable auxiliaries, if desired, to obtain tablets or dragee cores. Suitable excipients are, in particular, fillers such as sugars, including lactose, sucrose, mannitol, or sorbitol; cellulose preparations such as, for example, maize starch, wheat starch, rice starch, potato starch, gelatin, gum tragacanth, methyl cellulose, hydroxypropylmethylcellulose, sodium carboxymethylcellulose, and/or polyvinylpyrrolidone (PVP). If desired, disintegrating agents can be added, such as the cross-linked polyvinyl pyrrolidone, agar, or alginic acid or a salt thereof such as sodium alginate.

Dragee cores are provided with suitable coatings. For this purpose, concentrated sugar solutions can be used, which can optionally contain gum arabic, talc, polyvinyl pyrrolidone, carbopol gel, polyethylene glycol, and/or titanium dioxide, lacquer solutions, and suitable organic solvents or solvent mixtures. Dyestuffs or pigments can be added to the tablets or dragee coatings for identification or to characterize different combinations of active compound doses.

Pharmaceutical preparations that can be used orally include push-fit capsules made of gelatin, as well as soft, sealed capsules made of gelatin and a plasticizer, such as glycerol or sorbitol. The push-fit capsules can contain the active ingredients in admixture with filler such as lactose, binders such as starches, and/or lubricants such as talc or magnesium stearate and, optionally, stabilizers. In soft capsules, the active compounds can be dissolved or suspended in suitable liquids, such as fatty oils, liquid paraffin, or liquid polyethylene glycols. In addition, stabilizers can be added. All formulations for oral administration should be in dosages suitable for such administration. For buccal administration, the compositions can take the form of tablets or lozenges formulated in conventional manner.

For administration by inhalation, the compounds for use according to the present invention are conveniently delivered in the form of an aerosol spray presentation from pressurized packs or a nebuliser, with the use of a suitable propellant, e.g., dichlorodifluoromethane, trichlorofluoromethane, dichlorotetrafluoroethane, carbon dioxide or other suitable gas. In the case of a pressurized aerosol the dosage unit can be determined by providing a valve to deliver a metered amount. Capsules and cartridges of e.g., gelatin for use in an inhaler or insufflator can be formulated containing a powder mix of the compound and a suitable powder base such as lactose or starch.

The compounds can be formulated for parenteral administration by injection, e.g., by bolus injection or continuous infusion. Formulations for injection can be presented in unit dosage form, e.g., in ampoules or in multi-dose containers, with an added preservative. The compositions can take such forms as suspensions, solutions or emulsions in oily or aqueous vehicles, and can contain formulatory agents such as suspending, stabilizing and/or dispersing agents.

Pharmaceutical formulations for parenteral administration include aqueous solutions of the active compounds in water-soluble form. Additionally, suspensions of the active compounds can be prepared as appropriate oily injection suspensions. Suitable lipophilic solvents or vehicles include fatty oils such as sesame oil, or synthetic fatty acid esters, such as ethyl oleate or triglycerides, or liposomes. Aqueous injection suspensions can contain substances that increase the viscosity of the suspension, such as sodium carboxymethyl cellulose, sorbitol, or dextran. Optionally, the suspension can also contain suitable stabilizers or agents that increase the solubility of the compounds to allow for the preparation of highly concentrated solutions. Alternatively, the active ingredient can be in powder form for constitution with a suitable vehicle, e.g., sterile pyrogen-free water, before use.

The compounds can also be formulated in rectal compositions such as suppositories or retention enemas, e.g., containing conventional suppository bases such as cocoa butter or other glycerides.

In addition to the formulations described previously, the compounds can also be formulated as a depot preparation. Such long acting formulations can be administered by implantation (for example subcutaneously or intramuscularly) or by intramuscular injection. Thus, for example, the compounds can be formulated with suitable polymeric or hydrophobic materials (for example as an emulsion in an acceptable oil) or ion exchange resins, or as sparingly soluble derivatives, for example, as a sparingly soluble salt.

A pharmaceutical carrier for the hydrophobic compounds of the invention is a cosolvent system comprising benzyl alcohol, a nonpolar surfactant, a water-miscible organic polymer, and an aqueous phase. The cosolvent system can be the VPD co-solvent system. VPD is a solution of 3% w/v benzyl alcohol, 8% w/v of the nonpolar surfactant polysorbate 80, and 65% w/v polyethylene glycol 300, made up to volume in absolute ethanol. The VPD co-solvent system (VPD:5W) consists of VPD diluted 1:1 with a 5% dextrose in water solution. This co-solvent system dissolves hydrophobic compounds well, and itself produces low toxicity upon systemic administration. Naturally, the proportions of a co-solvent system can be varied considerably without destroying its solubility and toxicity characteristics. Furthermore, the identity of the co-solvent components can be varied: for example, other low-toxicity nonpolar surfactants can be used instead of polysorbate 80; the fraction size of polyethylene glycol can be varied; other biocompatible polymers can replace polyethylene glycol, e.g. polyvinyl pyrrolidone; and other sugars or polysaccharides can substitute for dextrose.

Alternatively, other delivery systems for hydrophobic pharmaceutical compounds can be employed. Liposomes and emulsions are well known examples of delivery vehicles or carriers for hydrophobic drugs. Certain organic solvents such as dimethylsulfoxide also can be employed, although usually at the cost of greater toxicity. Additionally, the compounds can be delivered using a sustained-release system, such as semipermeable matrices of solid hydrophobic polymers containing the therapeutic agent. Various sustained-release materials have been established and are well known by those skilled in the art. Sustained-release capsules can, depending on their chemical nature, release the compounds for a few weeks up to over 100 days. Depending on the chemical nature and the biological stability of the therapeutic reagent, additional strategies for protein and nucleic acid stabilization can be employed.

The pharmaceutical compositions also can comprise suitable solid or gel phase carriers or excipients. Examples of such carriers or excipients include but are not limited to calcium carbonate, calcium phosphate, various sugars, starches, cellulose derivatives, gelatin, and polymers such as polyethylene glycols.

The compounds of the invention can be provided as salts with pharmaceutically compatible counterions. Pharmaceutically compatible salts can be formed with many acids, including but not limited to hydrochloric, sulfuric, acetic, lactic, tartaric, malic, succinic, phosphoric, hydrobromic, sulfinic, formic, toluenesulfonic, methanesulfonic, nitic, benzoic, citric, tartaric, maleic, hydroiodic, alkanoic such as acetic, HOOC—(CH₂)_n—CH₃where n is 0-4, and the like. Salts tend to be more soluble in aqueous or other protonic solvents that are the corresponding free base forms. Non-toxic pharmaceutical base addition salts include salts of bases such as sodium, potassium, calcium, ammonium, and the like. Those skilled in the art will recognize a wide variety of non-toxic pharmaceutically acceptable addition salts.

Pharmaceutical compositions of the compounds of the present invention can be formulated and administered through a variety of means, including systemic, localized, or topical administration. Techniques for formulation and administration can be found in “Remington's Pharmaceutical Sciences,” Mack Publishing Co., Easton, Pa. The mode of administration can be selected to maximize delivery to a desired target site in the body. Suitable routes of administration can, for example, include oral, rectal, transmucosal, transcutaneous, or intestinal administration; parenteral delivery, including intramuscular, subcutaneous, intramedullary injections, as well as intrathecal, direct intraventricular, intravenous, intraperitoneal, intranasal, or intraocular injections.

Alternatively, one can administer the compound in a local rather than systemic manner, for example, via injection of the compound directly into a specific tissue, often in a depot or sustained release formulation.

Pharmaceutical compositions suitable for use in the present invention include compositions wherein the active ingredients are contained in an effective amount to achieve its intended purpose. More specifically, a therapeutically effective amount means an amount effective to prevent development of or to alleviate the existing symptoms of the subject being treated. Determination of the effective amounts is well within the capability of those skilled in the art, especially in light of the detailed disclosure provided herein.

For any compound used in the method of the invention, the therapeutically effective dose can be estimated initially from cell culture assays, as disclosed herein. For example, a dose can be formulated in animal models to achieve a circulating concentration range that includes the EC₅₀(effective dose for 50% increase) as determined in cell culture, i.e., the concentration of the test compound which achieves a half-maximal inhibition of bacterial cell growth. Such information can be used to more accurately determine useful doses in humans.

It will be understood, however, that the specific dose level for any particular patient will depend upon a variety of factors including the activity of the specific compound employed, the age, body weight, general health, sex, diet, time of administration, route of administration, and rate of excretion, drug combination, the severity of the particular disease undergoing therapy and the judgment of the prescribing physician.

Preferred compounds of the invention will have certain pharmacological properties. Such properties include, but are not limited to oral bioavailability, low toxicity, low serum protein binding and desirable in vitro and in vivo half-lives. Assays may be used to predict these desirable pharmacological properties. Assays used to predict bioavailability include transport across human intestinal cell monolayers, including Caco-2 cell monolayers. Serum protein binding may be predicted from albumin binding assays. Such assays are described in a review by Oravcová et al. (1996, J. Chromat. B 677: 1-27). Compound half-life is inversely proportional to the frequency of dosage of a compound. In vitro half-lives of compounds may be predicted from assays of microsomal half-life as described by Kuhnz and Gieschen (1998, DRUG METABOLISM AND DISPOSITION, Vol. 26, pp. 1120-1127).

Toxicity and therapeutic efficacy of such compounds can be determined by standard pharmaceutical procedures in cell cultures or experimental animals, e.g., for determining the LD₅₀(the dose lethal to 50% of the population) and the ED50 (the dose therapeutically effective in 50% of the population). The dose ratio between toxic and therapeutic effects is the therapeutic index and it can be expressed as the ratio between LD₅₀and ED₅₀. Compounds that exhibit high therapeutic indices are preferred. The data obtained from these cell culture assays and animal studies can be used in formulating a range of dosage for use in humans. The dosage of such compounds lies preferably within a range of circulating concentrations that include the ED50 with little or no toxicity. The dosage can vary within this range depending upon the dosage form employed and the route of administration utilized. The exact formulation, route of administration and dosage can be chosen by the individual physician in view of the patient's condition. (See, e.g. Fingl et al., 1975, in “The Pharmacological Basis of Therapeutics”, Ch. 1, p. 1).

Dosage amount and interval can be adjusted individually to provide plasma levels of the active moiety that are sufficient to maintain tumor cell growth-inhibitory effects. Usual patient dosages for systemic administration range from 100-2000 mg/day. Stated in terms of patient body surface areas, usual dosages range from 50-910 mg/m²/day. Usual average plasma levels should be maintained within 0.1-1000 μM. In cases of local administration or selective uptake, the effective local concentration of the compound cannot be related to plasma concentration.

The following Examples are intended to further illustrate certain preferred embodiments of the invention and are not limiting in nature.

EXAMPLES

1. Production of Normalized Tumor Library from MCF-7 Human Breast Cancer Cells

A normalized cDNA fragment library was generated from MCF-7 breast carcinoma cell line (estrogen receptor positive, wild-type for p53; ATCC Accession No. HTB22, American Type Culture Collection, Manassas, Va.). Poly(A)+ RNA from MCF-7 cells was used to prepare a population of normalized cDNA fragments through a modification of the procedure described in Gudkov and Roninson (1997). Briefly, RNA was fragmented by heating at 100° C. for 9 minutes. Double-stranded cDNA was generated from this heat-fragmented RNA using the Gibco Superscript kit with a reverse-transcription primer (5′-GGATCCTCACTCACTCANNNNNN-3′; SEQ ID NO. 1). This primer contains a random octamer sequence at its 3′ end for random priming, and it carries a tag (termed “stop adaptor” in its double-stranded form) that provides TGA stop codons in all three open reading frames, together with BamHI restriction site. PCR assays were used to establish the presence of β2-microglobulin, β-actin and estrogen receptor mRNA sequences in this cDNA preparation. Double-stranded cDNA fragments were ligated to the following adaptor: 5′GTACCTGAGTTATAGGATCCCTGCCATGCCATGCCATG3′(SEQ ID NO. 2) 3′CCTAGGGACGGTACGGTACGGTAC 5′(SEQ ID NO. 3) The latter adaptor (“start adaptor”) contains translation start sites in all three frames, together with a BamHI site. The double-stranded cDNA was amplified by PCR with primers that anneal to the start and stop adaptors. Although the start adaptor is initially ligated at both ends of cDNA fragments, the PCR products were generated predominantly by the two different primers and contain the start adaptor only at the 5′ but not at 3′ end. This desirable outcome is explained by the “PCR suppression effect”, due to PCR inhibition by panhandle-like structures formed upon renaturation of sequences flanked by an inverted repeat (Siebert et al., 1995, Nucleic Acids Res. 23: 1087-1088). Furthermore, any residual start adaptors at the 3′ ends were subsequently removed by BamHI digestion prior to cloning. The amplified cDNA fragment population was again tested for the presence of β2-microglobulin, β-actin and estrogen receptor sequences. This procedure produced a population of randomly initiating and terminating double-stranded cDNA fragments (100-400 bp size), which are tagged by different adaptors at the ends corresponding to the 5′ and 3′ direction of the original mRNA. The 5′ adaptor contains translation initiation codons in three open reading frames, and the 3′ adaptor contains stop codons in all three reading frames. Such fragments direct the synthesis of peptides derived from the parental protein when cloned in sense orientation, or give rise to antisense RNA molecules when cloned in antisense orientation.

The cDNA fragment mixture was subjected to normalization, through a modification of the procedure of Patanjali et al. (1991, Proc. Natl. Acad. Sci. USA 88: 1943-1947), based on C_ot fractionation. Normalization was achieved by reannealing portions of denatured cDNA for 24, 48, 72, or 96 hours. Single-stranded products were separated from re-annealed double stranded DNA by hydroxyapatite chromatography. Normalization of cDNA fragments was tested by Southern hybridization with probes corresponding to genes expressed to different levels in MCF-7 cells and performed with each single-stranded fraction. This analysis indicated that the content of β-actin, an abundant mRNA species, decreased over normalization time, with the lowest content found at the 96 hr time point. Conversely, a moderately-abundant cDNA sequence, c-MYC and a low-abundant cDNA sequence, MDR1 (which was undetectable in MCF-7 cDNA prior to normalization) increased their levels to those comparable with β-actin by 96 hr, suggesting that the 96 hr fraction was the best-normalized. To confirm the normalization of the 96 hr fraction, this DNA was digested (on a small scale) with BamHI, ligated into a plasmid vector and transformed into E. coli (Top10) by electroporation. Colony hybridization analysis was performed on nitrocellulose filters to which 10,000 colonies were plated, using radiolabeled probes for different genes. The following signal numbers per filter were obtained: β-actin, 3 signals; MDR1, 3 signals; C-MYC, 2 signals; C—FOS, 2 signals. These results indicated that the sequences from the tested genes are found on average in 1 of 3,000-5,000 clones of this library, and also confirmed that the 96 hr fraction was normalized.

The normalized cDNA fraction was amplified by PCR and ligated into IPTG-inducible retroviral vector LNXCO3 (Chang and Roninson, 1996, Gene 183: 137-142). The ligation produced a library of approximately 50 million clones. Percent recombination in this library was assessed by PCR of the DNA from bacterial colonies, using primers that flank the insertion site of LNXCO3. The number of clones containing an insert was 131/150 or 87%. Most of the inserts ranged in size from 100 to 300 bp. For further characterization of the library, a fraction of the inserts were recloned into the pcDNA3 vector. The insert sequences of 69 randomly picked clones in pcDNA3 were determined using a high-throughput DNA sequencer, and analyzed for homology to known gene sequences in the public-domain database. Fifty-two of the inserts matched no known genes, 16 corresponded to different human genes, and one sequence was found to be of bacterial origin. This normalized MCF-7 cDNA fragment library was used to select growth-inhibitory GSEs in breast carcinoma cells.

2. Production of Breast Cancer Recipient Cells

The normalized tumor library described in Example 1 was prepared from MCF-7 human breast carcinoma cells. As recipient cells for GSE selection, a different breast carcinoma cell line, MDA-MB-231 (ATCC Accession No. HTB26) was chosen. This line represents a more malignant class of breast cancers relative to MCF-7: it is estrogen receptor-negative and p53-deficient. The choice of different cell lines as the source of RNA and as the recipient was aimed at isolating growth-inhibitory GSEs that are more likely to be effective against different types of breast cancer.

MDA-MB231 cells were first rendered susceptible to infection with ecotropic retroviruses, which can be readily generated at a high titer using convenient packaging cell lines, and are not infectious to humans or unmodified human cells. MDA-MB-231 cells were infected with amphotropic recombinant virus that carries the gene for the murine ecotropic receptor in retroviral vector LXIHis (Levenson et al., 1998, Hum. Gene Ther. 9: 1233-1236), and the infected cell population was selected with histidinol. The susceptibility of the selected cells to infection with ecotropic retroviruses was determined by infecting such cells with an ecotropic retrovirus LXSE (Kandel et al., 1997, Id.) that carries the gene for the Green Fluorescent Protein (GFP). Over 86% of LXSE-infected cells were positive for GFP fluorescence (as determined by flow cytometry), indicating a correspondingly high infection rate. These cells were next transfected with the 3′SS plasmid (Stratagene) that carries the LacI repressor (Fieck et al., 1992, Nucleic Acids Res. 20: 1785-1791) and the hygromycin resistance marker, and stable transfectants were selected with hygromycin. The selected transfectants were subcloned, and 33 single-cell clones were individually tested for IPTG-regulated expression of a LacI-inhibited promoter. This testing was carried out by transient transfection of the cell clones with pCMVI3luc plasmid (Stratagene) that expresses luciferase from the LacI-regulated CMV promoter. As a positive control, the same assay was carried out on a previously characterized well-regulated fibrosarcoma cell line HT1080 3′SS6 (Chang and Roninson, 1996, Id.; Chang et al., 1999, Id.). Three of the tested clones showed the induction of luciferase expression in the presence of IPTG at a level similar to that of HT1080 3′SS6.

These clones were further tested by the following assays. The first assay was infection with LXSE ecotropic retrovirus, followed by FACS analysis of GFP fluorescence, to determine the susceptibility to ecotropic infection. The second assay was ecotropic retroviral transduction with IPTG-regulated retrovirus LNLucCO3 (Chang and Roninson, 1996), followed by G418 selection and testing for IPTG inducibility of luciferase expression. The third assay was the infection with IPTG-regulated ecotropic retrovirus LNp21CO3 (Chang et al., 1999, Id.), which carries the cell cycle inhibitor p21 (a positive control for an IPTG-inducible genetic inhibitor), followed by BrdU suicide selection (described below) in the presence and in the absence of IPTG. Based on the results of these assays, a cell line called MDA-MB231 3′SS31 was selected as being optimal for growth-inhibitory GSE selection. This cell line showed about 80% infectability with ecotropic retroviruses, approximately 10-fold inducibility by IPTG (which is higher than the concurrently determined value for HT1080 3′SS6) and over 20-fold increase in clonogenic survival of BrdU suicide upon infection with LNp21 CO3.

3. Isolation of Tumor Cell Growth Inhibiting Genetic Suppressor Elements

The MCF-7 derived normalized tumor library in the LNXCO3 vector was transduced into MDA-MB231 3′SS31 cell line by ecotropic retroviral transduction using the BOSC23 packaging cell line (Pear et al., 1993, Id.), as described in Roninson et al. (1998, Methods Enzymol. 292: 225-248). Two hundred million (2×10⁸) recipient cells were infected and selected with G418. The infection rate (as determined by the frequency of G418-resistant colonies) was 36%. Eighty million (8×10⁷) G418-selected infectants were subjected to selection for IPTG-dependent resistance to BrdU suicide, as follows. Cells were plated at 10⁶cells per P150 and treated with 50 μM IPTG for 36 hrs, then with 50 μM IPTG and 50 μM BrdU for 48 hrs. Cells were thereafter incubated with 10 μM Hoechst 33342 for 3 hrs and illuminated with fluorescent white light for 15 min on a light box, to destroy the cells that grew and incorporated BrdU in the presence of IPTG. Cells were then washed twice with phosphate-buffered saline and allowed to recover in G418-containing medium without IPTG or BrdU for 7-10 days. The surviving cells were then subjected to a second step of BrdU selection under the same conditions. Control plates were selected in the absence of IPTG, and representative plates were stained to count the colonies; these results are shown in FIG. 4. The number of surviving colonies after the second step of selection in the presence of IPTG was approximately three times higher than the corresponding number in the absence of IPTG. In contrast, control cells infected with an insert-free LNXCO3 vector showed no difference in BrdU survival in the presence or in the absence of IPTG. As a positive control, cells were infected with p21-expressing LNp21CO3, but the number of survivors in the presence of IPTG was too high to count. These results demonstrated that the frequency of library-infected cells that survived BrdU suicide selection increased in IPTG-dependent manner, consistent with successful selection of IPTG-inducible growth-inhibitory GSEs.

Genomic DNA was isolated from the two-step selected library-transduced cells and used as a template for PCR, using vector-derived sequences flanking the inserts as primers. The PCR-amplified mixture of inserts from the selected cells was recloned into LNXCO3 vector and close to 3,000 randomly picked plasmid clones from the library of selected fragments were sequenced by high-throughput DNA sequencing by PPD Discovery, Inc., Menlo Park, Calif. 1482 clones containing human cDNA fragments were identified among these sequences by BLAST homology search using the NCBI database and analyzed to identify genes that gave rise to the selected cDNA fragments. Ninety-three genes were found to give rise to two or more of the sequenced clones, indicating the enrichment for such genes in the selected library, with 67 genes represented by three or more clones. Forty-nine of the enriched genes were represented by two or more non-identical sequences. The sequences of the enriched clones are provided in Table 4 and the Sequence Listing. Many of these clones encode peptides derived from the corresponding gene products. The sequences of these growth-inhibitory peptides are provided in Table 5 and in the Sequence Listing as SEQ ID NOS. 229-314. The enriched genes with the corresponding accession numbers, as well as the numbers of selected clones and different sequences derived from each genes are listed in Table 1. Table 2 lists enriched genes previously known to be involved in cell proliferation, and Table 3 lists enriched genes that were not previously known to be involved in cell proliferation.

The following criteria were used for assigning genes to Table 2 or Table 3. The function of each gene was first confirmed according to the corresponding entry in the LocusLink database of NCBI. On the basis of this information, genes that are essential for basic cell functions (such as general transcription or translation), and genes known to play a role in cell cycle progression or carcinogenesis were excluded from Table 3 and assigned to Table 2. The functions of the other genes were then investigated through a database search of the art, using all the common names of the gene listed in LocusLink as keywords for the search. Through this analysis, additional genes were assigned to Table 2 by the following criteria (i) if overexpression of the gene, alone or in combinations, was shown to promote neoplastic transformation or cell immortalization; (ii) if inhibition of the gene function or expression was shown to produce cell growth inhibition or cell death; (iii) if homozygous knockout of the gene was shown to be embryonic lethal in mammals; or (iv) if the gene was found to be activated through genetic changes (such as gene amplification, rearrangement or point mutations) in a substantive fraction of any type of cancers. Genes that did not satisfy any of the above criteria were then assigned to Table 3.

4. Analysis of Tumor Cell Growth Inhibiting Genetic Suppressor Elements

Individual selected clones representative of enriched genes have been analyzed by functional testing for GSE activity. Results of these assays are summarized in Table 1. The principal assay involves the transduction of individual putative GSE clones (in the LNXCO3 vector) into MDA-MB-231-3′SS31 cells, followed by G418 selection of infected populations (for the neo gene of LNXCO3) and testing the transduced populations for IPTG-dependent survival of BrdU suicide. The latter assay was carried out as follows. Infected cells (200,000 per P100, in triplicate) were treated with 50 μM IPTG for 72 hrs, then with 50 μM IPTG and 50 μM BrdU for 48 hrs. A parallel set of cells was treated in the same way but without IPTG (in triplicate). Cells were then illuminated with white light and allowed to recover in the absence of BrdU and IPTG for 12-14 days. Results are expressed as the average number of colonies per P100, with standard deviations. In each set of assays, insert-free LNXCO3 vector was used as a negative control. As a positive control, LNXCO3 vector expressing CDK inhibitor p21 was used, but this control consistently gave excessively positive values of surviving colonies. Alternative positive controls comprised a GSE derived from a proliferation-associated transcription factor Stat3, which produced moderate but reproducibly positive results in multiple assays. Table 1 lists the results of this assay (IPTG-dependent survival of BrdU suicide) as positive (“A” in Functional Assays column) if t-test analysis of the difference in the number of colonies surviving in the presence and in the absence of IPTG provides a significance value of P<0.05. Results of this analysis on a subset of positive GSEs are shown in FIG. 5.

The assay for IPTG-dependent survival of BrdU suicide was performed for GSEs derived from 38 genes with positive results. Several infected cell populations that scored positive in this assay were also tested by a more stringent assay for direct growth inhibition by IPTG. None of the tested populations, however, showed significant growth inhibition by IPTG. A similar result (positivity in BrdU selection but not in the growth inhibition assay) was reported by Pestov et al. (1998, Id.) for a weak growth-inhibitory cDNA clone encoding a ubiquitin-conjugation enzyme. To determine whether increased BrdU survival in such cell populations reflects the heterogeneity of GSE expression and function among the infected cells, multiple (10 or more) clonal cell lines were generated from a subset of infected populations and tested for the ability to be growth-inhibited by IPTG. Through this process, IPTG-inhibited cell lines containing GSEs from 19 of the enriched genes were produced. The genes that scored positive by this assay are indicated in Table 1 (“B” in Functional Assays column). In contrast to these GSE-containing cell lines, cells transduced with an insert-free LNXCO3 vector showed no growth inhibition in the presence of IPTG. Results of IPTG growth inhibition assays with positive cell lines are shown in FIG. 6.

Putative GSEs from 7 of the tested genes gave a greatly diminished yield of G418-resistant infectants, relative to cells infected with the control LNXCO3 virus or with other tested clones. When the resulting small populations of G418-resistant cells infected with these clones were expanded and tested for IPTG-dependent survival of BrdU suicide, almost all of these populations produced negative results. Remarkably, most of the genes in this category (“C” in Functional Assays column of Table 1) are known to be important positive regulators of cell growth (JUN B, INT-2, MCM-3 replication protein, delta and eta isoforms of protein kinase C) and therefore are expected to give rise to growth-inhibitory GSEs. Since LNXCO3 vector is known to provide substantial basal expression in the absence of IPTG (Chang and Roninson, 1996), it seems likely that this group may include the strongest functional GSEs, which inhibit cell growth even in the absence of IPTG. Altogether, GSEs from a total of 51 genes have so far been confirmed by functional assays (IPTG-dependent survival of BrdU suicide or IPTG-dependent growth inhibition) or a putative positive criterion (decreased apparent infection rate).

The genes shown in Table 2 are known to be positive regulators of the cell growth or neoplastic transformation. These include genes directly involved in cell cycle progression (such as CCN D1 and CDK2) or DNA replication (e.g. PCNA, RPA3 or MCM-3), growth factors (e.g. INT-2/FGF-3 and TDGF1) and growth factor receptors (e.g. FGFR1, C-KIT), transcription factors known to be positive regulators of cell proliferation (e.g. STAT3, c-FOS, NFκB-1), several proliferation-associated signal transduction proteins, such as three isoforms of PKC (the primary target of tumor promoters) and three integrin proteins, as well as several ribosomal components required for protein synthesis. The enriched genes include many known protooncogenes, such as JunB and c-FOS (which gave rise to two of three growth-inhibitory GSEs isolated by Pestov and Lau (1994, Id.) from a 19-gene library in NIH 3T3 cells), a FOS-related gene, INT-2, c-KIT, LYN B (YES protooncogene), MET, RAN (a member of RAS family), several growth-promoting genes that are known to be amplified in cancers (CCN D1, CDK2, FGFR1), and several genes reported to be overexpressed in cancers. Some of the enriched genes have specific associations with breast cancer, including INT-2, originally identified as a mammary oncogene (Peters et al., 1984, Nature 309: 273-275), CCN D1 and FGFR1 found to be amplified in a substantial minority of breast cancers (Barnes and Gillett, 1998, Breast Cancer Res Treat. 52: 1-15; Jacquemier et al., 1994, Int. J. Cancer 59: 373-378), and HSPCA, which was shown to be expressed in all the tested breast cancers (143 total) at a higher level than in non-malignant breast tissue (Jameel et al., 1992, Int. J. Cancer 50: 409-415). The abundance of such genes among the selected sequences provides strong validation of this approach to the elucidation of positive growth regulators in breast carcinoma cells.

The genes in Table 3 have no known function in growth regulation. These genes encode several transcription factors, proteins involved in signal transduction or cell adhesion, a number of proteins involved in RNA transport or protein trafficking and processing, a group of genes with miscellaneous other functions that are not related to cell growth, and 10 genes, the functions of which are presently unknown.

Of special interest, at least three of the genes in Table 3 appear to be inessential for growth of normal cells, since homozygous knockout of these genes in mice does not prevent the development of adult animals (except for some limited developmental abnormalities). These genes include L1CAM (Dahme et al., 1997, Nat. Genet. 17 346-349), ICAM2 (Gerwin et al., 1999, Immunity 10: 9-19), and von Willebrand factor (Denis et al., 1998, Proc Natl Acad Sci USA 95: 9524-9529). The effect of GSEs derived from these genes on breast carcinoma cells suggests that inhibition of such “inessential” genes may have a desirable tumor-specific or tissue-specific antiproliferative effect.

A striking example of an apparently inessential gene enriched in the selected library, which has been independently identified as a highly promising target for breast cancer treatment, is provided by HSPCA (included in Table 2). The basic function of this gene, which belongs to of a heat shock responsive family of chaperone proteins, which play a role in refolding of mature proteins, does not indicate that it should be required for cell growth. HSPCA, however, was found to play a role in stabilizing several proteins that are involved in oncogenic pathways, including Raf, Met, steroid receptors, and members of the HER kinase family, and to serve as the target of an antitumor antibiotic geldanamycin (Stebbins et al., 1997, Cell 89: 239-250). The HSPCA-inhibiting geldanamycin analog 17-AAG has been shown to arrest the growth of breast carcinoma cell lines (including MDA-MB-231; Munster et al, 2001, Cancer Res. 61: 2945-2952) and to sensitize such cells to chemotherapy-induced apoptosis (Munster et al., 2001, Clin Cancer Res 7: 2228-2236); 17-AAG is currently in clinical trial. The example of HSPCA suggests that other apparently inessential genes identified by GSE selection are likely to provide similarly promising targets for cancer treatment. Some of these potential novel targets are described in more detail in the next section.

5. Potential Novel Drug Targets.

Several of the selected genes warrant consideration as potential novel targets for cancer drug development. Non-limiting examples are as follows.

L1CAM. L1 cell adhesion molecule (L1CAM) is represented in the set of growth-inhibiting GSEs by eight sense-oriented and four antisense-oriented GSEs. L1CAM is a 200-220 kDa type I membrane glycoprotein of the immunoglobulin superfamily expressed in neural, hematopoietic and certain epithelial cells. The non-neuronal (shortened) form of L1CAM is expressed highly in melanoma, neuroblastoma, and other tumor cell types, including breast. L1CAM is found not only in membrane-bound form but also in the extracellular matrix of brain and tumor cells. Soluble L1CAM directs the migration of glioma cells, and one of anti-L1CAM antibodies was found to inhibit this migration (Izumoto et al., 1996, Cancer Res. 56: 1440-1444). Such an antibody might be useful as an initial prototype agent to validate L1CAM as a cancer drug target.

As a cell surface molecule, L1CAM should be easily accessible to different types of drugs. FIGS. 7A and 7B illustrate morphological effects of an L1CAM-derived GSE in a clonal IPTG-inhibited cell line. Four-day treatment with IPTG drastically altered cell morphology, with the cells developing lamellipodia and apparent focal adhesion plaques (FIG. 7A). This effect suggests that the IPTG-induced GSE affects cell adhesion, as would have been expected from targeting L1CAM. GSE induction not only arrested cell growth but also induced mitotic catastrophe in 15-20% of IPTG-treated cells. Mitotic catastrophe is a major form of tumor cell death (Chang et al., 1999, Id.), which is characterized by abnormal mitotic figures and formation of cells with multiple micronuclei (FIG. 7B). The ability of a GSE to induce mitotic catastrophe is a good general indication for the potential promise of a GSE-inhibited target.

Human L1CAM gene is mutated in patients with a severe X-linked neurological syndrome (CRASH: corpus callosum hypoplasia, retardation, aphasia, spastic paraplegia and hydrocephalus). L1CAM “knockout” (−/−) mice develop to adulthood and appear superficially normal (slightly smaller than adults), but they have a shortened lifespan due to CRASH-like neurological deficits, which may be related to a decrease in neurite outgrowth (Dahme et al., 1997, Id.). These observations suggest that targeting L1CAM in an adult cancer patient should not have major toxicity outside of the nervous system, where most drugs will not penetrate due to the blood-brain barrier. Furthermore, it is quite likely that the neurological effects result only from a lack of L1CAM during embryonic development and would not develop from L1CAM inhibition in an adult.

ICAM2. The intercellular cell adhesion molecule-2 (ICAM2) is represented in the set of growth-inhibiting GSEs by two sense-oriented and one antisense-oriented GSE. ICAM2 has many similarities to L1CAM and is also inessential for the growth of normal cells (Gerwin et al., 1999, Id.). Anti-ICAM2 antibodies, for example, are attractive possibilities for prototype drugs.

NIN283. This gene has recently been described (Araki et al., 2001, J. Biol. Chem. 276: 34131-34141) as being induced in Schwann cells upon nerve injury and termed NIN283. Induction of NIN283 is a part of injury response of Schwann cells, which then act to promote the growth of the injured nerve. NIN283 is also induced by nerve growth factor (NGF). Like L1CAM, NIN283 is expressed primarily in the brain. It is localized to lysosomes, is highly conserved in evolution (with identifiable homologs in Drosophila and C. elegans), and contains a unique combination of a single zinc finger and a RING finger motif. Based on these structural features and localization, Araki et al. (2001, Id.) speculated that NIN283 may be involved in ubiquitin-mediated protein modification and degradation. With this putative function in protein modification, stress inducibility and evolutionary conservation, NIN283 appears analogous to the above-discussed HSPCA.

Here, this gene was found to give rise to one of the strongest functionally active GSEs in breast carcinoma growth-inhibition assays. The available information on functional domains of NIN283 should be useful in structure-based rational design of small molecule inhibitors of this interesting protein.

ATF4. Activating transcription factor 4 gave rise to the most highly enriched antisense GSE in these selection assays. Homozygous knockout of ATF4 results in only minor developmental abnormalities (in the eye lens; Tanaka et al., 1998, Genes Cells 3: 801-810; Hettmann et al., 2000, Dev. Biol. 222: 110-123), indicating that this factor is not essential for normal cell growth. The results disclosed herein implicate ATF4 in breast cancer cell proliferation and are strengthened by reports in the art that ATF4 expression and function are augmented by heregulin β1, a factor that stimulates the growth of breast cancer cells (Talukder et al., 2000, Cancer Res. 60: 276-281).

Zinedin. Zinedin is a recently described calmodulin-binding protein with a WD repeat domain, which is preferentially expressed in the brain (Castets et al., 2000, J. Biol. Chem. 275: 19970-19977). This expression pattern suggests that zinedin-targeting drugs are unlikely to have an effect on any normal proliferating cells. An antisense-oriented GSE derived from zinedin, however, was found herein to inhibit breast carcinoma cell growth, both by the IPTG-dependent BrdU suicide assay and by the ability to give rise to an IPTG-inhibited cell line. Structural analysis of zinedin indicates specific domains that apparently mediate its interactions with calmodulin and caveolin (Castets et al., Id.). Structure-based targeting of these domains, as well as screening based on the interference with zinedin-calmodulin interactions, can be used as strategies for developing zinedin-targeting drugs.

Novel genes. Several genes identified by this selection have no known function, no significant homologies with known genes or identifiable functional domains. These results provide the first functional evidence for such genes. One of the most highly enriched and functionally active GSEs is designated GBC-1 (Growth of Breast Carcinoma 1). Translated protein sequence of GBC-1 matches a partial sense-oriented sequence of a hypothetical unnamed protein (accession No. XP_—031920). GBC-1 GSE encodes a helical-repeat peptide. The strong growth-inhibitory activity of this GSE suggests that molecules derived from or mimicking this peptide are likely to have antitumor activity. The GBC-1 peptide disclosed herein can be regarded as a prototype drug, the structure of which can be used to direct rational design of a synthetic compound.

Among other novel genes identified in the instant invention, two genes, designated herein GBC-3 (Growth of Breast Carcinoma 3) and GBC-11 (Growth of Breast Carcinoma 11) are the most highly enriched, and their GSEs show strong functional activity. Cell lines that comprise these GSE and that are efficiently growth-inhibited by treatment with IPTG are useful for characterizing the cellular effects of GBC-3 or GBC-11 inhibition. GBC-3 matches an otherwise uncharacterized EST AA443027 and maps to chromosome 3q29, GBC-11 maps to chromosome 14 and does not match any known cDNA sequences. GBC-3 appears according to “Virtual Northern” analysis carried out using the NCBI SAGE database (http://www.ncbi.nlm.nih.gov/SAGE/sagevn.cgi) to be expressed at a very low level in all cell types, suggesting that it may be an easy target to inhibit.

6. In Vivo Testing of Test Compounds

The efficacy of inhibiting expression or activity of the genes set forth in Table 3 is tested in vivo as follows.

Cells (1-2×10⁶) expressing an IPTG-inducible GSE of the invention that inhibits expression or activity of a gene in Table 3 are injected into a mouse as a xenograft, most preferably in one flank of the mouse so that tumor growth can be visually monitored. IPTG-regulated gene expression in mouse xenografts of MDA-MB-231 breast carcinoma has been demonstrated in the art, for example by Lee et al. (1997, Biotechniques 23: 1062-1068) and the experiments described herein can be performed substantially as described by Lee et al. but using the GSE-containing tumor cells of the invention. Conveniently, GSE-naïve tumor cells are injected in the opposite flank in each mouse. Two sets of injected mice are housed and maintained in parallel, with one set of mice having feed supplemented with IPTG at a concentration as taught by Lee et al. and the other set of mice not receiving IPTG supplemented food. Emergent tumors are observed on the mice under humane animal care conditions until the extent of tumor cell growth is life-threatening or inhumane. Biopsy samples are taken and the tumors measured and weighed after animal sacrifice to determine differences between the GSE-expressing and non-GSE-expressing tumors in each mouse and between mice fed IPTG and mice without IPTG supplementation.

IPTG-fed mice will bear one tumor of naïve xenograft cells whose growth is unaffected by IPTG. These tumors will be substantially identical to the size of both naïve xenograft cell and GSE-containing xenograft cell tumor in mice not fed IPTG. In contrast, the tumor produced from the GSE-containing xenograft cells in mice fed IPTG will be substantially smaller than the other tumors. Biopsy will show proliferating tumor cells in both naïve xenograft cell and GSE-containing xenograft cell tumor in mice not fed IPTG and naïve xenograft cells from IPTG-fed mice, and quiescent or dying cells in the GSE-containing xenograft tumor.

These results demonstrate that inhibition of expression or activity of genes set forth in Table 3 inhibits tumor cell growth in vivo.

It should be understood that the foregoing disclosure emphasizes certain specific embodiments of the invention and that all modifications or alternatives equivalent thereto are within the spirit and scope of the invention as set forth in the appended claims.

TABLE 1 Genes Enriched among 1482 Sequences of Clones Containing cDNA Inserts in the Selected Library #Sequences Functional Gene Accession # (s/as) # clones Assays* ATF4 NM_001675.1 5(as) 369 A STAT5b NM_012448.1 4(s), 4(as) 152 A, B GBC-1 NM_031221.1 2 (s) 70 A, B ARHG NM_001665.1 5(s), 1(as) 43 A VWF NM_000552.2 6(s), 5(as) 39 B MCM3 NM_002388.2 3(s), 4(as) 38 C 18S RNA K03432.1 8(s), 4(as) 33 A ITGB5 NM_002213.1 4(s), 1(as) 30 A, B HSPCA NM_005348.1 2(s) 27 B STAT3 NM_003150.1 4(s), 3(as) 25 A, B L1CAM NM_000425.2 8(s), 4(as) 20 A, B 28S RNA M27830.1 3 (s) 17 A C-FOS NM_005252.2 3(s), 3(as) 17 A C-KIT NM_021099.2 4(s), 2(as) 12 A FEN1 NM_004111.3 2(s), 2(as) 12 A GBC-3 AA443027 1(s) 12 A, B NIN283 NM_032268 1(s) 11 A ADPRT NM_001618 1(s), 1(as) 10 CCN D1 NM_001758.1 2(s), 2(as) 9 A CDC20 NM_001255 1(as) 9 B EFNA1 NM_004428 1(s), 3(as) 9 A KIAA1270 XM_044835 1(as) 9 A RPL31 NM_013403.1 2(s) 9 A, B 7SL X04248.1 4(s), 1(as) 8 C ENO1 NM_001428 2(s) 8 GSTP NM_000852 2(s) 8 ICAM2 NM_000873 2(s), 1(as) 8 INT-2/FGF3 NM_005247 2(s) 8 C LYN NM_002350 2(as) 8 A RPS24 NM_001026 1(s), 1(as) 8 FGFR1 NM_000604.2 2(s), 1(as) 6 A HES6 XM_043579 1(s) 6 B PKC zeta NM_002744 2(s), 1(as) 6 B RAN NM_006325 1(s) 6 RPA3 NM_002947.1 1(s) 6 A ZIN NM_013403.1 1(as) 6 A, B TAF7 NM_005642 1(s) 6 A AP1B1/BAM22 NM_001127.1 2(s) 5 A HNRPF NM_004966 1(s) 5 A HNRPMT AF222689 1(s) 5 A NFkB-1 NM_003998.1 1(as) 5 A, B NR3C1 NM_000176 1(s) 5 A PKC delta NM_006254.1 2(s), 1(as) 5 C BAG-1 NM_004323.2 2(s) 4 A GBC-11 W84777 1(s) 4 A, B HNRPA2B1 NM_002137 1(s) 4 A IF1 NM_016311.1 1(s) 4 A ITGA4 NM_000885 1(s), 1(as) 4 JunB NM_002229.1 1(s) 4 C GRP58 NM_005313.1 1(s), 1(as) 4 PKC eta NM_006255.1 3(s), 1(as) 4 A, B, C PSMB7 NM_002799 1(s) 4 RAB2L NM_004761 1(s) 4 RPL35 NM_004632.1 2(as) 4 C CDK2 NM_001798.1 2(s) 3 A DAP-3 NM_004632.1 2(as) 3 A, B EIF-3 NM_003750 3(s) 3 A GBC-12 1(s) 3 A IGF2R NM_000876 2(s) 3 KIFC1 XM_042626 1(as) 3 MET NM_031517 2(s), 1(as) 3 PCNA NM_002592 1(s) 3 PPP2R1B NM_002716 2(as) 3 RAB5B NM_002868.1 1(s), 1(as) 3 TDGF1 NM_003212 1(as) 3 ARFAPTIN1 NM_014447 1(as) 2 CDK10 NM_003674 2(s) 2 B CREB1 NM_004379 1(s) 2 EDF-1 NM_003792 1(s) 2 FLJ10006 XM_041928 1(as) 2 ELJ13052 NM_023018 1(s) 2 FOSL2 NM_005253.1 1(s), 1(as) 2 GBC-13 1(s) 2 GBC-14 AL557138 1(s) 2 GBC-15 BE079876 1(s) 2 GBC-16 1(s) 2 GBC-17 1(s) 2 GBC-18 1(s) 2 GNAS M21139 1(as) 2 IL4R NM_000418 1(as) 2 ITGA3 NM_002204 1(as) 2 MAP2K2 NM_002755 2(as) 2 MBD-1 NM_015847 1(s), 1(as) 2 B MCM-6 NM_005915 1(s) 2 MYL6 NM_021019 2(s) 2 A NUMA1 NM_006185 1(s) 2 PC4 NM_006713 1(s) 2 RAD23A NM_005053 1(s) 2 REL NM_002908 1(s) 2 RPA1 NM_002945 1(as) 2 RPL12 NM_000976 1(s) 2 RPS29 NM_001032 1(s) 2 SQSTM1 NM_003900 1(s) 2
*A, confirmed by BrdU suicide assay; B, gave rise to cell line inhibited by IPTG; C, low infection rate

TABLE 2 Enriched Genes Previously Implicated in Cell Proliferation #Sequences # Gene Accession No. (s/as) clones Description Association with cancer CCN D1 NM_001758 2(s), 2(as) 9 Cyclin, G1/S transition Amplified in cancers CDK2 NM_001798 2(s) 3 Cyclin-dependent kinase, S- Amplified in cancers phase PCNA NM_002592 1(s) 3 DNA replication Upregulated in cancers RPA3 NM_002947 1(s) 6 DNA replication, excision repair RPA1 NM_002945 1(as) 2 DNA replication MCM3 NM_002388 3(s), 4(as) 38 DNA replication MCM6 NM_005915 1(s) 2 DNA replication FEN1 NM_004111 2(s), 2(as) 12 DNA replication and repair CDC20 NM_001255 1(as) 9 CDC2-related kinase, mitosis NUMA1 NM_006185 1(s) 2 Nuclear reassembly in late mitosis RAN NM_006325 1(s) 6 Small GTPase, mitosis Ras family CDK10 NM_003674 2(s) 2 Cell cycle, G2/M C-KIT NM_021099 4(s), 2(as) 12 Growth factor receptor, Protooncogene oncogene EFN A1 NM_004428 1(s), 3(as) 9 Receptor tyrosine kinase ligand RAS pathway regulator LYN NM_002350 2(as) 8 Tyrosine kinase YES protooncogene INT-2/FGF-3 NM_005247 2(s) 8 Fibroblast growth factor Mammary oncogene FGFR1 NM_000604 2(s), 1(as) 6 Fibroblast growth factor Amplified in breast cancers receptor, tyrosine kinase IGF2R NM_000876 2(s) 3 Insulin-like growth factor 2 Mutated in breast cancers receptor TDGF1 NM_003212 1(as) 3 Teratocarcinoma derived growth Overexpressed in teratocarcinomas factor 1 (EGF family) MET NM_031517 2(s), 1(as) 3 Hepatocyte growth factor Protooncogene receptor IL4R NM_000418 1(as) 2 Interleukin-4 receptor STAT3 NM_003150 4(s), 3(as) 25 Transcription factor Upregulated in breast ca (proliferation) STAT5b NM_012448 4(s), 4(as) 152 Transcription factor (proliferation) C-FOS NM_005252 3(s), 3(as) 17 AP-1 component Protooncogene NFκB-1 NM_003998 1(as) 5 Stress, apoptosis, paracrine activities TAF7 NM_005642 1(s) 6 Transcription initiation factor PC4 NM_006713 1(s) 2 General positive coactivator of transcription CREB1 NM_004379 1(s) 2 Transcription factor, regulates expression of cAMP-inducible genes including Cyclin A JUNB NM_002229 1(s) 4 AP-1 component Protooncogene FOSL2 NM_005253 1(s), 1(as) 2 AP-1 component FOS-related REL NM_002908 1(s) 2 Transcription factor Protooncogene ADPRT NM_001618 1(s), 1(as) 10 Poly (ADP ribosyl) transferase PKC zeta NM_002744 2(s), 1(as) 6 Serine/threonine protein kinase Stimulated by tumor promoters PKC delta NM_006254 2(s), 1(as) 5 Serine/threonine protein kinase Stimulated by tumor promoters PKC eta NM_006255 3(s), 1(as) 4 Serine/threonine protein kinase Stimulated by tumor promoters MAP2K2 NM_002755 2(as) 2 MAP kinase kinase Implicated in medulloblastoma metastasis GRP58 NM_005313 1(s), 1(as) 4 Membrane signal transduction PPP2R1B NM_002716 2(as) 3 Protein phosphatase 2 regulatory subunit β BAG1 NM_004323 2(s) 4 Apoptosis inhibitor (Bcl-2 Overexpressed in cancers family) DAP3 NM_004632 2(as) 3 Positive/negative apoptosis Overexpressed in gliomas regulator ITGA4 NM_000885 1(s), 1(as) 4 Cell adhesion, signal Involved in Src pathway transduction ITGA3 NM_002204 1(as) 2 Cell adhesion, signal Involved in colorectal cancer growth transduction ITGB5 NM_002213 4(s), 1(as) 30 Cell adhesion, signal Correlates with invasiveness in transduction gastric ca AHRG NM_001665. 5(s), 1(as) 43 Small GTPase, cytoskeletal Ras family, contributes to Ras reorganization transforming activity GNAS complex M21139 1(as) 2 G-protein alpha subunit s, knockout is embryonic lethal HSPCA NM_005348 2(s) 27 Chaperone, protein folding Overexpressed in breast ca, activates tyrosine kinases EIF-3 NM_003750 3(s) 3 Translation initiation factor RPL31 NM_013403 2(s) 9 Ribosomal protein L31 RPL35 NM_004632 2(as) 4 Ribosomal protein L35 RPL12 NM_000976 1(s) 2 Ribosomal protein L12 RPS29 NM_001032 1(s) 2 Ribosomal protein S29 RPS24 NM_001026 1(s), 1(as) 8 Ribosomal protein S24 18S RNA K03432.1 8(s), 4(as) 33 Ribosomal RNA 28S RNA M27830 3(s) 17 Ribosomal RNA 7SL X04248 4(s), 1(as) 8 RNA component of signal recognition particle

TABLE 3 Enriched Genes That Have Not Been Previously Implicated in Cell Proliferation #Sequences # Association with Gene Accession No. (s/as) clones Description cancer Transcription factors ATF4 NM_001675 5(as) 369 Activating transcription factor Induced in breast ca by heregulin HES6 XM_043579 1(s) 6 Transcription co-factor, differentiation inducer NR3C1 NM_000176 1(s) 5 Glucocorticoid receptor EDF1 NM_003792 1(s) 2 Transcription factor, stimulates endothelial cell growth, represses endothelial cell differentiation MBD1 NM_015847 1(s), 1(as) 2 Methylated DNA binding protein, transcription inhibitor RNA transport HRPMT1L2 NM_001536 1(s) 5 Hnrp arginine methyltransferase HNRPF NM_004966 1(s) 5 Heterogeneous nuclear ribonucleoprotein F HNRPA2B1 NM_002137 1(s) 4 Heterogeneous nuclear ribonucleoprotein A2/B1 Signal transduction and cell adhesion ZIN NM_013403 1(as) 6 Calmodulin-binding WD repeat protein Arfaptin1 NM_014447 1(as) 2 Similar to POR1 GTP-binding protein; may act in cellular membrane ruffling and formation of lamellipodia L1CAM NM_000425 8(s), 4(as) 20 Cell adhesion, neural ICAM2 NM_000873 2(s), 1(as) 8 Cell adhesion, intercellular Intracellular transport AP1B1/BAM22 NM_001127 2(s) 5 Clathrin-associated adaptor protein RAB2L NM_004761 1(s) 4 Small GTPase, intracellular transport Ras family KIFC1 XM_042626 1(as) 3 Intracellular trafficking Rab5B NM_002868 1(s), 1(as) 3 Small GTPase, vesicle transport Ras family Protein processing NIN283 NM_032268 1(s) 11 ubiquitin-mediated protein modification PSMB7 NM_002799 1(s) 4 Proteasome subunit β7 SQSTM1 NM_003900 1(s) 2 Sequestosome 1; ubiquitin-mediated protein degradation RAD23A NM_005053 1(s) 2 Nucleotide excision repair, ubiquitin-mediated protein degradation Other VWF NM_000552 6(s), 5(as) 39 Blood clotting GSTP NM_000852 2(s) 8 Xenobiotic metabolism ENO1 NM_001428 2(s) 8 Glycolysis IF1 NM_016311 1(s) 4 Inhibitor of Fo/F1 mitochondrial ATPase MYL6 NM_021019 2(s) 2 Contractility FLJ13052 NM_023018 1(s) 2 NAD kinase (predicted) GBC-14 AL557138 1(s) 2 similar to tyrosine 3- monooxygenase/tryptophan 5- monooxygenase activation protein, zeta polypeptide KIAA1270 XM_044835 1(as) 9 Alanyl-tRNA synthetase homolog IGF2R NM_000876 2(s) 3 Insulin-like growth factor 2 receptor Mutated in breast cancers Unknown function GBC-1 NM_031221 2(s) 70 Contains helical repeat peptide FLJ10006 XM_041928 1(as) 2 GBC-3 AA443027 1(s) 12 HC 3q29 GBC-11 1(s) 4 HC 14 GBC-12 1(s) 3 HC 1 GBC-13 1(s) 2 GBC-15 BE079876 1(s) 2 GBC-16 1(s) 2 GBC-17 1(s) 2 GBC-18 1(s) 2

TABLE 4 Nucleotide Sequences of GSEs SEQ Gene/Acces- No. of Orien- ID sion No. Clones tation NO Sequence 18S RNA 1 AS 4 1089 gccgctagaggtgaaattccttggaccggcgcaagacggaccagagcgaaagcatttgccaagaatgtt K03432.1 ttcattaatcaagaacgaaagtcggaggttcgaagacgatcagataccgtcgtagttccgaccataaac gatgccgaccggcgatgcggcggcgttattcccatgacccgccgg 1271 2 AS 5 1413 ccggacacggacaggattgacagattgatagctctttctcgattccgtgggtggtggtgcatggccgtt cttagttggtggagcgatttgtctggttaattccgataacgaacgaga 1529 6 S 6 177 caaagattaagccatgcatgtctaagtacgcacggccggtacagtgaaactgcgaatggctcattaaat cagttatggttcctttggtcgct 268 7 S 7 1414 cggacacggacaggattgacagattgatagctctttctcgattccgtgggtggtggtgcatggccgttc 1482 4 AS 8 154 ctgccagtagcatatgcttgtctcaaagattaagccatgcatgtctaagtacgcacggccggtac 218 1 AS 9 199 taagtacgcacggccggtacagtgaaactgcgaatggctcattaaatcagttatggt 255 2 S 10 570 cggagagggagcctgagaaacggctaccacatccaaggaaggca 613 3 S 11 177 caaagattaagccatgcatgtctaagtacgcacggccggta 217 1 S 12 1040 cggaactgaggccatgattaagagggacggccggg 1074 1 S 13 1433 cagattgatagctctttctcgattccgtgggtggt 1467 1 S 14 224 aactgcgaatggctcattaaatcagttatggttcctttggtcgct 268 4 S 15 185 aagccatgcatgtctaagtacgcacggccg 214 28S RNA 10 S 16 83 ccctactgatgatgtgttgttgccatggtaatcctgctcagtacgagaggaaccgcaggttcagacatt M27830.1 tggtgtatgtgcttggctgaggagccaatggggcgaacgtaccatctgt 200 4 S 17 1 gaattcaccaagcgttggattgttcacccactaatagggaacgtgagct 49 3 S 18 136 cgcaggttcagacatttggtgtatgtg 162 7SL RNA 3 S 19 29 cccagctactcgggaggctgaggctggaggatcgcttgagtccaggagttctgggctgtagtgcgctat X04248.1 gccgatcgggtgtccgcactaagttcggcatcaatatgg 136 1 S 20 70 ccaggagttctgggctgtagtgcgctatgccgatcgggtgtccgcactaagttcggcatcaatatggt 137 3 S 21 144 ccgggagcgggggaccaccaggttgcctaaggaggggtga 183 9 AS 22 24 gtagtcccagctactcgggaggctgaggctggaggatcgcttga 67 3 S 23 153 ggggaccaccaggttgcctaaggaggggtga 183 ADPRT 9 S 24 2736 gctgtggcacgggtctaggaccaccaactttgctgggatcctgtcccagggtcttcggatagccc NM_001618 cgcctgaagcgcccgtgacag 2821 1 AS 25 2422 gaccctcccctgagcagactgtaggccacctcgatgtccagcaggttgtcaagcatttcc accttggcctgcacactgtctgc 2504 ARFAPTIN1 2 AS 26 26 ttcacactgaccaaccgccgaggacagtcggaccggcgacctctcaacccagcc 79 NM_014447 ATF4 359 AS 27 833 acaccttcgaattaagcacattcctcgattccagcaaagcaccgcaacatgaccgaaatgagcttcctg NM_001675.1 agcagcg 909 6 AS 28 833 gacaccttcgaattaagcacattcctcgattccagcaaagcaccgcaaca 883 2 AS 29 838 ----ccttagaattaagcacattcctcgattccagcaaagcgccgcaacatgacggaaa 893 1 AS 30 843 ---------gaattaagcactttcctcgagtccagcaaagccccgca------------ 880 1 AS 31 864 cgctgctcagcaagctctgttcggtcatgttgcggtgctttgctgg 909 IF1 4 S 32 13 ccagcagcaatggcagtgacggcgttggcggcgcggacgtggcttggcgtgtggggc 69 NM_016311.1 BAG1 3 S 33 434 ccgggacgaggagtcgacccggagcgaggaggtgaccagggaggaaatggcggcagctgggctcaccgt NM_004323.2 gactgtcacccacagc 518 1 S 34 461 ggaggtgaccagggaggaaatggcggcagctgggctcaccgtgactgtcacccacagc 518 AP1B1 5 S 35 275 gccaagagtcagcctgacatggccattatggccgtcaacacctttgtgaaggactgtgagga 336 NM_001127.1 1 S 36 286 gcctgacatggccattatggccgtcaacacctttgtgaaggactgtgag 334 CDC20 4 AS 37 1001 gccagggacaccatgctacggccttgacagccccttgatgctgggtgaatgtctgcagaggaa NM_001255 cccagccaccctctccaggagcactgggccacacattgaccaagttatcattaccaccactggccaaat gtcgtccatctggggcccagcgcagcccacacacttcctggctgtggccactcagtgtggccacatggt gttctgct 1209 CDK10 1 S 38 1159 gccccagccacctccgagggccagagcatgcgctgtaaacc 1199 NM_003674 1 S 39 1734 ctaccaggagagccctgggctggaggctgagctgcatccctgctccccacatggaggacccaa caggaggccgtggctctgatgctgagcgaagct 1829 CDK-2 2 S 40 322 agatctctctgcttaaggagcttaaccatcctaatattgtcaagctg 368 NM_001798.1 1 S 41 645 tacacccatgaggtggtgaccctgtggtaccgagctcctgaaatcctcctgggctgca 702 c-FOS 1 AS 42 347 cactgccatctcgaccagtccggacctgcagtggctggtgcagcccgccctcgtctcctctgtggcccc NM_005252.2 atcgcagaccagagcccctcaccctttcggagtccccgccccc 458 1 AS 43 246 cactcacccgcagactccttctccagcatgggctcgcctgtcaacgcgcaggacttctgcacggacctg gcc 317 12 S 44 57 agcgaacgagcagtgaccgtgctcctacccagctctgcttcacagcgcccacctgtctccgcccct 122 1 S 45 1342 gcccgagctggtgcattacagagaggagaaacacatcttccctagagggttcctgtagacc taggg 1407 1 AS 46 717 gaggcagggtgaaggcctcctcagactccggggtggcaacctctggcaggcccccagtcagatca agggaagccacagacatctcttctgggaagcccaggtcatcagggatcttgcaggcgggtcggtgagct gccaggatgaactctagtttttccttctcctt 882 1 S 47 596 taagatggctgcagccaaatgccgcaaccggagga 630 c-KIT 2 AS 48 2448 gcgatttcgggctagccagagacatcaggaatgattcgaattacgtggtcaaaggaaatgcacgactgc NM_021099.2 ccgtgaagtggatggcaccagagagcattttcagctgcg 2555 4 AS 49 2632 cccagggatgccggtcgactccaagttctacaagatgatcaaggaaggcttccggatggtcagcccgga gcacgcgcctgccgaaatgtatgacgtcatgaagacttgctgggacg 2747 2 S 50 3466 aacggggcatcggaagtctggtcacgctaagaagaccgaggctgagaaggaacaagccaggggaagcgt ga 3536 1 S 51 4650 gctggtttggaggtcctgtggtcatgtacgagactgtcaccagttaccgcgctctgtttgaaacatgtc 4718 2 S 52 3508 tgagaaggaacaagccaggggaagcgtgaacaatgatgctctgctctgggctgccgctcgggcttct gtacaactgacctggttt 3592 1 S 53 3595 gaacaagccagggaagcgtgaacaatgatgctctgctctgggctgccgctcgggcttctgtacaac tgacctggtttctc 3515 CREB1 2 S 54 199 aagcccagccacagattgccacattagcccaggtatctatgccagcagctcatgcaacatcatctg NM_004379 264 CCND1 6 S 55 311 tgcggaagatcgtcgccacctggatgctggaggtctgcgaggaacagaagtgcgaggaggaggtcttcc NM_001758.1 cgctggccatgaactacctggaccgcttcctgtcgctgg 418 1 S 56 935 agaacatggaccccaaggccgcc 957 2 AS 57 331 tggatgctggaggtctgcgaggaacagaagtgcgaggaggaggtcttc ccgctggccatgaactacctggaccgcttcctg 411 1 58 406 cacagcttctcggccgtcagggggatggtctccttcatcttagaggccacgaacatgcaagtggccc ccagcagctgcaggcggctctttttcacgggctccagcgacaggaa 518 DAP3 2 AS 59 1249 gcggcactgtgcctacctctaagccaagatcacagcatgtgaggaagacagtggacatctgctttatgc NM_004632.1 tggacccagtaagatgaggaagtcgggcagtacacaggaagaggagccaggcccttgtacctatgggat tggacaggactgcagttggctctggacctgc 1417 1 AS 60 1259 gcctacctctaagccaagatcacagcatgtgaggaagacagtggacatctgctttatgctggacccagt aagatgaggaagtcgggcagtacacaggaagaggagccaggcccttgtacctatgggattggacaggac tgcagttggctctggacctgc 1417 EDF1 2 S 61 97 ggccaaatccaagcaggctatcttagcggcacagagacgaggaggagat 145 NM_003792 eIF-3 1 S 62 3259 ggcgaggaggcgctgatgatgagcgatcatcctggcgtaatgctgatgatgaccggggtcccaggcgag NM_003750 ggttggatga 3337 1 S 63 40 gcagcgttgggcccatgcaggacgc 64 1 S 64 269 cagcttcaggcagaaacagaaccaa 293 ENO-1 7 S 65 5 agatctcgccggctttacgttcacctcggtgtctgcagcaccctccgcttcctctcctaggcgacg 70 NM_001428 1 S 66 11 cgccggctttacgttcacctcggtgtctgcagcaccctccgcttcct 57 EFNA1 5 S 67 228 cgcactatgaagatcactctgtggcagacgctgccatggagcagtacatactgtacctggtggagca NM_004428 tgaggagtaccagctgt 311 2 AS 68 517 tgctgcaagtctcttctcctgtggattgacatgggcctgaggactgtgagtgattttgcca 577 1 AS 69 1183 tggcacagcccccctgctggcacagctctggggagtgctgccccaggatgggagagaatgcagtacctg gctacaaacttctctgtggcagctccacagatgaggtctt 1291 1 AS 70 467 gacagtcaccttcaacctcaagcagcggtcttcatgctggtggatggg 514 FEN1 5 S 71 634 gccacagctcaagtcaggcgagctggccaaacgcagtgagcggcgggctgaggcagagaagcagctgca NM_004111.3 gcaggctcaggctgctgg 720 4 AS 72 841 ggcagaggccagctgtgctgccctggtgaaggctggcaaagtctatgctgcggctaccga 900 2 AS 73 634 gccacagctcaagtcaggcgagctggccaaacgcagtgagcggc 677 1 S 74 651 gcgacctggacaaacgcattgagcggcggcctgaggcagagaagcagctgtatcatgctcaagctgctg g 720 FGFR1 1 S 76 2004 ggtaacagtgtctgctgactccagtgcatccatgaactctggggttcttctggttcggccatcacggct NM_000604.2 ctcctccagtgggactcccatgctagcaggggtctctgagtatgagcttcccgaagaccctcgc gggagctgcctcgggacagactggtcttaggc 2169 1 AS 77 2844 ggaggaacttttcaagctgctgaaggagggtcaccgcatggacaagcccagtaactgcaccaacgagct gtacatgatgatgcgggactgctggcatgcagtgccctcacagagacccaccttcaagcagctggt 2978 4 S 78 1930 ggtaccaagaagagtgacttccacagccagatggctgtgcacaagctggccaagagcatcctctgcgca gacaggtaacagtgtctgctgactccagtg 2029 GBC-1 68 S 79 876 tcctcacatcccagacgatgggcggccaggcagagacgctcctcacttcccagacggggtagcggccg XM_031920 943 2 S 80 876 tcctcacatcccagacgatgggcggccaggcagagacgctcctcacttcccag 928 FLJ10006 2 AS 81 1010 agaaagtgaggaccctcaggaggctgcaggccagtgagtcagcaaatgaagagattcccgaaccccgaa XM_041928 tcagtgattcggaaagtgaggatcc 1102 FLJ13052 2 S 82 2508 ctaacacagcgagggactcaacacgctgattctcctcctgcctctcccg 2556 NM_023018 FOSL2 1 S 83 708 ggcggggctggacaatgcccagcgctctgtctcaagcccatcagcattgctgggggcttctacggtgag NM_005253.1 gatcccc 784 1 AS 84 881 ggtgactcctgctccaggacgctaggataggtga 848 GBC-11 4 AS 85 437 cagagccccaaaacgctgggcagagttgacaggacccaaatgctaaagttgtggaggg 378 W84777 GBC-12 4 S 86 tggggagacccggagacggtggctggggtgtcctcagcccgggagagctgagtcagccgcgccccgcac acagcatacttaggagccaaggacttggacctcgcttctcgccggtacgcga GBC-13 3 S 87 acccctggnaacatggnaaatataaaacaacttggtgtttttgaaaaaccgcaaagcgttatggtgtgg atgtaacacaggggtgtggtgt GBC-14 2 S 88 176 tggaggaaaccccgtgtctgcggagcggctgtagcctgtgagcagcgagatccagggacag 236 AL557138 GBC-15 2 S 89 107 cagctacccagaagtctgaggcaggagaaatgctggaacccgggaggcagagg 159 BE079876 GBC-16 2 S 90 Cagcgatccgtccagcagatgacgaatatcgacggccatttccggcataccgagctgttgcataatgcc cgcagactgtgct GBC-17 2 S 91 Cggaagagctcacaatgctcatttcgcgtctcgctcgggtgttgtgctgttctttaatactgtgggcaa ttcaggtgtgtcgcttagaaaacggaggtactcaatggagtcctcaacaatgaggggccctgttcatgg ctttgtgttggccgttcgttccacatgttctt GBC-18 2 S 92 Cgatgattattttcttggcaaagtttttagcagaacgtcaaaaattgattacatcttttaaacgtggtt tattaccggc GBC-2 1 93 1 agagcgaggcgtgaagtccacacgcccagccccgtcgcagtgtggttgccgagcaaggctacgtctgcg gcgcgtgcggta 81 GBC-3 12 5 94 4 ccgggatgaagtgacccagcagaaataccagagaccggagacggaatggcccagggtcagcctccaccc AA443027 ggaaccggaggatgcagcgaagacgtctc 101 GBC-4 1 AS 95 87 cctcgctcaggattgcttcccgcggtgcctcccgcggctgcacggaaggccacgaaccgacaacttgca AV710590 cagcagccatcttttct 1 GNAS 2 AS 96 44 cgcgcgcagctccccgcccctcgagccgaggccgagggggctgatggccgccgccgggccgag 106 NM_000516 GSTP 7 S 97 275 ggaccagcaggaggcagccctggtggacatggtgaatgacggcgtggaggacctccgctgca 336 NM_000852 1 S 98 670 tgcctggctgcgtttcccctgctctcagcatatgtggggcgcctcagcgcccggcccaagctca aggccttcctggcctcccctgagtacgtgaacttccccatcaatggcaacgggaaacagtgagg gttg gg 537 HES6 6 S 99 935 gcagggcagcccctggtaaccagcccagtcaggccccagccccgtttcttaagaaacttttaggg XM_043579 accctgcagctctg 1013 HNRPA2B1 4 S 100 826 cggaccaggaccaggaagtaactttagaggaggatctgatggatatggcagtggacgtggattt NM_002137 ggggatggctata 902 HNRPF 5 S 101 1000 caggcctggaaaggatgaggcctggtgcctacagcacaggctacgggggctacgaggagtacagt NM_004966 ggcctcagtgatggctacggcttcaccaccgacctgttcgggagagacctcagctactgtctctccgga atgtatgaccacagatacgccgac 1157 HRMT1L2 5 S 102 2707 ggtgcgggtgaagatggcggcagccgaggccgcgaactgcat 2748 NM_001536 HSPCA 24 S 103 1554 caaggaccaggtagctaactcagcctttgtggaacgtcttcggaaacatggc 1605 NM_005348.1 3 S 104 1553 ccaaggaccaggtagctaactcagcctttgtggaac 1588 ICAM2 5 S 105 12 ggcagcccttggctggtccctgcgagcccgtggagactgccagagatgtcctctttcggtta NM_000873 caggaccctgactgtggccctcttcaccctgatctgctg 112 2 S 106 705 gagcctgtgtcggacagccagatggtcatcatagtcacggtggtgtcggtgttgctgtccctgt 768 1 AS 107 745 gccgctcactccccgtaggtgcccatccgctgctggcgcaagtgctggccgaagatgaagcaga gcaggacagatgtcacgaacagggacagcaacaccgacacca 850 IGF2R 2 S 108 903 gaagctggtgcgcaaggacaggcttgtcctgagttacgt 941 NM_000876 1 S 109 1571 gcggtgccaccgacgggnaagaagcgctatgacctgtccgcgctggtccgccatgcagaacc 1631 IL4R 2 AS 110 1178 ctcctcctcctcacactccaccgggngcctcaaacaactccacacatcgcaccacgctgatgctct NM_000418 ctggccagaggactgtcttgctgatctccactgggcaccatgctgattttccagagcc 1300 INTB5 25 S 111 67 tggggctctgcgcgctcctgccccggctcgcaggtctcaacatatgcactagtggaagtgccacctcat NM_002213.1 gtgaagaatgtctgctaatccacccaaaatgtgcctggtgctccaaagaggacttcggaagcc 198 2 S 112 2088 ccaaggactgcgtcatgatgttcacctatgtggagctccccagtgggaagtccaacctgaccgtcctc agggagccagagtgtggaaacacccccaacgccatgaccatcctcct 2203 1 S 113 1722 ggccatggcgagtgtcactgcggggaatgcaagtgccatgcaggttacatcggggacaactgtaactgc tcgacagacatcagcaca 1808 1 S 114 2118 gtggagctccccagtgggaagtccaacctgaccgtcctcagggagccagagtgtggaaacacccccaa cgccatgaccatcctcctggctg 2208 1 AS 115 2047 tgaaagatgaccaggaggctgtgctatgtttctaca 2082 ITGA3 2 AS 118 1993 tgggcgtcctccccggagcgctccgaggtccgggtgttcgtcacgttgatgctcaggagcaattt NM_002204 ccggacgtctctgctgtactggagcctg 2085 ITGA4 2 S 119 1188 ggcgcgaacccggcccccgaaggccgccgtccgggagacggtgatgctgttgctgtgcctgggggt NM_000885 cccgaccggccgcccctacaacgt 1276 1 AS 120 2797 tgtgttctacagttagcttctctgctggacacctgtatgcttcnctgtaatca 2848 JunB 1 S 121 306 cgggatacggccgggcccctggtggcctctctctacacgactacaaac 353 NM_002229.1 1 S 122 322 ccctggtggcctctctctacacgactac 349 KIAA1270 9 AS 123 1591 cctgtccaagaggaggccacagcgctggcctttccccacggaggccactgctgtcccgtcctctgt XM_044835 atacagttgcaacacctgggcctcacaggt 1683 KIFC1 3 AS 124 2193 tctggatccgtcttcacttcctgttggcctgagcagtaccaataacacactggttcaccttggaggcaa XM_042626 2125 L1CAM 1 AS 125 4465 ttggggacccaggagacgacacttggatgttgtgtggtgggtaccgaaggcagcgtgtgtatggagctc NM_000425.2 ctgaaagccggccatggggtgggc 4392 1 AS 126 2457 caggcaatccctgagctggaaggcattgaaatcctcaactcaagtgccgtgctggtcaagtggcggccg gtggacctggcccaggtcaagggccacctccgcggatacaatg 2568 2 S 127 1389 agtgttcagtggctggacgaggatgggacaacagtgcttcaggacgaacgcttcttcccctatgccaat gggaccctgggcattcgagacctccaggccaatgacac 1495 2 AS 128 1518 gccaatgaccaaaacaatgttaccatcatggctaacctgaaggttaaagatgcaactcagatcactcag gggccccgcagcacaatcgagaagaaaggttccaggg 1623 1 S 129 666 accaggaccatcattcagaaggaacccattgacctccgggtcaaggccaccaacagcatgattgacagg aagccgcgcctgctcttccccaccaactccagcagccacc 774 1 S 130 591 ggcaacctctactttgccaatgtgctcacctccgacaaccactcagactacatctgccacgcccacttc ccaggcaccaggaccatcatt 680 1 S 131 253 ccaaggaagagctgggtgtgaccgtgtaccagtcgccccact 294 1 S 132 1367 ggccttcggagcgcctgtgcccagtgttcagtggctggacgaggatgggacaacagtgctt 1427 1 S 133 729 gacaggaagccgcgcctgctcttccccaccaactccagcagccacctggtg 779 12 S 134 94 aatatgaaggacaccatgtgatggagccacctgtcatcac 133 1 AS 135 2889 cccctggatgaggggggcaaggggcaact 2917 7 S 136 94 aatatgaaggacaccatgtgatggagc 120 LYN 1 AS 137 1243 tacatcatcaccgagttcatggctaagggtagtttgctggatttcctcaagagtgatgaaggtggcaag NM_002350 gtgctgctgcccaagctcattgacttctcggcccagattgca 1353 4 AS 138 1208 ggctgtacgctgtggtcaccaaggaggagcccatctacatcatcaccg 1255 PSMB7 4 S 139 595 caagaatctggtgagcgaagccatcgcagctggcatcttcaacgacctgggc 647 NM_002799 MAP2K2 1 AS 140 435 tcatcgtctttgagttcgccgaccttggctttctgggtgag 475 NM_030662.1 1 AS 141 881 ccgctccggagccatgtaggagcgcgtgcccacgaaggagttggccatggagtctatgagct ggccgctcaccccgaagtcacacagcttgatctcc 977 MBD1 1 S 142 2829 cctcgtgccgaattcttggcctcgagggccaaattccctatagtgagtcgtattaaattcg 2889 NM_015847 1 AS 143 2846 tttaatacgactcactatagggaatttggccctcgaggcc 2885 MCM3 3 AS 144 2207 cactccaaagacggcagactcacaggagaccaaggaatcccagaaagtggagttgagtgaatccaggtt NM_002388.2 gaaggcattcaaggtggccctcttggatgtgttccgggaagctcatgcgcagtcaatcggcatgaatcg cctcacagaatccatcaaccgggacagcgaagagcccttctcttcagttg 2394 6 S 145 1597 tgcccttgggtagtgctgtggatatcctggccacagatgatcccaactttagccaggaagatcagcagg acacccagat 1675 14 AS 146 1707 accaagaagaaaaaggagaagatggtgagtgcagcattcatgaagaagtacatccatgtggccaaaatc atcaagcc 1783 4 AS 147 1597 tgcccttgggtagtgctgtggatatcctggccacagatgatcccaactttagccaggaagatcagcagg acacccagat 1675 6 S 148 2410 tgagcaagatgcaggatgacaatcaggtcatggtgtctgag 2450 1 AS 149 2400 acccaagttcggagacgaggcctcctcagatgaggaagatgatgccctcagacaccatgacctgatt gtcatcctgcatcttgctcagagcaacctg 2496 1 S 150 2799 agcagtggctcatccgccctacttcccatcccacacaaacccaattgtaaataacatatgacttcgt gagtacttttggg 2721 MCM6 2 S 151 2127 gccctgctcctgtgaacgggatcaatggctacaatgaagacataaatcaagagtctgctcccaaagcc NM_005915 2194 MYL6 1 S 155 35 gtcaagatgtgtgacttcaccgaagaccagaccgcagagttcaaggaggccttccagctgtttgaccga NM_021019 acag 107 1 S 156 54 ccgaagaccagaccgcagagttcaaggaggccttccagctgtttgaccgaacaggtgatggc aagatcctgtacagccagtg 135 NFkB1 5 AS 157 1 ggccaccggagcggcccggcgacgatcgctgacagcttcccctgcc 46 NM_003998.1 NIN283 11 S 158 1116 ggcaccccttctgcactgacttccagatatggttctcccttcctccctgaggacaccaaattg NM_032268 gatgagagcaagtttgagagaag 1202 NR3C1 5 S 159 511 gcaaacctcatatgtcgaccagtgttccagagaaccccaagagttcagcatccactgctgtgt NM_000176 ctgctgcccccacagagaaggagtt 599 NUMA1 2 S 160 4197 ggagctgacctcacaggctgagcgtgcggaggagctgggccaagaattgaaggcgtggc 4255 NM_006185 GRP58 3 S 161 1166 caatctgaagagatacctgaagtctgaacctatcccagagagcaatgatgggcctgtgaaggtagtggt NM_005313.1 agc 1237 1 AS 162 1084 ttagcagttctgatagcaacaacaggaatctctccagcagtgctctccaagtgagtgagcggccgc 1034 PC4 2 S 163 93 tgctccagaaaaacctgtaaagagacaaaagacaggtgagacttcgagagccctg 147 NM_006713 PCNA 3 S 164 1 ccgctacaggcaggcgggaaggaggaaagtctagctggtttcggcttcaggagcctcaga NM_002592 gcgagcgggcgaacgtcgcgacgacgggctgagacct 97 PKC delta 1 S 165 897 gcggcatcaaccagaagcttttggctgaggccttgaaccaagtcacccagagagcctccc NM_006254.1 ggagatcagactcagcctcctcagagcctgttgggatatatcagggtttcgagaagaagaccggagtt 1024 1 S 166 667 gatcatcggcagatgcactggcaccgcggccaacagccgggacactatattccagaaaga acgcttcaacatcgacatgccgcaccgcttcaaggttcacaactacatg 775 3 AS 167 1935 cacccagagactacagtaactttgaccaggagttcctgaacgagaaggcgcgcctctcctacagcg 2000 PKC eta 1 S 168 327 tgggccagaccagcaccaagcagaagaccaacaaacccacgtacaacgaggagttttgcgctaacgtca NM_006255.1 ccgacggcggccacctcgagttg 418 1 S 169 383 tgcgctaacgtcaccgacggcggccacctcgagttggccgtcttccacgagacccccctgggctacgac cacttcgtggccaactgcaccctgcagttccaggagct 486 1 AS 170 371 aacgaggagttttgcgctaacgtcaccgacggcggccacctcgagttggccgtcttccacgagaccccc ctgggc 445 1 S 171 362 cccacgtacaacgaggagttttgcgctaa 390 PKCZETA 4 S 172 386 acggccacctcttccaagccaagcgctttaacaggagagcgtactgcggtcagtgcagcg 445 NM_002744 1 S 173 163 ccgctcaccctcaagtgggtggacagcgaaggtgacccttgcacggtgtcctcccagatgg agctggaagaggctttccgcctggcccgtcagtgcagggatgaaggcctcatcattcatg 283 1 AS 174 842 gacgtactcaatgaccaggaacaaccgacttgtcgtctggaagcaggagtgtaatccgacca ggaaggggttgctggatgcctgctcaaacacgtgcttctctgtctgtacccagtcaatatcctcgccat catgcaccagctctttcttcaccactt 999 PPP2R1B 2 AS 175 504 acggaattgctgtctgatttctgctttaacagcatttgatgccctgggatagcaaacgctg aacaaac NM_002716 cacatgc 578 1 AS 176 805 aggacccatggctttctggagctctgaaaatctgtcagccaccatatagcgaacgcgcca agatttatcttctgctgcttgtcgaagtg 893 RAB2L 4 S 177 871 gtcacacagtttaacaaggtggcaggggcagtggttagttctgtcctgggggctacttcc NM_004761 actggagagggacctggggaggtgaccatacggcc 965 RAB5B 2 S 178 834 aacaccaggcagctgttccgactggcctcct 864 NM_002868.1 1 AS 179 1345 gggcggaggtggaggtgcagggtcaactgtggctctgta 1383 RAD23A 2 S 180 1351 gcctgctcanagaagctggcaggactgggaggcgacagatgggcccctcttggcctctgtcccagctct NM_005053 1419 RAN 6 S 181 750 ggatggtgacctgtgagaatgaagctggagcccagcgtcagaagtctagttttataggcagctgtcc NM_006325 816 REL 2 S 182 1727 tgaatcttgaaaacccctcatgtaattcagtgttagacccaagagacttgagacagctccatc NM_002908 agatgtcctcttccagtatgtcagcaggcgccaattccaatactactgcccattgtttcacaatcagat gcatttgagggatctgacttcagttgtgcagataacagcatgataaatg 1906 AHRG 36 S 183 518 aggagcagagccaggcgcccatcacaccgcagcagggccaggcactcgcgaaacagatcc NM_001665.1 acgctgtgcgctacctcgaatgctcagccctgcaacaggatggtgtcaaggaagtgttcgccgag gctgtccgggctgtgctc 660 2 AS 184 377 ccattgccagtccgccgtcctatgagaacgtgcggcacaagtggcatccagaggtgtgccacca ctgccctgatgtgcccatcctgctggtgggcaccaagaaggacctgagagcccagcctgacaccctacg gc 511 1 S 185 518 aggagcagagccaggcgcccatcacaccgcagcagggccaggcact 563 2 S 186 273 ggcaatggagaaacagatgacgaaaacgttggtctgagggtaggagagtgtacggaggcggtcatact cctcctggcccgcagtgtcccacaggttcaggttcactttgcgc 384 1 S 187 516 caccatcctgttgcagggctgagcattcgaggtagcgcacagcgtggatctgcttggccagtgcctggc cctgctgcggtgtgatgggcgcctggccctgctccttg 622 1 S 188 541 gagcacagcccggacagcctcggcgagctattccttggctccatcgtgttgcaggggtggcgtcctagg tagcgcgcagcgtggatatgctcggccagtgcatggccctgatgcggtgt 660 RPA1 2 AS 189 2163 tggagaagcaaaaacctagttacataatttacttcatggtctgcagttagggtcagtgactta NM_002945 cgacataattcctgcttgatgataatgaaattgacagaagcctgaaggctgagtgagtga 2285 RPA3 6 S 190 8 agccgcagtcttggaccataatcatgg 34 NM_002947.1 RPL12 2 S 191 24 ggccaaggtgcaacttccttcggtcgtcccgaatccgggttcatccgacaccagccgcctcca NM_000976 ccatgccgccgaagttcgaccccaacga 114 RPL31 9 S 192 28 tggcgagaagaaaaagggccgttctgccatcaacgaagtggtaacccgagaat 80 NM_013403.1 1 S 193 44 ggccgttctgccatcaacgaagtggtaacccgagaat 80 RPL35 2 AS 194 12 ggcggcttgtgcagcaatggccaagatcaaggctcgagatct 53 NM_004632.1 1 AS 195 12 ggcggcttgtgcagcaatggccaagatcaaggc 44 RPS24 4 AS 196 351 gccagcaccaacattggcctttgcagtccccctgactttcttcattctgttcttgcgttcct ttcgtt NM_001026 gct 421 4 s 197 373 cagaatgaagaaagtcagggggactgcaaaggccaatgttggtgctggcaaaaag 427 RPS29 2 S 198 4 ttacctcgttgcactgctgagagcaagatgggtcaccagcagctgtactggagcca 59 NM_001032 SQSTM1 2 S 199 1278 ggcagcaaaacaagtgacatgaagggagggtccctgtgtgtgtgtgc 1324 NM_003900 STAT3 11 5 200 2288 gagagccaggagcatcctgaagctgacccaggtagcgctgccccatacctgaagaccaagttta NM_003150.1 tctgtgtgacaccaacgacctgcagcaataccattgacctgccgatgtccccccgc 2407 7 AS 201 2111 aagacccagatccagtccgtggaaccatacacaaagcagcagctgaacaacatgtcatttgctgaaatc atcatgggctataagatcatggatgctaccaatatcctg 2218 2 S 202 667 ggatgtccggaagagagtgcaggatctagaacagaaaatgaaagtggtagagaatctcca ggatgactttgatttcaactataaaaccctcaagagtc 764 2 S 203 431 ttcctgcaagagtcgaatgttctctatcagcacaatctacgaagaatcaagcagtttcttcagagcagg tatcttgagaagccaatggagattgcccggattgtggcccggtgcc 545 1 AS 204 834 agatgctcactgcgctggaccagatgcggagaagcatcgtgagtgagctggcggggcttttgtcagcga tggagtacgtgcagaa 918 1 S 205 413 gaccagcagtatagccgcttcctgcaagagtcgaatgttctctatca 459 1 AS 206 935 gagctggctgactggaagaggcggcaacagatggagtacgtgcagaa 980 STAT5b 102 AS 207 287 tcttgataatccacaggagaacattaaggccacccagctcctggagggcctggtgcag NM_012448.1 gagctgcagaagaaggcagaacaccaggtgggggaagatgggttttt 391 1 AS 208 303 gagaacattaaggccacccagctcctggagggcctggcgcaggagctgcagaacaaggcacaacaccag gagggggaagatg 384 3 S 209 1941 aacaagcagcaggcccacgacctgctcatcaacaagccagatgggaccttcctgctgcgcttcagcgac tcggaaatcgggggcatcaccattgcttggaagtttga 2047 36 S 210 1409 aaacgaatcaagaggtctgaccgccgtggtgcagagtcggtcacggaagagaagttcacaatcttgttt gactcacagttcagtgttggtggaaatgagctggt 1513 3 AS 211 287 tcttgataatcctcaggaggccattaagcccacccagctcatgaagggcatggtgcagtagctgcagaa gaagagcagaactccaggtgggggaagatgggttt 389 1 AS 212 287 tcttgataatccacaggagaacattaaggccacccagctcctggaggg 334 2 S 213 1467 acaatcctgtttgaatcccagttcagtgttggtggaaatgagctggt 1513 1 S 214 1484 ccagttcagtgttggtggaaatgagctggt 1513 TAF7 6 S 215 65 cgagctgcgcctctcggcaagatttcgcgctgaccatcccgggccctttcatcactaatcggt 127 (TFIID) NM_005642 TDGF1 3 AS 216 57 ggtcgtagcagaagcaggagcaaggcgtccaggggaaactggagggctt 105 NM_003212 VWF 8 S 217 3646 ccagcatggcaaggtggtgacctggaggacggccacattgtgcccccagagctgcgaggagaggaatct NM_000552.2 ccgggagaacgggtatgagtgtgagtggcgctataacagctgtgcacctgcctg 3768 3 AS 218 4687 ccttgcccctgaagcccctcctcctactctgcccccccacatggcacaagtcactgtgggcccggggct cttgggggtttcgaccctggggcccaagaggaactccatggttctggatgtggcgttc 4813 3 S 219 1124 gcccggacctgtgcccaggagggaatggtgctgtacggctggaccgaccacagcgcgtgcagcccagtg tgccctgctggtatg 1207 2 S 220 7776 agtgctgtggaaggtgcctgccatctgcctgtgaggtggtgactggctcaccgcggggggactcccagt cttcctg 7851 2 S 221 5082 tggtcagccagggtgaccgggagcaggcgcccaacctggtctacatggtcaccggaaatcctg 5144 3 S 222 6003 agtgccacaccgtgacttgccagccagatggccagaccttgctgaagagtcatcgggtcaactgt 6067 1 AS 223 4725 acatggcacaagtcactgtgggcccggggctcttgggggtttcgaccctggggcccaagaggaactcca tggttctggatg 4805 2 S 224 4376 tccaccagcgaggtcttgaaatacacactgttccaaatcttcagcaagatcgaccgccctgaagc 4440 1 AS 225 7818 ctggctcaccgcggggggactcccagtcttcctggaagagtgtcggctcccagtggg 7874 1 AS 226 1380 accctcccggcacctccctctctcgagactgcaacacctgcatttgccgaaacagcc 1436 2 AS 227 8762 agctgcatgggtgcctgctgctgcc 8786 ZIN 6 AS 228 1782 ctcagtggccttcaccagcaccgagcctgcccacatcgtggcctccttccgctctggcgacaccgtctt NM_013403.1 gtatgacatggaggttggcagtgccctcctcacgctggagtcccggggcagcagcggtccaaccca 1916

TABLE 5 Peptides encoded by sense-oriented GSEs Location in Parent GSE Peptide Protein SEQ ID SEQ ID (AA Gene NO NO Residues) Sequence ADPRT 24 229 860-887 LWHGSRTTNFAGILSQGLRIAPPEAPVT IF1 32 230 1-16 MAVTALAARTWLGVWG BAG1 33 231 53-80 RDEESTRSEEVTREEMAAAGLTVTVTHS BAG1 34 232 62-80 EVTREEMAAAGLTVTVTHS AP1B1 35 233 76-97 YAKSQPDMAIMAVNTFVKDCED AP1B1 36 234 81-96 PDMAIMAVNTFVKDCE CDK10 38 235 347-360 APATSEGQSKRCKP CDK2 40 236 51-66 EISLLKELNHPNIVKL CDK2 41 237 159-177 YTHEVVTLWYRAPEILLGC c-FOS 45 238 362-378 PELVHYREEKHVFPQRF c-FOS 47 239 148-158 KMAAAKCRNRR CREB1 54 240 27-49 VQAQPQIATLAQVSMPAAHATSS CCND1 55 241 56-91 MRKIVATWMLEVCEEQKCEEEVFPLAMN YLDRFLSL EDF1 61 242 22-37 AKSKQAILAAQRRGGD EIF1 62 243 1050-1063 RGGADDERSSWRNA EFNA1 67 244 53-79 HYEDHSVADAAMEQYILYLVEHEEYQL FEN1 71 245 90-101 PQLKSGELAKRS FGFR1 76 246 427-470 VTVSADSSASMNSGVLLVRPSRLSSSGTPMLAGVSEYELPEDPR FGFR1 78 247 402-421 GTKKSDFHSQMAVHKLAKSI GBC1 79 248 36-54 LTSQTMGGQAETLLTSQKG FOS2L 83 249 246-261 IKPISIAGGFYGEEPL GSTP 97 250 83-102 DQQEAALVDMVNDGVEDLRC GSTP 98 251 170-210 CLDAFPLLSAYVGRLSARPKLKAFLASP EYVNLPINGNGKQ GBC-3 94 252 WMDGRDEVTQQKYQRPETEWPRVSLH PEPEDAAKTSLSE HES6 99 253 874-948 RAAPGNQPSQAPAPFLKKLLGTLQL HNRPA2B1 100 254 786-866 ISDQDQEVTLEEDLMDMAVDVDLGMAI HNRPF 101 255 226-278 AGLERMRPGAYSTGYGGYEEYSGLSDGYGFTTDLFGRDLSYCL SGMYDHRYGD HRMT1L2 102 256 2701-2748 GVGAGEDGGSRGRELH HSPCA 103 257 499-515 KDQVANSAFVERLRKHG ICAM2 105 258 1-19 MSSFGYRTLTVALFTLICC ICAM2 106 259 216-229 YEPVSDSQMVIIVT IGF2R 108 260 253-265 KLVRKDRLVLSYV IGF2R 109 261 481-496 KKRYDLSALVRHAEPE INTB5 111 262 12-56 LLGLCALLPRLAGLNICTSGSATSCEECLLIHPKCAWCSKEDFGS INTB5 112 263 688-724 KDCVMMFTYVELPSGKSNLTVLREPECGNTPNAMTIL INTB5 113 264 457-485 GHGECHCGECKCHAGYIGDNCNCSTDIST INTB5 114 265 697-726 VELPSGKSNLTVLREPECGNTPNAMTILLA ITGA4 119 266 18-41 PEAAVRETVMLLLCLGVPTGRPYN JUNB 121 267 19-34 GYGRAPGGLSLHDYKL JUNB 122 268 24-32 PGGLSLHDY L1CAM 127 269 457-491 SVQWLDEDGTTVLQDERFFPYANGTLGIRDLQAND L1CAM 129 270 216-251 TRTIIQKEPIDLRVKATNSMIDRKPRLLFPTNSSSH L1CAM 130 271 191-220 GNLYFANVLTSDNHSDYICHAHFPGTRTII L1CAM 132 272 450-469 AFGAPVPSVQWLDEDGTTVL L1CAM 134 273 25-39 EYEGHHVMEPPVITE L1CAM 131 274 79-91 KEELGVTVYQSPH L1CAM 133 275 237-253 DRKPRLLFPTNSSSHLV PSMB7 139 276 193-211 EAKNLVSEAIAAGIFNDLG MCM3 145 277 519-543 PLGSAVDILATDDPNFSQEDQQDTQ MCM3 148 278 789-802 LSKMQDDNQVMVSE MCM6 151 279 690-711 PAPVNGINGYNEDINQESAPKA MET 154 280 1253-1317 YSVHNKTGAKLPVKWMALESLQTQKFTTKSDVWS FGVVLWELMTRGAPP YPDVNTFDITVYLLQG MYL6 155 281 2-23 MCDFTEDQTTEFKEAFQLFDRT MYL6 156 282 7-32 EDQTTEFKEAFQLFDRTGDGKILYNQ NR3C1 159 283 132-155 STSVPENPKSSASTAVSAAPTEKE NUMA1 160 284 1314-1332 ELTSQAERAEELGQELKAW GRP58 161 285 360-382 NLKRYLKSEPIPESNDGPVKVVV PC4 163 286 32-49 APEKPVKKQKTGETSRAL PKC delta 165 287 281-322 GINQKLLAEALNQVTQRASRRSDSASSEPVGIYQGFEKKTGV PKC delta 166 288 204-239 IIGRCTGTAANSRDTIFQKERFNIDMPHRFKVHNYM PKC eta 168 289 55-84 GQTSTKQKTNKPTYNEEFCANVTDGGHLEL PKC eta 169 290 73-106 CANVTDGGHLELAVFHETPLGYDHFVANCTLQFQE PKC zeta 172 291 130-148 GHLFQAKRFNRRAYCGQCS PKC zeta 173 292 55-94 PLTLKWVDSEGDPCTVSSQMELEEAFRLARQCRDEGLIIH RAB2L 177 293 291-321 VTQFNKVAGAVVSSVLGATSTGEGPGEVTIR REL 182 294 518-553 NLENPSCNSVLDPRDLRQLHQMSSSSMSAGANSNTT AHRG 183 295 131-177 EQSQAPITPQQGQALAKQIHAVRYLECSALQQDGVKEVFAEAVRAVL AHRG 186 296 49-85 GWMEEQSQAPITPQQGQALE AHRG 187 297 130-164 KEQSQAPITPQQGQALAKQIHAVRYLECSALQQDG AHRG 188 298 138-155 TPQQGQALAKQIHAVRYL RPL12 191 299 209-228 SRIRVHLTPAASTMLPKFNP RPL31 192 300 8-24 GEKKKGRSAINEVVTRE RPS24 197 301 113-130 WMDGRMKKVRGTAKANVGAGKK STAT3 200 302 6150-729 ESQEHPEADPGSAAPYLKTKFICVTPTTCSNTIDLPMSPR STAT3 202 303 90-181 DVRKRVQDLEQKMKVVENLQDDFDFNYKTLKS STAT3 203 304 71-108 FLQESNVLYQHNLRRIKQFLQSRYLEKPMEIARIVARC STAT3 205 305 65-79 DQQYSRFLQESNVLY STAT5 209 306 599-633 NKQQAHDLLINKPDGTFLLRFSDSEIGGITIAWKF STAT5 210 307 422-455 KRIKRSDRRGAESVTEEKFTILFESQFSVGGNEL STAT5 213 308 441-455 TILFESQFSVGGNEL VWF 217 309 1113-1152 QHGKVVTWRTATLCPQSCEERNLRENGYECEWRYNSCAPA VWF 219 310 272-299 ARTCAQEGMVLYGWTDHSACSPVCPAGM VWF 220 311 2490-2513 CCGRCLPSACEVVTGSPRGDSQSS VWF 221 312 1592-1611 VSQGDREQAPNLVYMVTGNP VWF 222 313 1899-1919 CHTVTCQPDGQTLLKSHRVNC VWF 224 314 1356-1376 STSEVLKYTLFQIFSKIDRPE

Claims

1. One or a plurality of target genes for identifying compounds that inhibit tumor cell growth, wherein inhibition of expression of at least one of said genes or inhibition of protein activity of its gene product inhibits growth of the tumor cell, wherein said genes are selected from the group consisting of the genes set forth in Table 3.

2. A plurality of target genes according to claim 1, comprising genes encoding L1CAM, ICAM2, Zinedin, or von Willebrand factor.

3. A pattern of gene expression inhibition or inhibition of protein activity of the gene product of a plurality of said genes as set forth in Table 3 wherein detecting said pattern in a tumor cell in response to contacting the cell with the compound is used to identify said compound as an inhibitor of tumor cell growth.

4. A pattern according to claim 3, wherein the pattern comprises genes encoding L1CAM, ICAM2, Zinedin, or von Willebrand factor.

5. A panel of oligonucleotides comprising sequences specific for a plurality of target genes for identifying compounds that inhibit tumor cell growth according to claim 1, wherein said genes are selected from the group consisting of the genes set forth in Table 3.

6. The panel of claim 5, wherein said oligonucleotides comprising sequences specific for the genes of said panel are immobilized on a solid substrate.

7. The panel of claim 6, wherein said solid substrate is a microchip or forms a microarray.

8. The panel of claim 5, comprising oligonucleotides specific for genes encoding L1CAM, ICAM2, Zinedin, or von Willebrand factor.