Novel human hydrolase family members and uses thereof

Info

Publication number: 20040214758
Type: Application
Filed: Jul 11, 2002
Publication Date: Oct 28, 2004
Inventors: Rachel E. Meyers (Newton, MA), Maria Alexandra Glucksmann (Lexington, MA), Rory A. J. Curtis (Framingham, MA), Laura A. Rudolph-Owen (Jamaica Plain, MA)
Application Number: 10193452

Abstract

The invention provides isolated nucleic acids molecules, designated 26443, 46873, 61833, 26493, 58224, 46980, 32225, 47508, 56939, 33410, 33521, 23479, 48120, 46689, 80091, and 46508 nucleic acid molecules, which encode novel human hydrolase family members. The invention also provides antisense nucleic acid molecules, recombinant expression vectors containing 26443, 46873, 61833, 26493, 58224, 46980, 32225, 47508, 56939, 33410, 33521, 23479, 48120, 46689, 80091, or 46508 nucleic acid molecules, host cells into which the expression vectors have been introduced, and nonhuman transgenic animals in which a 26443, 46873, 61833, 26493, 58224, 46980, 32225, 47508, 56939, 33410, 33521, 23479, 48120, 46689, 80091, or 46508 gene has been introduced or disrupted. The invention still further provides isolated 26443, 46873, 61833, 26493, 58224, 46980, 32225, 47508, 56939, 33410, 33521, 23479, 48120, 46689, 80091, or 46508 proteins, fusion proteins, antigenic peptides and anti-26443, 46873, 61833, 26493, 58224, 46980, 32225, 47508, 56939, 33410, 33521, 23479, 48120, 46689, 80091, or 46508 antibodies. Diagnostic methods utilizing compositions of the invention are also provided.

Description

Description

RELATED APPLICATIONS

[0001] This application is a continuation-in-part and claims priority to U.S. application Ser. No. 09/816,664, filed Mar. 23, 2001, which claims the benefit of U.S. Provisional Application Ser. No. 60/191,973, filed Mar. 24, 2000; and U.S. application Ser. No. 09/841,880, filed Apr. 24, 2001, which claims the benefit of U.S. Provisional Application Ser. No. 60/199,559, filed Apr. 25, 2000; and U.S. application Ser. No. 09/862,556, filed May 22, 2001, and International Application Serial No. PCT/US01/16424, filed May 22, 2001, which claim the benefit of U.S. Provisional Application Ser. No. 60/206,036, filed May 22, 2000; and U.S. application Ser. No. 09/861,165, filed May 18, 2001, and International Application Serial No. PCT/US01/16014, filed May 18, 2001, which claim the benefit of U.S. Provisional Application Ser. No. 60/205,442, filed May 19, 2000; and U.S. application Ser. No. 09/875,353, filed Jun. 6, 2001, and International Application Serial No. PCT/US01/18335, filed Jun. 6, 2001, which claim the benefit of U.S. Provisional Application Ser. No. 60/209,949, filed Jun. 6, 2000; and U.S. application Ser. No. 09/896,578, filed Jun. 29, 2001, and International Application Serial No. PCT/US01/20880, filed Jun. 29, 2001, which claim the benefit of U.S. Provisional Application Ser. No. 60/214,948, filed Jun. 29, 2000; and U.S. application Ser. No. 09/911,150, filed Jul. 23, 2001, and International Application Serial No. PCT/US01/23153, filed Jul. 23, 2001, which claim the benefit of U.S. Provisional Application Ser. No. 60/220,008, filed Jul. 21, 2000; and U.S. application Ser. No. 09/911,317, filed Jul. 23, 2001, and International Application Serial No. PCT/US01/23160, filed Jul. 23, 2001, which claim the benefit of U.S. Provisional Application Ser. No. 60/220,040, filed Jul. 21, 2000; and U.S. application Ser. No. 09/934,323, filed Aug. 21, 2001, and International Application Serial No. PCT/US01/26091, filed Aug. 21, 2001, which claim the benefit of U.S. Provisional Application Ser. No. 60/226,774, filed Aug. 21, 2000; and U.S. application Ser. No. 09/963,959, filed Sep. 25, 2001, and International Application Serial No. PCT/US01/29962, filed Sep. 25, 2001, which claim the benefit of U.S. Provisional Application Ser. No. 60/235,033, filed Sep. 25, 2000; and U.S. Application Serial No. 09/971,490, filed Oct. 5, 2001, and International Application Serial No. PCT/US01/31674, filed Oct. 5, 2001, which claim the benefit of U.S. Provisional Application Ser. No. 60/238,170, filed Oct. 5, 2000; and U.S. application Ser. No. 10/071,275, filed Feb. 7, 2002, and International Application Serial No. PCT/US02/03793, filed Feb. 7, 2002, which claim the benefit of U.S. Provisional Application Ser. No. 60/267,054, filed Feb. 7, 2001; and U.S. application Ser. No. 09/888,911, filed Jun. 25, 2001, and International Application Serial No. PCT/US01/19967, filed Jun. 25, 2001, which claim the benefit of U.S. Provisional Application Ser. No. 60/213,688, filed Jun. 23, 2000, the contents of which are incorporated herein by reference.

BACKGROUND OF THE 26443 AND 46873 INVENTION

[0002] Asparaginase is an enzyme that catalyzes the hydrolysis of asparagine to aspartic acid and ammonia. Saccharomyces cerevisiae expresses two forms of asparaginase: L-asparaginase I, a cytoplasmic enzyme that is synthesized constitutively, and asparaginase II, a cell wall mannan protein localized external to the cell membrane which plays a role in hydrolysis of exogenous asparagines and uptake of aspartic acid. The two enzymes are biochemically and genetically distinct.

[0003] Because some lymphoid tumor cells are deficient in L-asparagine synthetase and cannot synthesize sufficient L-asparagine, asparagine is, for these cells, an essential amino acid. Therefore, asparagine depletion by administration of asparaginase rapidly results in decreased protein synthesis, followed by a decrease in DNA and RNA synthesis, and ultimately cell death.

SUMMARY OF THE 26443 And 46873 INVENTION

[0004] The present invention is based, in part, on the discovery of novel asparaginases, referred to herein as “26443” and “46873” nucleic acid and protein molecules. The nucleotide sequence of a cDNA encoding 26443 and 46873 is shown in SEQ ID NO: 1 and SEQ ID NO:4, respectively, and the amino acid sequence of a 26443 and 46873 polypeptide is shown in SEQ ID NO:2 and SEQ ID NO:5, respectively. In addition, the nucleotide sequence of the coding regions of 26443 and 46873 are depicted in SEQ ID NO:3 and SEQ ID NO:6, respectively.

[0005] Accordingly, in one aspect, the invention features a nucleic acid molecule that encodes a 26443 or 46873 protein or polypeptide, e.g., a biologically active portion of the 26443 or 46873 protein. In a preferred embodiment the isolated nucleic acid molecule encodes a polypeptide having the amino acid sequence of SEQ ID NO:2 or SEQ ID NO:5. In other embodiments, the invention provides isolated 26443 or 46873 nucleic acid molecules having the nucleotide sequence shown in SEQ ID NO: 1, SEQ ID NO:3, SEQ ID NO:4 or SEQ ID NO:6, or the sequence of the DNA insert of the plasmid deposited with ATCC Accession Number ______ or Accession Number ______. In still other embodiments, the invention provides nucleic acid molecules that are substantially identical (e.g., naturally occurring allelic variants) to the nucleotide sequence shown in SEQ ID NO: 1, SEQ ID NO:3, SEQ ID NO:4 or SEQ ID NO:6, or the sequence of the DNA insert of the plasmid deposited with ATCC Accession Number ______ or Accession Number ______. In other embodiments, the invention provides a nucleic acid molecule which hybridizes under stringent hybridization conditions to a nucleic acid molecule comprising the nucleotide sequence of SEQ ID NO: 1, SEQ ID NO:3, SEQ ID NO:4 or SEQ ID NO:6, or the sequence of the DNA insert of the plasmid deposited with ATCC Accession Number ______ or Accession Number ______, wherein the nucleic acid encodes a full length 26443 or 46873 protein or a biologically active fragment thereof.

[0006] In a related aspect, the invention further provides nucleic acid constructs, which include a 26443 or 46873 nucleic acid molecule described herein. In certain embodiments, the nucleic acid molecules of the invention are operatively linked to native or heterologous regulatory sequences. Also included, are vectors and host cells containing the 26443 or 46873 nucleic acid molecules of the invention e.g., vectors and host cells suitable for producing 26443 or 46873 nucleic acid molecules and polypeptides.

[0007] In another related aspect, the invention provides nucleic acid fragments suitable as primers or hybridization probes for the detection of 26443 or 46873-encoding nucleic acids.

[0008] In still another related aspect, isolated nucleic acid molecules that are antisense to a 26443 or 46873 encoding nucleic acid molecule are provided.

[0009] In another aspect, the invention features, 26443 or 46873 polypeptides, and biologically active or antigenic fragments thereof that are useful, e.g., as reagents or targets in assays applicable to treatment and diagnosis of 26443- or 46873-mediated or related disorders. In another embodiment, the invention provides 26443 or 46873 polypeptides having a 26443 or 46873 activity. Preferred polypeptides are 26443 or 46873 proteins including at least one asparaginase domain, and, preferably, having a 26443 or 46873 activity, e.g., a 26443 or 46873 activity as described herein.

[0010] In other embodiments, the invention provides 26443 or 46873 polypeptides, e.g., a 26443 or 46873 polypeptide having the amino acid sequence shown in SEQ ID NO:2 or SEQ ID NO:5, respectively; the amino acid sequence encoded by the cDNA insert of the plasmid deposited with ATCC Accession Number ______ or Accession Number ______; an amino acid sequence that is substantially identical to the amino acid sequence shown in SEQ ID NO:2 or SEQ ID NO:5; or an amino acid sequence encoded by a nucleic acid molecule having a nucleotide sequence which hybridizes under stringent hybridization conditions to a nucleic acid molecule comprising the nucleotide sequence of SEQ ID NO: 1, SEQ ID NO:3, SEQ ID NO:4 or SEQ ID NO:6, or the sequence of the DNA insert of the plasmid deposited with ATCC Accession Number ______ or Accession Number ______, wherein the nucleic acid encodes a full length 26443 or 46873 protein or an active fragment thereof.

[0011] In a related aspect, the invention further provides nucleic acid constructs that include a 26443 or 46873 nucleic acid molecule described herein.

[0012] In a related aspect, the invention provides 26443 or 46873 polypeptides or fragments operatively linked to non-26443 or -46873 polypeptides to form fusion proteins.

[0013] In another aspect, the invention features antibodies and antigen-binding fragments thereof, that react with, or more preferably, specifically bind 26443 or 46873 polypeptides.

[0014] In another aspect, the invention provides methods of screening for compounds that modulate the expression or activity of the 26443 or 46873 polypeptides or nucleic acids.

[0015] In still another aspect, the invention provides a process for modulating 26443 or 46873 polypeptide or nucleic acid expression or activity, e.g., using the screened compounds. In certain embodiments, the methods involve treatment of conditions related to aberrant activity or expression of the 26443 or 46873 polypeptides or nucleic acids, such as metabolic diseases and conditions involving aberrant or deficient oxidation of long- and medium-chain fatty acids.

[0016] The invention also provides assays for determining the activity of, or the presence or absence of, 26443 or 46873 polypeptides or nucleic acid molecules in a biological sample, including for the purpose of disease diagnosis.

[0017] In a further aspect, the invention provides assays for determining the presence or absence of a genetic alteration in a 26443 or 46873 polypeptide or nucleic acid molecule, including for the purpose of disease diagnosis.

[0018] In another aspect, the invention features a two dimensional array having a plurality of addresses, each address of the plurality being positionally distinguishable from each other address of the plurality, and each address of the plurality having a unique capture probe, e.g., a nucleic acid or peptide sequence. At least one address of the plurality has a capture probe that recognizes a 26443 or 46873 molecule. In one embodiment, the capture probe is a nucleic acid, e.g., a probe complementary to a 26443 or 46873 nucleic acid sequence. In another embodiment, the capture probe is a polypeptide, e.g., an antibody specific for 26443 or 46873 polypeptides. Also featured is a method of analyzing a sample by contacting the sample to the aforementioned array and detecting binding of the sample to the array.

[0019] Other features and advantages of the invention will be apparent from the following detailed description, and from the claims.

BRIEF DESCRIPTION OF THE DRAWINGS

[0020] FIGS. 1A-1B depicts a cDNA sequence (SEQ ID NO:1) and predicted amino acid sequence (SEQ ID NO:2) of human 26443. The methionine-initiated open reading frame of human 26443 (without the 5′ and 3′untranslated regions) starts at nucleotide 91 and continues through to nucleotide 1344 of SEQ ID NO: 1 (coding sequence also shown in SEQ ID NO:3).

[0021] FIG. 2 depicts a hydropathy plot of human 26443. Relative hydrophobic residues are shown above the dashed horizontal line, and relative hydrophilic residues are below the dashed horizontal line. The cysteine residues (Cys) are indicated by short vertical lines just below the hydropathy trace. The numbers corresponding to the amino acid sequence of human 26443 are indicated. Polypeptides of the invention include 26443 fragments that include: all or part of a hydrophobic sequence (a sequence above the dashed line; all or part of a hydrophilic fragment (e.g., a fragment below the dashed line). Other fragments include a cysteine or a glycosylation site.

[0022] FIG. 3 depicts a series of plots summarizing an analysis of the primary and secondary protein structure of a human asparaginase. The particular algorithm used for each plot is indicated at the right hand side of each plot. The following plots are depicted: Gamier-Robson plots providing the predicted location of alpha-, beta-, turn and coil regions (Gamier et al. (1978) J. Mol. Biol. 120:97); Chou-Fasman plots providing the predicted location of alpha-, beta-, turn and coil regions (Chou and Fasman (1978) Adv. In Enzymol. Mol. 47:45-148); Kyte-Doolittle hydrophilicity/hydrophobicity plots (Kyte and Doolittle (1982) J. Mol. Biol. 157:105-132); Eisenberg plots providing the predicted location of alpha- and beta-amphipathic regions (Eisenberg et al. (1982) Nature 299:371-374); a Karplus-Schultz plot providing the predicted location of flexible regions (Karplus and Schulz (1985) Naturwissens-Chafen 72:212-213); a plot of the antigenic index (Jameson-Wolf) (Jameson and Wolf (1988) CABIOS 4:121-136); and a surface probability plot (Emini algorithm) (Emini et al. (1985) J. Virol. 55:836-839). The numbers corresponding to the amino acid sequence of human 26443 are indicated.

[0023] FIG. 4 depicts an alignment of the asparaginase domain of human 26443 with a consensus amino acid sequence derived from a hidden Markov model. The upper sequence is the consensus amino acid sequence (SEQ ID NO:7), while the lower amino acid sequence corresponds to amino acids 38 to 345 of SEQ ID NO:2.

[0024] FIG. 5 depicts a cDNA sequence (SEQ ID NO:4) and predicted amino acid sequence (SEQ ID NO:5) of human 46873. The methionine-initiated open reading frame of human 46873 (without the 5′ and 3′untranslated regions) starts at nucleotide 134 and continues through to nucleotide 1057 of SEQ ID NO:4 (coding sequence also shown in SEQ ID NO:6).

[0025] FIG. 6 depicts a hydropathy plot of human 46873. Relative hydrophobic residues are shown above the dashed horizontal line, and relative hydrophilic residues are below the dashed horizontal line. The cysteine residues (Cys) are indicated by short vertical lines just below the hydropathy trace. The numbers corresponding to the amino acid sequence of human 46873 are indicated. Polypeptides of the invention include 46873 fragments that include: all or part of a hydrophobic sequence (a sequence above the dashed line; all or part of a hydrophilic fragment (e.g., a fragment below the dashed line).

[0026] FIG. 7 depicts a series of plots summarizing an analysis of the primary and secondary protein structure of a human asparaginase. The particular algorithm used for each plot is indicated at the right hand side of each plot. The following plots are depicted: Gamier-Robson plots providing the predicted location of alpha-, beta-, turn and coil regions (Gamier et al. (1978) J. Mol. Biol. 120:97); Chou-Fasman plots providing the predicted location of alpha-, beta-, turn and coil regions (Chou and Fasman (1978) Adv. In Enzymol. Mol. 47:45-148); Kyte-Doolittle hydrophilicity/hydrophobicity plots (Kyte and Doolittle (1982) J. Mol. Biol. 157:105-132); Eisenberg plots providing the predicted location of alpha- and beta-amphipathic regions (Eisenberg et al. (1982) Nature 299:371-374); a Karplus-Schultz plot providing the predicted location of flexible regions (Karplus and Schulz (1985) Naturwissens-Chafen 72:212-213); a plot of the antigenic index (Jameson-Wolf) (Jameson and Wolf (1988) CABIOS 4:121-136); and a surface probability plot (Emini algorithm) (Emini et al. (1985) J. Virol. 55:836-839). The numbers corresponding to the amino acid sequence of human 46873 are indicated.

[0027] FIG. 8 depicts an alignment of the asparaginase domain of human 46873 with a consensus amino acid sequence derived from a hidden Markov model. The upper sequence is the consensus amino acid sequence (SEQ ID NO:7), while the lower amino acid sequence corresponds to amino acids 1 to 302 of SEQ ID NO:5.

[0028] FIG. 9 depicts a hydropathy plot of human 61833. Relative hydrophobic residues are shown above the dashed horizontal line, and relative hydrophilic residues are below the dashed horizontal line. Numbers corresponding to positions in the amino acid sequence of human 61833 are indicated. Polypeptides of the invention include fragments which include: all or part of a hydrophobic sequence, i.e., a sequence above the dashed line, e.g., the sequence from about amino acid 30 to 50, from about 276 to 293, and from about 323 to 341 of SEQ ID NO:11; all or part of a hydrophilic sequence, i.e., a sequence below the dashed line, e.g., the sequence of from about amino acid 300 to 310 of SEQ ID NO:11.

[0029] FIGS. 10A-10B depicts an alignment of the pyridoxyl-dependent decarboxylase domain of human 61833 with a consensus amino acid sequence derived from a hidden Markov model (HMM) from PFAM. The upper sequence is the consensus amino acid sequence (SEQ ID NO:13), while the lower amino acid sequence corresponds to amino acids 41 to 401 of SEQ ID NO:11.

[0030] FIG. 11 depicts a hydropathy plot of human 26493. Relative hydrophobic residues are shown above the dashed horizontal line, and relative hydrophilic residues are below the dashed horizontal line. The cysteine residues (cys) are indicated by short vertical lines just below the hydropathy trace. The numbers corresponding to the amino acid sequence of human 26493 are indicated. Polypeptides of the invention include fragments which include: all or part of a hydrophobic sequence, i.e., a sequence above the dashed line, e.g., the sequence of from about amino acid residue 85 to 101, and from about 350 to 360 of SEQ ID NO:17; all or part of a hydrophilic sequence, i.e., a sequence below the dashed line, e.g., the sequence from about amino acid residue 360 to 370 of SEQ ID NO:17; or a sequence which includes a Cys, or an N-glycosylation site.

[0031] FIG. 12 depicts an alignment of the mutT domain of human 26493 with consensus amino acid sequences derived from a hidden Markov model (HMM) from PFAM. The upper sequence is the consensus amino acid sequence for a mutT domain (SEQ ID NO:19), while the lower amino acid sequence corresponds to amino acids 122 to 251 of SEQ ID NO: 17.

[0032] FIG. 13 depicts a hydropathy plot of human 58224. The SNF2 domain and the C-terminal helicase domain are indicated. The numbers corresponding to the amino acid sequence of human 58224 (SEQ ID NO:23) are indicated. Polypeptides of the invention include fragments which include: all or part of a hydrophobic sequence, i.e., a sequence above the dashed line, e.g., the sequence of 255-265 of SEQ ID NO:23; all or part of a hydrophilic sequence, i.e., a sequence below the dashed line, e.g., the sequence of 150-160 of SEQ ID NO:23; a sequence which includes a Cys, or a glycosylation site.

[0033] FIGS. 14A-14B depicts an alignment of the SN2 N-terminal domain of human 58224 with a consensus amino acid sequence derived from a hidden Markov model. The upper sequence is the consensus amino acid sequence (SEQ ID NO:25), while the lower amino acid sequence corresponds to amino acids 226 to 577 of SEQ ID NO:23.

[0034] FIG. 14C depicts an alignment of the helicase conserved C-terminal domain of human 58224 with a consensus amino acid sequence derived from a hidden Markov model. The upper sequence is the consensus amino acid sequence (SEQ ID NO:26), while the lower amino acid sequence corresponds to amino acids 629 to 712 of SEQ ID NO:23.

[0035] FIG. 15 depicts a hydropathy plot of human 46980. Relative hydrophobic residues are shown above the dashed horizontal line, and relative hydrophilic residues are below the dashed horizontal line. Numbers corresponding to positions in the amino acid sequence of human 46980 are indicated. Polypeptides of the invention include fragments which include: all or part of a hydrophobic sequence, i.e., a sequence above the dashed line, e.g., the sequence from about amino acid 1 to 29, from about 187 to 211, and from about 675 to 696 of SEQ ID NO:28; all or part of a hydrophilic sequence, i.e., a sequence below the dashed line, e.g., the sequence of from about amino acid 80 to 100, or 440 to 450 of SEQ ID NO:28. Also indicated are an extracellular domain from about amino acid 43 to 674 of SEQ ID NO:28, a transmembrane domain from about amino acid 675 to 696 of SEQ ID NO:28, an intracellular domain from about amino acid 697 to 816 of SEQ ID NO:28, and a carboxylesterase domain from about amino acid 25 to 590 of SEQ ID NO:28.

[0036] FIGS. 16A-16C depicts an alignment of the carboxylesterase domain of human 46980 with a consensus amino acid sequence derived from a hidden Markov model (HMM) from PFAM. The upper sequence is the consensus amino acid sequence (SEQ ID NO:30), while the lower amino acid sequence corresponds to amino acids 25 to 590 of SEQ ID NO:28.

[0037] FIGS. 17A-17B depicts a BLAST alignment of a human 46980 polypeptide with a rat neuroligin 3 (SEQ ID NO:31).

[0038] FIG. 18 depicts a hydropathy plot of human 32225. Relative hydrophobic residues are shown above the dashed horizontal line, and relative hydrophilic residues are below the dashed horizontal line. Numbers corresponding to positions in the amino acid sequence of human 32225 are indicated. Polypeptides of the invention include fragments which include: all or part of a hydrophobic sequence, i.e., a sequence above the dashed line, e.g., the sequence from about amino acid 71 to 88, from about 135 to 157, and from about 186 to 199 of SEQ ID NO:34; all or part of a hydrophilic sequence, i.e., a sequence below the dashed line, e.g., the sequence of from about amino acid 106 to 123, from about 220 to 236, and from about 299 to 316 of SEQ ID NO:34.

[0039] FIG. 19 depicts an alignment of the hydrolase domain of human 32225 with a consensus amino acid sequence derived from a hidden Markov model (HMM) from PFAM. The upper sequence is the consensus amino acid sequence (SEQ ID NO:36), while the lower amino acid sequence corresponds to amino acids 95 to 338 of SEQ ID NO:34.

[0040] FIGS. 20A-20C depicts BLAST alignments of the &agr;/&bgr; hydrolase domain of human 32225 with consensus amino acid sequences derived from ProDom families PD349163, PD021903, and PD034252, (see ProDomain Release 2001.1; http://www.toulouse.inra.fr/prodom.html). The BLAST algorithm identifies multiple local alignments between the consensus amino acid sequence and human 32225. (A) The lower sequence is the consensus amino acid sequence for ProDom PD349163 (SEQ ID NO:37), while the upper amino acid sequence corresponds to a first fragment of the &agr;/&bgr; hydrolase domain of human 32225, about amino acids 8 to 69 of SEQ ID NO:34. (B) The lower sequence is the consensus amino acid sequence for ProDom PD021903 (SEQ ID NO:38), while the upper amino acid sequence corresponds to a second fragment of the &agr;/&bgr; hydrolase domain of human 32225, about amino acids 187 to 277 of SEQ ID NO:34. (C) The lower sequence is the consensus amino acid sequence for ProDom PD034252 (SEQ ID NO:39), while the upper amino acid sequence corresponds to a third fragment of the &agr;/&bgr; hydrolase domain of human 32225, about amino acids 280 to 342 of SEQ ID NO:34.

[0041] FIG. 21 depicts a hydropathy plot of human 47508. Relative hydrophobic residues are shown above the dashed horizontal line, and relative hydrophilic residues are below the dashed horizontal line. Numbers corresponding to positions in the amino acid sequence of human 47508 are indicated. Polypeptides of the invention include fragments which include: all or part of a hydrophobic sequence, i.e., a sequence above the dashed line, e.g., the sequence from about amino acid 151 to 172, from about 215 to 232, and from about 377 to 393 of SEQ ID NO:42; all or part of a hydrophilic sequence, i.e., a sequence below the dashed line, e.g., the sequence of from about amino acid 19 to 35, from about 245 to 263, and from about 286 to 310 of SEQ ID NO:42.

[0042] FIG. 22 depicts an alignment of the histone deacetylase domain of human 47508 with a consensus amino acid sequence derived from a hidden Markov model (HMM) from PFAM. The upper sequence is the consensus amino acid sequence (SEQ ID NO:44), while the lower amino acid sequence corresponds to amino acids 83 to 392 of SEQ ID NO:42.

[0043] FIGS. 23A-23C depict BLAST alignments of portions of the histone deacetylase domain of human 47508 with representative amino acid sequences derived from ProDomains No. 345193, 001400, and 021448 (ProDomain Release 2000.1; http://www.toulouse.inra.fr/prodom.html). The BLAST algorithm identifies multiple local alignments between the consensus amino acid sequence and human 47508. In FIG. 23A, the lower sequence is the representative amino acid sequence of ProDomain 345193 (SEQ ID NO:45), while the upper amino acid sequence corresponds to an N-terminal portion of the histone deacetylase domain of human 47508, about amino acid residues 71 to 115 of SEQ ID NO:42. In FIG. 23B, the lower sequence is the representative amino acid sequence of ProDomain 001400 (SEQ ID NO:46), while the upper amino acid sequence corresponds to a central portion of the histone deacetylase domain of human 47508, about amino acid residues 120 to 258. In FIG. 23C, the lower sequence is the representative amino acid sequence of ProDomain 021448 (SEQ ID NO:47), while the upper amino acid sequence corresponds to a C-terminal portion of the histone deacetylase domain of human 47508, about amino acid residues 251 to 372 of SEQ ID NO:42.

[0044] FIG. 24 depicts a hydropathy plot of human 56939. Relative hydrophobic residues are shown above the dashed horizontal line, and relative hydrophilic residues are below the dashed horizontal line. The numbers corresponding to the amino acid sequence of human 56939 are indicated. Polypeptides of the invention include fragments which include: all or part of a hydrophobic sequence, i.e., a sequence above the dashed line, e.g., the sequence of from about amino acid 78 to 83, from about 100 to 104, and from about 223 to 231 of SEQ ID NO:49; all or part of a hydrophilic sequence, i.e., a sequence below the dashed line, e.g., the sequence of from about amino acid 110 to 115, from about 323 to 330, and from about 339 to 349 of SEQ ID NO:49.

[0045] FIG. 25 depicts an alignment of the acyl-CoA thioesterase domain of human 56939 with a consensus amino acid sequence derived from the ProDomain family PD0006914. The lower sequence is the consensus amino acid sequence (SEQ ID NO:51), while the upper sequence corresponds to amino acids 1 to 415 of SEQ ID NO:49 (SEQ ID NO:52).

[0046] FIG. 26 depicts a hydropathy plot of human 33410. Relative hydrophobic residues are shown above the dashed horizontal line, and relative hydrophilic residues are below the dashed horizontal line. The cysteine residues (cys) are indicated by short vertical lines just below the hydropathy trace. The numbers corresponding to the amino acid sequence of human 33410 are indicated. Polypeptides of the invention include 33410 fragments which include: all or part of a hydrophobic sequence, i.e., a sequence above the dashed line, e.g., the sequence from about amino acid 60 to 72, from about 260 to 277, and from about 780 to 793; all or part of a hydrophilic sequence, i.e., a sequence below the dashed line, e.g., the sequence of from about amino acid 330 to 350, from about 480 to 505, and from about 695 to 720; and/or a sequence which includes a cysteine, or a glycosylation site.

[0047] FIG. 27 depicts an alignment of the carboxylesterase domain of human 33410 with a consensus amino acid sequence derived from a hidden Markov model (HMM). The upper sequence is the consensus amino acid sequence (SEQ ID NO:56), while the lower amino acid sequence corresponds to amino acids 42 to 601 of SEQ ID NO:54.

[0048] FIGS. 28A-28B depicts alignment of the rat neuroligin-2 amino acid sequence and the human 33410 (SEQ ID NO:54) amino acid sequences. The location of the transmembrane domain in the rat neuroligin-2 (SEQ ID NO:57) and 33410 amino acid sequences is indicated as “TM1”.

[0049] FIGS. 29A-29B depicts alignment of the partial human KIAA1366 (Genbank Accession Number AB037787; SEQ ID NO:58) and the human 33410 amino acid sequences (SEQ ID NO:54).

[0050] FIG. 30 depicts a hydropathy plot of human 33521. Relative hydrophobic residues are shown above the dashed horizontal line, and relative hydrophilic residues are below the dashed horizontal line. The cysteine residues (cys) are indicated by short vertical lines just below the hydropathy trace. The numbers corresponding to the amino acid sequence of human 33521 are indicated. Polypeptides of the invention include fragments which include: all or part of a hydrophobic sequence, i.e., a sequence above the dashed line, e.g., the sequence from about amino acid 722 to 730, from about 883 to 891, and from about 966 to 975, of SEQ ID NO:62; all or part of a hydrophilic sequence, i.e., a sequence below the dashed line, e.g., the sequence of from about amino acid 741 to 750, from about 756 to 762, and from about 1363 to 1372, of SEQ ID NO:62; a sequence which includes a Cys, or a glycosylation site.

[0051] FIG. 31A depicts an alignment of the first PH domain of human 33521 with a consensus amino acid sequence derived from a hidden Markov model (HMM) from PFAM. The upper sequence is the consensus amino acid sequence (SEQ ID NO:64), while the lower amino acid sequence corresponds to amino acids 507 to 620 of SEQ ID NO:62.

[0052] FIG. 31B depicts an alignment of the RBD domain of human 33521 with a consensus amino acid sequence derived from a hidden Markov model (HMM) from PFAM. The upper sequence is the consensus amino acid sequence (SEQ ID NO:65), while the lower amino acid sequence corresponds to amino acids 810 to 853 of SEQ ID NO:62.

[0053] FIG. 31C depicts an alignment of the PDZ domain of human 33521 with a consensus amino acid sequence derived from a hidden Markov model (HMM) from PFAM. The upper sequence is the consensus amino acid sequence (SEQ ID NO:66), while the lower amino acid sequence corresponds to amino acids 890 to 975 of SEQ ID NO:62.

[0054] FIG. 31D depicts an alignment of the Rho GEF domain of human 33521 with a consensus amino acid sequence derived from a hidden Markov model (HMM) from PFAM. The upper sequence is the consensus amino acid sequence (SEQ ID NO:67), while the lower amino acid sequence corresponds to amino acids 1103 to 1292 of SEQ ID NO:62.

[0055] FIG. 31E depicts an alignment of the second PH domain of human 33521 with a consensus amino acid sequence derived from a hidden Markov model (HMM) from PFAM. The upper sequence is the consensus amino acid sequence (SEQ ID NO:68), while the lower amino acid sequence corresponds to amino acids 1353 to 1455 of SEQ ID NO:62.

[0056] FIG. 32A depicts an alignment of the first PH domain of human 33521 with a consensus amino acid sequence derived from a hidden Markov model (HMM) from SMART. The upper sequence is the consensus amino acid sequence (SEQ ID NO:69), while the lower amino acid sequence corresponds to amino acids 507 to 622 of SEQ ID NO:62.

[0057] FIG. 32B depicts an alignment of the RBD domain of human 33521 with a consensus amino acid sequence derived from a hidden Markov model (HMM) from SMART. The upper sequence is the consensus amino acid sequence (SEQ ID NO:70), while the lower amino acid sequence corresponds to amino acids 810 to 881 of SEQ ID NO:62.

[0058] FIG. 32C depicts an alignment of the PDZ domain of human 33521 with a consensus amino acid sequence derived from a hidden Markov model (HMM) from SMART. The upper sequence is the consensus amino acid sequence (SEQ ID NO:71), while the lower amino acid sequence corresponds to amino acids 900 to 976 of SEQ ID NO:62.

[0059] FIG. 32D depicts an alignment of the Rho GEF domain of human 33521 with a consensus amino acid sequence derived from a hidden Markov model (HMM) from SMART. The upper sequence is the consensus amino acid sequence (SEQ ID NO:72), while the lower amino acid sequence corresponds to amino acids 1103 to 1292 of SEQ ID NO:62.

[0060] FIG. 32E depicts an alignment of the second PH domain of human 33521 with a consensus amino acid sequence derived from a hidden Markov model (HMM) from SMART. The upper sequence is the consensus amino acid sequence (SEQ ID NO:73), while the lower amino acid sequence corresponds to amino acids 1326 to 1457 of SEQ ID NO:62.

[0061] FIG. 33 depicts a hydropathy plot of human 23479 polypeptide. Relative hydrophobic residues are shown above the dashed horizontal line, and relative hydrophilic residues are below the dashed horizontal line. Numbers corresponding to positions in the amino acid sequence of human 23479 are indicated. Polypeptides of the invention include fragments which include: all or part of a hydrophobic sequence, i.e., a sequence above the dashed line, e.g., the sequence from about amino acid 100 to 110, from about amino acid 295 to 310, and from about amino acid 920 to 930, of SEQ ID NO:75; all or part of a hydrophilic sequence, i.e., a sequence below the dashed line, e.g., the sequence from about amino acid 275 to 290, from about amino acid 530 to 550, and from about amino acid 640 to 650, of SEQ ID NO:75.

[0062] FIG. 34A depicts an alignment of the first ubiquitin carboxyl-terminal hydrolase domain (UCH-1) of human 23479 with a consensus amino acid sequence derived from a hidden Markov model (HMM) from PFAM. The upper sequence is the consensus amino acid sequence (SEQ ID NO:83), while the lower amino acid sequence corresponds to amino acids 296-327 of SEQ ID NO:75.

[0063] FIG. 34B depicts an alignment of the second ubiquitin carboxyl-terminal hydrolase domain (UCH-2) of human 23479 with a consensus amino acid sequence derived from a hidden Markov model (HMM) from PFAM. The upper sequence is the consensus amino acid sequence (SEQ ID NO:84), while the lower amino acid sequence corresponds to amino acids 546-640 of SEQ ID NO:75.

[0064] FIG. 35 depicts a hydropathy plot of human 48120 polypeptide. Relative hydrophobic residues are shown above the dashed horizontal line, and relative hydrophilic residues are below the dashed horizontal line. Numbers corresponding to positions in the amino acid sequence of human 48120 are indicated. Polypeptides of the invention include fragments which include: all or part of a hydrophobic sequence, i.e., a sequence above the dashed line, e.g., the sequence from about amino acid 1040 to 1055 of SEQ ID NO:78; all or part of a hydrophilic sequence, i.e., a sequence below the dashed line, e.g., the sequence from about amino acid 120 to 155, from about amino acid 680 to 700, and from about amino acid 770 to 800, of SEQ ID NO:78.

[0065] FIG. 36A depicts an alignment of the first ubiquitin carboxyl-terminal hydrolase domain (UCH-1) of human 48120 with a consensus amino acid sequence derived from a hidden Markov model (HMM) from PFAM. The upper sequence is the consensus amino acid sequence (SEQ ID NO:83), while the lower amino acid sequence corresponds to amino acids 162 to 193 of SEQ ID NO:78.

[0066] FIG. 36B depicts an alignment of the second ubiquitin carboxyl-terminal hydrolase domain (UCH-2) of human 48120 with a consensus amino acid sequence derived from a hidden Markov model (HMM) from PFAM. The upper sequence is the consensus amino acid sequence (SEQ ID NO:84), while the lower amino acid sequence corresponds to amino acids 580 to 649 of SEQ ID NO:78.

[0067] FIG. 36C depicts an alignment of the ubiquitin associated (UBA) domain of human 48120 with a consensus amino acid sequence derived from a hidden Markov model (HMM) from PFAM. The upper sequence is the consensus amino acid sequence (SEQ ID NO:85), while the lower amino acid sequence corresponds to amino acids 20 to 61 of SEQ ID NO:78.

[0068] FIG. 36D depicts an alignment of the ubiquitin interaction motif (UIM) domain of human 48120 with a consensus amino acid sequence derived from a hidden Markov model (HMM) from PFAM. The upper sequence is the consensus amino acid sequence (SEQ ID NO:86), while the lower amino acid sequence corresponds to amino acids 96 to 113 of SEQ ID NO:78.

[0069] FIG. 37 depicts a hydropathy plot of human 46689 polypeptide. Relative hydrophobic residues are shown above the dashed horizontal line, and relative hydrophilic residues are below the dashed horizontal line. Numbers corresponding to positions in the amino acid sequence of human 46689 are indicated. Polypeptides of the invention include fragments which include: all or part of a hydrophobic sequence, i.e., a sequence above the dashed line, e.g., the sequence from about amino acid 1 to 23, from about 133 to 145, and from about 150 to 168 of SEQ ID NO:81; all or part of a hydrophilic sequence, i.e., a sequence below the dashed line, e.g., the sequence of from about amino acid 34 to 51, from about 333 to 347, and from about 438 to 449 of SEQ ID NO:81.

[0070] FIG. 38 depicts an alignment of the &agr;/&bgr; hydrolase domain of human 46689 with a consensus amino acid sequence derived from a hidden Markov model (HMM) from PFAM. The upper sequence is the consensus amino acid sequence (SEQ ID NO:87), while the lower amino acid sequence corresponds to about amino acid residues 186 to 419 of SEQ ID NO:81.

[0071] FIG. 39 depicts a BLAST alignment of the &agr;/&bgr; hydrolase domain of human 46689 with a consensus amino acid sequence derived from a ProDom family PD007763 (Release 2001.1; http://www.toulouse.inra.fr/prodom.html). The lower sequence is the consensus amino acid sequence (SEQ ID NO:88), while the upper amino acid sequence corresponds to the a/b hydrolase domain of human 46689 along with some flanking sequence, about amino acid residues 97 to 424 of SEQ ID NO:81.

[0072] FIG. 40 depicts a hydropathy plot of human 80091. Relative hydrophobic residues are shown above the dashed horizontal line, and relative hydrophilic residues are below the dashed horizontal line. The cysteine residues (Cys) are indicated by short vertical lines just below the hydropathy trace. The numbers corresponding to the amino acid sequence of human 80091 are indicated. Polypeptides of the invention include fragments which include: all or part of a hydrophobic sequence, i.e., a sequence above the dashed line, e.g., the sequences of about amino acids 188 to 205, about 540 to 550, about 700 to 725, about 817 to 825, and about 980 to 995 of SEQ ID NO:95; all or part of a hydrophilic sequence, i.e., a sequence below the dashed line, e.g., the sequences of about amino acids 315 to 339, about 530 to 539, about 680 to 695, and about 1185 to 1220 of SEQ ID NO:95; a sequence which includes a Cys, or a glycosylation site.

[0073] FIG. 41A depicts an alignment of the ubiquitin carboxy-terminal hydrolase-1 (UCH-1) domain of human 80091 with a consensus amino acid sequence derived from a hidden Markov model (HMM) from PFAM. The upper sequence is the consensus amino acid sequence (SEQ ID NO:96), while the lower amino acid sequence corresponds to amino acids 447 to about 478 of SEQ ID NO:95.

[0074] FIG. 41B depicts an alignment of the ubiquitin carboxy-terminal hydrolase-2 (UCH-2) domain of human 80091 with a consensus amino acid sequence derived from a hidden Markov model (HMM) from PFAM. The upper sequence is the consensus amino acid sequence (SEQ ID NO:97), while the lower amino acid sequence corresponds to amino acids 1219 to 1279 of SEQ ID NO:95.

[0075] FIGS. 42A-42C depicts a cDNA sequence (SEQ ID NO:101) and predicted amino acid sequence (SEQ ID NO:102) of human 46508. The coding sequence (without the 5′ and 3′untranslated regions), which starts at the initiator methionine of the open reading frame of human 46508 until the termination codon of SEQ ID NO:101 are also indicated (shown as SEQ ID NO:103).

[0076] FIG. 43 depicts a hydropathy plot of human 46508. Relative hydrophobic residues are shown above the dashed horizontal line, and relative hydrophilic residues are below the dashed horizontal line. Numbers corresponding to positions in the amino acid sequence of human 46508 are indicated. Polypeptides of the invention include fragments which include: all or part of a hydrophobic sequence, i.e., a sequence above the dashed line, e.g., the sequence from about amino acid 60 to 70, from about 86 to 102, and from about 189 to 195 of SEQ ID NO: 102; all or part of a hydrophilic sequence, i.e., a sequence below the dashed line, e.g., the sequence of from about amino acid 77 to 85, from about 217 to 224 of SEQ ID NO: 102, a sequence which includes a Cys, or a glycosylation site, of SEQ ID NO: 102.

[0077] FIG. 44 depicts an alignment of the peptidyl-tRNA hydrolase domain of human 46508 with a consensus amino acid sequence derived from a hidden Markov model (HMM) from PFAM. The upper sequence is the consensus amino acid sequence (SEQ ID NO: 104), while the lower amino acid sequence corresponds to amino acids 44 to 221 of SEQ ID NO:102.

DETAILED DESCRIPTION OF 26443 AND 46873

[0078] The human 26443 sequence (FIG. 1; SEQ ID NO: 1), which is approximately 1888 nucleotides long, including untranslated regions, contains a predicted methionine-initiated coding sequence of about 1254 nucleotides (SEQ ID NO:3, and nucleotides 91-1344 of SEQ ID NO:1). The coding sequence encodes an 418 amino acid protein (SEQ ID NO:2).

[0079] Human 26443 contains a predicted asparaginase domain from about amino acids 38 to 345 of SEQ ID NO:2.

[0080] The 26443 protein also includes the following domains: a predicted N-glycosylation site (PFAM Accession PS0001) located at about amino acid residues 225-228 of SEQ ID NO:2; two predicted glycosaminoglycan attachment sites (PFAM Accession PS0002) located at about amino acid residues 7-10 and 289-292 of SEQ ID NO:2; a predicted cAMP- and cGMP-dependent protein kinase phosphorylation site (PFAM Accession PS0004) located at about amino acid residues 217-220 of SEQ ID NO:2; five predicted Protein Kinase C phosphorylation sites (PS00005) at about amino acids 24-26, 33-35, 186-188, 221-223 and 346-348 of SEQ ID NO:2; six predicted Casein Kinase II phosphorylation sites (PS00006) located at about amino acids 6-9, 24-27, 33-36, 116-119, 221-224 and 381-384 of SEQ ID NO:2; and eight predicted N-myristoylation sites (PS00008) from about amino acids 4-9, 77-82, 100-105, 126-131, 228-233, 242-247, 336-341 and 397-402 of SEQ ID NO:2.

[0081] The human 46873 sequence (FIG. 4; SEQ ID NO:4), which is approximately 1358 nucleotides long, including untranslated regions, contains a predicted methionine-initiated coding sequence of about 924 nucleotides (SEQ ID NO:6, and nucleotides 134-1057 of SEQ ID NO:4). The coding sequence encodes a 308 amino acid protein (SEQ ID NO:5).

[0082] Human 46873 contains a predicted asparaginase domain from about amino acids 1 to 302 of SEQ ID NO:5.

[0083] The 46873 protein also includes the following domains: one predicted Protein Kinase C phosphorylation site (PS00005) at about amino acids 141-143 of SEQ ID NO:5; five predicted Casein Kinase II phosphorylation sites (PS00006) located at about amino acids 43-46, 71-74, 80-83, 243-246 and 303-306 of SEQ ID NO:5; and eight predicted N-myristoylation sites (PS00008) from about amino acids 26-31, 50-55, 66-71, 90-05, 156-161, 167-172, 187-192 and 214-219 of SEQ ID NO:5.

[0084] For general information regarding PFAM identifiers, PS prefix and PF prefix domain identification numbers, refer to Sonnhammer et al. (1997) Protein 28:405-420 and http://www.psc.edu/general/software/packages/pfam/pfam.html.

[0085] Plasmids containing the nucleotide sequence encoding human 26443 and 46873 (clones “Fbh26443FL” and “Fbh46873FL,” respectively) were deposited with American Type Culture Collection (ATCC), 10801 University Boulevard, Manassas, Va. 20110-2209, on ______ and assigned Accession Numbers ______ or ______. This deposit will be maintained under the terms of the Budapest Treaty on the International Recognition of the Deposit of Microorganisms for the Purposes of Patent Procedure. This deposit was made merely as a convenience for those of skill in the art and is not an admission that a deposit is required under 35 U.S.C. §112. Table 1 contains a summary of sequence information for 26443 and 46873. 1 TABLE 1 (Summary of Sequence Information for Asparaginase Polypeptides ATCC Poly- Accession GENE cDNA ORF peptide FIG No. 26443 SEQ ID SEQ ID SEQ ID 1 — NO: 1 NO: 3 NO: 2 46873 SEQ ID SEQ ID SEQ ID 5 — NO: 4 NO: 6 NO: 5

[0086] The 26443 and 46873 proteins contain a significant number of structural characteristics in common with members of the asparaginase family. The term “family” when referring to the protein and nucleic acid molecules of the invention means two or more proteins or nucleic acid molecules having a common structural domain or motif and having sufficient amino acid or nucleotide sequence homology as defined herein. Such family members can be naturally or non-naturally occurring and can be from either the same or different species. For example, a family can contain a first protein of human origin as well as other distinct proteins of human origin, or alternatively, can contain homologues of non-human origin, e.g., rat or mouse proteins. Members of a family can also have common functional characteristics.

[0087] 26443 and 46873 polypeptides or 26443 and 46873 family members can include an “asparaginase domain” or regions homologous with an “asparaginase domain”.

[0088] As used herein, the term “asparaginase domain” refers to a protein domain having an amino acid sequence of about 50 to 600 amino acids, preferably about 150 to 450 amino acid residues, more preferably about 300 to 310 amino acids. An asparaginase domain typically includes two conserved threonine residues that play a role in the catalytic properties of asparaginases. The first is typically located in the N-terminal extremity of the protein, while the second is located at the end of the first third of the amino acid sequence. Consensus patterns for asparaginases are as follows: [LIVM]-x(2)-T-G-G-T-[IV]-[AGS], SEQ ID NO:8, the second T is an active site residue, and G-x-[LIVM]-x(2)-H-G-T-D-T-[LIVM], SEQ ID NO:9, wherein the first T is an active site residue. Preferably, an “asparaginase domain” includes an amino acid sequence of about 250 to 400 amino acid residues in length and having a bit score for the alignment of the sequence to the asparaginase domain (HMM) of at least 75. More preferably, an asparaginase domain includes at least about 50 to 600 amino acids, even more preferably about 150 to 400 amino acids, or even most preferably, 300-310 amino acids, and has a bit score for the alignment of the sequence to the asparaginase domain (HMM) of at least 75, 100, 200, 300, 400 or greater. Asparaginase domains (HMM) have been assigned PFAM Accession PF00710 and PFAM Accession PF01112 (http://genome.wustl.edu/Pfam/html). An alignment of the asparaginase domain (SEQ ID NO:7, corresponding to amino acids 38 to 345 of SEQ ID NO:2) of human 26443 with a consensus amino acid sequence derived from a hidden Markov model is depicted in FIG. 4. An alignment of the asparaginase domain (SEQ ID NO:7, corresponding to amino acids 1 to 302 of SEQ ID NO:5) of human 46873 with a consensus amino acid sequence derived from a hidden Markov model is depicted in FIG. 8.

[0089] In a preferred embodiment, A 26443 or 46873 polypeptide or protein has an “asparaginase domain” or a region which includes at least about 50-600, more preferably about 150-450 or 300-310 amino acid residues, and having at least about 60%, 70% 80% 90% 95%, 99%, or 100% homology with an “asparaginase domain,” e.g., the asparaginase domain of human 26443 or 46873 (e.g., residues 38-345 of SEQ ID NO:2 or residues 1-302 of SEQ ID NO:5, respectively).

[0090] To identify the presence of a “asparaginase domain” in a 26443 or 46873 protein sequence, and make the determination that a polypeptide or protein of interest has a particular profile, the amino acid sequence of the protein can be searched against a database of HMMs (e.g., the Pfam database, release 2.1) using the default parameters (http://www.sanger.ac.uk/Software/Pfam/HMM_search). For example, the hmmsf program, which is available as part of the HMMER package of search programs, is a family specific default program for MILPAT0063 and a score of 15 is the default threshold score for determining a hit. Alternatively, the threshold score for determining a hit can be lowered (e.g., to 8 bits). A description of the Pfam database can be found in Sonhammer et al. (1997) Proteins 28(3):405-420 and a detailed description of HMMs can be found, for example, in Gribskov et al. (1990) Meth. Enzymol. 183:146-159; Gribskov et al. (1987) Proc. Natl. Acad. Sci. USA 84:4355-4358; Krogh et al. (1994) J. Mol. Biol. 235:1501-1531; and Stultz et al. (1993) Protein Sci. 2:305-314, the contents of which are incorporated herein by reference. A search was performed against the HMM database resulting in the identification of an “asparaginase domain” in the amino acid sequence of human 26443 and 46873 at about residues 38-345 of SEQ ID NO:2 (see FIG. 4) and 1-302 of SEQ ID NO:5 (see FIG. 8), respectively.

[0091] As the 26443 or 46873 polypeptides of the invention may modulate 26443- or 46873-mediated activities, they may be useful as, or for, developing novel diagnostic and therapeutic agents for 26443- or 46873-mediated or related disorders, as described below.

[0092] As used herein, a “26443 or 46873 activity”, “biological activity of 26443 or 46873” or “functional activity of 26443 or 46873”, refers to an activity exerted by a 26443 or 46873 protein, polypeptide or nucleic acid molecule on, e.g., a 26443- or 46873-responsive cell or on a 26443 or 46873 substrate, e.g., a protein substrate, as determined in vivo or in vitro. In one embodiment, a 26443 or 46873 activity is a direct activity, such as an association with a 26443 or 46873 target molecule. A “target molecule” or “binding partner” is a molecule with which a 26443 or 46873 protein binds or interacts in nature. In an exemplary embodiment, a “target molecule” is, e.g., an asparagine. A 26443 or 46873 activity can also be an indirect activity, e.g., a cellular signaling activity mediated by interaction of the 26443 or 46873 protein with a 26443 or 46873 ligand. For example, the 26443 or 46873 proteins of the present invention can have one or more of the following activities: (1) catalyzes the hydrolysis of asparagine to aspartic acid and ammonia; (2) regulates cellular amounts of asparagine; (3) regulates the cellular amounts of aspartic acid; (4) regulates cellular amounts of ammonia; and (5) antagonizes or inhibits, e.g., competitively or noncompetitively, any of activities 1-4.

[0093] Based on the above-described sequence similarities, the 26443 or 46873 molecules of the present invention are predicted to have similar biological activities as asparaginase family members. Asparaginase enzymes assist in the hydrolysis of asparagine to aspartic acid and ammonia. Thus, the 26443 or 46873 molecules can act as novel diagnostic targets and therapeutic agents for controlling, e.g., the amount of asparagine (and likewise, aspartic acid) in a cell.

[0094] The 26443 or 46873 protein may be involved in disorders characterized by aberrant activity of the cells in which it is expressed. Since asparaginase enzymes are typically found in most cells in bacterial fungi, plants and mammals, e.g., cells that contain or metabolize asparagine, it is likely that 26443 or 46873 proteins may also be expressed in such cells. Therefore, altered expression and/or activity of a 26443 or 46873 molecule can lead to defects in the metabolism of asparagine and/or aspartic acid.

[0095] The 26443 or 46873 molecules can also act as novel diagnostic targets and therapeutic agents for controlling one or more of cellular proliferative and/or differentiative disorders, disorders associated with bone metabolism, immune disorders, hematopoietic disorders, cardiovascular disorders, liver disorders, viral diseases, pain or metabolic disorders.

[0096] Examples of cellular proliferative and/or differentiative disorders include cancer, e.g., carcinoma, sarcoma, metastatic disorders or hematopoietic neoplastic disorders, e.g., leukemias. A metastatic tumor can arise from a multitude of primary tumor types, including but not limited to those of prostate, colon, lung, breast and liver origin.

[0097] As used herein, the terms “cancer”, “hyperproliferative” and “neoplastic” refer to cells having the capacity for autonomous growth, i.e., an abnormal state or condition characterized by rapidly proliferating cell growth. Hyperproliferative and neoplastic disease states may be categorized as pathologic, i.e., characterizing or constituting a disease state, or may be categorized as non-pathologic, i.e., a deviation from normal but not associated with a disease state. The term is meant to include all types of cancerous growths or oncogenic processes, metastatic tissues or malignantly transformed cells, tissues, or organs, irrespective of histopathologic type or stage of invasiveness. “Pathologic hyperproliferative” cells occur in disease states characterized by malignant tumor growth. Examples of non-pathologic hyperproliferative cells include proliferation of cells associated with wound repair.

[0098] The terms “cancer” or “neoplasms” include malignancies of the various organ systems, such as affecting lung, breast, thyroid, lymphoid, gastrointestinal, and genito-urinary tract, as well as adenocarcinomas which include malignancies such as most colon cancers, renal-cell carcinoma, prostate cancer and/or testicular tumors, non-small cell carcinoma of the lung, cancer of the small intestine and cancer of the esophagus.

[0099] The term “carcinoma” is art recognized and refers to malignancies of epithelial or endocrine tissues including respiratory system carcinomas, gastrointestinal system carcinomas, genitourinary system carcinomas, testicular carcinomas, breast carcinomas, prostatic carcinomas, endocrine system carcinomas, and melanomas. Exemplary carcinomas include those forming from tissue of the cervix, lung, prostate, breast, head and neck, colon and ovary. The term also includes carcinosarcomas, e.g., which include malignant tumors composed of carcinomatous and sarcomatous tissues. An “adenocarcinoma” refers to a carcinoma derived from glandular tissue or in which the tumor cells form recognizable glandular structures.

[0100] The term “sarcoma” is art recognized and refers to malignant tumors of mesenchymal derivation.

[0101] The 26443 or 46873 nucleic acid and protein of the invention can be used to treat and/or diagnose a variety of hematopoietic neoplastic disorders. As used herein, the term “hematopoietic neoplastic disorders” includes diseases involving hyperplastic/neoplastic cells of hematopoietic origin, e.g., arising from myeloid, lymphoid or erythroid lineages, or precursor cells thereof. Preferably, the diseases arise from poorly differentiated acute leukemias, e.g., erythroblastic leukemia and acute megakaryoblastic leukemia. Additional exemplary myeloid disorders include, but are not limited to, acute promyeloid leukemia (APML), acute myelogenous leukemia (AML) and chronic myelogenous leukemia (CML) (reviewed in Vaickus, L. (1991) Crit Rev. in Oncol./Hemotol. 11:267-97); lymphoid malignancies include, but are not limited to acute lymphoblastic leukemia (ALL) which includes B-lineage ALL and T-lineage ALL, chronic lymphocytic leukemia (CLL), prolymphocytic leukemia (PLL), hairy cell leukemia (HLL) and Waldenstrom's macroglobulinemia (WM). Additional forms of malignant lymphomas include, but are not limited to non-Hodgkin lymphoma and variants thereof, peripheral T cell lymphomas, adult T cell leukemia/lymphoma (ATL), cutaneous T-cell lymphoma (CTCL), large granular lymphocytic leukemia (LGF), Hodgkin's disease and Reed-Sternberg disease.

[0102] Asparaginases are generally more effective in treating acute lymphoblastic leukemia and lymphosarcomas, than other forms of leukemia or solid tumors, since remissions of these types of cancers are invariably of short duration. Whereas most normal tissues synthesize L-asparagine in amounts sufficient for their metabolic needs, certain neoplastic tissues, primarily acute lymphoblastic leukemia (ALL) and lymphosarcoma cells, require an exogenous source of asparagines (i.e., from nearby host tissues). Administration of L-asparaginase enzymatically catalyzes the hydrolysis of asparagine to aspartic acid and ammonia, which deprives the malignant cells of the asparagine from extracellular fluid and eventually results in cell death. Clinical use of asparaginase from, e.g., Escherichia coli or Erwinia chrysanthemi, oftentimes results in hypersensitive immune responses after multiple administrations. Since the two asparaginase enzymes from E. coli and E. chrysanthemi do not exhibit any cross-reactivity, the two enzymes can be used in a treatment regimen to reduce or avoid the hypersensitivity response.

[0103] Additionally, asparaginases can be administered in combination with other traditional or experimental cancer treatments. Asparaginases can be combined with a treatment modality which inhibits cell proliferation, e.g., cytotoxic agents, e.g., agents with diverse structures and mechanisms of action, including but not limited to, antimicrotubule agents, topoisomerase I inhibitors, topoisomerase II inhibitors, antimetabolites, mitotic inhibitors, alkylating agents, intercalating agents, agents capable of interfering with a signal transduction pathway (e.g., protein kinase C inhibitors, e.g., anti-hormones, e.g., antibodies against growth factor receptors), agents that promote apoptosis and/or necrosis, biological response modifiers (e.g., interferons, interleukins, tumor necrosis factors), and radiation.

[0104] The 26443 or 46873 protein, fragments thereof, and derivatives and other variants of the sequence in SEQ ID NO:2 or SEQ ID NO:5, respectively, are collectively referred to as “polypeptides or proteins of the invention” or “26443 or 46873 polypeptides or proteins”. Nucleic acid molecules encoding such polypeptides or proteins are collectively referred to as “nucleic acids of the invention” or “26443 or 46873 nucleic acids”. 26443 or 46873 molecules refer to 26443 or 46873 nucleic acids, polypeptides, and antibodies.

[0105] As used herein, the term “nucleic acid molecule” includes DNA molecules (e.g., a cDNA or genomic DNA) and RNA molecules (e.g., an mRNA) and analogs of the DNA or RNA generated, e.g., by the use of nucleotide analogs. The nucleic acid molecule can be single-stranded or double-stranded, but preferably is double-stranded DNA.

[0106] The term “isolated or purified nucleic acid molecule” includes nucleic acid molecules that are separated from other nucleic acid molecules that are present in the natural source of the nucleic acid. For example, with regards to genomic DNA, the term “isolated” includes nucleic acid molecules that are separated from the chromosome with which the genomic DNA is naturally associated. Preferably, an “isolated” nucleic acid is free of sequences that naturally flank the nucleic acid (i.e., sequences located at the 5′ and/or 3′ends of the nucleic acid) in the genomic DNA of the organism from which the nucleic acid is derived. For example, in various embodiments, the isolated nucleic acid molecule can contain less than about 5 kb, 4 kb, 3 kb, 2 kb, 1 kb, 0.5 kb or 0.1 kb of 5′ and/or 3′nucleotide sequences which naturally flank the nucleic acid molecule in genomic DNA of the cell from which the nucleic acid is derived. Moreover, an “isolated” nucleic acid molecule, such as a cDNA molecule, can be substantially free of other cellular material, or culture medium when produced by recombinant techniques, or substantially free of chemical precursors or other chemicals when chemically synthesized.

[0107] As used herein, the term “hybridizes under low stringency, medium stringency, high stringency, or very high stringency conditions” describes conditions for hybridization and washing. Guidance for performing hybridization reactions can be found in Current Protocols in Molecular Biology, John Wiley & Sons, N.Y. (1989), 6.3.1-6.3.6, which is incorporated by reference. Aqueous and non-aqueous methods are described in that reference and either can be used. Specific hybridization conditions referred to herein are as follows: 1) low stringency hybridization conditions in 6× sodium chloride/sodium citrate (SSC) at about 45° C., followed by two washes in 0.2× SSC, 0.1% SDS at least at 50° C. (the temperature of the washes can be increased to 55° C. for low stringency conditions); 2) medium stringency hybridization conditions in 6× SSC at about 45° C., followed by one or more washes in 0.2× SSC, 0.1% SDS at 60° C.; 3) high stringency hybridization conditions in 6× SSC at about 45° C., followed by one or more washes in 0.2× SSC, 0.1% SDS at 65° C.; and preferably 4) very high stringency hybridization conditions are 0.5M sodium phosphate, 7% SDS at 65° C., followed by one or more washes at 0.2× SSC, 1% SDS at 65° C. Very high stringency conditions (4) are the preferred conditions and the ones that should be used unless otherwise specified. Preferably, an isolated nucleic acid molecule of the invention that hybridizes under stringent conditions to the sequence of SEQ ID NO: 1 or 3, corresponds to a naturally-occurring nucleic acid molecule.

[0108] As used herein, a “naturally-occurring” nucleic acid molecule refers to an RNA or DNA molecule having a nucleotide sequence that occurs in nature (e.g., encodes a natural protein).

[0109] As used herein, the terms “gene” and “recombinant gene” refer to nucleic acid molecules which include an open reading frame encoding a 26443 or 46873 protein, preferably a mammalian 26443 or 46873 protein, and can further include non-coding regulatory sequences and introns.

[0110] An “isolated” or “purified” polypeptide or protein is substantially free of cellular material or other contaminating proteins from the cell or tissue source from which the protein is derived, or substantially free from chemical precursors or other chemicals when chemically synthesized. In one embodiment, the language “substantially free” means preparation of 26443 or 46873 protein having less than about 30%, 20%, 10% and more preferably 5% (by dry weight), of non-26443 or -46873 protein (also referred to herein as a “contaminating protein”), or of chemical precursors or non-26443 or -46873 chemicals. When the 26443 or 46873 protein or biologically active portion thereof is recombinantly produced, it is also preferably substantially free of culture medium, i.e., culture medium represents less than about 20%, more preferably less than about 10%, and most preferably less than about 5% of the volume of the protein preparation. The invention includes isolated or purified preparations of at least 0.01, 0.1, 1.0, and 10 milligrams in dry weight.

[0111] A “non-essential” amino acid residue is a residue that can be altered from the wild-type sequence of 26443 or 46873 (e.g., the nucleotide sequence of SEQ ID NO: 1, SEQ ID NO:3, SEQ ID NO:4 or SEQ ID NO:6, or the nucleotide sequence of the DNA insert of the plasmid deposited with ATCC as Accession Number ______ or Accession Number ______) without abolishing or more preferably, without substantially altering a biological activity, whereas an “essential” amino acid residue results in such a change. For example, amino acid residues that are conserved among the polypeptides of the present invention, e.g., those present in the asparaginase domain, are predicted to be particularly unamenable to alteration.

[0112] A “conservative amino acid substitution” is one in which the amino acid residue is replaced with an amino acid residue having a similar side chain. Families of amino acid residues having similar side chains have been defined in the art. These families include amino acids with basic side chains (e.g., lysine, arginine, histidine), acidic side chains (e.g., aspartic acid, glutamic acid), uncharged polar side chains (e.g., glycine, asparagine, glutamine, serine, threonine, tyrosine, cysteine), nonpolar side chains (e.g., alanine, valine, leucine, isoleucine, proline, phenylalanine, methionine, tryptophan), beta-branched side chains (e.g., threonine, valine, isoleucine) and aromatic side chains (e.g., tyrosine, phenylalanine, tryptophan, histidine). Thus, a predicted nonessential amino acid residue in a 26443 or 46873 protein is preferably replaced with another amino acid residue from the same side chain family. Alternatively, in another embodiment, mutations can be introduced randomly along all or part of a 26443 or 46873 coding sequence, such as by saturation mutagenesis, and the resultant mutants can be screened for 26443 or 46873 biological activity to identify mutants that retain activity. Following mutagenesis of SEQ ID NO:1, SEQ ID NO:3, SEQ ID NO:4 or SEQ ID NO:6, or the nucleotide sequence of the DNA insert of the plasmid deposited with ATCC as Accession Number ______ or as Accession Number ______, the encoded protein can be expressed recombinantly and the activity of the protein can be determined.

[0113] As used herein, a “biologically active portion” of a 26443 or 46873 protein includes a fragment of a 26443 or 46873 protein that participates in an interaction between a 26443 or 46873 molecule and a non-26443 or -46873 molecule. Biologically active portions of a 26443 or 46873 protein include peptides comprising amino acid sequences sufficiently homologous to or derived from the amino acid sequence of the 26443 or 46873 protein, e.g., the amino acid sequence shown in SEQ ID NO:2 or SEQ ID NO:5, respectively, which include less amino acids than the full length 26443 or 46873 proteins, and exhibit at least one activity of a 26443 or 46873 protein. Typically, biologically active portions comprise a domain or motif with at least one activity of the 26443 or 46873 protein, e.g., asparaginase. A biologically active portion of a 26443 or 46873 protein can be a polypeptide that is, for example, 50, 100, 200 or more amino acids in length. Biologically active portions of a 26443 or 46873 protein can be used as targets for developing agents, which modulate a 26443- or 46873-mediated activity, e.g., asparaginase.

[0114] Calculations of homology or sequence identity between sequences (the terms are used interchangeably herein) are performed as follows.

[0115] To determine the percent identity of two amino acid sequences, or of two nucleic acid sequences, the sequences are aligned for optimal comparison purposes (e.g., gaps can be introduced in one or both of a first and a second amino acid or nucleic acid sequence for optimal alignment and non-homologous sequences can be disregarded for comparison purposes). In a preferred embodiment, the length of a reference sequence aligned for comparison purposes is at least 30%, preferably at least 40%, more preferably at least 50%, even more preferably at least 60%, and even more preferably at least 70%, 80%, 90%, 100% of the length of the reference sequence (e.g., when aligning a second sequence to the 26443 amino acid sequence of SEQ ID NO:2 having 125 amino acid residues, at least 167, preferably at least 209, more preferably at least 251, and even more preferably at least 293, 334, 376 or 418 amino acid residues are aligned; when aligning a second sequence to the 46873 amino acid sequence of SEQ ID NO:5 having 92 amino acid residues, at least 123, preferably at least 154, more preferably at least 185, and even more preferably at least 216, 246, 277 or 308 amino acid residues are aligned). The amino acid residues or nucleotides at corresponding amino acid positions or nucleotide positions are then compared. When a position in the first sequence is occupied by the same amino acid residue or nucleotide as the corresponding position in the second sequence, then the molecules are identical at that position (as used herein amino acid or nucleic acid “identity” is equivalent to amino acid or nucleic acid “homology”). The percent identity between the two sequences is a function of the number of identical positions shared by the sequences, taking into account the number of gaps, and the length of each gap, which need to be introduced for optimal alignment of the two sequences.

[0116] The comparison of sequences and determination of percent identity between two sequences can be accomplished using a mathematical algorithm. In a preferred embodiment, the percent identity between two amino acid sequences is determined using the Needleman and Wunsch (J. Mol. Biol. (48):444-453 (1970)) algorithm which has been incorporated into the GAP program in the GCG software package (available at http://www.gcg.com), using either a Blossum 62 matrix or a PAM250 matrix, and a gap weight of 16, 14, 12, 10, 8, 6, or 4 and a length weight of 1, 2, 3, 4, 5, or 6. In yet another preferred embodiment, the percent identity between two nucleotide sequences is determined using the GAP program in the GCG software package (available at http://www.gcg.com), using a NWSgapdna.CMP matrix and a gap weight of 40, 50, 60, 70, or 80 and a length weight of 1, 2, 3, 4, 5, or 6. A particularly preferred set of parameters (and the one that should be used if the practitioner is uncertain about what parameters should be applied to determine if the molecule is within the sequence identity limits of a claim) is using a Blossum 62 scoring matrix with a gap open penalty of 12, a gap extend penalty of 4, and a frameshift gap penalty of 5.

[0117] The percent identity between two amino acid or nucleotide sequences can be determined using the algorithm of E. Meyers and W. Miller (CABIOS, 4:11-17 (1989)) which has been incorporated into the ALIGN program (version 2.0), using a PAM120 weight residue table, a gap length penalty of 12 and a gap penalty of 4.

[0118] The nucleic acid and protein sequences described herein can be used as a “query sequence” to perform a search against public databases to, for example, identify other family members or related sequences. Such searches can be performed using the NBLAST and XBLAST programs (version 2.0) of Altschul, et al. (1990) J. Mol. Biol. 215:403-10. BLAST nucleotide searches can be performed with the NBLAST program, score=100, wordlength=12 to obtain nucleotide sequences homologous to 26443 or 46873 nucleic acid molecules of the invention. BLAST protein searches can be performed with the XBLAST program, score=50, wordlength=3 to obtain amino acid sequences homologous to 26443 or 46873 protein molecules of the invention. To obtain-gapped alignments for comparison purposes, Gapped BLAST can be utilized as described in Altschul et al., (1997) Nucleic Acids Res. 25(17):3389-3402. When utilizing BLAST and Gapped BLAST programs, the default parameters of the respective programs (e.g., XBLAST and NBLAST) can be used. See http://www.ncbi.nlm.nih.gov.

[0119] “Misexpression or aberrant expression”, as used herein, refers to a non-wild type pattern of gene expression, at the RNA or protein level. It includes: expression at non-wild type levels, i.e., over- or under-expression; a pattern of expression that differs from wild type in terms of the time or stage at which the gene is expressed, e.g., increased or decreased expression (as compared with wild type) at a predetermined developmental period or stage; a pattern of expression that differs from wild type in terms of decreased expression (as compared with wild type) in a predetermined cell type or tissue type; a pattern of expression that differs from wild type in terms of the splicing size, amino acid sequence, post-transitional modification, or biological activity of the expressed polypeptide; a pattern of expression that differs from wild type in terms of the effect of an environmental stimulus or extracellular stimulus on expression of the gene, e.g., a pattern of increased or decreased expression (as compared with wild type) in the presence of an increase or decrease in the strength of the stimulus.

[0120] “Subject”, as used herein, can refer to a mammal, e.g., a human, or to an experimental or animal or disease model. The subject can also be a non-human animal, e.g., a horse, cow, goat, or other domestic animal.

[0121] A “purified preparation of cells”, as used herein, refers to, in the case of plant or animal cells, an in vitro preparation of cells and not an entire intact plant or animal. In the case of cultured cells or microbial cells, it consists of a preparation of at least 10% and more preferably 50% of the subject cells.

[0122] Various aspects of the invention are described in further detail below.

[0123] Isolated Nucleic Acid Molecules of 26443 and 46873

[0124] In one aspect, the invention provides, an isolated or purified, nucleic acid molecule that encodes a 26443 or 46873 polypeptide described herein, e.g., a full-length 26443 or 46873 protein or a fragment thereof, e.g., a biologically active portion of a 26443 or 46873 protein. Also included is a nucleic acid fragment suitable for use as a hybridization probe, which can be used, e.g., to a identify a nucleic acid molecule encoding a polypeptide of the invention, 26443 or 46873 mRNA, and fragments suitable for use as primers, e.g., PCR primers for the amplification or mutation of nucleic acid molecules.

[0125] In one embodiment, an isolated nucleic acid molecule of the invention includes the nucleotide sequence shown in SEQ ID NO: 1, or the nucleotide sequence of the DNA insert of the plasmid deposited with ATCC as Accession Number ______ or Accession Number ______, or a portion of any of these nucleotide sequences. In one embodiment, the nucleic acid molecule includes sequences encoding the human 26443 protein (i.e., “the coding region”, from nucleotides 91-1344 of SEQ ID NO:1), as well as 5′untranslated sequences (nucleotides 1-90 of SEQ ID NO:1) and 3′untranslated sequences (nucleotides 1345-1888 of SEQ ID NO:1). Alternatively, the nucleic acid molecule can include only the coding region of SEQ ID NO:1 (e.g., nucleotides 91-1344, corresponding to SEQ ID NO:3) and, e.g., no flanking sequences which normally accompany the subject sequence. In another embodiment, the nucleic acid molecule encodes a sequence corresponding to the mature protein from about amino acid 1 to amino acid 418 of SEQ ID NO:2.

[0126] In another embodiment, an isolated nucleic acid molecule of the invention includes the nucleotide sequence shown in SEQ ID NO:4, or the nucleotide sequence of the DNA insert of the plasmid deposited with ATCC as Accession Number ______ or Accession Number ______, or a portion of any of these nucleotide sequences. In one embodiment, the nucleic acid molecule includes sequences encoding the human 46873 protein (i.e., “the coding region”, from nucleotides 134-1057 of SEQ ID NO:4), as well as 5′untranslated sequences (nucleotides 1-133 of SEQ ID NO:4) and 3′untranslated sequences (nucleotides 1058-1358 of SEQ ID NO:4). Alternatively, the nucleic acid molecule can include only the coding region of SEQ ID NO:4 (e.g., nucleotides 134-1057, corresponding to SEQ ID NO:6) and, e.g., no flanking sequences which normally accompany the subject sequence. In another embodiment, the nucleic acid molecule encodes a sequence corresponding to the mature protein from about amino acid 1 to amino acid 308 of SEQ ID NO:5.

[0127] In another embodiment, an isolated nucleic acid molecule of the invention includes a nucleic acid molecule which is a complement of the nucleotide sequence shown in SEQ ID NO: 1, SEQ ID NO:3, SEQ ID NO:4 or SEQ ID NO:6, or the nucleotide sequence of the DNA insert of the plasmid deposited with ATCC as Accession Number ______ or as Accession Number ______, or a portion of any of these nucleotide sequences. In other embodiments, the nucleic acid molecule of the invention is sufficiently complementary to the nucleotide sequence shown in SEQ ID NO: 1, SEQ ID NO:3, SEQ ID NO:4 or SEQ ID NO:6, or the nucleotide sequence of the DNA insert of the plasmid deposited with ATCC as Accession Number ______ or as Accession Number ______ such that it can hybridize to the nucleotide sequence shown in SEQ ID NO:1, SEQ ID NO:3, SEQ ID NO:4 or SEQ ID NO:6, or the nucleotide sequence of the DNA insert of the plasmid deposited with ATCC as Accession Number ______ or as Accession Number ______, thereby forming a stable duplex.

[0128] In one embodiment, an isolated nucleic acid molecule of the present invention includes a nucleotide sequence which is at least about 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or more homologous to the entire length of the nucleotide sequence shown in SEQ ID NO:1, SEQ ID NO:3, SEQ ID NO:4 or SEQ ID NO:6, or the entire length of the nucleotide sequence of the DNA insert of the plasmid deposited with ATCC as Accession Number ______ or as Accession Number ______, or a portion, preferably of the same length, of any of these nucleotide sequences.

[0129] 26443 or 46873 Nucleic Acid Fragments

[0130] A nucleic acid molecule of the invention can include only a portion of the nucleic acid sequence of SEQ ID NO:1, SEQ ID NO:3, SEQ ID NO:4 or SEQ ID NO:6, or the nucleotide sequence of the DNA insert of the plasmid deposited with ATCC as Accession Number ______ or as Accession Number ______. For example, such a nucleic acid molecule can include a fragment that can be used as a probe or primer or a fragment encoding a portion of a 26443 or 46873 protein, e.g., an immunogenic or biologically active portion of a 26443 or 46873 protein. A fragment can comprise nucleotides 202 to 1125 of SEQ ID NO:1, which encodes an asparaginase domain of human 26443, or nucleotides 134 to 1039 of SEQ ID NO:4, which also encodes an asparaginase domain of human 46873. The nucleotide sequence determined from the cloning of the 26443 or 46873 gene allows for the generation of probes and primers designed for use in identifying and/or cloning other 26443 or 46873 family members, or fragments thereof, as well as 26443 or 46873 homologues, or fragments thereof, from other species.

[0131] In another embodiment, a nucleic acid includes a nucleotide sequence that includes part, or all, of the coding region and extends into either (or both) the 5′ or 3′noncoding region. Other embodiments include a fragment that includes a nucleotide sequence encoding an amino acid fragment described herein. Nucleic acid fragments can encode a specific domain or site described herein or fragments thereof, particularly fragments thereof that are at least 200, preferably 300 amino acids in length. Fragments also include nucleic acid sequences corresponding to specific amino acid sequences described above or fragments thereof. Nucleic acid fragments should not to be construed as encompassing those fragments that may have been disclosed prior to the invention.

[0132] A nucleic acid fragment can include a sequence corresponding to a domain, region, or functional site described herein. A nucleic acid fragment can also include one or more domain, region, or functional site described herein. Thus, for example, a nucleic acid fragment can include a sequence corresponding to an asparaginase domain.

[0133] In a preferred embodiment, the fragment is at least 200, 300, 350, 400, 450, 500, 550, 600, 650, 700, 750, 800, 850, 900, 950, or 1000 nucleotides in length.

[0134] 26443 or 46873 probes and primers are provided. Typically a probe/primer is an isolated or purified oligonucleotide. The oligonucleotide typically includes a region of nucleotide sequence that hybridizes under stringent conditions to at least about 7, 12 or 15, preferably about 20 or 25, more preferably about 30, 35, 40, 45, 50, 55, 60, 65, or 75 consecutive nucleotides of a sense or antisense sequence of SEQ ID NO: 1, SEQ ID NO:3, SEQ ID NO:4 or SEQ ID NO:6, or the nucleotide sequence of the DNA insert of the plasmid deposited with ATCC as Accession Number ______ or as Accession Number ______, or of a naturally occurring allelic variant or mutant of SEQ ID NO: 1, SEQ ID NO:3, SEQ ID NO:4 or SEQ ID NO:6, or the nucleotide sequence of the DNA insert of the plasmid deposited with ATCC as Accession Number ______ or as Accession Number ______.

[0135] In a preferred embodiment the nucleic acid is a probe which is at least 5 or 10, and less than 200, more preferably less than 100, or less than 50, base pairs in length. It should be identical, or differ by 1, or less than in 5 or 10 bases, from a sequence disclosed herein. If alignment is needed for this comparison the sequences should be aligned for maximum homology. “Looped” out sequences from deletions or insertions, or mismatches, are considered differences.

[0136] A probe or primer can be derived from the sense or anti-sense strand of a nucleic acid which encodes an asparaginase domain (corresponding to residues 38-345 of SEQ ID NO:2 or residues 1-302 of SEQ ID NO:5.

[0137] In another embodiment, a set of primers is provided, e.g., primers suitable for use in a PCR, which can be used to amplify a selected region of a 26443 or 46873 sequence. The primers should be at least 5, 10, or 50 base pairs in length and less than 100, or less than 200, base pairs in length. The primers should be identical, or differs by one base from a sequence disclosed herein or from a naturally occurring variant. E.g., primers suitable for amplifying all or a portion of a domain or region described herein, e.g., any of the following regions, are provided an asparaginase domain corresponding to residues 38-345 of SEQ ID NO:2 or residues 1-302 of SEQ ID NO:5.

[0138] A nucleic acid fragment can encode an epitope bearing region of a polypeptide described herein.

[0139] A nucleic acid fragment encoding a “biologically active portion of a 26443 or 46873 polypeptide” can be prepared by isolating a portion of the nucleotide sequence of SEQ ID NO: 1, SEQ ID NO:3, SEQ ID NO:4 or SEQ ID NO:6, or the nucleotide sequence of the DNA insert of the plasmid deposited with ATCC as Accession Number ______ or Accession Number ______, which encodes a polypeptide having a 26443 or 46873 biological activity (e.g., the biological activities of the 26443 or 46873 proteins described herein), expressing the encoded portion of the 26443 or 46873 protein (e.g., by recombinant expression in vitro) and assessing the activity of the encoded portion of the 26443 or 46873 protein. For example, a nucleic acid fragment encoding a biologically active portion of 26443 or 46873 includes an asparaginase domain, e.g., amino acid residues 38 to 345 of SEQ ID NO:2 or amino acid residues 1 to 302 of SEQ ID NO:5. A nucleic acid fragment encoding a biologically active portion of a 26443 or 46873 polypeptide may comprise a nucleotide sequence that is greater than 300 or more nucleotides in length (e.g., greater than about 400 nucleotides in length).

[0140] In preferred embodiments, a nucleic acid fragment of 26443 includes a nucleotide sequence which is at least about 300, at least about 353 (e.g., 355, 375, 400), at least about 400 (e.g., 500, 600, 700, 800), at least about 457 (e.g., 460, 500, 600, 700), or more nucleotides in length and hybridizes under stringent hybridization conditions to a nucleic acid molecule of SEQ ID NO: 1, or SEQ ID NO:3, or the nucleotide sequence of the DNA insert of the plasmid deposited with ATCC as Accession Number ______.

[0141] In a preferred embodiment, a nucleic acid fragment of 26443 includes a nucleotide sequence comprising nucleotides 183-842, 459-842, 1195-1244, or 1644-1888 of SEQ ID NO: 1, or a portion thereof, wherein each fragment hybridizes under stringent hybridization conditions to a nucleic acid molecule of SEQ ID NO: 1, or SEQ ID NO:3, or the nucleotide sequence of the DNA insert of the plasmid deposited with ATCC as Accession Number ______. In another preferred embodiment, a nucleic acid fragment of 26443 includes a nucleotide sequence comprising nucleotides 1-842 of SEQ ID NO: 1, or a portion thereof, wherein each portion is about 183 or longer nucleotides and hybridizes under stringent hybridization conditions to a nucleic acid molecule of SEQ ID NO: 1, or SEQ ID NO:3, or the nucleotide sequence of the DNA insert of the plasmid deposited with ATCC as Accession Number ______.

[0142] In a preferred embodiment, a nucleic acid fragment has a nucleotide sequence other than AI793006, AA262517, R89654, or C07777.

[0143] In preferred embodiments, a nucleic acid fragment of 46873 includes a nucleotide sequence which is at least about 300, 400, 500, 560 (e.g., 570, 580, 590, 600), at least about 662 (e.g., 665, 666, 667, 668, 670, 680, 690, 700), or more nucleotides in length and hybridizes under stringent hybridization conditions to a nucleic acid molecule of SEQ ID NO:4, or SEQ ID NO:6, or the nucleotide sequence of the DNA insert of the plasmid deposited with ATCC as Accession Number ______.

[0144] In a preferred embodiment, a nucleic acid fragment of 46873 includes a nucleotide sequence of SEQ ID NO:4 or 6, or a portion thereof; or a portion of the 46873 sequence comprising nucleotides 1-680, 1-686, 1-692 or 1-785 of SEQ ID NO:4, or a portion thereof, wherein each fragment hybridizes under stringent hybridization conditions to a nucleic acid molecule of SEQ ID NO:4, or SEQ ID NO:6, or the nucleotide sequence of the DNA insert of the plasmid deposited with ATCC as Accession Number ______.

[0145] In a preferred embodiment, a nucleic acid fragment has a nucleotide sequence other than AI879995, AI928914, AW131805, or AI978667.

[0146] 26443 or 46873 Nucleic Acid Variants

[0147] The invention further encompasses nucleic acid molecules that differ from the nucleotide sequence shown in SEQ ID NO: 1, SEQ ID NO:3, SEQ ID NO:4 or SEQ ID NO:6, or the nucleotide sequence of the DNA insert of the plasmid deposited with ATCC as Accession Number ______ or Accession Number ______. Such differences can be due to degeneracy of the genetic code (and result in a nucleic acid that encodes the same 26443 or 46873 proteins as those encoded by the nucleotide sequence disclosed herein. In another embodiment, an isolated nucleic acid molecule of the invention has a nucleotide sequence encoding a protein having an amino acid sequence which differs, by at least 1, but less than 5, 10, 20, 50, or 100 amino acid residues that shown in SEQ ID NO:2 or SEQ ID NO:5. If alignment is needed for this comparison the sequences should be aligned for maximum homology. “Looped” out sequences from deletions or insertions, or mismatches, are considered differences.

[0148] Nucleic acids of the inventor can be chosen for having codons, which are preferred, or non-preferred, for a particular expression system. E.g., the nucleic acid can be one in which at least one colon, at preferably at least 10%, or 20% of the codons has been altered such that the sequence is optimized for expression in E. Coli, yeast, human, insect, or CHO cells.

[0149] Nucleic acid variants can be naturally occurring, such as allelic variants (same locus), homologs (different locus), and orthologs (different organism) or can be non-naturally occurring. Non-naturally occurring variants can be made by mutagenesis techniques, including those applied to polynucleotides, cells, or organisms. The variants can contain nucleotide substitutions, deletions, inversions and insertions. Variation can occur in either or both the coding and non-coding regions. The variations can produce both conservative and non-conservative amino acid substitutions (as compared in the encoded product).

[0150] In a preferred embodiment, the nucleic acid differs from that of SEQ ID NO: 1, SEQ ID NO:3, SEQ ID NO:4 or SEQ ID NO:6, or the sequence in ATCC Accession Number ______ or Accession Number ______, e.g., as follows: by at least one but less than 10, 20, 30, or 40 nucleotides; at least one but less than 2%, 5%, 10% or 20% of the subject nucleic acid. If necessary for this analysis the sequences should be aligned for maximum homology. “Looped” out sequences from deletions or insertions, or mismatches, are considered differences.

[0151] Orthologs, homologs, and allelic variants can be identified using methods known in the art. These variants comprise a nucleotide sequence encoding a polypeptide that is 50%, at least about 55%, typically at least about 70-75%, more typically at least about 80-85%, and most typically at least about 90-95% or more identical to the amino acid sequence shown in SEQ ID NO:2 or SEQ ID NO:5 or a fragment of those sequences. Nucleic acid molecules encoding such polypeptides can readily be identified as being able to hybridize under stringent conditions, to the nucleotide sequence shown in SEQ ID NO: 1, SEQ ID NO:3, SEQ ID NO:4 or SEQ ID NO:6, or a fragment of the sequence. Nucleic acid molecules corresponding to orthologs, homologs, and allelic variants of the 26443 or 46873 cDNAs of the invention can further be isolated by mapping to the same chromosome or locus as the 26443 or 46873 gene.

[0152] Preferred variants include those that are correlated with asparaginase activity.

[0153] Allelic variants of 26443 or 46873, e.g., human 26443 or 46873, include both functional and non-functional proteins. Functional allelic variants are naturally occurring amino acid sequence variants of the 26443 or 46873 protein within a population that maintain the ability to function as an asparaginase. Functional allelic variants will typically contain only conservative substitution of one or more amino acids of SEQ ID NO:2 or SEQ ID NO:5, or substitution, deletion or insertion of non-critical residues in non-critical regions of the protein. Non-functional allelic variants are naturally-occurring amino acid sequence variants of the 26443 or 46873, e.g., human 26443 or 46873, protein within a population that do not have the ability to function as an asparaginase. Non-functional allelic variants will typically contain a non-conservative substitution, a deletion, or insertion, or premature truncation of the amino acid sequence of SEQ ID NO:2 or SEQ ID NO:5, or a substitution, insertion, or deletion in critical residues or critical regions of the protein.

[0154] Moreover, nucleic acid molecules encoding other 26443 or 46873 family members and, thus, which have a nucleotide sequence which differs from the 26443 or 46873 sequences of SEQ ID NO: 1, SEQ ID NO:3, SEQ ID NO:4 or SEQ ID NO:6, or the nucleotide sequence of the DNA insert of the plasmid deposited with ATCC as Accession Number ______ or Accession Number ______ are intended to be within the scope of the invention.

[0155] Antisense Nucleic Acid Molecules, Ribozymes and Modified 26443 or 46873 Nucleic Acid Molecules

[0156] In another aspect, the invention features, an isolated nucleic acid molecule that is antisense to 26443 or 46873. An “antisense” nucleic acid can include a nucleotide sequence that is complementary to a “sense” nucleic acid encoding a protein, e.g., complementary to the coding strand of a double-stranded cDNA molecule or complementary to an mRNA sequence. The antisense nucleic acid can be complementary to an entire 26443 or 46873 coding strand, or to only a portion thereof (e.g., the coding region of human 26443 or 46873 corresponding to SEQ ID NO:3 or SEQ ID NO:6, respectively). In another embodiment, the antisense nucleic acid molecule is antisense to a “noncoding region” of the coding strand of a nucleotide sequence encoding 26443 or 46873 (e.g., the 5′ or 3′untranslated regions).

[0157] An antisense nucleic acid can be designed such that it is complementary to the entire coding region of 26443 or 46873 mRNA, but more preferably is an oligonucleotide that is antisense to only a portion of the coding or noncoding region of 26443 or 46873 mRNA. For example, the antisense oligonucleotide can be complementary to the region surrounding the translation start site of 26443 or 46873 mRNA, e.g., between the −10 and +10 regions of the target gene nucleotide sequence of interest. An antisense oligonucleotide can be, for example, about 7, 10, 15, 20, 25, 30, 35, 40, 45, 50, 55, 60, 65, 70, 75, 80, or more nucleotides in length.

[0158] An antisense nucleic acid of the invention can be constructed using chemical synthesis and enzymatic ligation reactions using procedures known in the art. For example, an antisense nucleic acid (e.g., an antisense oligonucleotide) can be chemically synthesized using naturally occurring nucleotides or variously modified nucleotides designed to increase the biological stability of the molecules or to increase the physical stability of the duplex formed between the antisense and sense nucleic acids, e.g., phosphorothioate derivatives and acridine-substituted nucleotides can be used. The antisense nucleic acid also can be produced biologically using an expression vector into which a nucleic acid has been subcloned in an antisense orientation (i.e., RNA transcribed from the inserted nucleic acid will be of an antisense orientation to a target nucleic acid of interest, described further in the following subsection).

[0159] The antisense nucleic acid molecules of the invention are typically administered to a subject (e.g., by direct injection at a tissue site), or generated in situ such that they hybridize with or bind to cellular mRNA and/or genomic DNA encoding a 26443 or 46873 protein to thereby inhibit expression of the protein, e.g., by inhibiting transcription and/or translation. Alternatively, antisense nucleic acid molecules can be modified to target selected cells and then administered systemically. For systemic administration, antisense molecules can be modified such that they specifically bind to receptors or antigens expressed on a selected cell surface, e.g., by linking the antisense nucleic acid molecules to peptides or antibodies that bind to cell surface receptors or antigens. The antisense nucleic acid molecules can also be delivered to cells using the vectors described herein. To achieve sufficient intracellular concentrations of the antisense molecules, vector constructs in which the antisense nucleic acid molecule is placed under the control of a strong pol II or pol III promoter are preferred.

[0160] In yet another embodiment, the antisense nucleic acid molecule of the invention is an &agr;-anomeric nucleic acid molecule. An &agr;-anomeric nucleic acid molecule forms specific double-stranded hybrids with complementary RNA in which, contrary to the usual &bgr;-units, the strands run parallel to each other (Gaultier et al. (1987) Nucleic Acids. Res. 15:6625-6641). The antisense nucleic acid molecule can also comprise a 2′-o-methylribonucleotide (Inoue et al. (1987) Nucleic Acids Res. 15:6131-6148) or a chimeric RNA-DNA analogue (Inoue et al. (1987) FEBS Lett. 215:327-330).

[0161] In still another embodiment, an antisense nucleic acid of the invention is a ribozyme. A ribozyme having specificity for a 26443- or 46873-encoding nucleic acid can include one or more sequences complementary to the nucleotide sequence of a 26443 or 46873 cDNA disclosed herein (i.e., SEQ ID NO:1, SEQ ID NO:3, SEQ ID NO:4 or SEQ ID NO:6), and a sequence having known catalytic sequence responsible for mRNA cleavage (see U.S. Pat. No. 5,093,246 or Haselhoff and Gerlach (1988) Nature 334:585-591). For example, a derivative of a Tetrahymena L-19 IVS RNA can be constructed in which the nucleotide sequence of the active site is complementary to the nucleotide sequence to be cleaved in a 26443- or 46873-encoding mRNA. See, e.g., Cech et al., U.S. Pat. No. 4,987,071; and Cech et al., U.S. Pat. No. 5,116,742. Alternatively, 26443 or 46873 mRNA can be used to select a catalytic RNA having a specific ribonuclease activity from a pool of RNA molecules. See, e.g., Bartel, D. and Szostak, J. W. (1993) Science 261:1411-1418.

[0162] 26443 or 46873 gene expression can be inhibited by targeting nucleotide sequences complementary to the regulatory region of the 26443 or 46873 (e.g., the 26443 or 46873 promoter and/or enhancers) to form triple helical structures that prevent transcription of the 26443 or 46873 genes in target cells. See generally, Helene, C. (1991) Anticancer Drug Des. 6(6):569-84; Helene, C. et al. (1992) Ann. N.Y. Acad. Sci. 660:27-36; and Maher, L. J. (1992) Bioassays 14(12):807-15. The potential sequences that can be targeted for triple helix formation can be increased by creating a so-called “switchback” nucleic acid molecule. Switchback molecules are synthesized in an alternating 5′-3′,3′-5′manner, such that they base pair with first one strand of a duplex and then the other, eliminating the necessity for a sizeable stretch of either purines or pyrimidines to be present on one strand of a duplex.

[0163] The invention also provides detectably labeled oligonucleotide primer and probe molecules. Typically, such labels are chemiluminescent, fluorescent, radioactive, or colorimetric.

[0164] A 26443 or 46873 nucleic acid molecule can be modified at the base moiety, sugar moiety or phosphate backbone to improve, e.g., the stability, hybridization, or solubility of the molecule. For example, the deoxyribose phosphate backbone of the nucleic acid molecules can be modified to generate peptide nucleic acids (see Hyrup B. et al. (1996) Bioorganic & Medicinal Chemistry 4 (1): 5-23). As used herein, the terms “peptide nucleic acid” or “PNA” refers to a nucleic acid mimic, e.g., a DNA mimic, in which the deoxyribose phosphate backbone is replaced by a pseudopeptide backbone and only the four natural nucleobases are retained. The neutral backbone of a PNA can allow for specific hybridization to DNA and RNA under conditions of low ionic strength. The synthesis of PNA oligomers can be performed using standard solid phase peptide synthesis protocols as described in Hyrup B. et al. (1996) supra; Perry-O'Keefe et al. Proc. Natl. Acad. Sci. 93: 14670-675.

[0165] PNAs of 26443 or 46873 nucleic acid molecules can be used in therapeutic and diagnostic applications. For example, PNAs can be used as antisense or antigene agents for sequence-specific modulation of gene expression by, for example, inducing transcription or translation arrest or inhibiting replication. PNAs of 26443 or 46873 nucleic acid molecules can also be used in the analysis of single base pair mutations in a gene, (e.g., by PNA-directed PCR clamping); as ‘artificial restriction enzymes’ when used in combination with other enzymes, (e.g., S1 nucleases (Hyrup B. (1996) supra)); or as probes or primers for DNA sequencing or hybridization (Hyrup B. et al. (1996) supra; Perry-O'Keefe supra).

[0166] In other embodiments, the oligonucleotide may include other appended groups such as peptides (e.g., for targeting host cell receptors in vivo), or agents facilitating transport across the cell membrane (see, e.g., Letsinger et al. (1989) Proc. Natl. Acad. Sci. USA 86:6553-6556; Lemaitre et al. (1987) Proc. Natl. Acad. Sci. USA 84:648-652; PCT Publication No. WO88/09810) or the blood-brain barrier (see, e.g., PCT Publication No. WO89/10134). In addition, oligonucleotides can be modified with hybridization-triggered cleavage agents (See, e.g., Krol et al. (1988) Bio-Techniques 6:958-976) or intercalating agents. (See, e.g., Zon (1988) Pharm. Res. 5:539-549). To this end, the oligonucleotide may be conjugated to another molecule, (e.g., a peptide, hybridization-triggered cross-linking agent, transport agent, or hybridization-triggered cleavage agent).

[0167] The invention also includes molecular beacon oligonucleotide primer and probe molecules having at least one region that is complementary to a 26443 or 46873 nucleic acid of the invention. One complementary region has a fluorophore, and the other, a quencher, such that the molecular beacon is useful for quantitating the presence of the 26443 or 46873 nucleic acid of the invention in a sample. Molecular beacon nucleic acids are described, for example, in Lizardi et al., U.S. Pat. No. 5,854,033; Nazarenko et al., U.S. Pat. No. 5,866,336, and Livak et al., U.S. Pat. No. 5,876,930.

[0168] Isolated 26443 or 46873 Polypeptides

[0169] In another aspect, the invention features, an isolated 26443 or 46873 protein, or fragment, e.g., a biologically active portion, for use as immunogens or antigens to raise or test (or more generally to bind) anti-26443 or -46873 antibodies. 26443 or 46873 protein can be isolated from cells or tissue sources using standard protein purification techniques. 26443 or 46873 protein or fragments thereof can be produced by recombinant DNA techniques or synthesized chemically.

[0170] Polypeptides of the invention include those that arise as a result of the existence of multiple genes, alternative transcription events, alternative RNA splicing events, and alternative translational and post-translational events. The polypeptide can be expressed in systems, e.g., cultured cells, which result in substantially the same post-translational modifications present when the polypeptide is expressed in a native cell, or in systems which result in the alteration or omission of post-translational modifications when expressed in a native cell, e.g., glycosylation or cleavage.

[0171] In a preferred embodiment, a 26443 or 46873 polypeptide has one or more of the following characteristics:

[0172] (i) it has the ability to catalyze the hydrolysis of asparagine to aspartic acid and ammonia;

[0173] (ii) it has the ability to regulate the cellular levels of asparagine, aspartic acid and ammonia;

[0174] (iii) it has the ability to inhibit or decrease the availability of asparagine in tumors;

[0175] (iv) it has a molecular weight (e.g., deduced molecular weight), amino acid composition or other physical characteristic of a 26443 or a 46873 polypeptide, e.g., a polypeptide having a sequence shown in SEQ ID NO:2 or SEQ ID NO:5;

[0176] (v) it has an overall sequence similarity of at least 60%, preferably at least 70%, more preferably at least 80, 90, or 95%, with a polypeptide of SEQ ID NO:2 or SEQ ID NO:5;

[0177] (vi) it has a asparaginase domain which is preferably about 60%, 70%, 80%, 90% or 95% homologous to amino acid residues 38-345 of SEQ ID NO:2 or residues 1-302 o SEQ ID NO:5; or

[0178] (vii) it has at least 70%, preferably 80%, more preferably 90%, and most preferably 100% of the cysteines found in the amino acid sequence of the native protein.

[0179] In a preferred embodiment, the 26443 or 46873 protein, or fragment thereof, differs from the corresponding sequence in SEQ ID NO:2 or SEQ ID NO:5, respectively. In one embodiment, the protein differs by at least one, but by less than 15, 10 or 5 amino acid residues. In another, it differs from the corresponding sequence in SEQ ID NO:2 or SEQ ID NO:5 by at least one residue but less than 20%, 15%, 10% or 5% of the residues. (If this comparison requires alignment the sequences should be aligned for maximum homology. “Looped” out sequences from deletions or insertions, or mismatches, are considered differences.) The differences are, preferably, differences or changes at a non-essential residue or a conservative substitution. In a preferred embodiment the differences are not in the asparaginase domain. In another preferred embodiment one or more differences are in the asparaginase domain.

[0180] Other embodiments include a protein that contain one or more changes in amino acid sequence, e.g., a change in an amino acid residue which is not essential for activity. Such 26443 or 46873 proteins differ in amino acid sequence from SEQ ID NO:2 or SEQ ID NO:5, respectively, yet retain biological activity.

[0181] In one embodiment, the protein includes an amino acid sequence that is at least about 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 98% or more homologous to SEQ ID NO:2 or SEQ ID NO:5.

[0182] A 26443 or 46873 protein or fragment is provided which varies from the sequence of SEQ ID NO:2 or SEQ ID NO:5, respectively, in non-essential regions (e.g., transmembrane domains) by at least one but by less than 15, 10 or 5 amino acid residues in the protein or fragment, but which does not differ from SEQ ID NO:2 or SEQ ID NO:5 in catalytic regions (e.g., the asparaginase domain). (If this comparison requires alignment the sequences should be aligned for maximum homology. “Looped” out sequences from deletions or insertions, or mismatches, are considered differences.) In some embodiments, the difference is at a non-essential residue or is a conservative substitution, while in others the difference is at an essential residue or is a non-conservative substitution.

[0183] In one embodiment, a biologically active portion of a 26443 or 46873 protein includes an asparaginase domain. Moreover, other biologically active portions, in which other regions of the protein are deleted, can be prepared by recombinant techniques and evaluated for one or more of the functional activities of a native 26443 or 46873 protein.

[0184] Particularly preferred 26443 or 46873 polypeptides of the present invention have an amino acid sequence substantially identical to the amino acid sequence of SEQ ID NO:2 or 5. In the context of an amino acid sequence, the term “substantially identical” is used herein to refer to a first amino acid that contains a sufficient or minimum number of amino acid residues that are i) identical to, or ii) conservative substitutions of aligned amino acid residues in a second amino acid sequence such that the first and second amino acid sequences can have a common structural domain and/or common functional activity. For example, amino acid sequences that contain a common structural domain having at least about 60%, or 65% identity, likely 75% identity, more likely 85%, 90%. 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98% or 99% identity to SEQ ID NO:2 or 5 are termed substantially identical.

[0185] In the context of nucleotide sequence, the term “substantially identical” is used herein to refer to a first nucleic acid sequence that contains a sufficient or minimum number of nucleotides that are identical to aligned nucleotides in a second nucleic acid sequence such that the first and second nucleotide sequences encode a polypeptide having common functional activity, or encode a common structural polypeptide domain or a common functional polypeptide activity. For example, nucleotide sequences having at least about 60%, or 65% identity, likely 75% identity, more likely 85%, 90%. 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98% or 99% identity to SEQ ID NO: 1, 3, 4 or 6 are termed substantially identical.

[0186] 26443 or 46873 Chimeric or Fusion Proteins

[0187] In another aspect, the invention provides 26443 or 46873 chimeric or fusion proteins. As used herein, a 26443 or 46873 “chimeric protein” or “fusion protein” includes a 26443 or 46873 polypeptide linked to a non-26443 or -46873 polypeptide. A “non-26443 or -46873 polypeptide” refers to a polypeptide having an amino acid sequence corresponding to a protein which is not substantially homologous to the 26443 or 46873 protein, e.g., a protein which is different from the 26443 or 46873 protein and which is derived from the same or a different organism. The 26443 or 46873 polypeptide of the fusion protein can correspond to all or a portion, e.g., a fragment described herein of a 26443 or 46873 amino acid sequence. In a preferred embodiment, a 26443 or 46873 fusion protein includes at least one (or two) biologically active portion of a 26443 or 46873 protein. The non-26443 or -46873 polypeptide can be fused to the N-terminus or C-terminus of the 26443 or 46873 polypeptide.

[0188] The fusion protein can include a moiety that has a high affinity for a ligand. For example, the fusion protein can be a GST-26443 or -46873 fusion protein in which the 26443 or 46873 sequences are fused to the C-terminus of the GST sequences. Such fusion proteins can facilitate the purification of recombinant 26443 or 46873. Alternatively, the fusion protein can be a 26443 or 46873 protein containing a heterologous signal sequence at its N-terminus. In certain host cells (e.g., mammalian host cells), expression and/or secretion of 26443 or 46873 can be increased through use of a heterologous signal sequence.

[0189] Fusion proteins can include all or a part of a serum protein, e.g., an IgG constant region, or human serum albumin.

[0190] The 26443 or 46873 fusion proteins of the invention can be incorporated into pharmaceutical compositions and administered to a subject in vivo. The 26443 or 46873 fusion proteins can be used to affect the bioavailability of a 26443 or 46873 substrate. 26443 or 46873 fusion proteins may be useful therapeutically for the treatment of disorders caused by, for example, (i) aberrant modification or mutation of a gene encoding a 26443 or 46873 protein; (ii) mis-regulation of the 26443 or 46873 gene; and (iii) aberrant post-translational modification of a 26443 or 46873 protein.

[0191] Moreover, the 26443- or 46873-fusion proteins of the invention can be used as immunogens to produce anti-26443 or -46873 antibodies in a subject, to purify 26443 or 46873 ligands and in screening assays to identify molecules that inhibit the interaction of 26443 or 46873 with a 26443 or 46873 substrate.

[0192] Expression vectors are commercially available that already encode a fusion moiety (e.g., a GST polypeptide). A 26443- or 46873-encoding nucleic acid can be cloned into such an expression vector such that the fusion moiety is linked in-frame to the 26443 or 46873 protein.

[0193] Variants of 26443 or 46873 Proteins

[0194] In another aspect, the invention also features a variant of a 26443 or 46873 polypeptide, e.g., which functions as an agonist (mimetics) or as an antagonist. Variants of the 26443 or 46873 proteins can be generated by mutagenesis, e.g., discrete point mutation, the insertion or deletion of sequences or the truncation of a 26443 or 46873 protein. An agonist of the 26443 or 46873 proteins can retain substantially the same, or a subset, of the biological activities of the naturally occurring form of a 26443 or 46873 protein. An antagonist of a 26443 or 46873 protein can inhibit one or more of the activities of the naturally occurring form of the 26443 or 46873 protein by, for example, competitively modulating a 26443- or 46873-mediated activity of a 26443 or 46873 protein. Thus, specific biological effects can be elicited by treatment with a variant of limited function. Preferably, treatment of a subject with a variant having a subset of the biological activities of the naturally occurring form of the protein has fewer side effects in a subject relative to treatment with the naturally occurring form of the 26443 or 46873 protein.

[0195] Variants of a 26443 or 46873 protein can be identified by screening combinatorial libraries of mutants, e.g., truncation mutants, of a 26443 or 46873 protein for agonist or antagonist activity.

[0196] Libraries of fragments, e.g., N-terminal, C-terminal, or internal fragments, of a 26443 or 46873 protein coding sequence can be used to generate a variegated population of fragments for screening and subsequent selection of variants of a 26443 or 46873 protein.

[0197] Variants in which a cysteine residue is added or deleted or in which a residue that is glycosylated is added or deleted are particularly preferred.

[0198] Methods for screening gene products of combinatorial libraries made by point mutations or truncation, and for screening cDNA libraries for gene products having a selected property are known in the art. Recursive ensemble mutagenesis (REM), a new technique which enhances the frequency of functional mutants in the libraries, can be used in combination with the screening assays to identify 26443 or 46873 variants (Arkin and Yourvan (1992) Proc. Natl. Acad. Sci. USA 89:7811-7815; Delgrave et al. (1993) Protein Engineering 6(3):327-331).

[0199] Cell based assays can be exploited to analyze a variegated 26443 or 46873 library. For example, a library of expression vectors can be transfected into a cell line, e.g., a cell line, which ordinarily responds to 26443 or 46873 in a substrate-dependent manner. The transfected cells are then contacted with 26443 or 46873 and the effect of the expression of the mutant on the 26443 or 46873 substrate can be detected, e.g., by measuring fatty the amount of asparagine and/or aspartic acid and ammonia. Plasmid DNA can then be recovered from the cells that score for inhibition, or alternatively, potentiation of the mutant by the 26443 or 46873 substrate, and the individual clones further characterized.

[0200] In another aspect, the invention features a method of making a 26443 or 46873 polypeptide, e.g., a peptide having a non-wild type activity, e.g., an antagonist, agonist, or super agonist of a naturally occurring 26443 or 46873 polypeptide, e.g., a naturally occurring 26443 or 46873 polypeptide. The method includes: altering the sequence of a 26443 or 46873 polypeptide, e.g., altering the sequence, e.g., by substitution or deletion of one or more residues of a non-conserved region, a domain or residue disclosed herein, and testing the altered polypeptide for the desired activity.

[0201] In another aspect, the invention features a method of making a fragment or analog of a 26443 or 46873 polypeptide a biological activity of a naturally occurring 26443 or 46873 polypeptide. The method includes: altering the sequence, e.g., by substitution or deletion of one or more residues, of a 26443 or 46873 polypeptide, e.g., altering the sequence of a non-conserved region, or a domain or residue described herein, and testing the altered polypeptide for the desired activity.

[0202] Anti-26443 or -46873 Antibodies

[0203] In another aspect, the invention provides an anti-26443 or -46873 antibody. The term “antibody” as used herein refers to an immunoglobulin molecule or immunologically active portion thereof, i.e., an antigen-binding portion. Examples of immunologically active portions of immunoglobulin molecules include F(ab) and F(ab′)2 fragments which can be generated by treating the antibody with an enzyme such as pepsin.

[0204] The antibody can be a polyclonal, monoclonal, recombinant, e.g., a chimeric or humanized, fully human, non-human, e.g., murine, or single chain antibody. In a preferred embodiment it has effector function and can fix complement. The antibody can be coupled to a toxin or imaging agent.

[0205] In a preferred embodiment, the antibody fails to bind an Fc receptor, e.g., it is an isotype which does not bind to an Fc receptor, or has been modified, e.g., by deletion or other mutation, such that it does not have a functional Fc receptor binding region.

[0206] A full-length 26443 or 46873 protein, or an antigenic peptide fragment of 26443 or 46873 can be used as an immunogen or can be used to identify anti-26443 or -46873 antibodies made with other immunogens, e.g., cells, membrane preparations, and the like. The antigenic peptide of 26443 or 46873 should include at least 8 amino acid residues of the amino acid sequence shown in SEQ ID NO:2 or SEQ ID NO:5, respectively, and encompasses an epitope of 26443 or 46873. Preferably, the antigenic peptide includes at least 10 amino acid residues, more preferably at least 15 amino acid residues, even more preferably at least 20 amino acid residues, and most preferably at least 30 amino acid residues.

[0207] Fragments of 26443 or 46873 which include, for example, residues 22-40, 48-65, 193-202, 203-230 or 345-361 of SEQ ID NO:2 or residues 12-26, 55-65 or 139-170 of SEQ ID NO:5, respectively, can be used to make, e.g., antibodies against hydrophilic regions of the 26443 or 46873 protein or used as immunogens or to characterize the specificity of an antibody. Similarly, a fragment of 26443 or 46873 which include, for example, residues 40-50, 75-90, 230-241 or 337-349 of SEQ ID NO:2 or residues 43-55, 90-105 or 138-170 of SEQ ID NO:5, respectively, can be used to make an antibody against a hydrophobic region of the 26443 or 46873 protein; a fragment of 26443 or 46873 which include residues 1-100, 50-150, 100-200, 150-250, 200-300, 250-350, 300-400 or 350-418 of SEQ ID NO:2 or residues 1-100, 50-150, 100-200, 150-250, 200-300 or 250-308 of SEQ ID NO:5, respectively, can be used to make an antibody against a non-transmembrane (i.e., matrix, cytosolic or lumen) region of the 26443 or 46873 protein; and a fragment of 26443 or 46873 which include residues 38-345 of SEQ ID NO:2 or residues 1-302 of SEQ ID NO:5, respectively, can be used to make an antibody against the asparaginase domain of the 26443 or 46873 protein.

[0208] Antibodies reactive with, or specific for, any of these regions, or other regions or domains described herein are provided.

[0209] Preferred epitopes encompassed by the antigenic peptide are regions of 26443 or 46873 are located on the surface of the protein, e.g., hydrophilic regions, as well as regions with high antigenicity. For example, an Emini surface probability analysis of the human 26443 or 46873 protein sequence can be used to indicate the regions that have a particularly high probability of being localized to the surface of the 26443 or 46873 protein and are thus likely to constitute surface residues useful for targeting antibody production. For example, residues 25-45, 55-65, 190-205 and 212-225 of the 26443 protein and residues 15-25 and 140-160 of the 46873 protein have a high probability of being localized on the surface of the respective proteins based on an Emini surface probability plot.

[0210] In a preferred embodiment, the antibody can bind to a 26443 or 46873 protein intracellularly. In another embodiment, the antibody binds to a 26443 or 46873 protein extracellularly.

[0211] In a preferred embodiment the antibody binds an epitope on any domain or region on 26443 or 46873 proteins described herein.

[0212] In preferred embodiments an antibody can be made by immunizing with purified 26443 or 46873 antigen, or a fragment thereof, e.g., a fragment described herein, membrane associated antigen, tissue, e.g., crude tissue preparations, whole cells, preferably living cells, lysed cells, or cell fractions, e.g., membrane fractions, cytoplasmic fractions.

[0213] Antibodies that bind only native 26443 or 46873 protein, only denatured or otherwise non-native 26443 or 46873 protein, or that bind both, are within the invention. Antibodies with linear or conformational epitopes are within the invention. Conformational epitopes can sometimes be determined by identifying antibodies that bind to native but not denatured 26443 or 46873 proteins.

[0214] Chimeric, humanized, but most preferably, completely human antibodies are desirable for applications which include repeated administration, e.g., therapeutic treatment (and some diagnostic applications) of human patients.

[0215] The anti-26443 or -46873 antibody can be a single-chain antibody. A single-chain antibody (scFV) may be engineered (see, for example, Colcher, D., et al. Ann N Y Acad Sci 1999 Jun. 30;880:263-80; and Reiter, Y. Clin Cancer Res 1996 February;2(2):245-52). The single-chain antibody can be dimerized or multimerized to generate multivalent antibodies having specificities for different epitopes of the same target 26443 or 46873 protein.

[0216] An anti-26443 or -46873 antibody (e.g., monoclonal antibody) can be used to isolate 26443 or 46873 by standard techniques, such as affinity chromatography or immunoprecipitation. Moreover, an anti-26443 or -46873 antibody can be used to detect 26443 or 46873 protein (e.g., in a cellular lysate or cell supernatant) in order to evaluate the abundance and pattern of expression of the protein. Anti-26443 or -46873 antibodies can be used diagnostically to monitor protein levels in tissue as part of a clinical testing procedure, e.g., to, for example, determine the efficacy of a given treatment regimen. Detection can be facilitated by coupling (i.e., physically linking) the antibody to a detectable substance (i.e., antibody labeling). Examples of detectable substances include various enzymes, prosthetic groups, fluorescent materials, luminescent materials, bioluminescent materials, and radioactive materials. Examples of suitable enzymes include horseradish peroxidase, alkaline phosphatase, &bgr;-galactosidase, or acetylcholinesterase; examples of suitable prosthetic group complexes include streptavidin/biotin and avidin/biotin; examples of suitable fluorescent materials include umbelliferone, fluorescein, fluorescein isothiocyanate, rhodamine, dichlorotriazinylamine fluorescein, dansyl chloride or phycoerythrin; an example of a luminescent material includes luminol; examples of bioluminescent materials include luciferase, luciferin, and aequorin, and examples of suitable radioactive material include 125I, 131I, 35S or 3H.

[0217] The invention also includes a nucleic acid that encodes an anti-26443 or -46873 antibody, e.g., an anti-26443 or -46873 antibody described herein. Also included are vectors which include the nucleic acid and cells transformed with the nucleic acid, particularly cells which are useful for producing an antibody, e.g., mammalian cells, e.g. CHO or lymphatic cells.

[0218] The invention also includes cell lines, e.g., hybridomas, which make an anti-26443 or -46873 antibody, e.g., and antibody described herein, and method of using said cells to make an anti-26443 or -46873 antibody.

[0219] 26443 and 46873 Recombinant Expression Vectors, Host Cells and Genetically Engineered Cells

[0220] In another aspect, the invention includes vectors, preferably expression vectors, containing a nucleic acid encoding a polypeptide described herein. As used herein, the term “vector” refers to a nucleic acid molecule capable of transporting another nucleic acid to which it has been linked and can include a plasmid, cosmid or viral vector. The vector can be capable of autonomous replication or it can integrate into a host DNA. Viral vectors include, e.g., replication defective retroviruses, adenoviruses and adeno-associated viruses.

[0221] A vector can include a 26443 or 46873 nucleic acid in a form suitable for expression of the nucleic acid in a host cell. Preferably the recombinant expression vector includes one or more regulatory sequences operatively linked to the nucleic acid sequence to be expressed. The term “regulatory sequence” includes promoters, enhancers and other expression control elements (e.g., polyadenylation signals). Regulatory sequences include those that direct constitutive expression of a nucleotide sequence, as well as tissue-specific regulatory and/or inducible sequences. The design of the expression vector can depend on such factors as the choice of the host cell to be transformed, the level of expression of protein desired, and the like. The expression vectors of the invention can be introduced into host cells to thereby produce proteins or polypeptides, including fusion proteins or polypeptides, encoded by nucleic acids as described herein (e.g., 26443 or 46873 proteins, mutant forms of 26443 or 46873 proteins, fusion proteins, and the like).

[0222] The recombinant expression vectors of the invention can be designed for expression of 26443 or 46873 proteins in prokaryotic or eukaryotic cells. For example, polypeptides of the invention can be expressed in E. coli, insect cells (e.g., using baculovirus expression vectors), yeast cells or mammalian cells. Suitable host cells are discussed further in Goeddel, Gene Expression Technology: Methods in Enzymology 185, Academic Press, San Diego, Calif. (1990). Alternatively, the recombinant expression vector can be transcribed and translated in vitro, for example using T7 promoter regulatory sequences and T7 polymerase.

[0223] Expression of proteins in prokaryotes is most often carried out in E. coli with vectors containing constitutive or inducible promoters directing the expression of either fusion or non-fusion proteins. Fusion vectors add a number of amino acids to a protein encoded therein, usually to the amino terminus of the recombinant protein. Such fusion vectors typically serve three purposes: 1) to increase expression of recombinant protein; 2) to increase the solubility of the recombinant protein; and 3) to aid in the purification of the recombinant protein by acting as a ligand in affinity purification. Often, a proteolytic cleavage site is introduced at the junction of the fusion moiety and the recombinant protein to enable separation of the recombinant protein from the fusion moiety subsequent to purification of the fusion protein. Such enzymes, and their cognate recognition sequences, include Factor Xa, thrombin and enterokinase. Typical fusion expression vectors include pGEX (Pharmacia Biotech Inc; Smith, D. B. and Johnson, K. S. (1988) Gene 67:31-40), pMAL (New England Biolabs, Beverly, Mass.) and pRIT5 (Pharmacia, Piscataway, N.J.) which fuse glutathione S-transferase (GST), maltose E binding protein, or protein A, respectively, to the target recombinant protein.

[0224] Purified fusion proteins can be used in 26443 or 46873 activity assays, (e.g., direct assays or competitive assays described in detail below), or to generate antibodies specific for 26443 or 46873 proteins. In a preferred embodiment, a fusion protein expressed in a retroviral expression vector of the present invention can be used to infect bone marrow cells that are subsequently transplanted into irradiated recipients. The pathology of the subject recipient is then examined after sufficient time has passed (e.g., six (6) weeks).

[0225] One strategy used to maximize recombinant protein expression in E. coli is to express the protein in a host strain with an impaired capacity to proteolytically cleave the recombinant protein (Gottesman, S., Gene Expression Technology: Methods in Enzymology 185, Academic Press, San Diego, Calif. (1990) 119-128). Another strategy is to alter the nucleic acid sequence of the nucleic acid to be inserted into an expression vector so that the individual codons for each amino acid are those preferentially utilized in E. coli (Wada et al., (1992) Nucleic Acids Res. 20:2111-2118). Such alteration of nucleic acid sequences of the invention can be carried out by standard DNA synthesis techniques.

[0226] The 26443 or 46873 expression vector can be a yeast expression vector, a vector for expression in insect cells, e.g., a baculovirus expression vector, or a vector suitable for expression in mammalian cells.

[0227] When used in mammalian cells, the expression vector's control functions are often provided by viral regulatory elements. For example, commonly used promoters are derived from polyoma, Adenovirus 2, cytomegalovirus and Simian Virus 40.

[0228] In another embodiment, the recombinant mammalian expression vector is capable of directing expression of the nucleic acid preferentially in a particular cell type (e.g., tissue-specific regulatory elements are used to express the nucleic acid). Non-limiting examples of suitable tissue-specific promoters include the albumin promoter (liver-specific; Pinkert et al. (1987) Genes Dev. 1:268-277), lymphoid-specific promoters (Calame and Eaton (1988) Adv. Immunol. 43:235-275), in particular promoters of T cell receptors (Winoto and Baltimore (1989) EMBO J. 8:729-733) and immunoglobulins (Banerji et al. (1983) Cell 33:729-740; Queen and Baltimore (1983) Cell 33:741-748), neuron-specific promoters (e.g., the neurofilament promoter; Byrne and Ruddle (1989) Proc. Natl. Acad. Sci. USA 86:5473-5477), pancreas-specific promoters (Edlund et al. (1985) Science 230:912-916), and mammary gland-specific promoters (e.g., milk whey promoter; U.S. Pat. No. 4,873,316 and European Application Publication No. 264,166). Developmentally-regulated promoters are also encompassed, for example, the murine hox promoters (Kessel and Gruss (1990) Science 249:374-379) and the &agr;-fetoprotein promoter (Campes and Tilghman (1989) Genes Dev. 3:537-546).

[0229] The invention further provides a recombinant expression vector comprising a DNA molecule of the invention cloned into the expression vector in an antisense orientation. Regulatory sequences (e.g., viral promoters and/or enhancers) operatively linked to a nucleic acid cloned in the antisense orientation can be chosen which direct the constitutive, tissue specific or cell type specific expression of antisense RNA in a variety of cell types. The antisense expression vector can be in the form of a recombinant plasmid, phagemid or attenuated virus. For a discussion of the regulation of gene expression using antisense genes, see Weintraub, H. et al., Antisense RNA as a molecular tool for genetic analysis, Reviews—Trends in Genetics, Vol. 1(1) 1986.

[0230] Another aspect the invention provides a host cell which includes a nucleic acid molecule described herein, e.g., a 26443 or 46873 nucleic acid molecule within a recombinant expression vector or a 26443 or 46873 nucleic acid molecule containing sequences which allow it to homologously recombine into a specific site of the host cell's genome. The terms “host cell” and “recombinant host cell” are used interchangeably herein. Such terms refer not only to the particular subject cell, but also to the progeny or potential progeny of such a cell. Because certain modifications may occur in succeeding generations due to either mutation or environmental influences, such progeny may not, in fact, be identical to the parent cell, but are still included within the scope of the term as used herein.

[0231] A host cell can be any prokaryotic or eukaryotic cell. For example, a 26443 or 46873 protein can be expressed in bacterial cells such as E. coli, insect cells, yeast or mammalian cells (such as Chinese hamster ovary cells (CHO) or COS cells). Other suitable host cells are known to those skilled in the art.

[0232] Vector DNA can be introduced into host cells via conventional transformation or transfection techniques. As used herein, the terms “transformation” and “transfection” are intended to refer to a variety of art-recognized techniques for introducing foreign nucleic acid (e.g., DNA) into a host cell, including calcium phosphate or calcium chloride co-precipitation, DEAE-dextran-mediated transfection, lipofection, or electroporation.

[0233] A host cell of the invention can be used to produce (i.e., express) a 26443 or 46873 protein. Accordingly, the invention further provides methods for producing a 26443 or 46873 protein using the host cells of the invention. In one embodiment, the method includes culturing the host cell of the invention (into which a recombinant expression vector encoding a 26443 or 46873 protein has been introduced) in a suitable medium such that a 26443 or 46873 protein is produced. In another embodiment, the method further includes isolating a 26443 or 46873 protein from the medium or the host cell.

[0234] In another aspect, the invention features, a cell or purified preparation of cells that include a 26443 or 46873 transgene, or which otherwise misexpress 26443 or 46873. The cell preparation can consist of human or non-human cells, e.g., rodent cells, e.g., mouse or rat cells, rabbit cells, or pig cells. In preferred embodiments, the cell or cells include a 26443 or 46873 transgene, e.g., a heterologous form of a 26443 or 46873, e.g., a gene derived from humans (in the case of a non-human cell). The 26443 or 46873 transgene can be misexpressed, e.g., overexpressed or underexpressed. In other preferred embodiments, the cell or cells include a gene that misexpresses an endogenous 26443 or 46873, e.g., a gene for which expression is disrupted, e.g., a knockout. Such cells can serve as a model for studying disorders that are related to mutated or misexpressed 26443 or 46873 alleles or for use in drug screening.

[0235] In another aspect, the invention features, a human cell, e.g., a tumor cell, transformed with a nucleic acid that encodes a subject 26443 or 46873 polypeptide.

[0236] Also provided are cells, e.g., human cells, e.g., human hematopoietic or fibroblast cells, in which an endogenous 26443 or 46873 is under the control of a regulatory sequence that does not normally control the expression of the endogenous 26443 or 46873 gene. The expression characteristics of an endogenous gene within a cell, e.g., a cell line or microorganism, can be modified by inserting a heterologous DNA regulatory element into the genome of the cell such that the inserted regulatory element is operably linked to the endogenous 26443 or 46873 gene. For example, an endogenous 26443 or 46873 gene, e.g., a gene that is “transcriptionally silent”, e.g., not normally expressed, or expressed only at very low levels, may be activated by inserting a regulatory element which is capable of promoting the expression of a normally expressed gene product in that cell. Techniques, such as targeted homologous recombination, can be used to insert the heterologous DNA as described in, e.g., Chappel, U.S. Pat. No. 5,272,071; WO 91/06667, published in May 16, 1991.

[0237] 26443 and 46873 Transgenic Animals

[0238] The invention provides non-human transgenic animals. Such animals are useful for studying the function and/or activity of a 26443 or 46873 protein and for identifying and/or evaluating modulators of 26443 or 46873 activity. As used herein, a “transgenic animal” is a non-human animal, preferably a mammal, more preferably a rodent such as a rat or mouse, in which one or more of the cells of the animal include a transgene. Other examples of transgenic animals include non-human primates, sheep, dogs, cows, goats, chickens, amphibians, and the like. A transgene is exogenous DNA or a rearrangement, e.g., a deletion of endogenous chromosomal DNA, which preferably is integrated into or occurs in the genome of the cells of a transgenic animal. A transgene can direct the expression of an encoded gene product in one or more cell types or tissues of the transgenic animal, other transgenes, e.g., a knockout, reduce expression. Thus, a transgenic animal can be one in which an endogenous 26443 or 46873 gene has been altered by, e.g., by homologous recombination between the endogenous gene and an exogenous DNA molecule introduced into a cell of the animal, e.g., an embryonic cell of the animal, prior to development of the animal.

[0239] Intronic sequences and polyadenylation signals can also be included in the transgene to increase the efficiency of expression of the transgene. A tissue-specific regulatory sequence(s) can be operably linked to a transgene of the invention to direct expression of a 26443 or 46873 protein to particular cells. A transgenic founder animal can be identified based upon the presence of a 26443 or 46873 transgene in its genome and/or expression of 26443 or 46873 mRNA in tissues or cells of the animals. A transgenic founder animal can then be used to breed additional animals carrying the transgene. Moreover, transgenic animals carrying a transgene encoding a 26443 or 46873 protein can further be bred to other transgenic animals carrying other transgenes.

[0240] 26443 or 46873 proteins or polypeptides can be expressed in transgenic animals or plants, e.g., a nucleic acid encoding the protein or polypeptide can be introduced into the genome of an animal. In preferred embodiments the nucleic acid is placed under the control of a tissue specific promoter, e.g., a milk or egg specific promoter, and recovered from the milk or eggs produced by the animal. Suitable animals are mice, pigs, cows, goats, and sheep.

[0241] The invention also includes a population of cells from a transgenic animal, as discussed, e.g., below.

[0242] Uses of 26443 and 46873

[0243] The nucleic acid molecules, proteins, protein homologues, and antibodies described herein can be used in one or more of the following methods: a) screening assays; b) predictive medicine (e.g., diagnostic assays, prognostic assays, monitoring clinical trials, and pharmacogenetics); and c) methods of treatment (e.g., therapeutic and prophylactic).

[0244] The isolated nucleic acid molecules of the invention can be used, for example, to express a 26443 or 46873 protein (e.g., via a recombinant expression vector in a host cell in gene therapy applications), to detect a 26443 or 46873 mRNA (e.g., in a biological sample) or a genetic alteration in a 26443 or 46873 gene, and to modulate 26443 or 46873 activity, as described further below. The 26443 or 46873 proteins can be used to treat disorders characterized by insufficient or excessive production of a 26443 or 46873 substrate or production of 26443 or 46873 inhibitors. In addition, the 26443 or 46873 proteins can be used to screen for naturally occurring 26443 or 46873 substrates, to screen for drugs or compounds which modulate 26443 or 46873 activity, as well as to treat disorders characterized by insufficient or excessive production of 26443 or 46873 protein or production of 26443 or 46873 protein forms which have decreased, aberrant or unwanted activity compared to 26443 or 46873 wild type protein (e.g., altered cellular levels of asparagine and/or aspartic acid and ammonia). Moreover, the anti-26443 or -46873 antibodies of the invention can be used to detect and isolate 26443 or 46873 proteins, regulate the bioavailability of 26443 or 46873 proteins, and modulate 26443 or 46873 activity.

[0245] A method of evaluating a compound for the ability to interact with, e.g., bind, a subject 26443 or 46873 polypeptide is provided. The method includes: contacting the compound with the subject 26443 or 46873 polypeptide; and evaluating the ability of the compound to interact with, e.g., to bind or form a complex with, the subject 26443 or 46873 polypeptide. This method can be performed in vitro, e.g., in a cell free system, or in vivo, e.g., in a two-hybrid interaction trap assay. This method can be used to identify naturally occurring molecules that interact with a subject 26443 or 46873 polypeptide. It can also be used to find natural or synthetic inhibitors of a subject 26443 or 46873 polypeptide. Screening methods are discussed in more detail below.

[0246] 26443 and 46873 Screening Assays

[0247] The invention provides methods (also referred to herein as “screening assays”) for identifying modulators, i.e., candidate or test compounds or agents (e.g., proteins, peptides, peptidomimetics, peptoids, small molecules or other drugs) which bind to 26443 or 46873 proteins, have a stimulatory or inhibitory effect on, for example, 26443 or 46873 expression or 26443 or 46873 activity, or have a stimulatory or inhibitory effect on, for example, the expression or activity of a 26443 or 46873 substrate. Compounds thus identified can be used to modulate the activity of target gene products (e.g., 26443 or 46873 genes) in a therapeutic protocol, to elaborate the biological function of the target gene product, or to identify compounds that disrupt normal target gene interactions.

[0248] In one embodiment, the invention provides assays for screening candidate or test compounds that are substrates of a 26443 or 46873 protein or polypeptide or a biologically active portion thereof. In another embodiment, the invention provides assays for screening candidate or test compounds that bind to or modulate the activity of a 26443 or 46873 protein or polypeptide or a biologically active portion thereof.

[0249] The test compounds of the present invention can be obtained using any of the numerous approaches in combinatorial library methods known in the art, including: biological libraries; peptoid libraries [libraries of molecules having the functionalities of peptides, but with a novel, non-peptide backbone which are resistant to enzymatic degradation but which nevertheless remain bioactive] (see, e.g., Zuckermann, R. N. et al. J. Med. Chem. 1994, 37: 2678-85); spatially addressable parallel solid phase or solution phase libraries; synthetic library methods requiring deconvolution; the ‘one-bead one-compound’ library method; and synthetic library methods using affinity chromatography selection. The biological library and peptoid library approaches are limited to peptide libraries, while the other four approaches are applicable to peptide, non-peptide oligomer or small molecule libraries of compounds (Lam, K. S. (1997) Anticancer Drug Des. 12:145).

[0250] Examples of methods for the synthesis of molecular libraries can be found in the art, for example in: DeWitt et al. (1993) Proc. Natl. Acad. Sci. U.S.A. 90:6909; Erb et al. (1994) Proc. Natl. Acad. Sci. USA 91:11422; Zuckermann et al. (1994). J. Med. Chem. 37:2678; Cho et al. (1993) Science 261:1303; Carrell et al. (1994) Angew. Chem. Int. Ed. Engl. 33:2059; Carell et al. (1994) Angew. Chem. Int. Ed. Engl. 33:2061; and in Gallop et al. (1994) J. Med. Chem. 37:1233.

[0251] Libraries of compounds may be presented in solution (e.g., Houghten (1992) Biotechniques 13:412-421), or on beads (Lam (1991) Nature 354:82-84), chips (Fodor (1993) Nature 364:555-556), bacteria (Ladner U.S. Pat. No. 5,223,409), spores (Ladner U.S. Pat. No. '409), plasmids (Cull et al. (1992) Proc Natl Acad Sci USA 89:1865-1869) or on phage (Scott and Smith (1990) Science 249:386-390); (Devlin (1990) Science 249:404-406); (Cwirla et al. (1990) Proc. Natl. Acad. Sci. 87:6378-6382); (Felici (1991) J. Mol. Biol. 222:301-310); (Ladner supra.).

[0252] In one embodiment, an assay is a cell-based assay in which a cell that expresses a 26443 or 46873 protein or biologically active portion thereof is contacted with a test compound, and the ability of the test compound to modulate 26443 or 46873 activity is determined. Determining the ability of the test compound to modulate 26443 or 46873 activity can be accomplished by monitoring, for example, cellular asparagine levels. The cell, for example, can be of mammalian origin, e.g., a tumor cell.

[0253] The ability of the test compound to modulate 26443 or 46873 binding to a compound, e.g., a 26443 or 46873 substrate, or to bind to 26443 or 46873 can also be evaluated. This can be accomplished, for example, by coupling the compound, e.g., the substrate, with a radioisotope or enzymatic label such that binding of the compound, e.g., the substrate, to 26443 or 46873 can be determined by detecting the labeled compound, e.g., substrate, in a complex. Alternatively, 26443 or 46873 could be coupled with a radioisotope or enzymatic label to monitor the ability of a test compound to modulate 26443 or 46873 binding to a 26443 or 46873 substrate in a complex. For example, compounds (e.g., 26443 or 46873 substrates) can be labeled with 125I, 35S, 14C, or 3H, either directly or indirectly, and the radioisotope detected by direct counting of radioemmission or by scintillation counting. Alternatively, compounds can be enzymatically labeled with, for example, horseradish peroxidase, alkaline phosphatase, or luciferase, and the enzymatic label detected by determination of conversion of an appropriate substrate to product.

[0254] The ability of a compound (e.g., a 26443 or 46873 substrate) to interact with 26443 or 46873 with or without the labeling of any of the interactants can be evaluated. For example, a microphysiometer can be used to detect the interaction of a compound with 26443 or 46873 without the labeling of either the compound or the 26443 or 46873. McConnell, H. M. et al. (1992) Science 257:1906-1912. As used herein, a “microphysiometer” (e.g., Cytosensor) is an analytical instrument that measures the rate at which a cell acidifies its environment using a light-addressable potentiometric sensor (LAPS). Changes in this acidification rate can be used as an indicator of the interaction between a compound and 26443 or 46873.

[0255] In yet another embodiment, a cell-free assay is provided in which a 26443 or 46873 protein or biologically active portion thereof is contacted with a test compound and the ability of the test compound to bind to the 26443 or 46873 protein or biologically active portion thereof is evaluated. Preferred biologically active portions of the 26443 or 46873 proteins to be used in assays of the present invention include fragments that participate in interactions with non-26443 or -46873 molecules, e.g., fragments with high surface probability scores.

[0256] Soluble and/or membrane-bound forms of isolated proteins (e.g., 26443 or 46873 proteins or biologically active portions thereof) can be used in the cell-free assays of the invention. When membrane-bound forms of the protein are used, it may be desirable to utilize a solubilizing agent. Examples of such solubilizing agents include non-ionic detergents such as n-octylglucoside, n-dodecylglucoside, n-dodecylmaltoside, octanoyl-N-methylglucamide, decanoyl-N-methylglucamide, Triton® X-100, Triton® X-114, Thesit®, Isotridecypoly(ethylene glycol ether)n, 3-[(3-cholamidopropyl)dimethylamminio]-1-propane sulfonate (CHAPS), 3-[(3-cholamidopropyl)dimethylamminio]-2-hydroxy-1-propane sulfonate (CHAPSO), or N-dodecyl=N,N-dimethyl-3-ammonio-1-propane sulfonate.

[0257] Cell-free assays involve preparing a reaction mixture of the target gene protein and the test compound under conditions and for a time sufficient to allow the two components to interact and bind, thus forming a complex that can be removed and/or detected.

[0258] The interaction between two molecules can also be detected, e.g., using fluorescence energy transfer (FET) (see, for example, Lakowicz et al., U.S. Pat. No. 5,631,169; Stavrianopoulos, et al., U.S. Pat. No. 4,868,103). A fluorophore label on the first, ‘donor’ molecule is selected such that it's emitted fluorescent energy will be absorbed by a fluorescent label on a second, ‘acceptor’ molecule, which in turn is able to fluoresce due to the absorbed energy. Alternately, the ‘donor’ protein molecule may simply utilize the natural fluorescent energy of tryptophan residues. Labels are chosen that emit different wavelengths of light, such that the ‘acceptor’ molecule label may be differentiated from that of the ‘donor’. Since the efficiency of energy transfer between the labels is related to the distance separating the molecules, the spatial relationship between the molecules can be assessed. In a situation in which binding occurs between the molecules, the fluorescent emission of the ‘acceptor’ molecule label in the assay should be maximal. An FET binding event can be conveniently measured through standard fluorometric detection means well known in the art (e.g., using a fluorimeter).

[0259] In another embodiment, determining the ability of the 26443 or 46873 protein to bind to a target molecule can be accomplished using real-time Biomolecular Interaction Analysis (BIA) (see, e.g., Sjolander, S. and Urbaniczky, C. (1991) Anal. Chem. 63:2338-2345 and Szabo et al. (1995) Curr. Opin. Struct. Biol. 5:699-705). “Surface plasmon resonance” or “BIA” detects biospecific interactions in real time, without labeling any of the interactants (e.g., BIAcore). Changes in the mass at the binding surface (indicative of a binding event) result in alterations of the refractive index of light near the surface (the optical phenomenon of surface plasmon resonance (SPR)), resulting in a detectable signal that can be used as an indication of real-time reactions between biological molecules.

[0260] In one embodiment, the target gene product or the test substance is anchored onto a solid phase. The target gene product/test compound complexes anchored on the solid phase can be detected at the end of the reaction. Preferably, the target gene product can be anchored onto a solid surface, and the test compound, (which is not anchored), can be labeled, either directly or indirectly, with detectable labels discussed herein.

[0261] It may be desirable to immobilize either 26443 or 46873, an anti-26443 or -46873 antibody or its target molecule to facilitate separation of complexed from uncomplexed forms of one or both of the proteins, as well as to accommodate automation of the assay. Binding of a test compound to a 26443 or 46873 protein, or interaction of a 26443 or 46873 protein with a target molecule in the presence and absence of a candidate compound, can be accomplished in any vessel suitable for containing the reactants. Examples of such vessels include microtiter plates, test tubes, and micro-centrifuge tubes. In one embodiment, a fusion protein can be provided which adds a domain that allows one or both of the proteins to be bound to a matrix. For example, glutathione-S-transferase-26443 or -46873 fusion proteins or glutathione-S-transferase-target fusion proteins can be adsorbed onto glutathione sepharose beads (Sigma Chemical, St. Louis, Mo.) or glutathione derivatized microtiter plates, which are then combined with the test compound or the test compound and either the non-adsorbed target protein or 26443 or 46873 protein, and the mixture incubated under conditions conducive to complex formation (e.g., at physiological conditions for salt and pH). Following incubation, the beads or microtiter plate wells are washed to remove any unbound components, the matrix immobilized in the case of beads, complex determined either directly or indirectly, for example, as described above. Alternatively, the complexes can be dissociated from the matrix, and the level of 26443 or 46873 binding or activity determined using standard techniques.

[0262] Other techniques for immobilizing either a 26443 or 46873 protein or a target molecule on matrices include using conjugation of biotin and streptavidin. Biotinylated 26443 or 46873 protein or target molecules can be prepared from biotin-NHS(N-hydroxy-succinimide) using techniques known in the art (e.g., biotinylation kit, Pierce Chemicals, Rockford, Ill.), and immobilized in the wells of streptavidin-coated 96 well plates (Pierce Chemical).

[0263] In order to conduct the assay, the non-immobilized component is added to the coated surface containing the anchored component. After the reaction is complete, unreacted components are removed (e.g., by washing) under conditions such that any complexes formed will remain immobilized on the solid surface. The detection of complexes anchored on the solid surface can be accomplished in a number of ways. Where the previously non-immobilized component is pre-labeled, the detection of label immobilized on the surface indicates that complexes were formed. Where the previously non-immobilized component is not pre-labeled, an indirect label can be used to detect complexes anchored on the surface; e.g., using a labeled antibody specific for the immobilized component (the antibody, in turn, can be directly labeled or indirectly labeled with, e.g., a labeled anti-Ig antibody).

[0264] In one embodiment, this assay is performed utilizing antibodies reactive with 26443 or 46873 protein or target molecules but which do not interfere with binding of the 26443 or 46873 protein to its target molecule. Such antibodies can be derivatized to the wells of the plate, and unbound target or 26443 or 46873 protein trapped in the wells by antibody conjugation. Methods for detecting such complexes, in addition to those described above for the GST-immobilized complexes, include immunodetection of complexes using antibodies reactive with the 26443 or 46873 protein or target molecule, as well as enzyme-linked assays which rely on detecting an enzymatic activity associated with the 26443 or 46873 protein or target molecule.

[0265] Alternatively, cell free assays can be conducted in a liquid phase. In such an assay, the reaction products are separated from unreacted components, by any of a number of standard techniques, including but not limited to: differential centrifugation (see, for example, Rivas, G., and Minton, A. P., Trends Biochem Sci 1993 August;18(8):284-7); chromatography (gel filtration chromatography, ion-exchange chromatography); electrophoresis (see, e.g., Ausubel, F. et al., eds. Current Protocols in Molecular Biology 1999, J. Wiley: New York.); and immunoprecipitation (see, for example, Ausubel, F. et al., eds. Current Protocols in Molecular Biology 1999, J. Wiley: New York). Such resins and chromatographic techniques are known to one skilled in the art (see, e.g., Heegaard, N. H., J Mol Recognit 1998 Winter; 11(1-6):141-8; Hage, D. S., and Tweed, S. A. J Chromatogr B Biomed Sci Appl 1997 Oct. 10;699(1-2):499-525). Further, fluorescence energy transfer may also be conveniently utilized, as described herein, to detect binding without further purification of the complex from solution.

[0266] In a preferred embodiment, the assay includes contacting the 26443 or 46873 protein or biologically active portion thereof with a known compound which binds 26443 or 46873 to form an assay mixture, contacting the assay mixture with a test compound, and determining the ability of the test compound to interact with a 26443 or 46873 protein, wherein determining the ability of the test compound to interact with a 26443 or 46873 protein includes determining the ability of the test compound to preferentially bind to 26443 or 46873 or biologically active portion thereof, or to modulate the activity of a target molecule, as compared to the known compound.

[0267] The target gene products of the invention can, in vivo, interact with one or more cellular or extracellular macromolecules, such as proteins. For the purposes of this discussion, such cellular and extracellular macromolecules are referred to herein as “binding partners.” Compounds that disrupt such interactions can be useful in regulating the activity of the target gene product. Such compounds can include, but are not limited to molecules such as antibodies, peptides, and small molecules. The preferred target genes/products for use in this embodiment are the 26443 or 46873 genes herein identified. In an alternative embodiment, the invention provides methods for determining the ability of the test compound to modulate the activity of a 26443 or 46873 protein through modulation of the activity of a downstream effector of a 26443 or 46873 target molecule. For example, the activity of the effector molecule on an appropriate target can be determined, or the binding of the effector to an appropriate target can be determined, as previously described.

[0268] To identify compounds that interfere with the interaction between the target gene product and its cellular or extracellular binding partner(s), a reaction mixture containing the target gene product and the binding partner is prepared, under conditions and for a time sufficient, to allow the two products to form complex. In order to test an inhibitory agent, the reaction mixture is provided in the presence and absence of the test compound. The test compound can be initially included in the reaction mixture, or can be added at a time subsequent to the addition of the target gene and its cellular or extracellular binding partner. Control reaction mixtures are incubated without the test compound or with a placebo. The formation of any complexes between the target gene product and the cellular or extracellular binding partner is then detected. The formation of a complex in the control reaction, but not in the reaction mixture containing the test compound, indicates that the compound interferes with the interaction of the target gene product and the interactive binding partner. Additionally, complex formation within reaction mixtures containing the test compound and normal target gene product can also be compared to complex formation within reaction mixtures containing the test compound and mutant target gene product. This comparison can be important in those cases wherein it is desirable to identify compounds that disrupt interactions of mutant but not normal target gene products.

[0269] These assays can be conducted in a heterogeneous or homogeneous format. Heterogeneous assays involve anchoring either the target gene product or the binding partner onto a solid phase, and detecting complexes anchored on the solid phase at the end of the reaction. In homogeneous assays, the entire reaction is carried out in a liquid phase. In either approach, the order of addition of reactants can be varied to obtain different information about the compounds being tested. For example, test compounds that interfere with the interaction between the target gene products and the binding partners, e.g., by competition, can be identified by conducting the reaction in the presence of the test substance. Alternatively, test compounds that disrupt preformed complexes, e.g., compounds with higher binding constants that displace one of the components from the complex, can be tested by adding the test compound to the reaction mixture after complexes have been formed. The various formats are briefly described below.

[0270] In a heterogeneous assay system, either the target gene product or the interactive cellular or extracellular binding partner is anchored onto a solid surface (e.g., a microtiter plate), while the non-anchored species is labeled, either directly or indirectly. The anchored species can be immobilized by non-covalent or covalent attachments. Alternatively, an immobilized antibody specific for the species to be anchored can be used to anchor the species to the solid surface.

[0271] In order to conduct the assay, the partner of the immobilized species is exposed to the coated surface with or without the test compound. After the reaction is complete, unreacted components are removed (e.g., by washing) and any complexes formed will remain immobilized on the solid surface. Where the non-immobilized species is pre-labeled, the detection of label immobilized on the surface indicates that complexes were formed. Where the non-immobilized species is not pre-labeled, an indirect label can be used to detect complexes anchored on the surface; e.g., using a labeled antibody specific for the initially non-immobilized species (the antibody, in turn, can be directly labeled or indirectly labeled with, e.g., a labeled anti-Ig antibody). Depending upon the order of addition of reaction components, test compounds that inhibit complex formation or that disrupt preformed complexes can be detected.

[0272] Alternatively, the reaction can be conducted in a liquid phase in the presence or absence of the test compound, the reaction products separated from unreacted components, and complexes detected; e.g., using an immobilized antibody specific for one of the binding components to anchor any complexes formed in solution, and a labeled antibody specific for the other partner to detect anchored complexes. Again, depending upon the order of addition of reactants to the liquid phase, test compounds that inhibit complex or that disrupt preformed complexes can be identified.

[0273] In an alternate embodiment of the invention, a homogeneous assay can be used. For example, a preformed complex of the target gene product and the interactive cellular or extracellular binding partner product is prepared in that either the target gene products or their binding partners are labeled, but the signal generated by the label is quenched due to complex formation (see, e.g., U.S. Pat. No. 4,109,496 that utilizes this approach for immunoassays). The addition of a test substance that competes with and displaces one of the species from the preformed complex will result in the generation of a signal above background. In this way, test substances that disrupt target gene product-binding partner interaction can be identified.

[0274] In yet another aspect, the 26443 or 46873 proteins can be used as “bait proteins” in a two-hybrid assay or three-hybrid assay (see, e.g., U.S. Pat. No. 5,283,317; Zervos et al. (1993) Cell 72:223-232; Madura et al. (1993) J. Biol. Chem. 268:12046-12054; Bartel et al. (1993) Biotechniques 14:920-924; Iwabuchi et al. (1993) Oncogene 8:1693-1696; and Brent WO94/10300), to identify other proteins, which bind to or interact with 26443 or 46873 (“26443- or 46873-binding proteins” or “26443- or 46873-bp”) and are involved in 26443 or 46873 activity. Such 26443- or 46873-bps can be activators or inhibitors of signals by the 26443 or 46873 proteins or 26443 or 46873 targets as, for example, downstream elements of a 26443- or 46873-mediated signaling pathway.

[0275] The two-hybrid system is based on the modular nature of most transcription factors, which consist of separable DNA-binding and activation domains. Briefly, the assay utilizes two different DNA constructs. In one construct, the gene that codes for a 26443 or 46873 protein is fused to a gene encoding the DNA binding domain of a known transcription factor (e.g., GAL-4). In the other construct, a DNA sequence, from a library of DNA sequences, that encodes an unidentified protein (“prey” or “sample”) is fused to a gene that codes for the activation domain of the known transcription factor. (Alternatively, the 26443 or 46873 protein can be the fused to the activator domain.) If the “bait” and the “prey” proteins are able to interact, in vivo, forming a 26443- or 46873-dependent complex, the DNA-binding and activation domains of the transcription factor are brought into close proximity. This proximity allows transcription of a reporter gene (e.g., LacZ) that is operably linked to a transcriptional regulatory site responsive to the transcription factor. Expression of the reporter gene can be detected and cell colonies containing the functional transcription factor can be isolated and used to obtain the cloned gene that encodes the protein that interacts with the 26443 or 46873 protein.

[0276] In another embodiment, modulators of 26443 or 46873 expression are identified. For example, a cell or cell free mixture is contacted with a candidate compound and the expression of 26443 or 46873 mRNA or protein evaluated relative to the level of expression of 26443 or 46873 mRNA or protein in the absence of the candidate compound. When expression of 26443 or 46873 mRNA or protein is greater in the presence of the candidate compound than in its absence, the candidate compound is identified as a stimulator of 26443 or 46873 mRNA or protein expression. Alternatively, when expression of 26443 or 46873 mRNA or protein is less (e.g., statistically significantly less) in the presence of the candidate compound than in its absence, the candidate compound is identified as an inhibitor of 26443 or 46873 mRNA or protein expression. The level of 26443 or 46873 mRNA or protein expression can be determined by methods described herein for detecting 26443 or 46873 mRNA or protein.

[0277] In another aspect, the invention pertains to a combination of two or more of the assays described herein. For example, a modulating agent can be identified using a cell-based or a cell free assay, and the ability of the agent to modulate the activity of a 26443 or 46873 protein can be confirmed in vivo, e.g., in an animal such as an animal model for aberrant fatty acid oxidation.

[0278] This invention further pertains to novel agents identified by the above-described screening assays. Accordingly, it is within the scope of this invention to further use an agent identified as described herein (e.g., a 26443 or 46873 modulating agent, an antisense 26443 or 46873 nucleic acid molecule, a 26443- or 46873-specific antibody, or a 26443- or 46873-binding partner) in an appropriate animal model to determine the efficacy, toxicity, side effects, or mechanism of action, of treatment with such an agent. Furthermore, novel agents identified by the above-described screening assays can be used for treatments as described herein.

[0279] 26443 and 46873 Detection Assays

[0280] Portions or fragments of the nucleic acid sequences identified herein can be used as polynucleotide reagents. For example, these sequences can be used to: (i) map their respective genes on a chromosome e.g., to locate gene regions associated with genetic disease or to associate 26443 or 46873 with a disease; (ii) identify an individual from a minute biological sample (tissue typing); and (iii) aid in forensic identification of a biological sample. These applications are described in the subsections below.

[0281] 26443 and 46873 Chromosome Mapping

[0282] The 26443 or 46873 nucleotide sequences or portions thereof can be used to map the location of the 26443 or 46873 genes on a chromosome. This process is called chromosome mapping. Chromosome mapping is useful in correlating the 26443 or 46873 sequences with genes associated with disease.

[0283] Briefly, 26443 or 46873 genes can be mapped to chromosomes by preparing PCR primers (preferably 15-25 bp in length) from the 26443 or 46873 nucleotide sequences. These primers can then be used for PCR screening of somatic cell hybrids containing individual human chromosomes. Only those hybrids containing the human gene corresponding to the 26443 or 46873 sequences will yield an amplified fragment.

[0284] A panel of somatic cell hybrids in which each cell line contains either a single human chromosome or a small number of human chromosomes, and a full set of mouse chromosomes, can allow easy mapping of individual genes to specific human chromosomes. (D'Eustachio P. et al. (1983) Science 220:919-924).

[0285] Other mapping strategies, e.g., in situ hybridization (described in Fan, Y. et al. (1990) Proc. Natl. Acad. Sci. USA, 87:6223-27), pre-screening with labeled flow-sorted chromosomes, and pre-selection by hybridization to chromosome specific cDNA libraries can be used to map 26443 or 46873 to a chromosomal location.

[0286] Fluorescence in situ hybridization (FISH) of a DNA sequence to a metaphase chromosomal spread can further be used to provide a precise chromosomal location in one step. The FISH technique can be used with a DNA sequence as short as 500 or 600 bases. However, clones larger than 1,000 bases have a higher likelihood of binding to a unique chromosomal location with sufficient signal intensity for simple detection. Preferably 1,000 bases, and more preferably 2,000 bases will suffice to get good results at a reasonable amount of time. For a review of this technique, see Verma et al., Human Chromosomes: A Manual of Basic Techniques (Pergamon Press, New York 1988).

[0287] Reagents for chromosome mapping can be used individually to mark a single chromosome or a single site on that chromosome, or panels of reagents can be used for marking multiple sites and/or multiple chromosomes. Reagents corresponding to non-coding regions of the genes actually are preferred for mapping purposes. Coding sequences are more likely to be conserved within gene families, thus increasing the chance of cross hybridizations during chromosomal mapping.

[0288] Once a sequence has been mapped to a precise chromosomal location, the physical position of the sequence on the chromosome can be correlated with genetic map data. (Such data are found, for example, in V. McKusick, Mendelian Inheritance in Man, available on-line through Johns Hopkins University Welch Medical Library). The relationship between a gene and a disease, mapped to the same chromosomal region, can then be identified through linkage analysis (co-inheritance of physically adjacent genes), described in, for example, Egeland, J. et al. (1987) Nature, 325:783-787.

[0289] Moreover, differences in the DNA sequences between individuals affected and unaffected with a disease associated with the 26443 or 46873 gene, can be determined. If a mutation is observed in some or all of the affected individuals but not in any unaffected individuals, then the mutation is likely to be the causative agent of the particular disease. Comparison of affected and unaffected individuals generally involves first looking for structural alterations in the chromosomes, such as deletions or translocations that are visible from chromosome spreads or detectable using PCR based on that DNA sequence. Ultimately, complete sequencing of genes from several individuals can be performed to confirm the presence of a mutation and to distinguish mutations from polymorphisms.

[0290] 26443 and 46873 Tissue Typing

[0291] 26443 or 46873 sequences can be used to identify individuals from biological samples using, e.g., restriction fragment length polymorphism (RFLP). In this technique, an individual's genomic DNA is digested with one or more restriction enzymes, the fragments separated, e.g., in a Southern blot, and probed to yield bands for identification. The sequences of the present invention are useful as additional DNA markers for RFLP (described in U.S. Pat. No. 5,272,057).

[0292] Furthermore, the sequences of the present invention can also be used to determine the actual base-by-base DNA sequence of selected portions of an individual's genome. Thus, the 26443 or 46873 nucleotide sequences described herein can be used to prepare two PCR primers from the 5′ and 3′ends of the sequences. These primers can then be used to amplify an individual's DNA and subsequently sequence it. Panels of corresponding DNA sequences from individuals, prepared in this manner, can provide unique individual identifications, as each individual will have a unique set of such DNA sequences due to allelic differences.

[0293] Allelic variation occurs to some degree in the coding regions of these sequences, and to a greater degree in the non-coding regions. Each of the sequences described herein can, to some degree, be used as a standard against which DNA from an individual can be compared for identification purposes. Because greater numbers of polymorphisms occur in the non-coding regions, fewer sequences are necessary to differentiate individuals. The non-coding sequences of SEQ ID NO: 1 or SEQ ID NO:4 can provide positive individual identification with a panel of perhaps 10 to 1,000 primers which each yield a non-coding amplified sequence of 100 bases. If predicted coding sequences, such as those in SEQ ID NO:3 or SEQ ID NO:6 are used, a more appropriate number of primers for positive individual identification would be 500-2,000.

[0294] If a panel of reagents from 26443 or 46873 nucleotide sequences described herein is used to generate a unique identification database for an individual, those same reagents can later be used to identify tissue from that individual. Using the unique identification database, positive identification of the individual, living or dead, can be made from extremely small tissue samples.

[0295] Use of Partial 26443 or 46873 Sequences in Forensic Biology

[0296] DNA-based identification techniques can also be used in forensic biology. To make such an identification, PCR technology can be used to amplify DNA sequences taken from very small biological samples such as tissues, e.g., hair or skin, or body fluids, e.g., blood, saliva, or semen found at a crime scene. The amplified sequence can then be compared to a standard, thereby allowing identification of the origin of the biological sample.

[0297] The sequences of the present invention can be used to provide polynucleotide reagents, e.g., PCR primers, targeted to specific loci in the human genome, which can enhance the reliability of DNA-based forensic identifications by, for example, providing another “identification marker” (i.e. another DNA sequence that is unique to a particular individual). As mentioned above, actual base sequence information can be used for identification as an accurate alternative to patterns formed by restriction enzyme generated fragments. Sequences targeted to non-coding regions of SEQ ID NO:1 or SEQ ID NO:4 (e.g., fragments derived from the non-coding regions of SEQ ID NO:1 or SEQ ID NO:4 having a length of at least 20 bases, preferably at least 30 bases) are particularly appropriate for this use.

[0298] The 26443 or 46873 nucleotide sequences described herein can further be used to provide polynucleotide reagents, e.g., labeled or labelable probes which can be used in, for example, an in situ hybridization technique, to identify a specific tissue, e.g., a tissue containing organelles having asparaginase. This can be very useful in cases where a forensic pathologist is presented with a tissue of unknown origin. Panels of such 26443 or 46873 probes can be used to identify tissue by species and/or by organ type.

[0299] In a similar fashion, these reagents, e.g., 26443 or 46873 primers or probes can be used to screen tissue culture for contamination (i.e., screen for the presence of a mixture of different types of cells in a culture). Predictive Medicine of 26443 and 46873 The present invention also pertains to the field of predictive medicine in which diagnostic assays, prognostic assays, and monitoring clinical trials are used for prognostic (predictive) purposes to thereby treat an individual.

[0300] Generally, the invention provides, a method of determining if a subject is at risk for a disorder related to a lesion in or the misexpression of a gene that encodes asparaginase.

[0301] Such disorders include, e.g., a disorder associated with the misexpression of an asparaginase; or a metabolic disorder, e.g., a disorder involving inappropriate cellular asparagine levels.

[0302] The method includes one or more of the following:

[0303] detecting, in a tissue of the subject, the presence or absence of a mutation which affects the expression of the 26443 or 46873 gene, or detecting the presence or absence of a mutation in a region which controls the expression of the gene, e.g., a mutation in the 5′ control region;

[0304] detecting, in a tissue of the subject, the presence or absence of a mutation which alters the structure of the 26443 or 46873 gene;

[0305] detecting, in a tissue of the subject, the misexpression of the 26443 or 46873 gene, at the mRNA level, e.g., detecting a non-wild type level of a mRNA;

[0306] detecting, in a tissue of the subject, the misexpression of the gene, at the protein level, e.g., detecting a non-wild type level of a 26443 or 46873 polypeptide.

[0307] In preferred embodiments the method includes: ascertaining the existence of at least one of: a deletion of one or more nucleotides from the 26443 or 46873 gene; an insertion of one or more nucleotides into the gene, a point mutation, e.g., a substitution of one or more nucleotides of the gene, a gross chromosomal rearrangement of the gene, e.g., a translocation, inversion, or deletion.

[0308] For example, detecting the genetic lesion can include: (i) providing a probe/primer including an oligonucleotide containing a region of nucleotide sequence which hybridizes to a sense or antisense sequence from SEQ ID NO: 1 or 3 or naturally occurring mutants thereof, or 5′ or 3′flanking sequences naturally associated with the 26443 or 46873 gene; (ii) exposing the probe/primer to nucleic acid of the tissue; and detecting, by hybridization, e.g., in situ hybridization, of the probe/primer to the nucleic acid, the presence or absence of the genetic lesion.

[0309] In preferred embodiments detecting the misexpression includes ascertaining the existence of at least one of: an alteration in the level of a messenger RNA transcript of the 26443 or 46873 gene; the presence of a non-wild type splicing pattern of a messenger RNA transcript of the gene; or a non-wild type level of 26443 or 46873.

[0310] Methods of the invention can be used prenatally or to determine if a subject's offspring will be at risk for a disorder.

[0311] In preferred embodiments the method includes determining the structure of a 26443 or 46873 gene, an abnormal structure being indicative of risk for the disorder.

[0312] In preferred embodiments the method includes contacting a sample form the subject with an antibody to the 26443 or 46873 protein or a nucleic acid, which hybridizes specifically with the gene. These and other embodiments are discussed below.

[0313] Diagnostic and Prognostic Assays of 26443 and 46873

[0314] Diagnostic and prognostic assays of the invention include method for assessing the expression level of 26443 or 46873 molecules and for identifying variations and mutations in the sequence of 26443 or 46873 molecules.

[0315] Expression Monitoring and Profiling:

[0316] The presence, level, or absence of 26443 or 46873 protein or nucleic acid in a biological sample can be evaluated by obtaining a biological sample from a test subject and contacting the biological sample with a compound or an agent capable of detecting 26443 or 46873 protein or nucleic acid (e.g., mRNA, genomic DNA) that encodes 26443 or 46873 protein such that the presence of 26443 or 46873 protein or nucleic acid is detected in the biological sample. The term “biological sample” includes tissues, cells and biological fluids isolated from a subject, as well as tissues, cells and fluids present within a subject. A preferred biological sample is serum. The level of expression of the 26443 or 46873 gene can be measured in a number of ways, including, but not limited to: measuring the mRNA encoded by the 26443 or 46873 gene; measuring the amount of protein encoded by the 26443 or 46873 gene; or measuring the activity of the protein encoded by the 26443 or 46873 gene.

[0317] The level of mRNA corresponding to the 26443 or 46873 gene in a cell can be determined by both in situ and in vitro formats.

[0318] The isolated mRNA can be used in hybridization or amplification assays that include, but are not limited to, Southern or Northern analyses, polymerase chain reaction analyses and probe arrays. One preferred diagnostic method for the detection of mRNA levels involves contacting the isolated mRNA with a nucleic acid molecule (probe) that can hybridize to the mRNA encoded by the gene being detected. The nucleic acid probe can be, for example, a full-length 26443 or 46873 nucleic acid, such as the nucleic acid of SEQ ID NO: 1, or a portion thereof, such as an oligonucleotide of at least 7, 15, 30, 50, 100, 250 or 500 nucleotides in length and sufficient to specifically hybridize under stringent conditions to 26443 or 46873 mRNA or genomic DNA. The probe can be disposed on an address of an array, e.g., an array described below. Other suitable probes for use in the diagnostic assays are described herein.

[0319] In one format, mRNA (or cDNA) is immobilized on a surface and contacted with the probes, for example by running the isolated mRNA on an agarose gel and transferring the mRNA from the gel to a membrane, such as nitrocellulose. In an alternative format, the probes are immobilized on a surface and the mRNA (or cDNA) is contacted with the probes, for example, in a two-dimensional gene chip array described below. A skilled artisan can adapt known mRNA detection methods for use in detecting the level of mRNA encoded by the 26443 or 46873 gene.

[0320] The level of mRNA in a sample that is encoded by one of 26443 or 46873 can be evaluated with nucleic acid amplification, e.g., by RT-PCR (Mullis (1987) U.S. Pat. No. 4,683,202), ligase chain reaction (Barany (1991) Proc. Natl. Acad. Sci. USA 88:189-193), self sustained sequence replication (Guatelli et al., (1990) Proc. Natl. Acad. Sci. USA 87:1874-1878), transcriptional amplification system (Kwoh et al., (1989), Proc. Natl. Acad. Sci. USA 86:1173-1177), Q-Beta Replicase (Lizardi et al., (1988) Bio/Technology 6:1197), rolling circle replication (Lizardi et al., U.S. Pat. No. 5,854,033) or any other nucleic acid amplification method, followed by the detection of the amplified molecules using techniques known in the art. As used herein, amplification primers are defined as being a pair of nucleic acid molecules that can anneal to 5′ or 3′regions of a gene (plus and minus strands, respectively, or vice-versa) and contain a short region in between. In general, amplification primers are from about 10 to 30 nucleotides in length and flank a region from about 50 to 200 nucleotides in length. Under appropriate conditions and with appropriate reagents, such primers permit the amplification of a nucleic acid molecule comprising the nucleotide sequence flanked by the primers.

[0321] For in situ methods, a cell or tissue sample can be prepared/processed and immobilized on a support, typically a glass slide, and then contacted with a probe that can hybridize to mRNA that encodes the 26443 or 46873 gene being analyzed.

[0322] In another embodiment, the methods further contacting a control sample with a compound or agent capable of detecting 26443 or 46873 mRNA, or genomic DNA, and comparing the presence of 26443 or 46873 mRNA or genomic DNA in the control sample with the presence of 26443 or 46873 mRNA or genomic DNA in the test sample. In still another embodiment, serial analysis of gene expression, as described in U.S. Pat. No. 5,695,937, is used to detect 26443 or 46873 transcript levels.

[0323] A variety of methods can be used to determine the level of protein encoded by 26443 or 46873. In general, these methods include contacting an agent that selectively binds to the protein, such as an antibody with a sample, to evaluate the level of protein in the sample. In a preferred embodiment, the antibody bears a detectable label. Antibodies can be polyclonal, or more preferably, monoclonal. An intact antibody, or a fragment thereof (e.g., Fab or F(ab′)2) can be used. The term “labeled”, with regard to the probe or antibody, is intended to encompass direct labeling of the probe or antibody by coupling (i.e., physically linking) a detectable substance to the probe or antibody, as well as indirect labeling of the probe or antibody by reactivity with a detectable substance. Examples of detectable substances are provided herein.

[0324] The detection methods can be used to detect 26443 or 46873 protein in a biological sample in vitro as well as in vivo. In vitro techniques for detection of 26443 or 46873 protein include enzyme linked immunosorbent assays (ELISAs), immunoprecipitations, immunofluorescence, enzyme immunoassay (EIA), radioimmunoassay (RIA), and Western blot analysis. In vivo techniques for detection of 26443 or 46873 proteins include introducing into a subject a labeled anti-26443 or -46873 antibody. For example, the antibody can be labeled with a radioactive marker whose presence and location in a subject can be detected by standard imaging techniques. In another embodiment, the sample is labeled, e.g., biotinylated and then contacted to the antibody, e.g., an anti-26443 or -46873 antibody positioned on an antibody array (as described below). The sample can be detected, e.g., with avidin coupled to a fluorescent label.

[0325] In another embodiment, the methods further include contacting the control sample with a compound or agent capable of detecting 26443 or 46873 proteins, and comparing the presence of 26443 or 46873 proteins in the control sample with the presence of 26443 or 46873 proteins in the test sample.

[0326] The invention also includes kits for detecting the presence of 26443 or 46873 in a biological sample. For example, the kit can include a compound or agent capable of detecting 26443 or 46873 protein or mRNA in a biological sample; and a standard. The compound or agent can be packaged in a suitable container. The kit can further comprise instructions for using the kit to detect 26443 or 46873 protein or nucleic acid.

[0327] For antibody-based kits, the kit can include: (1) a first antibody (e.g., attached to a solid support) which binds to a polypeptide corresponding to a marker of the invention; and, optionally, (2) a second, different antibody which binds to either the polypeptide or the first antibody and is conjugated to a detectable agent.

[0328] For oligonucleotide-based kits, the kit can include: (1) an oligonucleotide, e.g., a detectably labeled oligonucleotide, which hybridizes to a nucleic acid sequence encoding a polypeptide corresponding to a marker of the invention or (2) a pair of primers useful for amplifying a nucleic acid molecule corresponding to a marker of the invention. The kit can also includes a buffering agent, a preservative, or a protein-stabilizing agent. The kit can also includes components necessary for detecting the detectable agent (e.g., an enzyme or a substrate). The kit can also contain a control sample or a series of control samples that can be assayed and compared to the test sample contained. Each component of the kit can be enclosed within an individual container and all of the various containers can be within a single package, along with instructions for interpreting the results of the assays performed using the kit.

[0329] The diagnostic methods described herein can identify subjects having, or at risk of developing, a disease or disorder associated with mis-expressed or aberrant or unwanted 26443 or 46873 expression or activity. As used herein, the term “unwanted” includes an unwanted phenomenon involved in a biological response such as pain or deregulated cell proliferation.

[0330] In one embodiment, a disease or disorder associated with aberrant or unwanted 26443 or 46873 expression or activity is identified. A test sample is obtained from a subject and 26443 or 46873 protein or nucleic acid (e.g., mRNA or genomic DNA) is evaluated, wherein the level, e.g., the presence or absence, of 26443 or 46873 protein or nucleic acid is diagnostic for a subject having or at risk of developing a disease or disorder associated with aberrant or unwanted 26443 or 46873 expression or activity. As used herein, a “test sample” refers to a biological sample obtained from a subject of interest, including a biological fluid (e.g., serum), cell sample, or tissue.

[0331] The prognostic assays described herein can be used to determine whether a subject can be administered an agent (e.g., an agonist, antagonist, peptidomimetic, protein, peptide, nucleic acid, small molecule, or other drug candidate) to treat a disease or disorder associated with aberrant or unwanted 26443 or 46873 expression or activity. For example, such methods can be used to determine whether a subject can be effectively treated with an agent for a disorder involving aberrant or unwanted 26443 or 46873 expression or activity.

[0332] In another aspect, the invention features a computer medium having a plurality of digitally encoded data records. Each data record includes a value representing the level of expression of 26443 or 46873 in a sample, and a descriptor of the sample. The descriptor of the sample can be an identifier of the sample, a subject from which the sample was derived (e.g., a patient), a diagnosis, or a treatment (e.g., a preferred treatment). In a preferred embodiment, the data record further includes values representing the level of expression of genes other than 26443 or 46873 (e.g., other genes associated with a 26443- or 46873-disorder, or other genes on an array). The data record can be structured as a table, e.g., a table that is part of a database such as a relational database (e.g., a SQL database of the Oracle or Sybase database environments).

[0333] Also featured is a method of evaluating a sample. The method includes providing a sample, e.g., from the subject, and determining a gene expression profile of the sample, wherein the profile includes a value representing the level of 26443 or 46873 expression. The method can further include comparing the value or the profile (i.e., multiple values) to a reference value or reference profile. The gene expression profile of the sample can be obtained by any of the methods described herein (e.g., by providing a nucleic acid from the sample and contacting the nucleic acid to an array). The method can be used to diagnose a disorder involving aberrant or unwanted 26443 or 46873 expression or activity in a subject. The method can be used to monitor a treatment for a disorder involving aberrant or unwanted 26443 or 46873 expression or activity in a subject. For example, the gene expression profile can be determined for a sample from a subject undergoing treatment. The profile can be compared to a reference profile or to a profile obtained from the subject prior to treatment or prior to onset of the disorder (see, e.g., Golub et al. (1999) Science 286:531).

[0334] In yet another aspect, the invention features a method of evaluating a test compound (see also, “Screening Assays”, above). The method includes providing a cell and a test compound; contacting the test compound to the cell; obtaining a subject expression profile for the contacted cell; and comparing the subject expression profile to one or more reference profiles. The profiles include a value representing the level of 26443 or 46873 expression. In a preferred embodiment, the subject expression profile is compared to a target profile, e.g., a profile for a normal cell or for desired condition of a cell. The test compound is evaluated favorably if the subject expression profile is more similar to the target profile than an expression profile obtained from an uncontacted cell.

[0335] In another aspect, the invention features, a method of evaluating a subject. The method includes: a) obtaining a sample from a subject, e.g., from a caregiver, e.g., a caregiver who obtains the sample from the subject; b) determining a subject expression profile for the sample. Optionally, the method further includes either or both of steps: c) comparing the subject expression profile to one or more reference expression profiles; and d) selecting the reference profile most similar to the subject reference profile. The subject expression profile and the reference profiles include a value representing the level of 26443 or 46873 expression. A variety of routine statistical measures can be used to compare two reference profiles. One possible metric is the length of the distance vector that is the difference between the two profiles. Each of the subject and reference profile is represented as a multi-dimensional vector, wherein each dimension is a value in the profile.

[0336] The method can further include transmitting a result to a caregiver. The result can be the subject expression profile, a result of a comparison of the subject expression profile with another profile, a most similar reference profile, or a descriptor of any of the aforementioned. The result can be transmitted across a computer network, e.g., the result can be in the form of a computer transmission, e.g., a computer data signal embedded in a carrier wave.

[0337] Also featured is a computer medium having executable code for effecting the following steps: receive a subject expression profile; access a database of reference expression profiles; and either i) select a matching reference profile most similar to the subject expression profile or ii) determine at least one comparison score for the similarity of the subject expression profile to at least one reference profile. The subject expression profile, and the reference expression profiles each include a value representing the level of 26443 or 46873 expression.

[0338] 26443 and 46873 Arrays and Uses Thereof

[0339] In another aspect, the invention features an array that includes a substrate having a plurality of addresses. At least one address of the plurality includes a capture probe that binds specifically to a 26443 or 46873 molecule (e.g., a 26443 or 46873 nucleic acid or a 26443 or 46873 polypeptide). The array can have a density of at least than 10, 50, 100, 200, 500, 1,000, 2,000, or 10,000 or more addresses/cm2, and ranges between. In a preferred embodiment, the plurality of addresses includes at least 10, 100, 500, 1,000, 5,000, 10,000, 50,000 addresses. In a preferred embodiment, the plurality of addresses includes equal to or less than 10, 100, 500, 1,000, 5,000, 10,000, or 50,000 addresses. The substrate can be a two-dimensional substrate such as a glass slide, a wafer (e.g., silica or plastic), a mass spectroscopy plate, or a three-dimensional substrate such as a gel pad. Addresses in addition to address of the plurality can be disposed on the array.

[0340] In a preferred embodiment, at least one address of the plurality includes a nucleic acid capture probe that hybridizes specifically to a 26443 or 46873 nucleic acid, e.g., the sense or anti-sense strand. In one preferred embodiment, a subset of addresses of the plurality of addresses has a nucleic acid capture probe for 26443 or 46873. Each address of the subset can include a capture probe that hybridizes to a different region of a 26443 or 46873 nucleic acid. In another preferred embodiment, addresses of the subset include a capture probe for a 26443 or 46873 nucleic acid. Each address of the subset is unique, overlapping, and complementary to a different variant of 26443 or 46873 (e.g., an allelic variant, or all possible hypothetical variants). The array can be used to sequence 26443 or 46873 by hybridization (see, e.g., U.S. Pat. No. 5,695,940).

[0341] An array can be generated by various methods, e.g., by photolithographic methods (see, e.g., U.S. Pat. Nos. 5,143,854; 5,510,270; and 5,527,681), mechanical methods (e.g., directed-flow methods as described in U.S. Pat. No. 5,384,261), pin-based methods (e.g., as described in U.S. Pat. No. 5,288,514), and bead-based techniques (e.g., as described in PCT US/93/04145).

[0342] In another preferred embodiment, at least one address of the plurality includes a polypeptide capture probe that binds specifically to a 26443 or 46873 polypeptide or fragment thereof. The polypeptide can be a naturally occurring interaction partner of 26443 or 46873 polypeptide. Preferably, the polypeptide is an antibody, e.g., an antibody described herein (see “Anti-26443 or -46873 Antibodies,” above), such as a monoclonal antibody or a single-chain antibody.

[0343] In another aspect, the invention features a method of analyzing the expression of 26443 or 46873. The method includes providing an array as described above; contacting the array with a sample and detecting binding of a 26443 or 46873-molecule (e.g., nucleic acid or polypeptide) to the array. In a preferred embodiment, the array is a nucleic acid array. Optionally the method further includes amplifying nucleic acid from the sample prior or during contact with the array.

[0344] In another embodiment, the array can be used to assay gene expression in a tissue to ascertain tissue specificity of genes in the array, particularly the expression of 26443 or 46873. If a sufficient number of diverse samples is analyzed, clustering (e.g., hierarchical clustering, k-means clustering, Bayesian clustering and the like) can be used to identify other genes which are co-regulated with 26443 or 46873. For example, the array can be used for the quantitation of the expression of multiple genes. Thus, not only tissue specificity, but also the level of expression of a battery of genes in the tissue is ascertained. Quantitative data can be used to group (e.g., cluster) genes on the basis of their tissue expression per se and level of expression in that tissue.

[0345] For example, array analysis of gene expression can be used to assess the effect of cell-cell interactions on 26443 or 46873 expression. A first tissue can be perturbed and nucleic acid from a second tissue that interacts with the first tissue can be analyzed. In this context, the effect of one cell type on another cell type in response to a biological stimulus can be determined, e.g., to monitor the effect of cell-cell interaction at the level of gene expression.

[0346] In another embodiment, cells are contacted with a therapeutic agent. The expression profile of the cells is determined using the array, and the expression profile is compared to the profile of like cells not contacted with the agent. For example, the assay can be used to determine or analyze the molecular basis of an undesirable effect of the therapeutic agent. If an agent is administered therapeutically to treat one cell type but has an undesirable effect on another cell type, the invention provides an assay to determine the molecular basis of the undesirable effect and thus provides the opportunity to co-administer a counteracting agent or otherwise treat the undesired effect. Similarly, even within a single cell type, undesirable biological effects can be determined at the molecular level. Thus, the effects of an agent on expression of other than the target gene can be ascertained and counteracted.

[0347] In another embodiment, the array can be used to monitor expression of one or more genes in the array with respect to time. For example, samples obtained from different time points can be probed with the array. Such analysis can identify and/or characterize the development of a 26443- or 46873-associated disease or disorder; and processes, such as a cellular transformation associated with a 26443- or 46873-associated disease or disorder. The method can also evaluate the treatment and/or progression of a 26443- or 46873-associated disease or disorder

[0348] The array is also useful for ascertaining differential expression patterns of one or more genes in normal and abnormal cells. This provides a battery of genes (e.g., including 26443 or 46873) that could serve as a molecular target for diagnosis or therapeutic intervention.

[0349] In another aspect, the invention features an array having a plurality of addresses. Each address of the plurality includes a unique polypeptide. At least one address of the plurality has disposed thereon a 26443 or 46873 polypeptide or fragment thereof. Methods of producing polypeptide arrays are described in the art, e.g., in De Wildt et al. (2000). Nature Biotech. 18, 989-994; Lueking et al. (1999). Anal. Biochem. 270, 103-111; Ge, H. (2000). Nucleic Acids Res. 28, e3, I-VII; MacBeath, G., and Schreiber, S. L. (2000). Science 289, 1760-1763; and WO 99/51773A1. In a preferred embodiment, each addresses of the plurality has disposed thereon a polypeptide at least 60, 70, 80, 85, 90, 95 or 99% identical to a 26443 or 46873 polypeptide or fragment thereof. For example, multiple variants of a 26443 or 46873 polypeptide (e.g., encoded by allelic variants, site-directed mutants, random mutants, or combinatorial mutants) can be disposed at individual addresses of the plurality. Addresses in addition to the address of the plurality can be disposed on the array.

[0350] The polypeptide array can be used to detect a 26443 or 46873 binding compound, e.g., an antibody in a sample from a subject with specificity for a 26443 or 46873 polypeptide or the presence of a 26443- or 46873-binding protein or ligand.

[0351] The array is also useful for ascertaining the effect of the expression of a gene on the expression of other genes in the same cell or in different cells (e.g., ascertaining the effect of 26443 or 46873 expression on the expression of other genes). This provides, for example, for a selection of alternate molecular targets for therapeutic intervention if the ultimate or downstream target cannot be regulated.

[0352] In another aspect, the invention features a method of analyzing a plurality of probes. The method is useful, e.g., for analyzing gene expression. The method includes: providing a two dimensional array having a plurality of addresses, each address of the plurality being positionally distinguishable from each other address of the plurality having a unique capture probe, e.g., wherein the capture probes are from a cell or subject which express 26443 or 46873 or from a cell or subject in which a 26443 or 46873 mediated response has been elicited, e.g., by contact of the cell with 26443 or 46873 nucleic acid or protein, or administration to the cell or subject 26443 or 46873 nucleic acid or protein; providing a two dimensional array having a plurality of addresses, each address of the plurality being positionally distinguishable from each other address of the plurality, and each address of the plurality having a unique capture probe, e.g., wherein the capture probes are from a cell or subject which does not express 26443 or 46873 (or does not express as highly as in the case of the 26443 or 46873 positive plurality of capture probes) or from a cell or subject which in which a 26443- or 46873-mediated response has not been elicited (or has been elicited to a lesser extent than in the first sample); contacting the array with one or more inquiry probes (which is preferably other than a 26443 or 46873 nucleic acid, polypeptide, or antibody), and thereby evaluating the plurality of capture probes. Binding, e.g., in the case of a nucleic acid, hybridization with a capture probe at an address of the plurality, is detected, e.g., by signal generated from a label attached to the nucleic acid, polypeptide, or antibody.

[0353] In another aspect, the invention features a method of analyzing a plurality of probes or a sample. The method is useful, e.g., for analyzing gene expression. The method includes: providing a two dimensional array having a plurality of addresses, each address of the plurality being positionally distinguishable from each other address of the plurality having a unique capture probe, contacting the array with a first sample from a cell or subject which express or mis-express 26443 or 46873 or from a cell or subject in which a 26443- or 46873-mediated response has been elicited, e.g., by contact of the cell with 26443 or 46873 nucleic acid or protein, or administration to the cell or subject 26443 or 46873 nucleic acid or protein; providing a two dimensional array having a plurality of addresses, each address of the plurality being positionally distinguishable from each other address of the plurality, and each address of the plurality having a unique capture probe, and contacting the array with a second sample from a cell or subject which does not express 26443 or 46873 (or does not express as highly as in the case of the 26443 or 46873 positive plurality of capture probes) or from a cell or subject which in which a 26443 or 46873 mediated response has not been elicited (or has been elicited to a lesser extent than in the first sample); and comparing the binding of the first sample with the binding of the second sample. Binding, e.g., in the case of a nucleic acid, hybridization with a capture probe at an address of the plurality, is detected, e.g., by signal generated from a label attached to the nucleic acid, polypeptide, or antibody. The same array can be used for both samples or different arrays can be used. If different arrays are used the plurality of addresses with capture probes should be present on both arrays.

[0354] In another aspect, the invention features a method of analyzing 26443 or 46873, e.g., analyzing structure, function, or relatedness to other nucleic acid or amino acid sequences. The method includes: providing a 26443 or 46873 nucleic acid or amino acid sequence; comparing the 26443 or 46873 sequence with one or more preferably a plurality of sequences from a collection of sequences, e.g., a nucleic acid or protein sequence database; to thereby analyze 26443 or 46873.

[0355] Detection of 26443 and 46873 Variations or Mutations

[0356] The methods of the invention can also be used to detect genetic alterations in a 26443 or 46873 gene, thereby determining if a subject with the altered gene is at risk for a disorder characterized by mis-regulation in 26443 or 46873 protein activity or nucleic acid expression, such as a disorder involving aberrant or unwanted 26443 or 46873 expression or activity. In preferred embodiments, the methods include detecting, in a sample from the subject, the presence or absence of a genetic alteration characterized by at least one of an alteration affecting the integrity of a gene encoding a 26443 or 46873 proteins, or the mis-expression of the 26443 or 46873 genes. For example, such genetic alterations can be detected by ascertaining the existence of at least one of 1) a deletion of one or more nucleotides from a 26443 or 46873 gene; 2) an addition of one or more nucleotides to a 26443 or 46873 gene; 3) a substitution of one or more nucleotides of a 26443 or 46873 gene, 4) a chromosomal rearrangement of a 26443 or 46873 gene; 5) an alteration in the level of a messenger RNA transcript of a 26443 or 46873 gene, 6) aberrant modification of a 26443 or 46873 gene, such as of the methylation pattern of the genomic DNA, 7) the presence of a non-wild type splicing pattern of a messenger RNA transcript of a 26443 or 46873 gene, 8) a non-wild type level of a 26443 or 46873 protein, 9) allelic loss of a 26443 or 46873 gene, and

[0357] 10) inappropriate post-translational modification of a 26443 or 46873 protein.

[0358] An alteration can be detected without a probe/primer in a polymerase chain reaction, such as anchor PCR or RACE-PCR, or, alternatively, in a ligation chain reaction (LCR), the latter of which can be particularly useful for detecting point mutations in the 26443 or 46873 gene. This method can include the steps of collecting a sample of cells from a subject, isolating nucleic acid (e.g., genomic, mRNA or both) from the sample, contacting the nucleic acid sample with one or more primers which specifically hybridize to a 26443 or 46873 gene under conditions such that hybridization and amplification of the 26443 or 46873 gene (if present) occurs, and detecting the presence or absence of an amplification product, or detecting the size of the amplification product and comparing the length to a control sample. It is anticipated that PCR and/or LCR may be desirable to use as a preliminary amplification step in conjunction with any of the techniques used for detecting mutations described herein. Alternatively, other amplification methods described herein or known in the art can be used.

[0359] In another embodiment, mutations in a 26443 or 46873 gene from a sample cell can be identified by detecting alterations in restriction enzyme cleavage patterns. For example, sample and control DNA is isolated, amplified (optionally), digested with one or more restriction endonucleases, and fragment length sizes are determined, e.g., by gel electrophoresis and compared. Differences in fragment length sizes between sample and control DNA indicates mutations in the sample DNA. Moreover, the use of sequence specific ribozymes (see, for example, U.S. Pat. No. 5,498,531) can be used to score for the presence of specific mutations by development or loss of a ribozyme cleavage site.

[0360] In other embodiments, genetic mutations in 26443 or 46873 can be identified by hybridizing a sample and control nucleic acids, e.g., DNA or RNA, two-dimensional arrays, e.g., chip based arrays. Such arrays include a plurality of addresses, each of which is positionally distinguishable from the other. A different probe is located at each address of the plurality. A probe can be complementary to a region of a 26443 or a 46873 nucleic acid or a putative variant (e.g., allelic variant) thereof. A probe can have one or more mismatches to a region of a 26443 or 46873 nucleic acid (e.g., a destabilizing mismatch). The arrays can have a high density of addresses, e.g., can contain hundreds or thousands of oligonucleotides probes (Cronin, M. T. et al. (1996) Human Mutation 7: 244-255; Kozal, M. J. et al. (1996) Nature Medicine 2: 753-759). For example, genetic mutations in 26443 or 46873 can be identified in two-dimensional arrays containing light-generated DNA probes as described in Cronin, M. T. et al. supra. Briefly, a first hybridization array of probes can be used to scan through long stretches of DNA in a sample and control to identify base changes between the sequences by making linear arrays of sequential overlapping probes. This step allows the identification of point mutations. This step is followed by a second hybridization array that allows the characterization of specific mutations by using smaller, specialized probe arrays complementary to all variants or mutations detected. Each mutation array is composed of parallel probe sets, one complementary to the wild-type gene and the other complementary to the mutant gene.

[0361] In yet another embodiment, any of a variety of sequencing reactions known in the art can be used to directly sequence the 26443 or 46873 gene and detect mutations by comparing the sequence of the sample 26443 or 46873 with the corresponding wild-type (control) sequence. Automated sequencing procedures can be utilized when performing the diagnostic assays ((1995) Biotechniques 19:448), including sequencing by mass spectrometry.

[0362] Other methods for detecting mutations in the 26443 or 46873 gene include methods in which protection from cleavage agents is used to detect mismatched bases in RNA/RNA or RNA/DNA heteroduplexes (Myers et al. (1985) Science 230:1242; Cotton et al. (1988) Proc. Natl. Acad Sci USA 85:4397; Saleeba et al. (1992) Methods Enzymol. 217:286-295).

[0363] In still another embodiment, the mismatch cleavage reaction employs one or more proteins that recognize mismatched base pairs in double-stranded DNA (so called “DNA mismatch repair” enzymes) in defined systems for detecting and mapping point mutations in 26443 or 46873 cDNAs obtained from samples of cells. For example, the mutY enzyme of E. coli cleaves A at G/A mismatches and the thymidine DNA glycosylase from HeLa cells cleaves T at G/T mismatches (Hsu et al. (1994) Carcinogenesis 15:1657-1662; U.S. Pat. No. 5,459,039).

[0364] In other embodiments, alterations in electrophoretic mobility will be used to identify mutations in 26443 or 46873 genes. For example, single strand conformation polymorphism (SSCP) may be used to detect differences in electrophoretic mobility between mutant and wild type nucleic acids (Orita et al. (1989) Proc Natl. Acad. Sci USA: 86:2766, see also Cotton (1993) Mutat. Res. 285:125-144; and Hayashi (1992) Genet. Anal. Tech. Appl. 9:73-79). Single-stranded DNA fragments of sample and control 26443 or 46873 nucleic acids will be denatured and allowed to renature. The secondary structure of single-stranded nucleic acids varies according to the sequence, the resulting alteration in electrophoretic mobility enables the detection of even a single base change. The DNA fragments may be labeled or detected with labeled probes. The sensitivity of the assay may be enhanced by using RNA (rather than DNA), in which the secondary structure is more sensitive to a change in sequence. In a preferred embodiment, the subject method utilizes heteroduplex analysis to separate double stranded heteroduplex molecules on the basis of changes in electrophoretic mobility (Keen et al. (1991) Trends Genet 7:5).

[0365] In yet another embodiment, the movement of mutant or wild-type fragments in polyacrylamide gels containing a gradient of denaturant is assayed using denaturing gradient gel electrophoresis (DGGE) (Myers et al. (1985) Nature 313:495). When DGGE is used as the method of analysis, DNA will be modified to insure that it does not completely denature, for example by adding a GC clamp of approximately 40 bp of high-melting GC-rich DNA by PCR. In a further embodiment, a temperature gradient is used in place of a denaturing gradient to identify differences in the mobility of control and sample DNA (Rosenbaum and Reissner (1987) Biophys Chem 265:12753).

[0366] Examples of other techniques for detecting point mutations include, but are not limited to, selective oligonucleotide hybridization, selective amplification, or selective primer extension (Saiki et al. (1986) Nature 324:163); Saiki et al. (1989) Proc. Natl. Acad. Sci USA 86:6230). A further method of detecting point mutations is the chemical ligation of oligonucleotides as described in Xu et al. ((2001) Nature Biotechnol. 19:148). Adjacent oligonucleotides, one of which selectively anneals to the query site, are ligated together if the nucleotide at the query site of the sample nucleic acid is complementary to the query oligonucleotide; ligation can be monitored, e.g., by fluorescent dyes coupled to the oligonucleotides.

[0367] Alternatively, allele specific amplification technology that depends on selective PCR amplification may be used in conjunction with the instant invention. Oligonucleotides used as primers for specific amplification may carry the mutation of interest in the center of the molecule (so that amplification depends on differential hybridization) (Gibbs et al. (1989) Nucleic Acids Res. 17:2437-2448) or at the extreme 3′end of one primer where, under appropriate conditions, mismatch can prevent, or reduce polymerase extension (Prossner (1993) Tibtech 11:238). In addition it may be desirable to introduce a novel restriction site in the region of the mutation to create cleavage-based detection (Gasparini et al. (1992) Mol. Cell Probes 6:1). It is anticipated that in certain embodiments amplification may also be performed using Taq ligase for amplification (Barany (1991) Proc. Natl. Acad. Sci USA 88:189). In such cases, ligation will occur only if there is a perfect match at the 3′end of the 5′sequence making it possible to detect the presence of a known mutation at a specific site by looking for the presence or absence of amplification.

[0368] In another aspect, the invention features a set of oligonucleotides. The set includes a plurality of oligonucleotides, each of which is at least partially complementary (e.g., at least 50%, 60%, 70%, 80%, 90%, 92%, 95%, 97%, 98%, or 99% complementary) to a 26443 or 46873 nucleic acid.

[0369] In a preferred embodiment the set includes a first and a second oligonucleotide. The first and second oligonucleotide can hybridize to the same or to different locations of SEQ ID NO:1, 3, 4 or 6 or the complement of SEQ ID NO:1, 3, 4 or 6. Different locations can be different but overlapping or or non-overlapping on the same strand. The first and second oligonucleotide can hybridize to sites on the same or on different strands.

[0370] The set can be useful, e.g., for identifying SNP's, or identifying specific alleles of 26443 or 46873. In a preferred embodiment, each oligonucleotide of the set has a different nucleotide at an interrogation position. In one embodiment, the set includes two oligonucleotides, each complementary to a different allele at a locus, e.g., a bi-allelic or polymorphic locus.

[0371] In another embodiment, the set includes four oligonucleotides, each having a different nucleotide (e.g., adenine, guanine, cytosine, or thymidine) at the interrogation position. The interrogation position can be a SNP or the site of a mutation. In another preferred embodiment, the oligonucleotides of the plurality are identical in sequence to one another (except for differences in length). The oligonucleotides can be provided with differential labels, such that an oligonucleotide that hybridizes to one allele provides a signal that is distinguishable from an oligonucleotide that hybridizes to a second allele. In still another embodiment, at least one of the oligonucleotides of the set has a nucleotide change at a position in addition to a query position, e.g., a destabilizing mutation to decrease the Tm of the oligonucleotide. In another embodiment, at least one oligonucleotide of the set has a non-natural nucleotide, e.g., inosine. In a preferred embodiment, the oligonucleotides are attached to a solid support, e.g., to different addresses of an array or to different beads or nanoparticles.

[0372] In a preferred embodiment the set of oligonucleotides can be used to specifically amplify, e.g., by PCR, or detect, a 26443 or 46873 nucleic acid.

[0373] The methods described herein may be performed, for example, by utilizing pre-packaged diagnostic kits comprising at least one probe nucleic acid or antibody reagent described herein, which may be conveniently used, e.g., in clinical settings to diagnose patients exhibiting symptoms or family history of a disease or illness involving a 26443 or 46873 gene.

[0374] Use of 26443 or 46873 Molecules as Surrogate Markers

[0375] The 26443 or 46873 molecules of the invention are also useful as markers of disorders or disease states, as markers for precursors of disease states, as markers for predisposition of disease states, as markers of drug activity, or as markers of the pharmacogenomic profile of a subject. Using the methods described herein, the presence, absence and/or quantity of the 26443 or 46873 molecules of the invention may be detected, and may be correlated with one or more biological states in vivo. For example, the 26443 or 46873 molecules of the invention may serve as surrogate markers for one or more disorders or disease states or for conditions leading up to disease states. As used herein, a “surrogate marker” is an objective biochemical marker that correlates with the absence or presence of a disease or disorder, or with the progression of a disease or disorder (e.g., with the presence or absence of a tumor). The presence or quantity of such markers is independent of the disease. Therefore, these markers may serve to indicate whether a particular course of treatment is effective in lessening a disease state or disorder. Surrogate markers are of particular use when the presence or extent of a disease state or disorder is difficult to assess through standard methodologies (e.g., early stage tumors), or when an assessment of disease progression is desired before a potentially dangerous clinical endpoint is reached (e.g., an assessment of cardiovascular disease may be made using cholesterol levels as a surrogate marker, and an analysis of HIV infection may be made using HIV RNA levels as a surrogate marker, well in advance of the undesirable clinical outcomes of myocardial infarction or fully-developed AIDS). Examples of the use of surrogate markers in the art include: Koomen et al. (2000) J. Mass. Spectrom. 35: 258-264; and James (1994) AIDS Treatment News Archive 209.

[0376] The 26443 or 46873 molecules of the invention are also useful as pharmacodynamic markers. As used herein, a “pharmacodynamic marker” is an objective biochemical marker that correlates specifically with drug effects. The presence or quantity of a pharmacodynamic marker is not related to the disease state or disorder for which the drug is being administered; therefore, the presence or quantity of the marker is indicative of the presence or activity of the drug in a subject. For example, a pharmacodynamic marker may be indicative of the concentration of the drug in a biological tissue, in that the marker is either expressed or transcribed or not expressed or transcribed in that tissue in relationship to the level of the drug. In this fashion, the distribution or uptake of the drug may be monitored by the pharmacodynamic marker. Similarly, the presence or quantity of the pharmacodynamic marker may be related to the presence or quantity of the metabolic product of a drug, such that the presence or quantity of the marker is indicative of the relative breakdown rate of the drug in vivo. Pharmacodynamic markers are of particular use in increasing the sensitivity of detection of drug effects, particularly when the drug is administered in low doses. Since even a small amount of a drug may be sufficient to activate multiple rounds of marker (e.g., a 26443 or 46873 marker) transcription or expression, the amplified marker may be in a quantity that is more readily detectable than the drug itself. Also, the marker may be more easily detected due to the nature of the marker itself; for example, using the methods described herein, anti-26443 or -46873 antibodies may be employed in an immune-based detection system for a 26443 or 46873 protein marker, or 26443- or 46873-specific radiolabeled probes may be used to detect a 26443 or 46873 mRNA marker. Furthermore, the use of a pharmacodynamic marker may offer mechanism-based prediction of risk due to drug treatment beyond the range of possible direct observations. Examples of the use of pharmacodynamic markers in the art include: Matsuda et al. U.S. Pat. No. 6,033,862; Hattis et al. (1991) Env. Health Perspect. 90: 229-238; Schentag (1999) Am. J. Health-Syst. Pharm. 56 Suppl. 3: S21-S24; and Nicolau (1999) Am, J. Health-Syst. Pharm. 56 Suppl. 3: S16-S20.

[0377] The 26443 or 46873 molecules of the invention are also useful as pharmacogenomic markers. As used herein, a “pharmacogenomic marker” is an objective biochemical marker that correlates with a specific clinical drug response or susceptibility in a subject (see, e.g., McLeod et al. (1999) Eur. J. Cancer 35:1650-1652). The presence or quantity of the pharmacogenomic marker is related to the predicted response of the subject to a specific drug or class of drugs prior to administration of the drug. By assessing the presence or quantity of one or more pharmacogenomic markers in a subject, a drug therapy which is most appropriate for the subject, or which is predicted to have a greater degree of success, may be selected. For example, based on the presence or quantity of RNA, or protein (e.g., 26443 or 46873 protein or RNA) for specific tumor markers in a subject, a drug or course of treatment may be selected that is optimized for the treatment of the specific tumor likely to be present in the subject. Similarly, the presence or absence of a specific sequence mutation in 26443 or 46873 DNA may correlate 26443 or 46873 drug response. The use of pharmacogenomic markers therefore permits the application of the most appropriate treatment for each subject without having to administer the therapy.

[0378] Pharmaceutical Compositions of 26443 and 46873

[0379] The nucleic acid and polypeptides, fragments thereof, as well as anti-26443 or 46873 antibodies (also referred to herein as “active compounds”) of the invention can be incorporated into pharmaceutical compositions. Such compositions typically include the nucleic acid molecule, protein, or antibody and a pharmaceutically acceptable carrier. As used herein, the language “pharmaceutically acceptable carrier” includes solvents, dispersion media, coatings, antibacterial and antifungal agents, isotonic and absorption delaying agents, and the like, compatible with pharmaceutical administration. Supplementary active compounds can also be incorporated into the compositions.

[0380] A pharmaceutical composition is formulated to be compatible with its intended route of administration. Examples of routes of administration include parenteral, e.g., intravenous, intradermal, subcutaneous, oral (e.g., inhalation), transdermal (topical), transmucosal, and rectal administration. Solutions or suspensions used for parenteral, intradermal, or subcutaneous application can include the following components: a sterile diluent such as water for injection, saline solution, fixed oils, polyethylene glycols, glycerine, propylene glycol or other synthetic solvents; antibacterial agents such as benzyl alcohol or methyl parabens; antioxidants such as ascorbic acid or sodium bisulfite; chelating agents such as ethylenediaminetetraacetic acid; buffers such as acetates, citrates or phosphates and agents for the adjustment of tonicity such as sodium chloride or dextrose. pH can be adjusted with acids or bases, such as hydrochloric acid or sodium hydroxide. The parenteral preparation can be enclosed in ampoules, disposable syringes or multiple dose vials made of glass or plastic.

[0381] Pharmaceutical compositions suitable for injectable use include sterile aqueous solutions (where water soluble) or dispersions and sterile powders for the extemporaneous preparation of sterile injectable solutions or dispersion. For intravenous administration, suitable carriers include physiological saline, bacteriostatic water, Cremophor EL™ (BASF, Parsippany, N.J.) or phosphate buffered saline (PBS). In all cases, the composition must be sterile and should be fluid to the extent that easy syringability exists. It should be stable under the conditions of manufacture and storage and must be preserved against the contaminating action of microorganisms such as bacteria and fungi. The carrier can be a solvent or dispersion medium containing, for example, water, ethanol, polyol (for example, glycerol, propylene glycol, and liquid polyetheylene glycol, and the like), and suitable mixtures thereof. The proper fluidity can be maintained, for example, by the use of a coating such as lecithin, by the maintenance of the required particle size in the case of dispersion and by the use of surfactants. Prevention of the action of microorganisms can be achieved by various antibacterial and antifungal agents, for example, parabens, chlorobutanol, phenol, ascorbic acid, thimerosal, and the like. In many cases, it will be preferable to include isotonic agents, for example, sugars, polyalcohols such as manitol, sorbitol, sodium chloride in the composition. Prolonged absorption of the injectable compositions can be brought about by including an agent in the composition that delays absorption, for example, aluminum monostearate and gelatin.

[0382] Sterile injectable solutions can be prepared by incorporating the active compound in the required amount in an appropriate solvent with one or a combination of ingredients enumerated above, as required, followed by filtered sterilization. Generally, dispersions are prepared by incorporating the active compound into a sterile vehicle that contains a basic dispersion medium and the required other ingredients from those enumerated above. In the case of sterile powders for the preparation of sterile injectable solutions, the preferred methods of preparation are vacuum drying and freeze-drying which yields a powder of the active ingredient plus any additional desired ingredient from a previously sterile-filtered solution thereof.

[0383] Oral compositions generally include an inert diluent or an edible carrier. For the purpose of oral therapeutic administration, the active compound can be incorporated with excipients and used in the form of tablets, troches, or capsules, e.g., gelatin capsules. Oral compositions can also be prepared using a fluid carrier for use as a mouthwash. Pharmaceutically compatible binding agents, and/or adjuvant materials can be included as part of the composition. The tablets, pills, capsules, troches and the like can contain any of the following ingredients, or compounds of a similar nature: a binder such as microcrystalline cellulose, gum tragacanth or gelatin; an excipient such as starch or lactose, a disintegrating agent such as alginic acid, Primogel, or corn starch; a lubricant such as magnesium stearate or Sterotes; a glidant such as colloidal silicon dioxide; a sweetening agent such as sucrose or saccharin; or a flavoring agent such as peppermint, methyl salicylate, or orange flavoring.

[0384] For administration by inhalation, the compounds are delivered in the form of an aerosol spray from pressured container or dispenser that contains a suitable propellant, e.g., a gas such as carbon dioxide, or a nebulizer.

[0385] Systemic administration can also be by transmucosal or transdermal means. For transmucosal or transdermal administration, penetrants appropriate to the barrier to be permeated are used in the formulation. Such penetrants are generally known in the art, and include, for example, for transmucosal administration, detergents, bile salts, and fusidic acid derivatives. Transmucosal administration can be accomplished through the use of nasal sprays or suppositories. For transdermal administration, the active compounds are formulated into ointments, salves, gels, or creams as generally known in the art.

[0386] The compounds can also be prepared in the form of suppositories (e.g., with conventional suppository bases such as cocoa butter and other glycerides) or retention enemas for rectal delivery.

[0387] In one embodiment, the active compounds are prepared with carriers that will protect the compound against rapid elimination from the body, such as a controlled release formulation, including implants and microencapsulated delivery systems. Biodegradable, biocompatible polymers can be used, such as ethylene vinyl acetate, polyanhydrides, polyglycolic acid, collagen, polyorthoesters, and polylactic acid. Methods for preparation of such formulations will be apparent to those skilled in the art. The materials can also be obtained commercially from Alza Corporation and Nova Pharmaceuticals, Inc. Liposomal suspensions (including liposomes targeted to infected cells with monoclonal antibodies to viral antigens) can also be used as pharmaceutically acceptable carriers. These can be prepared according to methods known to those skilled in the art, for example, as described in U.S. Pat. No. 4,522,811.

[0388] It is advantageous to formulate oral or parenteral compositions in dosage unit form for ease of administration and uniformity of dosage. Dosage unit form as used herein refers to physically discrete units suited as unitary dosages for the subject to be treated; each unit containing a predetermined quantity of active compound calculated to produce the desired therapeutic effect in association with the required pharmaceutical carrier.

[0389] Toxicity and therapeutic efficacy of such compounds can be determined by standard pharmaceutical procedures in cell cultures or experimental animals, e.g., for determining the LD50 (the dose lethal to 50% of the population) and the ED50 (the dose therapeutically effective in 50% of the population). The dose ratio between toxic and therapeutic effects is the therapeutic index and it can be expressed as the ratio LD50/ED50. Compounds that exhibit high therapeutic indeces are preferred. While compounds that exhibit toxic side effects may be used, care should be taken to design a delivery system that targets such compounds to the site of affected tissue in order to minimize potential damage to uninfected cells and, thereby, reduce side effects.

[0390] The data obtained from the cell culture assays and animal studies can be used in formulating a range of dosage for use in humans. The dosage of such compounds lies preferably within a range of circulating concentrations that include the ED50 with little or no toxicity. The dosage may vary within this range depending upon the dosage form employed and the route of administration utilized. For any compound used in the method of the invention, the therapeutically effective dose can be estimated initially from cell culture assays. A dose may be formulated in animal models to achieve a circulating plasma concentration range that includes the IC50 (i.e., the concentration of the test compound which achieves a half-maximal inhibition of symptoms) as determined in cell culture. Such information can be used to more accurately determine useful doses in humans. Levels in plasma may be measured, for example, by high performance liquid chromatography.

[0391] As defined herein, a therapeutically effective amount of protein or polypeptide (i.e., an effective dosage) ranges from about 0.001 to 30 mg/kg body weight, preferably about 0.01 to 25 mg/kg body weight, more preferably about 0.1 to 20 mg/kg body weight, and even more preferably about 1 to 10 mg/kg, 2 to 9 mg/kg, 3 to 8 mg/kg, 4 to 7 mg/kg, or 5 to 6 mg/kg body weight. The protein or polypeptide can be administered one time per week for between about 1 to 10 weeks, preferably between 2 to 8 weeks, more preferably between about 3 to 7 weeks, and even more preferably for about 4, 5, or 6 weeks. The skilled artisan will appreciate that certain factors may influence the dosage and timing required to effectively treat a subject, including but not limited to the severity of the disease or disorder, previous treatments, the general health and/or age of the subject, and other diseases present. Moreover, treatment of a subject with a therapeutically effective amount of a protein, polypeptide, or antibody can include a single treatment or, preferably, can include a series of treatments.

[0392] For antibodies, the preferred dosage is 0.1 mg/kg of body weight (generally 10 mg/kg to 20 mg/kg). If the antibody is to act in the brain, a dosage of 50 mg/kg to 100 mg/kg is usually appropriate. Generally, partially human antibodies and fully human antibodies have a longer half-life within the human body than other antibodies. Accordingly, lower dosages and less frequent administration is often possible. Modifications such as lipidation can be used to stabilize antibodies and to enhance uptake and tissue penetration (e.g., into the brain). A method for lipidation of antibodies is described by Cruikshank et al. ((1997) J. Acquired Immune Deficiency Syndromes and Human Retrovirology 14:193).

[0393] The present invention encompasses agents that modulate expression or activity. An agent may, for example, be a small molecule. For example, such small molecules include, but are not limited to, peptides, peptidomimetics (e.g., peptoids), amino acids, amino acid analogs, polynucleotides, polynucleotide analogs, nucleotides, nucleotide analogs, organic or inorganic compounds (i.e., including heteroorganic and organometallic compounds) having a molecular weight less than about 10,000 grams per mole, organic or inorganic compounds having a molecular weight less than about 5,000 grams per mole, organic or inorganic compounds having a molecular weight less than about 1,000 grams per mole, organic or inorganic compounds having a molecular weight less than about 500 grams per mole, and salts, esters, and other pharmaceutically acceptable forms of such compounds.

[0394] Exemplary doses include milligram or microgram amounts of the small molecule per kilogram of subject or sample weight (e.g., about 1 microgram per kilogram to about 500 milligrams per kilogram, about 100 micrograms per kilogram to about 5 milligrams per kilogram, or about 1 microgram per kilogram to about 50 micrograms per kilogram. It is furthermore understood that appropriate doses of a small molecule depend upon the potency of the small molecule with respect to the expression or activity to be modulated. When one or more of these small molecules is to be administered to an animal (e.g., a human) in order to modulate expression or activity of a polypeptide or nucleic acid of the invention, a physician, veterinarian, or researcher may, for example, prescribe a relatively low dose at first, subsequently increasing the dose until an appropriate response is obtained. In addition, it is understood that the specific dose level for any particular animal subject will depend upon a variety of factors including the activity of the specific compound employed, the age, body weight, general health, gender, and diet of the subject, the time of administration, the route of administration, the rate of excretion, any drug combination, and the degree of expression or activity to be modulated.

[0395] An antibody (or fragment thereof) may be conjugated to a therapeutic moiety such as a cytotoxin, a therapeutic agent or a radioactive metal ion. A cytotoxin or cytotoxic agent includes any agent that is detrimental to cells. Examples include taxol, cytochalasin B, gramicidin D, ethidium bromide, emetine, mitomycin, etoposide, tenoposide, vincristine, vinblastine, colchicin, doxorubicin, daunorubicin, dihydroxy anthracin dione, mitoxantrone, mithramycin, actinomycin D, 1-dehydrotestosterone, glucocorticoids, procaine, tetracaine, lidocaine, propranolol, and puromycin and analogs or homologs thereof. Therapeutic agents include, but are not limited to, antimetabolites (e.g., methotrexate, 6-mercaptopurine, 6-thioguanine, cytarabine, 5-fluorouracil decarbazine), alkylating agents (e.g., mechlorethamine, thioepa chlorambucil, melphalan, carmustine (BSNU) and lomustine (CCNU), cyclothosphamide, busulfan, dibromomannitol, streptozotocin, mitomycin C, and cis-dichlorodiamine platinum (II) (DDP) cisplatin), anthracyclines (e.g., daunorubicin (formerly daunomycin) and doxorubicin), antibiotics (e.g., dactinomycin (formerly actinomycin), bleomycin, mithramycin, and anthramycin (AMC)), and anti-mitotic agents (e.g., vincristine and vinblastine).

[0396] The conjugates of the invention can be used for modifying a given biological response, although the drug moiety is not to be construed as limited to classical chemical therapeutic agents. For example, the drug moiety may be a protein or polypeptide possessing a desired biological activity. Such proteins may include, for example, a toxin such as abrin, ricin A, pseudomonas exotoxin, or diphtheria toxin; a protein such as tumor necrosis factor, alpha-interferon, beta-interferon, nerve growth factor, platelet derived growth factor, tissue plasminogen activator; or, biological response modifiers such as, for example, lymphokines, interleukin-1 (“IL-1”), interleukin-2 (“IL-2”), interleukin-6 (“IL-6”), granulocyte macrophase colony stimulating factor (“GM-CSF”), granulocyte colony stimulating factor (“G-CSF”), or other growth factors.

[0397] Alternatively, an antibody can be conjugated to a second antibody to form an antibody heteroconjugate, as described by Segal in U.S. Pat. No. 4,676,980.

[0398] The nucleic acid molecules of the invention can be inserted into vectors and used as gene therapy vectors. Gene therapy vectors can be delivered to a subject by, for example, intravenous injection, local administration (see U.S. Pat. No. 5,328,470) or by stereotactic injection (see e.g., Chen et al. (1994) Proc. Natl. Acad. Sci. USA 91:3054-3057). The pharmaceutical preparation of the gene therapy vector can include the gene therapy vector in an acceptable diluent, or can comprise a slow release matrix in which the gene delivery vehicle is imbedded. Alternatively, where the complete gene delivery vector can be produced intact from recombinant cells, e.g., retroviral vectors, the pharmaceutical preparation can include one or more cells which produce the gene delivery system.

[0399] The pharmaceutical compositions can be included in a container, pack, or dispenser together with instructions for administration.

[0400] Methods of Treatment for 26443 and 46873

[0401] The present invention provides for both prophylactic and therapeutic methods of treating a subject at risk of (or susceptible to) a disorder or having a disorder associated with aberrant or unwanted 26443 or 46873 expression or activity. As used herein, the term “treatment” is defined as the application or administration of a therapeutic agent to a patient, or application or administration of a therapeutic agent to an isolated tissue or cell line from a patient, who has a disease, a symptom of disease or a predisposition toward a disease, with the purpose to cure, heal, alleviate, relieve, alter, remedy, ameliorate, improve or affect the disease, the symptoms of disease or the predisposition toward disease. A therapeutic agent includes, but is not limited to, small molecules, peptides, antibodies, ribozymes and antisense oligonucleotides.

[0402] With regards to both prophylactic and therapeutic methods of treatment, such treatments may be specifically tailored or modified, based on knowledge obtained from the field of pharmacogenomics. “Pharmacogenomics”, as used herein, refers to the application of genomics technologies such as gene sequencing, statistical genetics, and gene expression analysis to drugs in clinical development and on the market. More specifically, the term refers the study of how a patient's genes determine his or her response to a drug (e.g., a patient's “drug response phenotype”, or “drug response genotype”.) Thus, another aspect of the invention provides methods for tailoring an individual's prophylactic or therapeutic treatment with either the 26443 or 46873 molecules of the present invention or 26443 or 46873 modulators according to that individual's drug response genotype. Pharmacogenomics allows a clinician or physician to target prophylactic or therapeutic treatments to patients who will most benefit from the treatment and to avoid treatment of patients who will experience toxic drug-related side effects.

[0403] In one aspect, the invention provides a method for preventing in a subject, a disease or condition associated with an aberrant or unwanted 26443 or 46873 expression or activity, by administering to the subject a 26443 or 46873 or an agent which modulates 26443 or 46873 expression or at least one 26443 or 46873 activity. Subjects at risk for a disease that is caused or contributed to by aberrant or unwanted 26443 or 46873 expression or activity can be identified by, for example, any or a combination of diagnostic or prognostic assays as described herein. Administration of a prophylactic agent can occur prior to the manifestation of symptoms characteristic of the 26443 or 46873 aberrance, such that a disease or disorder is prevented or, alternatively, delayed in its progression. Depending on the type of 26443 or 46873 aberrance, for example, a 26443 or 46873, 26443 or 46873 agonist or 26443 or 46873 antagonist agent can be used for treating the subject. The appropriate agent can be determined based on screening assays described herein.

[0404] It is possible that some 26443 or 46873 disorders can be caused, at least in part, by an abnormal level of gene product, or by the presence of a gene product exhibiting abnormal activity. As such, the reduction in the level and/or activity of such gene products would bring about the amelioration of disorder symptoms.

[0405] As discussed, successful treatment of 26443 or 46873 disorders can be brought about by techniques that serve to inhibit the expression or activity of target gene products. For example, compounds, e.g., an agent identified using an assays described above, that proves to exhibit negative modulatory activity, can be used in accordance with the invention to prevent and/or ameliorate symptoms of 26443 or 46873 disorders. Such molecules can include, but are not limited to peptides, phosphopeptides, small organic or inorganic molecules, or antibodies (including, for example, polyclonal, monoclonal, humanized, anti-idiotypic, chimeric or single chain antibodies, and Fab, F(ab′)2 and Fab expression library fragments, scFV molecules, and epitope-binding fragments thereof).

[0406] Further, antisense and ribozyme molecules that inhibit expression of the target gene can also be used in accordance with the invention to reduce the level of target gene expression, thus effectively reducing the level of target gene activity. Still further, triple helix molecules can be utilized in reducing the level of target gene activity. Antisense, ribozyme and triple helix molecules are discussed above.

[0407] It is possible that the use of antisense, ribozyme, and/or triple helix molecules to reduce or inhibit mutant gene expression can also reduce or inhibit the transcription (triple helix) and/or translation (antisense, ribozyme) of mRNA produced by normal target gene alleles, such that the concentration of normal target gene product present can be lower than is necessary for a normal phenotype. In such cases, nucleic acid molecules that encode and express target gene polypeptides exhibiting normal target gene activity can be introduced into cells via gene therapy method. Alternatively, in instances in that the target gene encodes an extracellular protein, it can be preferable to co-administer normal target gene protein into the cell or tissue in order to maintain the requisite level of cellular or tissue target gene activity.

[0408] Another method by which nucleic acid molecules may be utilized in treating or preventing a disease characterized by 26443 or 46873 expression is through the use of aptamer molecules specific for 26443 or 46873 protein. Aptamers are nucleic acid molecules having a tertiary structure that permits them to specifically bind to protein ligands (see, e.g., Osborne, et al. Curr. Opin. Chem Biol. 1997, 1(1): 5-9; and Patel, D. J. Curr Opin Chem Biol 1997 June; 1(1):32-46). Since nucleic acid molecules may in many cases be more conveniently introduced into target cells than therapeutic protein molecules may be, aptamers offer a method by which 26443 or 46873 protein activity may be specifically decreased without the introduction of drugs or other molecules which may have pluripotent effects.

[0409] Antibodies can be generated that are both specific for target gene product and that reduce target gene product activity. Such antibodies may, therefore, by administered in instances whereby negative modulatory techniques are appropriate for the treatment of 26443 or 46873 disorders. For a description of antibodies, see the Antibody section above.

[0410] In circumstances wherein injection of an animal or a human subject with a 26443 or 46873 protein or epitope for stimulating antibody production is harmful to the subject, it is possible to generate an immune response against 26443 or 46873 through the use of anti-idiotypic antibodies (see, for example, Herlyn, D. Ann Med 1999;31(1):66-78; and Bhattacharya-Chatterjee, M., and Foon, K. A. Cancer Treat Res 1998;94:51-68). If an anti-idiotypic antibody is introduced into a mammal or human subject, it should stimulate the production of anti-anti-idiotypic antibodies, which should be specific to the 26443 or 46873 protein. Vaccines directed to a disease characterized by 26443 or 46873 expression may also be generated in this fashion.

[0411] In instances where the target antigen is intracellular and whole antibodies are used, internalizing antibodies may be preferred. Lipofectin or liposomes can be used to deliver the antibody or a fragment of the Fab region that binds to the target antigen into cells. Where fragments of the antibody are used, the smallest inhibitory fragment that binds to the target antigen is preferred. For example, peptides having an amino acid sequence corresponding to the Fv region of the antibody can be used. Alternatively, single chain neutralizing antibodies that bind to intracellular target antigens can also be administered. Such single chain antibodies can be administered, for example, by expressing nucleotide sequences encoding single-chain antibodies within the target cell population (see e.g., Marasco et al., 1993, Proc. Natl. Acad. Sci. USA 90:7889-7893).

[0412] The identified compounds that inhibit target gene expression, synthesis and/or activity can be administered to a patient at therapeutically effective doses to prevent, treat or ameliorate 26443 or 46873 disorders. A therapeutically effective dose refers to that amount of the compound sufficient to result in amelioration of symptoms of the disorders.

[0413] Another example of determination of effective dose for an individual is the ability to directly assay levels of “free” and “bound” compound in the serum of the test subject. Such assays may utilize antibody mimics and/or “biosensors” that have been created through molecular imprinting techniques. The compound which is able to modulate 26443 or 46873 activity is used as a template, or “imprinting molecule”, to spatially organize polymerizable monomers prior to their polymerization with catalytic reagents. The subsequent removal of the imprinted molecule leaves a polymer matrix that contains a repeated “negative image” of the compound and is able to selectively rebind the molecule under biological assay conditions. A detailed review of this technique can be seen in Ansell, R. J. et al (1996) Current Opinion in Biotechnology 7:89-94 and in Shea, K. J. (1994) Trends in Polymer Science 2:166-173. Such “imprinted” affinity matrixes are amenable to ligand-binding assays, whereby the immobilized monoclonal antibody component is replaced by an appropriately imprinted matrix. An example of the use of such matrixes in this way can be seen in Vlatakis, G. et al (1993) Nature 361:645-647. Through the use of isotope-labeling, the “free” concentration of compound which modulates the expression or activity of 26443 or 46873 can be readily monitored and used in calculations of IC50.

[0414] Such “imprinted” affinity matrixes can also be designed to include fluorescent groups whose photon-emitting properties measurably change upon local and selective binding of target compound. These changes can be readily assayed in real time using appropriate fiberoptic devices, in turn allowing the dose in a test subject to be quickly optimized based on its individual IC50. A rudimentary example of such a “biosensor” is discussed in Kriz, D. et al (1995) Analytical Chemistry 67:2142-2144.

[0415] Another aspect of the invention pertains to methods of modulating 26443 or 46873 expression or activity for therapeutic purposes. Accordingly, in an exemplary embodiment, the modulatory method of the invention involves contacting a cell with a 26443 or 46873 or agent that modulates one or more of the activities of 26443 or 46873 protein activity associated with the cell. An agent that modulates 26443 or 46873 protein activity can be an agent as described herein, such as a nucleic acid or a protein, a naturally-occurring target molecule of a 26443 or 46873 protein (e.g., a 26443 or 46873 substrate or receptor), a 26443 or 46873 antibody, a 26443 or 46873 agonist or antagonist, a peptidomimetic of a 26443 or 46873 agonist or antagonist, or other small molecule.

[0416] In one embodiment, the agent stimulates one or 26443 or 46873 activities. Examples of such stimulatory agents include active 26443 or 46873 protein and a nucleic acid molecule encoding 26443 or 46873. In another embodiment, the agent inhibits one or more 26443 or 46873 activities. Examples of such inhibitory agents include antisense 26443 or 46873 nucleic acid molecules, anti26443 or 46873 antibodies, and 26443 or 46873 inhibitors. These modulatory methods can be performed in vitro (e.g., by culturing the cell with the agent) or, alternatively, in vivo (e.g., by administering the agent to a subject). As such, the present invention provides methods of treating an individual afflicted with a disease or disorder characterized by aberrant or unwanted expression or activity of a 26443 or 46873 protein or nucleic acid molecule. In one embodiment, the method involves administering an agent (e.g., an agent identified by a screening assay described herein), or combination of agents that modulates (e.g., upregulates or downregulates) 26443 or 46873 expression or activity. In another embodiment, the method involves administering a 26443 or 46873 protein or nucleic acid molecule as therapy to compensate for reduced, aberrant, or unwanted 26443 or 46873 expression or activity.

[0417] Stimulation of 26443 or 46873 activity is desirable in situations in which 26443 or 46873 is abnormally downregulated and/or in which increased 26443 or 46873 activity is likely to have a beneficial effect. For example, stimulation of 26443 or 46873 activity is desirable in situations in which a 26443 or 46873 is downregulated and/or in which increased 26443 or 46873 activity is likely to have a beneficial effect. Likewise, inhibition of 26443 or 46873 activity is desirable in situations in which 26443 or 46873 is abnormally upregulated and/or in which decreased 26443 or 46873 activity is likely to have a beneficial effect.

[0418] Examples of other disorders which can be treated include, but are not limited to, autoimmune diseases (including, for example, diabetes mellitus, arthritis (including rheumatoid arthritis, juvenile rheumatoid arthritis, osteoarthritis, psoriatic arthritis), multiple sclerosis, encephalomyelitis, myasthenia gravis, systemic lupus erythematosis, autoimmune thyroiditis, dermatitis (including atopic dermatitis and eczematous dermatitis), psoriasis, Sjögren's Syndrome, Crohn's disease, aphthous ulcer, iritis, conjunctivitis, keratoconjunctivitis, ulcerative colitis, asthma, allergic asthma, cutaneous lupus erythematosus, scleroderma, vaginitis, proctitis, drug eruptions, leprosy reversal reactions, erythema nodosum leprosum, autoimmune uveitis, allergic encephalomyelitis, acute necrotizing hemorrhagic encephalopathy, idiopathic bilateral progressive sensorineural hearing loss, aplastic anemia, pure red cell anemia, idiopathic thrombocytopenia, polychondritis, Wegener's granulomatosis, chronic active hepatitis, Stevens-Johnson syndrome, idiopathic sprue, lichen planus, Graves' disease, sarcoidosis, primary biliary cirrhosis, uveitis posterior, and interstitial lung fibrosis), graft-versus-host disease, cases of transplantation, and allergy such as, atopic allergy.

[0419] In addition, aberrant activity of a 26443 or 46873 polypeptide may adversely affect a muscle cell. Examples of disorders involving, for example, heart muscle, or “cardiovascular disorders”, include, but are not limited to, a disease, disorder, or state involving the cardiovascular system, e.g., the heart, and or coronary blood vessels. A cardiovascular disorder can be caused by a malfunction of the heart, an imbalance in arterial pressure or an occlusion of a blood vessel, e.g., by a thrombus. Examples of such disorders include arrhythmias, myocardial infarction, hypertension, athlerosclerosis, coronary artery spasm, congestive heart failure, coronary artery disease, valvular disease and cardiomyopathies. Additionally, skeletal muscle cells may be affected by aberrant activity of a 26443 or 46873 polypeptide. For instance, symptoms of a skeletal muscular disorder may include aching muscles, muscle cramps or muscle degeneracy.

[0420] Examples of liver disorders include, but are not limited to, disorders associated with an accumulation of fibrous tissue, such as that resulting from an imbalance between production and degradation of the extracellular matrix accompanied by the collapse and condensation of preexisting fibers; hepatocellular necrosis or injury induced by a wide variety of agents including processes which disturb homeostasis, such as an inflammatory process, tissue damage resulting from toxic injury or altered hepatic blood flow, and infections (e.g., bacterial, viral and parasitic); hepatic injury, such as portal hypertension or hepatic fibrosis; liver fibrosis attributed to inborn errors of metabolism, for example, fibrosis resulting from a storage disorder such as Gaucher's disease (lipid abnormalities) or a glycogen storage disease, e.g., A1-antitrypsin deficiency; a disorder mediating the accumulation (e.g., storage) of an exogenous substance, for example, hemochromatosis (iron-overload syndrome) and copper storage diseases (Wilson's disease), disorders resulting in the accumulation of a toxic metabolite (e.g., tyrosinemia, fructosemia and galactosemia) and peroxisomal disorders (e.g., Zellweger syndrome); liver injury associated with the administration of various chemicals or drugs, such as for example, methotrexate, isonizaid, oxyphenisatin, methyldopa, chlorpromazine, tolbutamide or alcohol, or which represents a hepatic manifestation of a vascular disorder, such as obstruction of either the intrahepatic or extrahepatic bile flow or an alteration in hepatic circulation resulting, for example, from chronic heart failure, veno-occlusive disease, portal vein thrombosis or Budd-Chiari syndrome.

[0421] Additionally, 26443 or 46873 may play an important role in overall metabolism. Diseases of metabolic imbalance include, but are not limited to, obesity, anorexia nervosa, cachexia and lipid disorders diabetes.

[0422] Moreover, a 26443 or 46873 protein may regulate cellular amino acid levels (e.g., asparagine, aspartic acid). A defect or deficiency in a 26443 or 46873 polypeptide, therefore, may result in inappropriate levels of, e.g., asparagine and/or aspartic acid, thereby causing a variety of disorders, for example, neurological disorders. Examples of neural disorders include, but are not limited to, neurodegenerative disorders, e.g., Alzheimer's disease, dementias related to Alzheimer's disease (such as Pick's disease), Parkinson's and other Lewy diffuse body diseases, multiple sclerosis, amyotrophic lateral sclerosis, progressive supranuclear palsy, epilepsy, and Jakob-Creutzfieldt disease; psychiatric disorders, e.g., depression, schizophrenic disorders, Korsakoff's psychosis, mania, anxiety disorders, or phobic disorders; learning or memory disorders, e.g., amnesia or age-related memory loss; and neurological disorders, e.g., migraine. The ability to regulate or control the expression of a 26443 or 46873 protein may result in the ability to likewise regulate or control levels of amino acids, e.g., asparagine or aspartic acid, thereby providing a protective and/or therapeutic effect against, e.g., neurological disorders.

[0423] Thus, the 26443 or 46873 molecules can act as novel diagnostic targets and therapeutic agents for controlling defects resulting in metabolic deficiencies and/or improper amino acid levels, e.g., asparagine or aspartic acid.

[0424] Aberrant expression and/or activity of 26443 or 46873 molecules may mediate disorders associated with bone metabolism. “Bone metabolism” refers to direct or indirect effects in the formation or degeneration of bone structures, e.g., bone formation, bone resorption, etc., which may ultimately affect the concentrations in serum of calcium and phosphate. This term also includes activities mediated by 26443 or 46873 molecules effects in bone cells, e.g., osteoclasts and osteoblasts, that may in turn result in bone formation and degeneration. For example, 26443 or 46873 molecules may support different activities of bone resorbing osteoclasts, such as the stimulation of differentiation of monocytes and mononuclear phagocytes into osteoclasts. Accordingly, 26443 or 46873 molecules that modulate the production of bone cells can influence bone formation and degeneration, and thus may be used to treat bone disorders. Examples of such disorders include, but are not limited to, osteoporosis, osteodystrophy, osteomalacia, rickets, osteitis fibrosa cystica, renal osteodystrophy, osteosclerosis, anti-convulsant treatment, osteopenia, fibrogenesis-imperfecta ossium, secondary hyperparathyrodism, hypoparathyroidism, hyperparathyroidism, cirrhosis, obstructive jaundice, drug induced metabolism, medullary carcinoma, chronic renal disease, rickets, sarcoidosis, glucocorticoid antagonism, malabsorption syndrome, steatorrhea, tropical sprue, idiopathic hypercalcemia and milk fever.

[0425] Additionally, 26443 or 46873 molecules may play an important role in the etiology of certain viral diseases, including but not limited to Hepatitis B, Heptitis C and Herpes Simplex Virus (HSV). Modulators of 26443 or 46873 activity could be used to control viral diseases. The modulators can be used in the treatment and/or diagnosis of viral infected tissue or virus-associated tissue fibrosis, especially liver and liver fibrosis. Also, 26443 or 46873 modulators can be used in the treatment and/or diagnosis of virus-associated carcinoma, especially hepatocellular cancer and lymphomas.

[0426] Additionally, 26443 or 46873 may play an important role in the regulation of pain disorders. Examples of pain disorders include, but are not limited to, pain response elicited during various forms of tissue injury, e.g., inflammation, infection, and ischemia, usually referred to as hyperalgesia (described in, for example, Fields, H. L. (1987) Pain, New York: McGraw-Hill); pain associated with musculoskeletal disorders, e.g., joint pain; tooth pain; headaches; pain associated with surgery; pain related to irritable bowel syndrome; or chest pain.

[0427] 26443 and 46873 Pharmacogenomics

[0428] The 26443 or 46873 molecules of the present invention, as well as agents, or modulators which have a stimulatory or inhibitory effect on 26443 or 46873 activity (e.g., 26443 or 46873 gene expression) as identified by a screening assay described herein can be administered to individuals to treat (prophylactically or therapeutically)26443 or 46873 associated disorders (e.g., metabolic disorders or defects associated with fatty acid oxidation) associated with aberrant or unwanted 26443 or 46873 activity. In conjunction with such treatment, pharmacogenomics (i.e., the study of the relationship between an individual's genotype and that individual's response to a foreign compound or drug) may be considered. Differences in metabolism of therapeutics can lead to severe toxicity or therapeutic failure by altering the relation between dose and blood concentration of the pharmacologically active drug. Thus, a physician or clinician may consider applying knowledge obtained in relevant pharmacogenomics studies in determining whether to administer a 26443 or 46873 molecule or 26443 or 46873 modulator as well as tailoring the dosage and/or therapeutic regimen of treatment with a 26443 or 46873 molecule or 26443 or 46873 modulator.

[0429] Pharmacogenomics deals with clinically significant hereditary variations in the response to drugs due to altered drug disposition and abnormal action in affected persons. See, for example, Eichelbaum, M. et al. (1996) Clin. Exp. Pharmacol. Physiol. 23(10-11): 983-985 and Linder, M. W. et al. (1997) Clin. Chem. 43(2):254-266. In general, two types of pharmacogenetic conditions can be differentiated. Genetic conditions transmitted as a single factor altering the way drugs act on the body (altered drug action) or genetic conditions transmitted as single factors altering the way the body acts on drugs (altered drug metabolism). These pharmacogenetic conditions can occur either as rare genetic defects or as naturally-occurring polymorphisms. For example, glucose-6-phosphate dehydrogenase deficiency (G6PD) is a common inherited enzymopathy in which the main clinical complication is haemolysis after ingestion of oxidant drugs (anti-malarials, sulfonamides, analgesics, nitrofurans) and consumption of fava beans.

[0430] One pharmacogenomics approach to identifying genes that predict drug response, known as “a genome-wide association”, relies primarily on a high-resolution map of the human genome consisting of already known gene-related markers (e.g., a “bi-allelic” gene marker map which consists of 60,000-100,000 polymorphic or variable sites on the human genome, each of which has two variants.) Such a high-resolution genetic map can be compared to a map of the genome of each of a statistically significant number of patients taking part in a Phase II/III drug trial to identify markers associated with a particular observed drug response or side effect. Alternatively, such a high-resolution map can be generated from a combination of some ten-million known single nucleotide polymorphisms (SNPs) in the human genome. As used herein, a “SNP” is a common alteration that occurs in a single nucleotide base in a stretch of DNA. For example, a SNP may occur once per every 1000 bases of DNA. A SNP may be involved in a disease process, however, the vast majority may not be disease-associated. Given a genetic map based on the occurrence of such SNPs, individuals can be grouped into genetic categories depending on a particular pattern of SNPs in their individual genome. In such a manner, treatment regimens can be tailored to groups of genetically similar individuals, taking into account traits that may be common among such genetically similar individuals.

[0431] Alternatively, a method termed the “candidate gene approach”, can be utilized to identify genes that predict drug response. According to this method, if a gene that encodes a drug's target is known (e.g., a 26443 or 46873 protein of the present invention), all common variants of that gene can be fairly easily identified in the population and it can be determined if having one version of the gene versus another is associated with a particular drug response.

[0432] Alternatively, a method termed “gene expression profiling” can be utilized to identify genes that predict drug response. For example, the gene expression of an animal dosed with a drug (e.g., a 26443 or 46873 molecule or 26443 or 46873 modulator of the present invention) can give an indication whether gene pathways related to toxicity have been turned on.

[0433] Information generated from more than one of the above pharmacogenomics approaches can be used to determine appropriate dosage and treatment regimens for prophylactic or therapeutic treatment of an individual. This knowledge, when applied to dosing or drug selection, can avoid adverse reactions or therapeutic failure and thus enhance therapeutic or prophylactic efficiency when treating a subject with a 26443 or 46873 molecule or 26443 or 46873 modulator, such as a modulator identified by one of the exemplary screening assays described herein.

[0434] The present invention further provides methods for identifying new agents, or combinations, that are based on identifying agents that modulate the activity of one or more of the gene products encoded by one or more of the 26443 or 46873 genes of the present invention, wherein these products may be associated with resistance of the cells to a therapeutic agent. Specifically, the activity of the proteins encoded by the 26443 or 46873 genes of the present invention can be used as a basis for identifying agents for overcoming agent resistance. By blocking the activity of one or more of the resistance proteins, target cells will become sensitive to treatment with an agent that the unmodified target cells were resistant to.

[0435] Monitoring the influence of agents (e.g., drugs) on the expression or activity of a 26443 or 46873 protein can be applied in clinical trials. For example, the effectiveness of an agent determined by a screening assay as described herein to increase 26443 or 46873 gene expression, protein levels, or upregulate 26443 or 46873 activity, can be monitored in clinical trials of subjects exhibiting decreased 26443 or 46873 gene expression, protein levels, or downregulated 26443 or 46873 activity. Alternatively, the effectiveness of an agent determined by a screening assay to decrease 26443 or 46873 gene expression, protein levels, or downregulate 26443 or 46873 activity, can be monitored in clinical trials of subjects exhibiting increased 26443 or 46873 gene expression, protein levels, or upregulated 26443 or 46873 activity. In such clinical trials, the expression or activity of a 26443 or 46873 gene, and preferably, other genes that have been implicated in, for example, a 26443- or 46873-associated disorder can be used as a “read out” or markers of the phenotype of a particular cell.

[0436] 26443 or 46873 Informatics

[0437] The sequence of a 26443 or 46873 molecule is provided in a variety of media to facilitate use thereof. A sequence can be provided as a manufacture, other than an isolated nucleic acid or amino acid molecule, which contains a 26443 or 46873. Such a manufacture can provide a nucleotide or amino acid sequence, e.g., an open reading frame, in a form which allows examination of the manufacture using means not directly applicable to examining the nucleotide or amino acid sequences, or a subset thereof, as they exists in nature or in purified form. The sequence information can include, but is not limited to, 26443 or 46873 full-length nucleotide and/or amino acid sequences, partial nucleotide and/or amino acid sequences, polymorphic sequences including single nucleotide polymorphisms (SNPs), epitope sequence, and the like. In a preferred embodiment, the manufacture is a machine-readable medium, e.g., a magnetic, optical, chemical or mechanical information storage device.

[0438] As used herein, “machine-readable media” refers to any medium that can be read and accessed directly by a machine, e.g., a digital computer or analogue computer. Non-limiting examples of a computer include a desktop PC, laptop, mainframe, server (e.g., a web server, network server, or server farm), handheld digital assistant, pager, mobile telephone, and the like. The computer can be stand-alone or connected to a communications network, e.g., a local area network (such as a VPN or intranet), a wide area network (e.g., an Extranet or the Internet), or a telephone network (e.g., a wireless, DSL, or ISDN network). Machine-readable media include, but are not limited to: magnetic storage media, such as floppy discs, hard disc storage medium, and magnetic tape; optical storage media such as CD-ROM; electrical storage media such as RAM, ROM, EPROM, EEPROM, flash memory, and the like; and hybrids of these categories such as magnetic/optical storage media.

[0439] A variety of data storage structures are available to a skilled artisan for creating a machine-readable medium having recorded thereon a nucleotide or amino acid sequence of the present invention. The choice of the data storage structure will generally be based on the means chosen to access the stored information. In addition, a variety of data processor programs and formats can be used to store the nucleotide sequence information of the present invention on computer readable medium. The sequence information can be represented in a word processing text file, formatted in commercially-available software such as WordPerfect and Microsoft Word, or represented in the form of an ASCII file, stored in a database application, such as DB2, Sybase, Oracle, or the like. The skilled artisan can readily adapt any number of data processor structuring formats (e.g., text file or database) in order to obtain computer readable medium having recorded thereon the nucleotide sequence information of the present invention.

[0440] In a preferred embodiment, the sequence information is stored in a relational database (such as Sybase or Oracle). The database can have a first table for storing sequence (nucleic acid and/or amino acid sequence) information. The sequence information can be stored in one field (e.g., a first column) of a table row and an identifier for the sequence can be store in another field (e.g., a second column) of the table row. The database can have a second table, e.g., storing annotations. The second table can have a field for the sequence identifier, a field for a descriptor or annotation text (e.g., the descriptor can refer to a functionality of the sequence, a field for the initial position in the sequence to which the annotation refers, and a field for the ultimate position in the sequence to which the annotation refers. Non-limiting examples for annotation to nucleic acid sequences include polymorphisms (e.g., SNP's) translational regulatory sites and splice junctions. Non-limiting examples for annotations to amino acid sequence include polypeptide domains, e.g., a domain described herein; active sites and other functional amino acids; and modification sites.

[0441] By providing the nucleotide or amino acid sequences of the invention in computer readable form, the skilled artisan can routinely access the sequence information for a variety of purposes. For example, one skilled in the art can use the nucleotide or amino acid sequences of the invention in computer readable form to compare a target sequence or target structural motif with the sequence information stored within the data storage means. A search is used to identify fragments or regions of the sequences of the invention that match a particular target sequence or target motif. The search can be a BLAST search or other routine sequence comparison, e.g., a search described herein.

[0442] Thus, in one aspect, the invention features a method of analyzing 26443 or 46873, e.g., analyzing structure, function, or relatedness to one or more other nucleic acid or amino acid sequences. The method includes: providing a 26443 or 46873 nucleic acid or amino acid sequence; comparing the 26443 or 46873 sequence with a second sequence, e.g., one or more preferably a plurality of sequences from a collection of sequences, e.g., a nucleic acid or protein sequence database to thereby analyze 26443 or 46873. The method can be performed in a machine, e.g., a computer, or manually by a skilled artisan.

[0443] The method can include evaluating the sequence identity between a 26443 or 46873 sequence and a database sequence. The method can be performed by accessing the database at a second site, e.g., over the Internet.

[0444] As used herein, a “target sequence” can be any DNA or amino acid sequence of six or more nucleotides or two or more amino acids. A skilled artisan can readily recognize that the longer a target sequence is, the less likely a target sequence will be present as a random occurrence in the database. Typical sequence lengths of a target sequence are from about 10 to 100 amino acids or from about 30 to 300 nucleotide residues. However, it is well recognized that commercially important fragments, such as sequence fragments involved in gene expression and protein processing, may be of shorter length.

[0445] Computer software is publicly available which allows a skilled artisan to access sequence information provided in a computer readable medium for analysis and comparison to other sequences. A variety of known algorithms are disclosed publicly and a variety of commercially available software for conducting search means are and can be used in the computer-based systems of the present invention. Examples of such software include, but are not limited to, MacPattern (EMBL), BLASTN and BLASTX (NCBI).

[0446] Thus, the invention features a method of making a computer readable record of a sequence of a 26443 or 46873 sequence that includes recording the sequence on a computer readable matrix. In a preferred embodiment the record includes one or more of the following: identification of an ORF; identification of a domain, region, or site; identification of the start of transcription; identification of the transcription terminator; the full length amino acid sequence of the protein, or a mature form thereof; the 5′end of the translated region.

[0447] In another aspect, the invention features, a method of analyzing a sequence. The method includes: providing a 26443 or 46873 sequence, or record, in machine-readable form; comparing a second sequence to the 26443 or 46873 sequence; thereby analyzing a sequence. Comparison can include comparing to sequences for sequence identity or determining if one sequence is included within the other, e.g., determining if the 26443 or 46873 sequence includes a sequence being compared. In a preferred embodiment the 26443 or 46873 or second sequence is stored on a first computer, e.g., at a first site and the comparison is performed, read, or recorded on a second computer, e.g., at a second site. E.g., the 26443 or 46873 or second sequence can be stored in a public or proprietary database in one computer, and the results of the comparison performed, read, or recorded on a second computer. In a preferred embodiment the record includes one or more of the following: identification of an ORF; identification of a domain, region, or site; identification of the start of transcription; identification of the transcription terminator; the full length amino acid sequence of the protein, or a mature form thereof; the 5′end of the translated region.

[0448] In another aspect, the invention provides a machine-readable medium for holding instructions for performing a method for determining whether a subject has a 26443- or 46873-associated disease or disorder or a pre-disposition to a 26443- or 46873-associated disease or disorder, wherein the method comprises the steps of determining 26443 or 46873 sequence information associated with the subject and based on the 26443 or 46873 sequence information, determining whether the subject has a 26443- or 46873-associated disease or disorder or a pre-disposition to a 26443- or 46873-associated disease or disorder and/or recommending a particular treatment for the disease, disorder or pre-disease condition.

[0449] The invention further provides in an electronic system and/or in a network, a method for determining whether a subject has a 26443- or 46873-associated disease or disorder or a pre-disposition to a disease associated with a 26443 or 46873 wherein the method comprises the steps of determining 26443 or 46873 sequence information associated with the subject, and based on the 26443 or 46873 sequence information, determining whether the subject has a 26443- or 46873-associated disease or disorder or a pre-disposition to a 26443- or 46873-associated disease or disorder, and/or recommending a particular treatment for the disease, disorder or pre-disease condition. In a preferred embodiment, the method further includes the step of receiving information, e.g., phenotypic or genotypic information, associated with the subject and/or acquiring from a network phenotypic information associated with the subject. The information can be stored in a database, e.g., a relational database. In another embodiment, the method further includes accessing the database, e.g., for records relating to other subjects, comparing the 26443 or 46873 sequence of the subject to the 26443 or 46873 sequences in the database to thereby determine whether the subject as a 26443- or 46873-associated disease or disorder, or a pre-disposition for such.

[0450] The present invention also provides in a network, a method for determining whether a subject has a 26443- or 46873-associated disease or disorder or a pre-disposition to a 26443- or 46873-associated disease or disorder associated with 26443 or 46873, said method comprising the steps of receiving 26443 or 46873 sequence information from the subject and/or information related thereto, receiving phenotypic information associated with the subject, acquiring information from the network corresponding to 26443 or 46873 and/or corresponding to a 26443- or 46873-associated disease or disorder (e.g., a disorder associated with aberrant or unwanted 26443 or 46873 expression or activity), and based on one or more of the phenotypic information, the 26443 or 46873 information (e.g., sequence information and/or information related thereto), and the acquired information, determining whether the subject has a 26443- or 46873-associated disease or disorder or a pre-disposition to a 26443- or 46873-associated disease or disorder. The method may further comprise the step of recommending a particular treatment for the disease, disorder or pre-disease condition.

[0451] The present invention also provides a method for determining whether a subject has a 26443- or 46873-associated disease or disorder or a pre-disposition to a 26443- or 46873-associated disease or disorder, said method comprising the steps of receiving information related to 26443 or 46873 (e.g., sequence information and/or information related thereto), receiving phenotypic information associated with the subject, acquiring information from the network related to 26443 or 46873 and/or related to a 26443- or 46873-associated disease or disorder, and based on one or more of the phenotypic information, the 26443 or 46873 information, and the acquired information, determining whether the subject has a 26443- or 46873-associated disease or disorder or a pre-disposition to a 26443- or 46873-associated disease or disorder. The method may further comprise the step of recommending a particular treatment for the disease, disorder or pre-disease condition.

[0452] This invention is further illustrated by the following examples that should not be construed as limiting. The contents of all references, patents and published patent applications cited throughout this application are incorporated herein by reference.

Background of the 61833 Invention

[0453] Amino acid decarboxylases are vitamin-B6-dependent enzymes (B6 enzymes). Most amino acid decarboxylases use pyridoxal-5′-phosphate (pyridoxal-P) as a coenzyme. In mammals, amino acid decarboxylases are responsible for the biosynthesis of biogenic amines and polyamines. In bacteria, constitutive amino acid decarboxylases appear to fulfill similar biosynthetic functions, whereas the decarboxylation of amino acids by inducible biodegenerative enzymes has been proposed to contribute to the regulation of pH both inside the cell and in its environment (Boeker and Snell, The Enzymes, 3rd edition, 6:217-253, 1972).

[0454] Decarboxylases can be subdivided into four groups that do not appear to be evolutionarily related. Group 1 includes glycine decarboxylase, which is part of a multienzyme complex. Group II contains glutamate, histidine, tyrosine and aromatic-L-amino acid decarboxlases. Group III includes prokaryotic omithine, lysine, and arginine decarboxylases, while Group IV contains eukaryotic omithine and arginine decarboxylase, and prokaryotic arginine decarboxylase and diaminopimelate decarboxylase.

[0455] The Group IV pyridoxal-dependent decarboxylases are particularly important due to their role in the synthesis of the polyamines putrescine, spermidine, and spermine. Within eukaryotic cells, the levels of polyamines have been positively correlated with cell cycle progression, and inhibition of polyamine synthesis causes cell growth arrest and cell death (Pegg (1986), Biochem J 234: 249-62; Tabor and Tabor (1984), Ann Rev Biochem 53: 749-90; Stefanelli et al. (2001), Biochem J 355:199-206). In addition, polyamines have been shown to decrease platelet aggregation and modulate the activity of voltage-gated sodium channels (de la Pena et al. (2000), Arch Med Res 31(6): 546-550; Huang and Moczydlowski (2001), Biophys J 80(3): 1262-79). Although putrescine, spermidine, and spermine appear to have slightly different activities, they are related by a common synthetic pathway. Putrescine can be converted into spermidine via the action of the aminopropyltransferase spermidine synthase, and a second aminopropyltransferase, spermine synthase, gives rise to spermine through the addition of another propylamine group to spermidine.

[0456] Eukaryotic ornithine decarboxylase plays a role in the biosynthesis of polyamines through its production of putrescine from the decarboxylation of omithine. Consistent with the positive correlation between polyamine levels and cell growth, recent studies have shown that omithine decarboxylase is upregulated in benign prostatic hyperplasia (BHP) samples taken from human prostates (Liu et al. (2000), Prostate 43(2): 83-7). Similarly, male Wistar rats that have been treated with 1,2-dimethylhydrazine (DMH), thereby leading to the formation of colonic premalignant aberrant crypt foci, and given an oral dose of omithine produce higher levels of blood putrescine than control rats that have not been treated with DMH (Schleiffer et al. (2000), Cancer Detect Prev 24(6): 542-8). These studies indicate a correlation between increased omithine decarboxylase expression and activity and the development of a hyperplastic cellular phenotype.

Summary of the 61833 Invention

[0457] The present invention is based, in part, on the discovery of a novel pyridoxyl -dependent decarboxylase family member, referred to herein as “61833”. The nucleotide sequence of a cDNA encoding 61833 is shown in SEQ ID NO:10, and the amino acid sequence of a 61833 polypeptide is shown in SEQ ID NO: 11 (see also Example 5). In addition, the nucleotide sequences of the coding region are depicted in SEQ ID NO: 12.

[0458] Accordingly, in one aspect, the invention features a nucleic acid molecule that encodes a 61833 protein or polypeptide, e.g., a biologically active portion of the 61833 protein. In a preferred embodiment the isolated nucleic acid molecule encodes a polypeptide having the amino acid sequence of SEQ ID NO: 11. In other embodiments, the invention provides isolated 61833 nucleic acid molecules having the nucleotide sequence shown in SEQ ID NO: 10, SEQ ID NO: 12, or the sequence of the DNA insert of the plasmid deposited with ATCC Accession Number ______. In still other embodiments, the invention provides nucleic acid molecules that are substantially identical (e.g., naturally occurring allelic variants) to the nucleotide sequence shown in SEQ ID NO:10, SEQ ID NO:12, or the sequence of the DNA insert of the plasmid deposited with ATCC Accession Number ______. In other embodiments, the invention provides a nucleic acid molecule which hybridizes under a stringency condition described herein to a nucleic acid molecule comprising the nucleotide sequence of SEQ ID NO: 10, SEQ ID NO: 12, or the sequence of the DNA insert of the plasmid deposited with ATCC Accession Number ______, wherein the nucleic acid encodes a full length 61833 protein or an active fragment thereof.

[0459] In a related aspect, the invention further provides nucleic acid constructs that include a 61833 nucleic acid molecule described herein. In certain embodiments, the nucleic acid molecules of the invention are operatively linked to native or heterologous regulatory sequences. Also included, are vectors and host cells containing the 61833 nucleic acid molecules of the invention e.g., vectors and host cells suitable for producing 61833 nucleic acid molecules and polypeptides.

[0460] In another related aspect, the invention provides nucleic acid fragments suitable as primers or hybridization probes for the detection of 61833-encoding nucleic acids.

[0461] In still another related aspect, isolated nucleic acid molecules that are antisense to a 61833 encoding nucleic acid molecule are provided.

[0462] In another aspect, the invention features, 61833 polypeptides, and biologically active or antigenic fragments thereof that are useful, e.g., as reagents or targets in assays applicable to treatment and diagnosis of 61833-mediated or -related disorders. In another embodiment, the invention provides 61833 polypeptides having a 61833 activity. Preferred polypeptides are 61833 proteins including at least one decarboxylase domain, e.g., a pyridoxal-dependent decarboxylase domain, e.g., an omithine decarboxylase, and, preferably, having a 61833 activity, e.g., a 61833 activity as described herein.

[0463] In other embodiments, the invention provides 61833 polypeptides, e.g., a 61833 polypeptide having the amino acid sequence shown in SEQ ID NO: 11 or the amino acid sequence encoded by the cDNA insert of the plasmid deposited with ATCC Accession Number ______; an amino acid sequence that is substantially identical to the amino acid sequence shown in SEQ ID NO: 11 or the amino acid sequence encoded by the cDNA insert of the plasmid deposited with ATCC Accession Number ______; or an amino acid sequence encoded by a nucleic acid molecule having a nucleotide sequence which hybridizes under a stringency condition described herein to a nucleic acid molecule comprising the nucleotide sequence of SEQ ID NO: 10, SEQ ID NO: 12, or the sequence of the DNA insert of the plasmid deposited with ATCC Accession Number ______, wherein the nucleic acid encodes a full length 61833 protein or an active fragment thereof.

[0464] In a related aspect, the invention provides 61833 polypeptides or fragments operatively linked to non-61833 polypeptides to form fusion proteins.

[0465] In another aspect, the invention features antibodies and antigen-binding fragments thereof, that react with, or more preferably specifically bind 61833 polypeptides.

[0466] In another aspect, the invention provides methods of screening for compounds that modulate the expression or activity of the 61833 polypeptides or nucleic acids.

[0467] In still another aspect, the invention provides a process for modulating 61833 polypeptide or nucleic acid expression or activity, e.g. using the screened compounds. In certain embodiments, the methods involve treatment of conditions related to aberrant activity or expression of the 61833 polypeptides or nucleic acids, such as conditions involving aberrant or deficient cellular proliferation or differentiation. The screened compounds can modulate the decarboxylase activity of a 61833 polypeptide.

[0468] The invention also provides assays for determining the activity of or the presence or absence of 61833 polypeptides or nucleic acid molecules in a biological sample, including for disease diagnosis.

[0469] In one aspect, the invention provides a method of evaluating a sample. The method includes: providing a sample; detecting a 61833 polypeptide or nucleic acid in the sample; and, optionally, comparing the level of expressed 61833 molecules to a reference sample. In one embodiment, an increased level of 61833 molecules is an indication that the sample includes cells in mitosis. In another embodiment, the level of 61833 molecules is an indication that a sample includes a proliferating cell, e.g., a proliferating colon, liver, lung, breast, or ovary cell.

[0470] In yet another aspect, the invention provides methods for inhibiting the proliferation, or inducing the killing, of a 61833-expressing cell, e.g., a hyper-proliferative 61833-expressing cell. The method includes contacting the cell with a compound (e.g., a compound identified using the methods described herein) that modulates the activity, or expression, of the 61833 polypeptide or nucleic acid. In a preferred embodiment, the contacting step is effective in vitro or ex vivo. In other embodiments, the contacting step is effected in vivo, e.g., in a subject (e.g., a mammal, e.g., a human), as part of a therapeutic or prophylactic protocol. In a preferred embodiment, the cell is a hyperproliferative cell, e.g., a cell found in a solid tumor, a soft tissue tumor, or a metastatic lesion.

[0471] In a preferred embodiment, the compound is an inhibitor of a 61833 polypeptide. Preferably, the inhibitor is chosen from a peptide, a phosphopeptide, a small organic molecule, a small inorganic molecule and an antibody (e.g., an antibody conjugated to a therapeutic moiety selected from a cytotoxin, a cytotoxic agent and a radioactive metal ion). In another preferred embodiment, the compound is an inhibitor of a 61833 nucleic acid, e.g., an antisense, a ribozyme, or a triple helix molecule.

[0472] In a preferred embodiment, the compound is administered in combination with a cytotoxic agent. Examples of cytotoxic agents include anti-microtubule agent, a topoisomerase I inhibitor, a topoisomerase II inhibitor, an anti-metabolite, a mitotic inhibitor, an alkylating agent, an intercalating agent, an agent capable of interfering with a signal transduction pathway, an agent that promotes apoptosis or necrosis, and radiation.

[0473] In another aspect, the invention features methods for treating or preventing a disorder characterized by aberrant cellular proliferation or differentiation of a 61833-expressing cell, in a subject. Preferably, the method includes comprising administering to the subject (e.g., a mammal, e.g., a human) an effective amount of a compound (e.g., a compound identified using the methods described herein) that modulates the activity, or expression, of the 61833 polypeptide or nucleic acid. In a preferred embodiment, the disorder is a cancerous or pre-cancerous condition.

[0474] In a further aspect, the invention provides methods for evaluating the efficacy of a treatment of a disorder, e.g., proliferative disorder. The method includes: treating a subject, e.g., a patient or an animal, with a protocol under evaluation (e.g., treating a subject with one or more of: chemotherapy, radiation, and/or a compound identified using the methods described herein); and evaluating the expression of a 61833 nucleic acid or polypeptide before and after treatment. A change, e.g., a decrease or increase, in the level of a 61833 nucleic acid (e.g., mRNA) or polypeptide after treatment, relative to the level of expression before treatment, is indicative of the efficacy of the treatment of the disorder. The level of 61833 nucleic acid or polypeptide expression can be detected by any method described herein.

[0475] In a preferred embodiment, the evaluating step includes obtaining a sample (e.g., a tissue sample, e.g., a biopsy, or a fluid sample) from the subject, before and after treatment and comparing the level of expressing of a 61833 nucleic acid (e.g., mRNA) or polypeptide before and after treatment.

[0476] In another aspect, the invention provides methods for evaluating the efficacy of a therapeutic or prophylactic agent (e.g., an anti-neoplastic agent). The method includes: contacting a sample with an agent (e.g., a compound identified using the methods described herein, a cytotoxic agent) and, evaluating the expression of 61833 nucleic acid or polypeptide in the sample before and after the contacting step. A change, e.g., a decrease or increase, in the level of 61833 nucleic acid (e.g., mRNA) or polypeptide in the sample obtained after the contacting step, relative to the level of expression in the sample before the contacting step, is indicative of the efficacy of the agent. The level of 61833 nucleic acid or polypeptide expression can be detected by any method described herein. In a preferred embodiment, the sample includes cells obtained from a cancerous tissue.

[0477] In further aspect, the invention provides assays for determining the presence or absence of a genetic alteration in a 61833 polypeptide or nucleic acid molecule, including for disease diagnosis.

[0478] In another aspect, the invention features a two dimensional array having a plurality of addresses, each address of the plurality being positionally distinguishable from each other address of the plurality, and each address of the plurality having a unique capture probe, e.g., a nucleic acid or peptide sequence. At least one address of the plurality has a capture probe that recognizes a 61833 molecule. In one embodiment, the capture probe is a nucleic acid, e.g., a probe complementary to a 61833 nucleic acid sequence. In another embodiment, the capture probe is a polypeptide, e.g., an antibody specific for 61833 polypeptides. Also featured is a method of analyzing a sample by contacting the sample to the aforementioned array and detecting binding of the sample to the array.

[0479] Other features and advantages of the invention will be apparent from the following detailed description, and from the claims.

Detailed Description of 61833

[0480] The human 61833 sequence (see SEQ ID NO: 10, as recited in Example 5), which is approximately 1937 nucleotides long including untranslated regions, contains a predicted methionine-initiated coding sequence of about 1383 nucleotides, including the termination codon. The coding sequence encodes a 460 amino acid protein (see SEQ ID NO: 11, as recited in Example 5).

[0481] Human 61833 contains the following regions or other structural features:

[0482] one pyridoxyl-dependent decarboxylase domain (PFAM Accession Number PF00278) located at about amino acid residues 41 to 401 of SEQ ID NO: 11;

[0483] one decarboxylase family 2 pridoxal phosphate attachment site (PS00878) located at about amino acid residues 67 to 85 of SEQ ID NO:11;

[0484] one decarboxylase family 2 signature motif (PS00879) located at about amino acid residues 230 to 241 of SEQ ID NO: 11;

[0485] six predicted Protein Kinase C phosphorylation sites (PS00005) located at about amino acid residues 18 to 20, 149 to 151, 168 to 170, 174 to 176, 177 to 179, and 311 to 313 of SEQ ID NO: 11;

[0486] five predicted Casein Kinase II phosphorylation sites (PS00006) located at about amino acids 6 to 9, 18 to 21, 34 to 37, 212 to 215, and 258 to 261 of SEQ ID NO:11;

[0487] one predicted cAMP/cGMP-dependent protein kinase phosphorylation site (PS00004) located at about amino acids 343 to 346 of SEQ ID NO: 11;

[0488] one predicted tyrosine kinase phosphorylation site (PS00007) located at about amino acid residues 344 to 352;

[0489] and seven predicted N-myristylation sites (PS00008) from about amino acids 29 to 34, 85 to 90, 103 to 108, 223 to 228, 237 to 242, 325 to 330, and 400 to 405 of SEQ ID NO:1.

[0490] For general information regarding PFAM identifiers, PS prefix and PF prefix domain identification numbers, refer to Sonnhammer et al. (1997) Protein 28:405-420 and http://www.psc.edu/general/software/packages/pfam/pfam.html.

[0491] A plasmid containing the nucleotide sequence encoding human 61833 (clone “Fbh61833FL”) was deposited with American Type Culture Collection (ATCC), 10801 University Boulevard, Manassas, Va. 20110-2209, on ______ and assigned Accession Number ______. This deposit will be maintained under the terms of the Budapest Treaty on the International Recognition of the Deposit of Microorganisms for the Purposes of Patent Procedure. This deposit was made merely as a convenience for those of skill in the art and is not an admission that a deposit is required under 35 U.S.C. §112.

[0492] The 61833 protein contains a significant number of structural characteristics in common with members of the pyridoxyl-dependent decarboxylase family. The term “family” when referring to the protein and nucleic acid molecules of the invention means two or more proteins or nucleic acid molecules having a common structural domain or motif and having sufficient amino acid or nucleotide sequence homology as defined herein. Such family members can be naturally or non-naturally occurring and can be from either the same or different species. For example, a family can contain a first protein of human origin as well as other distinct proteins of human origin, or alternatively, can contain homologues of non-human origin, e.g., rat or mouse proteins. Members of a family can also have common functional characteristics.

[0493] A pyridoxyl-dependent decarboxylase family of proteins can be subdivided into four subgroups: Group I includes glycine decarboxylases; group II includes glutamate, histidine, tyrosine, and aromatic-L-amino acid decarboxylases; group III includes prokaryotic ornithine and lysine decarboxylases as well as the prokaryotic biodegradative type of decarboxylases; and group IV includes eukaryotic ornithine and arginine decarboxylases as well as the prokaryotic biosynthetic type of arginine decarboxylases. These four families have been described in, e.g., Sandmeier et al. (1994), Eur J Biochem 221(3): 997-1002, the contents of which are incorporated herein by reference. 61833 polypeptides have structural characteristics in common with the group IV pyridoxyl-dependent decarboxylases.

[0494] A 61833 polypeptide can include a “pyridoxyl-dependent decarboxylase domain” or regions homologous with a “pyridoxyl-dependent decarboxylase domain”.

[0495] As used herein, the term “pyridoxyl-dependent decarboxylase domain” includes an amino acid sequence of about 250 to 500 amino acid residues in length and having a bit score for the alignment of the sequence to the pyridoxyl-dependent decarboxylase domain profile (Pfam HMM) of at least 215. Preferably, a pyridoxyl-dependent decarboxylase domain includes at least about 300 to 450 amino acids, more preferably about 325 to 425 amino acid residues, or about 350 to 400 amino acids and has a bit score for the alignment of the sequence to the pyridoxyl-dependent decarboxylase domain (HMM) of at least 250, 300, 350, 375, 400, 425 or greater. The pyridoxyl-dependent decarboxylase domain (HMM) has been assigned the PFAM Accession Number PF00278 (http;//genome.wustl.edu/Pfam/.html). An alignment of the pyridoxyl-dependent decarboxylase domain (amino acids 41 to 401 of SEQ ID NO:11) of human 61833 with a consensus amino acid sequence (SEQ ID NO:13) derived from a hidden Markov model is depicted in FIG. 10.

[0496] In a preferred embodiment 61833 polypeptide or protein has a “pyridoxyl-dependent decarboxylase domain” or a region which includes at least about 250 to 500, more preferably about 325 to 425, or 350 to 400 amino acid residues and has at least about 50%, 60%, 70% 80% 90% 95%, 99%, or 100% homology with a “pyridoxyl-dependent decarboxylase domain,” e.g., the pyridoxyl-dependent decarboxylase domain of human 61833 (e.g., residues 41 to 401 of SEQ ID NO:11).

[0497] To identify the presence of a “pyridoxyl-dependent decarboxylase” domain in a 61833 protein sequence, and make the determination that a polypeptide or protein of interest has a particular profile, the amino acid sequence of the protein can be searched against the Pfam database of HMMs (e.g., the Pfam database, release 2.1) using the default parameters (http://www.sanger.ac.uk/Software/Pfam/HMM_search). For example, the hmmsf program, which is available as part of the HMMER package of search programs, is a family specific default program for MILPAT0063 and a score of 15 is the default threshold score for determining a hit. Alternatively, the threshold score for determining a hit can be lowered (e.g., to 8 bits). A description of the Pfam database can be found in Sonharnmer et al. (1997) Proteins 28(3):405-420 and a detailed description of HMMs can be found, for example, in Gribskov et al. (1990) Meth. Enzymol. 183:146-159; Gribskov et al. (1987) Proc. Natl. Acad. Sci. USA 84:4355-4358; Krogh et al. (1994) J. Mol. Biol. 235:1501-1531; and Stultz et al. (1993) Protein Sci. 2:305-314, the contents of which are incorporated herein by reference. A search was performed against the HMM database resulting in the identification of a “pyridoxyl-dependent decarboxylase” domain in the amino acid sequence of human 61833 at about residues 41 to 401 of SEQ ID NO:11 (see FIG. 10).

[0498] In one embodiment, a 61833 protein includes at least one decarboxylase family 2 pyridoxyl phosphate binding site, located at about amino acid residues 67 to 85 of SEQ ID NO: 11. As used herein, the term “decarboxylase family 2 pyridoxyl phosphate binding site” includes a sequence of at least 12 amino acid residues defined by the sequence: (F/Y)-(P/A)-X-K-(S/A/C/V)-(N/H/C/L/F/W)-X-X-X-X-(L/I/V/M/F)-(L/I/V/M/T/A)-X-X-(L/I/V/M/A)-X-X-X-(G/T/E) (SEQ ID NO: 14). A decarboxylase family 2 pyridoxyl phosphate binding site, as defined, can be involved in the binding of pyridoxyl -5′-phosphate, as well as the removal of a carboxyl group from an appropriate substrate, e.g. ornithine or an amino acid, e.g., arginine. More preferably, a decarboxylase family 2 pyridoxyl phosphate binding site includes 15, or even more preferably 19, amino acid residues which encompass the lysine (K) residue of position 4. Decarboxylase family 2 pyridoxyl phosphate binding sites have been described in, e.g., the PROSITE database (www.expasy.org; PS00878), the contents of which are hereby incorporated by reference.

[0499] In one embodiment, a 61833 protein includes at least one decarboxylase family 2 signature sequence, located at about amino acid residues 227 to 241 of SEQ ID NO: 11. As used herein, the term “decarboxylase family 2 signature sequence” includes a sequence of at least 8 amino acid residues defined by the sequence: (G/S)-X-X-(L/I/V/M/S/C/P)-X-X-(L/I/V/M/F)-(D/N/S)-(L/I/V/M/C/A)-G-G-G-(L/I/V/M/F/Y)-(G/S/T/P/C/E/Q) (SEQ ID NO: 15). A decarboxylase family 2 signature sequence, as defined, can be involved in the binding of a substrate molecule, e.g., omithine or an amino acid, e.g., arginine, or a pyridoxyl-dependent decarboxylase inhibitor, e.g., a-difluoromethylornithine, as well as the removal of a carboxyl group from an appropriate substrate, e.g. ornithine or an amino acid, e.g., arginine. More preferably, a decarboxylase family 2 signature motif includes 11, or even more preferably 14, amino acid residues which encompass the glycine (G) residues of positions 10, 11, and 12. Decarboxylase family 2 signature motifs have been described in, e.g., the PROSITE database (www.expasy.org; PS00879), the contents of which are hereby incorporated by reference.

[0500] A 61833 family member can include at least one pyridoxyl-dependent decarboxylase domain. Furthermore, a 61833 family member can include: at least one decarboxylase family 2 pyridoxyl phosphate binding site that has the capacity to bind pyridoxyl 5′ phosphate; at least one decarboxylase family 2 signature motif that has the capcity to bind to a substrate molecule; at least one, two, three, four, five, preferably six predicted protein kinase C phosphorylation sites (PS00005); at least one, two, three, four, and preferably five predicted casein kinase II phosphorylation sites (PS00006); at least one predicted cAMP/cGMP-dependent protein kinase phosphorylation site (PS0004); at least one predicted tyrosine kinase phosphorylation site (PS00007); and at least one, two, three, four, five, six, and preferably seven predicted N-myristylation sites (PS00008).

[0501] As the 61833 polypeptides of the invention may modulate 61833-mediated activities, they may be useful as of for developing novel diagnostic and therapeutic agents for 61833-mediated or related disorders, as described below.

[0502] As used herein, a “61833 activity”, “biological activity of 61833” or “functional activity of 61833”, refers to an activity exerted by a 61833 protein, polypeptide or nucleic acid molecule. For example, a 61833 activity can be an activity exerted by 61833 in a physiological milieu on, e.g., a 61833-responsive cell or on a 61833 substrate, e.g., a protein substrate. A 61833 activity can be determined in vivo or in vitro. In one embodiment, a 61833 activity is a direct activity, such as an association with a 61833 target molecule. A “target molecule” or “binding partner” is a molecule with which a 61833 protein binds or interacts in nature.

[0503] In an exemplary embodiment, 61833 is an enzyme that removes carboxyl groups from appropriate substrate molecules, e.g., omithine or amino acids, e.g., arginine. As used herein, a “pyridoxyl-dependent decarboxylase activity” refers to an activity that catalyzes the removal of a carboxyl group from an appropriate substrate.

[0504] A 61833 activity can also be an indirect activity, e.g., a cellular signaling activity mediated by interaction of the 61833 protein with a 61833 receptor. The features of the 61833 molecules of the present invention can provide similar biological activities as pyridoxyl-dependent decarboxylase family members. For example, the 61833 proteins of the present invention can have one or more of the following activities: (1) pyridoxyl-dependent decarboxylase activity, e.g., for the removal of a carboxyl group from omithine or an amino acid, e.g., arginine; (2) stimulation of polyamine levels within a cell, e.g., polyamines such as putrescine, spermidine, and spermine; (3) stimulation of cell growth and proliferation; (4) inhibition of cell death; (5) inhibition of platelet aggregation; (6) inhibition of atherosclerotic plaque formation; (7) modulation of voltage-gated sodium channels; or (8) modulation of neuronal behavior. Thus, the 61833 molecules can act as novel diagnostic targets and therapeutic agents for controlling (1) cellular proliferative or differentiative disorders, (2) cardiovascular disorders, and/or (3) disorders of the brain.

[0505] Examples of cellular proliferative and/or differentiative disorders include cancer, e.g., carcinoma, sarcoma, metastatic disorders or hematopoietic neoplastic disorders, e.g., leukemias. A metastatic tumor can arise from a multitude of primary tumor types, including but not limited to those of prostate, colon, lung, breast and liver origin.

[0506] As used herein, the terms “cancer”, “hyperproliferative” and “neoplastic” refer to cells having the capacity for autonomous growth, i.e., an abnormal state or condition characterized by rapidly proliferating cell growth. Hyperproliferative and neoplastic disease states may be categorized as pathologic, i.e., characterizing or constituting a disease state, or may be categorized as non-pathologic, i.e., a deviation from normal but not associated with a disease state. The term is meant to include all types of cancerous growths or oncogenic processes, metastatic tissues or malignantly transformed cells, tissues, or organs, irrespective of histopathologic type or stage of invasiveness. “Pathologic hyperproliferative” cells occur in disease states characterized by malignant tumor growth. Examples of non-pathologic hyperproliferative cells include proliferation of cells associated with wound repair.

[0507] The terms “cancer” or “neoplasms” include malignancies of the various organ systems, such as affecting lung, breast, thyroid, lymphoid, gastrointestinal, and genito-urinary tract, as well as adenocarcinomas which include malignancies such as most colon cancers, renal-cell carcinoma, prostate cancer and/or testicular tumors, non-small cell carcinoma of the lung, cancer of the small intestine and cancer of the esophagus.

[0508] The term “carcinoma” is art recognized and refers to malignancies of epithelial or endocrine tissues including respiratory system carcinomas, gastrointestinal system carcinomas, genitourinary system carcinomas, testicular carcinomas, breast carcinomas, prostatic carcinomas, endocrine system carcinomas, and melanomas. Exemplary carcinomas include those forming from tissue of the cervix, lung, prostate, breast, head and neck, colon and ovary. The term also includes carcinosarcomas, e.g., which include malignant tumors composed of carcinomatous and sarcomatous tissues. An “adenocarcinoma” refers to a carcinoma derived from glandular tissue or in which the tumor cells form recognizable glandular structures.

[0509] The term “sarcoma” is art recognized and refers to malignant tumors of mesenchymal derivation.

[0510] Examples of cellular proliferative and/or differentiative disorders of the colon include, but are not limited to, non-neoplastic polyps, adenomas, familial syndromes, colorectal carcinogenesis, colorectal carcinoma, and carcinoid tumors.

[0511] Examples of cellular proliferative and/or differentiative disorders of the liver include, but are not limited to, nodular hyperplasias, adenomas, and malignant tumors, including primary carcinoma of the liver and metastatic tumors.

[0512] Examples of cellular proliferative and/or differentiative disorders of the breast include, but are not limited to, proliferative breast disease including, e.g., epithelial hyperplasia, sclerosing adenosis, and small duct papillomas; tumors, e.g., stromal tumors such as fibroadenoma, phyllodes tumor, and sarcomas, and epithelial tumors such as large duct papilloma; carcinoma of the breast including in situ (noninvasive) carcinoma that includes ductal carcinoma in situ (including Paget's disease) and lobular carcinoma in situ, and invasive (infiltrating) carcinoma including, but not limited to, invasive ductal carcinoma, invasive lobular carcinoma, medullary carcinoma, colloid (mucinous) carcinoma, tubular carcinoma, and invasive papillary carcinoma, and miscellaneous malignant neoplasms. Disorders in the male breast include, but are not limited to, gynecomastia and carcinoma.

[0513] Examples of cellular proliferative and/or differentiative disorders of the lung include, but are not limited to, bronchogenic carcinoma, including paraneoplastic syndromes, bronchioloalveolar carcinoma, neuroendocrine tumors, such as bronchial carcinoid, miscellaneous tumors, and metastatic tumors; pathologies of the pleura, including inflammatory pleural effusions, noninflammatory pleural effusions, pneumothorax, and pleural tumors, including solitary fibrous tumors (pleural fibroma) and malignant mesothelioma.

[0514] Additional examples of proliferative disorders include hematopoietic neoplastic disorders. As used herein, the term “hematopoietic neoplastic disorders” includes diseases involving hyperplastic/neoplastic cells of hematopoietic origin, e.g., arising from myeloid, lymphoid or erythroid lineages, or precursor cells thereof. Preferably, the diseases arise from poorly differentiated acute leukemias, e.g., erythroblastic leukemia and acute megakaryoblastic leukemia. Additional exemplary myeloid disorders include, but are not limited to, acute promyeloid leukemia (APML), acute myelogenous leukemia (AML) and chronic myelogenous leukemia (CML) (reviewed in Vaickus, L. (1991) Crit Rev. in Oncol./Hemotol. 11:267-97); lymphoid malignancies include, but are not limited to acute lymphoblastic leukemia (ALL) which includes B-lineage ALL and T-lineage ALL, chronic lymphocytic leukemia (CLL), prolymphocytic leukemia (PLL), hairy cell leukemia (HLL) and Waldenstrom's macroglobulinemia (WM). Additional forms of malignant lymphomas include, but are not limited to non-Hodgkin lymphoma and variants thereof, peripheral T cell lymphomas, adult T cell leukemia/lymphoma (ATL), cutaneous T-cell lymphoma (CTCL), large granular lymphocytic leukemia (LGF), Hodgkin's disease and Reed-Sternberg disease.

[0515] Further examples of cancers or neoplastic conditions, in addition to the ones described above, include, but are not limited to, a fibrosarcoma, myosarcoma, liposarcoma, chondrosarcoma, osteogenic sarcoma, chordoma, angiosarcoma, endotheliosarcoma, lymphangiosarcoma, lymphangioendotheliosarcoma, synovioma, mesothelioma, Ewing's tumor, leiomyosarcoma, rhabdomyosarcoma, gastric cancer, esophageal cancer, rectal cancer, pancreatic cancer, ovarian cancer, prostate cancer, uterine cancer, cancer of the head and neck, skin cancer, brain cancer, squamous cell carcinoma, sebaceous gland carcinoma, papillary carcinoma, papillary adenocarcinoma, cystadenocarcinoma, medullary carcinoma, bronchogenic carcinoma, renal cell carcinoma, hepatoma, bile duct carcinoma, choriocarcinoma, seminoma, embryonal carcinoma, Wilm's tumor, cervical cancer, testicular cancer, small cell lung carcinoma, non-small cell lung carcinoma, bladder carcinoma, epithelial carcinoma, glioma, astrocytoma, medulloblastoma, craniopharyngioma, ependymoma, pinealoma, hemangioblastoma, acoustic neuroma, oligodendroglioma, meningioma, melanoma, neuroblastoma, retinoblastoma, leukemia, lymphoma, or Kaposi sarcoma. Many such neoplastic conditions can progress to a metastatic state, e.g., resulting in tumor cells moving to new locations and forming metastatic tumors. The motility of such cells can depend on extracellular ligands, e.g., a ligand that is synthesized by a 61833 polypeptide.

[0516] 61833 molecules can also be useful as novel diagnostic targets and therapeutic agents for cardiovascular diseases. Examples of disorders involving the heart or “cardiovascular disorder” include, but are not limited to, a disease, disorder, or state involving the cardiovascular system, e.g., the heart, the blood vessels, and/or the blood. A cardiovascular disorder can be caused by an imbalance in arterial pressure, a malfunction of the heart, or an occlusion of a blood vessel, e.g., by a thrombus. Examples of such disorders include hypertension, atherosclerosis, coronary artery spasm, congestive heart failure, coronary artery disease, valvular disease, arrhythmias, and cardiomyopathies.

[0517] Disorders involving blood vessels include, but are not limited to, responses of vascular cell walls to injury, such as endothelial dysfunction and endothelial activation and intimal thickening; vascular diseases including, but not limited to, congenital anomalies, such as arteriovenous fistula, atherosclerosis, and hypertensive vascular disease, such as hypertension; inflammatory disease—the vasculitides, such as giant cell (temporal) arteritis, Takayasu arteritis, polyarteritis nodosa (classic), Kawasaki syndrome (mucocutaneous lymph node syndrome), microscopic polyanglitis (microscopic polyarteritis, hypersensitivity or leukocytoclastic anglitis), Wegener granulomatosis, thromboanglitis obliterans (Buerger disease), vasculitis associated with other disorders, and infectious arteritis; Raynaud disease; aneurysms and dissection, such as abdominal aortic aneurysms, syphilitic (luetic) aneurysms, and aortic dissection (dissecting hematoma); disorders of veins and lymphatics, such as varicose veins, thrombophlebitis and phlebothrombosis, obstruction of superior vena cava (superior vena cava syndrome), obstruction of inferior vena cava (inferior vena cava syndrome), and lymphangitis and lymphedema; tumors, including benign tumors and tumor-like conditions, such as hemangioma, lymphangioma, glomus tumor (glomangioma), vascular ectasias, and bacillary angiomatosis, and intermediate-grade (borderline low-grade malignant) tumors, such as Kaposi sarcoma and hemangloendothelioma, and malignant tumors, such as angiosarcoma and hemangiopericytoma; and pathology of therapeutic interventions in vascular disease, such as balloon angioplasty and related techniques and vascular replacement, such as coronary artery bypass graft surgery.

[0518] Disorders involving the brain include, but are not limited to, disorders involving neurons, and disorders involving glia, such as astrocytes, oligodendrocytes, ependymal cells, and microglia; cerebral edema, raised intracranial pressure and herniation, and hydrocephalus; malformations and developmental diseases, such as neural tube defects, forebrain anomalies, posterior fossa anomalies, and syringomyelia and hydromyelia; perinatal brain injury; cerebrovascular diseases, such as those related to hypoxia, ischemia, and infarction, including hypotension, hypoperfusion, and low-flow states—global cerebral ischemia and focal cerebral ischemia—infarction from obstruction of local blood supply, intracranial hemorrhage, including intracerebral (intraparenchymal) hemorrhage, subarachnoid hemorrhage and ruptured berry aneurysms, and vascular malformations, hypertensive cerebrovascular disease, including lacunar infarcts, slit hemorrhages, and hypertensive encephalopathy; infections, such as acute meningitis, including acute pyogenic (bacterial) meningitis and acute aseptic (viral) meningitis, acute focal suppurative infections, including brain abscess, subdural empyema, and extradural abscess, chronic bacterial meningoencephalitis, including tuberculosis and mycobacterioses, neurosyphilis, and neuroborreliosis (Lyme disease), viral meningoencephalitis, including arthropod-borne (Arbo) viral encephalitis, Herpes simplex virus Type 1, Herpes simplex virus Type 2, Varicalla-zoster virus (Herpes zoster), cytomegalovirus, poliomyelitis, rabies, and human immunodeficiency virus 1, including HIV-1 meningoencephalitis (subacute encephalitis), vacuolar myelopathy, AIDS-associated myopathy, peripheral neuropathy, and AIDS in children, progressive multifocal leukoencephalopathy, subacute sclerosing panencephalitis, fungal meningoencephalitis, other infectious diseases of the nervous system; transmissible spongiform encephalopathies (prion diseases); demyelinating diseases, including multiple sclerosis, multiple sclerosis variants, acute disseminated encephalomyelitis and acute necrotizing hemorrhagic encephalomyelitis, and other diseases with demyelination; degenerative diseases, such as degenerative diseases affecting the cerebral cortex, including Alzheimer disease and Pick disease, degenerative diseases of basal ganglia and brain stem, including Parkinsonism, idiopathic Parkinson disease (paralysis agitans), progressive supranuclear palsy, corticobasal degenration, multiple system atrophy, including striatonigral degenration, Shy-Drager syndrome, and olivopontocerebellar atrophy, and Huntington disease; spinocerebellar degenerations, including spinocerebellar ataxias, including Friedreich ataxia, and ataxia-telanglectasia, degenerative diseases affecting motor neurons, including amyotrophic lateral sclerosis (motor neuron disease), bulbospinal atrophy (Kennedy syndrome), and spinal muscular atrophy; inborn errors of metabolism, such as leukodystrophies, including Krabbe disease, metachromatic leukodystrophy, adrenoleukodystrophy, Pelizaeus-Merzbacher disease, and Canavan disease, mitochondrial encephalomyopathies, including Leigh disease and other mitochondrial encephalomyopathies; toxic and acquired metabolic diseases, including vitamin deficiencies such as thiamine (vitamin B1) deficiency and vitamin B12 deficiency, neurologic sequelae of metabolic disturbances, including hypoglycemia, hyperglycemia, and hepatic encephatopathy, toxic disorders, including carbon monoxide, methanol, ethanol, and radiation, including combined methotrexate and radiation-induced injury; tumors, such as gliomas, including astrocytoma, including fibrillary (diffuse) astrocytoma and glioblastoma multiforme, pilocytic astrocytoma, pleomorphic xanthoastrocytoma, and brain stem glioma, oligodendroglioma, and ependymoma and related paraventricular mass lesions, neuronal tumors, poorly differentiated neoplasms, including medulloblastoma, other parenchymal tumors, including primary brain lymphoma, germ cell tumors, and pineal parenchymal tumors, meningiomas, metastatic tumors, paraneoplastic syndromes, peripheral nerve sheath tumors, including schwannoma, neurofibroma, and malignant peripheral nerve sheath tumor (malignant schwannoma), and neurocutaneous syndromes (phakomatoses), including neurofibromotosis, including Type 1 neurofibromatosis (NF1) and TYPE 2 neurofibromatosis (NF2), tuberous sclerosis, and Von Hippel-Lindau disease.

[0519] The 61833 nucleic acid and protein of the invention can be used to treat and/or diagnose a variety of other conditions, in addition to the ones described above (see “Methods of Treatment” for additional examples).

[0520] The 61833 protein, fragments thereof, and derivatives and other variants of the sequence in SEQ ID NO: 11 thereof are collectively referred to as “polypeptides or proteins of the invention” or “61833 polypeptides or proteins”. Nucleic acid molecules encoding such polypeptides or proteins are collectively referred to as “nucleic acids of the invention” or “61833 nucleic acids.” 61833 molecules refer to 61833 nucleic acids, polypeptides, and antibodies.

[0521] As used herein, the term “nucleic acid molecule” includes DNA molecules (e.g., a cDNA or genomic DNA), RNA molecules (e.g., an mRNA) and analogs of the DNA or RNA. A DNA or RNA analog can be synthesized from nucleotide analogs. The nucleic acid molecule can be single-stranded or double-stranded, but preferably is double-stranded DNA.

[0522] The term “isolated nucleic acid molecule” or “purified nucleic acid molecule” includes nucleic acid molecules that are separated from other nucleic acid molecules present in the natural source of the nucleic acid. For example, with regards to genomic DNA, the term “isolated” includes nucleic acid molecules which are separated from the chromosome with which the genomic DNA is naturally associated. Preferably, an “isolated” nucleic acid is free of sequences which naturally flank the nucleic acid (i.e., sequences located at the 5′ and/or 3′ends of the nucleic acid) in the genomic DNA of the organism from which the nucleic acid is derived. For example, in various embodiments, the isolated nucleic acid molecule can contain less than about 5 kb, 4 kb, 3 kb, 2 kb, 1 kb, 0.5 kb or 0.1 kb of 5′ and/or 3′nucleotide sequences which naturally flank the nucleic acid molecule in genomic DNA of the cell from which the nucleic acid is derived. Moreover, an “isolated” nucleic acid molecule, such as a cDNA molecule, can be substantially free of other cellular material, or culture medium when produced by recombinant techniques, or substantially free of chemical precursors or other chemicals when chemically synthesized.

[0523] As used herein, the term “hybridizes under low stringency, medium stringency, high stringency, or very high stringency conditions” describes conditions for hybridization and washing. Guidance for performing hybridization reactions can be found in Current Protocols in Molecular Biology, John Wiley & Sons, N.Y. (1989), 6.3.1-6.3.6, which is incorporated by reference. Aqueous and nonaqueous methods are described in that reference and either can be used. Specific hybridization conditions referred to herein are as follows: 1) low stringency hybridization conditions in 6× sodium chloride/sodium citrate (SSC) at about 45° C., followed by two washes in 0.2× SSC, 0.1% SDS at least at 50° C. (the temperature of the washes can be increased to 55° C. for low stringency conditions); 2) medium stringency hybridization conditions in 6× SSC at about 45° C., followed by one or more washes in 0.2× SSC, 0.1% SDS at 60° C.; 3) high stringency hybridization conditions in 6× SSC at about 45° C., followed by one or more washes in 0.2× SSC, 0.1% SDS at 65° C.; and preferably 4) very high stringency hybridization conditions are 0.5M sodium phosphate, 7% SDS at 65° C., followed by one or more washes at 0.2× SSC, 1% SDS at 65° C. Very high stringency conditions (4) are the preferred conditions and the ones that should be used unless otherwise specified.

[0524] Preferably, an isolated nucleic acid molecule of the invention that hybridizes under a stringency condition described herein to the sequence of SEQ ID NO: 10 or SEQ ID NO: 12, corresponds to a naturally-occurring nucleic acid molecule.

[0525] As used herein, a “naturally-occurring” nucleic acid molecule refers to an RNA or DNA molecule having a nucleotide sequence that occurs in nature. For example a naturally occurring nucleic acid molecule can encode a natural protein.

[0526] As used herein, the terms “gene” and “recombinant gene” refer to nucleic acid molecules which include at least an open reading frame encoding a 61833 protein. The gene can optionally further include non-coding sequences, e.g., regulatory sequences and introns. Preferably, a gene encodes a mammalian 61833 protein or derivative thereof.

[0527] An “isolated” or “purified” polypeptide or protein is substantially free of cellular material or other contaminating proteins from the cell or tissue source from which the protein is derived, or substantially free from chemical precursors or other chemicals when chemically synthesized. “Substantially free” means that a preparation of 61833 protein is at least 10% pure. In a preferred embodiment, the preparation of 61833 protein has less than about 30%, 20%, 10% and more preferably 5% (by dry weight), of non-61833 protein (also referred to herein as a “contaminating protein”), or of chemical precursors or non-61833 chemicals. When the 61833 protein or biologically active portion thereof is recombinantly produced, it is also preferably substantially free of culture medium, i.e., culture medium represents less than about 20%, more preferably less than about 10%, and most preferably less than about 5% of the volume of the protein preparation. The invention includes isolated or purified preparations of at least 0.01, 0.1, 1.0, and 10 milligrams in dry weight.

[0528] A “non-essential” amino acid residue is a residue that can be altered from the wild-type sequence of 61833 without abolishing or substantially altering a 61833 activity. Preferably the alteration does not substantially alter the 61833 activity, e.g., the activity is at least 20%, 40%, 60%, 70% or 80% of wild-type. An “essential” amino acid residue is a residue that, when altered from the wild-type sequence of 61833, results in abolishing a 61833 activity such that less than 20% of the wild-type activity is present. For example, conserved amino acid residues in 61833 are predicted to be particularly unamenable to alteration.

[0529] A “conservative amino acid substitution” is one in which the amino acid residue is replaced with an amino acid residue having a similar side chain. Families of amino acid residues having similar side chains have been defined in the art. These families include amino acids with basic side chains (e.g., lysine, arginine, histidine), acidic side chains (e.g., aspartic acid, glutamic acid), uncharged polar side chains (e.g., glycine, asparagine, glutamine, serine, threonine, tyrosine, cysteine), nonpolar side chains (e.g., alanine, valine, leucine, isoleucine, proline, phenylalanine, methionine, tryptophan), beta-branched side chains (e.g., threonine, valine, isoleucine) and aromatic side chains (e.g., tyrosine, phenylalanine, tryptophan, histidine). Thus, a predicted nonessential amino acid residue in a 61833 protein is preferably replaced with another amino acid residue from the same side chain family. Alternatively, in another embodiment, mutations can be introduced randomly along all or part of a 61833 coding sequence, such as by saturation mutagenesis, and the resultant mutants can be screened for 61833 biological activity to identify mutants that retain activity. Following mutagenesis of SEQ ID NO:10 or SEQ ID NO:12, the encoded protein can be expressed recombinantly and the activity of the protein can be determined.

[0530] As used herein, a “biologically active portion” of a 61833 protein includes a fragment of a 61833 protein which participates in an interaction, e.g., an intramolecular or an inter-molecular interaction. An inter-molecular interaction can be a specific binding interaction or an enzymatic interaction (e.g., the interaction can be transient and a covalent bond is formed or broken). An inter-molecular interaction can be between a 61833 molecule and a non-61833 molecule or between a first 61833 molecule and a second 61833 molecule (e.g., a dimerization interaction). Biologically active portions of a 61833 protein include peptides comprising amino acid sequences sufficiently homologous to or derived from the amino acid sequence of the 61833 protein, e.g., the amino acid sequence shown in SEQ ID NO:11, which include less amino acids than the full length 61833 proteins, and exhibit at least one activity of a 61833 protein. Typically, biologically active portions comprise a domain or motif with at least one activity of the 61833 protein, e.g., pyridoxyl-dependent decarboxylase activity. A biologically active portion of a 61833 protein can be a polypeptide which is, for example, 10, 25, 50, 100, 200 or more amino acids in length. Biologically active portions of a 61833 protein can be used as targets for developing agents which modulate a 61833 mediated activity, e.g., pyridoxyl-dependent decarboxylase activity

[0531] Calculations of homology or sequence identity between sequences (the terms are used interchangeably herein) are performed as follows.

[0532] To determine the percent identity of two amino acid sequences, or of two nucleic acid sequences, the sequences are aligned for optimal comparison purposes (e.g., gaps can be introduced in one or both of a first and a second amino acid or nucleic acid sequence for optimal alignment and non-homologous sequences can be disregarded for comparison purposes). In a preferred embodiment, the length of a reference sequence aligned for comparison purposes is at least 30%, preferably at least 40%, more preferably at least 50%, 60%, and even more preferably at least 70%, 80%, 90%, 100% of the length of the reference sequence. The amino acid residues or nucleotides at corresponding amino acid positions or nucleotide positions are then compared. When a position in the first sequence is occupied by the same amino acid residue or nucleotide as the corresponding position in the second sequence, then the molecules are identical at that position (as used herein amino acid or nucleic acid “identity” is equivalent to amino acid or nucleic acid “homology”).

[0533] The percent identity between the two sequences is a function of the number of identical positions shared by the sequences, taking into account the number of gaps, and the length of each gap, which need to be introduced for optimal alignment of the two sequences.

[0534] The comparison of sequences and determination of percent identity between two sequences can be accomplished using a mathematical algorithm. In a preferred embodiment, the percent identity between two amino acid sequences is determined using the Needleman and Wunsch ((1970) J. Mol. Biol. 48:444-453) algorithm which has been incorporated into the GAP program in the GCG software package (available at http://www.gcg.com), using either a Blossum 62 matrix or a PAM250 matrix, and a gap weight of 16, 14, 12, 10, 8, 6, or 4 and a length weight of 1, 2, 3, 4, 5, or 6. In yet another preferred embodiment, the percent identity between two nucleotide sequences is determined using the GAP program in the GCG software package (available at http://www.gcg.com), using a NWSgapdna.CMP matrix and a gap weight of 40, 50, 60, 70, or 80 and a length weight of 1, 2, 3, 4, 5, or 6. A particularly preferred set of parameters (and the one that should be used unless otherwise specified) are a Blossum 62 scoring matrix with a gap penalty of 12, a gap extend penalty of 4, and a frameshift gap penalty of 5.

[0535] The percent identity between two amino acid or nucleotide sequences can be determined using the algorithm of E. Meyers and W. Miller ((1989) CABIOS, 4:11-17) which has been incorporated into the ALIGN program (version 2.0), using a PAM120 weight residue table, a gap length penalty of 12 and a gap penalty of 4.

[0536] The nucleic acid and protein sequences described herein can be used as a “query sequence” to perform a search against public databases to, for example, identify other family members or related sequences. Such searches can be performed using the NBLAST and XBLAST programs (version 2.0) of Altschul, et al. (1990) J. Mol. Biol. 215:403-10. BLAST nucleotide searches can be performed with the NBLAST program, score=100, wordlength=12 to obtain nucleotide sequences homologous to 61833 nucleic acid molecules of the invention. BLAST protein searches can be performed with the XBLAST program, score=50, wordlength=3 to obtain amino acid sequences homologous to 61833 protein molecules of the invention. To obtain gapped alignments for comparison purposes, Gapped BLAST can be utilized as described in Altschul et al., (1997) Nucleic Acids Res. 25:3389-3402. When utilizing BLAST and Gapped BLAST programs, the default parameters of the respective programs (e.g., XBLAST and NBLAST) can be used. See http://www.ncbi.nlm.nih.gov.

[0537] Particularly preferred 61833 polypeptides of the present invention have an amino acid sequence substantially identical to the amino acid sequence of SEQ ID NO: 11. In the context of an amino acid sequence, the term “substantially identical” is used herein to refer to a first amino acid that contains a sufficient or minimum number of amino acid residues that are i) identical to, or ii) conservative substitutions of aligned amino acid residues in a second amino acid sequence such that the first and second amino acid sequences can have a common structural domain and/or common functional activity. For example, amino acid sequences that contain a common structural domain having at least about 60%, or 65% identity, likely 75% identity, more likely 85%, 90%. 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98% or 99% identity to SEQ ID NO: 11 are termed substantially identical.

[0538] In the context of nucleotide sequence, the term “substantially identical” is used herein to refer to a first nucleic acid sequence that contains a sufficient or minimum number of nucleotides that are identical to aligned nucleotides in a second nucleic acid sequence such that the first and second nucleotide sequences encode a polypeptide having common functional activity, or encode a common structural polypeptide domain or a common functional polypeptide activity. For example, nucleotide sequences having at least about 60%, or 65% identity, likely 75% identity, more likely 85%, 90%. 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98% or 99% identity to SEQ ID NO:10 or 12 are termed substantially identical.

[0539] “Misexpression or aberrant expression”, as used herein, refers to a non-wildtype pattern of gene expression at the RNA or protein level. It includes: expression at non-wild type levels, i.e., over- or under-expression; a pattern of expression that differs from wild type in terms of the time or stage at which the gene is expressed, e.g., increased or decreased expression (as compared with wild type) at a predetermined developmental period or stage; a pattern of expression that differs from wild type in terms of altered, e.g., increased or decreased, expression (as compared with wild type) in a predetermined cell type or tissue type; a pattern of expression that differs from wild type in terms of the splicing size, translated amino acid sequence, post-transitional modification, or biological activity of the expressed polypeptide; a pattern of expression that differs from wild type in terms of the effect of an environmental stimulus or extracellular stimulus on expression of the gene, e.g., a pattern of increased or decreased expression (as compared with wild type) in the presence of an increase or decrease in the strength of the stimulus.

[0540] “Subject,” as used herein, refers to human and non-human animals. The term “non-human animals” of the invention includes all vertebrates, e.g., mammals, such as non-human primates (particularly higher primates), sheep, dog, rodent (e.g., mouse or rat), guinea pig, goat, pig, cat, rabbits, cow, and non-mammals, such as chickens, amphibians, reptiles, etc. In a preferred embodiment, the subject is a human. In another embodiment, the subject is an experimental animal or animal suitable as a disease model.

[0541] A “purified preparation of cells”, as used herein, refers to an in vitro preparation of cells. In the case cells from multicellular organisms (e.g., plants and animals), a purified preparation of cells is a subset of cells obtained from the organism, not the entire intact organism. In the case of unicellular microorganisms (e.g., cultured cells and microbial cells), it consists of a preparation of at least 10% and more preferably 50% of the subject cells.

[0542] Various aspects of the invention are described in further detail below.

[0543] Isolated Nucleic Acid Molecules of 61833

[0544] In one aspect, the invention provides, an isolated or purified, nucleic acid molecule that encodes a 61833 polypeptide described herein, e.g., a full-length 61833 protein or a fragment thereof, e.g., a biologically active portion of 61833 protein. Also included is a nucleic acid fragment suitable for use as a hybridization probe, which can be used, e.g., to identify a nucleic acid molecule encoding a polypeptide of the invention, 61833 mRNA, and fragments suitable for use as primers, e.g., PCR primers for the amplification or mutation of nucleic acid molecules.

[0545] In one embodiment, an isolated nucleic acid molecule of the invention includes the nucleotide sequence shown in SEQ ID NO: 10, or a portion of any of these nucleotide sequences. In one embodiment, the nucleic acid molecule includes sequences encoding the human 61833 protein (i.e., “the coding region” of SEQ ID NO:10, as shown in SEQ ID NO: 12), as well as 5′untranslated sequences. Alternatively, the nucleic acid molecule can include only the coding region of SEQ ID NO:10 (e.g., SEQ ID NO: 12) and, e.g., no flanking sequences which normally accompany the subject sequence. In another embodiment, the nucleic acid molecule encodes a sequence corresponding to a fragment of the protein, from about amino acid 41 to 401 of SEQ ID NO:11.

[0546] In another embodiment, an isolated nucleic acid molecule of the invention includes a nucleic acid molecule which is a complement of the nucleotide sequence shown in SEQ ID NO: 10 or SEQ ID NO: 12, or a portion of any of these nucleotide sequences. In other embodiments, the nucleic acid molecule of the invention is sufficiently complementary to the nucleotide sequence shown in SEQ ID NO: 10 or SEQ ID NO: 12, such that it can hybridize (e.g., under a stringency condition described herein) to the nucleotide sequence shown in SEQ ID NO:10 or 12, thereby forming a stable duplex.

[0547] In one embodiment, an isolated nucleic acid molecule of the present invention includes a nucleotide sequence which is at least about 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or more homologous to the entire length of the nucleotide sequence shown in SEQ ID NO: 10 or SEQ ID NO: 12, or a portion, preferably of the same length, of any of these nucleotide sequences.

[0548] 61833 Nucleic Acid Fragments

[0549] A nucleic acid molecule of the invention can include only a portion of the nucleic acid sequence of SEQ ID NO:10 or 12. For example, such a nucleic acid molecule can include a fragment which can be used as a probe or primer or a fragment encoding a portion of a 61833 protein, e.g., an immunogenic or biologically active portion of a 61833 protein. A fragment can comprise those nucleotides of SEQ ID NO: 10, which encode a pyridoxyl-dependent decarboxylase domain of human 61833. The nucleotide sequence determined from the cloning of the 61833 gene allows for the generation of probes and primers designed for use in identifying and/or cloning other 61833 family members, or fragments thereof, as well as 61833 homologues, or fragments thereof, from other species.

[0550] In another embodiment, a nucleic acid includes a nucleotide sequence that includes part, or all, of the coding region and extends into either (or both) the 5′ or 3′noncoding region. Other embodiments include a fragment which includes a nucleotide sequence encoding an amino acid fragment described herein. Nucleic acid fragments can encode a specific domain or site described herein or fragments thereof, particularly fragments thereof which are at least 100, 200, more preferably 300, most preferably 350 or more amino acids in length. Fragments also include nucleic acid sequences corresponding to specific amino acid sequences described above or fragments thereof. Nucleic acid fragments should not to be construed as encompassing those fragments that may have been disclosed prior to the invention.

[0551] A nucleic acid fragment can include a sequence corresponding to a domain, region, or functional site described herein. A nucleic acid fragment can also include one or more domain, region, or functional site described herein. Thus, for example, a 61833 nucleic acid fragment can include a sequence corresponding to a pyridoxyl-dependent decarboxylase domain.

[0552] 61833 probes and primers are provided. Typically a probe/primer is an isolated or purified oligonucleotide. The oligonucleotide typically includes a region of nucleotide sequence that hybridizes under a stringency condition described herein to at least about 7, 12 or 15, preferably about 20 or 25, more preferably about 30, 35, 40, 45, 50, 55, 60, 65, or 75 consecutive nucleotides of a sense or antisense sequence of SEQ ID NO:10 or SEQ ID NO:12, or of a naturally occurring allelic variant or mutant of SEQ ID NO:10 or SEQ ID NO:12.

[0553] In a preferred embodiment the nucleic acid is a probe which is at least 5 or 10, and less than 200, more preferably less than 100, or less than 50, base pairs in length. It should be identical, or differ by 1, or less than in 5 or 10 bases, from a sequence disclosed herein. If alignment is needed for this comparison the sequences should be aligned for maximum homology. “Looped” out sequences from deletions or insertions, or mismatches, are considered differences.

[0554] A probe or primer can be derived from the sense or anti-sense strand of a nucleic acid which encodes the pyridoxyl-dependent decarboxylase domain, e.g., about nucleotides 442 to 1524 of SEQ ID NO: 10, or any other domain, region, or sequence described herein of human 61833.

[0555] In another embodiment a set of primers is provided, e.g., primers suitable for use in a PCR, which can be used to amplify a selected region of a 61833 sequence, e.g., a domain, region, site or other sequence described herein. The primers should be at least 5, 10, or 50 base pairs in length and less than 100, or less than 200, base pairs in length. The primers should be identical, or differs by one base from a sequence disclosed herein or from a naturally occurring variant. For example, primers suitable for amplifying all or a portion of any of the following regions are provided: a pyridoxyl-dependent decarboxylase domain from about amino acid 41 to 401 of SEQ ID NO:11.

[0556] A nucleic acid fragment can encode an epitope bearing region of a polypeptide described herein.

[0557] A nucleic acid fragment encoding a “biologically active portion of a 61833 polypeptide” can be prepared by isolating a portion of the nucleotide sequence of SEQ ID NO:10 or 12, which encodes a polypeptide having a 61833 biological activity (e.g., the biological activities of the 61833 proteins are described herein), expressing the encoded portion of the 61833 protein (e.g., by recombinant expression in vitro) and assessing the activity of the encoded portion of the 61833 protein. For example, a nucleic acid fragment encoding a biologically active portion of 61833 includes a pyridoxyl-dependent decarboxylase domain, e.g., amino acid residues about 41 to 401 of SEQ ID NO: 11. A nucleic acid fragment encoding a biologically active portion of a 61833 polypeptide, may comprise a nucleotide sequence which is greater than 300 or more nucleotides in length.

[0558] In preferred embodiments, a nucleic acid includes a nucleotide sequence which is about 300, 400, 500, 600, 700, 800, 900, 1000, 1100, 1200, 1300, 1383 or more nucleotides in length and hybridizes under a stringency condition described herein to a nucleic acid molecule of SEQ ID NO: 10, or SEQ ID NO: 12. In a preferred embodiment, a nucleic acid includes at least one contiguous nucleotide from the region of about nucleotides 1 to 200, 100 to 250, 200 to 350, 322 to 441, 322 to 576, 442 to 576, 520 to 800, 520 to 1041, 800 to 1041, 1000 to 1200, 1000 to 1524, 1200 to 1524, 1525 to 1701, or 1700 to 1900 of SEQ ID NO:10.

[0559] 61833 Nucleic Acid Variants

[0560] The invention further encompasses nucleic acid molecules that differ from the nucleotide sequence shown in SEQ ID NO:10 or SEQ ID NO:12. Such differences can be due to degeneracy of the genetic code (and result in a nucleic acid which encodes the same 61833 proteins as those encoded by the nucleotide sequence disclosed herein. In another embodiment, an isolated nucleic acid molecule of the invention has a nucleotide sequence encoding a protein having an amino acid sequence which differs, by at least 1, but less than 5, 10, 20, 50, or 100 amino acid residues that shown in SEQ ID NO:11. If alignment is needed for this comparison the sequences should be aligned for maximum homology. “Looped” out sequences from deletions or insertions, or mismatches, are considered differences.

[0561] Nucleic acids of the inventor can be chosen for having codons, which are preferred, or non-preferred, for a particular expression system. E.g., the nucleic acid can be one in which at least one codon, at preferably at least 10%, or 20% of the codons has been altered such that the sequence is optimized for expression in E. coli, yeast, human, insect, or CHO cells.

[0562] Nucleic acid variants can be naturally occurring, such as allelic variants (same locus), homologs (different locus), and orthologs (different organism) or can be non naturally occurring. Non-naturally occurring variants can be made by mutagenesis techniques, including those applied to polynucleotides, cells, or organisms. The variants can contain nucleotide substitutions, deletions, inversions and insertions. Variation can occur in either or both the coding and non-coding regions. The variations can produce both conservative and non-conservative amino acid substitutions (as compared in the encoded product).

[0563] In a preferred embodiment, the nucleic acid differs from that of SEQ ID NO:10 or 12, e.g., as follows: by at least one but less than 10, 20, 30, or 40 nucleotides; at least one but less than 1%, 5%, 10% or 20% of the nucleotides in the subject nucleic acid. If necessary for this analysis the sequences should be aligned for maximum homology. “Looped” out sequences from deletions or insertions, or mismatches, are considered differences.

[0564] Orthologs, homologs, and allelic variants can be identified using methods known in the art. These variants comprise a nucleotide sequence encoding a polypeptide that is 50%, at least about 55%, typically at least about 70-75%, more typically at least about 80-85%, and most typically at least about 90-95% or more identical to the nucleotide sequence shown in SEQ ID NO: 11 or a fragment of this sequence. Such nucleic acid molecules can readily be identified as being able to hybridize under a stringency condition described herein, to the nucleotide sequence shown in SEQ ID NO: 11 or a fragment of the sequence. Nucleic acid molecules corresponding to orthologs, homologs, and allelic variants of the 61833 cDNAs of the invention can further be isolated by mapping to the same chromosome or locus as the 61833 gene.

[0565] Preferred variants include those that are correlated with pyridoxyl-dependent decarboxylase activity.

[0566] Allelic variants of 61833, e.g., human 61833, include both functional and non-functional proteins. Functional allelic variants are naturally occurring amino acid sequence variants of the 61833 protein within a population that maintain the ability to bind pyridoxyl -5′-phosphate and a substrate molecule, e.g., omithine or an amino acid, e.g., arginine. Functional allelic variants will typically contain only conservative substitution of one or more amino acids of SEQ ID NO:11, or substitution, deletion or insertion of non-critical residues in non-critical regions of the protein. Non-functional allelic variants are naturally-occurring amino acid sequence variants of the 61833, e.g., human 61833, protein within a population that do not have the ability to catalyze the pyridoxyl -5′-phosphate-dependent removal of a carboxyl moiety from a substrate molecule, e.g., non-functional allelic variants may lack the ability to bind pyridoxyl -5′-phosphate or a substrate molecule, or they may lack decarboxylase activity. Non-functional allelic variants will typically contain a non-conservative substitution, a deletion, or insertion, or premature truncation of the amino acid sequence of SEQ ID NO: 11, or a substitution, insertion, or deletion in critical residues or critical regions of the protein.

[0567] Moreover, nucleic acid molecules encoding other 61833 family members and, thus, which have a nucleotide sequence which differs from the 61833 sequences of SEQ ID NO:10 or SEQ ID NO: 12 are intended to be within the scope of the invention.

[0568] Antisense Nucleic Acid Molecules, Ribozymes and Modified 61833 Nucleic Acid Molecules

[0569] In another aspect, the invention features, an isolated nucleic acid molecule which is antisense to 61833. An “antisense” nucleic acid can include a nucleotide sequence which is complementary to a “sense” nucleic acid encoding a protein, e.g., complementary to the coding strand of a double-stranded cDNA molecule or complementary to an mRNA sequence. The antisense nucleic acid can be complementary to an entire 61833 coding strand, or to only a portion thereof (e.g., the coding region of human 61833 corresponding to SEQ ID NO:12). In another embodiment, the antisense nucleic acid molecule is antisense to a “noncoding region” of the coding strand of a nucleotide sequence encoding 61833 (e.g., the 5′ and 3′untranslated regions).

[0570] An antisense nucleic acid can be designed such that it is complementary to the entire coding region of 61833 mRNA, but more preferably is an oligonucleotide which is antisense to only a portion of the coding or noncoding region of 61833 mRNA. For example, the antisense oligonucleotide can be complementary to the region surrounding the translation start site of 61833 mRNA, e.g., between the −10 and +10 regions of the target gene nucleotide sequence of interest. An antisense oligonucleotide can be, for example, about 7, 10, 15, 20, 25, 30, 35, 40, 45, 50, 55, 60, 65, 70, 75, 80, or more nucleotides in length.

[0571] An antisense nucleic acid of the invention can be constructed using chemical synthesis and enzymatic ligation reactions using procedures known in the art. For example, an antisense nucleic acid (e.g., an antisense oligonucleotide) can be chemically synthesized using naturally occurring nucleotides or variously modified nucleotides designed to increase the biological stability of the molecules or to increase the physical stability of the duplex formed between the antisense and sense nucleic acids, e.g., phosphorothioate derivatives and acridine substituted nucleotides can be used. The antisense nucleic acid also can be produced biologically using an expression vector into which a nucleic acid has been subcloned in an antisense orientation (i.e., RNA transcribed from the inserted nucleic acid will be of an antisense orientation to a target nucleic acid of interest, described further in the following subsection).

[0572] The antisense nucleic acid molecules of the invention are typically administered to a subject (e.g., by direct injection at a tissue site), or generated in situ such that they hybridize with or bind to cellular mRNA and/or genomic DNA encoding a 61833 protein to thereby inhibit expression of the protein, e.g., by inhibiting transcription and/or translation. Alternatively, antisense nucleic acid molecules can be modified to target selected cells and then administered systemically. For systemic administration, antisense molecules can be modified such that they specifically bind to receptors or antigens expressed on a selected cell surface, e.g., by linking the antisense nucleic acid molecules to peptides or antibodies which bind to cell surface receptors or antigens. The antisense nucleic acid molecules can also be delivered to cells using the vectors described herein. To achieve sufficient intracellular concentrations of the antisense molecules, vector constructs in which the antisense nucleic acid molecule is placed under the control of a strong pol II or pol III promoter are preferred.

[0573] In yet another embodiment, the antisense nucleic acid molecule of the invention is an &agr;-anomeric nucleic acid molecule. An &agr;-anomeric nucleic acid molecule forms specific double-stranded hybrids with complementary RNA in which, contrary to the usual &bgr;-units, the strands run parallel to each other (Gaultier et al. (1987) Nucleic Acids. Res. 15:6625-6641). The antisense nucleic acid molecule can also comprise a 2′-o-methylribonucleotide (Inoue et al. (1987) Nucleic Acids Res. 15:6131-6148) or a chimeric RNA-DNA analogue (Inoue et al. (1987) FEBS Lett. 215:327-330).

[0574] In still another embodiment, an antisense nucleic acid of the invention is a ribozyme. A ribozyme having specificity for a 61833-encoding nucleic acid can include one or more sequences complementary to the nucleotide sequence of a 61833 cDNA disclosed herein (i.e., SEQ ID NO:10 or SEQ ID NO:12), and a sequence having known catalytic sequence responsible for mRNA cleavage (see U.S. Pat. No. 5,093,246 or Haselhoff and Gerlach (1988) Nature 334:585-591). For example, a derivative of a Tetrahymena L-19 IVS RNA can be constructed in which the nucleotide sequence of the active site is complementary to the nucleotide sequence to be cleaved in a 61833-encoding mRNA. See, e.g., Cech et al. U.S. Pat. No. 4,987,071; and Cech et al. U.S. Pat. No. 5,116,742. Alternatively, 61833 mRNA can be used to select a catalytic RNA having a specific ribonuclease activity from a pool of RNA molecules. See, e.g., Bartel, D. and Szostak, J. W. (1993) Science 261:1411-1418.

[0575] 61833 gene expression can be inhibited by targeting nucleotide sequences complementary to the regulatory region of the 61833 (e.g., the 61833 promoter and/or enhancers) to form triple helical structures that prevent transcription of the 61833 gene in target cells. See generally, Helene, C. (1991) Anticancer Drug Des. 6:569-84; Helene, C. i (1992) Ann. N.Y. Acad. Sci. 660:27-36; and Maher, L. J. (1992) Bioassays 14:807-15. The potential sequences that can be targeted for triple helix formation can be increased by creating a so-called “switchback” nucleic acid molecule. Switchback molecules are synthesized in an alternating 5′-3′,3′-5′manner, such that they base pair with first one strand of a duplex and then the other, eliminating the necessity for a sizeable stretch of either purines or pyrimidines to be present on one strand of a duplex.

[0576] The invention also provides detectably labeled oligonucleotide primer and probe molecules. Typically, such labels are chemiluminescent, fluorescent, radioactive, or colorimetric.

[0577] A 61833 nucleic acid molecule can be modified at the base moiety, sugar moiety or phosphate backbone to improve, e.g., the stability, hybridization, or solubility of the molecule. For non-limiting examples of synthetic oligonucleotides with modifications see Toulmé (2001) Nature Biotech. 19:17 and Faria et al. (2001) Nature Biotech. 19:40-44. Such phosphoramidite oligonucleotides can be effective antisense agents.

[0578] For example, the deoxyribose phosphate backbone of the nucleic acid molecules can be modified to generate peptide nucleic acids (see Hyrup B. et al. (1996) Bioorganic & Medicinal Chemistry 4: 5-23). As used herein, the terms “peptide nucleic acid” or “PNA” refers to a nucleic acid mimic, e.g., a DNA mimic, in which the deoxyribose phosphate backbone is replaced by a pseudopeptide backbone and only the four natural nucleobases are retained. The neutral backbone of a PNA can allow for specific hybridization to DNA and RNA under conditions of low ionic strength. The synthesis of PNA oligomers can be performed using standard solid phase peptide synthesis protocols as described in Hyrup B. et al. (1996) supra and Perry-O'Keefe et al. Proc. Natl. Acad. Sci. 93: 14670-675.

[0579] PNAs of 61833 nucleic acid molecules can be used in therapeutic and diagnostic applications. For example, PNAs can be used as antisense or antigene agents for sequence-specific modulation of gene expression by, for example, inducing transcription or translation arrest or inhibiting replication. PNAs of 61833 nucleic acid molecules can also be used in the analysis of single base pair mutations in a gene, (e.g., by PNA-directed PCR clamping); as ‘artificial restriction enzymes’ when used in combination with other enzymes, (e.g., S1 nucleases (Hyrup B. et al. (1996) supra)); or as probes or primers for DNA sequencing or hybridization (Hyrup B. et al. (1996) supra; Perry-O'Keefe supra).

[0580] In other embodiments, the oligonucleotide may include other appended groups such as peptides (e.g., for targeting host cell receptors in vivo), or agents facilitating transport across the cell membrane (see, e.g., Letsinger et al. (1989) Proc. Natl. Acad. Sci. USA 86:6553-6556; Lemaitre et al. (1987) Proc. Natl. Acad. Sci. USA 84:648-652; PCT Publication No. WO88/09810) or the blood-brain barrier (see, e.g., PCT Publication No. WO89/10134). In addition, oligonucleotides can be modified with hybridization-triggered cleavage agents (see, e.g., Krol et al. (1988) Bio-Techniques 6:958-976) or intercalating agents. (see, e.g., Zon (1988) Pharm. Res. 5:539-549). To this end, the oligonucleotide may be conjugated to another molecule, (e.g., a peptide, hybridization triggered cross-linking agent, transport agent, or hybridization-triggered cleavage agent).

[0581] The invention also includes molecular beacon oligonucleotide primer and probe molecules having at least one region which is complementary to a 61833 nucleic acid of the invention, two complementary regions one having a fluorophore and one a quencher such that the molecular beacon is useful for quantitating the presence of the 61833 nucleic acid of the invention in a sample. Molecular beacon nucleic acids are described, for example, in Lizardi et al., U.S. Pat. No. 5,854,033; Nazarenko et al., U.S. Pat. No. 5,866,336, and Livak et al., U.S. Pat. No. 5,876,930.

[0582] Isolated 61833 Polypeptides

[0583] In another aspect, the invention features, an isolated 61833 protein, or fragment, e.g., a biologically active portion, for use as immunogens or antigens to raise or test (or more generally to bind) anti-61833 antibodies. 61833 protein can be isolated from cells or tissue sources using standard protein purification techniques. 61833 protein or fragments thereof can be produced by recombinant DNA techniques or synthesized chemically.

[0584] Polypeptides of the invention include those which arise as a result of the existence of multiple genes, alternative transcription events, alternative RNA splicing events, and alternative translational and post-translational events. The polypeptide can be expressed in systems, e.g., cultured cells, which result in substantially the same post-translational modifications present when expressed the polypeptide is expressed in a native cell, or in systems which result in the alteration or omission of post-translational modifications, e.g., glycosylation or cleavage, present when expressed in a native cell.

[0585] In a preferred embodiment, a 61833 polypeptide has one or more of the following characteristics:

[0586] (i) it has the ability to decarboxylate a substrate molecule in a pyridoxyl-dependent fashion;

[0587] (ii) it has a molecular weight, e.g., a deduced molecular weight, preferably ignoring any contribution of post translational modifications, amino acid composition or other physical characteristic of a 61833 polypeptide, e.g., a polypeptide of SEQ ID NO:11;

[0588] (iii) it has an overall sequence similarity of at least 60%, preferably at least 70, 80, 90, or 95%, with a polypeptide of SEQ ID NO: 11;

[0589] (iv) it has a pyridoxyl-dependent decarboxylase domain which is preferably about 70%, 80%, 90% or 95% with amino acid residues about 41 to 401 of SEQ ID NO:11;

[0590] (v) it has a decarboxylase family 2 pyridoxyl phosphate attachment site, located at about amino acid residues 67 to 85 of SEQ ID NO:11; or

[0591] (vi) it has a decarboxylase family 2 signature motif, located at about amino acid residues 228 to 241.

[0592] In a preferred embodiment the 61833 protein, or fragment thereof, differs from the corresponding sequence in SEQ ID NO:11. In one embodiment it differs by at least one but by less than 15, 10 or 5 amino acid residues. In another it differs from the corresponding sequence in SEQ ID NO: 11 by at least one residue but less than 20%, 15%, 10% or 5% of the residues in it differ from the corresponding sequence in SEQ ID NO: 11. (If this comparison requires alignment the sequences should be aligned for maximum homology. “Looped” out sequences from deletions or insertions, or mismatches, are considered differences.) The differences are, preferably, differences or changes at a non essential residue or a conservative substitution. In a preferred embodiment the differences are not in the pyridoxyl-dependent decarboxylase domain, e.g., about amino acid residues 41 to 401 of SEQ ID NO:11. In another preferred embodiment one or more differences are in the pyridoxyl-dependent decarboxylase domain.

[0593] Other embodiments include a protein that contain one or more changes in amino acid sequence, e.g., a change in an amino acid residue which is not essential for activity. Such 61833 proteins differ in amino acid sequence from SEQ ID NO:11, yet retain biological activity.

[0594] In one embodiment, the protein includes an amino acid sequence at least about 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 98% or more homologous to SEQ ID NO:11.

[0595] A 61833 protein or fragment is provided which varies from the sequence of SEQ ID NO: 11 in regions defined by amino acids about 1 to 460 by at least one but by less than 15, 10 or 5 amino acid residues in the protein or fragment but which does not differ from SEQ ID NO: 11 in regions defined about by amino acid residues 67 to 85 and 228 to 241 of SEQ ID NO: 11 (If this comparison requires alignment the sequences should be aligned for maximum homology. “Looped” out sequences from deletions or insertions, or mismatches, are considered differences.) In some embodiments the difference is at a non-essential residue or is a conservative substitution, while in others the difference is at an essential residue or is a non-conservative substitution.

[0596] In one embodiment, a biologically active portion of a 61833 protein includes a pyridoxyl-dependent decarboxylase domain. Moreover, other biologically active portions, in which other regions of the protein are deleted, can be prepared by recombinant techniques and evaluated for one or more of the functional activities of a native 61833 protein.

[0597] In a preferred embodiment, the 61833 protein has an amino acid sequence shown in SEQ ID NO: 11. In other embodiments, the 61833 protein is substantially identical to SEQ ID NO:11. In yet another embodiment, the 61833 protein is substantially identical to SEQ ID NO: 11 and retains the functional activity of the protein of SEQ ID NO:11, as described in detail in the subsections above.

[0598] 61833 Chimeric or Fusion Proteins

[0599] In another aspect, the invention provides 61833 chimeric or fusion proteins. As used herein, a 61833 “chimeric protein” or “fusion protein” includes a 61833 polypeptide linked to a non-61833 polypeptide. A “non-61833 polypeptide” refers to a polypeptide having an amino acid sequence corresponding to a protein which is not substantially homologous to the 61833 protein, e.g., a protein which is different from the 61833 protein and which is derived from the same or a different organism. The 61833 polypeptide of the fusion protein can correspond to all or a portion e.g., a fragment described herein of a 61833 amino acid sequence. In a preferred embodiment, a 61833 fusion protein includes at least one (or two) biologically active portion of a 61833 protein. The non-61833 polypeptide can be fused to the N-terminus or C-terminus of the 61833 polypeptide.

[0600] The fusion protein can include a moiety which has a high affinity for a ligand. For example, the fusion protein can be a GST-61833 fusion protein in which the 61833 sequences are fused to the C-terminus of the GST sequences. Such fusion proteins can facilitate the purification of recombinant 61833. Alternatively, the fusion protein can be a 61833 protein containing a heterologous signal sequence at its N-terminus. In certain host cells (e.g., mammalian host cells), expression and/or secretion of 61833 can be increased through use of a heterologous signal sequence.

[0601] Fusion proteins can include all or a part of a serum protein, e.g., an IgG constant region, or human serum albumin.

[0602] The 61833 fusion proteins of the invention can be incorporated into pharmaceutical compositions and administered to a subject in vivo. The 61833 fusion proteins can be used to affect the bioavailability of a 61833 substrate. 61833 fusion proteins may be useful therapeutically for the treatment of disorders caused by, for example, (i) aberrant modification or mutation of a gene encoding a 61833 protein; (ii) mis-regulation of the 61833 gene; and (iii) aberrant post-translational modification of a 61833 protein.

[0603] Moreover, the 61833-fusion proteins of the invention can be used as immunogens to produce anti-61833 antibodies in a subject, to purify 61833 ligands and in screening assays to identify molecules which inhibit the interaction of 61833 with a 61833 substrate.

[0604] Expression vectors are commercially available that already encode a fusion moiety (e.g., a GST polypeptide). A 61833-encoding nucleic acid can be cloned into such an expression vector such that the fusion moiety is linked in-frame to the 61833 protein.

[0605] Variants of 61833 Proteins

[0606] In another aspect, the invention also features a variant of a 61833 polypeptide, e.g., which functions as an agonist (mimetics) or as an antagonist. Variants of the 61833 proteins can be generated by mutagenesis, e.g., discrete point mutation, the insertion or deletion of sequences or the truncation of a 61833 protein. An agonist of the 61833 proteins can retain substantially the same, or a subset, of the biological activities of the naturally occurring form of a 61833 protein. An antagonist of a 61833 protein can inhibit one or more of the activities of the naturally occurring form of the 61833 protein by, for example, competitively modulating a 61833-mediated activity of a 61833 protein. Thus, specific biological effects can be elicited by treatment with a variant of limited function. Preferably, treatment of a subject with a variant having a subset of the biological activities of the naturally occurring form of the protein has fewer side effects in a subject relative to treatment with the naturally occurring form of the 61833 protein.

[0607] Variants of a 61833 protein can be identified by screening combinatorial libraries of mutants, e.g., truncation mutants, of a 61833 protein for agonist or antagonist activity.

[0608] Libraries of fragments e.g., N terminal, C terminal, or internal fragments, of a 61833 protein coding sequence can be used to generate a variegated population of fragments for screening and subsequent selection of variants of a 61833 protein. Variants in which a cysteine residues is added or deleted or in which a residue which is glycosylated is added or deleted are particularly preferred.

[0609] Methods for screening gene products of combinatorial libraries made by point mutations or truncation, and for screening cDNA libraries for gene products having a selected property are known in the art. Such methods are adaptable for rapid screening of the gene libraries generated by combinatorial mutagenesis of 61833 proteins. Recursive ensemble mutagenesis (REM), a new technique which enhances the frequency of functional mutants in the libraries, can be used in combination with the screening assays to identify 61833 variants (Arkin and Yourvan (1992) Proc. Natl. Acad. Sci. USA 89:7811-7815; Delgrave et al. (1993) Protein Engineering 6:327-331).

[0610] Cell based assays can be exploited to analyze a variegated 61833 library. For example, a library of expression vectors can be transfected into a cell line, e.g., a cell line, which ordinarily responds to 61833 in a substrate-dependent manner. The transfected cells are then contacted with 61833 and the effect of the expression of the mutant on signaling by the 61833 substrate can be detected, e.g., by measuring the levels of a 61833 product, e.g., a polyamin, e.g., putrescine or spermidine. Plasmid DNA can then be recovered from the cells which score for inhibition, or alternatively, potentiation of signaling by the 61833 substrate, and the individual clones further characterized.

[0611] In another aspect, the invention features a method of making a 61833 polypeptide, e.g., a peptide having a non-wild type activity, e.g., an antagonist, agonist, or super agonist of a naturally occurring 61833 polypeptide, e.g., a naturally occurring 61833 polypeptide. The method includes: altering the sequence of a 61833 polypeptide, e.g., altering the sequence, e.g., by substitution or deletion of one or more residues of a non-conserved region, a domain or residue disclosed herein, and testing the altered polypeptide for the desired activity.

[0612] In another aspect, the invention features a method of making a fragment or analog of a 61833 polypeptide a biological activity of a naturally occurring 61833 polypeptide. The method includes: altering the sequence, e.g., by substitution or deletion of one or more residues, of a 61833 polypeptide, e.g., altering the sequence of a non-conserved region, or a domain or residue described herein, and testing the altered polypeptide for the desired activity.

[0613] Anti-61833 Antibodies

[0614] In another aspect, the invention provides an anti-61833 antibody, or a fragment thereof (e.g., an antigen-binding fragment thereof). The term “antibody” as used herein refers to an immunoglobulin molecule or immunologically active portion thereof, i.e., an antigen-binding portion. As used herein, the term “antibody” refers to a protein comprising at least one, and preferably two, heavy (H) chain variable regions (abbreviated herein as VH), and at least one and preferably two light (L) chain variable regions (abbreviated herein as VL). The VH and VL regions can be further subdivided into regions of hypervariability, termed “complementarity determining regions” (“CDR”), interspersed with regions that are more conserved, termed “framework regions” (FR). The extent of the framework region and CDR's has been precisely defined (see, Kabat, E. A., et al. (1991) Sequences of proteins of Immunological Interest, Fifth Edition, U.S. Department of Health and Human Services, NIH Publication No. 91-3242, and Chothia, C. et al. (1987) J. Mol. Biol. 196:901-917, which are incorporated herein by reference). Each VH and VL is composed of three CDR's and four FRs, arranged from amino-terminus to carboxy-terminus in the following order: FR1, CDR1, FR2, CDR2, FR3, CDR3, FR4.

[0615] The anti-61833 antibody can further include a heavy and light chain constant region, to thereby form a heavy and light immunoglobulin chain, respectively. In one embodiment, the antibody is a tetramer of two heavy immunoglobulin chains and two light immunoglobulin chains, wherein the heavy and light immunoglobulin chains are inter-connected by, e.g., disulfide bonds. The heavy chain constant region is comprised of three domains, CH1, CH2 and CH3. The light chain constant region is comprised of one domain, CL. The variable region of the heavy and light chains contains a binding domain that interacts with an antigen. The constant regions of the antibodies typically mediate the binding of the antibody to host tissues or factors, including various cells of the immune system (e.g., effector cells) and the first component (Clq) of the classical complement system.

[0616] As used herein, the term “immunoglobulin” refers to a protein consisting of one or more polypeptides substantially encoded by immunoglobulin genes. The recognized human immunoglobulin genes include the kappa, lambda, alpha (IgA1 and IgA2), gamma (IgG1, IgG2, IgG3, IgG4), delta, epsilon and mu constant region genes, as well as the myriad immunoglobulin variable region genes. Full-length immunoglobulin “light chains” (about 25 KDa or 214 amino acids) are encoded by a variable region gene at the NH2-terminus (about 110 amino acids) and a kappa or lambda constant region gene at the COOH—terminus. Full-length immunoglobulin “heavy chains” (about 50 KDa or 446 amino acids), are similarly encoded by a variable region gene (about 116 amino acids) and one of the other aforementioned constant region genes, e.g., gamma (encoding about 330 amino acids).

[0617] The term “antigen-binding fragment” of an antibody (or simply “antibody portion,” or “fragment”), as used herein, refers to one or more fragments of a full-length antibody that retain the ability to specifically bind to the antigen, e.g., 61833 polypeptide or fragment thereof. Examples of antigen-binding fragments of the anti-61833 antibody include, but are not limited to: (i) a Fab fragment, a monovalent fragment consisting of the VL, VH, CL and CH1 domains; (ii) a F(ab′)2 fragment, a bivalent fragment comprising two Fab fragments linked by a disulfide bridge at the hinge region; (iii) a Fd fragment consisting of the VH and CH1 domains; (iv) a Fv fragment consisting of the VL and VH domains of a single arm of an antibody, (v) a dAb fragment (Ward et al., (1989) Nature 341:544-546), which consists of a VH domain; and (vi) an isolated complementarity determining region (CDR). Furthermore, although the two domains of the Fv fragment, VL and VH, are coded for by separate genes, they can be joined, using recombinant methods, by a synthetic linker that enables them to be made as a single protein chain in which the VL and VH regions pair to form monovalent molecules (known as single chain Fv (scFv); see e.g., Bird et al. (1988) Science 242:423-426; and Huston et al. (1988) Proc. Natl. Acad. Sci. USA 85:5879-5883). Such single chain antibodies are also encompassed within the term “antigen-binding fragment” of an antibody. These antibody fragments are obtained using conventional techniques known to those with skill in the art, and the fragments are screened for utility in the same manner as are intact antibodies.

[0618] The anti-61833 antibody can be a polyclonal or a monoclonal antibody. In other embodiments, the antibody can be recombinantly produced, e.g., produced by phage display or by combinatorial methods.

[0619] Phage display and combinatorial methods for generating anti-61833 antibodies are known in the art (as described in, e.g., Ladner et al. U.S. Pat. No. 5,223,409; Kang et al. International Publication No. WO 92/18619; Dower et al. International Publication No. WO 91/17271; Winter et al. International Publication WO 92/20791; Markland et al. International Publication No. WO 92/15679; Breitling et al. International Publication WO 93/01288; McCafferty et al. International Publication No. WO 92/01047; Garrard et al. International Publication No. WO 92/09690; Ladner et al. International Publication No. WO 90/02809; Fuchs et al. (1991) Bio/Technology 9:1370-1372; Hay et al. (1992) Hum Antibod Hybridomas 3:81-85; Huse et al. (1989) Science 246:1275-1281; Griffths et al. (1993) EMBO J 12:725-734; Hawkins et al. (1992) J Mol Biol 226:889-896; Clackson et al. (1991) Nature 352:624-628; Gram et al. (1992) PNAS 89:3576-3580; Garrad et al. (1991) Bio/Technology 9:1373-1377; Hoogenboom et al. (1991) Nuc Acid Res 19:4133-4137; and Barbas et al. (1991) PNAS 88:7978-7982, the contents of all of which are incorporated by reference herein).

[0620] In one embodiment, the anti-61833 antibody is a fully human antibody (e.g., an antibody made in a mouse which has been genetically engineered to produce an antibody from a human immunoglobulin sequence), or a non-human antibody, e.g., a rodent (mouse or rat), goat, primate (e.g., monkey), camel antibody. Preferably, the non-human antibody is a rodent (mouse or rat antibody). Method of producing rodent antibodies are known in the art.

[0621] Human monoclonal antibodies can be generated using transgenic mice carrying the human immunoglobulin genes rather than the mouse system. Splenocytes from these transgenic mice immunized with the antigen of interest are used to produce hybridomas that secrete human mAbs with specific affinities for epitopes from a human protein (see, e.g., Wood et al. International Application WO 91/00906, Kucherlapati et al. PCT publication WO 91/10741; Lonberg et al. International Application WO 92/03918; Kay et al. International Application 92/03917; Lonberg, N. et al. 1994 Nature 368:856-859; Green, L. L. et al. 1994 Nature Genet. 7:13-21; Morrison, S. L. et al. 1994 Proc. Natl. Acad. Sci. USA 81:6851-6855; Bruggeman et al. 1993 Year Immunol 7:33-40; Tuaillon et al. 1993 PNAS 90:3720-3724; Bruggeman et al. 1991 Eur J Immunol 21:1323-1326).

[0622] An anti-61833 antibody can be one in which the variable region, or a portion thereof, e.g., the CDR's, are generated in a non-human organism, e.g., a rat or mouse. Chimeric, CDR-grafted, and humanized antibodies are within the invention. Antibodies generated in a non-human organism, e.g., a rat or mouse, and then modified, e.g., in the variable framework or constant region, to decrease antigenicity in a human are within the invention.

[0623] Chimeric antibodies can be produced by recombinant DNA techniques known in the art. For example, a gene encoding the Fc constant region of a murine (or other species) monoclonal antibody molecule is digested with restriction enzymes to remove the region encoding the murine Fc, and the equivalent portion of a gene encoding a human Fc constant region is substituted (see Robinson et al., International Patent Publication PCT/US86/02269; Akira, et al., European Patent Application 184,187; Taniguchi, M., European Patent Application 171,496; Morrison et al., European Patent Application 173,494; Neuberger et al., International Application WO 86/01533; Cabilly et al. U.S. Pat. No. 4,816,567; Cabilly et al., European Patent Application 125,023; Better et al. (1988 Science 240:1041-1043); Liu et al. (1987) PNAS 84:3439-3443; Liu et al., 1987, J. Immunol. 139:3521-3526; Sun et al. (1987) PNAS 84:214-218; Nishimura et al., 1987, Canc. Res. 47:999-1005; Wood et al. (1985) Nature 314:446-449; and Shaw et al., 1988, J. Natl Cancer Inst. 80:1553-1559).

[0624] A humanized or CDR-grafted antibody will have at least one or two but generally all three recipient CDR's (of heavy and or light immuoglobulin chains) replaced with a donor CDR. The antibody may be replaced with at least a portion of a non-human CDR or only some of the CDR's may be replaced with non-human CDR's. It is only necessary to replace the number of CDR's required for binding of the humanized antibody to a 61833 or a fragment thereof. Preferably, the donor will be a rodent antibody, e.g., a rat or mouse antibody, and the recipient will be a human framework or a human consensus framework. Typically, the immunoglobulin providing the CDR's is called the “donor” and the immunoglobulin providing the framework is called the “acceptor.” In one embodiment, the donor immunoglobulin is a non-human (e.g., rodent). The acceptor framework is a naturally-occurring (e.g., a human) framework or a consensus framework, or a sequence about 85% or higher, preferably 90%, 95%, 99% or higher identical thereto.

[0625] As used herein, the term “consensus sequence” refers to the sequence formed from the most frequently occurring amino acids (or nucleotides) in a family of related sequences (See e.g., Winnaker, From Genes to Clones (Verlagsgesellschaft, Weinheim, Germany 1987). In a family of proteins, each position in the consensus sequence is occupied by the amino acid occurring most frequently at that position in the family. If two amino acids occur equally frequently, either can be included in the consensus sequence. A “consensus framework” refers to the framework region in the consensus immunoglobulin sequence.

[0626] An antibody can be humanized by methods known in the art. Humanized antibodies can be generated by replacing sequences of the Fv variable region which are not directly involved in antigen binding with equivalent sequences from human Fv variable regions. General methods for generating humanized antibodies are provided by Morrison, S. L., 1985, Science 229:1202-1207, by Oi et al., 1986, BioTechniques 4:214, and by Queen et al. U.S. Pat. No. 5,585,089, U.S. Pat. No. 5,693,761 and U.S. Pat. No. 5,693,762, the contents of all of which are hereby incorporated by reference. Those methods include isolating, manipulating, and expressing the nucleic acid sequences that encode all or part of immunoglobulin Fv variable regions from at least one of a heavy or light chain. Sources of such nucleic acid are well known to those skilled in the art and, for example, may be obtained from a hybridoma producing an antibody against a 61833 polypeptide or fragment thereof. The recombinant DNA encoding the humanized antibody, or fragment thereof, can then be cloned into an appropriate expression vector.

[0627] Humanized or CDR-grafted antibodies can be produced by CDR-grafting or CDR substitution, wherein one, two, or all CDR's of an immunoglobulin chain can be replaced. See e.g., U.S. Pat. No. 5,225,539; Jones et al. 1986 Nature 321:552-525; Verhoeyan et al. 1988 Science 239:1534; Beidler et al. 1988 J. Immunol. 141:4053-4060; Winter U.S. Pat. No. 5,225,539, the contents of all of which are hereby expressly incorporated by reference. Winter describes a CDR-grafting method which may be used to prepare the humanized antibodies of the present invention (UK Patent Application GB 2188638A, filed on Mar. 26, 1987; Winter U.S. Pat. No. 5,225,539), the contents of which is expressly incorporated by reference.

[0628] Also within the scope of the invention are humanized antibodies in which specific amino acids have been substituted, deleted or added. Preferred humanized antibodies have amino acid substitutions in the framework region, such as to improve binding to the antigen. For example, a humanized antibody will have framework residues identical to the donor framework residue or to another amino acid other than the recipient framework residue. To generate such antibodies, a selected, small number of acceptor framework residues of the humanized immunoglobulin chain can be replaced by the corresponding donor amino acids. Preferred locations of the substitutions include amino acid residues adjacent to the CDR, or which are capable of interacting with a CDR (see e.g., U.S. Pat. No. 5,585,089). Criteria for selecting amino acids from the donor are described in U.S. Pat. No. 5,585,089, e.g., columns 12-16 of U.S. Pat. No. 5,585,089, the e.g., columns 12-16 of U.S. Pat. No. 5,585,089, the contents of which are hereby incorporated by reference. Other techniques for humanizing antibodies are described in Padlan et al. EP 519596 A1, published on Dec. 23, 1992.

[0629] In preferred embodiments an antibody can be made by immunizing with purified 61833 antigen, or a fragment thereof, e.g., a fragment described herein, tissue, e.g., crude tissue preparations, whole cells, preferably living cells, lysed cells, or cell fractions, e.g., cytosolic fractions.

[0630] A full-length 61833 protein or, antigenic peptide fragment of 61833 can be used as an immunogen or can be used to identify anti-61833 antibodies made with other immunogens, e.g., cells, membrane preparations, and the like. The antigenic peptide of 61833 should include at least 8 amino acid residues of the amino acid sequence shown in SEQ ID NO: 1 and encompasses an epitope of 61833. Preferably, the antigenic peptide includes at least 10 amino acid residues, more preferably at least 15 amino acid residues, even more preferably at least 20 amino acid residues, and most preferably at least 30 amino acid residues.

[0631] Fragments of 61833 which include residues about 30 to 50, about 276 to 293, or about 323 to 341 can be used to make, e.g., used as immunogens or used to characterize the specificity of an antibody, antibodies against hydrophobic regions of the 61833 protein. Similarly, fragments of 61833 which include residues about 300 to 310 can be used to make an antibody against a hydrophilic region of the 61833 protein; a fragment of 61833 which include residues about 128 to 150, about 170 to 205, or about 370 to 401 can be used to make an antibody against the pyridoxyl-dependent decarboxylase region of the 61833 protein.

[0632] Antibodies reactive with, or specific for, any of these regions, or other regions or domains described herein are provided.

[0633] Antibodies which bind only native 61833 protein, only denatured or otherwise non-native 61833 protein, or which bind both, are with in the invention. Antibodies with linear or conformational epitopes are within the invention. Conformational epitopes can sometimes be identified by identifying antibodies which bind to native but not denatured 61833 protein.

[0634] Preferred epitopes encompassed by the antigenic peptide are regions of 61833 are located on the surface of the protein, e.g., hydrophilic regions, as well as regions with high antigenicity. For example, an Emini surface probability analysis of the human 61833 protein sequence can be used to indicate the regions that have a particularly high probability of being localized to the surface of the 61833 protein and are thus likely to constitute surface residues useful for targeting antibody production.

[0635] In preferred embodiments antibodies can bind one or more of purified antigen or tissue, e.g., tissue sections, whole cells, preferably living cells, lysed cells, cell fractions, e.g., cytosolic fractions.

[0636] The anti-61833 antibody can be a single chain antibody. A single-chain antibody (scFV) may be engineered (see, for example, Colcher, D. et al. (1999) Ann N Y Acad Sci 880:263-80; and Reiter, Y. (1996) Clin Cancer Res 2:245-52). The single chain antibody can be dimerized or multimerized to generate multivalent antibodies having specificities for different epitopes of the same target 61833 protein.

[0637] In a preferred embodiment the antibody has: effector function; and can fix complement. In other embodiments the antibody does not; recruit effector cells; or fix complement.

[0638] In a preferred embodiment, the antibody has reduced or no ability to bind an Fc receptor. For example., it is a isotype or subtype, fragment or other mutant, which does not support binding to an Fc receptor, e.g., it has a mutagenized or deleted Fc receptor binding region.

[0639] In a preferred embodiment, an anti-61833 antibody alters (e.g., increases or decreases) the pyridoxyl-dependent decarboxylase activity of a 61833 polypeptide. For example, the antibody can bind at or in proximity to a pyridoxyl -5′-phosphate- or a substrate-binding site, e.g., to an epitope that includes a residue located from about 67 to 85 or 228 to 241 of SEQ ID NO:1.

[0640] The antibody can be coupled to a toxin, e.g., a polypeptide toxin, e,g, ricin or diphtheria toxin or active fragment hereof, or a radioactive nucleus, or imaging agent, e.g. a radioactive, enzymatic, or other, e.g., imaging agent, e.g., a NMR contrast agent. Labels which produce detectable radioactive emissions or fluorescence are preferred.

[0641] An anti-61833 antibody (e.g., monoclonal antibody) can be used to isolate 61833 by standard techniques, such as affinity chromatography or immunoprecipitation. Moreover, an anti-61833 antibody can be used to detect 61833 protein (e.g., in a cellular lysate or cell supernatant) in order to evaluate the abundance and pattern of expression of the protein. Anti-61833 antibodies can be used diagnostically to monitor protein levels in tissue as part of a clinical testing procedure, e.g., to determine the efficacy of a given treatment regimen. Detection can be facilitated by coupling (i.e., physically linking) the antibody to a detectable substance (i.e., antibody labelling). Examples of detectable substances include various enzymes, prosthetic groups, fluorescent materials, luminescent materials, bioluminescent materials, and radioactive materials. Examples of suitable enzymes include horseradish peroxidase, alkaline phosphatase, &bgr;-galactosidase, or acetylcholinesterase; examples of suitable prosthetic group complexes include streptavidin/biotin and avidin/biotin; examples of suitable fluorescent materials include umbelliferone, fluorescein, fluorescein isothiocyanate, rhodamine, dichlorotriazinylamine fluorescein, dansyl chloride or phycoerythrin; an example of a luminescent material includes luminol; examples of bioluminescent materials include luciferase, luciferin, and aequorin, and examples of suitable radioactive material include 125I, 131I, 35S or 3H.

[0642] The invention also includes a nucleic acids which encodes an anti-61833 antibody, e.g., an anti-61833 antibody described herein. Also included are vectors which include the nucleic acid and sells transformed with the nucleic acid, particularly cells which are useful for producing an antibody, e.g., mammalian cells, e.g. CHO or lymphatic cells.

[0643] The invention also includes cell lines, e.g., hybridomas, which make an anti-61833 antibody, e.g., and antibody described herein, and method of using said cells to make a 61833 antibody.

[0644] 61833 Recombinant Expression Vectors, Host Cells and Genetically Engineered Cells

[0645] In another aspect, the invention includes, vectors, preferably expression vectors, containing a nucleic acid encoding a polypeptide described herein. As used herein, the term “vector” refers to a nucleic acid molecule capable of transporting another nucleic acid to which it has been linked and can include a plasmid, cosmid or viral vector. The vector can be capable of autonomous replication or it can integrate into a host DNA. Viral vectors include, e.g., replication defective retroviruses, adenoviruses and adeno-associated viruses.

[0646] A vector can include a 61833 nucleic acid in a form suitable for expression of the nucleic acid in a host cell. Preferably the recombinant expression vector includes one or more regulatory sequences operatively linked to the nucleic acid sequence to be expressed. The term “regulatory sequence” includes promoters, enhancers and other expression control elements (e.g., polyadenylation signals). Regulatory sequences include those which direct constitutive expression of a nucleotide sequence, as well as tissue-specific regulatory and/or inducible sequences. The design of the expression vector can depend on such factors as the choice of the host cell to be transformed, the level of expression of protein desired, and the like. The expression vectors of the invention can be introduced into host cells to thereby produce proteins or polypeptides, including fusion proteins or polypeptides, encoded by nucleic acids as described herein (e.g., 61833 proteins, mutant forms of 61833 proteins, fusion proteins, and the like).

[0647] The recombinant expression vectors of the invention can be designed for expression of 61833 proteins in prokaryotic or eukaryotic cells. For example, polypeptides of the invention can be expressed in E. coli, insect cells (e.g., using baculovirus expression vectors), yeast cells or mammalian cells. Suitable host cells are discussed further in Goeddel, (1990) Gene Expression Technology: Methods in Enzymology 185, Academic Press, San Diego, Calif. Alternatively, the recombinant expression vector can be transcribed and translated in vitro, for example using T7 promoter regulatory sequences and T7 polymerase.

[0648] Expression of proteins in prokaryotes is most often carried out in E. coli with vectors containing constitutive or inducible promoters directing the expression of either fusion or non-fusion proteins. Fusion vectors add a number of amino acids to a protein encoded therein, usually to the amino terminus of the recombinant protein. Such fusion vectors typically serve three purposes: 1) to increase expression of recombinant protein; 2) to increase the solubility of the recombinant protein; and 3) to aid in the purification of the recombinant protein by acting as a ligand in affinity purification. Often, a proteolytic cleavage site is introduced at the junction of the fusion moiety and the recombinant protein to enable separation of the recombinant protein from the fusion moiety subsequent to purification of the fusion protein. Such enzymes, and their cognate recognition sequences, include Factor Xa, thrombin and enterokinase. Typical fusion expression vectors include pGEX (Pharmacia Biotech Inc; Smith, D. B. and Johnson, K. S. (1988) Gene 67:31-40), pMAL (New England Biolabs, Beverly, Mass.) and pRIT5 (Pharmacia, Piscataway, N.J.) which fuse glutathione S-transferase (GST), maltose E binding protein, or protein A, respectively, to the target recombinant protein.

[0649] Purified fusion proteins can be used in 61833 activity assays, (e.g., direct assays or competitive assays described in detail below), or to generate antibodies specific for 61833 proteins. In a preferred embodiment, a fusion protein expressed in a retroviral expression vector of the present invention can be used to infect bone marrow cells which are subsequently transplanted into irradiated recipients. The pathology of the subject recipient is then examined after sufficient time has passed (e.g., six weeks).

[0650] To maximize recombinant protein expression in E. coli is to express the protein in a host bacteria with an impaired capacity to proteolytically cleave the recombinant protein (Gottesman, S., (1990) Gene Expression Technology: Methods in Enzymology 185, Academic Press, San Diego, Calif. 119-128). Another strategy is to alter the nucleic acid sequence of the nucleic acid to be inserted into an expression vector so that the individual codons for each amino acid are those preferentially utilized in E. coli (Wada et al., (1992) Nucleic Acids Res. 20:2111-2118). Such alteration of nucleic acid sequences of the invention can be carried out by standard DNA synthesis techniques.

[0651] The 61833 expression vector can be a yeast expression vector, a vector for expression in insect cells, e.g., a baculovirus expression vector or a vector suitable for expression in mammalian cells.

[0652] When used in mammalian cells, the expression vector's control functions can be provided by viral regulatory elements. For example, commonly used promoters are derived from polyoma, Adenovirus 2, cytomegalovirus and Simian Virus 40.

[0653] In another embodiment, the promoter is an inducible promoter, e.g., a promoter regulated by a steroid hormone, by a polypeptide hormone (e.g., by means of a signal transduction pathway), or by a heterologous polypeptide (e.g., the tetracycline-inducible systems, “Tet-On” and “Tet-Off”; see, e.g., Clontech Inc., CA, Gossen and Bujard (1992) Proc. Natl. Acad. Sci. USA 89:5547, and Paillard (1989) Human Gene Therapy 9:983).

[0654] In another embodiment, the recombinant mammalian expression vector is capable of directing expression of the nucleic acid preferentially in a particular cell type (e.g., tissue-specific regulatory elements are used to express the nucleic acid). Non-limiting examples of suitable tissue-specific promoters include the albumin promoter (liver-specific; Pinkert et al. (1987) Genes Dev. 1:268-277), lymphoid-specific promoters (Calame and Eaton (1988) Adv. Immunol. 43:235-275), in particular promoters of T cell receptors (Winoto and Baltimore (1989) EMBO J. 8:729-733) and immunoglobulins (Banerji et al. (1983) Cell 33:729-740; Queen and Baltimore (1983) Cell 33:741-748), neuron-specific promoters (e.g., the neurofilament promoter; Byrne and Ruddle (1989) Proc. Natl. Acad. Sci. USA 86:5473-5477), pancreas-specific promoters (Edlund et al. (1985) Science 230:912-916), and mammary gland-specific promoters (e.g., milk whey promoter; U.S. Pat. No. 4,873,316 and European Application Publication No. 264,166). Developmentally-regulated promoters are also encompassed, for example, the murine hox promoters (Kessel and Gruss (1990) Science 249:374-379) and the &agr;-fetoprotein promoter (Campes and Tilghman (1989) Genes Dev. 3:537-546).

[0655] The invention further provides a recombinant expression vector comprising a DNA molecule of the invention cloned into the expression vector in an antisense orientation. Regulatory sequences (e.g., viral promoters and/or enhancers) operatively linked to a nucleic acid cloned in the antisense orientation can be chosen which direct the constitutive, tissue specific or cell type specific expression of antisense RNA in a variety of cell types. The antisense expression vector can be in the form of a recombinant plasmid, phagemid or attenuated virus.

[0656] Another aspect the invention provides a host cell which includes a nucleic acid molecule described herein, e.g., a 61833 nucleic acid molecule within a recombinant expression vector or a 61833 nucleic acid molecule containing sequences which allow it to homologously recombine into a specific site of the host cell's genome. The terms “host cell” and “recombinant host cell” are used interchangeably herein. Such terms refer not only to the particular subject cell but to the progeny or potential progeny of such a cell. Because certain modifications may occur in succeeding generations due to either mutation or environmental influences, such progeny may not, in fact, be identical to the parent cell, but are still included within the scope of the term as used herein.

[0657] A host cell can be any prokaryotic or eukaryotic cell. For example, a 61833 protein can be expressed in bacterial cells (such as E. coli), insect cells, yeast or mammalian cells (such as Chinese hamster ovary cells (CHO) or COS cells). Other suitable host cells are known to those skilled in the art.

[0658] Vector DNA can be introduced into host cells via conventional transformation or transfection techniques. As used herein, the terms “transformation” and “transfection” are intended to refer to a variety of art-recognized techniques for introducing foreign nucleic acid (e.g., DNA) into a host cell, including calcium phosphate or calcium chloride co-precipitation, DEAE-dextran-mediated transfection, lipofection, or electroporation.

[0659] A host cell of the invention can be used to produce (i.e., express) a 61833 protein. Accordingly, the invention further provides methods for producing a 61833 protein using the host cells of the invention. In one embodiment, the method includes culturing the host cell of the invention (into which a recombinant expression vector encoding a 61833 protein has been introduced) in a suitable medium such that a 61833 protein is produced. In another embodiment, the method further includes isolating a 61833 protein from the medium or the host cell.

[0660] In another aspect, the invention features, a cell or purified preparation of cells which include a 61833 transgene, or which otherwise misexpress 61833. The cell preparation can consist of human or non-human cells, e.g., rodent cells, e.g., mouse or rat cells, rabbit cells, or pig cells. In preferred embodiments, the cell or cells include a 61833 transgene, e.g., a heterologous form of a 61833, e.g., a gene derived from humans (in the case of a non-human cell). The 61833 transgene can be misexpressed, e.g., overexpressed or underexpressed. In other preferred embodiments, the cell or cells include a gene that mis-expresses an endogenous 61833, e.g., a gene the expression of which is disrupted, e.g., a knockout. Such cells can serve as a model for studying disorders that are related to mutated or mis-expressed 61833 alleles or for use in drug screening.

[0661] In another aspect, the invention features, a human cell, e.g., a hematopoietic stem cell, transformed with nucleic acid which encodes a subject 61833 polypeptide.

[0662] Also provided are cells, preferably human cells, e.g., human hematopoietic or fibroblast cells, in which an endogenous 61833 is under the control of a regulatory sequence that does not normally control the expression of the endogenous 61833 gene. The expression characteristics of an endogenous gene within a cell, e.g., a cell line or microorganism, can be modified by inserting a heterologous DNA regulatory element into the genome of the cell such that the inserted regulatory element is operably linked to the endogenous 61833 gene. For example, an endogenous 61833 gene which is “transcriptionally silent,” e.g., not normally expressed, or expressed only at very low levels, may be activated by inserting a regulatory element which is capable of promoting the expression of a normally expressed gene product in that cell. Techniques such as targeted homologous recombinations, can be used to insert the heterologous DNA as described in, e.g., Chappel, U.S. Pat. No. 5,272,071; WO 91/06667, published in May 16, 1991.

[0663] In a preferred embodiment, recombinant cells described herein can be used for replacement therapy in a subject. For example, a nucleic acid encoding a 61833 polypeptide operably linked to an inducible promoter (e.g., a steroid hormone receptor-regulated promoter) is introduced into a human or nonhuman, e.g., mammalian, e.g., porcine recombinant cell. The cell is cultivated and encapsulated in a biocompatible material, such as poly-lysine alginate, and subsequently implanted into the subject. See, e.g., Lanza (1996) Nat. Biotechnol. 14:1107; Joki et al. (2001) Nat. Biotechnol. 19:35; and U.S. Pat. No. 5,876,742. Production of a 61833 polypeptide can be regulated in the subject by administering an agent (e.g., a steroid hormone) to the subject. In another preferred embodiment, the implanted recombinant cells express and secrete an antibody specific for a 61833 polypeptide. The antibody can be any antibody or any antibody derivative described herein.

[0664] 61833 Transgenic Animals

[0665] The invention provides non-human transgenic animals. Such animals are useful for studying the function and/or activity of a 61833 protein and for identifying and/or evaluating modulators of 61833 activity. As used herein, a “transgenic animal” is a non-human animal, preferably a mammal, more preferably a rodent such as a rat or mouse, in which one or more of the cells of the animal includes a transgene. Other examples of transgenic animals include non-human primates, sheep, dogs, cows, goats, chickens, amphibians, and the like. A transgene is exogenous DNA or a rearrangement, e.g., a deletion of endogenous chromosomal DNA, which preferably is integrated into or occurs in the genome of the cells of a transgenic animal. A transgene can direct the expression of an encoded gene product in one or more cell types or tissues of the transgenic animal, other transgenes, e.g., a knockout, reduce expression. Thus, a transgenic animal can be one in which an endogenous 61833 gene has been altered by, e.g., by homologous recombination between the endogenous gene and an exogenous DNA molecule introduced into a cell of the animal, e.g., an embryonic cell of the animal, prior to development of the animal.

[0666] Intronic sequences and polyadenylation signals can also be included in the transgene to increase the efficiency of expression of the transgene. A tissue-specific regulatory sequence(s) can be operably linked to a transgene of the invention to direct expression of a 61833 protein to particular cells. A transgenic founder animal can be identified based upon the presence of a 61833 transgene in its genome and/or expression of 61833 mRNA in tissues or cells of the animals. A transgenic founder animal can then be used to breed additional animals carrying the transgene. Moreover, transgenic animals carrying a transgene encoding a 61833 protein can further be bred to other transgenic animals carrying other transgenes.

[0667] 61833 proteins or polypeptides can be expressed in transgenic animals or plants, e.g., a nucleic acid encoding the protein or polypeptide can be introduced into the genome of an animal. In preferred embodiments the nucleic acid is placed under the control of a tissue specific promoter, e.g., a milk or egg specific promoter, and recovered from the milk or eggs produced by the animal. Suitable animals are mice, pigs, cows, goats, and sheep.

[0668] The invention also includes a population of cells from a transgenic animal, as discussed, e.g., below.

[0669] Uses of 61833

[0670] The nucleic acid molecules, proteins, protein homologues, and antibodies described herein can be used in one or more of the following methods: a) screening assays; b) predictive medicine (e.g., diagnostic assays, prognostic assays, monitoring clinical trials, and pharmacogenetics); and c) methods of treatment (e.g., therapeutic and prophylactic).

[0671] The isolated nucleic acid molecules of the invention can be used, for example, to express a 61833 protein (e.g., via a recombinant expression vector in a host cell in gene therapy applications), to detect a 61833 mRNA (e.g., in a biological sample) or a genetic alteration in a 61833 gene, and to modulate 61833 activity, as described further below. The 61833 proteins can be used to treat disorders characterized by insufficient or excessive production of a 61833 substrate or production of 61833 inhibitors. In addition, the 61833 proteins can be used to screen for naturally occurring 61833 substrates, to screen for drugs or compounds which modulate 61833 activity, as well as to treat disorders characterized by insufficient or excessive production of 61833 protein or production of 61833 protein forms which have decreased, aberrant or unwanted activity compared to 61833 wild type protein (e.g., cellular proliferative or differentiative disorder, a cardiovascular disorder, or a disorder of the brain). Moreover, the anti-61833 antibodies of the invention can be used to detect and isolate 61833 proteins, regulate the bioavailability of 61833 proteins, and modulate 61833 activity.

[0672] The 61833 polypeptide of the invention can also be used in a method of modifying a compound, e.g., a method of decarboxylating a substrate compound. The method includes: providing a compound of interest, e.g., omithine or an amino acid, e.g., agrinine; a reaction-compatible solvent; combining the compound of interest, a 61833 polypeptide described herein, and pyridoxyl -5′-phosphate in the solvent; and maintaining the solvent mixture under conditions such that a carboxyl moiety is removed from the compound of interest. The method can further include isolating the compound of interest from the solvent. Such a method can be useful for synthesizing compounds, e.g., putrescine, in vitro for, e.g., laboratory or pharmaceutical use.

[0673] A method of evaluating a compound for the ability to interact with, e.g., bind, a subject 61833 polypeptide is provided. The method includes: contacting the compound with the subject 61833 polypeptide; and evaluating ability of the compound to interact with, e.g., to bind or form a complex with the subject 61833 polypeptide. This method can be performed in vitro, e.g., in a cell free system, or in vivo, e.g., in a two-hybrid interaction trap assay. This method can be used to identify naturally occurring molecules that interact with subject 61833 polypeptide. It can also be used to find natural or synthetic inhibitors of subject 61833 polypeptide. Screening methods are discussed in more detail below.

[0674] 61833 Screening Assays

[0675] The invention provides methods (also referred to herein as “screening assays”) for identifying modulators, i.e., candidate or test compounds or agents (e.g., proteins, peptides, peptidomimetics, peptoids, small molecules or other drugs) which bind to 61833 proteins, have a stimulatory or inhibitory effect on, for example, 61833 expression or 61833 activity, or have a stimulatory or inhibitory effect on, for example, the expression or activity of a 61833 substrate. Compounds thus identified can be used to modulate the activity of target gene products (e.g., 61833 genes) in a therapeutic protocol, to elaborate the biological function of the target gene product, or to identify compounds that disrupt normal target gene interactions.

[0676] In one embodiment, the invention provides assays for screening candidate or test compounds which are substrates of a 61833 protein or polypeptide or a biologically active portion thereof. In another embodiment, the invention provides assays for screening candidate or test compounds that bind to or modulate an activity of a 61833 protein or polypeptide or a biologically active portion thereof.

[0677] In one embodiment, a 61833 protein can be purified from an animal tissue sample and then its activity can be assayed in vitro, as described in, e.g., Seely and Pegg (1983), Methods Enzymol. 94: 158-161, the contents of which are incorporated herein by reference.

[0678] Bacterially expressed, recombinant 61833 protein can be purified and assayed in vitro for activity as described, e.g., in Poulin et al. (1992), J Biol. Chem. 267: 150-158, the contents of which are incorporated herein by reference.

[0679] The test compounds of the present invention can be obtained using any of the numerous approaches in combinatorial library methods known in the art, including: biological libraries; peptoid libraries (libraries of molecules having the functionalities of peptides, but with a novel, non-peptide backbone which are resistant to enzymatic degradation but which nevertheless remain bioactive; see, e.g., Zuckermann, R. N. et al. (1994) J. Med. Chem. 37:2678-85); spatially addressable parallel solid phase or solution phase libraries; synthetic library methods requiring deconvolution; the ‘one-bead one-compound’ library method; and synthetic library methods using affinity chromatography selection. The biological library and peptoid library approaches are limited to peptide libraries, while the other four approaches are applicable to peptide, non-peptide oligomer or small molecule libraries of compounds (Lam (1997) Anticancer Drug Des. 12:145).

[0680] Examples of methods for the synthesis of molecular libraries can be found in the art, for example in: DeWitt et al. (1993) Proc. Natl. Acad. Sci. U.S.A. 90:6909; Erb et al. (1994) Proc. Natl. Acad. Sci. USA 91:11422; Zuckermann et al. (1994). J. Med. Chem. 37:2678; Cho et al. (1993) Science 261:1303; Carrell et al. (1994) Angew. Chem. Int. Ed. Engl. 33:2059; Carell et al. (1994) Angew. Chem. Int. Ed. Engl. 33:2061; and Gallop et al. (1994) J. Med. Chem. 37:1233.

[0681] Libraries of compounds may be presented in solution (e.g., Houghten (1992) Biotechniques 13:412-421), or on beads (Lam (1991) Nature 354:82-84), chips (Fodor (1993) Nature 364:555-556), bacteria (Ladner, U.S. Pat. No. 5,223,409), spores (Ladner U.S. Pat. No. 5,223,409), plasmids (Cull et al. (1992) Proc Natl Acad Sci USA 89:1865-1869) or on phage (Scott and Smith (1990) Science 249:386-390; Devlin (1990) Science 249:404-406; Cwirla et al. (1990) Proc. Natl. Acad. Sci. 87:6378-6382; Felici (1991) J. Mol. Biol. 222:301-310; Ladner supra.).

[0682] In one embodiment, an assay is a cell-based assay in which a cell which expresses a 61833 protein or biologically active portion thereof is contacted with a test compound, and the ability of the test compound to modulate 61833 activity is determined. Determining the ability of the test compound to modulate 61833 activity can be accomplished by monitoring, for example, the generation of a 61833 product, e.g., putrecine. The cell, for example, can be of mammalian origin, e.g., human

[0683] The ability of the test compound to modulate 61833 binding to a compound, e.g., a 61833 substrate, or to bind to 61833 can also be evaluated. This can be accomplished, for example, by coupling the compound, e.g., the substrate, with a radioisotope or enzymatic label such that binding of the compound, e.g., the substrate, to 61833 can be determined by detecting the labeled compound, e.g., substrate, in a complex. Alternatively, 61833 could be coupled with a radioisotope or enzymatic label to monitor the ability of a test compound to modulate 61833 binding to a 61833 substrate in a complex. For example, compounds (e.g., 61833 substrates) can be labeled with 125I, 35S, 14C, or 3H, either directly or indirectly, and the radioisotope detected by direct counting of radioemmission or by scintillation counting. Alternatively, compounds can be enzymatically labeled with, for example, horseradish peroxidase, alkaline phosphatase, or luciferase, and the enzymatic label detected by determination of conversion of an appropriate substrate to product.

[0684] The ability of a compound (e.g., a 61833 substrate) to interact with 61833 with or without the labeling of any of the interactants can be evaluated. For example, a microphysiometer can be used to detect the interaction of a compound with 61833 without the labeling of either the compound or the 61833. McConnell, H. M. et al. (1992) Science 257:1906-1912. As used herein, a “microphysiometer” (e.g., Cytosensor) is an analytical instrument that measures the rate at which a cell acidifies its environment using a light-addressable potentiometric sensor (LAPS). Changes in this acidification rate can be used as an indicator of the interaction between a compound and 61833.

[0685] In yet another embodiment, a cell-free assay is provided in which a 61833 protein or biologically active portion thereof is contacted with a test compound and the ability of the test compound to bind to the 61833 protein or biologically active portion thereof is evaluated. Preferred biologically active portions of the 61833 proteins to be used in assays of the present invention include fragments which participate in interactions with non-61833 molecules, e.g., fragments with high surface probability scores.

[0686] Soluble and/or membrane-bound forms of isolated proteins (e.g., 61833 proteins or biologically active portions thereof) can be used in the cell-free assays of the invention. When membrane-bound forms of the protein are used, it may be desirable to utilize a solubilizing agent. Examples of such solubilizing agents include non-ionic detergents such as n-octylglucoside, n-dodecylglucoside, n-dodecylmaltoside, octanoyl-N-methylglucamide, decanoyl-N-methylglucamide, Triton® X-100, Triton® X-114, Thesit®, Isotridecypoly(ethylene glycol ether)n, 3-[(3-cholamidopropyl)dimethylamminio]-1-propane sulfonate (CHAPS), 3-[(3-cholamidopropyl)dimethylamminio]-2-hydroxy-1-propane sulfonate (CHAPSO), or N-dodecyl=N,N-dimethyl-3-ammonio-1-propane sulfonate.

[0687] Cell-free assays involve preparing a reaction mixture of the target gene protein and the test compound under conditions and for a time sufficient to allow the two components to interact and bind, thus forming a complex that can be removed and/or detected.

[0688] The interaction between two molecules can also be detected, e.g., using fluorescence energy transfer (FET) (see, for example, Lakowicz et al., U.S. Pat. No. 5,631,169; Stavrianopoulos, et al., U.S. Pat. No. 4,868,103). A fluorophore label on the first, ‘donor’ molecule is selected such that its emitted fluorescent energy will be absorbed by a fluorescent label on a second, ‘acceptor’ molecule, which in turn is able to fluoresce due to the absorbed energy. Alternately, the ‘donor’ protein molecule may simply utilize the natural fluorescent energy of tryptophan residues. Labels are chosen that emit different wavelengths of light, such that the ‘acceptor’ molecule label may be differentiated from that of the ‘donor’. Since the efficiency of energy transfer between the labels is related to the distance separating the molecules, the spatial relationship between the molecules can be assessed. In a situation in which binding occurs between the molecules, the fluorescent emission of the ‘acceptor’ molecule label in the assay should be maximal. An FET binding event can be conveniently measured through standard fluorometric detection means well known in the art (e.g., using a fluorimeter).

[0689] In another embodiment, determining the ability of the 61833 protein to bind to a target molecule can be accomplished using real-time Biomolecular Interaction Analysis (BIA) (see, e.g., Sjolander, S. and Urbaniczky, C. (1991) Anal. Chem. 63:2338-2345 and Szabo et al. (1995) Curr. Opin. Struct. Biol. 5:699-705). “Surface plasmon resonance” or “BIA” detects biospecific interactions in real time, without labeling any of the interactants (e.g., BIAcore). Changes in the mass at the binding surface (indicative of a binding event) result in alterations of the refractive index of light near the surface (the optical phenomenon of surface plasmon resonance (SPR)), resulting in a detectable signal which can be used as an indication of real-time reactions between biological molecules.

[0690] In one embodiment, the target gene product or the test substance is anchored onto a solid phase. The target gene product/test compound complexes anchored on the solid phase can be detected at the end of the reaction. Preferably, the target gene product can be anchored onto a solid surface, and the test compound, (which is not anchored), can be labeled, either directly or indirectly, with detectable labels discussed herein.

[0691] It may be desirable to immobilize either 61833, an anti-61833 antibody or its target molecule to facilitate separation of complexed from uncomplexed forms of one or both of the proteins, as well as to accommodate automation of the assay. Binding of a test compound to a 61833 protein, or interaction of a 61833 protein with a target molecule in the presence and absence of a candidate compound, can be accomplished in any vessel suitable for containing the reactants. Examples of such vessels include microtiter plates, test tubes, and micro-centrifuge tubes. In one embodiment, a fusion protein can be provided which adds a domain that allows one or both of the proteins to be bound to a matrix. For example, glutathione-S-transferase/61833 fusion proteins or glutathione-S-transferase/target fusion proteins can be adsorbed onto glutathione sepharose beads (Sigma Chemical, St. Louis, Mo.) or glutathione derivatized microtiter plates, which are then combined with the test compound or the test compound and either the non-adsorbed target protein or 61833 protein, and the mixture incubated under conditions conducive to complex formation (e.g., at physiological conditions for salt and pH). Following incubation, the beads or microtiter plate wells are washed to remove any unbound components, the matrix immobilized in the case of beads, complex determined either directly or indirectly, for example, as described above. Alternatively, the complexes can be dissociated from the matrix, and the level of 61833 binding or activity determined using standard techniques.

[0692] Other techniques for immobilizing either a 61833 protein or a target molecule on matrices include using conjugation of biotin and streptavidin. Biotinylated 61833 protein or target molecules can be prepared from biotin-NHS(N-hydroxy-succinimide) using techniques known in the art (e.g., biotinylation kit, Pierce Chemicals, Rockford, Ill.), and immobilized in the wells of streptavidin-coated 96 well plates (Pierce Chemical).

[0693] In order to conduct the assay, the non-immobilized component is added to the coated surface containing the anchored component. After the reaction is complete, unreacted components are removed (e.g., by washing) under conditions such that any complexes formed will remain immobilized on the solid surface. The detection of complexes anchored on the solid surface can be accomplished in a number of ways. Where the previously non-immobilized component is pre-labeled, the detection of label immobilized on the surface indicates that complexes were formed. Where the previously non-immobilized component is not pre-labeled, an indirect label can be used to detect complexes anchored on the surface; e.g., using a labeled antibody specific for the immobilized component (the antibody, in turn, can be directly labeled or indirectly labeled with, e.g., a labeled anti-Ig antibody).

[0694] In one embodiment, this assay is performed utilizing antibodies reactive with 61833 protein or target molecules but which do not interfere with binding of the 61833 protein to its target molecule. Such antibodies can be derivatized to the wells of the plate, and unbound target or 61833 protein trapped in the wells by antibody conjugation. Methods for detecting such complexes, in addition to those described above for the GST-immobilized complexes, include immunodetection of complexes using antibodies reactive with the 61833 protein or target molecule, as well as enzyme-linked assays which rely on detecting an enzymatic activity associated with the 61833 protein or target molecule.

[0695] Alternatively, cell free assays can be conducted in a liquid phase. In such an assay, the reaction products are separated from unreacted components, by any of a number of standard techniques, including but not limited to: differential centrifugation (see, for example, Rivas, G., and Minton, A. P., (1993) Trends Biochem Sci 18:284-7); chromatography (gel filtration chromatography, ion-exchange chromatography); electrophoresis (see, e.g., Ausubel, F. et al., eds. Current Protocols in Molecular Biology 1999, J. Wiley: New York.); and immunoprecipitation (see, for example, Ausubel, F. et al., eds. (1999) Current Protocols in Molecular Biology, J. Wiley: New York). Such resins and chromatographic techniques are known to one skilled in the art (see, e.g., Heegaard, N. H., (1998) J Mol Recognit 11:141-8; Hage, D. S., and Tweed, S. A. (1997) J Chromatogr B Biomed Sci Appl. 699:499-525). Further, fluorescence energy transfer may also be conveniently utilized, as described herein, to detect binding without further purification of the complex from solution.

[0696] In a preferred embodiment, the assay includes contacting the 61833 protein or biologically active portion thereof with a known compound which binds 61833 to form an assay mixture, contacting the assay mixture with a test compound, and determining the ability of the test compound to interact with a 61833 protein, wherein determining the ability of the test compound to interact with a 61833 protein includes determining the ability of the test compound to preferentially bind to 61833 or biologically active portion thereof, or to modulate the activity of a target molecule, as compared to the known compound.

[0697] The target gene products of the invention can, in vivo, interact with one or more cellular or extracellular macromolecules, such as proteins. For the purposes of this discussion, such cellular and extracellular macromolecules are referred to herein as “binding partners.” Compounds that disrupt such interactions can be useful in regulating the activity of the target gene product. Such compounds can include, but are not limited to molecules such as antibodies, peptides, and small molecules. The preferred target genes/products for use in this embodiment are the 61833 genes herein identified. In an alternative embodiment, the invention provides methods for determining the ability of the test compound to modulate the activity of a 61833 protein through modulation of the activity of a downstream effector of a 61833 target molecule. For example, the activity of the effector molecule on an appropriate target can be determined, or the binding of the effector to an appropriate target can be determined, as previously described.

[0698] To identify compounds that interfere with the interaction between the target gene product and its cellular or extracellular binding partner(s), a reaction mixture containing the target gene product and the binding partner is prepared, under conditions and for a time sufficient, to allow the two products to form complex. In order to test an inhibitory agent, the reaction mixture is provided in the presence and absence of the test compound. The test compound can be initially included in the reaction mixture, or can be added at a time subsequent to the addition of the target gene and its cellular or extracellular binding partner. Control reaction mixtures are incubated without the test compound or with a placebo. The formation of any complexes between the target gene product and the cellular or extracellular binding partner is then detected. The formation of a complex in the control reaction, but not in the reaction mixture containing the test compound, indicates that the compound interferes with the interaction of the target gene product and the interactive binding partner. Additionally, complex formation within reaction mixtures containing the test compound and normal target gene product can also be compared to complex formation within reaction mixtures containing the test compound and mutant target gene product. This comparison can be important in those cases wherein it is desirable to identify compounds that disrupt interactions of mutant but not normal target gene products.

[0699] These assays can be conducted in a heterogeneous or homogeneous format. Heterogeneous assays involve anchoring either the target gene product or the binding partner onto a solid phase, and detecting complexes anchored on the solid phase at the end of the reaction. In homogeneous assays, the entire reaction is carried out in a liquid phase. In either approach, the order of addition of reactants can be varied to obtain different information about the compounds being tested. For example, test compounds that interfere with the interaction between the target gene products and the binding partners, e.g., by competition, can be identified by conducting the reaction in the presence of the test substance. Alternatively, test compounds that disrupt preformed complexes, e.g., compounds with higher binding constants that displace one of the components from the complex, can be tested by adding the test compound to the reaction mixture after complexes have been formed. The various formats are briefly described below.

[0700] In a heterogeneous assay system, either the target gene product or the interactive cellular or extracellular binding partner, is anchored onto a solid surface (e.g., a microtiter plate), while the non-anchored species is labeled, either directly or indirectly. The anchored species can be immobilized by non-covalent or covalent attachments. Alternatively, an immobilized antibody specific for the species to be anchored can be used to anchor the species to the solid surface.

[0701] In order to conduct the assay, the partner of the immobilized species is exposed to the coated surface with or without the test compound. After the reaction is complete, unreacted components are removed (e.g., by washing) and any complexes formed will remain immobilized on the solid surface. Where the non-immobilized species is pre-labeled, the detection of label immobilized on the surface indicates that complexes were formed. Where the non-immobilized species is not pre-labeled, an indirect label can be used to detect complexes anchored on the surface; e.g., using a labeled antibody specific for the initially non-immobilized species (the antibody, in turn, can be directly labeled or indirectly labeled with, e.g., a labeled anti-Ig antibody). Depending upon the order of addition of reaction components, test compounds that inhibit complex formation or that disrupt preformed complexes can be detected.

[0702] Alternatively, the reaction can be conducted in a liquid phase in the presence or absence of the test compound, the reaction products separated from unreacted components, and complexes detected; e.g., using an immobilized antibody specific for one of the binding components to anchor any complexes formed in solution, and a labeled antibody specific for the other partner to detect anchored complexes. Again, depending upon the order of addition of reactants to the liquid phase, test compounds that inhibit complex or that disrupt preformed complexes can be identified.

[0703] In an alternate embodiment of the invention, a homogeneous assay can be used. For example, a preformed complex of the target gene product and the interactive cellular or extracellular binding partner product is prepared in that either the target gene products or their binding partners are labeled, but the signal generated by the label is quenched due to complex formation (see, e.g., U.S. Pat. No. 4,109,496 that utilizes this approach for immunoassays). The addition of a test substance that competes with and displaces one of the species from the preformed complex will result in the generation of a signal above background. In this way, test substances that disrupt target gene product-binding partner interaction can be identified.

[0704] In yet another aspect, the 61833 proteins can be used as “bait proteins” in a two-hybrid assay or three-hybrid assay (see, e.g., U.S. Pat. No. 5,283,317; Zervos et al. (1993) Cell 72:223-232; Madura et al. (1993) J. Biol. Chem. 268:12046-12054; Bartel et al. (1993) Biotechniques 14:920-924; Iwabuchi et al. (1993) Oncogene 8:1693-1696; and Brent WO94/10300), to identify other proteins, which bind to or interact with 61833 (“61833-binding proteins” or “61833-bp”) and are involved in 61833 activity. Such 61833-bps can be activators or inhibitors of signals by the 61833 proteins or 61833 targets as, for example, downstream elements of a 61833-mediated signaling pathway.

[0705] The two-hybrid system is based on the modular nature of most transcription factors, which consist of separable DNA-binding and activation domains. Briefly, the assay utilizes two different DNA constructs. In one construct, the gene that codes for a 61833 protein is fused to a gene encoding the DNA binding domain of a known transcription factor (e.g., GAL-4). In the other construct, a DNA sequence, from a library of DNA sequences, that encodes an unidentified protein (“prey” or “sample”) is fused to a gene that codes for the activation domain of the known transcription factor. (Alternatively the: 61833 protein can be the fused to the activator domain.) If the “bait” and the “prey” proteins are able to interact, in vivo, forming a 61833-dependent complex, the DNA-binding and activation domains of the transcription factor are brought into close proximity. This proximity allows transcription of a reporter gene (e.g., lacZ) which is operably linked to a transcriptional regulatory site responsive to the transcription factor. Expression of the reporter gene can be detected and cell colonies containing the functional transcription factor can be isolated and used to obtain the cloned gene which encodes the protein which interacts with the 61833 protein.

[0706] In another embodiment, modulators of 61833 expression are identified. For example, a cell or cell free mixture is contacted with a candidate compound and the expression of 61833 mRNA or protein evaluated relative to the level of expression of 61833 mRNA or protein in the absence of the candidate compound. When expression of 61833 mRNA or protein is greater in the presence of the candidate compound than in its absence, the candidate compound is identified as a stimulator of 61833 mRNA or protein expression. Alternatively, when expression of 61833 mRNA or protein is less (statistically significantly less) in the presence of the candidate compound than in its absence, the candidate compound is identified as an inhibitor of 61833 mRNA or protein expression. The level of 61833 mRNA or protein expression can be determined by methods described herein for detecting 61833 mRNA or protein.

[0707] In another aspect, the invention pertains to a combination of two or more of the assays described herein. For example, a modulating agent can be identified using a cell-based or a cell free assay, and the ability of the agent to modulate the activity of a 61833 protein can be confirmed in vivo, e.g., in an animal such as an animal model for excessive or abnormal cellular proliferation or differentiation, or an animal having a cardiovascular or a neurological disorder.

[0708] This invention further pertains to novel agents identified by the above-described screening assays. Accordingly, it is within the scope of this invention to further use an agent identified as described herein (e.g., a 61833 modulating agent, an antisense 61833 nucleic acid molecule, a 61833-specific antibody, or a 61833-binding partner) in an appropriate animal model to determine the efficacy, toxicity, side effects, or mechanism of action, of treatment with such an agent. Furthermore, novel agents identified by the above-described screening assays can be used for treatments as described herein.

[0709] 61833 Detection Assays

[0710] Portions or fragments of the nucleic acid sequences identified herein can be used as polynucleotide reagents. For example, these sequences can be used to: (i) map their respective genes on a chromosome e.g., to locate gene regions associated with genetic disease or to associate 61833 with a disease; (ii) identify an individual from a minute biological sample (tissue typing); and (iii) aid in forensic identification of a biological sample. These applications are described in the subsections below.

[0711] 61833 Chromosome Mapping

[0712] The 61833 nucleotide sequences or portions thereof can be used to map the location of the 61833 genes on a chromosome. This process is called chromosome mapping. Chromosome mapping is useful in correlating the 61833 sequences with genes associated with disease.

[0713] Briefly, 61833 genes can be mapped to chromosomes by preparing PCR primers (preferably 15-25 bp in length) from the 61833 nucleotide sequences. These primers can then be used for PCR screening of somatic cell hybrids containing individual human chromosomes. Only those hybrids containing the human gene corresponding to the 61833 sequences will yield an amplified fragment.

[0714] A panel of somatic cell hybrids in which each cell line contains either a single human chromosome or a small number of human chromosomes, and a full set of mouse chromosomes, can allow easy mapping of individual genes to specific human chromosomes. (D'Eustachio P. et al. (1983) Science 220:919-924).

[0715] Other mapping strategies e.g., in situ hybridization (described in Fan, Y. et al. (1990) Proc. Natl. Acad. Sci. USA, 87:6223-27), pre-screening with labeled flow-sorted chromosomes, and pre-selection by hybridization to chromosome specific cDNA libraries can be used to map 61833 to a chromosomal location.

[0716] Fluorescence in situ hybridization (FISH) of a DNA sequence to a metaphase chromosomal spread can further be used to provide a precise chromosomal location in one step. The FISH technique can be used with a DNA sequence as short as 500 or 600 bases. However, clones larger than 1,000 bases have a higher likelihood of binding to a unique chromosomal location with sufficient signal intensity for simple detection. Preferably 1,000 bases, and more preferably 2,000 bases will suffice to get good results at a reasonable amount of time. For a review of this technique, see Verma et al., Human Chromosomes: A Manual of Basic Techniques ((1988) Pergamon Press, New York).

[0717] Reagents for chromosome mapping can be used individually to mark a single chromosome or a single site on that chromosome, or panels of reagents can be used for marking multiple sites and/or multiple chromosomes. Reagents corresponding to noncoding regions of the genes actually are preferred for mapping purposes. Coding sequences are more likely to be conserved within gene families, thus increasing the chance of cross hybridizations during chromosomal mapping.

[0718] Once a sequence has been mapped to a precise chromosomal location, the physical position of the sequence on the chromosome can be correlated with genetic map data. (Such data are found, for example, in V. McKusick, Mendelian Inheritance in Man, available on-line through Johns Hopkins University Welch Medical Library). The relationship between a gene and a disease, mapped to the same chromosomal region, can then be identified through linkage analysis (co-inheritance of physically adjacent genes), described in, for example, Egeland, J. et al. (1987) Nature, 325:783-787.

[0719] Moreover, differences in the DNA sequences between individuals affected and unaffected with a disease associated with the 61833 gene, can be determined. If a mutation is observed in some or all of the affected individuals but not in any unaffected individuals, then the mutation is likely to be the causative agent of the particular disease. Comparison of affected and unaffected individuals generally involves first looking for structural alterations in the chromosomes, such as deletions or translocations that are visible from chromosome spreads or detectable using PCR based on that DNA sequence. Ultimately, complete sequencing of genes from several individuals can be performed to confirm the presence of a mutation and to distinguish mutations from polymorphisms.

[0720] 61833 Tissue Typing

[0721] 61833 sequences can be used to identify individuals from biological samples using, e.g., restriction fragment length polymorphism (RFLP). In this technique, an individual's genomic DNA is digested with one or more restriction enzymes, the fragments separated, e.g., in a Southern blot, and probed to yield bands for identification. The sequences of the present invention are useful as additional DNA markers for RFLP (described in U.S. Pat. No. 5,272,057).

[0722] Furthermore, the sequences of the present invention can also be used to determine the actual base-by-base DNA sequence of selected portions of an individual's genome. Thus, the 61833 nucleotide sequences described herein can be used to prepare two PCR primers from the 5′ and 3′ends of the sequences. These primers can then be used to amplify an individual's DNA and subsequently sequence it. Panels of corresponding DNA sequences from individuals, prepared in this manner, can provide unique individual identifications, as each individual will have a unique set of such DNA sequences due to allelic differences.

[0723] Allelic variation occurs to some degree in the coding regions of these sequences, and to a greater degree in the noncoding regions. Each of the sequences described herein can, to some degree, be used as a standard against which DNA from an individual can be compared for identification purposes. Because greater numbers of polymorphisms occur in the noncoding regions, fewer sequences are necessary to differentiate individuals. The noncoding sequences of SEQ ID NO:10 can provide positive individual identification with a panel of perhaps 10 to 1,000 primers which each yield a noncoding amplified sequence of 100 bases. If predicted coding sequences, such as those in SEQ ID NO: 12 are used, a more appropriate number of primers for positive individual identification would be 500-2,000.

[0724] If a panel of reagents from 61833 nucleotide sequences described herein is used to generate a unique identification database for an individual, those same reagents can later be used to identify tissue from that individual. Using the unique identification database, positive identification of the individual, living or dead, can be made from extremely small tissue samples.

[0725] Use of Partial 61833 Sequences in Forensic Biology

[0726] DNA-based identification techniques can also be used in forensic biology. To make such an identification, PCR technology can be used to amplify DNA sequences taken from very small biological samples such as tissues, e.g., hair or skin, or body fluids, e.g., blood, saliva, or semen found at a crime scene. The amplified sequence can then be compared to a standard, thereby allowing identification of the origin of the biological sample.

[0727] The sequences of the present invention can be used to provide polynucleotide reagents, e.g., PCR primers, targeted to specific loci in the human genome, which can enhance the reliability of DNA-based forensic identifications by, for example, providing another “identification marker” (i.e. another DNA sequence that is unique to a particular individual). As mentioned above, actual base sequence information can be used for identification as an accurate alternative to patterns formed by restriction enzyme generated fragments. Sequences targeted to noncoding regions of SEQ ID NO:10 (e.g., fragments derived from the noncoding regions of SEQ ID NO: 10 having a length of at least 20 bases, preferably at least 30 bases) are particularly appropriate for this use.

[0728] The 61833 nucleotide sequences described herein can further be used to provide polynucleotide reagents, e.g., labeled or labelable probes which can be used in, for example, an in situ hybridization technique, to identify a specific tissue. This can be very useful in cases where a forensic pathologist is presented with a tissue of unknown origin. Panels of such 61833 probes can be used to identify tissue by species and/or by organ type.

[0729] In a similar fashion, these reagents, e.g., 61833 primers or probes can be used to screen tissue culture for contamination (i.e. screen for the presence of a mixture of different types of cells in a culture).

[0730] Predictive Medicine of 61833

[0731] The present invention also pertains to the field of predictive medicine in which diagnostic assays, prognostic assays, and monitoring clinical trials are used for prognostic (predictive) purposes to thereby treat an individual.

[0732] Generally, the invention provides, a method of determining if a subject is at risk for a disorder related to a lesion in or the misexpression of a gene which encodes 61833.

[0733] Such disorders include, e.g., a disorder associated with the misexpression of 61833 gene.

[0734] The method includes one or more of the following:

[0735] detecting, in a tissue of the subject, the presence or absence of a mutation which affects the expression of the 61833 gene, or detecting the presence or absence of a mutation in a region which controls the expression of the gene, e.g., a mutation in the 5′control region;

[0736] detecting, in a tissue of the subject, the presence or absence of a mutation which alters the structure of the 61833 gene;

[0737] detecting, in a tissue of the subject, the misexpression of the 61833 gene, at the mRNA level, e.g., detecting a non-wild type level of a mRNA;

[0738] detecting, in a tissue of the subject, the misexpression of the gene, at the protein level, e.g., detecting a non-wild type level of a 61833 polypeptide.

[0739] In preferred embodiments the method includes: ascertaining the existence of at least one of: a deletion of one or more nucleotides from the 61833 gene; an insertion of one or more nucleotides into the gene, a point mutation, e.g., a substitution of one or more nucleotides of the gene, a gross chromosomal rearrangement of the gene, e.g., a translocation, inversion, or deletion.

[0740] For example, detecting the genetic lesion can include: (i) providing a probe/primer including an oligonucleotide containing a region of nucleotide sequence which hybridizes to a sense or antisense sequence from SEQ ID NO: 10, or naturally occurring mutants thereof or 5′ or 3′flanking sequences naturally associated with the 61833 gene; (ii) exposing the probe/primer to nucleic acid of the tissue; and detecting, by hybridization, e.g., in situ hybridization, of the probe/primer to the nucleic acid, the presence or absence of the genetic lesion.

[0741] In preferred embodiments detecting the misexpression includes ascertaining the existence of at least one of: an alteration in the level of a messenger RNA transcript of the 61833 gene; the presence of a non-wild type splicing pattern of a messenger RNA transcript of the gene; or a non-wild type level of 61833.

[0742] Methods of the invention can be used prenatally or to determine if a subject's offspring will be at risk for a disorder.

[0743] In preferred embodiments the method includes determining the structure of a 61833 gene, an abnormal structure being indicative of risk for the disorder.

[0744] In preferred embodiments the method includes contacting a sample from the subject with an antibody to the 61833 protein or a nucleic acid, which hybridizes specifically with the gene. These and other embodiments are discussed below.

[0745] Diagnostic and Prognostic Assays of 61833

[0746] Diagnostic and prognostic assays of the invention include method for assessing the expression level of 61833 molecules and for identifying variations and mutations in the sequence of 61833 molecules.

[0747] Expression Monitoring and Profiling:

[0748] The presence, level, or absence of 61833 protein or nucleic acid in a biological sample can be evaluated by obtaining a biological sample from a test subject and contacting the biological sample with a compound or an agent capable of detecting 61833 protein or nucleic acid (e.g., mRNA, genomic DNA) that encodes 61833 protein such that the presence of 61833 protein or nucleic acid is detected in the biological sample. The term “biological sample” includes tissues, cells and biological fluids isolated from a subject, as well as tissues, cells and fluids present within a subject. A preferred biological sample is serum. The level of expression of the 61833 gene can be measured in a number of ways, including, but not limited to: measuring the mRNA encoded by the 61833 genes; measuring the amount of protein encoded by the 61833 genes; or measuring the activity of the protein encoded by the 61833 genes.

[0749] The level of mRNA corresponding to the 61833 gene in a cell can be determined both by in situ and by in vitro formats.

[0750] The isolated mRNA can be used in hybridization or amplification assays that include, but are not limited to, Southern or Northern analyses, polymerase chain reaction analyses and probe arrays. One preferred diagnostic method for the detection of mRNA levels involves contacting the isolated mRNA with a nucleic acid molecule (probe) that can hybridize to the mRNA encoded by the gene being detected. The nucleic acid probe can be, for example, a full-length 61833 nucleic acid, such as the nucleic acid of SEQ ID NO: 10, or a portion thereof, such as an oligonucleotide of at least 7, 15, 30, 50, 100, 250 or 500 nucleotides in length and sufficient to specifically hybridize under stringent conditions to 61833 mRNA or genomic DNA. The probe can be disposed on an address of an array, e.g., an array described below. Other suitable probes for use in the diagnostic assays are described herein.

[0751] In one format, mRNA (or cDNA) is immobilized on a surface and contacted with the probes, for example by running the isolated mRNA on an agarose gel and transferring the mRNA from the gel to a membrane, such as nitrocellulose. In an alternative format, the probes are immobilized on a surface and the mRNA (or cDNA) is contacted with the probes, for example, in a two-dimensional gene chip array described below. A skilled artisan can adapt known mRNA detection methods for use in detecting the level of mRNA encoded by the 61833 genes.

[0752] The level of mRNA in a sample that is encoded by one of 61833 can be evaluated with nucleic acid amplification, e.g., by rtPCR (Mullis (1987) U.S. Pat. No. 4,683,202), ligase chain reaction (Barany (1991) Proc. Natl. Acad. Sci. USA 88:189-193), self sustained sequence replication (Guatelli et al., (1990) Proc. Natl. Acad. Sci. USA 87:1874-1878), transcriptional amplification system (Kwoh et al., (1989), Proc. Natl. Acad. Sci. USA 86:1173-1177), Q-Beta Replicase (Lizardi et al., (1988) Bio/Technology 6:1197), rolling circle replication (Lizardi et al., U.S. Pat. No. 5,854,033) or any other nucleic acid amplification method, followed by the detection of the amplified molecules using techniques known in the art. As used herein, amplification primers are defined as being a pair of nucleic acid molecules that can anneal to 5′ or 3′regions of a gene (plus and minus strands, respectively, or vice-versa) and contain a short region in between. In general, amplification primers are from about 10 to 30 nucleotides in length and flank a region from about 50 to 200 nucleotides in length. Under appropriate conditions and with appropriate reagents, such primers permit the amplification of a nucleic acid molecule comprising the nucleotide sequence flanked by the primers.

[0753] For in situ methods, a cell or tissue sample can be prepared/processed and immobilized on a support, typically a glass slide, and then contacted with a probe that can hybridize to mRNA that encodes the 61833 gene being analyzed.

[0754] In another embodiment, the methods further contacting a control sample with a compound or agent capable of detecting 61833 mRNA, or genomic DNA, and comparing the presence of 61833 mRNA or genomic DNA in the control sample with the presence of 61833 mRNA or genomic DNA in the test sample. In still another embodiment, serial analysis of gene expression, as described in U.S. Pat. No. 5,695,937, is used to detect 61833 transcript levels.

[0755] A variety of methods can be used to determine the level of protein encoded by 61833. In general, these methods include contacting an agent that selectively binds to the protein, such as an antibody with a sample, to evaluate the level of protein in the sample. In a preferred embodiment, the antibody bears a detectable label. Antibodies can be polyclonal, or more preferably, monoclonal. An intact antibody, or a fragment thereof (e.g., Fab or F(ab′)2) can be used. The term “labeled”, with regard to the probe or antibody, is intended to encompass direct labeling of the probe or antibody by coupling (i.e., physically linking) a detectable substance to the probe or antibody, as well as indirect labeling of the probe or antibody by reactivity with a detectable substance. Examples of detectable substances are provided herein.

[0756] The detection methods can be used to detect 61833 protein in a biological sample in vitro as well as in vivo. In vitro techniques for detection of 61833 protein include enzyme linked immunosorbent assays (ELISAs), immunoprecipitations, immunofluorescence, enzyme immunoassay (EIA), radioimmunoassay (RIA), and Western blot analysis. In vivo techniques for detection of 61833 protein include introducing into a subject a labeled anti-61833 antibody. For example, the antibody can be labeled with a radioactive marker whose presence and location in a subject can be detected by standard imaging techniques. In another embodiment, the sample is labeled, e.g., biotinylated and then contacted to the antibody, e.g., an anti-61833 antibody positioned on an antibody array (as described below). The sample can be detected, e.g., with avidin coupled to a fluorescent label.

[0757] In another embodiment, the methods further include contacting the control sample with a compound or agent capable of detecting 61833 protein, and comparing the presence of 61833 protein in the control sample with the presence of 61833 protein in the test sample.

[0758] The invention also includes kits for detecting the presence of 61833 in a biological sample. For example, the kit can include a compound or agent capable of detecting 61833 protein or mRNA in a biological sample; and a standard. The compound or agent can be packaged in a suitable container. The kit can further comprise instructions for using the kit to detect 61833 protein or nucleic acid.

[0759] For antibody-based kits, the kit can include: (1) a first antibody (e.g., attached to a solid support) which binds to a polypeptide corresponding to a marker of the invention; and, optionally, (2) a second, different antibody which binds to either the polypeptide or the first antibody and is conjugated to a detectable agent.

[0760] For oligonucleotide-based kits, the kit can include: (1) an oligonucleotide, e.g., a detectably labeled oligonucleotide, which hybridizes to a nucleic acid sequence encoding a polypeptide corresponding to a marker of the invention or (2) a pair of primers useful for amplifying a nucleic acid molecule corresponding to a marker of the invention. The kit can also includes a buffering agent, a preservative, or a protein stabilizing agent. The kit can also includes components necessary for detecting the detectable agent (e.g., an enzyme or a substrate). The kit can also contain a control sample or a series of control samples which can be assayed and compared to the test sample contained. Each component of the kit can be enclosed within an individual container and all of the various containers can be within a single package, along with instructions for interpreting the results of the assays performed using the kit.

[0761] The diagnostic methods described herein can identify subjects having, or at risk of developing, a disease or disorder associated with misexpressed or aberrant or unwanted 61833 expression or activity. As used herein, the term “unwanted” includes an unwanted phenomenon involved in a biological response such as atherosclerosis or deregulated cell proliferation.

[0762] In one embodiment, a disease or disorder associated with aberrant or unwanted 61833 expression or activity is identified. A test sample is obtained from a subject and 61833 protein or nucleic acid (e.g., mRNA or genomic DNA) is evaluated, wherein the level, e.g., the presence or absence, of 61833 protein or nucleic acid is diagnostic for a subject having or at risk of developing a disease or disorder associated with aberrant or unwanted 61833 expression or activity. As used herein, a “test sample” refers to a biological sample obtained from a subject of interest, including a biological fluid (e.g., serum), cell sample, or tissue.

[0763] The prognostic assays described herein can be used to determine whether a subject can be administered an agent (e.g., an agonist, antagonist, peptidomimetic, protein, peptide, nucleic acid, small molecule, or other drug candidate) to treat a disease or disorder associated with aberrant or unwanted 61833 expression or activity. For example, such methods can be used to determine whether a subject can be effectively treated with an agent for a cellular proliferative and/or differentiative disorder, a cardiovascular disorder, or a disorder of the brain.

[0764] In another aspect, the invention features a computer medium having a plurality of digitally encoded data records. Each data record includes a value representing the level of expression of 61833 in a sample, and a descriptor of the sample. The descriptor of the sample can be an identifier of the sample, a subject from which the sample was derived (e.g., a patient), a diagnosis, or a treatment (e.g., a preferred treatment). In a preferred embodiment, the data record further includes values representing the level of expression of genes other than 61833 (e.g., other genes associated with a 61833-disorder, or other genes on an array). The data record can be structured as a table, e.g., a table that is part of a database such as a relational database (e.g., a SQL database of the Oracle or Sybase database environments).

[0765] Also featured is a method of evaluating a sample. The method includes providing a sample, e.g., from the subject, and determining a gene expression profile of the sample, wherein the profile includes a value representing the level of 61833 expression. The method can further include comparing the value or the profile (i.e., multiple values) to a reference value or reference profile. The gene expression profile of the sample can be obtained by any of the methods described herein (e.g., by providing a nucleic acid from the sample and contacting the nucleic acid to an array). The method can be used to diagnose a proliferative and/or differentiative cellular disorder, or a disorder of the brain, in a subject wherein an increase in 61833 expression is an indication that the subject has or is disposed to having a proliferative and/or differentiative cellular disorder or a disorder of the brain. Alternatively, the method can be used to diagnose a cardiovascular disorder, e.g., atherosclerosis, in a subject wherein a decrease in 61833 expression is an indication that the subject has or is disposed to having a cardiovascular disorder. The method can be used to monitor a treatment for a proliferative and/or differentiative cellular disorder, a cardiovascular disorder, or a disorder of the brain in a subject. For example, the gene expression profile can be determined for a sample from a subject undergoing treatment. The profile can be compared to a reference profile or to a profile obtained from the subject prior to treatment or prior to onset of the disorder (see, e.g., Golub et al. (1999) Science 286:531).

[0766] In yet another aspect, the invention features a method of evaluating a test compound (see also, “Screening Assays”, above). The method includes providing a cell and a test compound; contacting the test compound to the cell; obtaining a subject expression profile for the contacted cell; and comparing the subject expression profile to one or more reference profiles. The profiles include a value representing the level of 61833 expression. In a preferred embodiment, the subject expression profile is compared to a target profile, e.g., a profile for a normal cell or for desired condition of a cell. The test compound is evaluated favorably if the subject expression profile is more similar to the target profile than an expression profile obtained from an uncontacted cell.

[0767] In another aspect, the invention features, a method of evaluating a subject. The method includes: a) obtaining a sample from a subject, e.g., from a caregiver, e.g., a caregiver who obtains the sample from the subject; b) determining a subject expression profile for the sample. Optionally, the method further includes either or both of steps: c) comparing the subject expression profile to one or more reference expression profiles; and d) selecting the reference profile most similar to the subject reference profile. The subject expression profile and the reference profiles include a value representing the level of 61833 expression. A variety of routine statistical measures can be used to compare two reference profiles. One possible metric is the length of the distance vector that is the difference between the two profiles. Each of the subject and reference profile is represented as a multi-dimensional vector, wherein each dimension is a value in the profile.

[0768] The method can further include transmitting a result to a caregiver. The result can be the subject expression profile, a result of a comparison of the subject expression profile with another profile, a most similar reference profile, or a descriptor of any of the aforementioned. The result can be transmitted across a computer network, e.g., the result can be in the form of a computer transmission, e.g., a computer data signal embedded in a carrier wave.

[0769] Also featured is a computer medium having executable code for effecting the following steps: receive a subject expression profile; access a database of reference expression profiles; and either i) select a matching reference profile most similar to the subject expression profile or ii) determine at least one comparison score for the similarity of the subject expression profile to at least one reference profile. The subject expression profile, and the reference expression profiles each include a value representing the level of 61833 expression.

[0770] 61833 Arrays and Uses Thereof

[0771] In another aspect, the invention features an array that includes a substrate having a plurality of addresses. At least one address of the plurality includes a capture probe that binds specifically to a 61833 molecule (e.g., a 61833 nucleic acid or a 61833 polypeptide). The array can have a density of at least than 10, 50, 100, 200, 500, 1,000, 2,000, or 10,000 or more addresses/cm2, and ranges between. In a preferred embodiment, the plurality of addresses includes at least 10, 100, 500, 1,000, 5,000, 10,000, 50,000 addresses. In a preferred embodiment, the plurality of addresses includes equal to or less than 10, 100, 500, 1,000, 5,000, 10,000, or 50,000 addresses. The substrate can be a two-dimensional substrate such as a glass slide, a wafer (e.g., silica or plastic), a mass spectroscopy plate, or a three-dimensional substrate such as a gel pad. Addresses in addition to address of the plurality can be disposed on the array.

[0772] In a preferred embodiment, at least one address of the plurality includes a nucleic acid capture probe that hybridizes specifically to a 61833 nucleic acid, e.g., the sense or anti-sense strand. In one preferred embodiment, a subset of addresses of the plurality of addresses has a nucleic acid capture probe for 61833. Each address of the subset can include a capture probe that hybridizes to a different region of a 61833 nucleic acid. In another preferred embodiment, addresses of the subset include a capture probe for a 61833 nucleic acid. Each address of the subset is unique, overlapping, and complementary to a different variant of 61833 (e.g., an allelic variant, or all possible hypothetical variants). The array can be used to sequence 61833 by hybridization (see, e.g., U.S. Pat. No. 5,695,940).

[0773] An array can be generated by various methods, e.g., by photolithographic methods (see, e.g., U.S. Pat. Nos. 5,143,854; 5,510,270; and 5,527,681), mechanical methods (e.g., directed-flow methods as described in U.S. Pat. No. 5,384,261), pin-based methods (e.g., as described in U.S. Pat. No. 5,288,514), and bead-based techniques (e.g., as described in PCT US/93/04145).

[0774] In another preferred embodiment, at least one address of the plurality includes a polypeptide capture probe that binds specifically to a 61833 polypeptide or fragment thereof. The polypeptide can be a naturally-occurring interaction partner of 61833 polypeptide. Preferably, the polypeptide is an antibody, e.g., an antibody described herein (see “Anti-61833 Antibodies,” above), such as a monoclonal antibody or a single-chain antibody.

[0775] In another aspect, the invention features a method of analyzing the expression of 61833. The method includes providing an array as described above; contacting the array with a sample and detecting binding of a 61833-molecule (e.g., nucleic acid or polypeptide) to the array. In a preferred embodiment, the array is a nucleic acid array. Optionally the method further includes amplifying nucleic acid from the sample prior or during contact with the array.

[0776] In another embodiment, the array can be used to assay gene expression in a tissue to ascertain tissue specificity of genes in the array, particularly the expression of 61833. If a sufficient number of diverse samples is analyzed, clustering (e.g., hierarchical clustering, k-means clustering, Bayesian clustering and the like) can be used to identify other genes which are co-regulated with 61833. For example, the array can be used for the quantitation of the expression of multiple genes. Thus, not only tissue specificity, but also the level of expression of a battery of genes in the tissue is ascertained. Quantitative data can be used to group (e.g., cluster) genes on the basis of their tissue expression per se and level of expression in that tissue.

[0777] For example, array analysis of gene expression can be used to assess the effect of cell-cell interactions on 61833 expression. A first tissue can be perturbed and nucleic acid from a second tissue that interacts with the first tissue can be analyzed. In this context, the effect of one cell type on another cell type in response to a biological stimulus can be determined, e.g., to monitor the effect of cell-cell interaction at the level of gene expression.

[0778] In another embodiment, cells are contacted with a therapeutic agent. The expression profile of the cells is determined using the array, and the expression profile is compared to the profile of like cells not contacted with the agent. For example, the assay can be used to determine or analyze the molecular basis of an undesirable effect of the therapeutic agent. If an agent is administered therapeutically to treat one cell type but has an undesirable effect on another cell type, the invention provides an assay to determine the molecular basis of the undesirable effect and thus provides the opportunity to co-administer a counteracting agent or otherwise treat the undesired effect. Similarly, even within a single cell type, undesirable biological effects can be determined at the molecular level. Thus, the effects of an agent on expression of other than the target gene can be ascertained and counteracted.

[0779] In another embodiment, the array can be used to monitor expression of one or more genes in the array with respect to time. For example, samples obtained from different time points can be probed with the array. Such analysis can identify and/or characterize the development of a 61833-associated disease or disorder; and processes, such as a cellular transformation associated with a 61833-associated disease or disorder. The method can also evaluate the treatment and/or progression of a 61833-associated disease or disorder.

[0780] The array is also useful for ascertaining differential expression patterns of one or more genes in normal and abnormal cells. This provides a battery of genes (e.g., including 61833) that could serve as a molecular target for diagnosis or therapeutic intervention.

[0781] In another aspect, the invention features an array having a plurality of addresses. Each address of the plurality includes a unique polypeptide. At least one address of the plurality has disposed thereon a 61833 polypeptide or fragment thereof. Methods of producing polypeptide arrays are described in the art, e.g., in De Wildt et al. (2000). Nature Biotech. 18, 989-994; Lueking et al. (1999). Anal. Biochem. 270, 103-111; Ge, H. (2000). Nucleic Acids Res. 28, e3, I-VII; MacBeath, G., and Schreiber, S. L. (2000). Science 289, 1760-1763; and WO 99/51773A1. In a preferred embodiment, each addresses of the plurality has disposed thereon a polypeptide at least 60, 70, 80, 85, 90, 95 or 99% identical to a 61833 polypeptide or fragment thereof. For example, multiple variants of a 61833 polypeptide (e.g., encoded by allelic variants, site-directed mutants, random mutants, or combinatorial mutants) can be disposed at individual addresses of the plurality. Addresses in addition to the address of the plurality can be disposed on the array.

[0782] The polypeptide array can be used to detect a 61833 binding compound, e.g., an antibody in a sample from a subject with specificity for a 61833 polypeptide or the presence of a 61833-binding protein or ligand.

[0783] The array is also useful for ascertaining the effect of the expression of a gene on the expression of other genes in the same cell or in different cells (e.g., ascertaining the effect of 61833 expression on the expression of other genes). This provides, for example, for a selection of alternate molecular targets for therapeutic intervention if the ultimate or downstream target cannot be regulated.

[0784] In another aspect, the invention features a method of analyzing a plurality of probes. The method is useful, e.g., for analyzing gene expression. The method includes: providing a two dimensional array having a plurality of addresses, each address of the plurality being positionally distinguishable from each other address of the plurality having a unique capture probe, e.g., wherein the capture probes are from a cell or subject which express 61833 or from a cell or subject in which a 61833 mediated response has been elicited, e.g., by contact of the cell with 61833 nucleic acid or protein, or administration to the cell or subject 61833 nucleic acid or protein; providing a two dimensional array having a plurality of addresses, each address of the plurality being positionally distinguishable from each other address of the plurality, and each address of the plurality having a unique capture probe, e.g., wherein the capture probes are from a cell or subject which does not express 61833 (or does not express as highly as in the case of the 61833 positive plurality of capture probes) or from a cell or subject which in which a 61833 mediated response has not been elicited (or has been elicited to a lesser extent than in the first sample); contacting the array with one or more inquiry probes (which is preferably other than a 61833 nucleic acid, polypeptide, or antibody), and thereby evaluating the plurality of capture probes. Binding, e.g., in the case of a nucleic acid, hybridization with a capture probe at an address of the plurality, is detected, e.g., by signal generated from a label attached to the nucleic acid, polypeptide, or antibody.

[0785] In another aspect, the invention features a method of analyzing a plurality of probes or a sample. The method is useful, e.g., for analyzing gene expression. The method includes: providing a two dimensional array having a plurality of addresses, each address of the plurality being positionally distinguishable from each other address of the plurality having a unique capture probe, contacting the array with a first sample from a cell or subject which express or mis-express 61833 or from a cell or subject in which a 61833-mediated response has been elicited, e.g., by contact of the cell with 61833 nucleic acid or protein, or administration to the cell or subject 61833 nucleic acid or protein; providing a two dimensional array having a plurality of addresses, each address of the plurality being positionally distinguishable from each other address of the plurality, and each address of the plurality having a unique capture probe, and contacting the array with a second sample from a cell or subject which does not express 61833 (or does not express as highly as in the case of the 61833 positive plurality of capture probes) or from a cell or subject which in which a 61833 mediated response has not been elicited (or has been elicited to a lesser extent than in the first sample); and comparing the binding of the first sample with the binding of the second sample. Binding, e.g., in the case of a nucleic acid, hybridization with a capture probe at an address of the plurality, is detected, e.g., by signal generated from a label attached to the nucleic acid, polypeptide, or antibody. The same array can be used for both samples or different arrays can be used. If different arrays are used the plurality of addresses with capture probes should be present on both arrays.

[0786] In another aspect, the invention features a method of analyzing 61833, e.g., analyzing structure, function, or relatedness to other nucleic acid or amino acid sequences. The method includes: providing a 61833 nucleic acid or amino acid sequence; comparing the 61833 sequence with one or more preferably a plurality of sequences from a collection of sequences, e.g., a nucleic acid or protein sequence database; to thereby analyze 61833.

[0787] Detection of 61833 Variations or Mutations

[0788] The methods of the invention can also be used to detect genetic alterations in a 61833 gene, thereby determining if a subject with the altered gene is at risk for a disorder characterized by misregulation in 61833 protein activity or nucleic acid expression, such as a cellular proliferative or differentiative disorder, a cardiovascular disorder, or a disorder of the brain. In preferred embodiments, the methods include detecting, in a sample from the subject, the presence or absence of a genetic alteration characterized by at least one of an alteration affecting the integrity of a gene encoding a 61833-protein, or the mis-expression of the 61833 gene. For example, such genetic alterations can be detected by ascertaining the existence of at least one of 1) a deletion of one or more nucleotides from a 61833 gene; 2) an addition of one or more nucleotides to a 61833 gene; 3) a substitution of one or more nucleotides of a 61833 gene, 4) a chromosomal rearrangement of a 61833 gene; 5) an alteration in the level of a messenger RNA transcript of a 61833 gene, 6) aberrant modification of a 61833 gene, such as of the methylation pattern of the genomic DNA, 7) the presence of a non-wild type splicing pattern of a messenger RNA transcript of a 61833 gene, 8) a non-wild type level of a 61833-protein, 9) allelic loss of a 61833 gene, and 10) inappropriate post-translational modification of a 61833-protein.

[0789] An alteration can be detected without a probe/primer in a polymerase chain reaction, such as anchor PCR or RACE PCR, or, alternatively, in a ligation chain reaction (LCR), the latter of which can be particularly useful for detecting point mutations in the 61833-gene. This method can include the steps of collecting a sample of cells from a subject, isolating nucleic acid (e.g., genomic, mRNA or both) from the sample, contacting the nucleic acid sample with one or more primers which specifically hybridize to a 61833 gene under conditions such that hybridization and amplification of the 61833-gene (if present) occurs, and detecting the presence or absence of an amplification product, or detecting the size of the amplification product and comparing the length to a control sample. It is anticipated that PCR and/or LCR may be desirable to use as a preliminary amplification step in conjunction with any of the techniques used for detecting mutations described herein. Alternatively, other amplification methods described herein or known in the art can be used.

[0790] In another embodiment, mutations in a 61833 gene from a sample cell can be identified by detecting alterations in restriction enzyme cleavage patterns. For example, sample and control DNA is isolated, amplified (optionally), digested with one or more restriction endonucleases, and fragment length sizes are determined, e.g., by gel electrophoresis and compared. Differences in fragment length sizes between sample and control DNA indicates mutations in the sample DNA. Moreover, the use of sequence specific ribozymes (see, for example, U.S. Pat. No. 5,498,531) can be used to score for the presence of specific mutations by development or loss of a ribozyme cleavage site.

[0791] In other embodiments, genetic mutations in 61833 can be identified by hybridizing a sample and control nucleic acids, e.g., DNA or RNA, two-dimensional arrays, e.g., chip based arrays. Such arrays include a plurality of addresses, each of which is positionally distinguishable from the other. A different probe is located at each address of the plurality. A probe can be complementary to a region of a 61833 nucleic acid or a putative variant (e.g., allelic variant) thereof. A probe can have one or more mismatches to a region of a 61833 nucleic acid (e.g., a destabilizing mismatch). The arrays can have a high density of addresses, e.g., can contain hundreds or thousands of oligonucleotides probes (Cronin, M. T. et al. (1996) Human Mutation 7: 244-255; Kozal, M. J. et al. (1996) Nature Medicine 2: 753-759). For example, genetic mutations in 61833 can be identified in two-dimensional arrays containing light-generated DNA probes as described in Cronin, M. T. et al. supra. Briefly, a first hybridization array of probes can be used to scan through long stretches of DNA in a sample and control to identify base changes between the sequences by making linear arrays of sequential overlapping probes. This step allows the identification of point mutations. This step is followed by a second hybridization array that allows the characterization of specific mutations by using smaller, specialized probe arrays complementary to all variants or mutations detected. Each mutation array is composed of parallel probe sets, one complementary to the wild-type gene and the other complementary to the mutant gene.

[0792] In yet another embodiment, any of a variety of sequencing reactions known in the art can be used to directly sequence the 61833 gene and detect mutations by comparing the sequence of the sample 61833 with the corresponding wild-type (control) sequence. Automated sequencing procedures can be utilized when performing the diagnostic assays ((1995) Biotechniques 19:448), including sequencing by mass spectrometry.

[0793] Other methods for detecting mutations in the 61833 gene include methods in which protection from cleavage agents is used to detect mismatched bases in RNA/RNA or RNA/DNA heteroduplexes (Myers et al. (1985) Science 230:1242; Cotton et al. (1988) Proc. Natl. Acad Sci USA 85:4397; Saleeba et al. (1992) Methods Enzymol. 217:286-295).

[0794] In still another embodiment, the mismatch cleavage reaction employs one or more proteins that recognize mismatched base pairs in double-stranded DNA (so called “DNA mismatch repair” enzymes) in defined systems for detecting and mapping point mutations in 61833 cDNAs obtained from samples of cells. For example, the mutY enzyme of E. coli cleaves A at G/A mismatches and the thymidine DNA glycosylase from HeLa cells cleaves T at G/T mismatches (Hsu et al. (1994) Carcinogenesis 15:1657-1662; U.S. Pat. No. 5,459,039).

[0795] In other embodiments, alterations in electrophoretic mobility will be used to identify mutations in 61833 genes. For example, single strand conformation polymorphism (SSCP) may be used to detect differences in electrophoretic mobility between mutant and wild type nucleic acids (Orita et al. (1989) Proc Natl. Acad. Sci USA: 86:2766, see also Cotton (1993) Mutat. Res. 285:125-144; and Hayashi (1992) Genet. Anal. Tech. Appl. 9:73-79). Single-stranded DNA fragments of sample and control 61833 nucleic acids will be denatured and allowed to renature. The secondary structure of single-stranded nucleic acids varies according to sequence, the resulting alteration in electrophoretic mobility enables the detection of even a single base change. The DNA fragments may be labeled or detected with labeled probes. The sensitivity of the assay may be enhanced by using RNA (rather than DNA), in which the secondary structure is more sensitive to a change in sequence. In a preferred embodiment, the subject method utilizes heteroduplex analysis to separate double stranded heteroduplex molecules on the basis of changes in electrophoretic mobility (Keen et al. (1991) Trends Genet 7:5).

[0796] In yet another embodiment, the movement of mutant or wild-type fragments in polyacrylamide gels containing a gradient of denaturant is assayed using denaturing gradient gel electrophoresis (DGGE) (Myers et al. (1985) Nature 313:495). When DGGE is used as the method of analysis, DNA will be modified to insure that it does not completely denature, for example by adding a GC clamp of approximately 40 bp of high-melting GC-rich DNA by PCR. In a further embodiment, a temperature gradient is used in place of a denaturing gradient to identify differences in the mobility of control and sample DNA (Rosenbaum and Reissner (1987) Biophys Chem 265:12753).

[0797] Examples of other techniques for detecting point mutations include, but are not limited to, selective oligonucleotide hybridization, selective amplification, or selective primer extension (Saiki et al. (1986) Nature 324:163); Saiki et al. (1989) Proc. Natl. Acad. Sci USA 86:6230). A further method of detecting point mutations is the chemical ligation of oligonucleotides as described in Xu et al. ((2001) Nature Biotechnol. 19:148). Adjacent oligonucleotides, one of which selectively anneals to the query site, are ligated together if the nucleotide at the query site of the sample nucleic acid is complementary to the query oligonucleotide; ligation can be monitored, e.g., by fluorescent dyes coupled to the oligonucleotides.

[0798] Alternatively, allele specific amplification technology that depends on selective PCR amplification may be used in conjunction with the instant invention. Oligonucleotides used as primers for specific amplification may carry the mutation of interest in the center of the molecule (so that amplification depends on differential hybridization) (Gibbs et al. (1989) Nucleic Acids Res. 17:2437-2448) or at the extreme 3′end of one primer where, under appropriate conditions, mismatch can prevent, or reduce polymerase extension (Prossner (1993) Tibtech 11:238). In addition it may be desirable to introduce a novel restriction site in the region of the mutation to create cleavage-based detection (Gasparini et al. (1992) Mol. Cell Probes 6:1). It is anticipated that in certain embodiments amplification may also be performed using Taq ligase for amplification (Barany (1991) Proc. Natl. Acad. Sci USA 88:189). In such cases, ligation will occur only if there is a perfect match at the 3′end of the 5′sequence making it possible to detect the presence of a known mutation at a specific site by looking for the presence or absence of amplification.

[0799] In another aspect, the invention features a set of oligonucleotides. The set includes a plurality of oligonucleotides, each of which is at least partially complementary (e.g., at least 50%, 60%, 70%, 80%, 90%, 92%, 95%, 97%, 98%, or 99% complementary) to a 61833 nucleic acid.

[0800] In a preferred embodiment the set includes a first and a second oligonucleotide. The first and second oligonucleotide can hybridize to the same or to different locations of SEQ ID NO:10 or the complement of SEQ ID NO:10. Different locations can be different but overlapping, or non-overlapping on the same strand. The first and second oligonucleotide can hybridize to sites on the same or on different strands.

[0801] The set can be useful, e.g., for identifying SNP's, or identifying specific alleles of 61833. In a preferred embodiment, each oligonucleotide of the set has a different nucleotide at an interrogation position. In one embodiment, the set includes two oligonucleotides, each complementary to a different allele at a locus, e.g., a biallelic or polymorphic locus.

[0802] In another embodiment, the set includes four oligonucleotides, each having a different nucleotide (e.g., adenine, guanine, cytosine, or thymidine) at the interrogation position. The interrogation position can be a SNP or the site of a mutation. In another preferred embodiment, the oligonucleotides of the plurality are identical in sequence to one another (except for differences in length). The oligonucleotides can be provided with differential labels, such that an oligonucleotide that hybridizes to one allele provides a signal that is distinguishable from an oligonucleotide that hybridizes to a second allele. In still another embodiment, at least one of the oligonucleotides of the set has a nucleotide change at a position in addition to a query position, e.g., a destabilizing mutation to decrease the Tm of the oligonucleotide. In another embodiment, at least one oligonucleotide of the set has a non-natural nucleotide, e.g., inosine. In a preferred embodiment, the oligonucleotides are attached to a solid support, e.g., to different addresses of an array or to different beads or nanoparticles.

[0803] In a preferred embodiment the set of oligo nucleotides can be used to specifically amplify, e.g., by PCR, or detect, a 61833 nucleic acid.

[0804] The methods described herein may be performed, for example, by utilizing pre-packaged diagnostic kits comprising at least one probe nucleic acid or antibody reagent described herein, which may be conveniently used, e.g., in clinical settings to diagnose patients exhibiting symptoms or family history of a disease or illness involving a 61833 gene.

[0805] Use of 61833 Molecules as Surrogate Markers

[0806] The 61833 molecules of the invention are also useful as markers of disorders or disease states, as markers for precursors of disease states, as markers for predisposition of disease states, as markers of drug activity, or as markers of the pharmacogenomic profile of a subject. Using the methods described herein, the presence, absence and/or quantity of the 61833 molecules of the invention may be detected, and may be correlated with one or more biological states in vivo. For example, the 61833 molecules of the invention may serve as surrogate markers for one or more disorders or disease states or for conditions leading up to disease states. As used herein, a “surrogate marker” is an objective biochemical marker which correlates with the absence or presence of a disease or disorder, or with the progression of a disease or disorder (e.g., with the presence or absence of a tumor). The presence or quantity of such markers is independent of the disease. Therefore, these markers may serve to indicate whether a particular course of treatment is effective in lessening a disease state or disorder. Surrogate markers are of particular use when the presence or extent of a disease state or disorder is difficult to assess through standard methodologies (e.g., early stage tumors), or when an assessment of disease progression is desired before a potentially dangerous clinical endpoint is reached (e.g., an assessment of cardiovascular disease may be made using cholesterol levels as a surrogate marker, and an analysis of HIV infection may be made using HIV RNA levels as a surrogate marker, well in advance of the undesirable clinical outcomes of myocardial infarction or fully-developed AIDS). Examples of the use of surrogate markers in the art include: Koomen et al. (2000) J. Mass. Spectrom. 35: 258-264; and James (1994) AIDS Treatment News Archive 209.

[0807] The 61833 molecules of the invention are also useful as pharmacodynamic markers. As used herein, a “pharmacodynamic marker” is an objective biochemical marker which correlates specifically with drug effects. The presence or quantity of a pharmacodynamic marker is not related to the disease state or disorder for which the drug is being administered; therefore, the presence or quantity of the marker is indicative of the presence or activity of the drug in a subject. For example, a pharmacodynamic marker may be indicative of the concentration of the drug in a biological tissue, in that the marker is either expressed or transcribed or not expressed or transcribed in that tissue in relationship to the level of the drug. In this fashion, the distribution or uptake of the drug may be monitored by the pharmacodynamic marker. Similarly, the presence or quantity of the pharmacodynamic marker may be related to the presence or quantity of the metabolic product of a drug, such that the presence or quantity of the marker is indicative of the relative breakdown rate of the drug in vivo. Pharmacodynamic markers are of particular use in increasing the sensitivity of detection of drug effects, particularly when the drug is administered in low doses. Since even a small amount of a drug may be sufficient to activate multiple rounds of marker (e.g., a 61833 marker) transcription or expression, the amplified marker may be in a quantity which is more readily detectable than the drug itself. Also, the marker may be more easily detected due to the nature of the marker itself; for example, using the methods described herein, anti-61833 antibodies may be employed in an immune-based detection system for a 61833 protein marker, or 61833-specific radiolabeled probes may be used to detect a 61833 mRNA marker. Furthermore, the use of a pharmacodynamic marker may offer mechanism-based prediction of risk due to drug treatment beyond the range of possible direct observations. Examples of the use of pharmacodynamic markers in the art include: Matsuda et al. U.S. Pat. No. 6,033,862; Hattis et al. (1991) Env. Health Perspect. 90: 229-238; Schentag (1999) Am. J. Health-Syst. Pharm. 56 Suppl. 3: S21-S24; and Nicolau (1999) Am, J. Health-Syst. Pharm. 56 Suppl. 3: S16-S20.

[0808] The 61833 molecules of the invention are also useful as pharmacogenomic markers. As used herein, a “pharmacogenomic marker” is an objective biochemical marker which correlates with a specific clinical drug response or susceptibility in a subject (see, e.g., McLeod et al. (1999) Eur. J. Cancer 35:1650-1652). The presence or quantity of the pharmacogenomic marker is related to the predicted response of the subject to a specific drug or class of drugs prior to administration of the drug. By assessing the presence or quantity of one or more pharmacogenomic markers in a subject, a drug therapy which is most appropriate for the subject, or which is predicted to have a greater degree of success, may be selected. For example, based on the presence or quantity of RNA, or protein (e.g., 61833 protein or RNA) for specific tumor markers in a subject, a drug or course of treatment may be selected that is optimized for the treatment of the specific tumor likely to be present in the subject. Similarly, the presence or absence of a specific sequence mutation in 61833 DNA may correlate 61833 drug response. The use of pharmacogenomic markers therefore permits the application of the most appropriate treatment for each subject without having to administer the therapy.

[0809] Pharmaceutical Compositions of 61833

[0810] The nucleic acid and polypeptides, fragments thereof, as well as anti-61833 antibodies (also referred to herein as “active compounds”) of the invention can be incorporated into pharmaceutical compositions. Such compositions typically include the nucleic acid molecule, protein, or antibody and a pharmaceutically acceptable carrier. As used herein the language “pharmaceutically acceptable carrier” includes solvents, dispersion media, coatings, antibacterial and antifungal agents, isotonic and absorption delaying agents, and the like, compatible with pharmaceutical administration. Supplementary active compounds can also be incorporated into the compositions.

[0811] A pharmaceutical composition is formulated to be compatible with its intended route of administration. Examples of routes of administration include parenteral, e.g., intravenous, intradermal, subcutaneous, oral (e.g., inhalation), transdermal (topical), transmucosal, and rectal administration. Solutions or suspensions used for parenteral, intradermal, or subcutaneous application can include the following components: a sterile diluent such as water for injection, saline solution, fixed oils, polyethylene glycols, glycerine, propylene glycol or other synthetic solvents; antibacterial agents such as benzyl alcohol or methyl parabens; antioxidants such as ascorbic acid or sodium bisulfite; chelating agents such as ethylenediaminetetraacetic acid; buffers such as acetates, citrates or phosphates and agents for the adjustment of tonicity such as sodium chloride or dextrose. pH can be adjusted with acids or bases, such as hydrochloric acid or sodium hydroxide. The parenteral preparation can be enclosed in ampoules, disposable syringes or multiple dose vials made of glass or plastic.

[0812] Pharmaceutical compositions suitable for injectable use include sterile aqueous solutions (where water soluble) or dispersions and sterile powders for the extemporaneous preparation of sterile injectable solutions or dispersion. For intravenous administration, suitable carriers include physiological saline, bacteriostatic water, Cremophor EL™ (BASF, Parsippany, N.J.) or phosphate buffered saline (PBS). In all cases, the composition must be sterile and should be fluid to the extent that easy syringability exists. It should be stable under the conditions of manufacture and storage and must be preserved against the contaminating action of microorganisms such as bacteria and fingi. The carrier can be a solvent or dispersion medium containing, for example, water, ethanol, polyol (for example, glycerol, propylene glycol, and liquid polyetheylene glycol, and the like), and suitable mixtures thereof. The proper fluidity can be maintained, for example, by the use of a coating such as lecithin, by the maintenance of the required particle size in the case of dispersion and by the use of surfactants. Prevention of the action of microorganisms can be achieved by various antibacterial and antifungal agents, for example, parabens, chlorobutanol, phenol, ascorbic acid, thimerosal, and the like. In many cases, it will be preferable to include isotonic agents, for example, sugars, polyalcohols such as manitol, sorbitol, sodium chloride in the composition. Prolonged absorption of the injectable compositions can be brought about by including in the composition an agent which delays absorption, for example, aluminum monostearate and gelatin.

[0813] Sterile injectable solutions can be prepared by incorporating the active compound in the required amount in an appropriate solvent with one or a combination of ingredients enumerated above, as required, followed by filtered sterilization. Generally, dispersions are prepared by incorporating the active compound into a sterile vehicle which contains a basic dispersion medium and the required other ingredients from those enumerated above. In the case of sterile powders for the preparation of sterile injectable solutions, the preferred methods of preparation are vacuum drying and freeze-drying which yields a powder of the active ingredient plus any additional desired ingredient from a previously sterile-filtered solution thereof.

[0814] Oral compositions generally include an inert diluent or an edible carrier. For the purpose of oral therapeutic administration, the active compound can be incorporated with excipients and used in the form of tablets, troches, or capsules, e.g., gelatin capsules. Oral compositions can also be prepared using a fluid carrier for use as a mouthwash. Pharmaceutically compatible binding agents, and/or adjuvant materials can be included as part of the composition. The tablets, pills, capsules, troches and the like can contain any of the following ingredients, or compounds of a similar nature: a binder such as microcrystalline cellulose, gum tragacanth or gelatin; an excipient such as starch or lactose, a disintegrating agent such as alginic acid, Primogel, or corn starch; a lubricant such as magnesium stearate or Sterotes; a glidant such as colloidal silicon dioxide; a sweetening agent such as sucrose or saccharin; or a flavoring agent such as peppermint, methyl salicylate, or orange flavoring.

[0815] For administration by inhalation, the compounds are delivered in the form of an aerosol spray from pressured container or dispenser which contains a suitable propellant, e.g., a gas such as carbon dioxide, or a nebulizer.

[0816] Systemic administration can also be by transmucosal or transdermal means. For transmucosal or transdermal administration, penetrants appropriate to the barrier to be permeated are used in the formulation. Such penetrants are generally known in the art, and include, for example, for transmucosal administration, detergents, bile salts, and fusidic acid derivatives. Transmucosal administration can be accomplished through the use of nasal sprays or suppositories. For transdermal administration, the active compounds are formulated into ointments, salves, gels, or creams as generally known in the art.

[0817] The compounds can also be prepared in the form of suppositories (e.g., with conventional suppository bases such as cocoa butter and other glycerides) or retention enemas for rectal delivery.

[0818] In one embodiment, the active compounds are prepared with carriers that will protect the compound against rapid elimination from the body, such as a controlled release formulation, including implants and microencapsulated delivery systems. Biodegradable, biocompatible polymers can be used, such as ethylene vinyl acetate, polyanhydrides, polyglycolic acid, collagen, polyorthoesters, and polylactic acid. Methods for preparation of such formulations will be apparent to those skilled in the art. The materials can also be obtained commercially from Alza Corporation and Nova Pharmaceuticals, Inc. Liposomal suspensions (including liposomes targeted to infected cells with monoclonal antibodies to viral antigens) can also be used as pharmaceutically acceptable carriers. These can be prepared according to methods known to those skilled in the art, for example, as described in U.S. Pat. No. 4,522,811.

[0819] It is advantageous to formulate oral or parenteral compositions in dosage unit form for ease of administration and uniformity of dosage. Dosage unit form as used herein refers to physically discrete units suited as unitary dosages for the subject to be treated; each unit containing a predetermined quantity of active compound calculated to produce the desired therapeutic effect in association with the required pharmaceutical carrier.

[0820] Toxicity and therapeutic efficacy of such compounds can be determined by standard pharmaceutical procedures in cell cultures or experimental animals, e.g., for determining the LD50 (the dose lethal to 50% of the population) and the ED50 (the dose therapeutically effective in 50% of the population). The dose ratio between toxic and therapeutic effects is the therapeutic index and it can be expressed as the ratio LD50/ED50. Compounds which exhibit high therapeutic indices are preferred. While compounds that exhibit toxic side effects may be used, care should be taken to design a delivery system that targets such compounds to the site of affected tissue in order to minimize potential damage to uninfected cells and, thereby, reduce side effects.

[0821] The data obtained from the cell culture assays and animal studies can be used in formulating a range of dosage for use in humans. The dosage of such compounds lies preferably within a range of circulating concentrations that include the ED50 with little or no toxicity. The dosage may vary within this range depending upon the dosage form employed and the route of administration utilized. For any compound used in the method of the invention, the therapeutically effective dose can be estimated initially from cell culture assays. A dose may be formulated in animal models to achieve a circulating plasma concentration range that includes the IC50 (i.e., the concentration of the test compound which achieves a half-maximal inhibition of symptoms) as determined in cell culture. Such information can be used to more accurately determine useful doses in humans. Levels in plasma may be measured, for example, by high performance liquid chromatography.

[0822] As defined herein, a therapeutically effective amount of protein or polypeptide (i.e., an effective dosage) ranges from about 0.001 to 30 mg/kg body weight, preferably about 0.01 to 25 mg/kg body weight, more preferably about 0.1 to 20 mg/kg body weight, and even more preferably about 1 to 10 mg/kg, 2 to 9 mg/kg, 3 to 8 mg/kg, 4 to 7 mg/kg, or 5 to 6 mg/kg body weight. The protein or polypeptide can be administered one time per week for between about 1 to 10 weeks, preferably between 2 to 8 weeks, more preferably between about 3 to 7 weeks, and even more preferably for about 4, 5, or 6 weeks. The skilled artisan will appreciate that certain factors may influence the dosage and timing required to effectively treat a subject, including but not limited to the severity of the disease or disorder, previous treatments, the general health and/or age of the subject, and other diseases present. Moreover, treatment of a subject with a therapeutically effective amount of a protein, polypeptide, or antibody can include a single treatment or, preferably, can include a series of treatments.

[0823] For antibodies, the preferred dosage is 0.1 mg/kg of body weight (generally 10 mg/kg to 20 mg/kg). If the antibody is to act in the brain, a dosage of 50 mg/kg to 100 mg/kg is usually appropriate. Generally, partially human antibodies and fully human antibodies have a longer half-life within the human body than other antibodies. Accordingly, lower dosages and less frequent administration is often possible. Modifications such as lipidation can be used to stabilize antibodies and to enhance uptake and tissue penetration (e.g., into the brain). A method for lipidation of antibodies is described by Cruikshank et al. ((1997) J Acquired Immune Deficiency Syndromes and Human Retrovirology 14:193).

[0824] The present invention encompasses agents which modulate expression or activity. An agent may, for example, be a small molecule. For example, such small molecules include, but are not limited to, peptides, peptidomimetics (e.g., peptoids), amino acids, amino acid analogs, polynucleotides, polynucleotide analogs, nucleotides, nucleotide analogs, organic or inorganic compounds (i.e., including heteroorganic and organometallic compounds) having a molecular weight less than about 10,000 grams per mole, organic or inorganic compounds having a molecular weight less than about 5,000 grams per mole, organic or inorganic compounds having a molecular weight less than about 1,000 grams per mole, organic or inorganic compounds having a molecular weight less than about 500 grams per mole, and salts, esters, and other pharmaceutically acceptable forms of such compounds.

[0825] Exemplary doses include milligram or microgram amounts of the small molecule per kilogram of subject or sample weight (e.g., about 1 microgram per kilogram to about 500 milligrams per kilogram, about 100 micrograms per kilogram to about 5 milligrams per kilogram, or about 1 microgram per kilogram to about 50 micrograms per kilogram. It is furthermore understood that appropriate doses of a small molecule depend upon the potency of the small molecule with respect to the expression or activity to be modulated. When one or more of these small molecules is to be administered to an animal (e.g., a human) in order to modulate expression or activity of a polypeptide or nucleic acid of the invention, a physician, veterinarian, or researcher may, for example, prescribe a relatively low dose at first, subsequently increasing the dose until an appropriate response is obtained. In addition, it is understood that the specific dose level for any particular animal subject will depend upon a variety of factors including the activity of the specific compound employed, the age, body weight, general health, gender, and diet of the subject, the time of administration, the route of administration, the rate of excretion, any drug combination, and the degree of expression or activity to be modulated.

[0826] An antibody (or fragment thereof) may be conjugated to a therapeutic moiety such as a cytotoxin, a therapeutic agent or a radioactive metal ion. A cytotoxin or cytotoxic agent includes any agent that is detrimental to cells. Examples include taxol, cytochalasin B, gramicidin D, ethidium bromide, emetine, mitomycin, etoposide, tenoposide, vincristine, vinblastine, colchicin, doxorubicin, daunorubicin, dihydroxy anthracin dione, mitoxantrone, mithramycin, actinomycin D, 1-dehydrotestosterone, glucocorticoids, procaine, tetracaine, lidocaine, propranolol, and puromycin and analogs or homologs thereof. Therapeutic agents include, but are not limited to, antimetabolites (e.g., methotrexate, 6-mercaptopurine, 6-thioguanine, cytarabine, 5-fluorouracil decarbazine), alkylating agents (e.g., mechlorethamine, thioepa chlorambucil, melphalan, carmustine (BSNU) and lomustine (CCNU), cyclothosphamide, busulfan, dibromomannitol, streptozotocin, mitomycin C, and cis-dichlorodiamine platinum (II) (DDP) cisplatin), anthracyclines (e.g., daunorubicin (formerly daunomycin) and doxorubicin), antibiotics (e.g., dactinomycin (formerly actinomycin), bleomycin, mithramycin, and anthramycin (AMC)), and anti-mitotic agents (e.g., vincristine and vinblastine).

[0827] The conjugates of the invention can be used for modifying a given biological response, the drug moiety is not to be construed as limited to classical chemical therapeutic agents. For example, the drug moiety may be a protein or polypeptide possessing a desired biological activity. Such proteins may include, for example, a toxin such as abrin, ricin A, pseudomonas exotoxin, or diphtheria toxin; a protein such as tumor necrosis factor, &agr;-interferon, &bgr;-interferon, nerve growth factor, platelet derived growth factor, tissue plasminogen activator; or, biological response modifiers such as, for example, lymphokines, interleukin-1 (“IL-1”), interleukin-2 (“IL-2”), interleukin-6 (“IL-6”), granulocyte macrophase colony stimulating factor (“GM-CSF”), granulocyte colony stimulating factor (“G-CSF”), or other growth factors.

[0828] Alternatively, an antibody can be conjugated to a second antibody to form an antibody heteroconjugate as described by Segal in U.S. Pat. No. 4,676,980.

[0829] The nucleic acid molecules of the invention can be inserted into vectors and used as gene therapy vectors. Gene therapy vectors can be delivered to a subject by, for example, intravenous injection, local administration (see U.S. Pat. No. 5,328,470) or by stereotactic injection (see e.g., Chen et al. (1994) Proc. Natl. Acad. Sci. USA 91:3054-3057). The pharmaceutical preparation of the gene therapy vector can include the gene therapy vector in an acceptable diluent, or can comprise a slow release matrix in which the gene delivery vehicle is imbedded. Alternatively, where the complete gene delivery vector can be produced intact from recombinant cells, e.g., retroviral vectors, the pharmaceutical preparation can include one or more cells which produce the gene delivery system.

[0830] The pharmaceutical compositions can be included in a container, pack, or dispenser together with instructions for administration.

[0831] Methods of Treatment for 61833

[0832] The present invention provides for both prophylactic and therapeutic methods of treating a subject at risk of (or susceptible to) a disorder or having a disorder associated with aberrant or unwanted 61833 expression or activity. As used herein, the term “treatment” is defined as the application or administration of a therapeutic agent to a patient, or application or administration of a therapeutic agent to an isolated tissue or cell line from a patient, who has a disease, a symptom of disease or a predisposition toward a disease, with the purpose to cure, heal, alleviate, relieve, alter, remedy, ameliorate, improve or affect the disease, the symptoms of disease or the predisposition toward disease. A therapeutic agent includes, but is not limited to, small molecules, peptides, antibodies, ribozymes and antisense oligonucleotides.

[0833] With regards to both prophylactic and therapeutic methods of treatment, such treatments may be specifically tailored or modified, based on knowledge obtained from the field of pharmacogenomics. “Pharmacogenomics”, as used herein, refers to the application of genomics technologies such as gene sequencing, statistical genetics, and gene expression analysis to drugs in clinical development and on the market. More specifically, the term refers the study of how a patient's genes determine his or her response to a drug (e.g., a patient's “drug response phenotype”, or “drug response genotype”.) Thus, another aspect of the invention provides methods for tailoring an individual's prophylactic or therapeutic treatment with either the 61833 molecules of the present invention or 61833 modulators according to that individual's drug response genotype. Pharmacogenomics allows a clinician or physician to target prophylactic or therapeutic treatments to patients who will most benefit from the treatment and to avoid treatment of patients who will experience toxic drug-related side effects.

[0834] In one aspect, the invention provides a method for preventing in a subject, a disease or condition associated with an aberrant or unwanted 61833 expression or activity, by administering to the subject a 61833 or an agent which modulates 61833 expression or at least one 61833 activity. Subjects at risk for a disease which is caused or contributed to by aberrant or unwanted 61833 expression or activity can be identified by, for example, any or a combination of diagnostic or prognostic assays as described herein. Administration of a prophylactic agent can occur prior to the manifestation of symptoms characteristic of the 61833 aberrance, such that a disease or disorder is prevented or, alternatively, delayed in its progression. Depending on the type of 61833 aberrance, for example, a 61833, 61833 agonist or 61833 antagonist agent can be used for treating the subject. The appropriate agent can be determined based on screening assays described herein.

[0835] It is possible that some 61833 disorders can be caused, at least in part, by an abnormal level of gene product, or by the presence of a gene product exhibiting abnormal activity. As such, the reduction in the level and/or activity of such gene products would bring about the amelioration of disorder symptoms.

[0836] The 61833 molecules can act as novel diagnostic targets and therapeutic agents for controlling one or more of cellular proliferative and/or differentiative disorders, cardiovascular disorders, and disorders of the brain, examples of which have been described above, as well as disorders associated with bone metabolism, immune disorders, liver disorders, viral diseases, pain or metabolic disorders.

[0837] Aberrant expression and/or activity of 61833 molecules may mediate disorders associated with bone metabolism. “Bone metabolism” refers to direct or indirect effects in the formation or degeneration of bone structures, e.g., bone formation, bone resorption, etc., which may ultimately affect the concentrations in serum of calcium and phosphate. This term also includes activities mediated by 61833 molecules effects in bone cells, e.g. osteoclasts and osteoblasts, that may in turn result in bone formation and degeneration. For example, 61833 molecules may support different activities of bone resorbing osteoclasts such as the stimulation of differentiation of monocytes and mononuclear phagocytes into osteoclasts. Accordingly, 61833 molecules that modulate the production of bone cells can influence bone formation and degeneration, and thus may be used to treat bone disorders. Examples of such disorders include, but are not limited to, osteoporosis, osteodystrophy, osteomalacia, rickets, osteitis fibrosa cystica, renal osteodystrophy, osteosclerosis, anti-convulsant treatment, osteopenia, fibrogenesis-imperfecta ossium, secondary hyperparathyrodism, hypoparathyroidism, hyperparathyroidism, cirrhosis, obstructive jaundice, drug induced metabolism, medullary carcinoma, chronic renal disease, rickets, sarcoidosis, glucocorticoid antagonism, malabsorption syndrome, steatorrhea, tropical sprue, idiopathic hypercalcemia and milk fever.

[0838] The 61833 nucleic acid and protein of the invention can be used to treat and/or diagnose a variety of immune disorders. Examples of immune disorders or diseases include, but are not limited to, autoimmune diseases (including, for example, diabetes mellitus, arthritis (including rheumatoid arthritis, juvenile rheumatoid arthritis, osteoarthritis, psoriatic arthritis), multiple sclerosis, encephalomyelitis, myasthenia gravis, systemic lupus erythematosis, autoimmune thyroiditis, dermatitis (including atopic dermatitis and eczematous dermatitis), psoriasis, Sjögren's Syndrome, Crohn's disease, aphthous ulcer, iritis, conjunctivitis, keratoconjunctivitis, ulcerative colitis, asthma, allergic asthma, cutaneous lupus erythematosus, scleroderma, vaginitis, proctitis, drug eruptions, leprosy reversal reactions, erythema nodosum leprosum, autoimmune uveitis, allergic encephalomyelitis, acute necrotizing hemorrhagic encephalopathy, idiopathic bilateral progressive sensorineural hearing loss, aplastic anemia, pure red cell anemia, idiopathic thrombocytopenia, polychondritis, Wegener's granulomatosis, chronic active hepatitis, Stevens-Johnson syndrome, idiopathic sprue, lichen planus, Graves' disease, sarcoidosis, primary biliary cirrhosis, uveitis posterior, and interstitial lung fibrosis), graft-versus-host disease, cases of transplantation, and allergy such as, atopic allergy.

[0839] Disorders which may be treated or diagnosed by methods described herein include, but are not limited to, disorders associated with an accumulation in the liver of fibrous tissue, such as that resulting from an imbalance between production and degradation of the extracellular matrix accompanied by the collapse and condensation of preexisting fibers. The methods described herein can be used to diagnose or treat hepatocellular necrosis or injury induced by a wide variety of agents including processes which disturb homeostasis, such as an inflammatory process, tissue damage resulting from toxic injury or altered hepatic blood flow, and infections (e.g., bacterial, viral and parasitic). For example, the methods can be used for the early detection of hepatic injury, such as portal hypertension or hepatic fibrosis. In addition, the methods can be employed to detect liver fibrosis attributed to inborn errors of metabolism, for example, fibrosis resulting from a storage disorder such as Gaucher's disease (lipid abnormalities) or a glycogen storage disease, A1-antitrypsin deficiency; a disorder mediating the accumulation (e.g., storage) of an exogenous substance, for example, hemochromatosis (iron-overload syndrome) and copper storage diseases (Wilson's disease), disorders resulting in the accumulation of a toxic metabolite (e.g., tyrosinemia, fructosemia and galactosemia) and peroxisomal disorders (e.g., Zellweger syndrome). Additionally, the methods described herein may be useful for the early detection and treatment of liver injury associated with the administration of various chemicals or drugs, such as for example, methotrexate, isonizaid, oxyphenisatin, methyldopa, chlorpromazine, tolbutamide or alcohol, or which represents a hepatic manifestation of a vascular disorder such as obstruction of either the intrahepatic or extrahepatic bile flow or an alteration in hepatic circulation resulting, for example, from chronic heart failure, veno-occlusive disease, portal vein thrombosis or Budd-Chiari syndrome.

[0840] Additionally, 61833 molecules may play an important role in the etiology of certain viral diseases, including but not limited to Hepatitis B, Hepatitis C and Herpes Simplex Virus (HSV). Modulators of 61833 activity could be used to control viral diseases. The modulators can be used in the treatment and/or diagnosis of viral infected tissue or virus-associated tissue fibrosis, especially liver and liver fibrosis. Also, 61833 modulators can be used in the treatment and/or diagnosis of virus-associated carcinoma, especially hepatocellular cancer.

[0841] Additionally, 61833 may play an important role in the regulation of metabolism or pain disorders. Diseases of metabolic imbalance include, but are not limited to, obesity, anorexia nervosa, cachexia, lipid disorders, and diabetes. Examples of pain disorders include, but are not limited to, pain response elicited during various forms of tissue injury, e.g., inflammation, infection, and ischemia, usually referred to as hyperalgesia (described in, for example, Fields, H. L. (1987) Pain, New York: McGraw-Hill); pain associated with musculoskeletal disorders, e.g., joint pain; tooth pain; headaches; pain associated with surgery; pain related to irritable bowel syndrome; or chest pain.

[0842] As discussed, successful treatment of 61833 disorders can be brought about by techniques that serve to inhibit the expression or activity of target gene products. For example, compounds, e.g., an agent identified using an assays described above, that proves to exhibit negative modulatory activity, can be used in accordance with the invention to prevent and/or ameliorate symptoms of 61833 disorders. Such molecules can include, but are not limited to peptides, phosphopeptides, small organic or inorganic molecules, or antibodies (including, for example, polyclonal, monoclonal, humanized, anti-idiotypic, chimeric or single chain antibodies, and Fab, F(ab′)2 and Fab expression library fragments, scFV molecules, and epitope-binding fragments thereof).

[0843] Further, antisense and ribozyme molecules that inhibit expression of the target gene can also be used in accordance with the invention to reduce the level of target gene expression, thus effectively reducing the level of target gene activity. Still further, triple helix molecules can be utilized in reducing the level of target gene activity. Antisense, ribozyme and triple helix molecules are discussed above.

[0844] It is possible that the use of antisense, ribozyme, and/or triple helix molecules to reduce or inhibit mutant gene expression can also reduce or inhibit the transcription (triple helix) and/or translation (antisense, ribozyme) of mRNA produced by normal target gene alleles, such that the concentration of normal target gene product present can be lower than is necessary for a normal phenotype. In such cases, nucleic acid molecules that encode and express target gene polypeptides exhibiting normal target gene activity can be introduced into cells via gene therapy method. Alternatively, in instances in that the target gene encodes an extracellular protein, it can be preferable to co-administer normal target gene protein into the cell or tissue in order to maintain the requisite level of cellular or tissue target gene activity.

[0845] Another method by which nucleic acid molecules may be utilized in treating or preventing a disease characterized by 61833 expression is through the use of aptamer molecules specific for 61833 protein. Aptamers are nucleic acid molecules having a tertiary structure which permits them to specifically bind to protein ligands (see, e.g., Osborne, et al. (1997) Curr. Opin. Chem Biol. 1: 5-9; and Patel, D. J. (1997) Curr Opin Chem Biol 1:32-46). Since nucleic acid molecules may in many cases be more conveniently introduced into target cells than therapeutic protein molecules may be, aptamers offer a method by which 61833 protein activity may be specifically decreased without the introduction of drugs or other molecules which may have pluripotent effects.

[0846] Antibodies can be generated that are both specific for target gene product and that reduce target gene product activity. Such antibodies may, therefore, by administered in instances whereby negative modulatory techniques are appropriate for the treatment of 61833 disorders. For a description of antibodies, see the Antibody section above.

[0847] In circumstances wherein injection of an animal or a human subject with a 61833 protein or epitope for stimulating antibody production is harmful to the subject, it is possible to generate an immune response against 61833 through the use of anti-idiotypic antibodies (see, for example, Herlyn, D. (1999) Ann Med 31:66-78; and Bhattacharya-Chatterjee, M., and Foon, K. A. (1998) Cancer Treat Res. 94:51-68). If an anti-idiotypic antibody is introduced into a mammal or human subject, it should stimulate the production of anti-anti-idiotypic antibodies, which should be specific to the 61833 protein. Vaccines directed to a disease characterized by 61833 expression may also be generated in this fashion.

[0848] In instances where the target antigen is intracellular and whole antibodies are used, internalizing antibodies may be preferred. Lipofectin or liposomes can be used to deliver the antibody or a fragment of the Fab region that binds to the target antigen into cells. Where fragments of the antibody are used, the smallest inhibitory fragment that binds to the target antigen is preferred. For example, peptides having an amino acid sequence corresponding to the Fv region of the antibody can be used. Alternatively, single chain neutralizing antibodies that bind to intracellular target antigens can also be administered. Such single chain antibodies can be administered, for example, by expressing nucleotide sequences encoding single-chain antibodies within the target cell population (see e.g., Marasco et al. (1993) Proc. Natl. Acad. Sci. USA 90:7889-7893).

[0849] The identified compounds that inhibit target gene expression, synthesis and/or activity can be administered to a patient at therapeutically effective doses to prevent, treat or ameliorate 61833 disorders. A therapeutically effective dose refers to that amount of the compound sufficient to result in amelioration of symptoms of the disorders. Toxicity and therapeutic efficacy of such compounds can be determined by standard pharmaceutical procedures as described above.

[0850] The data obtained from the cell culture assays and animal studies can be used in formulating a range of dosage for use in humans. The dosage of such compounds lies preferably within a range of circulating concentrations that include the ED50 with little or no toxicity. The dosage can vary within this range depending upon the dosage form employed and the route of administration utilized. For any compound used in the method of the invention, the therapeutically effective dose can be estimated initially from cell culture assays. A dose can be formulated in animal models to achieve a circulating plasma concentration range that includes the IC50 (i.e., the concentration of the test compound that achieves a half-maximal inhibition of symptoms) as determined in cell culture. Such information can be used to more accurately determine useful doses in humans. Levels in plasma can be measured, for example, by high performance liquid chromatography.

[0851] Another example of determination of effective dose for an individual is the ability to directly assay levels of “free” and “bound” compound in the serum of the test subject. Such assays may utilize antibody mimics and/or “biosensors” that have been created through molecular imprinting techniques. The compound which is able to modulate 61833 activity is used as a template, or “imprinting molecule”, to spatially organize polymerizable monomers prior to their polymerization with catalytic reagents. The subsequent removal of the imprinted molecule leaves a polymer matrix which contains a repeated “negative image” of the compound and is able to selectively rebind the molecule under biological assay conditions. A detailed review of this technique can be seen in Ansell, R. J. et al (1996) Current Opinion in Biotechnology 7:89-94 and in Shea, K. J. (1994) Trends in Polymer Science 2:166-173. Such “imprinted” affinity matrixes are amenable to ligand-binding assays, whereby the immobilized monoclonal antibody component is replaced by an appropriately imprinted matrix. An example of the use of such matrixes in this way can be seen in Vlatakis, G. et al (1993) Nature 361:645-647. Through the use of isotope-labeling, the “free” concentration of compound which modulates the expression or activity of 61833 can be readily monitored and used in calculations of IC50.

[0852] Such “imprinted” affinity matrixes can also be designed to include fluorescent groups whose photon-emitting properties measurably change upon local and selective binding of target compound. These changes can be readily assayed in real time using appropriate fiberoptic devices, in turn allowing the dose in a test subject to be quickly optimized based on its individual IC50. An rudimentary example of such a “biosensor” is discussed in Kriz, D. et al (1995) Analytical Chemistry 67:2142-2144.

[0853] Another aspect of the invention pertains to methods of modulating 61833 expression or activity for therapeutic purposes. Accordingly, in an exemplary embodiment, the modulatory method of the invention involves contacting a cell with a 61833 or agent that modulates one or more of the activities of 61833 protein activity associated with the cell. An agent that modulates 61833 protein activity can be an agent as described herein, such as a nucleic acid or a protein, a naturally-occurring target molecule of a 61833 protein (e.g., a 61833 substrate or receptor), a 61833 antibody, a 61833 agonist or antagonist, a peptidomimetic of a 61833 agonist or antagonist, or other small molecule.

[0854] In one embodiment, the agent stimulates one or 61833 activities. Examples of such stimulatory agents include active 61833 protein and a nucleic acid molecule encoding 61833. In another embodiment, the agent inhibits one or more 61833 activities. Examples of such inhibitory agents include antisense 61833 nucleic acid molecules, anti-61833 antibodies, and 61833 inhibitors. These modulatory methods can be performed in vitro (e.g., by culturing the cell with the agent) or, alternatively, in vivo (e.g., by administering the agent to a subject). As such, the present invention provides methods of treating an individual afflicted with a disease or disorder characterized by aberrant or unwanted expression or activity of a 61833 protein or nucleic acid molecule. In one embodiment, the method involves administering an agent (e.g., an agent identified by a screening assay described herein), or combination of agents that modulates (e.g., up regulates or down regulates) 61833 expression or activity. In another embodiment, the method involves administering a 61833 protein or nucleic acid molecule as therapy to compensate for reduced, aberrant, or unwanted 61833 expression or activity.

[0855] Stimulation of 61833 activity is desirable in situations in which 61833 is abnormally downregulated and/or in which increased 61833 activity is likely to have a beneficial effect. For example, stimulation of 61833 activity is desirable in situations in which a 61833 is downregulated and/or in which increased 61833 activity is likely to have a beneficial effect. Likewise, inhibition of 61833 activity is desirable in situations in which 61833 is abnormally upregulated and/or in which decreased 61833 activity is likely to have a beneficial effect.

[0856] 61833 Pharmacogenomics

[0857] The 61833 molecules of the present invention, as well as agents, or modulators which have a stimulatory or inhibitory effect on 61833 activity (e.g., 61833 gene expression) as identified by a screening assay described herein can be administered to individuals to treat (prophylactically or therapeutically) 61833 associated disorders (e.g., cellular proliferative or differentiative disorders, cardiovascular disorders, or disorders of the brain) associated with aberrant or unwanted 61833 activity. In conjunction with such treatment, pharmacogenomics (i.e., the study of the relationship between an individual's genotype and that individual's response to a foreign compound or drug) may be considered. Differences in metabolism of therapeutics can lead to severe toxicity or therapeutic failure by altering the relation between dose and blood concentration of the pharmacologically active drug. Thus, a physician or clinician may consider applying knowledge obtained in relevant pharmacogenomics studies in determining whether to administer a 61833 molecule or 61833 modulator as well as tailoring the dosage and/or therapeutic regimen of treatment with a 61833 molecule or 61833 modulator.

[0858] Pharmacogenomics deals with clinically significant hereditary variations in the response to drugs due to altered drug disposition and abnormal action in affected persons. See, for example, Eichelbaum, M. et al. (1996) Clin. Exp. Pharmacol. Physiol. 23:983-985 and Linder, M. W. et al. (1997) Clin. Chem. 43:254-266. In general, two types of pharmacogenetic conditions can be differentiated. Genetic conditions transmitted as a single factor altering the way drugs act on the body (altered drug action) or genetic conditions transmitted as single factors altering the way the body acts on drugs (altered drug metabolism). These pharmacogenetic conditions can occur either as rare genetic defects or as naturally-occurring polymorphisms. For example, glucose-6-phosphate dehydrogenase deficiency (G6PD) is a common inherited enzymopathy in which the main clinical complication is haemolysis after ingestion of oxidant drugs (anti-malarials, sulfonamides, analgesics, nitrofurans) and consumption of fava beans.

[0859] One pharmacogenomics approach to identifying genes that predict drug response, known as “a genome-wide association”, relies primarily on a high-resolution map of the human genome consisting of already known gene-related markers (e.g., a “bi-allelic” gene marker map which consists of 60,000-100,000 polymorphic or variable sites on the human genome, each of which has two variants.) Such a high-resolution genetic map can be compared to a map of the genome of each of a statistically significant number of patients taking part in a Phase II/III drug trial to identify markers associated with a particular observed drug response or side effect. Alternatively, such a high resolution map can be generated from a combination of some ten-million known single nucleotide polymorphisms (SNPs) in the human genome. As used herein, a “SNP” is a common alteration that occurs in a single nucleotide base in a stretch of DNA. For example, a SNP may occur once per every 1000 bases of DNA. A SNP may be involved in a disease process, however, the vast majority may not be disease-associated. Given a genetic map based on the occurrence of such SNPs, individuals can be grouped into genetic categories depending on a particular pattern of SNPs in their individual genome. In such a manner, treatment regimens can be tailored to groups of genetically similar individuals, taking into account traits that may be common among such genetically similar individuals.

[0860] Alternatively, a method termed the “candidate gene approach,” can be utilized to identify genes that predict drug response. According to this method, if a gene that encodes a drug's target is known (e.g., a 61833 protein of the present invention), all common variants of that gene can be fairly easily identified in the population and it can be determined if having one version of the gene versus another is associated with a particular drug response.

[0861] Alternatively, a method termed the “gene expression profiling,” can be utilized to identify genes that predict drug response. For example, the gene expression of an animal dosed with a drug (e.g., a 61833 molecule or 61833 modulator of the present invention) can give an indication whether gene pathways related to toxicity have been turned on.

[0862] Information generated from more than one of the above pharmacogenomics approaches can be used to determine appropriate dosage and treatment regimens for prophylactic or therapeutic treatment of an individual. This knowledge, when applied to dosing or drug selection, can avoid adverse reactions or therapeutic failure and thus enhance therapeutic or prophylactic efficiency when treating a subject with a 61833 molecule or 61833 modulator, such as a modulator identified by one of the exemplary screening assays described herein.

[0863] The present invention further provides methods for identifying new agents, or combinations, that are based on identifying agents that modulate the activity of one or more of the gene products encoded by one or more of the 61833 genes of the present invention, wherein these products may be associated with resistance of the cells to a therapeutic agent. Specifically, the activity of the proteins encoded by the 61833 genes of the present invention can be used as a basis for identifying agents for overcoming agent resistance. By blocking the activity of one or more of the resistance proteins, target cells, e.g., human cells, will become sensitive to treatment with an agent that the unmodified target cells were resistant to.

[0864] Monitoring the influence of agents (e.g., drugs) on the expression or activity of a 61833 protein can be applied in clinical trials. For example, the effectiveness of an agent determined by a screening assay as described herein to increase 61833 gene expression, protein levels, or upregulate 61833 activity, can be monitored in clinical trials of subjects exhibiting decreased 61833 gene expression, protein levels, or downregulated 61833 activity. Alternatively, the effectiveness of an agent determined by a screening assay to decrease 61833 gene expression, protein levels, or downregulate 61833 activity, can be monitored in clinical trials of subjects exhibiting increased 61833 gene expression, protein levels, or upregulated 61833 activity. In such clinical trials, the expression or activity of a 61833 gene, and preferably, other genes that have been implicated in, for example, a 61833-associated disorder can be used as a “read out” or markers of the phenotype of a particular cell.

[0865] 61833 Informatics

[0866] The sequence of a 61833 molecule is provided in a variety of media to facilitate use thereof. A sequence can be provided as a manufacture, other than an isolated nucleic acid or amino acid molecule, which contains a 61833. Such a manufacture can provide a nucleotide or amino acid sequence, e.g., an open reading frame, in a form which allows examination of the manufacture using means not directly applicable to examining the nucleotide or amino acid sequences, or a subset thereof, as they exists in nature or in purified form. The sequence information can include, but is not limited to, 61833 full-length nucleotide and/or amino acid sequences, partial nucleotide and/or amino acid sequences, polymorphic sequences including single nucleotide polymorphisms (SNPs), epitope sequence, and the like. In a preferred embodiment, the manufacture is a machine-readable medium, e.g., a magnetic, optical, chemical or mechanical information storage device.

[0867] As used herein, “machine-readable media” refers to any medium that can be read and accessed directly by a machine, e.g., a digital computer or analogue computer. Non-limiting examples of a computer include a desktop PC, laptop, mainframe, server (e.g., a web server, network server, or server farm), handheld digital assistant, pager, mobile telephone, and the like. The computer can be stand-alone or connected to a communications network, e.g., a local area network (such as a VPN or intranet), a wide area network (e.g., an Extranet or the Internet), or a telephone network (e.g., a wireless, DSL, or ISDN network). Machine-readable media include, but are not limited to: magnetic storage media, such as floppy discs, hard disc storage medium, and magnetic tape; optical storage media such as CD-ROM; electrical storage media such as RAM, ROM, EPROM, EEPROM, flash memory, and the like; and hybrids of these categories such as magnetic/optical storage media.

[0868] A variety of data storage structures are available to a skilled artisan for creating a machine-readable medium having recorded thereon a nucleotide or amino acid sequence of the present invention. The choice of the data storage structure will generally be based on the means chosen to access the stored information. In addition, a variety of data processor programs and formats can be used to store the nucleotide sequence information of the present invention on computer readable medium. The sequence information can be represented in a word processing text file, formatted in commercially-available software such as WordPerfect and Microsoft Word, or represented in the form of an ASCII file, stored in a database application, such as DB2, Sybase, Oracle, or the like. The skilled artisan can readily adapt any number of data processor structuring formats (e.g., text file or database) in order to obtain computer readable medium having recorded thereon the nucleotide sequence information of the present invention.

[0869] In a preferred embodiment, the sequence information is stored in a relational database (such as Sybase or Oracle). The database can have a first table for storing sequence (nucleic acid and/or amino acid sequence) information. The sequence information can be stored in one field (e.g., a first column) of a table row and an identifier for the sequence can be store in another field (e.g., a second column) of the table row. The database can have a second table, e.g., storing annotations. The second table can have a field for the sequence identifier, a field for a descriptor or annotation text (e.g., the descriptor can refer to a functionality of the sequence, a field for the initial position in the sequence to which the annotation refers, and a field for the ultimate position in the sequence to which the annotation refers. Non-limiting examples for annotation to nucleic acid sequences include polymorphisms (e.g., SNP's) translational regulatory sites and splice junctions. Non-limiting examples for annotations to amino acid sequence include polypeptide domains, e.g., a domain described herein; active sites and other functional amino acids; and modification sites.

[0870] By providing the nucleotide or amino acid sequences of the invention in computer readable form, the skilled artisan can routinely access the sequence information for a variety of purposes. For example, one skilled in the art can use the nucleotide or amino acid sequences of the invention in computer readable form to compare a target sequence or target structural motif with the sequence information stored within the data storage means. A search is used to identify fragments or regions of the sequences of the invention which match a particular target sequence or target motif. The search can be a BLAST search or other routine sequence comparison, e.g., a search described herein.

[0871] Thus, in one aspect, the invention features a method of analyzing 61833, e.g., analyzing structure, function, or relatedness to one or more other nucleic acid or amino acid sequences. The method includes: providing a 61833 nucleic acid or amino acid sequence; comparing the 61833 sequence with a second sequence, e.g., one or more preferably a plurality of sequences from a collection of sequences, e.g., a nucleic acid or protein sequence database to thereby analyze 61833. The method can be performed in a machine, e.g., a computer, or manually by a skilled artisan.

[0872] The method can include evaluating the sequence identity between a 61833 sequence and a database sequence. The method can be performed by accessing the database at a second site, e.g., over the Internet.

[0873] As used herein, a “target sequence” can be any DNA or amino acid sequence of six or more nucleotides or two or more amino acids. A skilled artisan can readily recognize that the longer a target sequence is, the less likely a target sequence will be present as a random occurrence in the database. Typical sequence lengths of a target sequence are from about 10 to 100 amino acids or from about 30 to 300 nucleotide residues. However, it is well recognized that commercially important fragments, such as sequence fragments involved in gene expression and protein processing, may be of shorter length.

[0874] Computer software is publicly available which allows a skilled artisan to access sequence information provided in a computer readable medium for analysis and comparison to other sequences. A variety of known algorithms are disclosed publicly and a variety of commercially available software for conducting search means are and can be used in the computer-based systems of the present invention. Examples of such software include, but are not limited to, MacPattern (EMBL), BLASTN and BLASTX (NCBI).

[0875] Thus, the invention features a method of making a computer readable record of a sequence of a 61833 sequence which includes recording the sequence on a computer readable matrix. In a preferred embodiment the record includes one or more of the following: identification of an ORF; identification of a domain, region, or site; identification of the start of transcription; identification of the transcription terminator; the full length amino acid sequence of the protein, or a mature form thereof; the 5′end of the translated region.

[0876] In another aspect, the invention features, a method of analyzing a sequence. The method includes: providing a 61833 sequence, or record, in machine-readable form; comparing a second sequence to the 61833 sequence; thereby analyzing a sequence. Comparison can include comparing to sequences for sequence identity or determining if one sequence is included within the other, e.g., determining if the 61833 sequence includes a sequence being compared. In a preferred embodiment the 61833 or second sequence is stored on a first computer, e.g., at a first site and the comparison is performed, read, or recorded on a second computer, e.g., at a second site. E.g., the 61833 or second sequence can be stored in a public or proprietary database in one computer, and the results of the comparison performed, read, or recorded on a second computer. In a preferred embodiment the record includes one or more of the following: identification of an ORF; identification of a domain, region, or site; identification of the start of transcription; identification of the transcription terminator; the full length amino acid sequence of the protein, or a mature form thereof; the 5′end of the translated region.

[0877] In another aspect, the invention provides a machine-readable medium for holding instructions for performing a method for determining whether a subject has a 61833-associated disease or disorder or a pre-disposition to a 61833-associated disease or disorder, wherein the method comprises the steps of determining 61833 sequence information associated with the subject and based on the 61833 sequence information, determining whether the subject has a 61833-associated disease or disorder or a pre-disposition to a 61833-associated disease or disorder and/or recommending a particular treatment for the disease, disorder or pre-disease condition.

[0878] The invention further provides in an electronic system and/or in a network, a method for determining whether a subject has a 61833-associated disease or disorder or a pre-disposition to a disease associated with a 61833 wherein the method comprises the steps of determining 61833 sequence information associated with the subject, and based on the 61833 sequence information, determining whether the subject has a 61833-associated disease or disorder or a pre-disposition to a 61833-associated disease or disorder, and/or recommending a particular treatment for the disease, disorder or pre-disease condition. In a preferred embodiment, the method further includes the step of receiving information, e.g., phenotypic or genotypic information, associated with the subject and/or acquiring from a network phenotypic information associated with the subject. The information can be stored in a database, e.g., a relational database. In another embodiment, the method further includes accessing the database, e.g., for records relating to other subjects, comparing the 61833 sequence of the subject to the 61833 sequences in the database to thereby determine whether the subject as a 61833-associated disease or disorder, or a pre-disposition for such.

[0879] The present invention also provides in a network, a method for determining whether a subject has a 61833 associated disease or disorder or a pre-disposition to a 61833-associated disease or disorder associated with 61833, said method comprising the steps of receiving 61833 sequence information from the subject and/or information related thereto, receiving phenotypic information associated with the subject, acquiring information from the network corresponding to 61833 and/or corresponding to a 61833-associated disease or disorder (e.g., a cellular proliferative or differentiative disorder, cardiovascular disorder, or a disorder of the brain), and based on one or more of the phenotypic information, the 61833 information (e.g., sequence information and/or information related thereto), and the acquired information, determining whether the subject has a 61833-associated disease or disorder or a pre-disposition to a 61833-associated disease or disorder. The method may further comprise the step of recommending a particular treatment for the disease, disorder or pre-disease condition.

[0880] The present invention also provides a method for determining whether a subject has a 61833-associated disease or disorder or a pre-disposition to a 61833-associated disease or disorder, said method comprising the steps of receiving information related to 61833 (e.g., sequence information and/or information related thereto), receiving phenotypic information associated with the subject, acquiring information from the network related to 61833 and/or related to a 61833-associated disease or disorder, and based on one or more of the phenotypic information, the 61833 information, and the acquired information, determining whether the subject has a 61833-associated disease or disorder or a pre-disposition to a 61833-associated disease or disorder. The method may further comprise the step of recommending a particular treatment for the disease, disorder or pre-disease condition.

[0881] This invention is further illustrated by the following examples that should not be construed as limiting. The contents of all references, patents and published patent applications cited throughout this application are incorporated herein by reference.

Background of the 26493 Invention

[0882] The avoidance of mutation by living cells is achieved through several processes operating at different stages of DNA replication. Some of these processes include: selection of the correct deoxyribonucleoside triphosphate from the nucleotide pool and removal of an incorrect nucleotide (editing) by DNA polymerases at the 3′-end of the nascent DNA strand (Brutlag and Komeberg (1972) J. Biol. Chem. 247: 241-248); specific post- (or pre-) replication repair processes such as the removal of potentially mutagenic bases from the template (Lindahl (1974) Proc. Natl. Acad. Sci. 71: 3649-3653); and finally, post-replicative removal of incorrectly inserted deoxynucleotides from within newly replicated material by mismatch repair activities, which recognize the newly synthesized DNA by a strand targeting signal that is distinct from, and distal to, the mismatch (Clayerys and Lacks (1986) Microbiol. Rev. 50: 133-165; Modrich (1991) Annu Rev Genetics 25: 229-253).

[0883] One example of an error avoidance pathway is the GO system. The GO system is a pathway devoted to enhancing the fidelity of DNA replication and is responsible for protecting DNA from one form of oxidative damage (Michaels M. L., et al. (1992) Proc. Natl. Acad. Sci. USA 89:7022-7025). Although the system has been most fully characterized in E. coli, evidence exists for similar repair proteins in other prokaryotes and in higher eukaryotes. The GO system in E. coli is composed of at least three proteins, MutM, MutT and MutY (Michaels M. L., et al. (1992) J. Bacteriol. 174:6321-6325). These three proteins are responsible for removing an oxidatively damaged form of guanine (8-hydroxyguanine or 7,8-dihydro-8-oxoguanine, also known as a GO lesion) from DNA and the nucleotide pool and for the correction of error-prone translesion synthesis. 8-oxo-dGTP is inserted opposite to dA and dC residues of template DNA with almost equal efficiency, thus leading to AT to GC transversions. MutT, which is a small protein of about 12 to 15 Kd, specifically degrades 8-oxo-dGTP to the monophosphate with the concomitant release of pyrophosphate. The MutT protein has a weak GTPase activity (Bhatnagar, S. K. et al. (1991) J. Biol. Chem. 266: 9050-9054), resulting in the formation of dGMP and pyrophosphate (Maki, H. et al. (1992) Nature 355: 273-275).

[0884] A region of about 40 amino acid residues, which is found in the N-terminal portion of mutT, can also be found in a variety of other prokaryotic, viral, and eukaryotic proteins (Koonin E. V. (1993) Nucleic Acids Res. 21:4847-4847; Mejean V. et al. (1994) Mol. Microbiol. 11:323-330). The conserved 40 amino acid domain has been proposed to be involved in the active center of a family of pyrophosphate-releasing NTPases (Koonin E. V. (1993) supra). This domain contains four conserved glutamate residues.

[0885] Oxidative stress has been implicated as an important causative agent of mutagenesis, carcinogenesis, aging, and a number of diseases (Pacifici, R. E. et al. (1991) Gerontology 37:166-180). The MutM, MutT and MutY proteins have been implicated in the GO system, which is responsible for mediating one form of oxidative damage. Therefore, there exists a need for identifying novel genes and gene products that are involved in error avoidance pathways such as the GO system.

Summary of the 26493 Invention

[0886] The present invention is based, in part, on the discovery of a novel mutT family member, referred to herein as “26493”. The nucleotide sequence of a cDNA encoding 26493 is shown in SEQ ID NO: 16, and the amino acid sequence of a 26493 polypeptide is shown in SEQ ID NO: 17. In addition, the nucleotide sequences of the coding region are depicted in SEQ ID NO:18.

[0887] Accordingly, in one aspect, the invention features a nucleic acid molecule that encodes a 26493 protein or polypeptide, e.g., a biologically active portion of the 26493 protein. In a preferred embodiment the isolated nucleic acid molecule encodes a polypeptide having the amino acid sequence of SEQ ID NO: 17. In other embodiments, the invention provides isolated 26493 nucleic acid molecules having the nucleotide sequence shown in SEQ ID NO: 16, SEQ ID NO: 18, or the sequence of the DNA insert of the plasmid deposited with ATCC Accession Number ______. In still other embodiments, the invention provides nucleic acid molecules that are substantially identical (e.g., naturally occurring allelic variants) to the nucleotide sequence shown in SEQ ID NO: 16, SEQ ID NO: 18, or the sequence of the DNA insert of the plasmid deposited with ATCC Accession Number ______. In other embodiments, the invention provides a nucleic acid molecule which hybridizes under a stringency condition described herein to a nucleic acid molecule comprising the nucleotide sequence of SEQ ID NO:16, SEQ ID NO:18, or the sequence of the DNA insert of the plasmid deposited with ATCC Accession Number ______, wherein the nucleic acid encodes a full length 26493 protein or an active fragment thereof.

[0888] In a related aspect, the invention further provides nucleic acid constructs that include a 26493 nucleic acid molecule described herein. In certain embodiments, the nucleic acid molecules of the invention are operatively linked to native or heterologous regulatory sequences. Also included, are vectors and host cells containing the 26493 nucleic acid molecules of the invention e.g., vectors and host cells suitable for producing 26493 nucleic acid molecules and polypeptides.

[0889] In another related aspect, the invention provides nucleic acid fragments suitable as primers or hybridization probes for the detection of 26493-encoding nucleic acids.

[0890] In still another related aspect, isolated nucleic acid molecules that are antisense to a 26493 encoding nucleic acid molecule are provided.

[0891] In another aspect, the invention features 26493 polypeptides, and biologically active or antigenic fragments thereof that are useful, e.g., as reagents or targets in assays applicable to treatment and diagnosis of 26493-mediated or -related disorders. In another embodiment, the invention provides 26493 polypeptides having a 26493 activity. Preferred polypeptides are 26493 proteins including at least one mutT domain and, preferably, having a 26493 activity, e.g., a 26493 activity as described herein.

[0892] In other embodiments, the invention provides 26493 polypeptides, e.g., a 26493 polypeptide having the amino acid sequence shown in SEQ ID NO:17 or the amino acid sequence encoded by the cDNA insert of the plasmid deposited with ATCC Accession Number ______; an amino acid sequence that is substantially identical to the amino acid sequence shown in SEQ ID NO: 17 or the amino acid sequence encoded by the cDNA insert of the plasmid deposited with ATCC Accession Number ______; or an amino acid sequence encoded by a nucleic acid molecule having a nucleotide sequence which hybridizes under a stringency condition described herein to a nucleic acid molecule comprising the nucleotide sequence of SEQ ID NO: 16, SEQ ID NO: 18, or the sequence of the DNA insert of the plasmid deposited with ATCC Accession Number ______, wherein the nucleic acid encodes a full length 26493 protein or an active fragment thereof.

[0893] In a related aspect, the invention further provides nucleic acid constructs which include a 26493 nucleic acid molecule described herein.

[0894] In a related aspect, the invention provides 26493 polypeptides or fragments operatively linked to non-26493 polypeptides to form fusion proteins. In another aspect, the invention features antibodies and antigen-binding fragments thereof, that react with, or more preferably specifically bind 26493 polypeptides or fragments thereof, e.g., a mutT domain of a 26493 polypeptide. In one embodiment, the antibodies or antigen-binding fragment thereof competitively inhibit the binding of a second antibody to a 26493 polypeptide or fragment thereof, e.g., a mutT domain of a 26493 polypeptide.

[0895] In another aspect, the invention provides methods of screening for compounds that modulate the expression or activity of the 26493 polypeptides or nucleic acids.

[0896] In still another aspect, the invention provides a process for modulating 26493 polypeptide or nucleic acid expression or activity, e.g. using the screened compounds. In certain embodiments, the methods involve treatment of conditions related to aberrant activity or expression of the 26493 polypeptides or nucleic acids, such as conditions involving oxidative stress damage, deficient DNA repair, aging, mutagenesis, or conditions involving cells expressing the 26493 polypeptides, e.g., neural cells.

[0897] In yet another aspect, the invention provides methods for inhibiting the proliferation or inducing the killing or differentiation, of a 26493-expressing cell, e.g., a hyperproliferative 26493-expressing cell. The method includes contacting the cell with an agent, e.g., a compound, (e.g., a compound identified using the methods described herein) that modulates the activity, or expression, of the 26493 polypeptide or nucleic acid. In a preferred embodiment, the contacting step is effective in vitro or ex vivo. In other embodiments, the contacting step is effected in vivo, e.g., in a subject (e.g., a mammal, e.g., a human), as part of a therapeutic or prophylactic protocol.

[0898] In a preferred embodiment, the cell is a hyperproliferative cell, e.g., a cell found in a solid tumor, a soft tissue tumor, or a metastatic lesion.

[0899] In a preferred embodiment, the agent, e.g., the compound, is an inhibitor of a 26493 polypeptide. Preferably, the inhibitor is chosen from a peptide, a phosphopeptide, a small organic molecule, a small inorganic molecule and an antibody (e.g., an antibody conjugated to a therapeutic moiety selected from a cytotoxin, a cytotoxic agent and a radioactive metal ion). In another preferred embodiment, the agent, e.g., the compound, is an inhibitor of a 26493 nucleic acid, e.g., an antisense, a ribozyme, or a triple helix molecule.

[0900] In a preferred embodiment, the agent, e.g., the compound, is administered in combination with a cytotoxic agent. Examples of cytotoxic agents include anti-microtubule agent, a topoisomerase I inhibitor, a topoisomerase II inhibitor, an anti-metabolite, a mitotic inhibitor, an alkylating agent, an intercalating agent, an agent capable of interfering with a signal transduction pathway, an agent that promotes apoptosis or necrosis, and radiation.

[0901] In another aspect, the invention features methods for treating or preventing a disorder characterized by aberrant activity of a 26493-expressing cell, in a subject. Preferably, the method includes comprising administering to the subject (e.g., a mammal, e.g., a human) an effective amount of a compound (e.g., a compound identified using the methods described herein) that modulates the activity, or expression, of the 26493 polypeptide or nucleic acid.

[0902] In a preferred embodiment, the disorder is a cancerous or pre-cancerous condition.

[0903] In a further aspect, the invention provides methods for evaluating the efficacy of a treatment of a disorder, e.g., proliferative disorder. The method includes: treating a subject, e.g., a patient or an animal, with a protocol under evaluation (e.g., treating a subject with one or more of: chemotherapy, radiation, and/or a compound identified using the methods described herein); and evaluating the expression of a 26493 nucleic acid or polypeptide before and after treatment. A change, e.g., a decrease or increase, in the level of a 26493 nucleic acid (e.g., mRNA) or polypeptide after treatment, relative to the level of expression before treatment, is indicative of the efficacy of the treatment of the disorder. The level of 26493 nucleic acid or polypeptide expression can be detected by any method described herein.

[0904] In a preferred embodiment, the evaluating step includes obtaining a sample (e.g., a tissue sample, e.g., a biopsy, or a fluid sample) from the subject, before and after treatment and comparing the level of expressing of a 26493 nucleic acid (e.g., mRNA) or polypeptide before and after treatment.

[0905] In another aspect, the invention provides methods for evaluating the efficacy of a therapeutic or prophylactic agent (e.g., an anti-neoplastic agent). The method includes: contacting a sample with an agent (e.g., a compound identified using the methods described herein, a cytotoxic agent) and, evaluating the expression of 26493 nucleic acid or polypeptide in the sample before and after the contacting step. A change, e.g., a decrease or increase, in the level of 26493 nucleic acid (e.g., mRNA) or polypeptide in the sample obtained after the contacting step, relative to the level of expression in the sample before the contacting step, is indicative of the efficacy of the agent. The level of 26493 nucleic acid or polypeptide expression can be detected by any method described herein. In a preferred embodiment, the sample includes cells obtained from a cancerous tissue or pre-cancerous tissue. In other embodiments, the samples is from brain tissue.

[0906] In further aspect, the invention provides assays for determining the presence or absence of a genetic alteration in a 26493 polypeptide or nucleic acid molecule, including for disease diagnosis.

[0907] In another aspect, the invention features a two dimensional array having a plurality of addresses, each address of the plurality being positionally distinguishable from each other address of the plurality, and each address of the plurality having a unique capture probe, e.g., a nucleic acid or peptide sequence. At least one address of the plurality has a capture probe that recognizes a 26493 molecule. In one embodiment, the capture probe is a nucleic acid, e.g., a probe complementary to a 26493 nucleic acid sequence. In another embodiment, the capture probe is a polypeptide, e.g., an antibody specific for 26493 polypeptides. Also featured is a method of analyzing a sample by contacting the sample to the aforementioned array and detecting binding of the sample to the array.

[0908] Other features and advantages of the invention will be apparent from the following detailed description, and from the claims.

Detailed Description of 26493

[0909] The human 26493 sequence (see SEQ ID NO:16, as recited in Example 10), which is approximately 1902 nucleotides long including untranslated regions, contains a predicted methionine-initiated coding sequence of about 1212 nucleotides, including the termination codon. The coding sequence encodes a 404 amino acid protein (see SEQ ID NO:17, as recited in Example 10).

[0910] Human 26493 contains the following regions or other structural features:

[0911] a predicted mutT domain (PFAM Accession Number PF00293) located at about amino acid residues 122 to 251 of SEQ ID NO:17; which includes a mutT domain signature located at about amino acids 156 to 175;

[0912] two predicted protein kinase C phosphorylation sites (PS00005) located at about amino acids 36 to 38 of SEQ ID NO: 17;

[0913] five predicted casein kinase II phosphorylation sites (PS00006) located at about amino acids 67 to 70, 132 to 135, 163 to 166, 187 to 190, and 211 to 214 of SEQ ID NO:17;

[0914] one predicted N-glycosylation site (PS00001) located from about amino acids 374 to 377 of SEQ ID NO:17;

[0915] seven predicted N-myristylation sites (PS00008) located at about amino acids 43 to 48, 63 to 68, 85 to 90, 96 to 101, 207 to 212, 309 to 314, and 347 to 352 of SEQ ID NO:17.

[0916] For general information regarding PFAM identifiers, PS prefix and PF prefix domain identification numbers, refer to Sonnhammer et al. (1997) Protein 28:405-420 and http://www.psc.edu/general/software/packages/pfam/pfam.html.

[0917] A plasmid containing the nucleotide sequence encoding human 26493 (clone “Fbh26493FL”) was deposited with American Type Culture Collection (ATCC), 10801 University Boulevard, Manassas, Va. 20110-2209, on ______ and assigned Accession Number ______. This deposit will be maintained under the terms of the Budapest Treaty on the International Recognition of the Deposit of Microorganisms for the Purposes of Patent Procedure. This deposit was made merely as a convenience for those of skill in the art and is not an admission that a deposit is required under 35 U.S.C. §112.

[0918] The 26493 protein contains a significant number of structural characteristics in common with members of the mutT subfamily. MutT proteins have been shown to have GTPase activity, e.g., an oxoguanine dGTPase activity (Bhatnagar, S. K. et al. (1991) J. Biol. Chem. 266: 9050-9054). The term “family” when referring to the protein and nucleic acid molecules of the invention means two or more proteins or nucleic acid molecules having a common structural domain or motif and having sufficient amino acid or nucleotide sequence homology as defined herein. Such family members can be naturally or non-naturally occurring and can be from either the same or different species. For example, a family can contain a first protein of human origin as well as other distinct proteins of human origin, or alternatively, can contain homologues of non-human origin, e.g., rat or mouse proteins. Members of a family can also have common functional characteristics.

[0919] MutT family members share a highly conserved region, which includes four conserved glutamate residues, which are believed to be involved in the active center of a family of pyrophosphate-releasing NTPases (Koonin E. V. (1993) supra). MutT family members are characterized by six positions containing conserved amino acid residues having the consensus sequence: GX5EX5[UA]XRE&XEEX9& (SEQ ID NO:20), wherein U is a bulky aliphatic residue, i.e. I, L, V, or M; & is a bulky hydrophobic residue, i.e. I, L, V, M, F, Y or W; and X can be any amino acid (Koonin E. V. (1993) supra). MutT family members are components of the GO system. The GO system is a pathway devoted to enhancing the fidelity of DNA replication and is responsible for mediating one form of oxidative damage (Michaels M. L., et al. (1992) Proc. Natl. Acad. Sci. USA 89:7022-7025). MutT family members are known to mediate removal of an oxidatively damaged form of guanine (8-hydroxyguanine or 7,8-dihydro-8-oxoguanine, also known as a GO lesion) from DNA and the nucleotide pool and for the correction of error-prone translesion synthesis. MutT family members degrade 8-oxo-dGTP to the monophosphate with the concomitant release of pyrophosphate. The MutT proteins have been also shown to have GTPase activity, e.g., an oxoguanine dGTPase activity (Bhatnagar, S. K. et al. (1991) J. Biol. Chem. 266: 9050-9054), resulting in the formation of dGMP and pyrophosphate (Maki, H. et al. (1992) Nature 355: 273-275).

[0920] A 26493 polypeptide can include a “mutT domain” or regions homologous with a “mutT domain”. A 26493 can optionally further include one or more of: at least one N-glycosylation site; at least one, or preferably two protein kinase C sites; at least one, two, three, four, preferably five casein kinase II sites; or at least one, two, three, four, five, six, preferably seven N-myristoylation sites.

[0921] As used herein, the term “mutT domain” refers to a protein domain which is includes one, two, three, and preferably four glutamate residues, clustered in a stretch of about forty amino acids. The glutamate residues are preferably arranged in the following sequence: EX6-9RELXEE (SEQ ID NO:21), wherein X can be any amino acid. For example, the mutT domain of 26493 includes a cluster of four glutamate residues located at amino acids 162, 171, 174 and 175 of SEQ ID NO:17 (FIG. 12). Preferably, the mutT domain has an amino acid sequence of about 50 to about 300 amino acid residues and having a bit score for the alignment of the sequence to the dGTPase domain (HMM) of at least 20. Preferably, a mutT domain includes at least about 80 to about 200 amino acids, more preferably about 100 to about 150 amino acid residues, about 120 to 130, or about 127, 128 or 129 amino acids and has a bit score for the alignment of the sequence to the mutT domain (HMM) of at least 40, preferably 50, more preferably 80 or greater. The mutT domain (HMM) has been assigned the PFAM Accession (PF00293) (http://genome.wustl.edu/Pfam/html). An alignment of the mutT domain (from about amino acids 122 to about 251 of SEQ ID NO: 17) of human 26493 with a consensus amino acid sequence derived from a hidden Markov model (PFAM) is depicted in FIG. 12.

[0922] In a preferred embodiment 26493 polypeptide or protein has a “mutT domain” or a region which includes at least about 80 to about 200 amino acids, more preferably about 100 to about 150 amino acid residues, about 120 to 130, or about 127, 128 or 129 amino acid residues and has at least about 50%, 60%, 70% 80% 90% 95%, 99%, or 100% homology with a “mutT domain,” e.g., the mutT domain of human 26493 (e.g., residues 122 to 251 of SEQ ID NO:17).

[0923] To identify the presence of a “mutT” domain in a 26493 protein sequence, and make the determination that a polypeptide or protein of interest has a particular profile, the amino acid sequence of the protein can be searched against the Pfam database of HMMs (e.g., the Pfam database, release 2.1) using the default parameters (http://www.sanger.ac.uk/Software/Pfam/HMM_search). For example, the hmmsf program, which is available as part of the HMMER package of search programs, is a family specific default program for MILPAT0063 and a score of 15 is the default threshold score for determining a hit. Alternatively, the threshold score for determining a hit can be lowered (e.g., to 8 bits). A description of the Pfam database can be found in Sonhammer et al. (1997) Proteins 28(3):405-420 and a detailed description of HMMs can be found, for example, in Gribskov et al. (1990) Meth. Enzymol. 183:146-159; Gribskov et al. (1987) Proc. Natl. Acad. Sci. USA 84:4355-4358; Krogh et al. (1994) J. Mol. Biol. 235:1501-1531; and Stultz et al. (1993) Protein Sci. 2:305-314, the contents of which are incorporated herein by reference. A search was performed against the HMM database resulting in the identification of a “mutT” domain in the amino acid sequence of human 26493 at about residues 122 to 251 of SEQ ID NO:17 (see Example 10).

[0924] A 26493 family member can include at least one mutT domain. Furthermore, a 26493 family member can include at least one, preferably two protein kinase C phosphorylation sites (PS00005); at least 1 predicted N-glycosylation site (PS00001); at least one, two, three, four, and preferably five, predicted casein kinase II phosphorylation sites (PS00006); and at least one, two, three, four, five, six and preferably seven predicted N-myristylation sites (PS00008).

[0925] As the 26493 polypeptides of the invention may modulate 26493-mediated activities, they may be useful as of for developing novel diagnostic and therapeutic agents for 26493-mediated or related disorders, as described below.

[0926] As used herein, a “26493 activity”, “biological activity of 26493” or “functional activity of 26493”, refers to an activity exerted by a 26493 protein, polypeptide or nucleic acid molecule. For example, a 26493 activity can be an activity exerted by 26493 in a physiological milieu on, e.g., a 26493-responsive cell or on a 26493 substrate, e.g., a protein substrate. A 26493 activity can be determined in vivo or in vitro. In one embodiment, a 26493 activity is a direct activity, such as an association with a 26493 target molecule. A “target molecule” or “binding partner” is a molecule with which a 26493 protein binds or interacts in nature, (e.g., an oxidatively damaged form of guanine (e.g. 8-hydroxyguanine or 7,8-dihydro-8-oxoguanine)). In another embodiment, a 26439 activity is an indirect activity, such as a cellular signaling activity mediated by interaction of the 26493 protein with a second protein.

[0927] The features of the 26493 molecules of the present invention can provide similar biological activities as mutT/dGTPase family members. For example, the 26493 proteins of the present invention can have one or more of the following activities: (1) ability to catalyze the removal of an oxidatively damaged form of guanine (e.g., 8-hydroxyguanine or 7,8-dihydro-8-oxoguanine) from a DNA molecule and the nucleotide pool; (2) ability to repair DNA, e.g., correct error-prone translesion synthesis; (3) ability to degrade an oxidatively damaged form of guanine (e.g., 8-oxo-dGTP) to the monophosphate with the concomitant release of pyrophosphate; (4) ability to have NTPase (e.g., GTPase) activity; (5) ability to interact with (e.g., bind to) an NTPase, e.g., GTPase; (6) the ability to modulate (e.g., to protect from)oxidative stress damage; (7) the ability to modulate aging; (8) ability to modulate mutagenesis, carcinogenesis, and/or aberrant or deficient cellular proliferation or differentiation; or (9) ability to modulate the activity of cells expressing the 26493 polypeptides, e.g., neural cells. Thus, the 26493 molecules can act as novel diagnostic targets and therapeutic agents for controlling cancers, aging, disorders involving prostate or neural cells, cellular differentiation disorders, and the like.

[0928] The 26493 molecules can act as novel diagnostic targets and therapeutic agents for controlling one or more of cellular proliferative and/or differentiative disorders, disorders associated with bone metabolism, immune disorders (e.g., inflammatory disorders), cardiovascular disorders, neural disorders, prostate disorders, among others.

[0929] Examples of cellular proliferative and/or differentiative disorders include cancer, e.g., carcinoma, sarcoma, metastatic disorders or hematopoietic neoplastic disorders, e.g., leukemias. A metastatic tumor can arise from a multitude of primary tumor types, including but not limited to those of prostate, colon, lung, breast and liver origin.

[0930] As used herein, the terms “cancer”, “hyperproliferative” and “neoplastic” refer to cells having the capacity for autonomous growth. Examples of such cells include cells having an abnormal state or condition characterized by rapidly proliferating cell growth. Hyperproliferative and neoplastic disease states may be categorized as pathologic, i.e., characterizing or constituting a disease state, or may be categorized as non-pathologic, i.e., a deviation from normal but not associated with a disease state. The term is meant to include all types of cancerous growths or oncogenic processes, metastatic tissues or malignantly transformed cells, tissues, or organs, irrespective of histopathologic type or stage of invasiveness. “Pathologic hyperproliferative” cells occur in disease states characterized by malignant tumor growth. Examples of non-pathologic hyperproliferative cells include proliferation of cells associated with wound repair.

[0931] The terms “cancer” or “neoplasms” include malignancies of the various organ systems, such as affecting lung, breast, thyroid, lymphoid, gastrointestinal, and genito-urinary tract, as well as adenocarcinomas which include malignancies such as most colon cancers, renal-cell carcinoma, prostate cancer and/or testicular tumors, non-small cell carcinoma of the lung, cancer of the small intestine and cancer of the esophagus.

[0932] The term “carcinoma” is art recognized and refers to malignancies of epithelial or endocrine tissues including respiratory system carcinomas, gastrointestinal system carcinomas, genitourinary system carcinomas, testicular carcinomas, breast carcinomas, prostatic carcinomas, endocrine system carcinomas, and melanomas. Exemplary carcinomas include those forming from tissue of the cervix, lung, prostate, breast, head and neck, colon and ovary. The term also includes carcinosarcomas, e.g., which include malignant tumors composed of carcinomatous and sarcomatous tissues. An “adenocarcinoma” refers to a carcinoma derived from glandular tissue or in which the tumor cells form recognizable glandular structures.

[0933] The term “sarcoma” is art recognized and refers to malignant tumors of mesenchymal derivation.

[0934] Additional examples of proliferative disorders include hematopoietic neoplastic disorders. As used herein, the term “hematopoietic neoplastic disorders” includes diseases involving hyperplastic/neoplastic cells of hematopoietic origin. A hematopoietic neoplastic disorder can arise from myeloid, lymphoid or erythroid lineages, or precursor cells thereof. Preferably, the diseases arise from poorly differentiated acute leukemias, e.g., erythroblastic leukemia and acute megakaryoblastic leukemia. Additional exemplary myeloid disorders include, but are not limited to, acute promyeloid leukemia (APML), acute myelogenous leukemia (AML) and chronic myelogenous leukemia (CML) (reviewed in Vaickus, L. (1991) Crit Rev. in Oncol./Hemotol. 11:267-97); lymphoid malignancies include, but are not limited to acute lymphoblastic leukemia (ALL) which includes B-lineage ALL and T-lineage ALL, chronic lymphocytic leukemia (CLL), prolymphocytic leukemia (PLL), hairy cell leukemia (HLL) and Waldenstrom's macroglobulinemia (WM). Additional forms of malignant lymphomas include, but are not limited to non-Hodgkin lymphoma and variants thereof, peripheral T cell lymphomas, adult T cell leukemia/lymphoma (ATL), cutaneous T-cell lymphoma (CTCL), large granular lymphocytic leukemia (LGF), Hodgkin's disease and Reed-Sternberg disease.

[0935] 26493 mRNA was found to be expressed in erythrocytes, neutrophils, megakaryocytes, lymph nodes, and normal tonsil. Thus, the 26493 nucleic acid and protein of the invention can be used to treat and/or diagnose a variety of immune disorders or disorders of the red blood cell. Examples of immune disorders or diseases include, but are not limited to, autoimmune diseases (including, for example, diabetes mellitus, arthritis (including rheumatoid arthritis, juvenile rheumatoid arthritis, osteoarthritis, psoriatic arthritis), multiple sclerosis, encephalomyelitis, myasthenia gravis, systemic lupus erythematosis, autoimmune thyroiditis, dermatitis (including atopic dermatitis and eczematous dermatitis), psoriasis, Sjögren's Syndrome, Crohn's disease, aphthous ulcer, iritis, conjunctivitis, keratoconjunctivitis, ulcerative colitis, asthma, allergic asthma, cutaneous lupus erythematosus, scleroderma, vaginitis, proctitis, drug eruptions, leprosy reversal reactions, erythema nodosum leprosum, autoimmune uveitis, allergic encephalomyelitis, acute necrotizing hemorrhagic encephalopathy, idiopathic bilateral progressive sensorineural hearing loss, aplastic anemia, pure red cell anemia, idiopathic thrombocytopenia, polychondritis, Wegener's granulomatosis, chronic active hepatitis, Stevens-Johnson syndrome, idiopathic sprue, lichen planus, Graves' disease, sarcoidosis, primary biliary cirrhosis, uveitis posterior, and interstitial lung fibrosis), graft-versus-host disease, cases of transplantation, and allergy such as, atopic allergy. Disorders involving red cells include, but are not limited to, anemias, such as hemolytic anemias, including hereditary spherocytosis, hemolytic disease due to erythrocyte enzyme defects: glucose-6-phosphate dehydrogenase deficiency, sickle cell disease, thalassemia syndromes, paroxysmal nocturnal hemoglobinuria, immunohemolytic anemia, and hemolytic anemia resulting from trauma to red cells; and anemias of diminished erythropoiesis, including megaloblastic anemias, such as anemias of vitamin B12 deficiency: pernicious anemia, and anemia of folate deficiency, iron deficiency anemia, anemia of chronic disease, aplastic anemia, pure red cell aplasia, and other forms of marrow failure.

[0936] Since 26493 mRNA was found to be expressed in various tissues of the heart, the 26493 nucleic acid and protein of the invention can be used to treat and/or diagnose a variety of disorders involving the heart. Examples of disorders involving the heart or “cardiovascular disorder” include, but are not limited to, a disease, disorder, or state involving the cardiovascular system, e.g., the heart, the blood vessels, and/or the blood. A cardiovascular disorder can be caused by an imbalance in arterial pressure, a malfunction of the heart, or an occlusion of a blood vessel, e.g., by a thrombus. Examples of such disorders include hypertension, atherosclerosis, coronary artery spasm, congestive heart failure, coronary artery disease, valvular disease, arrhythmias, and cardiomyopathies.

[0937] 26493 mRNA was found to be expressed in brain tissue, including the cortex and hypothalamus, with lower levels expressed in the spinal chord and nerves. Accordingly, the molecules of the invention may mediate disorders involving aberrant activities of these cells, for example neurodegenerative or brain disorders. Accordingly, the molecules of the invention may mediate disorders involving aberrant activities of brain cells, for example neurodegenerative disorders. Disorders involving the brain include, but are not limited to, disorders involving neurons, and disorders involving glia, such as astrocytes, oligodendrocytes, ependymal cells, and microglia; cerebral edema, raised intracranial pressure and herniation, and hydrocephalus; malformations and developmental diseases, such as neural tube defects, forebrain anomalies, posterior fossa anomalies, and syringomyelia and hydromyelia; perinatal brain injury; cerebrovascular diseases, such as those related to hypoxia, ischemia, and infarction, including hypotension, hypoperfusion, and low-flow states—global cerebral ischemia and focal cerebral ischemia—infarction from obstruction of local blood supply, intracranial hemorrhage, including intracerebral (intraparenchymal) hemorrhage, subarachnoid hemorrhage and ruptured berry aneurysms, and vascular malformations, hypertensive cerebrovascular disease, including lacunar infarcts, slit hemorrhages, and hypertensive encephalopathy; infections, such as acute meningitis, including acute pyogenic (bacterial) meningitis and acute aseptic (viral) meningitis, acute focal suppurative infections, including brain abscess, subdural empyema, and extradural abscess, chronic bacterial meningoencephalitis, including tuberculosis and mycobacterioses, neurosyphilis, and neuroborreliosis (Lyme disease), viral meningoencephalitis, including arthropod-borne (Arbo) viral encephalitis, Herpes simplex virus Type 1, Herpes simplex virus Type 2, Varicalla-zoster virus (Herpes zoster), cytomegalovirus, poliomyelitis, rabies, and human immunodeficiency virus 1, including HIV-1 meningoencephalitis (subacute encephalitis), vacuolar myelopathy, AIDS-associated myopathy, peripheral neuropathy, and AIDS in children, progressive multifocal leukoencephalopathy, subacute sclerosing panencephalitis, fungal meningoencephalitis, other infectious diseases of the nervous system; transmissible spongiform encephalopathies (prion diseases); demyelinating diseases, including multiple sclerosis, multiple sclerosis variants, acute disseminated encephalomyelitis and acute necrotizing hemorrhagic encephalomyelitis, and other diseases with demyelination; degenerative diseases, such as degenerative diseases affecting the cerebral cortex, including Alzheimer disease and Pick disease, degenerative diseases of basal ganglia and brain stem, including Parkinsonism, idiopathic Parkinson disease (paralysis agitans), progressive supranuclear palsy, corticobasal degenration, multiple system atrophy, including striatonigral degenration, Shy-Drager syndrome, and olivopontocerebellar atrophy, and Huntington disease; spinocerebellar degenerations, including spinocerebellar ataxias, including Friedreich ataxia, and ataxia-telanglectasia, degenerative diseases affecting motor neurons, including amyotrophic lateral sclerosis (motor neuron disease), bulbospinal atrophy (Kennedy syndrome), and spinal muscular atrophy; inborn errors of metabolism, such as leukodystrophies, including Krabbe disease, metachromatic leukodystrophy, adrenoleukodystrophy, Pelizaeus-Merzbacher disease, and Canavan disease, mitochondrial encephalomyopathies, including Leigh disease and other mitochondrial encephalomyopathies; toxic and acquired metabolic diseases, including vitamin deficiencies such as thiamine (vitamin B1) deficiency and vitamin B12 deficiency, neurologic sequelae of metabolic disturbances, including hypoglycemia, hyperglycemia, and hepatic encephatopathy, toxic disorders, including carbon monoxide, methanol, ethanol, and radiation, including combined methotrexate and radiation-induced injury; tumors, such as gliomas, including astrocytoma, including fibrillary (diffuse) astrocytoma and glioblastoma multiforme, pilocytic astrocytoma, pleomorphic xanthoastrocytoma, and brain stem glioma, oligodendroglioma, and ependymoma and related paraventricular mass lesions, neuronal tumors, poorly differentiated neoplasms, including medulloblastoma, other parenchymal tumors, including primary brain lymphoma, germ cell tumors, and pineal parenchymal tumors, meningiomas, metastatic tumors, paraneoplastic syndromes, peripheral nerve sheath tumors, including schwannoma, neurofibroma, and malignant peripheral nerve sheath tumor (malignant schwannoma), and neurocutaneous syndromes (phakomatoses), including neurofibromotosis, including Type 1 neurofibromatosis (NF1) and TYPE 2 neurofibromatosis (NF2), tuberous sclerosis, and Von Hippel-Lindau disease.

[0938] 26493 mRNA was found to be expressed in kidney tissue. Accordingly, the molecules of the invention may also mediate disorders involving aberrant activities of kidney cells. Disorders involving the kidney include, but are not limited to, congenital anomalies including, but not limited to, cystic diseases of the kidney, that include but are not limited to, cystic renal dysplasia, autosomal dominant (adult) polycystic kidney disease, autosomal recessive (childhood) polycystic kidney disease, and cystic diseases of renal medulla, which include, but are not limited to, medullary sponge kidney, and nephronophthisis-uremic medullary cystic disease complex, acquired (dialysis-associated) cystic disease, such as simple cysts; glomerular diseases including pathologies of glomerular injury that include, but are not limited to, in situ immune complex deposition, that includes, but is not limited to, anti-GBM nephritis, Heymann nephritis, and antibodies against planted antigens, circulating immune complex nephritis, antibodies to glomerular cells, cell-mediated immunity in glomerulonephritis, activation of alternative complement pathway, epithelial cell injury, and pathologies involving mediators of glomerular injury including cellular and soluble mediators, acute glomerulonephritis, such as acute proliferative (poststreptococcal, postinfectious) glomerulonephritis, including but not limited to, poststreptococcal glomerulonephritis and nonstreptococcal acute glomerulonephritis, rapidly progressive (crescentic) glomerulonephritis, nephrotic syndrome, membranous glomerulonephritis (membranous nephropathy), minimal change disease (lipoid nephrosis), focal segmental glomerulosclerosis, membranoproliferative glomerulonephritis, IgA nephropathy (Berger disease), focal proliferative and necrotizing glomerulonephritis (focal glomerulonephritis), hereditary nephritis, including but not limited to, Alport syndrome and thin membrane disease (benign familial hematuria), chronic glomerulonephritis, glomerular lesions associated with systemic disease, including but not limited to, systemic lupus erythematosus, Henoch-Schönlein purpura, bacterial endocarditis, diabetic glomerulosclerosis, amyloidosis, fibrillary and immunotactoid glomerulonephritis, and other systemic disorders; diseases affecting tubules and interstitium, including acute tubular necrosis and tubulointerstitial nephritis, including but not limited to, pyelonephritis and urinary tract infection, acute pyelonephritis, chronic pyelonephritis and reflux nephropathy, and tubulointerstitial nephritis induced by drugs and toxins, including but not limited to, acute drug-induced interstitial nephritis, analgesic abuse nephropathy, nephropathy associated with nonsteroidal anti-inflammatory drugs, and other tubulointerstitial diseases including, but not limited to, urate nephropathy, hypercalcemia and nephrocalcinosis, and multiple myeloma; diseases of blood vessels including benign nephrosclerosis, malignant hypertension and accelerated nephrosclerosis, renal artery stenosis, and thrombotic microangiopathies including, but not limited to, classic (childhood) hemolytic-uremic syndrome, adult hemolytic-uremic syndrome/thrombotic thrombocytopenic purpura, idiopathic HUS/TTP, and other vascular disorders including, but not limited to, atherosclerotic ischemic renal disease, atheroembolic renal disease, sickle cell disease nephropathy, diffuse cortical necrosis, and renal infarcts; urinary tract obstruction (obstructive uropathy); urolithiasis (renal calculi, stones); and tumors of the kidney including, but not limited to, benign tumors, such as renal papillary adenoma, renal fibroma or hamartoma (renomedullary interstitial cell tumor), angiomyolipoma, and oncocytoma, and malignant tumors, including renal cell carcinoma (hypemephroma, adenocarcinoma of kidney), which includes urothelial carcinomas of renal pelvis.

[0939] 26493 mRNA was found to be expressed in normal pancreas tissues, and thus the molecules of the invention may also mediate disorders involving aberrant activities of pancreas cells. Disorders involving the pancreas include those of the exocrine pancreas such as congenital anomalies, including but not limited to, ectopic pancreas; pancreatitis, including but not limited to, acute pancreatitis; cysts, including but not limited to, pseudocysts; tumors, including but not limited to, cystic tumors and carcinoma of the pancreas; and disorders of the endocrine pancreas such as, diabetes mellitus; islet cell tumors, including but not limited to, insulinomas, gastrinomas, and other rare islet cell tumors.

[0940] 26493 mRNA was also found to be expressed in normal and cancerous lung tissue. Accordingly, the molecules of the invention may also mediate disorders involving aberrant activities of lung cells. Examples of disorders of the lung include, but are not limited to, congenital anomalies; atelectasis; diseases of vascular origin, such as pulmonary congestion and edema, including hemodynamic pulmonary edema and edema caused by microvascular injury, adult respiratory distress syndrome (diffuse alveolar damage), pulmonary embolism, hemorrhage, and infarction, and pulmonary hypertension and vascular sclerosis; chronic obstructive pulmonary disease, such as emphysema, chronic bronchitis, bronchial asthma, and bronchiectasis; diffuse interstitial (infiltrative, restrictive) diseases, such as pneumoconioses, sarcoidosis, idiopathic pulmonary fibrosis, desquamative interstitial pneumonitis, hypersensitivity pneumonitis, pulmonary eosinophilia (pulmonary infiltration with eosinophilia), Bronchiolitis obliterans-organizing pneumonia, diffuse pulmonary hemorrhage syndromes, including Goodpasture syndrome, idiopathic pulmonary hemosiderosis and other hemorrhagic syndromes, pulmonary involvement in collagen vascular disorders, and pulmonary alveolar proteinosis; complications of therapies, such as drug-induced lung disease, radiation-induced lung disease, and lung transplantation; tumors, such as bronchogenic carcinoma, including paraneoplastic syndromes, bronchioloalveolar carcinoma, neuroendocrine tumors, such as bronchial carcinoid, miscellaneous tumors, and metastatic tumors; pathologies of the pleura, including inflammatory pleural effusions, noninflammatory pleural effusions, pneumothorax, and pleural tumors, including solitary fibrous tumors (pleural fibroma) and malignant mesothelioma.

[0941] 26493 mRNA was also found to be expressed in normal and cancerous prostate tissue. Accordingly, the molecules of the invention may also mediate disorders involving aberrant activities of prostate cells. Disorders involving the prostate include, but are not limited to, inflammations, benign enlargement, for example, nodular hyperplasia (benign prostatic hypertrophy or hyperplasia), and tumors such as carcinoma. As used herein, “a prostate disorder” refers to an abnormal condition occurring in the male pelvic region characterized by, e.g., male sexual dysfunction and/or urinary symptoms. This disorder may be manifested in the form of genitourinary inflammation (e.g., inflammation of smooth muscle cells) as in several common diseases of the http://164.195.100.11/netacgi/nph-Parser?Sect1=PTO2&Sect2=HITOFF&u=/netahtml/-h5http://164.195.100.1/netacgi/nph-Parser?Sect1=PTO2&Sect2=HITOFF&u=/netahtml/-h7prostate including prostatitis, benign prostatic hyperplasia and cancer, e.g., adenocarcinoma or carcinoma, of the http://164.195.100.11/netacgi/nph-Parser?Sect1=PTO2&Sect2=HITOFF&u=/netahtml/-h6http://164.195.100.11/netacgi/nph-Parser?Sect1=PTO2&Sect2=HITOFF&u=/netahtml/-h8prostate.

[0942] 26493 has also been found in normal and tumor breast tissue. Accordingly, the molecules of the invention may also mediate disorders involving aberrant activities of breast cells. Disorders of the breast include, but are not limited to, disorders of development; inflammations, including but not limited to, acute mastitis, periductal mastitis, periductal mastitis (recurrent subareolar abscess, squamous metaplasia of lactiferous ducts), mammary duct ectasia, fat necrosis, granulomatous mastitis, and pathologies associated with silicone breast implants; fibrocystic changes; proliferative breast disease including, but not limited to, epithelial hyperplasia, sclerosing adenosis, and small duct papillomas; tumors including, but not limited to, stromal tumors such as fibroadenoma, phyllodes tumor, and sarcomas, and epithelial tumors such as large duct papilloma; carcinoma of the breast including in situ (noninvasive) carcinoma that includes ductal carcinoma in situ (including Paget's disease) and lobular carcinoma in situ, and invasive (infiltrating) carcinoma including, but not limited to, invasive ductal carcinoma, no special type, invasive lobular carcinoma, medullary carcinoma, colloid (mucinous) carcinoma, tubular carcinoma, and invasive papillary carcinoma, and miscellaneous malignant neoplasms. Disorders in the male breast include, but are not limited to, gynecomastia and carcinoma.

[0943] As the 26493 mRNA is expressed in endothelial cells, the molecules of the invention can be used as diagnostic/therapeutic targets for disorders involving the blood vessels. Disorders involving blood vessels include, but are not limited to, responses of vascular cell walls to injury, such as endothelial dysfunction and endothelial activation and intimal thickening; vascular diseases including, but not limited to, congenital anomalies, such as arteriovenous fistula, atherosclerosis, and hypertensive vascular disease, such as hypertension; inflammatory disease—the vasculitides, such as giant cell (temporal) arteritis, Takayasu arteritis, polyarteritis nodosa (classic), Kawasaki syndrome (mucocutaneous lymph-node syndrome), microscopic polyanglitis (microscopic polyarteritis, hypersensitivity or leukocytoclastic anglitis), Wegener granulomatosis, thromboanglitis obliterans (Buerger disease), vasculitis associated with other disorders, and infectious arteritis; Raynaud disease; aneurysms and dissection, such as abdominal aortic aneurysms, syphilitic (luetic) aneurysms, and aortic dissection (dissecting hematoma); disorders of veins and lymphatics, such as varicose veins, thrombophlebitis and phlebothrombosis, obstruction of superior vena cava (superior vena cava syndrome), obstruction of inferior vena cava (inferior vena cava syndrome), and lymphangitis and lymphedema; tumors, including benign tumors and tumor-like conditions, such as hemangioma, lymphangioma, glomus tumor (glomangioma), vascular ectasias, and bacillary angiomatosis, and intermediate-grade (borderline low-grade malignant) tumors, such as Kaposi sarcoma and hemangloendothelioma, and malignant tumors, such as angiosarcoma and hemangiopericytoma; and pathology of therapeutic interventions in vascular disease, such as balloon angioplasty and related techniques and vascular replacement, such as coronary artery bypass graft surgery.

[0944] The 26493 protein, fragments thereof, and derivatives and other variants of the sequence in SEQ ID NO: 17 thereof are collectively referred to as “polypeptides or proteins of the invention” or “26493 polypeptides or proteins”. Nucleic acid molecules encoding such polypeptides or proteins are collectively referred to as “nucleic acids of the invention” or “26493 nucleic acids.” 26493 molecules refer to 26493 nucleic acids, polypeptides, and antibodies.

[0945] As used herein, the term “nucleic acid molecule” includes DNA molecules (e.g., a cDNA or genomic DNA), RNA molecules (e.g., an mRNA) and analogs of the DNA or RNA. A DNA or RNA analog can be synthesized from nucleotide analogs. The nucleic acid molecule can be single-stranded or double-stranded, but preferably is double-stranded DNA.

[0946] The term “isolated nucleic acid molecule” or “purified nucleic acid molecule” includes nucleic acid molecules that are separated from other nucleic acid molecules present in the natural source of the nucleic acid. For example, with regards to genomic DNA, the term “isolated” includes nucleic acid molecules which are separated from the chromosome with which the genomic DNA is naturally associated. Preferably, an “isolated” nucleic acid is free of sequences which naturally flank the nucleic acid (i.e., sequences located at the 5′ and/or 3′ends of the nucleic acid) in the genomic DNA of the organism from which the nucleic acid is derived. For example, in various embodiments, the isolated nucleic acid molecule can contain less than about 5 kb, 4 kb, 3 kb, 2 kb, 1 kb, 0.5 kb or 0.1 kb of 5′ and/or 3′nucleotide sequences which naturally flank the nucleic acid molecule in genomic DNA of the cell from which the nucleic acid is derived. Moreover, an “isolated” nucleic acid molecule, such as a cDNA molecule, can be substantially free of other cellular material, or culture medium when produced by recombinant techniques, or substantially free of chemical precursors or other chemicals when chemically synthesized.

[0947] As used herein, the term “hybridizes under low stringency, medium stringency, high stringency, or very high stringency conditions” describes conditions for hybridization and washing. Guidance for performing hybridization reactions can be found in Current Protocols in Molecular Biology, John Wiley & Sons, N.Y. (1989), 6.3.1-6.3.6, which is incorporated by reference. Aqueous and nonaqueous methods are described in that reference and either can be used. Specific hybridization conditions referred to herein are as follows: 1) low stringency hybridization conditions in 6× sodium chloride/sodium citrate (SSC) at about 45° C., followed by two washes in 0.2× SSC, 0.1% SDS at least at 50° C. (the temperature of the washes can be increased to 55° C. for low stringency conditions); 2) medium stringency hybridization conditions in 6× SSC at about 45° C., followed by one or more washes in 0.2× SSC, 0.1% SDS at 60° C.; 3) high stringency hybridization conditions in 6× SSC at about 45° C., followed by one or more washes in 0.2× SSC, 0.1% SDS at 65° C.; and preferably 4) very high stringency hybridization conditions are 0.5M sodium phosphate, 7% SDS at 65° C., followed by one or more washes at 0.2× SSC, 1% SDS at 65° C. Very high stringency conditions (4) are the preferred conditions and the ones that should be used unless otherwise specified.

[0948] Preferably, an isolated nucleic acid molecule of the invention that hybridizes under a stringency condition described herein to the sequence of SEQ ID NO: 16 or SEQ ID NO: 18, corresponds to a naturally-occurring nucleic acid molecule.

[0949] As used herein, a “naturally-occurring” nucleic acid molecule refers to an RNA or DNA molecule having a nucleotide sequence that occurs in nature. For example a naturally occurring nucleic acid molecule can encode a natural protein.

[0950] As used herein, the terms “gene” and “recombinant gene” refer to nucleic acid molecules which include at least an open reading frame encoding a 26493 protein. The gene can optionally further include non-coding sequences, e.g., regulatory sequences and introns. Preferably, a gene encodes a mammalian 26493 protein or derivative thereof.

[0951] An “isolated” or “purified” polypeptide or protein is substantially free of cellular material or other contaminating proteins from the cell or tissue source from which the protein is derived, or substantially free from chemical precursors or other chemicals when chemically synthesized. “Substantially free” means that a preparation of 26493 protein is at least 10% pure. In a preferred embodiment, the preparation of 26493 protein has less than about 30%, 20%, 10% and more preferably 5% (by dry weight), of non-26493 protein (also referred to herein as a “contaminating protein”), or of chemical precursors or non-26493 chemicals. When the 26493 protein or biologically active portion thereof is recombinantly produced, it is also preferably substantially free of culture medium, i.e., culture medium represents less than about 20%, more preferably less than about 10%, and most preferably less than about 5% of the volume of the protein preparation. The invention includes isolated or purified preparations of at least 0.01, 0.1, 1.0, and 10 milligrams in dry weight.

[0952] A “non-essential” amino acid residue is a residue that can be altered from the wild-type sequence of 26493 without abolishing or substantially altering a 26493 activity. Preferably the alteration does not substantially alter the 26493 activity, e.g., the activity is at least 20%, 40%, 60%, 70% or 80% of wild-type. An “essential” amino acid residue is a residue that, when altered from the wild-type sequence of 26493, results in abolishing a 26493 activity such that less than 20% of the wild-type activity is present. For example, conserved amino acid residues in 26493 are predicted to be particularly unamenable to alteration.

[0953] A “conservative amino acid substitution” is one in which the amino acid residue is replaced with an amino acid residue having a similar side chain. Families of amino acid residues having similar side chains have been defined in the art. These families include amino acids with basic side chains (e.g., lysine, arginine, histidine), acidic side chains (e.g., aspartic acid, glutamic acid), uncharged polar side chains (e.g., glycine, asparagine, glutamine, serine, threonine, tyrosine, cysteine), nonpolar side chains (e.g., alanine, valine, leucine, isoleucine, proline, phenylalanine, methionine, tryptophan), beta-branched side chains (e.g., threonine, valine, isoleucine) and aromatic side chains (e.g., tyrosine, phenylalanine, tryptophan, histidine). Thus, a predicted nonessential amino acid residue in a 26493 protein is preferably replaced with another amino acid residue from the same side chain family. Alternatively, in another embodiment, mutations can be introduced randomly along all or part of a 26493 coding sequence, such as by saturation mutagenesis, and the resultant mutants can be screened for 26493 biological activity to identify mutants that retain activity. Following mutagenesis of SEQ ID NO: 16 or SEQ ID NO: 18, the encoded protein can be expressed recombinantly and the activity of the protein can be determined.

[0954] As used herein, a “biologically active portion” of a 26493 protein includes a fragment of a 26493 protein which participates in an interaction, e.g., an intramolecular or an inter-molecular interaction. An inter-molecular interaction can be a specific binding interaction or an enzymatic interaction (e.g., the interaction can be transient and a covalent bond is formed or broken). An inter-molecular interaction can be between a 26493 molecule and a non-26493 molecule or between a first 26493 molecule and a second 26493 molecule (e.g., a dimerization interaction). Biologically active portions of a 26493 protein include peptides comprising amino acid sequences sufficiently homologous to or derived from the amino acid sequence of the 26493 protein, e.g., the amino acid sequence shown in SEQ ID NO: 17, which include less amino acids than the full length 26493 proteins, and exhibit at least one activity of a 26493 protein. Typically, biologically active portions comprise a domain or motif with at least one activity of the 26493 protein, e.g., the ability to catalyze the degradation of an oxidatively damaged form of guanine (e.g., 8-hydroxyguanine or 7,8-dihydro-8-oxoguanine) to the monophosphate with the concomitant release of pyrophosphate. A biologically active portion of a 26493 protein can be a polypeptide which is, for example, 10, 25, 50, 100, 200 or more amino acids in length. Biologically active portions of a 26493 protein can be used as targets for developing agents which modulate a 26493 mediated activity, e.g., an oxoguanine dGTPase activity.

[0955] Calculations of homology or sequence identity between sequences (the terms are used interchangeably herein) are performed as follows.

[0956] To determine the percent identity of two amino acid sequences, or of two nucleic acid sequences, the sequences are aligned for optimal comparison purposes (e.g., gaps can be introduced in one or both of a first and a second amino acid or nucleic acid sequence for optimal alignment and non-homologous sequences can be disregarded for comparison purposes). In a preferred embodiment, the length of a reference sequence aligned for comparison purposes is at least 30%, preferably at least 40%, more preferably at least 50%, 60%, and even more preferably at least 70%, 80%, 90%, 100% of the length of the reference sequence. The amino acid residues or nucleotides at corresponding amino acid positions or nucleotide positions are then compared. When a position in the first sequence is occupied by the same amino acid residue or nucleotide as the corresponding position in the second sequence, then the molecules are identical at that position (as used herein amino acid or nucleic acid “identity” is equivalent to amino acid or nucleic acid “homology”).

[0957] The percent identity between the two sequences is a function of the number of identical positions shared by the sequences, taking into account the number of gaps, and the length of each gap, which need to be introduced for optimal alignment of the two sequences.

[0958] The comparison of sequences and determination of percent identity between two sequences can be accomplished using a mathematical algorithm. In a preferred embodiment, the percent identity between two amino acid sequences is determined using the Needleman and Wunsch ((1970) J. Mol. Biol 48:444-453) algorithm which has been incorporated into the GAP program in the GCG software package (available at http://www.gcg.com), using either a Blossum 62 matrix or a PAM250 matrix, and a gap weight of 16, 14, 12, 10, 8, 6, or 4 and a length weight of 1, 2, 3, 4, 5, or 6. In yet another preferred embodiment, the percent identity between two nucleotide sequences is determined using the GAP program in the GCG software package (available at http://www.gcg.com), using a NWSgapdna.CMP matrix and a gap weight of 40, 50, 60, 70, or 80 and a length weight of 1, 2, 3, 4, 5, or 6. A particularly preferred set of parameters (and the one that should be used unless otherwise specified) are a Blossum 62 scoring matrix with a gap penalty of 12, a gap extend penalty of 4, and a frameshift gap penalty of 5.

[0959] The percent identity between two amino acid or nucleotide sequences can be determined using the algorithm of E. Meyers and W. Miller ((1989) CABIOS, 4:11-17) which has been incorporated into the ALIGN program (version 2.0), using a PAM120 weight residue table, a gap length penalty of 12 and a gap penalty of 4.

[0960] The nucleic acid and protein sequences described herein can be used as a “query sequence” to perform a search against public databases to, for example, identify other family members or related sequences. Such searches can be performed using the NBLAST and XBLAST programs (version 2.0) of Altschul, et al. (1990) J. Mol. Biol. 215:403-10. BLAST nucleotide searches can be performed with the NBLAST program, score=100, wordlength=12 to obtain nucleotide sequences homologous to 26493 nucleic acid molecules of the invention. BLAST protein searches can be performed with the XBLAST program, score=50, wordlength=3 to obtain amino acid sequences homologous to 26493 protein molecules of the invention. To obtain gapped alignments for comparison purposes, Gapped BLAST can be utilized as described in Altschul et al., (1997) Nucleic Acids Res. 25:3389-3402. When utilizing BLAST and Gapped BLAST programs, the default parameters of the respective programs (e.g., XBLAST and NBLAST) can be used. See http://www.ncbi.nlm.nih.gov.

[0961] Particular 26493 polypeptides of the present invention have an amino acid sequence substantially identical to the amino acid sequence of SEQ ID NO: 17. In the context of an amino acid sequence, the term “substantially identical” is used herein to refer to a first amino acid that contains a sufficient or minimum number of amino acid residues that are i) identical to, or ii) conservative substitutions of aligned amino acid residues in a second amino acid sequence such that the first and second amino acid sequences can have a common structural domain and/or common functional activity. For example, amino acid sequences that contain a common structural domain having at least about 60%, or 65% identity, likely 75% identity, more likely 85%, 90%. 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98% or 99% identity to SEQ ID NO: 17 are termed substantially identical.

[0962] In the context of nucleotide sequence, the term “substantially identical” is used herein to refer to a first nucleic acid sequence that contains a sufficient or minimum number of nucleotides that are identical to aligned nucleotides in a second nucleic acid sequence such that the first and second nucleotide sequences encode a polypeptide having common functional activity, or encode a common structural polypeptide domain or a common functional polypeptide activity. For example, nucleotide sequences having at least about 60%, or 65% identity, likely 75% identity, more likely 85%, 90%. 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98% or 99% identity to SEQ ID NO: 16 or 18 are termed substantially identical.

[0963] “Misexpression or aberrant expression”, as used herein, refers to a non-wildtype pattern of gene expression at the RNA or protein level. It includes: expression at non-wild type levels, i.e., over- or under-expression; a pattern of expression that differs from wild type in terms of the time or stage at which the gene is expressed, e.g., increased or decreased expression (as compared with wild type) at a predetermined developmental period or stage; a pattern of expression that differs from wild type in terms of altered, e.g., increased or decreased, expression (as compared with wild type) in a predetermined cell type or tissue type; a pattern of expression that differs from wild type in terms of the splicing size, translated amino acid sequence, post-transitional modification, or biological activity of the expressed polypeptide; a pattern of expression that differs from wild type in terms of the effect of an environmental stimulus or extracellular stimulus on expression of the gene, e.g., a pattern of increased or decreased expression (as compared with wild type) in the presence of an increase or decrease in the strength of the stimulus.

[0964] “Subject,” as used herein, refers to human and non-human animals. The term “non-human animals” of the invention includes all vertebrates, e.g., mammals, such as non-human primates (particularly higher primates), sheep, dog, rodent (e.g., mouse or rat), guinea pig, goat, pig, cat, rabbits, cow, and non-mammals, such as chickens, amphibians, reptiles, etc. In a preferred embodiment, the subject is a human. In another embodiment, the subject is an experimental animal or animal suitable as a disease model.

[0965] A “purified preparation of cells”, as used herein, refers to an in vitro preparation of cells. In the case cells from multicellular organisms (e.g., plants and animals), a purified preparation of cells is a subset of cells obtained from the organism, not the entire intact organism. In the case of unicellular microorganisms (e.g., cultured cells and microbial cells), it consists of a preparation of at least 10% and more preferably 50% of the subject cells.

[0966] Various aspects of the invention are described in further detail below.

[0967] Isolated Nucleic Acid Molecules of 26493

[0968] In one aspect, the invention provides, an isolated or purified, nucleic acid molecule that encodes a 26493 polypeptide described herein, e.g., a full-length 26493 protein or a fragment thereof, e.g., a biologically active portion of 26493 protein. Also included is a nucleic acid fragment suitable for use as a hybridization probe, which can be used, e.g., to identify a nucleic acid molecule encoding a polypeptide of the invention, 26493 mRNA, and fragments suitable for use as primers, e.g., PCR primers for the amplification or mutation of nucleic acid molecules.

[0969] In one embodiment, an isolated nucleic acid molecule of the invention includes the nucleotide sequence shown in SEQ ID NO: 16, or a portion of any of these nucleotide sequences. In one embodiment, the nucleic acid molecule includes sequences encoding the human 26493 protein (i.e., “the coding region” of SEQ ID NO: 16, as shown in SEQ ID NO: 18), as well as 5′untranslated sequences. Alternatively, the nucleic acid molecule can include only the coding region of SEQ ID NO: 16 (e.g., SEQ ID NO: 18) and, e.g., no flanking sequences which normally accompany the subject sequence. In another embodiment, the nucleic acid molecule encodes a sequence corresponding to a fragment of the protein from about amino acid 1 to 1298, or 1838 to 1902 of SEQ ID NO: 17.

[0970] In another embodiment, an isolated nucleic acid molecule of the invention includes a nucleic acid molecule which is a complement of the nucleotide sequence shown in SEQ ID NO: 16 or SEQ ID NO: 18, or a portion of any of these nucleotide sequences. In other embodiments, the nucleic acid molecule of the invention is sufficiently complementary to the nucleotide sequence shown in SEQ ID NO: 16 or SEQ ID NO: 18, such that it can hybridize (e.g., under a stringency condition described herein) to the nucleotide sequence shown in SEQ ID NO:16 or 18, thereby forming a stable duplex.

[0971] In one embodiment, an isolated nucleic acid molecule of the present invention includes a nucleotide sequence which is at least about: 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or more homologous to the entire length of the nucleotide sequence shown in SEQ ID NO: 16 or SEQ ID NO: 18, or a portion, preferably of the same length, of any of these nucleotide sequences.

[0972] 26493 Nucleic Acid Fragments

[0973] A nucleic acid molecule of the invention can include only a portion of the nucleic acid sequence of SEQ ID NO: 16 or 18. For example, such a nucleic acid molecule can include a fragment which can be used as a probe or primer or a fragment encoding a portion of a 26493 protein, e.g., an immunogenic or biologically active portion of a 26493 protein. A fragment can comprise those nucleotides of SEQ ID NO:16, which encode a mutT domain of human 26493. The nucleotide sequence determined from the cloning of the 26493 gene allows for the generation of probes and primers designed for use in identifying and/or cloning other 26493 family members, or fragments thereof, as well as 26493 homologues, or fragments thereof, from other species.

[0974] In another embodiment, a nucleic acid includes a nucleotide sequence that includes part, or all, of the coding region and extends into either (or both) the 5′ or 3′noncoding region. Other embodiments include a fragment which includes a nucleotide sequence encoding an amino acid fragment described herein. Nucleic acid fragments can encode a specific domain or site described herein or fragments thereof, particularly fragments thereof which are at least 50, 100, 150, 200, 250, 300, 350, 400, 404, 450 amino acids in length Fragments also include nucleic acid sequences corresponding to specific amino acid sequences described above or fragments thereof. Nucleic acid fragments should not to be construed as encompassing those fragments that may have been disclosed prior to the invention.

[0975] A nucleic acid fragment can include a sequence corresponding to a domain, region, or functional site described herein. A nucleic acid fragment can also include one or more domain, region, or functional site described herein. Thus, for example, a 26493 nucleic acid fragment can include a sequence corresponding to a mutT domain at locations in the translated 26493 polypeptide described herein.

[0976] 26493 probes and primers are provided. Typically a probe/primer is an isolated or purified oligonucleotide. The oligonucleotide typically includes a region of nucleotide sequence that hybridizes under a stringency condition described herein to at least about 7, 12 or 15, preferably about 20 or 25, more preferably about 30, 35, 40, 45, 50, 55, 60, 65, or 75 consecutive nucleotides of a sense or antisense sequence of SEQ ID NO: 16 or SEQ ID NO: 18, or of a naturally occurring allelic variant or mutant of SEQ ID NO: 16 or SEQ ID NO:18.

[0977] In a preferred embodiment the nucleic acid is a probe which is at least 5 or 10, and less than 200, more preferably less than 100, or less than 50, base pairs in length. It should be identical, or differ by 1, or less than in 5 or 10 bases, from a sequence disclosed herein. If alignment is needed for this comparison the sequences should be aligned for maximum homology. “Looped” out sequences from deletions or insertions, or mismatches, are considered differences.

[0978] A probe or primer can be derived from the sense or anti-sense strand of a nucleic acid which encodes, e.g., a mutT domain from about amino acid 122 to 251 of SEQ ID NO: 17.

[0979] In another embodiment a set of primers is provided, e.g., primers suitable for use in a PCR, which can be used to amplify a selected region of a 26493 sequence, e.g., a domain, region, site or other sequence described herein. The primers should be at least 5, 10, or 50 base pairs in length and less than 100, or less than 200, base pairs in length. The primers should be identical, or differs by one base from a sequence disclosed herein or from a naturally occurring variant. For example, primers suitable for amplifying all or a portion of any of the following regions are provided: a mutT domain from about amino acid 122 to 251 of SEQ ID NO: 17.

[0980] A nucleic acid fragment can encode an epitope bearing region of a polypeptide described herein.

[0981] A nucleic acid fragment encoding a “biologically active portion of a 26493 polypeptide” can be prepared by isolating a portion of the nucleotide sequence of SEQ ID NO:16 or 18, which encodes a polypeptide having a 26493 biological activity (e.g., the biological activities of the 26493 proteins are described herein), expressing the encoded portion of the 26493 protein (e.g., by recombinant expression in vitro) and assessing the activity of the encoded portion of the 26493 protein. For example, a nucleic acid fragment encoding a biologically active portion of 26493 includes a mutT domain, e.g., amino acid residues about 122 to 251 of SEQ ID NO: 17. A nucleic acid fragment encoding a biologically active portion of a 26493 polypeptide, may comprise a nucleotide sequence which is greater than 300 or more nucleotides in length.

[0982] In preferred embodiments, a nucleic acid includes a nucleotide sequence which is about 300, 400, 500, 600, 700, 800, 900, 1000, 1100, 1200, 1300 or more nucleotides in length and hybridizes under a stringency condition described herein to a nucleic acid molecule of SEQ ID NO:16, or SEQ ID NO:18.

[0983] 26493 Nucleic Acid Variants

[0984] The invention further encompasses nucleic acid molecules that differ from the nucleotide sequence shown in SEQ ID NO:16 or SEQ ID NO:18. Such differences can be due to degeneracy of the genetic code (and result in a nucleic acid which encodes the same 26493 proteins as those encoded by the nucleotide sequence disclosed herein. In another embodiment, an isolated nucleic acid molecule of the invention has a nucleotide sequence encoding a protein having an amino acid sequence which differs, by at least 1, but less than 5, 10, 20, 50, or 100 amino acid residues that shown in SEQ ID NO:17. If alignment is needed for this comparison the sequences should be aligned for maximum homology. “Looped” out sequences from deletions or insertions, or mismatches, are considered differences.

[0985] Nucleic acids of the inventor can be chosen for having codons, which are preferred, or non-preferred, for a particular expression system. E.g., the nucleic acid can be one in which at least one codon, at preferably at least 10%, or 20% of the codons has been altered such that the sequence is optimized for expression in E. coli, yeast, human, insect, or CHO cells.

[0986] Nucleic acid variants can be naturally occurring, such as allelic variants (same locus), homologs (different locus), and orthologs (different organism) or can be non naturally occurring. Non-naturally occurring variants can be made by mutagenesis techniques, including those applied to polynucleotides, cells, or organisms. The variants can contain nucleotide substitutions, deletions, inversions and insertions. Variation can occur in either or both the coding and non-coding regions. The variations can produce both conservative and non-conservative amino acid substitutions (as compared in the encoded product).

[0987] In a preferred embodiment, the nucleic acid differs from that of SEQ ID NO: 16 or 18, e.g., as follows: by at least one but less than 10, 20, 30, or 40 nucleotides; at least one but less than 1%, 5%, 10% or 20% of the nucleotides in the subject nucleic acid. If necessary for this analysis the sequences should be aligned for maximum homology. “Looped” out sequences from deletions or insertions, or mismatches, are considered differences.

[0988] Orthologs, homologs, and allelic variants can be identified using methods known in the art. These variants comprise a nucleotide sequence encoding a polypeptide that is 50%, at least about 55%, typically at least about 70-75%, more typically at least about 80-85%, and most typically at least about 90-95% or more identical to the nucleotide sequence shown in SEQ ID NO: 17 or a fragment of this sequence. Such nucleic acid molecules can readily be identified as being able to hybridize under a stringency condition described herein, to the nucleotide sequence shown in SEQ ID NO:17 or a fragment of the sequence. Nucleic acid molecules corresponding to orthologs, homologs, and allelic variants of the 26493 cDNAs of the invention can further be isolated by mapping to the same chromosome or locus as the 26493 gene.

[0989] Preferred variants include those that are correlated with modulating (stimulating and/or enhancing or inhibiting) cellular proliferation, differentiation, or tumorogenesis; modulating aging; or modulating neural activity.

[0990] Allelic variants of 26493, e.g., human 26493, include both functional and non-functional proteins. Functional allelic variants are naturally occurring amino acid sequence variants of the 26493 protein within a population that maintain the ability to bind oxidatively damaged guanine substrates, and to catalyze the removal of an oxidatively damaged form of guanine (e.g., 8-hydroxyguanine or 7,8-dihydro-8-oxoguanine) from DNA and/or the nucleotide pool. Functional allelic variants will typically contain only conservative substitution of one or more amino acids of SEQ ID NO:17, or substitution, deletion or insertion of non-critical residues in non-critical regions of the protein. Non-functional allelic variants are naturally-occurring amino acid sequence variants of the 26493, e.g., human 26493, protein within a population that do not have the ability to bind oxidatively damaged guanine substrates, and to catalyze the removal of an oxidatively damaged form of guanine (e.g., 8-hydroxyguanine or 7,8-dihydro-8-oxoguanine) from DNA and/or the nucleotide pool. Non-functional allelic variants will typically contain a non-conservative substitution, a deletion, or insertion, or premature truncation of the amino acid sequence of SEQ ID NO: 17, or a substitution, insertion, or deletion in critical residues or critical regions of the protein.

[0991] Moreover, nucleic acid molecules encoding other 26493 family members and, thus, which have a nucleotide sequence which differs from the 26493 sequences of SEQ ID NO:16 or SEQ ID NO: 18 are intended to be within the scope of the invention.

[0992] Antisense Nucleic Acid Molecules, Ribozymes and Modified 26493 Nucleic Acid Molecules

[0993] In another aspect, the invention features, an isolated nucleic acid molecule which is antisense to 26493. An “antisense” nucleic acid can include a nucleotide sequence which is complementary to a “sense” nucleic acid encoding a protein, e.g., complementary to the coding strand of a double-stranded cDNA molecule or complementary to an mRNA sequence. The antisense nucleic acid can be complementary to an entire 26493 coding strand, or to only a portion thereof (e.g., the coding region of human 26493 corresponding to SEQ ID NO: 18). In another embodiment, the antisense nucleic acid molecule is antisense to a “noncoding region” of the coding strand of a nucleotide sequence encoding 26493 (e.g., the 5′ and 3′untranslated regions).

[0994] An antisense nucleic acid can be designed such that it is complementary to the entire coding region of 26493 mRNA, but more preferably is an oligonucleotide which is antisense to only a portion of the coding or noncoding region of 26493 mRNA. For example, the antisense oligonucleotide can be complementary to the region surrounding the translation start site of 26493 mRNA, e.g., between the −10 and +10 regions of the target gene nucleotide sequence of interest. An antisense oligonucleotide can be, for example, about 7, 10, 15, 20, 25, 30, 35, 40, 45, 50, 55, 60, 65, 70, 75, 80, or more nucleotides in length.

[0995] An antisense nucleic acid of the invention can be constructed using chemical synthesis and enzymatic ligation reactions using procedures known in the art. For example, an antisense nucleic acid (e.g., an antisense oligonucleotide) can be chemically synthesized using naturally occurring nucleotides or variously modified nucleotides designed to increase the biological stability of the molecules or to increase the physical stability of the duplex formed between the antisense and sense nucleic acids, e.g., phosphorothioate derivatives and acridine substituted nucleotides can be used. The antisense nucleic acid also can be produced biologically using an expression vector into which a nucleic acid has been subcloned in an antisense orientation (i.e., RNA transcribed from the inserted nucleic acid will be of an antisense orientation to a target nucleic acid of interest, described further in the following subsection).

[0996] The antisense nucleic acid molecules of the invention are typically administered to a subject (e.g., by direct injection at a tissue site), or generated in situ such that they hybridize with or bind to cellular mRNA and/or genomic DNA encoding a 26493 protein to thereby inhibit expression of the protein, e.g., by inhibiting transcription and/or translation. Alternatively, antisense nucleic acid molecules can be modified to target selected cells and then administered systemically. For systemic administration, antisense molecules can be modified such that they specifically bind to receptors or antigens expressed on a selected cell surface, e.g., by linking the antisense nucleic acid molecules to peptides or antibodies which bind to cell surface receptors or antigens. The antisense nucleic acid molecules can also be delivered to cells using the vectors described herein. To achieve sufficient intracellular concentrations of the antisense molecules, vector constructs in which the antisense nucleic acid molecule is placed under the control of a strong pol II or pol III promoter are preferred.

[0997] In yet another embodiment, the antisense nucleic acid molecule of the invention is an &agr;-anomeric nucleic acid molecule. An &agr;-anomeric nucleic acid molecule forms specific double-stranded hybrids with complementary RNA in which, contrary to the usual &bgr;-units, the strands run parallel to each other (Gaultier et al. (1987) Nucleic Acids. Res. 15:6625-6641). The antisense nucleic acid molecule can also comprise a 2′-o-methylribonucleotide (Inoue et al. (1987) Nucleic Acids Res. 15:6131-6148) or a chimeric RNA-DNA analogue (Inoue et al. (1987) FEBS Lett. 215:327-330).

[0998] In still another embodiment, an antisense nucleic acid of the invention is a ribozyme. A ribozyme having specificity for a 26493-encoding nucleic acid can include one or more sequences complementary to the nucleotide sequence of a 26493 cDNA disclosed herein (i.e., SEQ ID NO:16 or SEQ ID NO:18), and a sequence having known catalytic sequence responsible for mRNA cleavage (see U.S. Pat. No. 5,093,246 or Haselhoff and Gerlach (1988) Nature 334:585-591). For example, a derivative of a Tetrahymena L-19 IVS RNA can be constructed in which the nucleotide sequence of the active site is complementary to the nucleotide sequence to be cleaved in a 26493-encoding mRNA. See, e.g., Cech et al. U.S. Pat. No. 4,987,071; and Cech et al. U.S. Pat. No. 5,116,742. Alternatively, 26493 mRNA can be used to select a catalytic RNA having a specific ribonuclease activity from a pool of RNA molecules. See, e.g., Bartel, D. and Szostak, J. W. (1993) Science 261:1411-1418.

[0999] 26493 gene expression can be inhibited by targeting nucleotide sequences complementary to the regulatory region of the 26493 (e.g., the 26493 promoter and/or enhancers) to form triple helical structures that prevent transcription of the 26493 gene in target cells. See generally, Helene, C. (1991) Anticancer Drug Des. 6:569-84; Helene, C. i (1992) Ann. N.Y. Acad. Sci. 660:27-36; and Maher, L. J. (1992) Bioassays 14:807-15. The potential sequences that can be targeted for triple helix formation can be increased by creating a so-called “switchback” nucleic acid molecule. Switchback molecules are synthesized in an alternating 5′-3′,3′-5′manner, such that they base pair with first one strand of a duplex and then the other, eliminating the necessity for a sizeable stretch of either purines or pyrimidines to be present on one strand of a duplex.

[1000] The invention also provides detectably labeled oligonucleotide primer and probe molecules. Typically, such labels are chemiluminescent, fluorescent, radioactive, or calorimetric.

[1001] A 26493 nucleic acid molecule can be modified at the base moiety, sugar moiety or phosphate backbone to improve, e.g., the stability, hybridization, or solubility of the molecule. For non-limiting examples of synthetic oligonucleotides with modifications see Toulmé (2001) Nature Biotech. 19:17 and Faria et al. (2001) Nature Biotech. 19:40-44. Such phosphoramidite oligonucleotides can be effective antisense agents.

[1002] For example, the deoxyribose phosphate backbone of the nucleic acid molecules can be modified to generate peptide nucleic acids (see Hyrup B. et al. (1996) Bioorganic & Medicinal Chemistry 4: 5-23). As used herein, the terms “peptide nucleic acid” or “PNA” refers to a nucleic acid mimic, e.g., a DNA mimic, in which the deoxyribose phosphate backbone is replaced by a pseudopeptide backbone and only the four natural nucleobases are retained. The neutral backbone of a PNA can allow for specific hybridization to DNA and RNA under conditions of low ionic strength. The synthesis of PNA oligomers can be performed using standard solid phase peptide synthesis protocols as described in Hyrup B. et al. (1996) supra and Perry-O'Keefe et al. Proc. Natl. Acad. Sci. 93: 14670-675.

[1003] PNAs of 26493 nucleic acid molecules can be used in therapeutic and diagnostic applications. For example, PNAs can be used as antisense or antigene agents for sequence-specific modulation of gene expression by, for example, inducing transcription or translation arrest or inhibiting replication. PNAs of 26493 nucleic acid molecules can also be used in the analysis of single base pair mutations in a gene, (e.g., by PNA-directed PCR clamping); as ‘artificial restriction enzymes’ when used in combination with other enzymes, (e.g., S1 nucleases (Hyrup B. et al. (1996) supra)); or as probes or primers for DNA sequencing or hybridization (Hyrup B. et al. (1996) supra; Perry-O'Keefe supra).

[1004] In other embodiments, the oligonucleotide may include other appended groups such as peptides (e.g., for targeting host cell receptors in vivo), or agents facilitating transport across the cell membrane (see, e.g., Letsinger et al. (1989) Proc. Natl. Acad. Sci. USA 86:6553-6556; Lemaitre et al. (1987) Proc. Natl. Acad. Sci. USA 84:648-652; PCT Publication No. WO88/09810) or the blood-brain barrier (see, e.g., PCT Publication No. WO89/10134). In addition, oligonucleotides can be modified with hybridization-triggered cleavage agents (see, e.g., Krol et al. (1988) Bio-Techniques 6:958-976) or intercalating agents. (see, e.g., Zon (1988) Pharm. Res. 5:539-549). To this end, the oligonucleotide may be conjugated to another molecule, (e.g., a peptide, hybridization triggered cross-linking agent, transport agent, or hybridization-triggered cleavage agent).

[1005] The invention also includes molecular beacon oligonucleotide primer and probe molecules having at least one region which is complementary to a 26493 nucleic acid of the invention, two complementary regions one having a fluorophore and one a quencher such that the molecular beacon is useful for quantitating the presence of the 26493 nucleic acid of the invention in a sample. Molecular beacon nucleic acids are described, for example, in Lizardi et al., U.S. Pat. No. 5,854,033; Nazarenko et al., U.S. Pat. No. 5,866,336, and Livak et al., U.S. Pat. No. 5,876,930.

[1006] Isolated 26493 Polypeptides

[1007] In another aspect, the invention features, an isolated 26493 protein, or fragment, e.g., a biologically active portion, for use as immunogens or antigens to raise or test (or more generally to bind) anti-26493 antibodies. 26493 protein can be isolated from cells or tissue sources using standard protein purification techniques. 26493 protein or fragments thereof can be produced by recombinant DNA techniques or synthesized chemically.

[1008] Polypeptides of the invention include those which arise as a result of the existence of multiple genes, alternative transcription events, alternative RNA splicing events, and alternative translational and post-translational events. The polypeptide can be expressed in systems, e.g., cultured cells, which result in substantially the same post-translational modifications present when expressed the polypeptide is expressed in a native cell, or in systems which result in the alteration or omission of post-translational modifications, e.g., glycosylation or cleavage, present when expressed in a native cell.

[1009] In a preferred embodiment, a 26493 polypeptide has one or more of the following characteristics:

[1010] (i) it catalyzes the removal of an oxidatively damaged form of guanine (e.g., 8-hydroxyguanine or 7,8-dihydro-8-oxoguanine) from a DNA molecule and the nucleotide pool;

[1011] (ii) it repairs DNA, e.g., corrects error-prone translesion synthesis;

[1012] (iii) it degrades an oxidatively damaged form of guanine (e.g., 8-oxo-dGTP) to the monophosphate with the concomitant release of pyrophosphate;

[1013] (iv) it has GTPase activity, e.g., an oxoguanine dGTPase activity;

[1014] (v) it has an amino acid composition of a 26493 polypeptide, e.g., a polypeptide of SEQ ID NO: 17;

[1015] (vi) it has an overall sequence similarity of at least 60%, preferably at least 70, more preferably at least 80, 90, or 95%, with a polypeptide of SEQ ID NO: 17;

[1016] (vii) it can be found in a human tissue, e.g., neural tissue;

[1017] (viii) it can be found in a subcellular organelle, e.g., mitochondria.

[1018] (ix) it has a mutT domain with a sequence similarity which is preferably about 70%, 80%, 90% or 95%, with amino acid residues about 122 to about 251 of SEQ ID NO: 17;

[1019] (x) it has at least three, preferably at least 4 glutamate residues found in the amino acid sequence of the protein of SEQ ID NO: 17; or

[1020] (xi) it has at least 5, preferably at least 8, and most preferably at least 9 of the 10 cysteines found in the amino acid sequence of the native protein.

[1021] In a preferred embodiment the 26493 protein, or fragment thereof, differs from the corresponding sequence in SEQ ID NO:17. In one embodiment it differs by at least one but by less than 15, 10 or 5 amino acid residues. In another it differs from the corresponding sequence in SEQ ID NO: 17 by at least one residue but less than 20%, 15%, 10% or 5% of the residues in it differ from the corresponding sequence in SEQ ID NO:17. (If this comparison requires alignment the sequences should be aligned for maximum homology. “Looped” out sequences from deletions or insertions, or mismatches, are considered differences.) The differences are, preferably, differences or changes at a non essential residue or a conservative substitution. In a preferred embodiment the differences are not in the mutT domain.

[1022] Other embodiments include a protein that contain one or more changes in amino acid sequence, e.g., a change in an amino acid residue which is not essential for activity. Such 26493 proteins differ in amino acid sequence from SEQ ID NO: 17, yet retain biological activity.

[1023] In one embodiment, the protein includes an amino acid sequence at least about 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 98% or more homologous to SEQ ID NO: 17.

[1024] A 26493 protein or fragment is provided which varies from the sequence of SEQ ID NO:17 in regions defined by amino acids about 1 to 216, and from about amino acid 444 to about 455 by at least one but by less than 15, 10 or 5 amino acid residues in the protein or fragment but which does not differ from SEQ ID NO: 17 in regions defined by amino acids about 217 to about 443. (If this comparison requires alignment the sequences should be aligned for maximum homology. “Looped” out sequences from deletions or insertions, or mismatches, are considered differences.) In some embodiments the difference is at a non-essential residue or is a conservative substitution, while in others the difference is at an essential residue or is a non-conservative substitution.

[1025] In one embodiment, a biologically active portion of a 26493 protein includes a mutT domain. Moreover, other biologically active portions, in which other regions of the protein are deleted, can be prepared by recombinant techniques and evaluated for one or more of the functional activities of a native 26493 protein.

[1026] In a preferred embodiment, the 26493 protein has an amino acid sequence shown in SEQ ID NO:17. In other embodiments, the 26493 protein is substantially identical to SEQ ID NO: 17. In yet another embodiment, the 26493 protein is substantially identical to SEQ ID NO: 17 and retains the functional activity of the protein of SEQ ID NO: 17, as described in detail in the subsections above.

[1027] 26493 Chimeric or Fusion Proteins

[1028] In another aspect, the invention provides 26493 chimeric or fusion proteins. As used herein, a 26493 “chimeric protein” or “fusion protein” includes a 26493 polypeptide linked to a non-26493 polypeptide. A “non-26493 polypeptide” refers to a polypeptide having an amino acid sequence corresponding to a protein which is not substantially homologous to the 26493 protein, e.g., a protein which is different from the 26493 protein and which is derived from the same or a different organism. The 26493 polypeptide of the fusion protein can correspond to all or a portion e.g., a fragment described herein of a 26493 amino acid sequence. In a preferred embodiment, a 26493 fusion protein includes at least one (or two) biologically active portion of a 26493 protein. The non-26493 polypeptide can be fused to the N-terminus or C-terminus of the 26493 polypeptide.

[1029] The fusion protein can include a moiety which has a high affinity for a ligand. For example, the fusion protein can be a GST-26493 fusion protein in which the 26493 sequences are fused to the C-terminus of the GST sequences. Such fusion proteins can facilitate the purification of recombinant 26493. Alternatively, the fusion protein can be a 26493 protein containing a heterologous signal sequence at its N-terminus. In certain host cells (e.g., mammalian host cells), expression and/or secretion of 26493 can be increased through use of a heterologous signal sequence.

[1030] Fusion proteins can include all or a part of a serum protein, e.g., an IgG constant region, or human serum albumin.

[1031] The 26493 fusion proteins of the invention can be incorporated into pharmaceutical compositions and administered to a subject in vivo. The 26493 fusion proteins can be used to affect the bioavailability of a 26493 substrate. 26493 fusion proteins may be useful therapeutically for the treatment of disorders caused by, for example, (i) aberrant modification or mutation of a gene encoding a 26493 protein; (ii) mis-regulation of the 26493 gene; and (iii) aberrant post-translational modification of a 26493 protein.

[1032] Moreover, the 26493-fusion proteins of the invention can be used as immunogens to produce anti-26493 antibodies in a subject, to purify 26493 ligands and in screening assays to identify molecules which inhibit the interaction of 26493 with a 26493 substrate.

[1033] Expression vectors are commercially available that already encode a fusion moiety (e.g., a GST polypeptide). A 26493-encoding nucleic acid can be cloned into such an expression vector such that the fusion moiety is linked in-frame to the 26493 protein. Variants of 26493 Proteins

[1034] In another aspect, the invention also features a variant of a 26493 polypeptide, e.g., which functions as an agonist (mimetics) or as an antagonist. Variants of the 26493 proteins can be generated by mutagenesis, e.g., discrete point mutation, the insertion or deletion of sequences or the truncation of a 26493 protein. An agonist of the 26493 proteins can retain substantially the same, or a subset, of the biological activities of the naturally occurring form of a 26493 protein. An antagonist of a 26493 protein can inhibit one or more of the activities of the naturally occurring form of the 26493 protein by, for example, competitively modulating a 26493-mediated activity of a 26493 protein. Thus, specific biological effects can be elicited by treatment with a variant of limited function. Preferably, treatment of a subject with a variant having a subset of the biological activities of the naturally occurring form of the protein has fewer side effects in a subject relative to treatment with the naturally occurring form of the 26493 protein.

[1035] Variants of a 26493 protein can be identified by screening combinatorial libraries of mutants, e.g., truncation mutants, of a 26493 protein for agonist or antagonist activity.

[1036] Libraries of fragments e.g., N terminal, C terminal, or internal fragments, of a 26493 protein coding sequence can be used to generate a variegated population of fragments for screening and subsequent selection of variants of a 26493 protein. Variants in which a cysteine residues is added or deleted or in which a residue which is glycosylated is added or deleted are particularly preferred.

[1037] Methods for screening gene products of combinatorial libraries made by point mutations or truncation, and for screening cDNA libraries for gene products having a selected property are known in the art. Such methods are adaptable for rapid screening of the gene libraries generated by combinatorial mutagenesis of 26493 proteins. Recursive ensemble mutagenesis (REM), a new technique which enhances the frequency of functional mutants in the libraries, can be used in combination with the screening assays to identify 26493 variants (Arkin and Yourvan (1992) Proc. Natl. Acad. Sci. USA 89:7811-7815; Delgrave et al. (1993) Protein Engineering 6:327-331).

[1038] Cell based assays can be exploited to analyze a variegated 26493 library. For example, a library of expression vectors can be transfected into a cell line, e.g., a cell line, which ordinarily responds to 26493 in a substrate-dependent manner. The transfected cells are then contacted with 26493 and the effect of the expression of the mutant on signaling by the 26493 substrate can be detected. Plasmid DNA can then be recovered from the cells which score for inhibition, or alternatively, potentiation of signaling by the 26493 substrate, and the individual clones further characterized.

[1039] In another aspect, the invention features a method of making a 26493 polypeptide, e.g., a peptide having a non-wild type activity, e.g., an antagonist, agonist, or super agonist of a naturally occurring 26493 polypeptide, e.g., a naturally occurring 26493 polypeptide. The method includes: altering the sequence of a 26493 polypeptide, e.g., altering the sequence, e.g., by substitution or deletion of one or more residues of a non-conserved region, a domain or residue disclosed herein, and testing the altered polypeptide for the desired activity.

[1040] In another aspect, the invention features a method of making a fragment or analog of a 26493 polypeptide a biological activity of a naturally occurring 26493 polypeptide. The method includes: altering the sequence, e.g., by substitution or deletion of one or more residues, of a 26493 polypeptide, e.g., altering the sequence of a non-conserved region, or a domain or residue described herein, and testing the altered polypeptide for the desired activity.

[1041] Anti-26493 Antibodies

[1042] In another aspect, the invention provides an anti-26493 antibody, or a fragment thereof (e.g., an antigen-binding fragment thereof). The term “antibody” as used herein refers to an immunoglobulin molecule or immunologically active portion thereof, i.e., an antigen-binding portion. As used herein, the term “antibody” refers to a protein comprising at least one, and preferably two, heavy (H) chain variable regions (abbreviated herein as VH), and at least one and preferably two light (L) chain variable regions (abbreviated herein as VL). The VH and VL regions can be further subdivided into regions of hypervariability, termed “complementarity determining regions” (“CDR”), interspersed with regions that are more conserved, termed “framework regions” (FR). The extent of the framework region and CDR's has been precisely defined (see, Kabat, E. A., et al. (1991) Sequences of Proteins of Immunological Interest, Fifth Edition, U.S. Department of Health and Human Services, NIH Publication No. 91-3242, and Chothia, C. et al. (1987) J. Mol. Biol. 196:901-917, which are incorporated herein by reference). Each VH and VL is composed of three CDR's and four FRs, arranged from amino-terminus to carboxy-terminus in the following order: FR1, CDR1, FR2, CDR2, FR3, CDR3, FR4.

[1043] The anti-26493 antibody can further include a heavy and light chain constant region, to thereby form a heavy and light immunoglobulin chain, respectively. In one embodiment, the antibody is a tetramer of two heavy immunoglobulin chains and two light immunoglobulin chains, wherein the heavy and light immunoglobulin chains are inter-connected by, e.g., disulfide bonds. The heavy chain constant region is comprised of three domains, CH1, CH2 and CH3. The light chain constant region is comprised of one domain, CL. The variable region of the heavy and light chains contains a binding domain that interacts with an antigen. The constant regions of the antibodies typically mediate the binding of the antibody to host tissues or factors, including various cells of the immune system (e.g., effector cells) and the first component (Clq) of the classical complement system.

[1044] As used herein, the term “immunoglobulin” refers to a protein consisting of one or more polypeptides substantially encoded by immunoglobulin genes. The recognized human immunoglobulin genes include the kappa, lambda, alpha (IgA1 and IgA2), gamma (IgG1, IgG2, IgG3, IgG4), delta, epsilon and mu constant region genes, as well as the myriad immunoglobulin variable region genes. Full-length immunoglobulin “light chains” (about 25 KDa or 214 amino acids) are encoded by a variable region gene at the NH2-terminus (about 110 amino acids) and a kappa or lambda constant region gene at the COOH—terminus. Full-length immunoglobulin “heavy chains” (about 50 KDa or 446 amino acids), are similarly encoded by a variable region gene (about 116 amino acids) and one of the other aforementioned constant region genes, e.g., gamma (encoding about 330 amino acids).

[1045] The term “antigen-binding fragment” of an antibody (or simply “antibody portion,” or “fragment”), as used herein, refers to one or more fragments of a full-length antibody that retain the ability to specifically bind to the antigen, e.g., 26493 polypeptide or fragment thereof. Examples of antigen-binding fragments of the anti-26493 antibody include, but are not limited to: (i) a Fab fragment, a monovalent fragment consisting of the VL, VH, CL and CH1 domains; (ii) a F(ab′)2 fragment, a bivalent fragment comprising two Fab fragments linked by a disulfide bridge at the hinge region; (iii) a Fd fragment consisting of the VH and CH1 domains; (iv) a Fv fragment consisting of the VL and VH domains of a single arm of an antibody, (v) a dAb fragment (Ward et al., (1989) Nature 341:544-546), which consists of a VH domain; and (vi) an isolated complementarity determining region (CDR). Furthermore, although the two domains of the Fv fragment, VL and VH, are coded for by separate genes, they can be joined, using recombinant methods, by a synthetic linker that enables them to be made as a single protein chain in which the VL and VH regions pair to form monovalent molecules (known as single chain Fv (scFv); see e.g., Bird et al. (1988) Science 242:423-426; and Huston et al. (1988) Proc. Natl. Acad. Sci. USA 85:5879-5883). Such single chain antibodies are also encompassed within the term “antigen-binding fragment” of an antibody. These antibody fragments are obtained using conventional techniques known to those with skill in the art, and the fragments are screened for utility in the same manner as are intact antibodies.

[1046] The anti-26493 antibody can be a polyclonal or a monoclonal antibody. In other embodiments, the antibody can be recombinantly produced, e.g., produced by phage display or by combinatorial methods.

[1047] Phage display and combinatorial methods for generating anti-26493 antibodies are known in the art (as described in, e.g., Ladner et al. U.S. Pat. No. 5,223,409; Kang et al. International Publication No. WO 92/18619; Dower et al. International Publication No. WO 91/17271; Winter et al. International Publication WO 92/20791; Markland et al. International Publication No. WO 92/15679; Breitling et al. International Publication WO 93/01288; McCafferty et al. International Publication No. WO 92/01047; Garrard et al. International Publication No. WO 92/09690; Ladner et al. International Publication No. WO 90/02809; Fuchs et al. (1991) Bio/Technology 9:1370-1372; Hay et al. (1992) Hum Antibod Hybridomas 3:81-85; Huse et al. (1989) Science 246:1275-1281; Griffths et al. (1993) EMBO J 12:725-734; Hawkins et al. (1992) J Mol Biol 226:889-896; Clackson et al. (1991) Nature 352:624-628; Gram et al. (1992) PNAS 89:3576-3580; Garrad et al. (1991) Bio/Technology 9:1373-1377; Hoogenboom et al. (1991) Nuc Acid Res 19:4133-4137; and Barbas et al. (1991) PNAS 88:7978-7982, the contents of all of which are incorporated by reference herein).

[1048] In one embodiment, the anti-26493 antibody is a fully human antibody (e.g., an antibody made in a mouse which has been genetically engineered to produce an antibody from a human immunoglobulin sequence), or a non-human antibody, e.g., a rodent (mouse or rat), goat, primate (e.g., monkey), camel antibody. Preferably, the non-human antibody is a rodent (mouse or rat antibody). Method of producing rodent antibodies are known in the art.

[1049] Human monoclonal antibodies can be generated using transgenic mice carrying the human immunoglobulin genes rather than the mouse system. Splenocytes from these transgenic mice immunized with the antigen of interest are used to produce hybridomas that secrete human mAbs with specific affinities for epitopes from a human protein (see, e.g., Wood et al. International Application WO 91/00906, Kucherlapati et al. PCT publication WO 91/10741; Lonberg et al. International Application WO 92/03918; Kay et al. International Application 92/03917; Lonberg, N. et al. 1994 Nature 368:856-859; Green, L. L. et al. 1994 Nature Genet. 7:13-21; Morrison, S. L. et al. 1994 Proc. Natl. Acad. Sci. USA 81:6851-6855; Bruggeman et al. 1993 Year Immunol 7:33-40; Tuaillon et al. 1993 PNAS 90:3720-3724; Bruggeman et al. 1991 Eur J Immunol 21:1323-1326).

[1050] An anti-26493 antibody can be one in which the variable region, or a portion thereof, e.g., the CDR's, are generated in a non-human organism, e.g., a rat or mouse. Chimeric, CDR-grafted, and humanized antibodies are within the invention. Antibodies generated in a non-human organism, e.g., a rat or mouse, and then modified, e.g., in the variable framework or constant region, to decrease antigenicity in a human are within the invention.

[1051] Chimeric antibodies can be produced by recombinant DNA techniques known in the art. For example, a gene encoding the Fc constant region of a murine (or other species) monoclonal antibody molecule is digested with restriction enzymes to remove the region encoding the murine Fc, and the equivalent portion of a gene encoding a human Fc constant region is substituted (see Robinson et al., International Patent Publication PCT/US86/02269; Akira, et al., European Patent Application 184,187; Taniguchi, M., European Patent Application 171,496; Morrison et al., European Patent Application 173,494; Neuberger et al., International Application WO 86/01533; Cabilly et al. U.S. Pat. No. 4,816,567; Cabilly et al., European Patent Application 125,023; Better et al. (1988 Science 240:1041-1043); Liu et al. (1987) PNAS 84:3439-3443; Liu et al., 1987, J. Immunol. 139:3521-3526; Sun et al. (1987) PNAS 84:214-218; Nishimura et al., 1987, Canc. Res. 47:999-1005; Wood et al. (1985) Nature 314:446-449; and Shaw et al., 1988, J. Natl Cancer Inst. 80:1553-1559).

[1052] A humanized or CDR-grafted antibody will have at least one or two but generally all three recipient CDR's (of heavy and or light immuoglobulin chains) replaced with a donor CDR. The antibody may be replaced with at least a portion of a non-human CDR or only some of the CDR's may be replaced with non-human CDR's. It is only necessary to replace the number of CDR's required for binding of the humanized antibody to a 26493 or a fragment thereof. Preferably, the donor will be a rodent antibody, e.g., a rat or mouse antibody, and the recipient will be a human framework or a human consensus framework. Typically, the immunoglobulin providing the CDR's is called the “donor” and the immunoglobulin providing the framework is called the “acceptor.” In one embodiment, the donor immunoglobulin is a non-human (e.g., rodent). The acceptor framework is a naturally-occurring (e.g., a human) framework or a consensus framework, or a sequence about 85% or higher, preferably 90%, 95%, 99% or higher identical thereto.

[1053] As used herein, the term “consensus sequence” refers to the sequence formed from the most frequently occurring amino acids (or nucleotides) in a family of related sequences (See e.g., Winnaker, From Genes to Clones (Verlagsgesellschaft, Weinheim, Germany 1987). In a family of proteins, each position in the consensus sequence is occupied by the amino acid occurring most frequently at that position in the family. If two amino acids occur equally frequently, either can be included in the consensus sequence. A “consensus framework” refers to the framework region in the consensus immunoglobulin sequence.

[1054] An antibody can be humanized by methods known in the art. Humanized antibodies can be generated by replacing sequences of the Fv variable region which are not directly involved in antigen binding with equivalent sequences from human Fv variable regions. General methods for generating humanized antibodies are provided by Morrison, S. L., 1985, Science 229:1202-1207, by Oi et al., 1986, BioTechniques 4:214, and by Queen et al. U.S. Pat. No. 5,585,089, U.S. Pat. No. 5,693,761 and U.S. Pat. No. 5,693,762, the contents of all of which are hereby incorporated by reference. Those methods include isolating, manipulating, and expressing the nucleic acid sequences that encode all or part of immunoglobulin Fv variable regions from at least one of a heavy or light chain. Sources of such nucleic acid are well known to those skilled in the art and, for example, may be obtained from a hybridoma producing an antibody against a 26493 polypeptide or fragment thereof. The recombinant DNA encoding the humanized antibody, or fragment thereof, can then be cloned into an appropriate expression vector.

[1055] Humanized or CDR-grafted antibodies can be produced by CDR-grafting or CDR substitution, wherein one, two, or all CDR's of an immunoglobulin chain can be replaced. See e.g., U.S. Pat. No. 5,225,539; Jones et al. 1986 Nature 321:552-525; Verhoeyan et al. 1988 Science 239:1534; Beidler et al. 1988 J. Immunol. 141:4053-4060; Winter U.S. Pat. No. 5,225,539, the contents of all of which are hereby expressly incorporated by reference. Winter describes a CDR-grafting method which may be used to prepare the humanized antibodies of the present invention (UK Patent Application GB 2188638A, filed on Mar. 26, 1987; Winter U.S. Pat. No. 5,225,539), the contents of which is expressly incorporated by reference.

[1056] Also within the scope of the invention are humanized antibodies in which specific amino acids have been substituted, deleted or added. Preferred humanized antibodies have amino acid substitutions in the framework region, such as to improve binding to the antigen. For example, a humanized antibody will have framework residues identical to the donor framework residue or to another amino acid other than the recipient framework residue. To generate such antibodies, a selected, small number of acceptor framework residues of the humanized immunoglobulin chain can be replaced by the corresponding donor amino acids. Preferred locations of the substitutions include amino acid residues adjacent to the CDR, or which are capable of interacting with a CDR (see e.g., U.S. Pat. No. 5,585,089). Criteria for selecting amino acids from the donor are described in U.S. Pat. No. 5,585,089, e.g., columns 12-16 of U.S. Pat. No. 5,585,089, the e.g., columns 12-16 of U.S. Pat. No. 5,585,089, the contents of which are hereby incorporated by reference. Other techniques for humanizing antibodies are described in Padlan et al. EP 519596 A1, published on Dec. 23, 1992.

[1057] In preferred embodiments an antibody can be made by immunizing with purified 26493 antigen, or a fragment thereof, e.g., a fragment described herein, membrane associated antigen, tissue, e.g., crude tissue preparations, whole cells, preferably living cells, lysed cells, or cell fractions, e.g., membrane fractions.

[1058] A full-length 26493 protein or, antigenic peptide fragment of 26493 can be used as an immunogen or can be used to identify anti-26493 antibodies made with other immunogens, e.g., cells, membrane preparations, and the like. The antigenic peptide of 26493 should include at least 8 amino acid residues of the amino acid sequence shown in SEQ ID NO:17 and encompasses an epitope of 26493. Preferably, the antigenic peptide includes at least 10 amino acid residues, more preferably at least 15 amino acid residues, even more preferably at least 20 amino acid residues, and most preferably at least 30 amino acid residues.

[1059] Fragments of 26493 which include residues about 1 to 281, 122 to 251, or 302 to 404 of SEQ ID NO: 17 can be used to make, e.g., used as immunogens or used to characterize the specificity of an antibody. A fragment which includes amino acid residues about 122 to 251 can be used to make antibodies against the mutT domain of the 26493 protein. Antibodies can be made against the hydrophilic regions of the 26493 protein, e.g. the sequence from about amino acid residue 360 to 370 of SEQ ID NO:17. Similarly, a fragment of 26493 which includes from about amino acids 85 to 101, about 121 to 125 and from about 350 to 360 of SEQ ID NO: 17 can be used to make an antibody against a hydrophobic region of the 26493 protein. Antibodies reactive with, or specific for, any of these regions, or other regions or domains described herein are provided.

[1060] Antibodies which bind only native 26493 protein, only denatured or otherwise non-native 26493 protein, or which bind both, are with in the invention. Antibodies with linear or conformational epitopes are within the invention. Conformational epitopes can sometimes be identified by identifying antibodies which bind to native but not denatured 26493 protein.

[1061] Preferred epitopes encompassed by the antigenic peptide are regions of 26493 are located on the surface of the protein, e.g., hydrophilic regions, as well as regions with high antigenicity. For example, an Emini surface probability analysis of the human 26493 protein sequence can be used to indicate the regions that have a particularly high probability of being localized to the surface of the 26493 protein and are thus likely to constitute surface residues useful for targeting antibody production.

[1062] In preferred embodiments antibodies can bind one or more of purified antigen, membrane associated antigen, tissue, e.g., tissue sections, whole cells, preferably living cells, lysed cells, cell fractions, e.g., membrane fractions.

[1063] The anti-26493 antibody can be a single chain antibody. A single-chain antibody (scFV) may be engineered (see, for example, Colcher, D. et al. (1999) Ann N Y Acad Sci 880:263-80; and Reiter, Y. (1996) Clin Cancer Res 2:245-52). The single chain antibody can be dimerized or multimerized to generate multivalent antibodies having specificities for different epitopes of the same target 26493 protein.

[1064] In a preferred embodiment the antibody has: effector function; and can fix complement. In other embodiments the antibody does not; recruit effector cells; or fix complement.

[1065] In a preferred embodiment, the antibody has reduced or no ability to bind an Fc receptor. For example, it is a isotype or subtype, fragment or other mutant, which does not support binding to an Fc receptor, e.g., it has a mutagenized or deleted Fc receptor binding region.

[1066] In a preferred embodiment, an anti-26493 antibody alters (e.g., increases or decreases) the 26493 activity (for example, the ability to catalyze the removal of an oxidatively damaged form of guanine (e.g., 8-hydroxyguanine or 7,8-dihydro-8-oxoguanine) from DNA and/or the nucleotide pool) of a 26493 polypeptide. For example, the antibody can bind at or in proximity to the active site, e.g., to an epitope that includes a residue located from about 122 to 251 of SEQ ID NO: 17.

[1067] The antibody can be coupled to a toxin, e.g., a polypeptide toxin, e.g., ricin or diphtheria toxin or active fragment hereof, or a radioactive nucleus, or imaging agent, e.g. a radioactive, enzymatic, or other, e.g., imaging agent, e.g., a NMR contrast agent. Labels which produce detectable radioactive emissions or fluorescence are preferred.

[1068] An anti-26493 antibody (e.g., monoclonal antibody) can be used to isolate 26493 by standard techniques, such as affinity chromatography or immunoprecipitation. Moreover, an anti-26493 antibody can be used to detect 26493 protein (e.g., in a cellular lysate or cell supernatant) in order to evaluate the abundance and pattern of expression of the protein. Anti-26493 antibodies can be used diagnostically to monitor protein levels in tissue as part of a clinical testing procedure, e.g., to determine the efficacy of a given treatment regimen. Detection can be facilitated by coupling (i.e., physically linking) the antibody to a detectable substance (i.e., antibody labelling). Examples of detectable substances include various enzymes, prosthetic groups, fluorescent materials, luminescent materials, bioluminescent materials, and radioactive materials. Examples of suitable enzymes include horseradish peroxidase, alkaline phosphatase, &bgr;-galactosidase, or acetylcholinesterase; examples of suitable prosthetic group complexes include streptavidin/biotin and avidin/biotin; examples of suitable fluorescent materials include umbelliferone, fluorescein, fluorescein isothiocyanate, rhodamine, dichlorotriazinylamine fluorescein, dansyl chloride or phycoerythrin; an example of a luminescent material includes luminol; examples of bioluminescent materials include luciferase, luciferin, and aequorin, and examples of suitable radioactive material include 125I, 131I, 35S or 3H.

[1069] The invention also includes a nucleic acids which encodes an anti-26493 antibody, e.g., an anti-26493 antibody described herein. Also included are vectors which include the nucleic acid and sells transformed with the nucleic acid, particularly cells which are useful for producing an antibody, e.g., mammalian cells, e.g. CHO or lymphatic cells.

[1070] The invention also includes cell lines, e.g., hybridomas, which make an anti-26493 antibody, e.g., and antibody described herein, and method of using said cells to make a 26493 antibody.

[1071] 26493 Recombinant Expression Vectors, Host Cells and Genetically Engineered Cells

[1072] In another aspect, the invention includes, vectors, preferably expression vectors, containing a nucleic acid encoding a polypeptide described herein. As used herein, the term “vector” refers to a nucleic acid molecule capable of transporting another nucleic acid to which it has been linked and can include a plasmid, cosmid or viral vector. The vector can be capable of autonomous replication or it can integrate into a host DNA. Viral vectors include, e.g., replication defective retroviruses, adenoviruses and adeno-associated viruses.

[1073] A vector can include a 26493 nucleic acid in a form suitable for expression of the nucleic acid in a host cell. Preferably the recombinant expression vector includes one or more regulatory sequences operatively linked to the nucleic acid sequence to be expressed. The term “regulatory sequence” includes promoters, enhancers and other expression control elements (e.g., polyadenylation signals). Regulatory sequences include those which direct constitutive expression of a nucleotide sequence, as well as tissue-specific regulatory and/or inducible sequences. The design of the expression vector can depend on such factors as the choice of the host cell to be transformed, the level of expression of protein desired, and the like. The expression vectors of the invention can be introduced into host cells to thereby produce proteins or polypeptides, including fusion proteins or polypeptides, encoded by nucleic acids as described herein (e.g., 26493 proteins, mutant forms of 26493 proteins, fusion proteins, and the like).

[1074] The recombinant expression vectors of the invention can be designed for expression of 26493 proteins in prokaryotic or eukaryotic cells. For example, polypeptides of the invention can be expressed in E. coli, insect cells (e.g., using baculovirus expression vectors), yeast cells or mammalian cells. Suitable host cells are discussed further in Goeddel, (1990) Gene Expression Technology: Methods in Enzymology 185, Academic Press, San Diego, Calif. Alternatively, the recombinant expression vector can be transcribed and translated in vitro, for example using T7 promoter regulatory sequences and T7 polymerase.

[1075] Expression of proteins in prokaryotes is most often carried out in E. coli with vectors containing constitutive or inducible promoters directing the expression of either fusion or non-fusion proteins. Fusion vectors add a number of amino acids to a protein encoded therein, usually to the amino terminus of the recombinant protein. Such fusion vectors typically serve three purposes: 1) to increase expression of recombinant protein; 2) to increase the solubility of the recombinant protein; and 3) to aid in the purification of the recombinant protein by acting as a ligand in affinity purification. Often, a proteolytic cleavage site is introduced at the junction of the fusion moiety and the recombinant protein to enable separation of the recombinant protein from the fusion moiety subsequent to purification of the fusion protein. Such enzymes, and their cognate recognition sequences, include Factor Xa, thrombin and enterokinase. Typical fusion expression vectors include pGEX (Pharmacia Biotech Inc; Smith, D. B. and Johnson, K. S. (1988) Gene 67:31-40), pMAL (New England Biolabs, Beverly, Mass.) and pRIT5 (Pharmacia, Piscataway, N.J.) which fuse glutathione S-transferase (GST), maltose E binding protein, or protein A, respectively, to the target recombinant protein.

[1076] Purified fusion proteins can be used in 26493 activity assays, (e.g., direct assays or competitive assays described in detail below), or to generate antibodies specific for 26493 proteins. In a preferred embodiment, a fusion protein expressed in a retroviral expression vector of the present invention can be used to infect bone marrow cells which are subsequently transplanted into irradiated recipients. The pathology of the subject recipient is then examined after sufficient time has passed (e.g., six weeks).

[1077] To maximize recombinant protein expression in E. coli is to express the protein in a host bacteria with an impaired capacity to proteolytically cleave the recombinant protein (Gottesman, S., (1990) Gene Expression Technology: Methods in Enzymology 185, Academic Press, San Diego, Calif. 119-128). Another strategy is to alter the nucleic acid sequence of the nucleic acid to be inserted into an expression vector so that the individual codons for each amino acid are those preferentially utilized in E. coli (Wada et al., (1992) Nucleic Acids Res. 20:2111-2118). Such alteration of nucleic acid sequences of the invention can be carried out by standard DNA synthesis techniques.

[1078] The 26493 expression vector can be a yeast expression vector, a vector for expression in insect cells, e.g., a baculovirus expression vector or a vector suitable for expression in mammalian cells.

[1079] When used in mammalian cells, the expression vector's control functions can be provided by viral regulatory elements. For example, commonly used promoters are derived from polyoma, Adenovirus 2, cytomegalovirus and Simian Virus 40.

[1080] In another embodiment, the promoter is an inducible promoter, e.g., a promoter regulated by a steroid hormone, by a polypeptide hormone (e.g., by means of a signal transduction pathway), or by a heterologous polypeptide (e.g., the tetracycline-inducible systems, “Tet-On” and “Tet-Off”; see, e.g., Clontech Inc., CA, Gossen and Bujard (1992) Proc. Natl. Acad. Sci. USA 89:5547, and Paillard (1989) Human Gene Therapy 9:983).

[1081] In another embodiment, the recombinant mammalian expression vector is capable of directing expression of the nucleic acid preferentially in a particular cell type (e.g., tissue-specific regulatory elements are used to express the nucleic acid). Non-limiting examples of suitable tissue-specific promoters include the albumin promoter (liver-specific; Pinkert et al. (1987) Genes Dev. 1:268-277), lymphoid-specific promoters (Calame and Eaton (1988) Adv. Immunol. 43:235-275), in particular promoters of T cell receptors (Winoto and Baltimore (1989) EMBO J. 8:729-733) and immunoglobulins (Banerji et al. (1983) Cell 33:729-740; Queen and Baltimore (1983) Cell 33:741-748), neuron-specific promoters (e.g., the neurofilament promoter; Byrne and Ruddle (1989) Proc. Natl. Acad. Sci. USA 86:5473-5477), pancreas-specific promoters (Edlund et al. (1985) Science 230:912-916), and mammary gland-specific promoters (e.g., milk whey promoter; U.S. Pat. No. 4,873,316 and European Application Publication No. 264,166). Developmentally-regulated promoters are also encompassed, for example, the murine hox promoters (Kessel and Gruss (1990) Science 249:374-379) and the &agr;-fetoprotein promoter (Campes and Tilghman (1989) Genes Dev. 3:537-546).

[1082] The invention further provides a recombinant expression vector comprising a DNA molecule of the invention cloned into the expression vector in an antisense orientation. Regulatory sequences (e.g., viral promoters and/or enhancers) operatively linked to a nucleic acid cloned in the antisense orientation can be chosen which direct the constitutive, tissue specific or cell type specific expression of antisense RNA in a variety of cell types. The antisense expression vector can be in the form of a recombinant plasmid, phagemid or attenuated virus.

[1083] Another aspect the invention provides a host cell which includes a nucleic acid molecule described herein, e.g., a 26493 nucleic acid molecule within a recombinant expression vector or a 26493 nucleic acid molecule containing sequences which allow it to homologously recombine into a specific site of the host cell's genome. The terms “host cell” and “recombinant host cell” are used interchangeably herein. Such terms refer not only to the particular subject cell but to the progeny or potential progeny of such a cell. Because certain modifications may occur in succeeding generations due to either mutation or environmental influences, such progeny may not, in fact, be identical to the parent cell, but are still included within the scope of the term as used herein.

[1084] A host cell can be any prokaryotic or eukaryotic cell. For example, a 26493 protein can be expressed in bacterial cells (such as E. coli), insect cells, yeast or mammalian cells (such as Chinese hamster ovary cells (CHO) or COS cells). Other suitable host cells are known to those skilled in the art.

[1085] Vector DNA can be introduced into host cells via conventional transformation or transfection techniques. As used herein, the terms “transformation” and “transfection” are intended to refer to a variety of art-recognized techniques for introducing foreign nucleic acid (e.g., DNA) into a host cell, including calcium phosphate or calcium chloride co-precipitation, DEAE-dextran-mediated transfection, lipofection, or electroporation.

[1086] A host cell of the invention can be used to produce (i.e., express) a 26493 protein. Accordingly, the invention further provides methods for producing a 26493 protein using the host cells of the invention. In one embodiment, the method includes culturing the host cell of the invention (into which a recombinant expression vector encoding a 26493 protein has been introduced) in a suitable medium such that a 26493 protein is produced. In another embodiment, the method further includes isolating a 26493 protein from the medium or the host cell.

[1087] In another aspect, the invention features, a cell or purified preparation of cells which include a 26493 transgene, or which otherwise misexpress 26493. The cell preparation can consist of human or non-human cells, e.g., rodent cells, e.g., mouse or rat cells, rabbit cells, or pig cells. In preferred embodiments, the cell or cells include a 26493 transgene, e.g., a heterologous form of a 26493, e.g., a gene derived from humans (in the case of a non-human cell). The 26493 transgene can be misexpressed, e.g., overexpressed or underexpressed. In other preferred embodiments, the cell or cells include a gene that mis-expresses an endogenous 26493, e.g., a gene the expression of which is disrupted, e.g., a knockout. Such cells can serve as a model for studying disorders that are related to mutated or mis-expressed 26493 alleles or for use in drug screening.

[1088] In another aspect, the invention features, a human cell, e.g., a hematopoietic stem cell, transformed with nucleic acid which encodes a subject 26493 polypeptide.

[1089] Also provided are cells, preferably human cells, e.g., human hematopoietic or fibroblast cells, in which an endogenous 26493 is under the control of a regulatory sequence that does not normally control the expression of the endogenous 26493 gene. The expression characteristics of an endogenous gene within a cell, e.g., a cell line or microorganism, can be modified by inserting a heterologous DNA regulatory element into the genome of the cell such that the inserted regulatory element is operably linked to the endogenous 26493 gene. For example, an endogenous 26493 gene which is “transcriptionally silent,” e.g., not normally expressed, or expressed only at very low levels, may be activated by inserting a regulatory element which is capable of promoting the expression of a normally expressed gene product in that cell. Techniques such as targeted homologous recombinations, can be used to insert the heterologous DNA as described in, e.g., Chappel, U.S. Pat. No. 5,272,071; WO 91/06667, published in May 16, 1991.

[1090] In a preferred embodiment, recombinant cells described herein can be used for replacement therapy in a subject. For example, a nucleic acid encoding a 26493 polypeptide operably linked to an inducible promoter (e.g., a steroid hormone receptor-regulated promoter) is introduced into a human or nonhuman, e.g., mammalian, e.g., porcine recombinant cell. The cell is cultivated and encapsulated in a biocompatible material, such as poly-lysine alginate, and subsequently implanted into the subject. See, e.g., Lanza (1996) Nat. Biotechnol. 14:1107; Joki et al. (2001) Nat. Biotechnol. 19:35; and U.S. Pat. No. 5,876,742. Production of 26493 polypeptide can be regulated in the subject by administering an agent (e.g., a steroid hormone) to the subject. In another preferred embodiment, the implanted recombinant cells express and secrete an antibody specific for a 26493 polypeptide. The antibody can be any antibody or any antibody derivative described herein.

[1091] 26493 Transgenic Animals

[1092] The invention provides non-human transgenic animals. Such animals are useful for studying the function and/or activity of a 26493 protein and for identifying and/or evaluating modulators of 26493 activity. As used herein, a “transgenic animal” is a non-human animal, preferably a mammal, more preferably a rodent such as a rat or mouse, in which one or more of the cells of the animal includes a transgene. Other examples of transgenic animals include non-human primates, sheep, dogs, cows, goats, chickens, amphibians, and the like. A transgene is exogenous DNA or a rearrangement, e.g., a deletion of endogenous chromosomal DNA, which preferably is integrated into or occurs in the genome of the cells of a transgenic animal. A transgene can direct the expression of an encoded gene product in one or more cell types or tissues of the transgenic animal, other transgenes, e.g., a knockout, reduce expression. Thus, a transgenic animal can be one in which an endogenous 26493 gene has been altered by, e.g., by homologous recombination between the endogenous gene and an exogenous DNA molecule introduced into a cell of the animal, e.g., an embryonic cell of the animal, prior to development of the animal.

[1093] Intronic sequences and polyadenylation signals can also be included in the transgene to increase the efficiency of expression of the transgene. A tissue-specific regulatory sequence(s) can be operably linked to a transgene of the invention to direct expression of a 26493 protein to particular cells. A transgenic founder animal can be identified based upon the presence of a 26493 transgene in its genome and/or expression of 26493 mRNA in tissues or cells of the animals. A transgenic founder animal can then be used to breed additional animals carrying the transgene. Moreover, transgenic animals carrying a transgene encoding a 26493 protein can further be bred to other transgenic animals carrying other transgenes.

[1094] 26493 proteins or polypeptides can be expressed in transgenic animals or plants, e.g., a nucleic acid encoding the protein or polypeptide can be introduced into the genome of an animal. In preferred embodiments the nucleic acid is placed under the control of a tissue specific promoter, e.g., a milk or egg specific promoter, and recovered from the milk or eggs produced by the animal. Suitable animals are mice, pigs, cows, goats, and sheep.

[1095] The invention also includes a population of cells from a transgenic animal, as discussed, e.g., below.

[1096] Uses of 26493

[1097] The nucleic acid molecules, proteins, protein homologues, and antibodies described herein can be used in one or more of the following methods: a) screening assays; b) predictive medicine (e.g., diagnostic assays, prognostic assays, monitoring clinical trials, and pharmacogenetics); and c) methods of treatment (e.g., therapeutic and prophylactic).

[1098] The proteins of the invention can be used in vitro to synthesize products, wherein the synthesis of such products requires activities such as catalyzing the removal of an oxidatively damaged form of guanine (e.g., 8-hydroxyguanine or 7,8-dihydro-8-oxoguanine) from a DNA molecule and the nucleotide pool, or degrading an oxidatively damaged forms of guanine, e.g., an oxoguanine dGTPase activity.

[1099] The isolated nucleic acid molecules of the invention can be used, for example, to express a 26493 protein (e.g., via a recombinant expression vector in a host cell in gene therapy applications), to detect a 26493 mRNA (e.g., in a biological sample) or a genetic alteration in a 26493 gene, and to modulate 26493 activity, as described further below. The 26493 proteins can be used to treat disorders characterized by insufficient or excessive production of a 26493 substrate or production of 26493 inhibitors. In addition, the 26493 proteins can be used to screen for naturally occurring 26493 substrates, to screen for drugs or compounds which modulate 26493 activity, as well as to treat disorders characterized by insufficient or excessive production of 26493 protein or production of 26493 protein forms which have decreased, aberrant or unwanted activity compared to 26493 wild type protein Moreover, the anti-26493 antibodies of the invention can be used to detect and isolate 26493 proteins, regulate the bioavailability of 26493 proteins, and modulate 26493 activity.

[1100] A method of evaluating a compound for the ability to interact with, e.g., bind, a subject 26493 polypeptide is provided. The method includes: contacting the compound with the subject 26493 polypeptide; and evaluating ability of the compound to interact with, e.g., to bind or form a complex with the subject 26493 polypeptide. This method can be performed in vitro, e.g., in a cell free system, or in vivo, e.g., in a two-hybrid interaction trap assay. This method can be used to identify naturally occurring molecules that interact with subject 26493 polypeptide. It can also be used to find natural or synthetic inhibitors of subject 26493 polypeptide. Screening methods are discussed in more detail below.

[1101] 26493 Screening Assays

[1102] The invention provides methods (also referred to herein as “screening assays”) for identifying modulators, i.e., candidate or test compounds or agents (e.g., proteins, peptides, peptidomimetics, peptoids, small molecules or other drugs) which bind to 26493 proteins, have a stimulatory or inhibitory effect on, for example, 26493 expression or 26493 activity, or have a stimulatory or inhibitory effect on, for example, the expression or activity of a 26493 substrate. Compounds thus identified can be used to modulate the activity of target gene products (e.g., 26493 genes) in a therapeutic protocol, to elaborate the biological function of the target gene product, or to identify compounds that disrupt normal target gene interactions.

[1103] In one embodiment, the invention provides assays for screening candidate or test compounds which are substrates of a 26493 protein or polypeptide or a biologically active portion thereof. In another embodiment, the invention provides assays for screening candidate or test compounds that bind to or modulate an activity of a 26493 protein or polypeptide or a biologically active portion thereof.

[1104] The test compounds of the present invention can be obtained using any of the numerous approaches in combinatorial library methods known in the art, including: biological libraries; peptoid libraries (libraries of molecules having the functionalities of peptides, but with a novel, non-peptide backbone which are resistant to enzymatic degradation but which nevertheless remain bioactive; see, e.g., Zuckermann, R. N. et al. (1994) J. Med. Chem. 37:2678-85); spatially addressable parallel solid phase or solution phase libraries; synthetic library methods requiring deconvolution; the ‘one-bead one-compound’ library method; and synthetic library methods using affinity chromatography selection. The biological library and peptoid library approaches are limited to peptide libraries, while the other four approaches are applicable to peptide, non-peptide oligomer or small molecule libraries of compounds (Lam (1997) Anticancer Drug Des. 12:145).

[1105] Examples of methods for the synthesis of molecular libraries can be found in the art, for example in: DeWitt et al. (1993) Proc. Natl. Acad. Sci. U.S.A. 90:6909; Erb et al. (1994) Proc. Natl. Acad. Sci. USA 91:11422; Zuckermann et al. (1994). J. Med. Chem. 37:2678; Cho et al. (1993) Science 261:1303; Carrell et al. (1994) Angew. Chem. Int. Ed. Engl. 33:2059; Carell et al. (1994) Angew. Chem. Int. Ed. Engl. 33:2061; and Gallop et al. (1994) J. Med. Chem. 37:1233.

[1106] Libraries of compounds may be presented in solution (e.g., Houghten (1992) Biotechniques 13:412-421), or on beads (Lam (1991) Nature 354:82-84), chips (Fodor (1993) Nature 364:555-556), bacteria (Ladner, U.S. Pat. No. 5,223,409), spores (Ladner U.S. Pat. No. 5,223,409), plasmids (Cull et al. (1992) Proc Natl Acad Sci USA 89:1865-1869) or on phage (Scott and Smith (1990) Science 249:386-390; Devlin (1990) Science 249:404-406; Cwirla et al. (1990) Proc. Natl. Acad. Sci. 87:6378-6382; Felici (1991) J. Mol. Biol. 222:301-310; Ladner supra.).

[1107] In one embodiment, an assay is a cell-based assay in which a cell which expresses a 26493 protein or biologically active portion thereof is contacted with a test compound, and the ability of the test compound to modulate 26493 activity is determined. Determining the ability of the test compound to modulate 26493 activity can be accomplished by monitoring, for example, an oxoguanine dGTPase activity. The cell, for example, can be of mammalian origin, e.g., human.

[1108] The ability of the test compound to modulate 26493 binding to a compound, e.g., a 26493 substrate, or to bind to 26493 can also be evaluated. This can be accomplished, for example, by coupling the compound, e.g., the substrate, with a radioisotope or enzymatic label such that binding of the compound, e.g., the substrate, to 26493 can be determined by detecting the labeled compound, e.g., substrate, in a complex. Alternatively, 26493 could be coupled with a radioisotope or enzymatic label to monitor the ability of a test compound to modulate 26493 binding to a 26493 substrate in a complex. For example, compounds (e.g., 26493 substrates) can be labeled with 125I, 35S, 14C, or 3H, either directly or indirectly, and the radioisotope detected by direct counting of radioemmission or by scintillation counting. Alternatively, compounds can be enzymatically labeled with, for example, horseradish peroxidase, alkaline phosphatase, or luciferase, and the enzymatic label detected by determination of conversion of an appropriate substrate to product.

[1109] The ability of a compound (e.g., a 26493 substrate) to interact with 26493 with or without the labeling of any of the interactants can be evaluated. For example, a microphysiometer can be used to detect the interaction of a compound with 26493 without the labeling of either the compound or the 26493. McConnell, H. M. et al. (1992) Science 257:1906-1912. As used herein, a “microphysiometer” (e.g., Cytosensor) is an analytical instrument that measures the rate at which a cell acidifies its environment using a light-addressable potentiometric sensor (LAPS). Changes in this acidification rate can be used as an indicator of the interaction between a compound and 26493.

[1110] In yet another embodiment, a cell-free assay is provided in which a 26493 protein or biologically active portion thereof is contacted with a test compound and the ability of the test compound to bind to the 26493 protein or biologically active portion thereof is evaluated. Preferred biologically active portions of the 26493 proteins to be used in assays of the present invention include fragments which participate in interactions with non-26493 molecules, e.g., fragments with high surface probability scores.

[1111] Soluble and/or membrane-bound forms of isolated proteins (e.g., 26493 proteins or biologically active portions thereof) can be used in the cell-free assays of the invention. When membrane-bound forms of the protein are used, it may be desirable to utilize a solubilizing agent. Examples of such solubilizing agents include non-ionic detergents such as n-octylglucoside, n-dodecylglucoside, n-dodecylmaltoside, octanoyl-N-methylglucamide, decanoyl-N-methylglucamide, Triton® X-100, Triton® X-114, Thesit®, Isotridecypoly(ethylene glycol ether)n, 3-[(3-cholamidopropyl)dimethylamminio]-1-propane sulfonate (CHAPS), 3-[(3-cholamidopropyl)dimethylamminio]-2-hydroxy-1-propane sulfonate (CHAPSO), or N-dodecyl=N,N-dimethyl-3-ammonio-1-propane sulfonate.

[1112] Cell-free assays involve preparing a reaction mixture of the target gene protein and the test compound under conditions and for a time sufficient to allow the two components to interact and bind, thus forming a complex that can be removed and/or detected.

[1113] The interaction between two molecules can also be detected, e.g., using fluorescence energy transfer (FET) (see, for example, Lakowicz et al., U.S. Pat. No. 5,631,169; Stavrianopoulos, et al., U.S. Pat. No. 4,868,103). A fluorophore label on the first, ‘donor’ molecule is selected such that its emitted fluorescent energy will be absorbed by a fluorescent label on a second, ‘acceptor’ molecule, which in turn is able to fluoresce due to the absorbed energy. Alternately, the ‘donor’ protein molecule may simply utilize the natural fluorescent energy of tryptophan residues. Labels are chosen that emit different wavelengths of light, such that the ‘acceptor’ molecule label may be differentiated from that of the ‘donor’. Since the efficiency of energy transfer between the labels is related to the distance separating the molecules, the spatial relationship between the molecules can be assessed. In a situation in which binding occurs between the molecules, the fluorescent emission of the ‘acceptor’ molecule label in the assay should be maximal. An FET binding event can be conveniently measured through standard fluorometric detection means well known in the art (e.g., using a fluorimeter).

[1114] In another embodiment, determining the ability of the 26493 protein to bind to a target molecule can be accomplished using real-time Biomolecular Interaction Analysis (BIA) (see, e.g., Sjolander, S. and Urbaniczky, C. (1991) Anal. Chem. 63:2338-2345 and Szabo et al. (1995) Curr. Opin. Struct. Biol. 5:699-705). “Surface plasmon resonance” or “BIA” detects biospecific interactions in real time, without labeling any of the interactants (e.g., BIAcore). Changes in the mass at the binding surface (indicative of a binding event) result in alterations of the refractive index of light near the surface (the optical phenomenon of surface plasmon resonance (SPR)), resulting in a detectable signal which can be used as an indication of real-time reactions between biological molecules.

[1115] In one embodiment, the target gene product or the test substance is anchored onto a solid phase. The target gene product/test compound complexes anchored on the solid phase can be detected at the end of the reaction. Preferably, the target gene product can be anchored onto a solid surface, and the test compound, (which is not anchored), can be labeled, either directly or indirectly, with detectable labels discussed herein.

[1116] It may be desirable to immobilize either 26493, an anti-26493 antibody or its target molecule to facilitate separation of complexed from uncomplexed forms of one or both of the proteins, as well as to accommodate automation of the assay. Binding of a test compound to a 26493 protein, or interaction of a 26493 protein with a target molecule in the presence and absence of a candidate compound, can be accomplished in any vessel suitable for containing the reactants. Examples of such vessels include microtiter plates, test tubes, and micro-centrifuge tubes. In one embodiment, a fusion protein can be provided which adds a domain that allows one or both of the proteins to be bound to a matrix. For example, glutathione-S-transferase/26493 fusion proteins or glutathione-S-transferase/target fusion proteins can be adsorbed onto glutathione sepharose beads (Sigma Chemical, St. Louis, Mo.) or glutathione derivatized microtiter plates, which are then combined with the test compound or the test compound and either the non-adsorbed target protein or 26493 protein, and the mixture incubated under conditions conducive to complex formation (e.g., at physiological conditions for salt and pH). Following incubation, the beads or microtiter plate wells are washed to remove any unbound components, the matrix immobilized in the case of beads, complex determined either directly or indirectly, for example, as described above. Alternatively, the complexes can be dissociated from the matrix, and the level of 26493 binding or activity determined using standard techniques.

[1117] Other techniques for immobilizing either a 26493 protein or a target molecule on matrices include using conjugation of biotin and streptavidin. Biotinylated 26493 protein or target molecules can be prepared from biotin-NHS(N-hydroxy-succinimide) using techniques known in the art (e.g., biotinylation kit, Pierce Chemicals, Rockford, Ill.), and immobilized in the wells of streptavidin-coated 96 well plates (Pierce Chemical).

[1118] In order to conduct the assay, the non-immobilized component is added to the coated surface containing the anchored component. After the reaction is complete, unreacted components are removed (e.g., by washing) under conditions such that any complexes formed will remain immobilized on the solid surface. The detection of complexes anchored on the solid surface can be accomplished in a number of ways. Where the previously non-immobilized component is pre-labeled, the detection of label immobilized on the surface indicates that complexes were formed. Where the previously non-immobilized component is not pre-labeled, an indirect label can be used to detect complexes anchored on the surface; e.g., using a labeled antibody specific for the immobilized component (the antibody, in turn, can be directly labeled or indirectly labeled with, e.g., a labeled anti-Ig antibody).

[1119] In one embodiment, this assay is performed utilizing antibodies reactive with 26493 protein or target molecules but which do not interfere with binding of the 26493 protein to its target molecule. Such antibodies can be derivatized to the wells of the plate, and unbound target or 26493 protein trapped in the wells by antibody conjugation. Methods for detecting such complexes, in addition to those described above for the GST-immobilized complexes, include immunodetection of complexes using antibodies reactive with the 26493 protein or target molecule, as well as enzyme-linked assays which rely on detecting an enzymatic activity associated with the 26493 protein or target molecule.

[1120] Alternatively, cell free assays can be conducted in a liquid phase. In such an assay, the reaction products are separated from unreacted components, by any of a number of standard techniques, including but not limited to: differential centrifugation (see, for example, Rivas, G., and Minton, A. P., (1993) Trends Biochem Sci 18:284-7); chromatography (gel filtration chromatography, ion-exchange chromatography); electrophoresis (see, e.g., Ausubel, F. et al., eds. Current Protocols in Molecular Biology 1999, J. Wiley: New York.); and immunoprecipitation (see, for example, Ausubel, F. et al., eds. (1999) Current Protocols in Molecular Biology, J. Wiley: New York). Such resins and chromatographic techniques are known to one skilled in the art (see, e.g., Heegaard, N. H., (1998) J Mol Recognit 11:141-8; Hage, D. S., and Tweed, S. A. (1997) J Chromatogr B Biomed Sci Appl. 699:499-525). Further, fluorescence energy transfer may also be conveniently utilized, as described herein, to detect binding without further purification of the complex from solution.

[1121] In a preferred embodiment, the assay includes contacting the 26493 protein or biologically active portion thereof with a known compound which binds 26493 to form an assay mixture, contacting the assay mixture with a test compound, and determining the ability of the test compound to interact with a 26493 protein, wherein determining the ability of the test compound to interact with a 26493 protein includes determining the ability of the test compound to preferentially bind to 26493 or biologically active portion thereof, or to modulate the activity of a target molecule, as compared to the known compound.

[1122] The target gene products of the invention can, in vivo, interact with one or more cellular or extracellular macromolecules, such as proteins. For the purposes of this discussion, such cellular and extracellular macromolecules are referred to herein as “binding partners.” Compounds that disrupt such interactions can be useful in regulating the activity of the target gene product. Such compounds can include, but are not limited to molecules such as antibodies, peptides, and small molecules. The preferred target genes/products for use in this embodiment are the 26493 genes herein identified. In an alternative embodiment, the invention provides methods for determining the ability of the test compound to modulate the activity of a 26493 protein through modulation of the activity of a downstream effector of a 26493 target molecule. For example, the activity of the effector molecule on an appropriate target can be determined, or the binding of the effector to an appropriate target can be determined, as previously described.

[1123] To identify compounds that interfere with the interaction between the target gene product and its cellular or extracellular binding partner(s), a reaction mixture containing the target gene product and the binding partner is prepared, under conditions and for a time sufficient, to allow the two products to form complex. In order to test an inhibitory agent, the reaction mixture is provided in the presence and absence of the test compound. The test compound can be initially included in the reaction mixture, or can be added at a time subsequent to the addition of the target gene and its cellular or extracellular binding partner. Control reaction mixtures are incubated without the test compound or with a placebo. The formation of any complexes between the target gene product and the cellular or extracellular binding partner is then detected. The formation of a complex in the control reaction, but not in the reaction mixture containing the test compound, indicates that the compound interferes with the interaction of the target gene product and the interactive binding partner. Additionally, complex formation within reaction mixtures containing the test compound and normal target gene product can also be compared to complex formation within reaction mixtures containing the test compound and mutant target gene product. This comparison can be important in those cases wherein it is desirable to identify compounds that disrupt interactions of mutant but not normal target gene products.

[1124] These assays can be conducted in a heterogeneous or homogeneous format. Heterogeneous assays involve anchoring either the target gene product or the binding partner onto a solid phase, and detecting complexes anchored on the solid phase at the end of the reaction. In homogeneous assays, the entire reaction is carried out in a liquid phase. In either approach, the order of addition of reactants can be varied to obtain different information about the compounds being tested. For example, test compounds that interfere with the interaction between the target gene products and the binding partners, e.g., by competition, can be identified by conducting the reaction in the presence of the test substance. Alternatively, test compounds that disrupt preformed complexes, e.g., compounds with higher binding constants that displace one of the components from the complex, can be tested by adding the test compound to the reaction mixture after complexes have been formed. The various formats are briefly described below.

[1125] In a heterogeneous assay system, either the target gene product or the interactive cellular or extracellular binding partner, is anchored onto a solid surface (e.g., a microtiter plate), while the non-anchored species is labeled, either directly or indirectly. The anchored species can be immobilized by non-covalent or covalent attachments. Alternatively, an immobilized antibody specific for the species to be anchored can be used to anchor the species to the solid surface.

[1126] In order to conduct the assay, the partner of the immobilized species is exposed to the coated surface with or without the test compound. After the reaction is complete, unreacted components are removed (e.g., by washing) and any complexes formed will remain immobilized on the solid surface. Where the non-immobilized species is pre-labeled, the detection of label immobilized on the surface indicates that complexes were formed. Where the non-immobilized species is not pre-labeled, an indirect label can be used to detect complexes anchored on the surface; e.g., using a labeled antibody specific for the initially non-immobilized species (the antibody, in turn, can be directly labeled or indirectly labeled with, e.g., a labeled anti-Ig antibody). Depending upon the order of addition of reaction components, test compounds that inhibit complex formation or that disrupt preformed complexes can be detected.

[1127] Alternatively, the reaction can be conducted in a liquid phase in the presence or absence of the test compound, the reaction products separated from unreacted components, and complexes detected; e.g., using an immobilized antibody specific for one of the binding components to anchor any complexes formed in solution, and a labeled antibody specific for the other partner to detect anchored complexes. Again, depending upon the order of addition of reactants to the liquid phase, test compounds that inhibit complex or that disrupt preformed complexes can be identified.

[1128] In an alternate embodiment of the invention, a homogeneous assay can be used. For example, a preformed complex of the target gene product and the interactive cellular or extracellular binding partner product is prepared in that either the target gene products or their binding partners are labeled, but the signal generated by the label is quenched due to complex formation (see, e.g., U.S. Pat. No. 4,109,496 that utilizes this approach for immunoassays). The addition of a test substance that competes with and displaces one of the species from the preformed complex will result in the generation of a signal above background. In this way, test substances that disrupt target gene product-binding partner interaction can be identified.

[1129] In yet another aspect, the 26493 proteins can be used as “bait proteins” in a two-hybrid assay or three-hybrid assay (see, e.g., U.S. Pat. No. 5,283,317; Zervos et al. (1993) Cell 72:223-232; Madura et al. (1993) J. Biol. Chem. 268:12046-12054; Bartel et al. (1993) Biotechniques 14:920-924; Iwabuchi et al. (1993) Oncogene 8:1693-1696; and Brent WO94/10300), to identify other proteins, which bind to or interact with 26493 (“26493-binding proteins” or “26493-bp”) and are involved in 26493 activity. Such 26493-bps can be activators or inhibitors of signals by the 26493 proteins or 26493 targets as, for example, downstream elements of a 26493-mediated signaling pathway.

[1130] The two-hybrid system is based on the modular nature of most transcription factors, which consist of separable DNA-binding and activation domains. Briefly, the assay utilizes two different DNA constructs. In one construct, the gene that codes for a 26493 protein is fused to a gene encoding the DNA binding domain of a known transcription factor (e.g., GAL-4). In the other construct, a DNA sequence, from a library of DNA sequences, that encodes an unidentified protein (“prey” or “sample”) is fused to a gene that codes for the activation domain of the known transcription factor. (Alternatively the: 26493 protein can be the fused to the activator domain.) If the “bait” and the “prey” proteins are able to interact, in vivo, forming a 26493-dependent complex, the DNA-binding and activation domains of the transcription factor are brought into close proximity. This proximity allows transcription of a reporter gene (e.g., lacZ) which is operably linked to a transcriptional regulatory site responsive to the transcription factor. Expression of the reporter gene can be detected and cell colonies containing the functional transcription factor can be isolated and used to obtain the cloned gene which encodes the protein which interacts with the 26493 protein.

[1131] In another embodiment, modulators of 26493 expression are identified. For example, a cell or cell free mixture is contacted with a candidate compound and the expression of 26493 mRNA or protein evaluated relative to the level of expression of 26493 mRNA or protein in the absence of the candidate compound. When expression of 26493 mRNA or protein is greater in the presence of the candidate compound than in its absence, the candidate compound is identified as a stimulator of 26493 mRNA or protein expression. Alternatively, when expression of 26493 mRNA or protein is less (statistically significantly less) in the presence of the candidate compound than in its absence, the candidate compound is identified as an inhibitor of 26493 mRNA or protein expression. The level of 26493 mRNA or protein expression can be determined by methods described herein for detecting 26493 mRNA or protein.

[1132] In another aspect, the invention pertains to a combination of two or more of the assays described herein. For example, a modulating agent can be identified using a cell-based or a cell free assay, and the ability of the agent to modulate the activity of a 26493 protein can be confirmed in vivo, e.g., in an animal such as an animal model for cancer.

[1133] This invention further pertains to novel agents identified by the above-described screening assays. Accordingly, it is within the scope of this invention to further use an agent identified as described herein (e.g., a 26493 modulating agent, an antisense 26493 nucleic acid molecule, a 26493-specific antibody, or a 26493-binding partner) in an appropriate animal model to determine the efficacy, toxicity, side effects, or mechanism of action, of treatment with such an agent. Furthermore, novel agents identified by the above-described screening assays can be used for treatments as described herein.

[1134] 26493 Detection Assays

[1135] Portions or fragments of the nucleic acid sequences identified herein can be used as polynucleotide reagents. For example, these sequences can be used to: (i) map their respective genes on a chromosome e.g., to locate gene regions associated with genetic disease or to associate 26493 with a disease; (ii) identify an individual from a minute biological sample (tissue typing); and (iii) aid in forensic identification of a biological sample. These applications are described in the subsections below.

[1136] 26493 Chromosome Mapping

[1137] The 26493 nucleotide sequences or portions thereof can be used to map the location of the 26493 genes on a chromosome. This process is called chromosome mapping. Chromosome mapping is useful in correlating the 26493 sequences with genes associated with disease.

[1138] Briefly, 26493 genes can be mapped to chromosomes by preparing PCR primers (preferably 15-25 bp in length) from the 26493 nucleotide sequences. These primers can then be used for PCR screening of somatic cell hybrids containing individual human chromosomes. Only those hybrids containing the human gene corresponding to the 26493 sequences will yield an amplified fragment.

[1139] A panel of somatic cell hybrids in which each cell line contains either a single human chromosome or a small number of human chromosomes, and a full set of mouse chromosomes, can allow easy mapping of individual genes to specific human chromosomes. (D'Eustachio P. et al. (1983) Science 220:919-924).

[1140] Other mapping strategies e.g., in situ hybridization (described in Fan, Y. et al. (1990) Proc. Natl. Acad. Sci. USA, 87:6223-27), pre-screening with labeled flow-sorted chromosomes, and pre-selection by hybridization to chromosome specific cDNA libraries can be used to map 26493 to a chromosomal location.

[1141] Fluorescence in situ hybridization (FISH) of a DNA sequence to a metaphase chromosomal spread can further be used to provide a precise chromosomal location in one step. The FISH technique can be used with a DNA sequence as short as 500 or 600 bases. However, clones larger than 1,000 bases have a higher likelihood of binding to a unique chromosomal location with sufficient signal intensity for simple detection. Preferably 1,000 bases, and more preferably 2,000 bases will suffice to get good results at a reasonable amount of time. For a review of this technique, see Verma et al., Human Chromosomes: A Manual of Basic Techniques ((1988) Pergamon Press, New York).

[1142] Reagents for chromosome mapping can be used individually to mark a single chromosome or a single site on that chromosome, or panels of reagents can be used for marking multiple sites and/or multiple chromosomes. Reagents corresponding to noncoding regions of the genes actually are preferred for mapping purposes. Coding sequences are more likely to be conserved within gene families, thus increasing the chance of cross hybridizations during chromosomal mapping.

[1143] Once a sequence has been mapped to a precise chromosomal location, the physical position of the sequence on the chromosome can be correlated with genetic map data. (Such data are found, for example, in V. McKusick, Mendelian Inheritance in Man, available on-line through Johns Hopkins University Welch Medical Library). The relationship between a gene and a disease, mapped to the same chromosomal region, can then be identified through linkage analysis (co-inheritance of physically adjacent genes), described in, for example, Egeland, J. et al. (1987) Nature, 325:783-787.

[1144] Moreover, differences in the DNA sequences between individuals affected and unaffected with a disease associated with the 26493 gene, can be determined. If a mutation is observed in some or all of the affected individuals but not in any unaffected individuals, then the mutation is likely to be the causative agent of the particular disease. Comparison of affected and unaffected individuals generally involves first looking for structural alterations in the chromosomes, such as deletions or translocations that are visible from chromosome spreads or detectable using PCR based on that DNA sequence. Ultimately, complete sequencing of genes from several individuals can be performed to confirm the presence of a mutation and to distinguish mutations from polymorphisms.

[1145] 26493 Tissue Typing

[1146] 26493 sequences can be used to identify individuals from biological samples using, e.g., restriction fragment length polymorphism (RFLP). In this technique, an individual's genomic DNA is digested with one or more restriction enzymes, the fragments separated, e.g., in a Southern blot, and probed to yield bands for identification. The sequences of the present invention are useful as additional DNA markers for RFLP (described in U.S. Pat. No. 5,272,057).

[1147] Furthermore, the sequences of the present invention can also be used to determine the actual base-by-base DNA sequence of selected portions of an individual's genome. Thus, the 26493 nucleotide sequences described herein can be used to prepare two PCR primers from the 5′ and 3′ends of the sequences. These primers can then be used to amplify an individual's DNA and subsequently sequence it. Panels of corresponding DNA sequences from individuals, prepared in this manner, can provide unique individual identifications, as each individual will have a unique set of such DNA sequences due to allelic differences.

[1148] Allelic variation occurs to some degree in the coding regions of these sequences, and to a greater degree in the noncoding regions. Each of the sequences described herein can, to some degree, be used as a standard against which DNA from an individual can be compared for identification purposes. Because greater numbers of polymorphisms occur in the noncoding regions, fewer sequences are necessary to differentiate individuals. The noncoding sequences of SEQ ID NO:16 can provide positive individual identification with a panel of perhaps 10 to 1,000 primers which each yield a noncoding amplified sequence of 100 bases. If predicted coding sequences, such as those in SEQ ID NO: 18 are used, a more appropriate number of primers for positive individual identification would be 500-2,000.

[1149] If a panel of reagents from 26493 nucleotide sequences described herein is used to generate a unique identification database for an individual, those same reagents can later be used to identify tissue from that individual. Using the unique identification database, positive identification of the individual, living or dead, can be made from extremely small tissue samples.

[1150] Use of Partial 26493 Sequences in Forensic Biology

[1151] DNA-based identification techniques can also be used in forensic biology. To make such an identification, PCR technology can be used to amplify DNA sequences taken from very small biological samples such as tissues, e.g., hair or skin, or body fluids, e.g., blood, saliva, or semen found at a crime scene. The amplified sequence can then be compared to a standard, thereby allowing identification of the origin of the biological sample.

[1152] The sequences of the present invention can be used to provide polynucleotide reagents, e.g., PCR primers, targeted to specific loci in the human genome, which can enhance the reliability of DNA-based forensic identifications by, for example, providing another “identification marker” (i.e. another DNA sequence that is unique to a particular individual). As mentioned above, actual base sequence information can be used for identification as an accurate alternative to patterns formed by restriction enzyme generated fragments. Sequences targeted to noncoding regions of SEQ ID NO: 16 (e.g., fragments derived from the noncoding regions of SEQ ID NO: 16 having a length of at least 20 bases, preferably at least 30 bases) are particularly appropriate for this use.

[1153] The 26493 nucleotide sequences described herein can further be used to provide polynucleotide reagents, e.g., labeled or labelable probes which can be used in, for example, an in situ hybridization technique, to identify a specific tissue. This can be very useful in cases where a forensic pathologist is presented with a tissue of unknown origin. Panels of such 26493 probes can be used to identify tissue by species and/or by organ type.

[1154] In a similar fashion, these reagents, e.g., 26493 primers or probes can be used to screen tissue culture for contamination (i.e. screen for the presence of a mixture of different types of cells in a culture).

[1155] Predictive Medicine of 26493

[1156] The present invention also pertains to the field of predictive medicine in which diagnostic assays, prognostic assays, and monitoring clinical trials are used for prognostic (predictive) purposes to thereby treat an individual.

[1157] Generally, the invention provides, a method of determining if a subject is at risk for a disorder related to a lesion in or the misexpression of a gene which encodes 26493.

[1158] Such disorders include, e.g., a disorder associated with the misexpression of 26493 gene; a disorder of the immune system.

[1159] The method includes one or more of the following:

[1160] detecting, in a tissue of the subject, the presence or absence of a mutation which affects the expression of the 26493 gene, or detecting the presence or absence of a mutation in a region which controls the expression of the gene, e.g., a mutation in the 5′control region;

[1161] detecting, in a tissue of the subject, the presence or absence of a mutation which alters the structure of the 26493 gene;

[1162] detecting, in a tissue of the subject, the misexpression of the 26493 gene, at the mRNA level, e.g., detecting a non-wild type level of a mRNA;

[1163] detecting, in a tissue of the subject, the misexpression of the gene, at the protein level, e.g., detecting a non-wild type level of a 26493 polypeptide.

[1164] In preferred embodiments the method includes: ascertaining the existence of at least one of: a deletion of one or more nucleotides from the 26493 gene; an insertion of one or more nucleotides into the gene, a point mutation, e.g., a substitution of one or more nucleotides of the gene, a gross chromosomal rearrangement of the gene, e.g., a translocation, inversion, or deletion.

[1165] For example, detecting the genetic lesion can include: (i) providing a probe/primer including an oligonucleotide containing a region of nucleotide sequence which hybridizes to a sense or antisense sequence from SEQ ID NO: 16, or naturally occurring mutants thereof or 5′ or 3′flanking sequences naturally associated with the 26493 gene; (ii) exposing the probe/primer to nucleic acid of the tissue; and detecting, by hybridization, e.g., in situ hybridization, of the probe/primer to the nucleic acid, the presence or absence of the genetic lesion.

[1166] In preferred embodiments detecting the misexpression includes ascertaining the existence of at least one of: an alteration in the level of a messenger RNA transcript of the 26493 gene; the presence of a non-wild type splicing pattern of a messenger RNA transcript of the gene; or a non-wild type level of 26493.

[1167] Methods of the invention can be used prenatally or to determine if a subject's offspring will be at risk for a disorder.

[1168] In preferred embodiments the method includes determining the structure of a 26493 gene, an abnormal structure being indicative of risk for the disorder.

[1169] In preferred embodiments the method includes contacting a sample from the subject with an antibody to the 26493 protein or a nucleic acid, which hybridizes specifically with the gene. These and other embodiments are discussed below.

[1170] Diagnostic and Prognostic Assays of 26493

[1171] Diagnostic and prognostic assays of the invention include method for assessing the expression level of 26493 molecules and for identifying variations and mutations in the sequence of 26493 molecules.

[1172] Expression Monitoring and Profiling:

[1173] The presence, level, or absence of 26493 protein or nucleic acid in a biological sample can be evaluated by obtaining a biological sample from a test subject and contacting the biological sample with a compound or an agent capable of detecting 26493 protein or nucleic acid (e.g., mRNA, genomic DNA) that encodes 26493 protein such that the presence of 26493 protein or nucleic acid is detected in the biological sample. The term “biological sample” includes tissues, cells and biological fluids isolated from a subject, as well as tissues, cells and fluids present within a subject. A preferred biological sample is serum. The level of expression of the 26493 gene can be measured in a number of ways, including, but not limited to: measuring the mRNA encoded by the 26493 genes; measuring the amount of protein encoded by the 26493 genes; or measuring the activity of the protein encoded by the 26493 genes.

[1174] The level of mRNA corresponding to the 26493 gene in a cell can be determined both by in situ and by in vitro formats.

[1175] The isolated mRNA can be used in hybridization or amplification assays that include, but are not limited to, Southern or Northern analyses, polymerase chain reaction analyses and probe arrays. One preferred diagnostic method for the detection of mRNA levels involves contacting the isolated mRNA with a nucleic acid molecule (probe) that can hybridize to the mRNA encoded by the gene being detected. The nucleic acid probe can be, for example, a full-length 26493 nucleic acid, such as the nucleic acid of SEQ ID NO: 16, or a portion thereof, such as an oligonucleotide of at least 7, 15, 30, 50, 100, 250 or 500 nucleotides in length and sufficient to specifically hybridize under stringent conditions to 26493 mRNA or genomic DNA. The probe can be disposed on an address of an array, e.g., an array described below. Other suitable probes for use in the diagnostic assays are described herein.

[1176] In one format, mRNA (or cDNA) is immobilized on a surface and contacted with the probes, for example by running the isolated mRNA on an agarose gel and transferring the mRNA from the gel to a membrane, such as nitrocellulose. In an alternative format, the probes are immobilized on a surface and the mRNA (or cDNA) is contacted with the probes, for example, in a two-dimensional gene chip array described below. A skilled artisan can adapt known mRNA detection methods for use in detecting the level of mRNA encoded by the 26493 genes.

[1177] The level of mRNA in a sample that is encoded by one of 26493 can be evaluated with nucleic acid amplification, e.g., by rtPCR (Mullis (1987) U.S. Pat. No. 4,683,202), ligase chain reaction (Barany (1991) Proc. Natl. Acad. Sci. USA 88:189-193), self sustained sequence replication (Guatelli et al., (1990) Proc. Natl. Acad. Sci. USA 87:1874-1878), transcriptional amplification system (Kwoh et al., (1989), Proc. Natl. Acad. Sci. USA 86:1173-1177), Q-Beta Replicase (Lizardi et al., (1988) Bio/Technology 6:1197), rolling circle replication (Lizardi et al., U.S. Pat. No. 5,854,033) or any other nucleic acid amplification method, followed by the detection of the amplified molecules using techniques known in the art. As used herein, amplification primers are defined as being a pair of nucleic acid molecules that can anneal to 5′ or 3′regions of a gene (plus and minus strands, respectively, or vice-versa) and contain a short region in between. In general, amplification primers are from about 10 to 30 nucleotides in length and flank a region from about 50 to 200 nucleotides in length. Under appropriate conditions and with appropriate reagents, such primers permit the amplification of a nucleic acid molecule comprising the nucleotide sequence flanked by the primers.

[1178] For in situ methods, a cell or tissue sample can be prepared/processed and immobilized on a support, typically a glass slide, and then contacted with a probe that can hybridize to mRNA that encodes the 26493 gene being analyzed.

[1179] In another embodiment, the methods further contacting a control sample with a compound or agent capable of detecting 26493 mRNA, or genomic DNA, and comparing the presence of 26493 mRNA or genomic DNA in the control sample with the presence of 26493 mRNA or genomic DNA in the test sample. In still another embodiment, serial analysis of gene expression, as described in U.S. Pat. No. 5,695,937, is used to detect 26493 transcript levels.

[1180] A variety of methods can be used to determine the level of protein encoded by 26493. In general, these methods include contacting an agent that selectively binds to the protein, such as an antibody with a sample, to evaluate the level of protein in the sample. In a preferred embodiment, the antibody bears a detectable label. Antibodies can be polyclonal, or more preferably, monoclonal. An intact antibody, or a fragment thereof (e.g., Fab or F(ab′)2) can be used. The term “labeled”, with regard to the probe or antibody, is intended to encompass direct labeling of the probe or antibody by coupling (i.e., physically linking) a detectable substance to the probe or antibody, as well as indirect labeling of the probe or antibody by reactivity with a detectable substance. Examples of detectable substances are provided herein.

[1181] The detection methods can be used to detect 26493 protein in a biological sample in vitro as well as in vivo. In vitro techniques for detection of 26493 protein include enzyme linked immunosorbent assays (ELISAs), immunoprecipitations, immunofluorescence, enzyme immunoassay (EIA), radioimmunoassay (RIA), and Western blot analysis. In vivo techniques for detection of 26493 protein include introducing into a subject a labeled anti-26493 antibody. For example, the antibody can be labeled with a radioactive marker whose presence and location in a subject can be detected by standard imaging techniques. In another embodiment, the sample is labeled, e.g., biotinylated and then contacted to the antibody, e.g., an anti-26493 antibody positioned on an antibody array (as described below). The sample can be detected, e.g., with avidin coupled to a fluorescent label.

[1182] In another embodiment, the methods further include contacting the control sample with a compound or agent capable of detecting 26493 protein, and comparing the presence of 26493 protein in the control sample with the presence of 26493 protein in the test sample.

[1183] The invention also includes kits for detecting the presence of 26493 in a biological sample. For example, the kit can include a compound or agent capable of detecting 26493 protein or mRNA in a biological sample; and a standard. The compound or agent can be packaged in a suitable container. The kit can further comprise instructions for using the kit to detect 26493 protein or nucleic acid.

[1184] For antibody-based kits, the kit can include: (1) a first antibody (e.g., attached to a solid support) which binds to a polypeptide corresponding to a marker of the invention; and, optionally, (2) a second, different antibody which binds to either the polypeptide or the first antibody and is conjugated to a detectable agent.

[1185] For oligonucleotide-based kits, the kit can include: (1) an oligonucleotide, e.g., a detectably labeled oligonucleotide, which hybridizes to a nucleic acid sequence encoding a polypeptide corresponding to a marker of the invention or (2) a pair of primers useful for amplifying a nucleic acid molecule corresponding to a marker of the invention. The kit can also includes a buffering agent, a preservative, or a protein stabilizing agent. The kit can also includes components necessary for detecting the detectable agent (e.g., an enzyme or a substrate). The kit can also contain a control sample or a series of control samples which can be assayed and compared to the test sample contained. Each component of the kit can be enclosed within an individual container and all of the various containers can be within a single package, along with instructions for interpreting the results of the assays performed using the kit.

[1186] The diagnostic methods described herein can identify subjects having, or at risk of developing, a disease or disorder associated with misexpressed or aberrant or unwanted 26493 expression or activity. As used herein, the term “unwanted” includes an unwanted phenomenon involved in a biological response such as pain or deregulated cell proliferation.

[1187] In one embodiment, a disease or disorder associated with aberrant or unwanted 26493 expression or activity is identified. A test sample is obtained from a subject and 26493 protein or nucleic acid (e.g., mRNA or genomic DNA) is evaluated, wherein the level, e.g., the presence or absence, of 26493 protein or nucleic acid is diagnostic for a subject having or at risk of developing a disease or disorder associated with aberrant or unwanted 26493 expression or activity. As used herein, a “test sample” refers to a biological sample obtained from a subject of interest, including a biological fluid (e.g., serum), cell sample, or tissue.

[1188] The prognostic assays described herein can be used to determine whether a subject can be administered an agent (e.g., an agonist, antagonist, peptidomimetic, protein, peptide, nucleic acid, small molecule, or other drug candidate) to treat a disease or disorder associated with aberrant or unwanted 26493 expression or activity. For example, such methods can be used to determine whether a subject can be effectively treated with an agent for a cell proliferative or differentiative disorder.

[1189] In another aspect, the invention features a computer medium having a plurality of digitally encoded data records. Each data record includes a value representing the level of expression of 26493 in a sample, and a descriptor of the sample. The descriptor of the sample can be an identifier of the sample, a subject from which the sample was derived (e.g., a patient), a diagnosis, or a treatment (e.g., a preferred treatment). In a preferred embodiment, the data record further includes values representing the level of expression of genes other than 26493 (e.g., other genes associated with a 26493-disorder, or other genes on an array). The data record can be structured as a table, e.g., a table that is part of a database such as a relational database (e.g., a SQL database of the Oracle or Sybase database environments).

[1190] Also featured is a method of evaluating a sample. The method includes providing a sample, e.g., from the subject, and determining a gene expression profile of the sample, wherein the profile includes a value representing the level of 26493 expression. The method can further include comparing the value or the profile (i.e., multiple values) to a reference value or reference profile. The gene expression profile of the sample can be obtained by any of the methods described herein (e.g., by providing a nucleic acid from the sample and contacting the nucleic acid to an array). The method can be used, for example, to diagnose a disorder, e.g., a disorder as described herein, in a subject. The method can be used to monitor a treatment for a disorder, e.g., a disorder as described herein, in a subject. For example, the gene expression profile can be determined for a sample from a subject undergoing treatment. The profile can be compared to a reference profile or to a profile obtained from the subject prior to treatment or prior to onset of the disorder (see, e.g., Golub et al. (1999) Science 286:531).

[1191] In yet another aspect, the invention features a method of evaluating a test compound (see also, “Screening Assays”, above). The method includes providing a cell and a test compound; contacting the test compound to the cell; obtaining a subject expression profile for the contacted cell; and comparing the subject expression profile to one or more reference profiles. The profiles include a value representing the level of 26493 expression. In a preferred embodiment, the subject expression profile is compared to a target profile, e.g., a profile for a normal cell or for desired condition of a cell. The test compound is evaluated favorably if the subject expression profile is more similar to the target profile than an expression profile obtained from an uncontacted cell.

[1192] In another aspect, the invention features, a method of evaluating a subject. The method includes: a) obtaining a sample from a subject, e.g., from a caregiver, e.g., a caregiver who obtains the sample from the subject; b) determining a subject expression profile for the sample. Optionally, the method further includes either or both of steps: c) comparing the subject expression profile to one or more reference expression profiles; and d) selecting the reference profile most similar to the subject reference profile. The subject expression profile and the reference profiles include a value representing the level of 26493 expression. A variety of routine statistical measures can be used to compare two reference profiles. One possible metric is the length of the distance vector that is the difference between the two profiles. Each of the subject and reference profile is represented as a multi-dimensional vector, wherein each dimension is a value in the profile.

[1193] The method can further include transmitting a result to a caregiver. The result can be the subject expression profile, a result of a comparison of the subject expression profile with another profile, a most similar reference profile, or a descriptor of any of the aforementioned. The result can be transmitted across a computer network, e.g., the result can be in the form of a computer transmission, e.g., a computer data signal embedded in a carrier wave.

[1194] Also featured is a computer medium having executable code for effecting the following steps: receive a subject expression profile; access a database of reference expression profiles; and either i) select a matching reference profile most similar to the subject expression profile or ii) determine at least one comparison score for the similarity of the subject expression profile to at least one reference profile. The subject expression profile, and the reference expression profiles each include a value representing the level of 26493 expression.

[1195] 26493 Arrays and Uses Thereof

[1196] In another aspect, the invention features an array that includes a substrate having a plurality of addresses. At least one address of the plurality includes a capture probe that binds specifically to a 26493 molecule (e.g., a 26493 nucleic acid or a 26493 polypeptide). The array can have a density of at least than 10, 50, 100, 200, 500, 1,000, 2,000, or 10,000 or more addresses/cm2, and ranges between. In a preferred embodiment, the plurality of addresses includes at least 10, 100, 500, 1,000, 5,000, 10,000, 50,000 addresses. In a preferred embodiment, the plurality of addresses includes equal to or less than 10, 100, 500, 1,000, 5,000, 10,000, or 50,000 addresses. The substrate can be a two-dimensional substrate such as a glass slide, a wafer (e.g., silica or plastic), a mass spectroscopy plate, or a three-dimensional substrate such as a gel pad. Addresses in addition to address of the plurality can be disposed on the array.

[1197] In a preferred embodiment, at least one address of the plurality includes a nucleic acid capture probe that hybridizes specifically to a 26493 nucleic acid, e.g., the sense or anti-sense strand. In one preferred embodiment, a subset of addresses of the plurality of addresses has a nucleic acid capture probe for 26493. Each address of the subset can include a capture probe that hybridizes to a different region of a 26493 nucleic acid. In another preferred embodiment, addresses of the subset include a capture probe for a 26493 nucleic acid. Each address of the subset is unique, overlapping, and complementary to a different variant of 26493 (e.g., an allelic variant, or all possible hypothetical variants). The array can be used to sequence 26493 by hybridization (see, e.g., U.S. Pat. No. 5,695,940).

[1198] An array can be generated by various methods, e.g., by photolithographic methods (see, e.g., U.S. Pat. Nos. 5,143,854; 5,510,270; and 5,527,681), mechanical methods (e.g., directed-flow methods as described in U.S. Pat. No. 5,384,261), pin-based methods (e.g., as described in U.S. Pat. No. 5,288,514), and bead-based techniques (e.g., as described in PCT US/93/04145).

[1199] In another preferred embodiment, at least one address of the plurality includes a polypeptide capture probe that binds specifically to a 26493 polypeptide or fragment thereof. The polypeptide can be a naturally-occurring interaction partner of 26493 polypeptide. Preferably, the polypeptide is an antibody, e.g., an antibody described herein (see “Anti-26493 Antibodies,” above), such as a monoclonal antibody or a single-chain antibody.

[1200] In another aspect, the invention features a method of analyzing the expression of 26493. The method includes providing an array as described above; contacting the array with a sample and detecting binding of a 26493-molecule (e.g., nucleic acid or polypeptide) to the array. In a preferred embodiment, the array is a nucleic acid array. Optionally the method further includes amplifying nucleic acid from the sample prior or during contact with the array.

[1201] In another embodiment, the array can be used to assay gene expression in a tissue to ascertain tissue specificity of genes in the array, particularly the expression of 26493. If a sufficient number of diverse samples is analyzed, clustering (e.g., hierarchical clustering, k-means clustering, Bayesian clustering and the like) can be used to identify other genes which are co-regulated with 26493. For example, the array can be used for the quantitation of the expression of multiple genes. Thus, not only tissue specificity, but also the level of expression of a battery of genes in the tissue is ascertained. Quantitative data can be used to group (e.g., cluster) genes on the basis of their tissue expression per se and level of expression in that tissue.

[1202] For example, array analysis of gene expression can be used to assess the effect of cell-cell interactions on 26493 expression. A first tissue can be perturbed and nucleic acid from a second tissue that interacts with the first tissue can be analyzed. In this context, the effect of one cell type on another cell type in response to a biological stimulus can be determined, e.g., to monitor the effect of cell-cell interaction at the level of gene expression.

[1203] In another embodiment, cells are contacted with a therapeutic agent. The expression profile of the cells is determined using the array, and the expression profile is compared to the profile of like cells not contacted with the agent. For example, the assay can be used to determine or analyze the molecular basis of an undesirable effect of the therapeutic agent. If an agent is administered therapeutically to treat one cell type but has an undesirable effect on another cell type, the invention provides an assay to determine the molecular basis of the undesirable effect and thus provides the opportunity to co-administer a counteracting agent or otherwise treat the undesired effect. Similarly, even within a single cell type, undesirable biological effects can be determined at the molecular level. Thus, the effects of an agent on expression of other than the target gene can be ascertained and counteracted.

[1204] In another embodiment, the array can be used to monitor expression of one or more genes in the array with respect to time. For example, samples obtained from different time points can be probed with the array. Such analysis can identify and/or characterize the development of a 26493-associated disease or disorder; and processes, such as a cellular transformation associated with a 26493-associated disease or disorder. The method can also evaluate the treatment and/or progression of a 26493-associated disease or disorder

[1205] The array is also useful for ascertaining differential expression patterns of one or more genes in normal and abnormal cells. This provides a battery of genes (e.g., including 26493) that could serve as a molecular target for diagnosis or therapeutic intervention.

[1206] In another aspect, the invention features an array having a plurality of addresses. Each address of the plurality includes a unique polypeptide. At least one address of the plurality has disposed thereon a 26493 polypeptide or fragment thereof. Methods of producing polypeptide arrays are described in the art, e.g., in De Wildt et al. (2000). Nature Biotech. 18, 989-994; Lueking et al. (1999). Anal. Biochem. 270, 103-111; Ge, H. (2000). Nucleic Acids Res. 28, e3, I-VII; MacBeath, G., and Schreiber, S. L. (2000). Science 289, 1760-1763; and WO 99/51773A1. In a preferred embodiment, each addresses of the plurality has disposed thereon a polypeptide at least 60, 70, 80, 85, 90, 95 or 99% identical to a 26493 polypeptide or fragment thereof. For example, multiple variants of a 26493 polypeptide (e.g., encoded by allelic variants, site-directed mutants, random mutants, or combinatorial mutants) can be disposed at individual addresses of the plurality. Addresses in addition to the address of the plurality can be disposed on the array.

[1207] The polypeptide array can be used to detect a 26493 binding compound, e.g., an antibody in a sample from a subject with specificity for a 26493 polypeptide or the presence of a 26493-binding protein or ligand.

[1208] The array is also useful for ascertaining the effect of the expression of a gene on the expression of other genes in the same cell or in different cells (e.g., ascertaining the effect of 26493 expression on the expression of other genes). This provides, for example, for a selection of alternate molecular targets for therapeutic intervention if the ultimate or downstream target cannot be regulated.

[1209] In another aspect, the invention features a method of analyzing a plurality of probes. The method is useful, e.g., for analyzing gene expression. The method includes: providing a two dimensional array having a plurality of addresses, each address of the plurality being positionally distinguishable from each other address of the plurality having a unique capture probe, e.g., wherein the capture probes are from a cell or subject which express 26493 or from a cell or subject in which a 26493 mediated response has been elicited, e.g., by contact of the cell with 26493 nucleic acid or protein, or administration to the cell or subject 26493 nucleic acid or protein; providing a two dimensional array having a plurality of addresses, each address of the plurality being positionally distinguishable from each other address of the plurality, and each address of the plurality having a unique capture probe, e.g., wherein the capture probes are from a cell or subject which does not express 26493 (or does not express as highly as in the case of the 26493 positive plurality of capture probes) or from a cell or subject which in which a 26493 mediated response has not been elicited (or has been elicited to a lesser extent than in the first sample); contacting the array with one or more inquiry probes (which is preferably other than a 26493 nucleic acid, polypeptide, or antibody), and thereby evaluating the plurality of capture probes. Binding, e.g., in the case of a nucleic acid, hybridization with a capture probe at an address of the plurality, is detected, e.g., by signal generated from a label attached to the nucleic acid, polypeptide, or antibody.

[1210] In another aspect, the invention features a method of analyzing a plurality of probes or a sample. The method is useful, e.g., for analyzing gene expression. The method includes: providing a two dimensional array having a plurality of addresses, each address of the plurality being positionally distinguishable from each other address of the plurality having a unique capture probe, contacting the array with a first sample from a cell or subject which express or mis-express 26493 or from a cell or subject in which a 26493-mediated response has been elicited, e.g., by contact of the cell with 26493 nucleic acid or protein, or administration to the cell or subject 26493 nucleic acid or protein; providing a two dimensional array having a plurality of addresses, each address of the plurality being positionally distinguishable from each other address of the plurality, and each address of the plurality having a unique capture probe, and contacting the array with a second sample from a cell or subject which does not express 26493 (or does not express as highly as in the case of the 26493 positive plurality of capture probes) or from a cell or subject which in which a 26493 mediated response has not been elicited (or has been elicited to a lesser extent than in the first sample); and comparing the binding of the first sample with the binding of the second sample. Binding, e.g., in the case of a nucleic acid, hybridization with a capture probe at an address of the plurality, is detected, e.g., by signal generated from a label attached to the nucleic acid, polypeptide, or antibody. The same array can be used for both samples or different arrays can be used. If different arrays are used the plurality of addresses with capture probes should be present on both arrays.

[1211] In another aspect, the invention features a method of analyzing 26493, e.g., analyzing structure, function, or relatedness to other nucleic acid or amino acid sequences. The method includes: providing a 26493 nucleic acid or amino acid sequence; comparing the 26493 sequence with one or more preferably a plurality of sequences from a collection of sequences, e.g., a nucleic acid or protein sequence database; to thereby analyze 26493.

[1212] Detection of 26493 Variations or Mutations

[1213] The methods of the invention can also be used to detect genetic alterations in a 26493 gene, thereby determining if a subject with the altered gene is at risk for a disorder characterized by misregulation in 26493 protein activity or nucleic acid expression. In preferred embodiments, the methods include detecting, in a sample from the subject, the presence or absence of a genetic alteration characterized by at least one of an alteration affecting the integrity of a gene encoding a 26493-protein, or the mis-expression of the 26493 gene. For example, such genetic alterations can be detected by ascertaining the existence of at least one of 1) a deletion of one or more nucleotides from a 26493 gene; 2) an addition of one or more nucleotides to a 26493 gene; 3) a substitution of one or more nucleotides of a 26493 gene, 4) a chromosomal rearrangement of a 26493 gene; 5) an alteration in the level of a messenger RNA transcript of a 26493 gene, 6) aberrant modification of a 26493 gene, such as of the methylation pattern of the genomic DNA, 7) the presence of a non-wild type splicing pattern of a messenger RNA transcript of a 26493 gene, 8) a non-wild type level of a 26493-protein, 9) allelic loss of a 26493 gene, and 10) inappropriate post-translational modification of a 26493-protein.

[1214] An alteration can be detected without a probe/primer in a polymerase chain reaction, such as anchor PCR or RACE PCR, or, alternatively, in a ligation chain reaction (LCR), the latter of which can be particularly useful for detecting point mutations in the 26493-gene. This method can include the steps of collecting a sample of cells from a subject, isolating nucleic acid (e.g., genomic, mRNA or both) from the sample, contacting the nucleic acid sample with one or more primers which specifically hybridize to a 26493 gene under conditions such that hybridization and amplification of the 26493-gene (if present) occurs, and detecting the presence or absence of an amplification product, or detecting the size of the amplification product and comparing the length to a control sample. It is anticipated that PCR and/or LCR may be desirable to use as a preliminary amplification step in conjunction with any of the techniques used for detecting mutations described herein. Alternatively, other amplification methods described herein or known in the art can be used.

[1215] In another embodiment, mutations in a 26493 gene from a sample cell can be identified by detecting alterations in restriction enzyme cleavage patterns. For example, sample and control DNA is isolated, amplified (optionally), digested with one or more restriction endonucleases, and fragment length sizes are determined, e.g., by gel electrophoresis and compared. Differences in fragment length sizes between sample and control DNA indicates mutations in the sample DNA. Moreover, the use of sequence specific ribozymes (see, for example, U.S. Pat. No. 5,498,531) can be used to score for the presence of specific mutations by development or loss of a ribozyme cleavage site.

[1216] In other embodiments, genetic mutations in 26493 can be identified by hybridizing a sample and control nucleic acids, e.g., DNA or RNA, two-dimensional arrays, e.g., chip based arrays. Such arrays include a plurality of addresses, each of which is positionally distinguishable from the other. A different probe is located at each address of the plurality. A probe can be complementary to a region of a 26493 nucleic acid or a putative variant (e.g., allelic variant) thereof. A probe can have one or more mismatches to a region of a 26493 nucleic acid (e.g., a destabilizing mismatch). The arrays can have a high density of addresses, e.g., can contain hundreds or thousands of oligonucleotides probes (Cronin, M. T. et al. (1996) Human Mutation 7: 244-255; Kozal, M. J. et al. (1996) Nature Medicine 2: 753-759). For example, genetic mutations in 26493 can be identified in two-dimensional arrays containing light-generated DNA probes as described in Cronin, M. T. et al. supra. Briefly, a first hybridization array of probes can be used to scan through long stretches of DNA in a sample and control to identify base changes between the sequences by making linear arrays of sequential overlapping probes. This step allows the identification of point mutations. This step is followed by a second hybridization array that allows the characterization of specific mutations by using smaller, specialized probe arrays complementary to all variants or mutations detected. Each mutation array is composed of parallel probe sets, one complementary to the wild-type gene and the other complementary to the mutant gene.

[1217] In yet another embodiment, any of a variety of sequencing reactions known in the art can be used to directly sequence the 26493 gene and detect mutations by comparing the sequence of the sample 26493 with the corresponding wild-type (control) sequence. Automated sequencing procedures can be utilized when performing the diagnostic assays ((1995) Biotechniques 19:448), including sequencing by mass spectrometry.

[1218] Other methods for detecting mutations in the 26493 gene include methods in which protection from cleavage agents is used to detect mismatched bases in RNA/RNA or RNA/DNA heteroduplexes (Myers et al. (1985) Science 230:1242; Cotton et al. (1988) Proc. Natl. Acad Sci USA 85:4397; Saleeba et al. (1992) Methods Enzymol. 217:286-295).

[1219] In still another embodiment, the mismatch cleavage reaction employs one or more proteins that recognize mismatched base pairs in double-stranded DNA (so called “DNA mismatch repair” enzymes) in defined systems for detecting and mapping point mutations in 26493 cDNAs obtained from samples of cells. For example, the mutY enzyme of E. coli cleaves A at G/A mismatches and the thymidine DNA glycosylase from HeLa cells cleaves T at G/T mismatches (Hsu et al. (1994) Carcinogenesis 15:1657-1662; U.S. Pat. No. 5,459,039).

[1220] In other embodiments, alterations in electrophoretic mobility will be used to identify mutations in 26493 genes. For example, single strand conformation polymorphism (SSCP) may be used to detect differences in electrophoretic mobility between mutant and wild type nucleic acids (Orita et al. (1989) Proc Natl. Acad. Sci USA: 86:2766, see also Cotton (1993) Mutat. Res. 285:125-144; and Hayashi (1992) Genet. Anal. Tech. Appl. 9:73-79). Single-stranded DNA fragments of sample and control 26493 nucleic acids will be denatured and allowed to renature. The secondary structure of single-stranded nucleic acids varies according to sequence, the resulting alteration in electrophoretic mobility enables the detection of even a single base change. The DNA fragments may be labeled or detected with labeled probes. The sensitivity of the assay may be enhanced by using RNA (rather than DNA), in which the secondary structure is more sensitive to a change in sequence. In a preferred embodiment, the subject method utilizes heteroduplex analysis to separate double stranded heteroduplex molecules on the basis of changes in electrophoretic mobility (Keen et al. (1991) Trends Genet 7:5).

[1221] In yet another embodiment, the movement of mutant or wild-type fragments in polyacrylamide gels containing a gradient of denaturant is assayed using denaturing gradient gel electrophoresis (DGGE) (Myers et al. (1985) Nature 313:495). When DGGE is used as the method of analysis, DNA will be modified to insure that it does not completely denature, for example by adding a GC clamp of approximately 40 bp of high-melting GC-rich DNA by PCR. In a further embodiment, a temperature gradient is used in place of a denaturing gradient to identify differences in the mobility of control and sample DNA (Rosenbaum and Reissner (1987) Biophys Chem 265:12753).

[1222] Examples of other techniques for detecting point mutations include, but are not limited to, selective oligonucleotide hybridization, selective amplification, or selective primer extension (Saiki et al. (1986) Nature 324:163); Saiki et al. (1989) Proc. Natl. Acad. Sci USA 86:6230). A further method of detecting point mutations is the chemical ligation of oligonucleotides as described in Xu et al. ((2001) Nature Biotechnol. 19:148). Adjacent oligonucleotides, one of which selectively anneals to the query site, are ligated together if the nucleotide at the query site of the sample nucleic acid is complementary to the query oligonucleotide; ligation can be monitored, e.g., by fluorescent dyes coupled to the oligonucleotides.

[1223] Alternatively, allele specific amplification technology that depends on selective PCR amplification may be used in conjunction with the instant invention. Oligonucleotides used as primers for specific amplification may carry the mutation of interest in the center of the molecule (so that amplification depends on differential hybridization) (Gibbs et al. (1989) Nucleic Acids Res. 17:2437-2448) or at the extreme 3′end of one primer where, under appropriate conditions, mismatch can prevent, or reduce polymerase extension (Prossner (1993) Tibtech 11:238). In addition it may be desirable to introduce a novel restriction site in the region of the mutation to create cleavage-based detection (Gasparini et al. (1992) Mol. Cell Probes 6:1). It is anticipated that in certain embodiments amplification may also be performed using Taq ligase for amplification (Barany (1991) Proc. Natl. Acad. Sci USA 88:189). In such cases, ligation will occur only if there is a perfect match at the 3′end of the 5′sequence making it possible to detect the presence of a known mutation at a specific site by looking for the presence or absence of amplification.

[1224] In another aspect, the invention features a set of oligonucleotides. The set includes a plurality of oligonucleotides, each of which is at least partially complementary (e.g., at least 50%, 60%, 70%, 80%, 90%, 92%, 95%, 97%, 98%, or 99% complementary) to a 26493 nucleic acid.

[1225] In a preferred embodiment the set includes a first and a second oligonucleotide. The first and second oligonucleotide can hybridize to the same or to different locations of SEQ ID NO: 16 or the complement of SEQ ID NO: 16. Different locations can be different but overlapping, or non-overlapping on the same strand. The first and second oligonucleotide can hybridize to sites on the same or on different strands.

[1226] The set can be useful, e.g., for identifying SNP's, or identifying specific alleles of 26493. In a preferred embodiment, each oligonucleotide of the set has a different nucleotide at an interrogation position. In one embodiment, the set includes two oligonucleotides, each complementary to a different allele at a locus, e.g., a biallelic or polymorphic locus.

[1227] In another embodiment, the set includes four oligonucleotides, each having a different nucleotide (e.g., adenine, guanine, cytosine, or thymidine) at the interrogation position. The interrogation position can be a SNP or the site of a mutation. In another preferred embodiment, the oligonucleotides of the plurality are identical in sequence to one another (except for differences in length). The oligonucleotides can be provided with differential labels, such that an oligonucleotide that hybridizes to one allele provides a signal that is distinguishable from an oligonucleotide that hybridizes to a second allele. In still another embodiment, at least one of the oligonucleotides of the set has a nucleotide change at a position in addition to a query position, e.g., a destabilizing mutation to decrease the Tm of the oligonucleotide. In another embodiment, at least one oligonucleotide of the set has a non-natural nucleotide, e.g., inosine. In a preferred embodiment, the oligonucleotides are attached to a solid support, e.g., to different addresses of an array or to different beads or nanoparticles.

[1228] In a preferred embodiment the set of oligo nucleotides can be used to specifically amplify, e.g., by PCR, or detect, a 26493 nucleic acid.

[1229] The methods described herein may be performed, for example, by utilizing pre-packaged diagnostic kits comprising at least one probe nucleic acid or antibody reagent described herein, which may be conveniently used, e.g., in clinical settings to diagnose patients exhibiting symptoms or family history of a disease or illness involving a 26493 gene.

[1230] Use of 26493 Molecules as Surrogate Markers

[1231] The 26493 molecules of the invention are also useful as markers of disorders or disease states, as markers for precursors of disease states, as markers for predisposition of disease states, as markers of drug activity, or as markers of the pharmacogenomic profile of a subject. Using the methods described herein, the presence, absence and/or quantity of the 26493 molecules of the invention may be detected, and may be correlated with one or more biological states in vivo. For example, the 26493 molecules of the invention may serve as surrogate markers for one or more disorders or disease states or for conditions leading up to disease states. As used herein, a “surrogate marker” is an objective biochemical marker which correlates with the absence or presence of a disease or disorder, or with the progression of a disease or disorder (e.g., with the presence or absence of a tumor). The presence or quantity of such markers is independent of the disease. Therefore, these markers may serve to indicate whether a particular course of treatment is effective in lessening a disease state or disorder. Surrogate markers are of particular use when the presence or extent of a disease state or disorder is difficult to assess through standard methodologies (e.g., early stage tumors), or when an assessment of disease progression is desired before a potentially dangerous clinical endpoint is reached (e.g., an assessment of cardiovascular disease may be made using cholesterol levels as a surrogate marker, and an analysis of HIV infection may be made using HIV RNA levels as a surrogate marker, well in advance of the undesirable clinical outcomes of myocardial infarction or fully-developed AIDS). Examples of the use of surrogate markers in the art include: Koomen et al. (2000) J. Mass. Spectrom. 35: 258-264; and James (1994) AIDS Treatment News Archive 209.

[1232] The 26493 molecules of the invention are also useful as pharmacodynamic markers. As used herein, a “pharmacodynamic marker” is an objective biochemical marker which correlates specifically with drug effects. The presence or quantity of a pharmacodynamic marker is not related to the disease state or disorder for which the drug is being administered; therefore, the presence or quantity of the marker is indicative of the presence or activity of the drug in a subject. For example, a pharmacodynamic marker may be indicative of the concentration of the drug in a biological tissue, in that the marker is either expressed or transcribed or not expressed or transcribed in that tissue in relationship to the level of the drug. In this fashion, the distribution or uptake of the drug may be monitored by the pharmacodynamic marker. Similarly, the presence or quantity of the pharmacodynamic marker may be related to the presence or quantity of the metabolic product of a drug, such that the presence or quantity of the marker is indicative of the relative breakdown rate of the drug in vivo. Pharmacodynamic markers are of particular use in increasing the sensitivity of detection of drug effects, particularly when the drug is administered in low doses. Since even a small amount of a drug may be sufficient to activate multiple rounds of marker (e.g., a 26493 marker) transcription or expression, the amplified marker may be in a quantity which is more readily detectable than the drug itself. Also, the marker may be more easily detected due to the nature of the marker itself; for example, using the methods described herein, anti-26493 antibodies may be employed in an immune-based detection system for a 26493 protein marker, or 26493-specific radiolabeled probes may be used to detect a 26493 mRNA marker. Furthermore, the use of a pharmacodynamic marker may offer mechanism-based prediction of risk due to drug treatment beyond the range of possible direct observations. Examples of the use of pharmacodynamic markers in the art include: Matsuda et al. U.S. Pat. No. 6,033,862; Hattis et al. (1991) Env. Health Perspect. 90: 229-238; Schentag (1999) Am. J. Health-Syst. Pharm. 56 Suppl. 3: S21-S24; and Nicolau (1999) Am, J Health-Syst. Pharm. 56 Suppl. 3: S16-S20.

[1233] The 26493 molecules of the invention are also useful as pharmacogenomic markers. As used herein, a “pharmacogenomic marker” is an objective biochemical marker which correlates with a specific clinical drug response or susceptibility in a subject (see, e.g., McLeod et al. (1999) Eur. J. Cancer 35:1650-1652). The presence or quantity of the pharmacogenomic marker is related to the predicted response of the subject to a specific drug or class of drugs prior to administration of the drug. By assessing the presence or quantity of one or more pharmacogenomic markers in a subject, a drug therapy which is most appropriate for the subject, or which is predicted to have a greater degree of success, may be selected. For example, based on the presence or quantity of RNA, or protein (e.g., 26493 protein or RNA) for specific tumor markers in a subject, a drug or course of treatment may be selected that is optimized for the treatment of the specific tumor likely to be present in the subject. Similarly, the presence or absence of a specific sequence mutation in 26493 DNA may correlate 26493 drug response. The use of pharmacogenomic markers therefore permits the application of the most appropriate treatment for each subject without having to administer the therapy.

[1234] Pharmaceutical Compositions of 26493

[1235] The nucleic acid and polypeptides, fragments thereof, as well as anti-26493 antibodies (also referred to herein as “active compounds”) of the invention can be incorporated into pharmaceutical compositions. Such compositions typically include the nucleic acid molecule, protein, or antibody and a pharmaceutically acceptable carrier. As used herein the language “pharmaceutically acceptable carrier” includes solvents, dispersion media, coatings, antibacterial and antifungal agents, isotonic and absorption delaying agents, and the like, compatible with pharmaceutical administration. Supplementary active compounds can also be incorporated into the compositions.

[1236] A pharmaceutical composition is formulated to be compatible with its intended route of administration. Examples of routes of administration include parenteral, e.g., intravenous, intradermal, subcutaneous, oral (e.g., inhalation), transdermal (topical), transmucosal, and rectal administration. Solutions or suspensions used for parenteral, intradermal, or subcutaneous application can include the following components: a sterile diluent such as Water for injection, saline solution, fixed oils, polyethylene glycols, glycerine, propylene glycol or other synthetic solvents; antibacterial agents such as benzyl alcohol or methyl parabens; antioxidants such as ascorbic acid or sodium bisulfite; chelating agents such as ethylenediaminetetraacetic acid; buffers such as acetates, citrates or phosphates and agents for the adjustment of tonicity such as sodium chloride or dextrose. pH can be adjusted with acids or bases, such as hydrochloric acid or sodium hydroxide. The parenteral preparation can be enclosed in ampoules, disposable syringes or multiple dose vials made of glass or plastic.

[1237] Pharmaceutical compositions suitable for injectable use include sterile aqueous solutions (where water soluble) or dispersions and sterile powders for the extemporaneous preparation of sterile injectable solutions or dispersion. For intravenous administration, suitable carriers include physiological saline, bacteriostatic water, Cremophor EL™ (BASF, Parsippany, N.J.) or phosphate buffered saline (PBS). In all cases, the composition must be sterile and should be fluid to the extent that easy syringability exists. It should be stable under the conditions of manufacture and storage and must be preserved against the contaminating action of microorganisms such as bacteria and fungi. The carrier can be a solvent or dispersion medium containing, for example, water, ethanol, polyol (for example, glycerol, propylene glycol, and liquid polyetheylene glycol, and the like), and suitable mixtures thereof. The proper fluidity can be maintained, for example, by the use of a coating such as lecithin, by the maintenance of the required particle size in the case of dispersion and by the use of surfactants. Prevention of the action of microorganisms can be achieved by various antibacterial and antifungal agents, for example, parabens, chlorobutanol, phenol, ascorbic acid, thimerosal, and the like. In many cases, it will be preferable to include isotonic agents, for example, sugars, polyalcohols such as manitol, sorbitol, sodium chloride in the composition. Prolonged absorption of the injectable compositions can be brought about by including in the composition an agent which delays absorption, for example, aluminum monostearate and gelatin.

[1238] Sterile injectable solutions can be prepared by incorporating the active compound in the required amount in an appropriate solvent with one or a combination of ingredients enumerated above, as required, followed by filtered sterilization. Generally, dispersions are prepared by incorporating the active compound into a sterile vehicle which contains a basic dispersion medium and the required other ingredients from those enumerated above. In the case of sterile powders for the preparation of sterile injectable solutions, the preferred methods of preparation are vacuum drying and freeze-drying which yields a powder of the active ingredient plus any additional desired ingredient from a previously sterile-filtered solution thereof.

[1239] Oral compositions generally include an inert diluent or an edible carrier. For the purpose of oral therapeutic administration, the active compound can be incorporated with excipients and used in the form of tablets, troches, or capsules, e.g., gelatin capsules. Oral compositions can also be prepared using a fluid carrier for use as a mouthwash. Pharmaceutically compatible binding agents, and/or adjuvant materials can be included as part of the composition. The tablets, pills, capsules, troches and the like can contain any of the following ingredients, or compounds of a similar nature: a binder such as microcrystalline cellulose, gum tragacanth or gelatin; an excipient such as starch or lactose, a disintegrating agent such as alginic acid, Primogel, or corn starch; a lubricant such as magnesium stearate or Sterotes; a glidant such as colloidal silicon dioxide; a sweetening agent such as sucrose or saccharin; or a flavoring agent such as peppermint, methyl salicylate, or orange flavoring.

[1240] For administration by inhalation, the compounds are delivered in the form of an aerosol spray from pressured container or dispenser which contains a suitable propellant, e.g., a gas such as carbon dioxide, or a nebulizer.

[1241] Systemic administration can also be by transmucosal or transdermal means. For transmucosal or transdermal administration, penetrants appropriate to the barrier to be permeated are used in the formulation. Such penetrants are generally known in the art, and include, for example, for transmucosal administration, detergents, bile salts, and fusidic acid derivatives. Transmucosal administration can be accomplished through the use of nasal sprays or suppositories. For transdermal administration, the active compounds are formulated into ointments, salves, gels, or creams as generally known in the art.

[1242] The compounds can also be prepared in the form of suppositories (e.g., with conventional suppository bases such as cocoa butter and other glycerides) or retention enemas for rectal delivery.

[1243] In one embodiment, the active compounds are prepared with carriers that will protect the compound against rapid elimination from the body, such as a controlled release formulation, including implants and microencapsulated delivery systems. Biodegradable, biocompatible polymers can be used, such as ethylene vinyl acetate, polyanhydrides, polyglycolic acid, collagen, polyorthoesters, and polylactic acid. Methods for preparation of such formulations will be apparent to those skilled in the art. The materials can also be obtained commercially from Alza Corporation and Nova Pharmaceuticals, Inc. Liposomal suspensions (including liposomes targeted to infected cells with monoclonal antibodies to viral antigens) can also be used as pharmaceutically acceptable carriers. These can be prepared according to methods known to those skilled in the art, for example, as described in U.S. Pat. No. 4,522,811.

[1244] It is advantageous to formulate oral or parenteral compositions in dosage unit form for ease of administration and uniformity of dosage. Dosage unit form as used herein refers to physically discrete units suited as unitary dosages for the subject to be treated; each unit containing a predetermined quantity of active compound calculated to produce the desired therapeutic effect in association with the required pharmaceutical carrier.

[1245] Toxicity and therapeutic efficacy of such compounds can be determined by standard pharmaceutical procedures in cell cultures or experimental animals, e.g., for determining the LD50 (the dose lethal to 50% of the population) and the ED50 (the dose therapeutically effective in 50% of the population). The dose ratio between toxic and therapeutic effects is the therapeutic index and it can be expressed as the ratio LD50/ED50. Compounds which exhibit high therapeutic indices are preferred. While compounds that exhibit toxic side effects may be used, care should be taken to design a delivery system that targets such compounds to the site of affected tissue in order to minimize potential damage to uninfected cells and, thereby, reduce side effects.

[1246] The data obtained from the cell culture assays and animal studies can be used in formulating a range of dosage for use in humans. The dosage of such compounds lies preferably within a range of circulating concentrations that include the ED50 with little or no toxicity. The dosage may vary within this range depending upon the dosage form employed and the route of administration utilized. For any compound used in the method of the invention, the therapeutically effective dose can be estimated initially from cell culture assays. A dose may be formulated in animal models to achieve a circulating plasma concentration range that includes the IC50 (i.e., the concentration of the test compound which achieves a half-maximal inhibition of symptoms) as determined in cell culture. Such information can be used to more accurately determine useful doses in humans. Levels in plasma may be measured, for example, by high performance liquid chromatography.

[1247] As defined herein, a therapeutically effective amount of protein or polypeptide (i.e., an effective dosage) ranges from about 0.001 to 30 mg/kg body weight, preferably about 0.01 to 25 mg/kg body weight, more preferably about 0.1 to 20 mg/kg body weight, and even more preferably about 1 to 10 mg/kg, 2 to 9 mg/kg, 3 to 8 mg/kg, 4 to 7 mg/kg, or 5 to 6 mg/kg body weight. The protein or polypeptide can be administered one time per week for between about 1 to 10 weeks, preferably between 2 to 8 weeks, more preferably between about 3 to 7 weeks, and even more preferably for about 4, 5, or 6 weeks. The skilled artisan will appreciate that certain factors may influence the dosage and timing required to effectively treat a subject, including but not limited to the severity of the disease or disorder, previous treatments, the general health and/or age of the subject, and other diseases present. Moreover, treatment of a subject with a therapeutically effective amount of a protein, polypeptide, or antibody can include a single treatment or, preferably, can include a series of treatments.

[1248] For antibodies, the preferred dosage is 0.1 mg/kg of body weight (generally 10 mg/kg to 20 mg/kg). If the antibody is to act in the brain, a dosage of 50 mg/kg to 100 mg/kg is usually appropriate. Generally, partially human antibodies and fully human antibodies have a longer half-life within the human body than other antibodies. Accordingly, lower dosages and less frequent administration is often possible. Modifications such as lipidation can be used to stabilize antibodies and to enhance uptake and tissue penetration (e.g., into the brain). A method for lipidation of antibodies is described by Cruikshank et al. ((1997) J. Acquired Immune Deficiency Syndromes and Human Retrovirology 14:193).

[1249] The present invention encompasses agents which modulate expression or activity. An agent may, for example, be a small molecule. For example, such small molecules include, but are not limited to, peptides, peptidomimetics (e.g., peptoids), amino acids, amino acid analogs, polynucleotides, polynucleotide analogs, nucleotides, nucleotide analogs, organic or inorganic compounds (i.e., including heteroorganic and organometallic compounds) having a molecular weight less than about 10,000 grams per mole, organic or inorganic compounds having a molecular weight less than about 5,000 grams per mole, organic or inorganic compounds having a molecular weight less than about 1,000 grams per mole, organic or inorganic compounds having a molecular weight less than about 500 grams per mole, and salts, esters, and other pharmaceutically acceptable forms of such compounds.

[1250] Exemplary doses include milligram or microgram amounts of the small molecule per kilogram of subject or sample weight (e.g., about 1 microgram per kilogram to about 500 milligrams per kilogram, about 100 micrograms per kilogram to about 5 milligrams per kilogram, or about 1 microgram per kilogram to about 50 micrograms per kilogram. It is furthermore understood that appropriate doses of a small molecule depend upon the potency of the small molecule with respect to the expression or activity to be modulated. When one or more of these small molecules is to be administered to an animal (e.g., a human) in order to modulate expression or activity of a polypeptide or nucleic acid of the invention, a physician, veterinarian, or researcher may, for example, prescribe a relatively low dose at first, subsequently increasing the dose until an appropriate response is obtained. In addition, it is understood that the specific dose level for any particular animal subject will depend upon a variety of factors including the activity of the specific compound employed, the age, body weight, general health, gender, and diet of the subject, the time of administration, the route of administration, the rate of excretion, any drug combination, and the degree of expression or activity to be modulated.

[1251] An antibody (or fragment thereof) may be conjugated to a therapeutic moiety such as a cytotoxin, a therapeutic agent or a radioactive metal ion. A cytotoxin or cytotoxic agent includes any agent that is detrimental to cells. Examples include taxol, cytochalasin B, gramicidin D, ethidium bromide, emetine, mitomycin, etoposide, tenoposide, vincristine, vinblastine, colchicin, doxorubicin, daunorubicin, dihydroxy anthracin dione, mitoxantrone, mithramycin, actinomycin D, 1-dehydrotestosterone, glucocorticoids, procaine, tetracaine, lidocaine, propranolol, and puromycin and analogs or homologs thereof. Therapeutic agents include, but are not limited to, antimetabolites (e.g., methotrexate, 6-mercaptopurine, 6-thioguanine, cytarabine, 5-fluorouracil decarbazine), alkylating agents (e.g., mechlorethamine, thioepa chlorambucil, melphalan, carmustine (BSNU) and lomustine (CCNU), cyclothosphamide, busulfan, dibromomannitol, streptozotocin, mitomycin C, and cis-dichlorodiamine platinum (II) (DDP) cisplatin), anthracyclines (e.g., daunorubicin (formerly daunomycin) and doxorubicin), antibiotics (e.g., dactinomycin (formerly actinomycin), bleomycin, mithramycin, and anthramycin (AMC)), and anti-mitotic agents (e.g., vincristine and vinblastine).

[1252] The conjugates of the invention can be used for modifying a given biological response, the drug moiety is not to be construed as limited to classical chemical therapeutic agents. For example, the drug moiety may be a protein or polypeptide possessing a desired biological activity. Such proteins may include, for example, a toxin such as abrin, ricin A, pseudomonas exotoxin, or diphtheria toxin; a protein such as tumor necrosis factor, &agr;-interferon, &agr;-interferon, nerve growth factor, platelet derived growth factor, tissue plasminogen activator; or, biological response modifiers such as, for example, lymphokines, interleukin-1 (“IL-1”), interleukin-2 (“IL-2”), interleukin-6 (“IL-6”), granulocyte macrophase colony stimulating factor (“GM-CSF”), granulocyte colony stimulating factor (“G-CSF”), or other growth factors.

[1253] Alternatively, an antibody can be conjugated to a second antibody to form an antibody heteroconjugate as described by Segal in U.S. Pat. No. 4,676,980.

[1254] The nucleic acid molecules of the invention can be inserted into vectors and used as gene therapy vectors. Gene therapy vectors can be delivered to a subject by, for example, intravenous injection, local administration (see U.S. Pat. No. 5,328,470) or by stereotactic injection (see e.g., Chen et al. (1994) Proc. Natl. Acad. Sci. USA 91:3054-3057). The pharmaceutical preparation of the gene therapy vector can include the gene therapy vector in an acceptable diluent, or can comprise a slow release matrix in which the gene delivery vehicle is imbedded. Alternatively, where the complete gene delivery vector can be produced intact from recombinant cells, e.g., retroviral vectors, the pharmaceutical preparation can include one or more cells which produce the gene delivery system.

[1255] The pharmaceutical compositions can be included in a container, pack, or dispenser together with instructions for administration.

[1256] Methods of Treatment for 26493

[1257] The present invention provides for both prophylactic and therapeutic methods of treating a subject at risk of (or susceptible to) a disorder or having a disorder associated with aberrant or unwanted 26493 expression or activity. As used herein, the term “treatment” is defined as the application or administration of a therapeutic agent to a patient, or application or administration of a therapeutic agent to an isolated tissue or cell line from a patient, who has a disease, a symptom of disease or a predisposition toward a disease, with the purpose to cure, heal, alleviate, relieve, alter, remedy, ameliorate, improve or affect the disease, the symptoms of disease or the predisposition toward disease. A therapeutic agent includes, but is not limited to, small molecules, peptides, antibodies, ribozymes and antisense oligonucleotides.

[1258] With regards to both prophylactic and therapeutic methods of treatment, such treatments may be specifically tailored or modified, based on knowledge obtained from the field of pharmacogenomics. “Pharmacogenomics”, as used herein, refers to the application of genomics technologies such as gene sequencing, statistical genetics, and gene expression analysis to drugs in clinical development and on the market. More specifically, the term refers the study of how a patient's genes determine his or her response to a drug (e.g., a patient's “drug response phenotype”, or “drug response genotype”.) Thus, another aspect of the invention provides methods for tailoring an individual's prophylactic or therapeutic treatment with either the 26493 molecules of the present invention or 26493 modulators according to that individual's drug response genotype. Pharmacogenomics allows a clinician or physician to target prophylactic or therapeutic treatments to patients who will most benefit from the treatment and to avoid treatment of patients who will experience toxic drug-related side effects.

[1259] In one aspect, the invention provides a method for preventing, in a subject, a disease or condition associated with an aberrant or unwanted 26493 expression or activity, by administering to the subject a 26493 or an agent which modulates 26493 expression or at least one 26493 activity. Subjects at risk for a disease which is caused or contributed to by aberrant or unwanted 26493 expression or activity can be identified by, for example, any or a combination of diagnostic or prognostic assays as described herein. Administration of a prophylactic agent can occur prior to the manifestation of symptoms characteristic of the 26493 aberrance, such that a disease or disorder is prevented or, alternatively, delayed in its progression. Depending on the type of 26493 aberrance, for example, a 26493, 26493 agonist or 26493 antagonist agent can be used for treating the subject. The appropriate agent can be determined based on screening assays described herein.

[1260] It is possible that some 26493 disorders can be caused, at least in part, by an abnormal level of gene product, or by the presence of a gene product exhibiting abnormal activity. As such, the reduction in the level and/or activity of such gene products would bring about the amelioration of disorder symptoms.

[1261] The 26493 molecules can act as novel diagnostic targets and therapeutic agents for controlling one or more of cellular proliferative and/or differentiative disorders, immune disorders, cardiovascular disorders and brain disorders as described above, as well as disorders associated with bone metabolism, liver disorders, viral diseases, pain or metabolic disorders.

[1262] Aberrant expression and/or activity of 26493 molecules may mediate disorders associated with bone metabolism. “Bone metabolism” refers to direct or indirect effects in the formation or degeneration of bone structures, e.g., bone formation, bone resorption, etc., which may ultimately affect the concentrations in serum of calcium and phosphate. This term also includes activities mediated by 26493 molecules effects in bone cells, e.g. osteoclasts and osteoblasts, that may in turn result in bone formation and degeneration. For example, 26493 molecules may support different activities of bone resorbing osteoclasts such as the stimulation of differentiation of monocytes and mononuclear phagocytes into osteoclasts. Accordingly, 26493 molecules that modulate the production of bone cells can influence bone formation and degeneration, and thus may be used to treat bone disorders. Examples of such disorders include, but are not limited to, osteoporosis, osteodystrophy, osteomalacia, rickets, osteitis fibrosa cystica, renal osteodystrophy, osteosclerosis, anti-convulsant treatment, osteopenia, fibrogenesis-imperfecta ossium, secondary hyperparathyrodism, hypoparathyroidism, hyperparathyroidism, cirrhosis, obstructive jaundice, drug induced metabolism, medullary carcinoma, chronic renal disease, rickets, sarcoidosis, glucocorticoid antagonism, malabsorption syndrome, steatorrhea, tropical sprue, idiopathic hypercalcemia and milk fever.

[1263] Disorders which may be treated or diagnosed by methods described herein include, but are not limited to, disorders associated with an accumulation in the liver of fibrous tissue, such as that resulting from an imbalance between production and degradation of the extracellular matrix accompanied by the collapse and condensation of preexisting fibers. The methods described herein can be used to diagnose or treat hepatocellular necrosis or injury induced by a wide variety of agents including processes which disturb homeostasis, such as an inflammatory process, tissue damage resulting from toxic injury or altered hepatic blood flow, and infections (e.g., bacterial, viral and parasitic). For example, the methods can be used for the early detection of hepatic injury, such as portal hypertension or hepatic fibrosis. In addition, the methods can be employed to detect liver fibrosis attributed to inborn errors of metabolism, for example, fibrosis resulting from a storage disorder such as Gaucher's disease (lipid abnormalities) or a glycogen storage disease, A1-antitrypsin deficiency; a disorder mediating the accumulation (e.g., storage) of an exogenous substance, for example, hemochromatosis (iron-overload syndrome) and copper storage diseases (Wilson's disease), disorders resulting in the accumulation of a toxic metabolite (e.g., tyrosinemia, fructosemia and galactosemia) and peroxisomal disorders (e.g., Zellweger syndrome). Additionally, the methods described herein may be useful for the early detection and treatment of liver injury associated with the administration of various chemicals or drugs, such as for example, methotrexate, isonizaid, oxyphenisatin, methyldopa, chlorpromazine, tolbutamide or alcohol, or which represents a hepatic manifestation of a vascular disorder such as obstruction of either the intrahepatic or extrahepatic bile flow or an alteration in hepatic circulation resulting, for example, from chronic heart failure, veno-occlusive disease, portal vein thrombosis or Budd-Chiari syndrome.

[1264] Additionally, 26493 molecules may play an important role in the etiology of certain viral diseases, including but not limited to Hepatitis B, Hepatitis C and Herpes Simplex Virus (HSV). Modulators of 26493 activity could be used to control viral diseases. The modulators can be used in the treatment and/or diagnosis of viral infected tissue or virus-associated tissue fibrosis, especially liver and liver fibrosis. Also, 26493 modulators can be used in the treatment and/or diagnosis of virus-associated carcinoma, especially hepatocellular cancer.

[1265] Additionally, 26493 may play an important role in the regulation of metabolism or pain disorders. Diseases of metabolic imbalance include, but are not limited to, obesity, anorexia nervosa, cachexia, lipid disorders, and diabetes. Examples of pain disorders include, but are not limited to, pain response elicited during various forms of tissue injury, e.g., inflammation, infection, and ischemia, usually referred to as hyperalgesia (described in, for example, Fields, H. L. (1987) Pain, New York: McGraw-Hill); pain associated with musculoskeletal disorders, e.g., joint pain; tooth pain; headaches; pain associated with surgery; pain related to irritable bowel syndrome; or chest pain.

[1266] As discussed, successful treatment of 26493 disorders can be brought about by techniques that serve to inhibit the expression or activity of target gene products. For example, compounds, e.g., an agent identified using an assays described above, that proves to exhibit negative modulatory activity, can be used in accordance with the invention to prevent and/or ameliorate symptoms of 26493 disorders. Such molecules can include, but are not limited to peptides, phosphopeptides, small organic or inorganic molecules, or antibodies (including, for example, polyclonal, monoclonal, humanized, anti-idiotypic, chimeric or single chain antibodies, and Fab, F(ab′)2 and Fab expression library fragments, scFV molecules, and epitope-binding fragments thereof).

[1267] Further, antisense and ribozyme molecules that inhibit expression of the target gene can also be used in accordance with the invention to reduce the level of target gene expression, thus effectively reducing the level of target gene activity. Still further, triple helix molecules can be utilized in reducing the level of target gene activity. Antisense, ribozyme and triple helix molecules are discussed above.

[1268] It is possible that the use of antisense, ribozyme, and/or triple helix molecules to reduce or inhibit mutant gene expression can also reduce or inhibit the transcription (triple helix) and/or translation (antisense, ribozyme) of mRNA produced by normal target gene alleles, such that the concentration of normal target gene product present can be lower than is necessary for a normal phenotype. In such cases, nucleic acid molecules that encode and express target gene polypeptides exhibiting normal target gene activity can be introduced into cells via gene therapy method. Alternatively, in instances in that the target gene encodes an extracellular protein, it can be preferable to co-administer normal target gene protein into the cell or tissue in order to maintain the requisite level of cellular or tissue target gene activity.

[1269] Another method by which nucleic acid molecules may be utilized in treating or preventing a disease characterized by 26493 expression is through the use of aptamer molecules specific for 26493 protein. Aptamers are nucleic acid molecules having a tertiary structure which permits them to specifically bind to protein ligands (see, e.g., Osborne, et al. (1997) Curr. Opin. Chem Biol. 1: 5-9; and Patel, D. J. (1997) Curr Opin Chem Biol 1:32-46). Since nucleic acid molecules may in many cases be more conveniently introduced into target cells than therapeutic protein molecules may be, aptamers offer a method by which 26493 protein activity may be specifically decreased without the introduction of drugs or other molecules which may have pluripotent effects.

[1270] Antibodies can be generated that are both specific for target gene product and that reduce target gene product activity. Such antibodies may, therefore, by administered in instances whereby negative modulatory techniques are appropriate for the treatment of 26493 disorders. For a description of antibodies, see the Antibody section above.

[1271] In circumstances wherein injection of an animal or a human subject with a 26493 protein or epitope for stimulating antibody production is harmful to the subject, it is possible to generate an immune response against 26493 through the use of anti-idiotypic antibodies (see, for example, Herlyn, D. (1999) Ann Med 31:66-78; and Bhattacharya-Chatterjee, M., and Foon, K. A. (1998) Cancer Treat Res. 94:51-68). If an anti-idiotypic antibody is introduced into a mammal or human subject, it should stimulate the production of anti-anti-idiotypic antibodies, which should be specific to the 26493 protein. Vaccines directed to a disease characterized by 26493 expression may also be generated in this fashion.

[1272] In instances where the target antigen is intracellular and whole antibodies are used, internalizing antibodies may be preferred. Lipofectin or liposomes can be used to deliver the antibody or a fragment of the Fab region that binds to the target antigen into cells. Where fragments of the antibody are used, the smallest inhibitory fragment that binds to the target antigen is preferred. For example, peptides having an amino acid sequence corresponding to the Fv region of the antibody can be used. Alternatively, single chain neutralizing antibodies that bind to intracellular target antigens can also be administered. Such single chain antibodies can be administered, for example, by expressing nucleotide sequences encoding single-chain antibodies within the target cell population (see e.g., Marasco et al. (1993) Proc. Natl. Acad. Sci. USA 90:7889-7893).

[1273] The identified compounds that inhibit target gene expression, synthesis and/or activity can be administered to a patient at therapeutically effective doses to prevent, treat or ameliorate 26493 disorders. A therapeutically effective dose refers to that amount of the compound sufficient to result in amelioration of symptoms of the disorders. Toxicity and therapeutic efficacy of such compounds can be determined by standard pharmaceutical procedures as described above.

[1274] The data obtained from the cell culture assays and animal studies can be used in formulating a range of dosage for use in humans. The dosage of such compounds lies preferably within a range of circulating concentrations that include the ED50 with little or no toxicity. The dosage can vary within this range depending upon the dosage form employed and the route of administration utilized. For any compound used in the method of the invention, the therapeutically effective dose can be estimated initially from cell culture assays. A dose can be formulated in animal models to achieve a circulating plasma concentration range that includes the IC50 (i.e., the concentration of the test compound that achieves a half-maximal inhibition of symptoms) as determined in cell culture. Such information can be used to more accurately determine useful doses in humans. Levels in plasma can be measured, for example, by high performance liquid chromatography. Another example of determination of effective dose for an individual is the ability to directly assay levels of “free” and “bound” compound in the serum of the test subject. Such assays may utilize antibody mimics and/or “biosensors” that have been created through molecular imprinting techniques. The compound which is able to modulate 26493 activity is used as a template, or “imprinting molecule”, to spatially organize polymerizable monomers prior to their polymerization with catalytic reagents. The subsequent removal of the imprinted molecule leaves a polymer matrix which contains a repeated “negative image” of the compound and is able to selectively rebind the molecule under biological assay conditions. A detailed review of this technique can be seen in Ansell, R. J. et al (1996) Current Opinion in Biotechnology 7:89-94 and in Shea, K. J. (1994) Trends in Polymer Science 2:166-173. Such “imprinted” affinity matrixes are amenable to ligand-binding assays, whereby the immobilized monoclonal antibody component is replaced by an appropriately imprinted matrix. An example of the use of such matrixes in this way can be seen in Vlatakis, G. et al (1993) Nature 361:645-647. Through the use of isotope-labeling, the “free” concentration of compound which modulates the expression or activity of 26493 can be readily monitored and used in calculations of IC50.

[1275] Such “imprinted” affinity matrixes can also be designed to include fluorescent groups whose photon-emitting properties measurably change upon local and selective binding of target compound. These changes can be readily assayed in real time using appropriate fiberoptic devices, in turn allowing the dose in a test subject to be quickly optimized based on its individual IC50. An rudimentary example of such a “biosensor” is discussed in Kriz, D. et al (1995) Analytical Chemistry 67:2142-2144.

[1276] Another aspect of the invention pertains to methods of modulating 26493 expression or activity for therapeutic purposes. Accordingly, in an exemplary embodiment, the modulatory method of the invention involves contacting a cell with a 26493 or agent that modulates one or more of the activities of 26493 protein activity associated with the cell. An agent that modulates 26493 protein activity can be an agent as described herein, such as a nucleic acid or a protein, a naturally-occurring target molecule of a 26493 protein (e.g., a 26493 substrate or receptor), a 26493 antibody, a 26493 agonist or antagonist, a peptidomimetic of a 26493 agonist or antagonist, or other small molecule.

[1277] In one embodiment, the agent stimulates one or 26493 activities. Examples of such stimulatory agents include active 26493 protein and a nucleic acid molecule encoding 26493. In another embodiment, the agent inhibits one or more 26493 activities. Examples of such inhibitory agents include antisense 26493 nucleic acid molecules, anti-26493 antibodies, and 26493 inhibitors. These modulatory methods can be performed in vitro (e.g., by culturing the cell with the agent) or, alternatively, in vivo (e.g., by administering the agent to a subject). As such, the present invention provides methods of treating an individual afflicted with a disease or disorder characterized by aberrant or unwanted expression or activity of a 26493 protein or nucleic acid molecule. In one embodiment, the method involves administering an agent (e.g., an agent identified by a screening assay described herein), or combination of agents that modulates (e.g., up regulates or down regulates) 26493 expression or activity. In another embodiment, the method involves administering a 26493 protein or nucleic acid molecule as therapy to compensate for reduced, aberrant, or unwanted 26493 expression or activity.

[1278] Stimulation of 26493 activity is desirable in situations in which 26493 is abnormally downregulated and/or in which increased 26493 activity is likely to have a beneficial effect. For example, stimulation of 26493 activity is desirable in situations in which a 26493 is downregulated and/or in which increased 26493 activity is likely to have a beneficial effect. Likewise, inhibition of 26493 activity is desirable in situations in which 26493 is abnormally upregulated and/or in which decreased 26493 activity is likely to have a beneficial effect.

[1279] 26493 Pharmacogenomics

[1280] The 26493 molecules of the present invention, as well as agents, or modulators which have a stimulatory or inhibitory effect on 26493 activity (e.g., 26493 gene expression) as identified by a screening assay described herein can be administered to individuals to treat (prophylactically or therapeutically) 26493 associated disorders associated with aberrant or unwanted 26493 activity. In conjunction with such treatment, pharmacogenomics (i.e., the study of the relationship between an individual's genotype and that individual's response to a foreign compound or drug) may be considered. Differences in metabolism of therapeutics can lead to severe toxicity or therapeutic failure by altering the relation between dose and blood concentration of the pharmacologically active drug. Thus, a physician or clinician may consider applying knowledge obtained in relevant pharmacogenomics studies in determining whether to administer a 26493 molecule or 26493 modulator as well as tailoring the dosage and/or therapeutic regimen of treatment with a 26493 molecule or 26493 modulator.

[1281] Pharmacogenomics deals with clinically significant hereditary variations in the response to drugs due to altered drug disposition and abnormal action in affected persons. See, for example, Eichelbaum, M. et al. (1996) Clin. Exp. Pharmacol. Physiol. 23:983-985 and Linder, M. W. et al. (1997) Clin. Chem. 43:254-266. In general, two types of pharmacogenetic conditions can be differentiated. Genetic conditions transmitted as a single factor altering the way drugs act on the body (altered drug action) or genetic conditions transmitted as single factors altering the way the body acts on drugs (altered drug metabolism). These pharmacogenetic conditions can occur either as rare genetic defects or as naturally-occurring polymorphisms. For example, glucose-6-phosphate dehydrogenase deficiency (G6PD) is a common inherited enzymopathy in which the main clinical complication is haemolysis after ingestion of oxidant drugs (anti-malarials, sulfonamides, analgesics, nitrofurans) and consumption of fava beans.

[1282] One pharmacogenomics approach to identifying genes that predict drug response, known as “a genome-wide association”, relies primarily on a high-resolution map of the human genome consisting of already known gene-related markers (e.g., a “bi-allelic” gene marker map which consists of 60,000-100,000 polymorphic or variable sites on the human genome, each of which has two variants.) Such a high-resolution genetic map can be compared to a map of the genome of each of a statistically significant number of patients taking part in a Phase II/III drug trial to identify markers associated with a particular observed drug response or side effect. Alternatively, such a high resolution map can be generated from a combination of some ten-million known single nucleotide polymorphisms (SNPs) in the human genome. As used herein, a “SNP” is a common alteration that occurs in a single nucleotide base in a stretch of DNA. For example, a SNP may occur once per every 1000 bases of DNA. A SNP may be involved in a disease process, however, the vast majority may not be disease-associated. Given a genetic map based on the occurrence of such SNPs, individuals can be grouped into genetic categories depending on a particular pattern of SNPs in their individual genome. In such a manner, treatment regimens can be tailored to groups of genetically similar individuals, taking into account traits that may be common among such genetically similar individuals.

[1283] Alternatively, a method termed the “candidate gene approach,” can be utilized to identify genes that predict drug response. According to this method, if a gene that encodes a drug's target is known (e.g., a 26493 protein of the present invention), all common variants of that gene can be fairly easily identified in the population and it can be determined if having one version of the gene versus another is associated with a particular drug response.

[1284] Alternatively, a method termed the “gene expression profiling,” can be utilized to identify genes that predict drug response. For example, the gene expression of an animal dosed with a drug (e.g., a 26493 molecule or 26493 modulator of the present invention) can give an indication whether gene pathways related to toxicity have been turned on.

[1285] Information generated from more than one of the above pharmacogenomics approaches can be used to determine appropriate dosage and treatment regimens for prophylactic or therapeutic treatment of an individual. This knowledge, when applied to dosing or drug selection, can avoid adverse reactions or therapeutic failure and thus enhance therapeutic or prophylactic efficiency when treating a subject with a 26493 molecule or 26493 modulator, such as a modulator identified by one of the exemplary screening assays described herein.

[1286] The present invention further provides methods for identifying new agents, or combinations, that are based on identifying agents that modulate the activity of one or more of the gene products encoded by one or more of the 26493 genes of the present invention, wherein these products may be associated with resistance of the cells to a therapeutic agent. Specifically, the activity of the proteins encoded by the 26493 genes of the present invention can be used as a basis for identifying agents for overcoming agent resistance. By blocking the activity of one or more of the resistance proteins, target cells, e.g., human cells, will become sensitive to treatment with an agent that the unmodified target cells were resistant to.

[1287] Monitoring the influence of agents (e.g., drugs) on the expression or activity of a 26493 protein can be applied in clinical trials. For example, the effectiveness of an agent determined by a screening assay as described herein to increase 26493 gene expression, protein levels, or upregulate 26493 activity, can be monitored in clinical trials of subjects exhibiting decreased 26493 gene expression, protein levels, or downregulated 26493 activity. Alternatively, the effectiveness of an agent determined by a screening assay to decrease 26493 gene expression, protein levels, or downregulate 26493 activity, can be monitored in clinical trials of subjects exhibiting increased 26493 gene expression, protein levels, or upregulated 26493 activity. In such clinical trials, the expression or activity of a 26493 gene, and preferably, other genes that have been implicated in, for example, a 26493-associated disorder can be used as a “read out” or markers of the phenotype of a particular cell.

[1288] 26493 Informatics

[1289] The sequence of a 26493 molecule is provided in a variety of media to facilitate use thereof. A sequence can be provided as a manufacture, other than an isolated nucleic acid or amino acid molecule, which contains a 26493. Such a manufacture can provide a nucleotide or amino acid sequence, e.g., an open reading frame, in a form which allows examination of the manufacture using means not directly applicable to examining the nucleotide or amino acid sequences, or a subset thereof, as they exists in nature or in purified form. The sequence information can include, but is not limited to, 26493 full-length nucleotide and/or amino acid sequences, partial nucleotide and/or amino acid sequences, polymorphic sequences including single nucleotide polymorphisms (SNPs), epitope sequence, and the like. In a preferred embodiment, the manufacture is a machine-readable medium, e.g., a magnetic, optical, chemical or mechanical information storage device.

[1290] As used herein, “machine-readable media” refers to any medium that can be read and accessed directly by a machine, e.g., a digital computer or analogue computer. Non-limiting examples of a computer include a desktop PC, laptop, mainframe, server (e.g., a web server, network server, or server farm), handheld digital assistant, pager, mobile telephone, and the like. The computer can be stand-alone or connected to a communications network, e.g., a local area network (such as a VPN or intranet), a wide area network (e.g., an Extranet or the Internet), or a telephone network (e.g., a wireless, DSL, or ISDN network). Machine-readable media include, but are not limited to: magnetic storage media, such as floppy discs, hard disc storage medium, and magnetic tape; optical storage media such as CD-ROM; electrical storage media such as RAM, ROM, EPROM, EEPROM, flash memory, and the like; and hybrids of these categories such as magnetic/optical storage media.

[1291] A variety of data storage structures are available to a skilled artisan for creating a machine-readable medium having recorded thereon a nucleotide or amino acid sequence of the present invention. The choice of the data storage structure will generally be based on the means chosen to access the stored information. In addition, a variety of data processor programs and formats can be used to store the nucleotide sequence information of the present invention on computer readable medium. The sequence information can be represented in a word processing text file, formatted in commercially-available software such as WordPerfect and Microsoft Word, or are presented in the form of an ASCII file, stored in a database application, such as DB2, Sybase, Oracle, or the like. The skilled artisan can readily adapt any number of data processor structuring formats (e.g., text file or database) in order to obtain computer readable medium having recorded thereon the nucleotide sequence information of the present invention.

[1292] In a preferred embodiment, the sequence information is stored in a relational database (such as Sybase or Oracle). The database can have a first table for storing sequence (nucleic acid and/or amino acid sequence) information. The sequence information can be stored in one field (e.g., a first column) of a table row and an identifier for the sequence can be store in another field (e.g., a second column) of the table row. The database can have a second table, e.g., storing annotations. The second table can have a field for the sequence identifier, a field for a descriptor or annotation text (e.g., the descriptor can refer to a functionality of the sequence, a field for the initial position in the sequence to which the annotation refers, and a field for the ultimate position in the sequence to which the annotation refers. Non-limiting examples for annotation to nucleic acid sequences include polymorphisms (e.g., SNP's) translational regulatory sites and splice junctions. Non-limiting examples for annotations to amino acid sequence include polypeptide domains, e.g., a domain described herein; active sites and other functional amino acids; and modification sites.

[1293] By providing the nucleotide or amino acid sequences of the invention in computer readable form, the skilled artisan can routinely access the sequence information for a variety of purposes. For example, one skilled in the art can use the nucleotide or amino acid sequences of the invention in computer readable form to compare a target sequence or target structural motif with the sequence information stored within the data storage means. A search is used to identify fragments or regions of the sequences of the invention which match a particular target sequence or target motif. The search can be a BLAST search or other routine sequence comparison, e.g., a search described herein.

[1294] Thus, in one aspect, the invention features a method of analyzing 26493, e.g., analyzing structure, function, or relatedness to one or more other nucleic acid or amino acid sequences. The method includes: providing a 26493 nucleic acid or amino acid sequence; comparing the 26493 sequence with a second sequence, e.g., one or more preferably a plurality of sequences from a collection of sequences, e.g., a nucleic acid or protein sequence database to thereby analyze 26493. The method can be performed in a machine, e.g., a computer, or manually by a skilled artisan.

[1295] The method can include evaluating the sequence identity between a 26493 sequence and a database sequence. The method can be performed by accessing the database at a second site, e.g., over the Internet.

[1296] As used herein, a “target sequence” can be any DNA or amino acid sequence of six or more nucleotides or two or more amino acids. A skilled artisan can readily recognize that the longer a target sequence is, the less likely a target sequence will be present as a random occurrence in the database. Typical sequence lengths of a target sequence are from about 10 to 100 amino acids or from about 30 to 300 nucleotide residues. However, it is well recognized that commercially important fragments, such as sequence fragments involved in gene expression and protein processing, may be of shorter length.

[1297] Computer software is publicly available which allows a skilled artisan to access sequence information provided in a computer readable medium for analysis and comparison to other sequences. A variety of known algorithms are disclosed publicly and a variety of commercially available software for conducting search means are and can be used in the computer-based systems of the present invention. Examples of such software include, but are not limited to, MacPattern (EMBL), BLASTN and BLASTX (NCBI).

[1298] Thus, the invention features a method of making a computer readable record of a sequence of a 26493 sequence which includes recording the sequence on a computer readable matrix. In a preferred embodiment the record includes one or more of the following: identification of an ORF; identification of a domain, region, or site; identification of the start of transcription; identification of the transcription terminator; the full length amino acid sequence of the protein, or a mature form thereof; the 5′end of the translated region.

[1299] In another aspect, the invention features, a method of analyzing a sequence. The method includes: providing a 26493 sequence, or record, in machine-readable form; comparing a second sequence to the 26493 sequence; thereby analyzing a sequence. Comparison can include comparing to sequences for sequence identity or determining if one sequence is included within the other, e.g., determining if the 26493 sequence includes a sequence being compared. In a preferred embodiment the 26493 or second sequence is stored on a first computer, e.g., at a first site and the comparison is performed, read, or recorded on a second computer, e.g., at a second site. E.g., the 26493 or second sequence can be stored in a public or proprietary database in one computer, and the results of the comparison performed, read, or recorded on a second computer. In a preferred embodiment the record includes one or more of the following: identification of an ORF; identification of a domain, region, or site; identification of the start of transcription; identification of the transcription terminator; the full length amino acid sequence of the protein, or a mature form thereof; the 5′end of the translated region.

[1300] In another aspect, the invention provides a machine-readable medium for holding instructions for performing a method for determining whether a subject has a 26493-associated disease or disorder or a pre-disposition to a 26493-associated disease or disorder, wherein the method comprises the steps of determining 26493 sequence information associated with the subject and based on the 26493 sequence information, determining whether the subject has a 26493-associated disease or disorder or a pre-disposition to a 26493-associated disease or disorder and/or recommending a particular treatment for the disease, disorder or pre-disease condition.

[1301] The invention further provides in an electronic system and/or in a network, a method for determining whether a subject has a 26493-associated disease or disorder or a pre-disposition to a disease associated with a 26493 wherein the method comprises the steps of determining 26493 sequence information associated with the subject, and based on the 26493 sequence information, determining whether the subject has a 26493-associated disease or disorder or a pre-disposition to a 26493-associated disease or disorder, and/or recommending a particular treatment for the disease, disorder or pre-disease condition. In a preferred embodiment, the method further includes the step of receiving information, e.g., phenotypic or genotypic information, associated with the subject and/or acquiring from a network phenotypic information associated with the subject. The information can be stored in a database, e.g., a relational database. In another embodiment, the method further includes accessing the database, e.g., for records relating to other subjects, comparing the 26493 sequence of the subject to the 26493 sequences in the database to thereby determine whether the subject as a 26493-associated disease or disorder, or a pre-disposition for such.

[1302] The present invention also provides in a network, a method for determining whether a subject has a 26493 associated disease or disorder or a pre-disposition to a 26493-associated disease or disorder associated with 26493, said method comprising the steps of receiving 26493 sequence information from the subject and/or information related thereto, receiving phenotypic information associated with the subject, acquiring information from the network corresponding to 26493 and/or corresponding to a 26493-associated disease or disorder, and based on one or more of the phenotypic information, the 26493 information (e.g., sequence information and/or information related thereto), and the acquired information, determining whether the subject has a 26493-associated disease or disorder or a pre-disposition to a 26493-associated disease or disorder. The method may further comprise the step of recommending a particular treatment for the disease, disorder or pre-disease condition.

[1303] The present invention also provides a method for determining whether a subject has a 26493-associated disease or disorder or a pre-disposition to a 26493-associated disease or disorder, said method comprising the steps of receiving information related to 26493 (e.g., sequence information and/or information related thereto), receiving phenotypic information associated with the subject, acquiring information from the network related to 26493 and/or related to a 26493-associated disease or disorder, and based on one or more of the phenotypic information, the 26493 information, and the acquired information, determining whether the subject has a 26493-associated disease or disorder or a pre-disposition to a 26493-associated disease or disorder. The method may further comprise the step of recommending a particular treatment for the disease, disorder or pre-disease condition.

[1304] This invention is further illustrated by the following examples that should not be construed as limiting. The contents of all references, patents and published patent applications cited throughout this application are incorporated herein by reference.

Background of the 58224 Invention

[1305] Helicases are mechanochemical enzymes that couple the energy of nucleoside triphosphate hydrolysis to the dehybridization or unwinding of duplex nucleic acid molecules. Nucleic acid unwinding is of central importance in a variety of nucleic acid processes that include the transcription, translation, recombination, and replication of genetic material. The importance of helicases is further underscored by the large number of DNA or RNA helicases identified in prokaryotic and eukaryotic organisms.

[1306] Helicase domains, for example, “DEAD/H” helicase domains, are found in a number of proteins, including initiation factor eIF-4A, splicing proteins PRP5 and PRP28, and SNF2-domain containing proteins. SNF2-domain containing proteins have been reported to be involved in a variety of processes, including transcription regulation (e.g., SNF2, STH1, brahma, MOTI), maintenance of chromosome stability during mitosis (e.g., lodestar), and various aspects of processing DNA damage, including DNA excision repair (e.g., RAD16 and ERCC6), recombinational pathways (e.g., RAD54) and post-replication daughter strand gap repair (e.g., RAD5) (Eisen, J. A. et al. (1995) Nucleic Acid Res. 23(14):2715-23).

Summary of the 58224 Invention

[1307] The present invention is based, in part, on the discovery of a novel helicase family member, referred to herein as “58224”. The nucleotide sequence of a cDNA encoding 58224 is shown in SEQ ID NO:22, and the amino acid sequence of a 58224 polypeptide is shown in SEQ ID NO:23. In addition, the nucleotide sequences of the coding region are depicted in SEQ ID NO:24 (See Example 15).

[1308] Accordingly, in one aspect, the invention features a nucleic acid molecule that encodes a 58224 protein or polypeptide, e.g., a biologically active portion of the 58224 protein. In a preferred embodiment the isolated nucleic acid molecule encodes a polypeptide having the amino acid sequence of SEQ ID NO:23. In other embodiments, the invention provides isolated 58224 nucleic acid molecules having the nucleotide sequence shown in SEQ ID NO:22 or SEQ ID NO:24 or the sequence of the DNA insert of the plasmid deposited with ATCC Accession Number ______. In still other embodiments, the invention provides nucleic acid molecules that are substantially identical (e.g., naturally occurring allelic variants) to the nucleotide sequence shown in SEQ ID NO:22, SEQ ID NO:24, or the sequence of the DNA insert of the plasmid deposited with ATCC Accession Number ______. In other embodiments, the invention provides a nucleic acid molecule which hybridizes under a stringency condition described herein to a nucleic acid molecule comprising the nucleotide sequence of SEQ ID NO:22, SEQ ID NO:24, or the sequence of the DNA insert of the plasmid deposited with ATCC Accession Number ______, wherein the nucleic acid encodes a full length 58224 protein or an active fragment thereof.

[1309] In a related aspect, the invention further provides nucleic acid constructs that include a 58224 nucleic acid molecule described herein. In certain embodiments, the nucleic acid molecules of the invention are operatively linked to native or heterologous regulatory sequences. Also included, are vectors and host cells containing the 58224 nucleic acid molecules of the invention e.g., vectors and host cells suitable for producing 58224 nucleic acid molecules and polypeptides.

[1310] In another related aspect, the invention provides nucleic acid fragments suitable as primers or hybridization probes for the detection of 58224-encoding nucleic acids.

[1311] In still another related aspect, isolated nucleic acid molecules that are antisense to a 58224 encoding nucleic acid molecule are provided.

[1312] In another aspect, the invention features, 58224 polypeptides, and biologically active or antigenic fragments thereof that are useful, e.g., as reagents or targets in assays applicable to treatment and diagnosis of 58224-mediated or -related disorders. In another embodiment, the invention provides 58224 polypeptides having a 58224 activity. Preferred polypeptides are 58224 proteins including at least one helicase domain, e.g., a C-terminal helicase domain, or an SNF2 domain, e.g., an N-terminal SNF2 domain, and, preferably, having a 58224 activity, e.g., a 58224 activity as described herein.

[1313] In other embodiments, the invention provides 58224 polypeptides, e.g., a 58224 polypeptide having the amino acid sequence shown in SEQ ID NO:23, or the amino acid sequence encoded by the cDNA insert of the plasmid deposited with ATCC Accession Number ______; an amino acid sequence that is substantially identical to the amino acid sequence shown in SEQ ID NO:23, or the amino acid sequence encoded by the cDNA insert of the plasmid deposited with ATCC Accession Number ______; or an amino acid sequence encoded by a nucleic acid molecule having a nucleotide sequence which hybridizes under a stringency condition described herein to a nucleic acid molecule comprising the nucleotide sequence of SEQ ID NO:22, SEQ ID NO:24, or the sequence of the DNA insert of the plasmid deposited with ATCC Accession Number ______, wherein the nucleic acid encodes a full length 58224 protein or an active fragment thereof.

[1314] In a related aspect, the invention further provides nucleic acid constructs which include a 58224 nucleic acid molecule described herein.

[1315] In a related aspect, the invention provides 58224 polypeptides or fragments operatively linked to non-58224 polypeptides to form fusion proteins.

[1316] In another aspect, the invention features antibodies and antigen-binding fragments thereof, that react with, or more preferably specifically bind to, 58224 polypeptides. In other embodiments, the antibody or antigen-binding fragment thereof reacts with, or more preferably binds specifically to a 58224 polypeptide or a fragment thereof, e.g., a helicase domain, or an SNF2 domain of a 58224 polypeptide. In one embodiment, the antibody or antigen-binding fragment thereof competitively inhibits the binding of a second antibody to its target epitope.

[1317] In another aspect, the invention provides methods of screening for compounds that modulate the expression or activity of the 58224 polypeptides or nucleic acids.

[1318] In still another aspect, the invention provides a process for modulating 58224 polypeptide or nucleic acid expression or activity, e.g. using the screened compounds. In certain embodiments, the methods involve treatment of conditions, e.g., disorders or diseases, related to aberrant activity or expression of the 58224 polypeptides or nucleic acids, such as conditions involving aberrant or deficient cellular proliferation or differentiation, or tumor invasion or metastasis.

[1319] In yet another aspect, the invention provides methods for inhibiting the proliferation or inducing the killing or differentiation, of a 58224-expressing cell (e.g., a 58224-expressing hyperproliferative cell), comprising contacting the cell with a an agent, e.g., a compound identified using the methods described herein) that modulates the activity, or expression, of the 58224 polypeptide or nucleic acid, thereby inhibiting the proliferation or inducing the killing or differentiation of the 58224-expressing cell.

[1320] In a preferred embodiment, the contacting step is effective in vitro or ex vivo. In other embodiments, the contacting step is effected in vivo, e.g., in a subject (e.g., a mammal, e.g., a human), as part of a therapeutic or prophylactic protocol.

[1321] In a preferred embodiment, the cell, e.g., the hyperproliferative cell is found in a solid tumor, a soft tissue tumor, or a metastatic lesion. Preferably, the tumor is a sarcoma, a carcinoma, or an adenocarcinoma. Preferably, the cell, e.g., the hyperproliferative cell is found in a cancerous or pre-cancerous tissue, e.g., a cancerous or pre-cancerous tissue where a 58224 polypeptide or nucleic acid is expressed, e.g., breast, ovarian, lung, colon, or brain cancer, or a liver metastasis. Most preferably, the cell, e.g., the hyperproliferative cell is found in a tumor from the breast, ovary, colon, liver or lung.

[1322] In a preferred embodiment, the agent, e.g., the compound, is an inhibitor of a 58224 polypeptide. Preferably, the inhibitor is chosen from a peptide, a phosphopeptide, a small organic molecule, a small inorganic molecule and an antibody (e.g., an antibody conjugated to a therapeutic moiety selected from a cytotoxin, a cytotoxic agent and a radioactive metal ion). The inhibitor can also be a trypsin inhibitor or a derivative thereof, or a peptidomimetic, e.g., a phosphonate analog of a peptide substrate.

[1323] In a preferred embodiment, the agent, e.g., the compound, is an inhibitor of a 58224 nucleic acid, e.g., an antisense, a ribozyme, or a triple helix molecule.

[1324] In a preferred embodiment, the agent, e.g., the compound, is administered in combination with a cytotoxic agent. Examples of cytotoxic agents include an anti-microtubule agent, a topoisomerase I inhibitor, a topoisomerase II inhibitor, an anti-metabolite, a mitotic inhibitor, an alkylating agent, an intercalating agent, an agent capable of interfering with a signal transduction pathway, an agent that promotes apoptosis or necrosis, and radiation.

[1325] In another aspect, the invention features methods for treating or preventing a disorder characterized by aberrant activity, e.g., aberrant cellular proliferation or differentiation, of a 58224-expressing cell, in a subject. Preferably, the method includes comprising administering to the subject (e.g., a mammal, e.g., a human) an effective amount of an agent, e.g., a compound (e.g., a compound identified using the methods described herein) that modulates the activity, or expression, of the 58224 polypeptide or nucleic acid.

[1326] In a preferred embodiment, the disorder is a cancerous or pre-cancerous condition. Most preferably, the disorder is a cancer, e.g., a solid tumor, a soft tissue tumor, or a metastatic lesion. Preferably, the cancer is a sarcoma, a carcinoma, or an adenocarcinoma. Preferably, the cancer is found in a tissue where a 58224 polypeptide or nucleic acid is expressed, e.g., breast, ovarian, lung, colon, or brain cancer, or a liver metastasis. Most preferably, the cancer is found in the breast, ovary, colon, liver and lung.

[1327] In a preferred embodiment, the agent, e.g., the compound, is an inhibitor of a 58224 polypeptide. Preferably, the inhibitor is chosen from a peptide, a phosphopeptide, a small organic molecule, a small inorganic molecule and an antibody (e.g., an antibody conjugated to a therapeutic moiety selected from a cytotoxin, a cytotoxic agent and a radioactive metal ion). The inhibitor can also be a trypsin inhibitor or a derivative thereof, or a peptidomimetic, e.g., a phosphonate analog of a peptide substrate.

[1328] In a preferred embodiment, the agent, e.g., the compound, is an inhibitor of a 58224 nucleic acid, e.g., an antisense, a ribozyme, or a triple helix molecule.

[1329] In a preferred embodiment, the agent, e.g., the compound, is administered in combination with a cytotoxic agent. Examples of cytotoxic agents include anti-microtubule agent, a topoisomerase I inhibitor, a topoisomerase II inhibitor, an anti-metabolite, a mitotic inhibitor, an alkylating agent, an intercalating agent, an agent capable of interfering with a signal transduction pathway, an agent that promotes apoptosis or necrosis, and radiation.

[1330] The invention also provides assays for determining the activity of or the presence or absence of 58224 polypeptides or nucleic acid molecules in a biological sample, including for disease diagnosis. Preferably, the biological sample includes a cancerous or pre-cancerous cell or tissue. For example, the cancerous tissue can be a solid tumor, a soft tissue tumor, or a metastatic lesion. Preferably, the cancerous tissue is a sarcoma, a carcinoma, or an adenocarcinoma. Preferably, the cancerous tissue is from the breast, ovary, colon, lung, liver, or brain.

[1331] In a further aspect the invention provides assays for determining the presence or absence of a genetic alteration in a 58224 polypeptide or nucleic acid molecule in a sample, for, e.g., disease diagnosis. Preferably, the sample includes a cancer cell or tissue. For example, the cancer can be a solid tumor, a soft tissue tumor, or a metastatic lesion. Preferably, the cancer is a sarcoma, a carcinoma, or an adenocarcinoma. Preferably, the cancer is a breast, ovarian, colon, lung, liver, or brain cancer.

[1332] In a still further aspect, the invention provides methods for staging a disorder, or evaluating the efficacy of a treatment of a disorder, e.g., a proliferative disorder, e.g., a cancer (e.g., a breast, ovarian, colon, liver or lung cancer). The method includes: treating a subject, e.g., a patient or an animal, with a protocol under evaluation (e.g., treating a subject with one or more of: chemotherapy, radiation, and/or a compound identified using the methods described herein); and evaluating the expression of a 58224 nucleic acid or polypeptide before and after treatment. A change, e.g., a decrease or increase, in the level of a 58224 nucleic acid (e.g., mRNA) or polypeptide after treatment, relative to the level of expression before treatment, is indicative of the efficacy of the treatment of the disorder.

[1333] In a preferred embodiment, the disorder is a cancer of the breast, ovary, colon, lung, or liver. The level of 58224 nucleic acid or polypeptide expression can be detected by any method described herein.

[1334] In a preferred embodiment, the evaluating step includes obtaining a sample (e.g., a tissue sample, e.g., a biopsy, or a fluid sample) from the subject, before and after treatment and comparing the level of expressing of a 58224 nucleic acid (e.g., mRNA) or polypeptide before and after treatment.

[1335] In another aspect, the invention provides methods for evaluating the efficacy of a therapeutic or prophylactic agent (e.g., an anti-neoplastic agent). The method includes: contacting a sample with an agent (e.g., a compound identified using the methods described herein, a cytotoxic agent) and, evaluating the expression of 58224 nucleic acid or polypeptide in the sample before and after the contacting step. A change, e.g., a decrease or increase, in the level of 58224 nucleic acid (e.g., mRNA) or polypeptide in the sample obtained after the contacting step, relative to the level of expression in the sample before the contacting step, is indicative of the efficacy of the agent. The level of 58224 nucleic acid or polypeptide expression can be detected by any method described herein.

[1336] In a preferred embodiment, the sample includes cells obtained from a cancerous tissue where a 58224 polypeptide or nucleic acid is obtained, e.g., a cancer of the breast, ovary, colon, lung, or liver.

[1337] In a preferred embodiment, the sample is a tissue sample (e.g., a biopsy), a bodily fluid, cultured cells (e.g., a tumor cell line).

[1338] In further aspect, the invention provides assays for determining the presence or absence of a genetic alteration in a 58224 polypeptide or nucleic acid molecule, including for disease diagnosis.

[1339] In another aspect, the invention features a two dimensional array having a plurality of addresses, each address of the plurality being positionally distinguishable from each other address of the plurality, and each address of the plurality having a unique capture probe, e.g., a nucleic acid or peptide sequence. At least one address of the plurality has a capture probe that recognizes a 58224 molecule. In one embodiment, the capture probe is a nucleic acid, e.g., a probe complementary to a 58224 nucleic acid sequence. In another embodiment, the capture probe is a polypeptide, e.g., an antibody specific for 58224 polypeptides. Also featured is a method of analyzing a sample by contacting the sample to the aforementioned array and detecting binding of the sample to the array.

[1340] Other features and advantages of the invention will be apparent from the following detailed description, and from the claims.

Detailed Description of 58224

[1341] The human 58224 sequence (SEQ ID NO:22), which is approximately 2798 nucleotides long including untranslated regions, contains a predicted methionine-initiated coding sequence of about 2514 nucleotides (nucleotides 106-2621 of SEQ ID NO:22; SEQ ID NO:24). The coding sequence encodes a 838 amino acid protein (SEQ ID NO:23).

[1342] Human 58224 contains the following regions or other structural features: nine N-glycosylation sites (PFAM Accession PS00001) located at about amino acid residues 137 to 140, 162 to 165, 389 to 392, 485 to 488, 517 to 520, 641 to 644, 682 to 685, 760 to 763, and 811 to 814 of SEQ ID NO:23; three predicted cAMP- and cGMP-dependent protein kinase phosphorylation sites (PS00004) at about amino acids 103 to 106, 291 to 294, and 509 to 512 of SEQ ID NO:23; ten protein kinase C phosphorylation sites (PS00005) at about amino acids 106 to 108, 190 to 192, 473 to 475, 495 to 497, 505 to 507, 512 to 514, 598 to 600, 652 to 654, 673 to 675, and 793 to 795; 16 predicted casein kinase II sites (PS00006) located at about amino acids 84 to 87, 143 to 146, 305 to 308, 331 to 334, 413 to 416, 419 to 422, 423 to 426, 495 to 498, 519 to 522, 626 to 629, 643 to 646, 650 to 653, 663 to 666, 684 to 687, 793 to 796, and 832 to 835 of SEQ ID NO:23; three tyrosine phosphorylation kinase phosphorylation sites (PS00007) at about amino acids 58 to 65, 128 to 136 and 474 to 480 of SEQ ID NO:23; four predicted N-myristoylation sites (PS00008) from about amino 8 to 13, 251 to 256, 678 to 683 and 754 to 759 of SEQ ID NO:23; and one predicted leucine zipper pattern (PS00029) at amino acid 379 to 400 of SEQ ID NO:23.

[1343] For general information regarding PFAM identifiers, PS prefix and PF prefix domain identification numbers, refer to Sonnhammer et al. (1997) Protein 28:405-420 and http://www.psc.edu/general/software/packages/pfam/pfam.html.

[1344] A plasmid containing the nucleotide sequence encoding human 58224 (clone Fbh58224FL) was deposited with American Type Culture Collection (ATCC), 10801 University Boulevard, Manassas, Va. 20110-2209, on ______ and assigned Accession Number ______. This deposit will be maintained under the terms of the Budapest Treaty on the International Recognition of the Deposit of Microorganisms for the Purposes of Patent Procedure. This deposit was made merely as a convenience for those of skill in the art and is not an admission that a deposit is required under 35 U.S.C. §112.

[1345] The 58224 protein contains a significant number of structural characteristics in common with members of the helicase family. The term “family” when referring to the protein and nucleic acid molecules of the invention means two or more proteins or nucleic acid molecules having a common structural domain or motif and having sufficient amino acid or nucleotide sequence homology as defined herein. Such family members can be naturally or non-naturally occurring and can be from either the same or different species. For example, a family can contain a first protein of human origin as well as other distinct proteins of human origin, or alternatively, can contain homologues of non-human origin, e.g., rat or mouse proteins. Members of a family can also have common functional characteristics.

[1346] The helicase family of proteins comprises a number of related enzymes that share structural homology and a common catalytic mechanism whereby the enzyme converts the energy from NTP, e.g., ATP, hydrolysis into the mechanical energy required for unwinding of nucleic acid duplexes. For example, DEAD-type helicases catalyze the unwinding of ribonucleic acids during RNA splicing. Other RNA helicases include members of the DEAH or DEXH helicase subfamilies. The SNF2 subfamily of helicases or helicase-like proteins comprises proteins that can act as transcriptional regulators for multiple genes. Thus, this family includes enzymes critical for the proper function of many physiological systems, including replication, transcription, and cellular proliferation and differentiation.

[1347] A 58224 polypeptide can include an “SNF2 N-terminal domain” or regions homologous with an “SNF2 N-terminal domain.” As used herein, the term “SNF2 N-terminal domain” refers to a protein domain having an amino acid sequence of about 200 to 500 amino acids and having a bit score for the alignment of the sequence to the SNF2 N-terminal domain (HMM) of at least 150. Preferably, an SNF2 N-terminal domain includes at least about 250 to 450 amino acids, preferably about 300 to 400 amino acid residues, or more preferably about 350 amino acid residues and has a bit score for the alignment of the sequence to the SNF2 N-terminal domain domain (HMM) of at least about 160, 170, 180, 190, 200, 225, 250, 275, 300, 350 or greater. An alignment of the SNF2 N-terminal domain (amino acids 226 to 577 of SEQ ID NO:23) of human 58224 with a consensus amino acid sequence derived from a hidden Markov model is depicted in FIG. 14A.

[1348] In a preferred embodiment 58224 polypeptide or protein has a “SNF2 N-terminal domain” or a region which includes at least about 200 to 500 amino acids, preferably about 300 to 400 amino acid residues, or more preferably about 350 amino acid residues and has at least about 60%, 70% 80% 90% 95%, 99%, or 100% homology with a “SNF2 N-terminal domain,” e.g., the SNF2 N-terminal domain of human 58224 (e.g., residues 226-577 of SEQ ID NO:23).

[1349] A 58224 polypeptide can also include a “helicase conserved C-terminal domain” or regions homologous with a “helicase conserved C-terminal domain”. As used herein, the term “helicase conserved C-terminal domain” refers to a protein domain having an amino acid sequence of about 50-200 amino acids and having a bit score for the alignment of the sequence to the helicase conserved C-terminal domain (HMM) of at least 50. Preferably, a helicase conserved C-terminal domain includes at least about 60-150 amino acids, preferably about 70-100 amino acid residues, or more preferably at least about 80 amino acid residues and has a bit score for the alignment of the sequence to the helicase conserved C-terminal domain (HMM) of at least about 60, 70, 80, 90, 95, or greater. An alignment of the helicase conserved C-terminal domain (amino acids 629 to 712 of SEQ ID NO:23) of human 58224 with a consensus amino acid sequence derived from a hidden Markov model is depicted in FIG. 14B. Typically, a helicase conserved C-terminal domain catalyzes the separation of two complementary strands of a duplex nucleic acid, and/or that can catalyze hydrolysis of an NTP (e.g., an ATP). For example, a helicase domain can show DNA-dependent ATPase activity.

[1350] In a preferred embodiment 58224 polypeptide or protein has a “helicase conserved C-terminal domain” or a region which includes at least about 50-200 amino acids, preferably about 50-100 amino acid residues, or more preferably at least about 80 amino acid residues and has at least about 60%, 70% 80% 90% 95%, 99%, or 100% homology with a “helicase conserved C-terminal domain”, e.g., the helicase conserved C-terminal domain of human 58224 (e.g., residues 629-712 of SEQ ID NO:23).

[1351] To identify the presence of a “SNF2 N-terminal domain” or a “helicase conserved C-terminal domain” in a 58224 protein sequence, and make the determination that a polypeptide or protein of interest has a particular profile, the amino acid sequence of the protein can be searched against a database of HMMs (e.g., the Pfam database, release 2.1) using the default parameters (http://www.sanger.ac.uk/Software/Pfam/HMM_search). For example, the hmmsf program, which is available as part of the HMMER package of search programs, is a family specific default program for MILPAT0063 and a score of 15 is the default threshold score for determining a hit. Alternatively, the threshold score for determining a hit can be lowered (e.g., to 8 bits). A description of the Pfam database can be found in Sonhammer et al. (1997) Proteins 28(3):405-420 and a detailed description of HMMs can be found, for example, in Gribskov et al. (1990) Meth. Enzymol. 183:146-159; Gribskov et al. (1987) Proc. Natl. Acad. Sci. USA 84:4355-4358; Krogh et al. (1994) J. Mol. Biol. 235:1501-1531; and Stultz et al. (1993) Protein Sci. 2:305-314, the contents of which are incorporated herein by reference. A search was performed against the HMM database resulting in the identification of a “SNF2 N-terminal domain” in the amino acid sequence of human 58224 at about residues 226-577 of SEQ ID NO:23 (see FIG. 14A) and a “helicase conserved C-terminal domain” in the amino acid sequence of human 58224 at about residues 629-712 of SEQ ID NO:23 (FIG. 14B).

[1352] A 58224 family member can include: a SNF2 N-terminal domain, a helicase conserved C-terminal domain, a lymphoid specific lymphocyte specific helicase domain 1 or a lymphoid specific lymphocyte specific helicase domain 2.

[1353] As the 58224 polypeptides of the invention may modulate 58224-mediated activities, they may be useful for developing novel diagnostic and therapeutic agents for 58224-mediated or related disorders, as described below.

[1354] As used herein, a “58224 activity”, “biological activity of 58224” or “functional activity of 58224”, refers to an activity exerted by a 58224 protein, polypeptide or nucleic acid molecule on e.g., a 58224-responsive cell or on a 58224 substrate, e.g., a protein substrate, as determined in vivo or in vitro. In one embodiment, a 58224 activity is a direct activity, such as an association with a 58224 target molecule. A “target molecule” or “binding partner” is a molecule with which a 58224 protein binds or interacts in nature. In an exemplary embodiment, is a 58224 substrate. A 58224 activity can also be an indirect activity, e.g., a cellular signaling activity mediated by interaction of the 58224 protein with a 58224 substrate. For example, the 58224 proteins of the present invention can have one or more of the following activities: (1) catalyze the separation of two complementary strands of a duplex nucleic acid; (2) bind DNA; (3) bind nucleoside 5′-triphosphate (NTP); (4) modulate replication; (5) modulate recombination; (6) modulate DNA repair; (7) modulate transcription, (8) modulate translation, (9) modulate RNA splicing, (10) modulate nucleic acid metabolism; (11) act as a transcriptional regulator; (12) disrupt mononucleosomal structure; (13) modulate the alteration of chromatin structure; (14) modulate rearrangement of receptor genes encoding lymphoctye antigen receptors; (15) modulate T- and B-cell developmental events; or (16) it is an agonist or antagonist of any of (1)-(15).

[1355] Based on the above-described sequence similarities, the 58224 molecules of the present invention are predicted to have similar biological activities as helicase family members, e.g., modulates nearly all metabolic transactions of nucleic acids, e.g., replication, repair, recombination, and transcription of DNA, and splicing of RNA. Thus, the 58224 molecules can act as novel diagnostic targets and therapeutic agents for controlling disorders associated with aberrant helicase activity, e.g., aberrant proliferation or differentiation, e.g., Werner syndrome, Bloom syndrome, cockayne's syndrome, xerodema pigmentosum, lymphoid proliferative diseases, cancer and &agr;-Thalassemia X-linked mental retardation.

[1356] 58824 mRNA expression is upregulated in some tumor tissues, e.g., in human breast, ovary, lung, and colon tumors, in liver metastases, and in fetal liver (see Example 16). Accordingly, 58224 molecules may act as novel therapeutic and prophylactic agents for controlling disorders or diseases involving aberrant activities of the cells in which these molecules are expressed, and as diagnostic markers useful for indicating the presence or predisposition towards developing such disorders, or monitoring the progression or regression of a disorder. Examples of such disorders include cellular proliferative and/or differentiative disorders, e.g., breast, ovary, lung or colon cancers.

[1357] Examples of cellular proliferative and/or differentiative disorders include cancer, e.g., carcinoma, sarcoma, metastatic disorders or hematopoietic neoplastic disorders, e.g., leukemias. A metastatic tumor can arise from a multitude of primary tumor types, including but not limited to those of breast, ovary, colon, lung, and liver origin.

[1358] As used herein, the terms “cancer”, “hyperproliferative” and “neoplastic” refer to cells having the capacity for autonomous growth, i.e., an abnormal state or condition characterized by rapidly proliferating cell growth. Hyperproliferative and neoplastic disease states may be categorized as pathologic, i.e., characterizing or constituting a disease state, or may be categorized as non-pathologic, i.e., a deviation from normal but not associated with a disease state. The term is meant to include all types of cancerous growths or oncogenic processes, metastatic tissues or malignantly transformed cells, tissues, or organs, irrespective of histopathologic type or stage of invasiveness. “Pathologic hyperproliferative” cells occur in disease states characterized by malignant tumor growth. Examples of non-pathologic hyperproliferative cells include proliferation of cells associated with wound repair.

[1359] The terms “cancer” or “neoplasms” include malignancies of the various organ systems, such as affecting lung, breast, thyroid, lymphoid, gastrointestinal, and genito-urinary tract, as well as adenocarcinomas which include malignancies such as most colon cancers, renal-cell carcinoma, prostate cancer and/or testicular tumors, non-small cell carcinoma of the lung, cancer of the small intestine and cancer of the esophagus.

[1360] The term “carcinoma” is art recognized and refers to malignancies of epithelial or endocrine tissues including respiratory system carcinomas, gastrointestinal system carcinomas, genitourinary system carcinomas, testicular carcinomas, breast carcinomas, prostatic carcinomas, endocrine system carcinomas, and melanomas. Exemplary carcinomas include those forming from tissue of the cervix, lung, prostate, breast, head and neck, colon and ovary. The term also includes carcinosarcomas, e.g., which include malignant tumors composed of carcinomatous and sarcomatous tissues. An “adenocarcinoma” refers to a carcinoma derived from glandular tissue or in which the tumor cells form recognizable glandular structures.

[1361] The term “sarcoma” is art recognized and refers to malignant tumors of mesenchymal derivation.

[1362] Examples of cellular proliferative and/or differentiative disorders of the breast include, but are not limited to, proliferative breast disease including, e.g., epithelial hyperplasia, sclerosing adenosis, and small duct papillomas; tumors, e.g., stromal tumors such as fibroadenoma, phyllodes tumor, and sarcomas, and epithelial tumors such as large duct papilloma; carcinoma of the breast including in situ (noninvasive) carcinoma that includes ductal carcinoma in situ (including Paget's disease) and lobular carcinoma in situ, and invasive (infiltrating) carcinoma including, but not limited to, invasive ductal carcinoma, invasive lobular carcinoma, medullary carcinoma, colloid (mucinous) carcinoma, tubular carcinoma, and invasive papillary carcinoma, and miscellaneous malignant neoplasms. Disorders in the male breast include, but are not limited to, gynecomastia and carcinoma.

[1363] Examples of cellular proliferative and/or differentiative disorders involving the ovary include, for example, polycystic ovarian disease, Stein-leventhal syndrome, Pseudomyxoma peritonei and stromal hyperthecosis; ovarian tumors such as, tumors of coelomic epithelium, serous tumors, mucinous tumors, endometeriod tumors, clear cell adenocarcinoma, cystadenofibroma, brenner tumor, surface epithelial tumors; germ cell tumors such as mature (benign) teratomas, monodermal teratomas, immature malignant teratomas, dysgerminoma, endodermal sinus tumor, choriocarcinoma; sex cord-stomal tumors such as, granulosa-theca cell tumors, thecoma-fibromas, androblastomas, hill cell tumors, and gonadoblastoma; and metastatic tumors such as Krukenberg tumors.

[1364] Examples of cellular proliferative and/or differentiative disorders of the lung include, but are not limited to, bronchogenic carcinoma, including paraneoplastic syndromes, bronchioloalveolar carcinoma, neuroendocrine tumors, such as bronchial carcinoid, miscellaneous tumors, and metastatic tumors; pathologies of the pleura, including inflammatory pleural effusions, noninflammatory pleural effusions, pneumothorax, and pleural tumors, including solitary fibrous tumors (pleural fibroma) and malignant mesothelioma.

[1365] Examples of cellular proliferative and/or differentiative disorders of the colon include, but are not limited to, non-neoplastic polyps, adenomas, familial syndromes, colorectal carcinogenesis, colorectal carcinoma, and carcinoid tumors.

[1366] Examples of cellular proliferative and/or differentiative disorders of the liver include, but are not limited to, nodular hyperplasias, adenomas, and malignant tumors, including primary carcinoma of the liver and metastatic tumors.

[1367] The 58224 protein, fragments thereof, and derivatives and other variants of the sequence in SEQ ID NO:23 thereof are collectively referred to as “polypeptides or proteins of the invention” or “58224 polypeptides or proteins”. Nucleic acid molecules encoding such polypeptides or proteins are collectively referred to as “nucleic acids of the invention” or “58224 nucleic acids.” 58224 molecules refer to 58224 nucleic acids, polypeptides, and antibodies.

[1368] As used herein, the term “nucleic acid molecule” includes DNA molecules (e.g., a cDNA or genomic DNA) and RNA molecules (e.g., an mRNA) and analogs of the DNA or RNA generated, e.g., by the use of nucleotide analogs. The nucleic acid molecule can be single-stranded or double-stranded, but preferably is double-stranded DNA.

[1369] The term “isolated or purified nucleic acid molecule” includes nucleic acid molecules that are separated from other nucleic acid molecules that are present in the natural source of the nucleic acid. For example, with respect to genomic DNA, the term “isolated” includes nucleic acid molecules that are separated from the chromosome with which the genomic DNA is naturally associated. Preferably, an “isolated” nucleic acid is free of sequences that naturally flank the nucleic acid (i.e., sequences located at the 5′ and/or 3′ends of the nucleic acid) in the genomic DNA of the organism from which the nucleic acid is derived. For example, in various embodiments, the isolated nucleic acid molecule can contain less than about 5 kb, 4 kb, 3 kb, 2 kb, 1 kb, 0.5 kb or 0.1 kb of 5′ and/or 3′nucleotide sequences that naturally flank the nucleic acid molecule in genomic DNA of the cell from which the nucleic acid is derived. Moreover, an “isolated” nucleic acid molecule, such as a cDNA molecule, can be substantially free of other cellular material, or culture medium when produced by recombinant techniques, or substantially free of chemical precursors or other chemicals when chemically synthesized.

[1370] As used herein, the term “hybridizes under low stringency, medium stringency, high stringency, or very high stringency conditions” describes conditions for hybridization and washing. Guidance for performing hybridization reactions can be found in Current Protocols in Molecular Biology, John Wiley & Sons, N.Y. (1989), 6.3.1-6.3.6, which is incorporated by reference. Aqueous and nonaqueous methods are described in that reference and either can be used. Specific hybridization conditions referred to herein are as follows: 1) low stringency hybridization conditions in 6× sodium chloride/sodium citrate (SSC) at about 45° C., followed by two washes in 0.2× SSC, 0.1% SDS at least at 50° C. (the temperature of the washes can be increased to 55° C. for low stringency conditions); 2) medium stringency hybridization conditions in 6× SSC at about 45° C., followed by one or more washes in 0.2× SSC, 0.1% SDS at 60° C.; 3) high stringency hybridization conditions in 6× SSC at about 45° C., followed by one or more washes in 0.2× SSC, 0.1% SDS at 65° C.; and preferably 4) very high stringency hybridization conditions are 0.5M sodium phosphate, 7% SDS at 65° C., followed by one or more washes at 0.2× SSC, 1% SDS at 65° C. Very high stringency conditions (4) are the preferred conditions and the ones that should be used unless otherwise specified.

[1371] As used herein, a “naturally-occurring” nucleic acid molecule refers to an RNA or DNA molecule having a nucleotide sequence that occurs in nature (e.g., encodes a natural protein).

[1372] As used herein, the terms “gene” and “recombinant gene” refer to nucleic acid molecules that include an open reading frame encoding a 58224 protein, preferably a mammalian 58224 protein, and further can include non-coding regulatory sequences and introns.

[1373] An “isolated” or “purified” polypeptide or protein is substantially free of cellular material or other contaminating proteins from the cell or tissue source from which the protein is derived, or substantially free from chemical precursors or other chemicals when chemically synthesized. In one embodiment, the language “substantially free” means preparation of 58224 protein having less than about 30%, 20%, 10% and more preferably 5% (by dry weight), of non-58224 protein (also referred to herein as a “contaminating protein”), or of chemical precursors or non-58224 chemicals. When the 58224 protein or biologically active portion thereof is recombinantly produced, it is also preferably substantially free of culture medium, i.e., culture medium represents less than about 20%, more preferably less than about 10%, and most preferably less than about 5% of the volume of the protein preparation. The invention includes isolated or purified preparations of at least 0.01, 0.1, 1.0, and 10 milligrams in dry weight.

[1374] A “non-essential” amino acid residue is a residue that can be altered from the wild-type sequence of 58224 (e.g., the sequence of SEQ ID NO:22, 24, or the nucleotide sequence of the DNA insert of the plasmid deposited with ATCC as Accession Number ______) without abolishing or more preferably, without substantially altering a biological activity of the 58224 protein, whereas an “essential” amino acid residue results in such a change. For example, amino acid residues that are conserved among the polypeptides of the present invention, e.g., those present in the helicase domain or the SNF2 domain, are predicted to be particularly unamenable to alteration.

[1375] A “conservative amino acid substitution” is one in which the amino acid residue is replaced with an amino acid residue having a similar side chain. Families of amino acid residues having similar side chains have been defined in the art. These families include amino acids with basic side chains (e.g., lysine, arginine, histidine), acidic side chains (e.g., aspartic acid, glutamic acid), uncharged polar side chains (e.g., glycine, asparagine, glutamine, serine, threonine, tyrosine, cysteine), nonpolar side chains (e.g., alanine, valine, leucine, isoleucine, proline, phenylalanine, methionine, tryptophan), beta-branched side chains (e.g., threonine, valine, isoleucine) and aromatic side chains (e.g., tyrosine, phenylalanine, tryptophan, histidine). Thus, a predicted nonessential amino acid residue in a 58224 protein is preferably replaced with another amino acid residue from the same side chain family. Alternatively, in another embodiment, mutations can be introduced randomly along all or part of a 58224 coding sequence, such as by saturation mutagenesis, and the resultant mutants can be screened for 58224 biological activity to identify mutants that retain activity. Following mutagenesis of SEQ ID NO:22, 24, or the nucleotide sequence of the DNA insert of the plasmid deposited with ATCC as Accession Number ______, the encoded protein can be expressed recombinantly and the activity of the protein can be determined.

[1376] As used herein, a “biologically active portion” of a 58224 protein includes a fragment of a 58224 protein that participates in an interaction between a 58224 molecule and a non-58224 molecule. Biologically active portions of a 58224 protein include peptides comprising amino acid sequences sufficiently homologous to or derived from the amino acid sequence of the 58224 protein, e.g., the amino acid sequence shown in SEQ ID NO:23, which include less amino acids than the full length 58224 protein and exhibit at least one activity of a 58224 protein. Typically, biologically active portions comprise a domain or motif with at least one activity of the 58224 protein, e.g., helicase activity. A biologically active portion of a 58224 protein can be a polypeptide that is, for example, 10, 25, 50, 100, 200 or more amino acids in length. Biologically active portions of a 58224 protein can be used as targets for developing agents that modulate a 58224 mediated activity, e.g., helicase activity.

[1377] Particularly preferred 58224 polypeptides of the present invention have an amino acid sequence substantially identical to the amino acid sequence of SEQ ID NO:23. In the context of an amino acid sequence, the term “substantially identical” is used herein to refer to a first amino acid that contains a sufficient or minimum number of amino acid residues that are i) identical to, or ii) conservative substitutions of aligned amino acid residues in a second amino acid sequence such that the first and second amino acid sequences can have a common structural domain and/or common functional activity. For example, amino acid sequences that contain a common structural domain having at least about 60%, or 65% identity, likely 75% identity, more likely 85%, 90%. 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98% or 99% identity to SEQ ID NO:23 are termed substantially identical.

[1378] In the context of nucleotide sequence, the term “substantially identical” is used herein to refer to a first nucleic acid sequence that contains a sufficient or minimum number of nucleotides that are identical to aligned nucleotides in a second nucleic acid sequence such that the first and second nucleotide sequences encode a polypeptide having common functional activity, or encode a common structural polypeptide domain or a common functional polypeptide activity. For example, nucleotide sequences having at least about 60%, or 65% identity, likely 75% identity, more likely 85%, 90%. 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98% or 99% identity to SEQ ID NO:22 or 24, are termed substantially identical.

[1379] Calculations of homology or sequence identity between sequences (the terms are used interchangeably herein) are performed as follows.

[1380] To determine the percent identity of two amino acid sequences, or of two nucleic acid sequences, the sequences are aligned for optimal comparison purposes (e.g., gaps can be introduced in one or both of a first and a second amino acid or nucleic acid sequence for optimal alignment and non-homologous sequences can be disregarded for comparison purposes). In a preferred embodiment, the length of a reference sequence aligned for comparison purposes is at least 30%, preferably at least 40%, more preferably at least 50%, even more preferably at least 60%, and even more preferably at least 70%, 80%, 90%, 100% of the length of the reference sequence (e.g., when aligning a second sequence to the 58224 amino acid sequence of SEQ ID NO:23 having 838 amino acid residues, at least 251, preferably at least 335, more preferably at least 419, even more preferably at least 503, and even more preferably at least 587, 536, 670, or 754 amino acid residues are aligned). The amino acid residues or nucleotides at corresponding amino acid positions or nucleotide positions are then compared. When a position in the first sequence is occupied by the same amino acid residue or nucleotide as the corresponding position in the second sequence, then the molecules are identical at that position (as used herein amino acid or nucleic acid “identity” is equivalent to amino acid or nucleic acid “homology”). The percent identity between the two sequences is a function of the number of identical positions shared by the sequences, taking into account the number of gaps, and the length of each gap, which need to be introduced for optimal alignment of the two sequences.

[1381] The comparison of sequences and determination of percent identity between two sequences can be accomplished using a mathematical algorithm. In a preferred embodiment, the percent identity between two amino acid sequences is determined using the Needleman and Wunsch (J. Mol. Biol. (48):444-453 (1970)) algorithm which has been incorporated into the GAP program in the GCG software package (available at http://www.gcg.com), using either a Blossum 62 matrix or a PAM250 matrix, and a gap weight of 16, 14, 12, 10, 8, 6, or 4 and a length weight of 1, 2, 3, 4, 5, or 6. In yet another preferred embodiment, the percent identity between two nucleotide sequences is determined using the GAP program in the GCG software package (available at http://www.gcg.com), using a NWSgapdna.CMP matrix and a gap weight of 40, 50, 60, 70, or 80 and a length weight of 1, 2, 3, 4, 5, or 6. A particularly preferred set of parameters (and the one that should be used if the practitioner is uncertain about what parameters should be applied to determine if a molecule is within the invention) is using a Blossum 62 scoring matrix with a gap open penalty of 12, a gap extend penalty of 4, and a frameshift gap penalty of 5.

[1382] The percent identity between two amino acid or nucleotide sequences can be determined using the algorithm of Meyers and Miller (CABIOS, 4:11-17 (1989)) which has been incorporated into the ALIGN program (version 2.0), using a PAM120 weight residue table, a gap length penalty of 12 and a gap penalty of 4.

[1383] The nucleic acid and protein sequences described herein can be used as a “query sequence” to perform a search against public databases to, for example, identify other family members or related sequences. Such searches can be performed using the NBLAST and XBLAST programs (version 2.0) of Altschul, et al. (1990) J. Mol. Biol. 215:403-10. BLAST nucleotide searches can be performed with the NBLAST program, score=100, wordlength=12 to obtain nucleotide sequences homologous to 58224 nucleic acid molecules of the invention. BLAST protein searches can be performed with the XBLAST program, score=50, wordlength=3 to obtain amino acid sequences homologous to 58224 protein molecules of the invention. To obtain gapped alignments for comparison purposes, Gapped BLAST can be utilized as described in Altschul et al., (1997) Nucleic Acids Res. 25(17):3389-3402. When utilizing BLAST and Gapped BLAST programs, the default parameters of the respective programs (e.g., XBLAST and NBLAST) can be used. See http://www.ncbi.nlm.nih.gov.

[1384] “Misexpression or aberrant expression”, as used herein, refers to a non-wild type pattern of gene expression, at the RNA or protein level. It includes: expression at non-wild type levels, i.e., over or under expression; a pattern of expression that differs from wild type in terms of the time or stage at which the gene is expressed, e.g., increased or decreased expression (as compared with wild type) at a predetermined developmental period or stage; a pattern of expression that differs from wild type in terms of decreased expression (as compared with wild type) in a predetermined cell type or tissue type; a pattern of expression that differs from wild type in terms of the splicing size, amino acid sequence, post-transitional modification, or biological activity of the expressed polypeptide; a pattern of expression that differs from wild type in terms of the effect of an environmental stimulus or extracellular stimulus on expression of the gene, e.g., a pattern of increased or decreased expression (as compared with wild type) in the presence of an increase or decrease in the strength of the stimulus.

[1385] “Subject,” as used herein, refers to human and non-human animals. The term “non-human animals” of the invention includes all vertebrates, e.g., mammals, such as non-human primates (particularly higher primates), sheep, dog, rodent (e.g., mouse or rat), guinea pig, goat, pig, cat, rabbits, cow, and non-mammals, such as chickens, amphibians, reptiles, etc. In a preferred embodiment, the subject is a human. In another embodiment, the subject is an experimental animal or animal suitable as a disease model.

[1386] A “purified preparation of cells”, as used herein, refers to, in the case of plant or animal cells, an in vitro preparation of cells and not an entire intact plant or animal. In the case of cultured cells or microbial cells, it consists of a preparation of at least 10% and more preferably 50% of the subject cells.

[1387] Various aspects of the invention are described in further detail below.

[1388] Isolated Nucleic Acid Molecules of 58224

[1389] In one aspect, the invention provides an isolated or purified nucleic acid molecule that encodes a 58224 polypeptide described herein, e.g., a full-length 58224 protein or a fragment thereof, e.g., a biologically active portion of a 58224 protein. Also included is a nucleic acid fragment suitable for use as a hybridization probe, which can be used, e.g., to identify a nucleic acid molecule encoding a polypeptide of the invention, 58224 mRNA, or fragments suitable for use as primers, e.g., PCR primers for the amplification or mutation of nucleic acid molecules.

[1390] In one embodiment, an isolated nucleic acid molecule of the invention includes the nucleotide sequence shown in SEQ ID NO:22, 24, or the nucleotide sequence of the DNA insert of the plasmids deposited with ATCC as Accession Number ______, or a portion of any of these nucleotide sequences. In one embodiment, the nucleic acid molecule includes sequences encoding the 58224 protein (i.e., “the coding region,”) as well as 5′untranslated sequences. Alternatively, the nucleic acid molecule can include only the coding region of SEQ ID NO:22 (e.g., the sequences corresponding to SEQ ID NO:24) and, e.g., no flanking sequences that normally accompany the subject sequence.

[1391] In another embodiment, an isolated nucleic acid molecule of the invention includes a nucleic acid molecule that is a complement of the nucleotide sequence shown in SEQ ID NO:22, 24, or the nucleotide sequence of the DNA insert of the plasmid deposited with ATCC as Accession Number ______, or a portion of any of these nucleotide sequences. In other embodiments, the nucleic acid molecule of the invention is sufficiently complementary to the nucleotide sequence shown in SEQ ID NO:22, 24, or the nucleotide sequence of the DNA insert of the plasmid deposited with ATCC as Accession Number ______ such that it can hybridize to the nucleotide sequence shown in SEQ ID NO:22, 24, or the nucleotide sequence of the DNA insert of the plasmid deposited with ATCC as Accession Number ______, thereby forming a stable duplex.

[1392] In one embodiment, an isolated nucleic acid molecule of the present invention includes a nucleotide sequence that is at least about: 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or more homologous to the entire length of the nucleotide sequence shown in SEQ ID NO:22, 24, or the entire length of the nucleotide sequence of the DNA insert of the plasmid deposited with ATCC as Accession Number ______. In the case of an isolated nucleic acid molecule which is longer than or equivalent in length to the reference sequence, e.g., SEQ ID NO:22 or 24, the comparison is made with the full length of the reference sequence. Where the isolated nucleic acid molecule is shorter that the reference sequence, e.g., shorter than SEQ ID NO:22 or 24, the comparison is made to a segment of the reference sequence of the same length (excluding any loop required by the homology calculation).

[1393] 58224 Nucleic Acid Fragments

[1394] A nucleic acid molecule of the invention can include only a portion of the nucleic acid sequence of SEQ ID NO:22, 24, or the nucleotide sequence of the DNA insert of the plasmid deposited with ATCC as Accession Number ______. For example, such a nucleic acid molecule can include a fragment that can be used as a probe or primer or a fragment encoding a portion of a 58224 protein, e.g., an immunogenic or biologically active portion of a 58224 protein. A fragment can comprise nucleotides encoding amino acids 226 to 577 of SEQ ID NO:23, which encode a SNF2 domain, or amino acids 629 to 712 of SEQ ID NO:23, which encode a C-terminal helicase domain of human 58224. The nucleotide sequence determined from the cloning of the 58224 gene allows for the generation of probes and primers designed for use in identifying and/or cloning other 58224 family members, or fragments thereof, as well as 58224 homologues or fragments thereof, from other species.

[1395] In another embodiment, a nucleic acid includes a nucleotide sequence that includes part, or all, of the coding region and extends into either (or both) the 5′ or 3′noncoding region. Other embodiments include a fragment that includes a nucleotide sequence encoding an amino acid fragment described herein. Nucleic acid fragments can encode a specific domain or site described herein or fragments thereof, particularly fragments thereof which are at least 212 or 273 amino acids in length. Fragments also include nucleic acid sequences corresponding to specific amino acid sequences described above or fragments thereof. Nucleic acid fragments should not to be construed as encompassing those fragments that may have been disclosed prior to the invention.

[1396] A nucleic acid fragment can include a sequence corresponding to a domain, region, or functional site described herein. A nucleic acid fragment also can include one or more domains, regions, or functional sites described herein.

[1397] In a preferred embodiment, the fragment is at least 50, 100, 143, 150, 200, 250, 294, 300, 350, 398, 400, 450, 500, 550, 600, 650, 700, 750, 800, 820, 850, 900, 950, 1000, 1500, 2000, 2253 nucleotides in length, and hybridizes under a stringent hybridization condition as described herein to a nucleic acid molecule of SEQ ID NO:22, 24, or the nucleotide sequence of the DNA insert of the plasmid deposited with ATCC as Accession Number ______.

[1398] 58224 probes and primers are provided. Typically a probe/primer is an isolated or purified oligonucleotide. The oligonucleotide typically includes a region of nucleotide sequence that hybridizes under a stringent hybridization condition as described herein to at least about 7, 12 or 15, preferably about 20 or 25, more preferably about 30, 35, 40, 45, 50, 55, 60, 65, or 75 consecutive nucleotides of a sense or antisense sequence of SEQ ID NO:22, 24, the nucleotide sequence of the DNA insert of the plasmid deposited with ATCC as Accession Number ______ or a naturally occurring allelic variant or mutant of SEQ ID NO:22, 24, or the nucleotide sequence of the DNA insert of the plasmid deposited with ATCC as Accession Number ______.

[1399] In a preferred embodiment the nucleic acid is a probe that is at least 5 or 10 and less than 500, 300, or 200 base pains in length, and more preferably is less than 100 or less than 50 base pairs in length. It should be identical, or differ by 1, or less than 5 or 10 bases, from a sequence disclosed herein. If alignment is needed for this comparison, the sequences should be aligned for maximum homology. “Looped” out sequences in the alignment from deletions, insertions, or mismatches, are considered differences.

[1400] A probe or primer can be derived from the sense or anti-sense strand of a nucleic acid that encodes a helicase domain: amino acids 226 to 577 of SEQ ID NO:23 or 629 to 712 of SEQ ID NO:23.

[1401] In another embodiment a set of primers is provided, e.g., primers suitable for use in a PCR, which can be used to amplify a selected region of a 58224 sequence, e.g., a region, domain, or site described herein. The primers should be at least 5, 10, or 50 base pairs in length and less than 100 or 200 base pairs in length. The primers should be identical, or differ by one base from a sequence disclosed herein or from a naturally occurring variant. E.g., primers suitable for amplifying all or a portion of a helicase domain: amino acids 226 to 577 of SEQ ID NO:23 or 629 to 712 of SEQ ID NO:23.

[1402] A nucleic acid fragment can encode an epitope bearing region of a polypeptide described herein.

[1403] A nucleic acid fragment encoding a “biologically active portion of a 58224 polypeptide” can be prepared by isolating a portion of the nucleotide sequence of SEQ ID NO:22, 24, or the nucleotide sequence of the DNA insert of the plasmid deposited with ATCC as Accession Number ______, which encodes a polypeptide having a 58224 biological activity (e.g., the biological activities of the 58224 proteins described herein), expressing the encoded portion of the 58224 protein (e.g., by recombinant expression in vitro) and assessing the activity of the encoded portion of the 58224 protein. For example, a nucleic acid fragment encoding a biologically active portion of 58224 includes a helicase domain, e.g., amino acid residues 226 to 577 of SEQ ID NO:23 or residues 629 to 712 of SEQ ID NO:23. A nucleic acid fragment encoding a biologically active portion of a 58224 polypeptide, may comprise a nucleotide sequence that is greater than about 80, 100, 200, 300 or more nucleotides in length (e.g., greater than about 350 nucleotides in length).

[1404] In preferred embodiments, a nucleic acid includes a nucleotide sequence which is about 300, 400, 500, 600, 700, 800, 900, 1000, 1100, 1200, 1300 or more nucleotides in length and hybridizes under a stringency condition described herein to a nucleic acid molecule of SEQ ID NO:22 or 24.

[1405] In preferred embodiments, a nucleic acid includes a nucleotide sequence which is at least about 300, 350, 398, 400, 450, 500, 550, 600, 650, 700, 750, 800, 820, 850, 900, 950, 1000, 1500, 2000, 2253, or more nucleotides in length and hybridizes under a stringency condition described herein to a nucleic acid molecule of SEQ ID NO:22 or 24.

[1406] In a preferred embodiment, a nucleic acid fragment has a nucleotide sequence other than (e.g., differs by one or more nucleotides from) Genbank accession number AK001201.

[1407] In a preferred embodiment, a nucleic acid fragment includes at least one, preferably more, nucleotides from the sequence of nucleotide 1 to 1794 or nucleotide 2614 to 2798 of SEQ ID NO:22.

[1408] In a preferred embodiment, a nucleic acid fragment includes at least one, preferably more, nucleotides from the sequence of nucleotide 1 to 2258 or nucleotide 2253 to 2798 of SEQ ID NO:22.

[1409] In a preferred embodiment, a nucleic acid fragment includes at least one, preferably more, nucleotides from the sequence of nucleotide 1 to 25 or nucleotide 424 to 2798 of SEQ ID NO:22.

[1410] In a preferred embodiment, a nucleic acid fragment includes at least one, preferably more, nucleotides from the sequence of nucleotide 1 to 387 or nucleotide 531 to 2798 of SEQ ID NO:22.

[1411] 58224 Nucleic Acid Variants

[1412] The invention further encompasses nucleic acid molecules that differ from the nucleotide sequence shown in SEQ ID NO:22, 24, or the nucleotide sequence of the DNA insert of the plasmid deposited with ATCC as Accession Number ______. Such differences can be due to degeneracy of the genetic code (and result in a nucleic acid that encodes the same 58224 proteins as those encoded by the nucleotide sequence disclosed herein. In another embodiment, an isolated nucleic acid molecule of the invention has a nucleotide sequence encoding a protein having an amino acid sequence that differs by at least 1, but less than 5, 10, 20, 50, or 100 amino acid residues than that shown in SEQ ID NO:23. If alignment is needed for this comparison the sequences should be aligned for maximum homology. “Looped” out sequences from deletions, insertions, or mismatches, are considered differences.

[1413] Nucleic acids of the invention can be chosen for having codons, which are preferred, or non-preferred, for a particular expression system (e.g., the nucleic acid can be one in which at least one codon, at preferably at least 10%, or 20% of the codons has been altered such that the sequence is optimized for expression in E. coli, yeast, human, insect, or chinese hamster ovary (CHO) cells).

[1414] Nucleic acid variants can be naturally occurring, such as allelic variants (same locus), homologs (different locus), and orthologs (different organism) or can be non-naturally occurring. Non-naturally occurring variants can be made by mutagenesis techniques, including those applied to polynucleotides, cells, or organisms. The variants can contain nucleotide substitutions, deletions, inversions, and insertions. Variation can occur in either or both the coding and non-coding regions. The variations can produce both conservative and non-conservative amino acid substitutions (as compared with the encoded product).

[1415] In a preferred embodiment, the nucleic acid differs from that of SEQ ID NO:22 or 24, or the sequence in ATCC Accession Number ______, e.g., as follows: by at least one but less than 10, 20, 30, or 40 nucleotides; at least one but less than 1%, 5%, 10% or 20% of the nucleotides in the subject nucleic acid. If necessary for this analysis, the sequences should be aligned for maximum homology. “Looped” out sequences from deletions, insertions, or mismatches, are considered differences.

[1416] Orthologs, homologs, and allelic variants can be identified using methods known in the art. These variants comprise a nucleotide sequence encoding a polypeptide that is 50%, at least about 55%, typically at least about 70-75%, more typically at least about 80-85%, and most typically at least about 90-95% or more identical to the amino acid sequence shown in SEQ ID NO:23 or SEQ ID NO:26 or a fragment of this sequence. Such nucleic acid molecules can be obtained as being able to hybridize under a stringent hybridization condition as described herein, to the nucleotide sequence shown in SEQ ID NO:22 or 24 or a fragment of the sequence. Nucleic acid molecules corresponding to orthologs, homologs, and allelic variants of the 58224 cDNAs of the invention can further be isolated by mapping to the same chromosome or locus as the 58224 gene. Preferred variants include those that are correlated with helicase activity, e.g., nucleic acid binding, e.g., RNA binding, activity, or nucleic acid dependent NTPase activity, e.g., RNA dependent ATPase activity.

[1417] Orthologs, homologs, and allelic variants can be identified using methods known in the art. These variants comprise a nucleotide sequence encoding a polypeptide that is 50%, at least about 55%, typically at least about 70-75%, more typically at least about 80-85%, and most typically at least about 90-95% or more identical to the amino acid sequence shown in SEQ ID NO:23 or a fragment of this sequence. Such nucleic acid molecules can readily be identified as being able to hybridize under stringent conditions, to the nucleotide sequence shown in SEQ ID NO:22 or 24 or a fragment of the sequence. Nucleic acid molecules corresponding to orthologs, homologs, and allelic variants of the 58224 cDNAs of the invention can further be isolated by mapping to the same chromosome or locus as the 58224 gene. Preferred variants include those that are correlated with helicase activity, e.g., nucleic acid binding, e.g., RNA binding, activity, or nucleic acid dependent NTPase activity, e.g., RNA dependent ATPase activity.

[1418] Allelic variants of 58224, e.g., human 58224, include both functional and non-functional proteins. Functional allelic variants are naturally occurring amino acid sequence variants of the 58224 protein within a population that maintain the ability to perform a helicase activity. Functional allelic variants typically will contain only conservative substitution of one or more amino acids of SEQ ID NO:23, or substitution, deletion or insertion of non-critical residues in non-critical regions of the protein. Non-functional allelic variants are naturally-occurring amino acid sequence variants of the 58224, e.g., human 58224, protein within a population that do not have helicase activity. Non-functional allelic variants will typically contain a non-conservative substitution, a deletion, or insertion, or premature truncation of the amino acid sequence of SEQ ID NO:23, or a substitution, insertion, or deletion in critical residues or critical regions of the protein.

[1419] Moreover, nucleic acid molecules encoding other 58224 family members and, thus have a nucleotide sequence that differs from the 58224 sequences of SEQ ID NO:22, 24, or the nucleotide sequence of the DNA insert of the plasmid deposited with ATCC as Accession Number ______ are intended to be within the scope of the invention.

[1420] Antisense Nucleic Acid Molecules, Ribozymes and Modified 58224 Nucleic Acid Molecules

[1421] In another aspect, the invention features, an isolated nucleic acid molecule that is antisense to 58224. An “antisense” nucleic acid can include a nucleotide sequence that is complementary to a “sense” nucleic acid encoding a protein, e.g., complementary to the coding strand of a double-stranded cDNA molecule or complementary to an mRNA sequence. The antisense nucleic acid can be complementary to an entire 58224 coding strand, or to only a portion thereof (e.g., the coding region of 58224 corresponding to SEQ ID NO:24). In another embodiment, the antisense nucleic acid molecule is antisense to a “noncoding region” of the coding strand of a nucleotide sequence encoding 58224 (e.g., the 5′ and 3′untranslated regions).

[1422] An antisense nucleic acid can be designed such that it is complementary to the entire coding region of 58224 mRNA, but more preferably is an oligonucleotide that is antisense to only a portion of the coding or noncoding region of 58224 mRNA. For example, the antisense oligonucleotide can be complementary to the region surrounding the translation start site of 58224 mRNA, e.g., between the −10 and +10 regions of the target gene nucleotide sequence. An antisense oligonucleotide can be, for example, about 7, 10, 15, 20, 25, 30, 35, 40, 45, 50, 55, 60, 65, 70, 75, 80, or more nucleotides in length.

[1423] An antisense nucleic acid of the invention can be constructed using chemical synthesis and enzymatic ligation reactions with procedures known in the art. For example, an antisense nucleic acid (e.g., an antisense oligonucleotide) can be chemically synthesized using naturally occurring nucleotides or variously modified nucleotides designed to increase the biological stability of the molecules or to increase the physical stability of the duplex formed between the antisense and sense nucleic acids, e.g., phosphorothioate derivatives and acridine substituted nucleotides can be used. The antisense nucleic acid also can be produced biologically using an expression vector into which a nucleic acid has been subcloned in an antisense orientation (i.e., RNA transcribed from the inserted nucleic acid will be of an antisense orientation to a target nucleic acid of interest, described further in the following subsection).

[1424] The antisense nucleic acid molecules of the invention are typically administered to a subject (e.g., by direct injection at a tissue site), or generated in situ such that they hybridize with or bind to cellular mRNA and/or genomic DNA encoding a 58224 protein to thereby inhibit expression of the protein, e.g., by inhibiting transcription and/or translation. Alternatively, antisense nucleic acid molecules can be modified to target selected cells and then administered systemically. For systemic administration, antisense molecules can be modified such that they specifically bind to receptors or antigens expressed on a selected cell surface, e.g., by linking the antisense nucleic acid molecules to peptides or antibodies that bind to cell surface receptors or antigens. The antisense nucleic acid molecules can also be delivered to cells using the vectors described herein. To achieve sufficient intracellular concentrations of the antisense molecules, vector constructs in which the antisense nucleic acid molecule is placed under the control of a strong polymerase II or polymerase III promoter are preferred.

[1425] In yet another embodiment, the antisense nucleic acid molecule of the invention is an &agr;-anomeric nucleic acid molecule. An &agr;-anomeric nucleic acid molecule forms specific double-stranded hybrids with complementary RNA in which, contrary to the usual &bgr;-units, the strands run parallel to each other (Gaultier et al. (1987) Nucleic Acids. Res. 15:6625-6641). The antisense nucleic acid molecule can also comprise a 2′-o-methylribonucleotide (Inoue et al. (1987) Nucleic Acids Res. 15:6131-6148) or a chimeric RNA-DNA analogue (Inoue et al. (1987) FEBS Lett. 215:327-330).

[1426] In still another embodiment, an antisense nucleic acid of the invention is a ribozyme. A ribozyme having specificity for a 58224-encoding nucleic acid can include one or more sequences complementary to the nucleotide sequence of a 58224 cDNA disclosed herein (i.e., SEQ ID NO:22, or 24), and a sequence having known catalytic sequence responsible for mRNA cleavage (see U.S. Pat. No. 5,093,246 or Haselhoff and Gerlach (1988) Nature 334:585-591). For example, a derivative of a Tetrahymena L-19 IVS RNA can be constructed in which the nucleotide sequence of the active site is complementary to the nucleotide sequence to be cleaved in a 58224-encoding mRNA. See, e.g., Cech et al. U.S. Pat. No. 4,987,071; and Cech et al. U.S. Pat. No. 5,116,742. Alternatively, 58224 mRNA can be used to select a catalytic RNA having a specific ribonuclease activity from a pool of RNA molecules. See, e.g., Bartel and Szostak (1993) Science 261:1411-1418.

[1427] 58224 gene expression can be inhibited by targeting nucleotide sequences complementary to the regulatory region of the 58224 (e.g., the 58224 promoter and/or enhancers) to form triple helical structures that prevent transcription of the 58224 gene in target cells. See generally, Helene, C. (1991) Anticancer Drug Des. 6(6):569-84; Helene, C. et al. (1992) Ann. N.Y. Acad. Sci. 660:27-36; and Maher, L. J. (1992) Bioassays 14(12):807-15. The potential sequences that can be targeted for triple helix formation can be increased by creating a “switchback” nucleic acid molecule. Switchback molecules are synthesized in an alternating 5′-3′,3′-5′manner, such that they base pair with first one strand of a duplex and then the other, eliminating the necessity for a sizeable stretch of either purines or pyrimidines to be present on one strand of a duplex.

[1428] The invention also provides detectably labeled oligonucleotide primer and probe molecules. Typically, such labels are chemiluminescent, fluorescent, radioactive, or colorimetric.

[1429] A 58224 nucleic acid molecule can be modified at the base moiety, sugar moiety or phosphate backbone to improve, e.g., the stability, hybridization, or solubility of the molecule. For example, the deoxyribose phosphate backbone of the nucleic acid molecules can be modified to generate peptide nucleic acids (see Hyrup B. et al. (1996) Bioorganic & Medicinal Chemistry 4 (1): 5-23). As used herein, the terms “peptide nucleic acid” or “PNA” refers to a nucleic acid mimic, e.g., a DNA mimic in which the deoxyribose phosphate backbone is replaced by a pseudopeptide backbone and only the four natural nucleobases are retained. The neutral backbone of a PNA can allow for specific hybridization to DNA and RNA under conditions of low ionic strength. The synthesis of PNA oligomers can be performed using standard solid phase peptide synthesis protocols as described in Hyrup B. et al. (1996) supra; Perry-O'Keefe et al. Proc. Natl. Acad. Sci. 93: 14670-675.

[1430] PNAs of 58224 nucleic acid molecules can be used in therapeutic and diagnostic applications. For example, PNAs can be used as antisense or antigene agents for sequence-specific modulation of gene expression by, for example, inducing transcription or translation arrest or inhibiting replication. PNAs of 58224 nucleic acid molecules can also be used in the analysis of single base pair mutations in a gene, (e.g., by PNA-directed PCR clamping); as ‘artificial restriction enzymes’ when used in combination with other enzymes, (e.g., S1 nucleases (Hyrup B. (1996) supra)); or as probes or primers for DNA sequencing or hybridization (Hyrup B. et al. (1996) supra; Perry-O'Keefe supra).

[1431] In other embodiments, the oligonucleotide may include other appended groups such as peptides (e.g., for targeting host cell receptors in vivo), or agents facilitating transport across the cell membrane (see, e.g., Letsinger et al. (1989) Proc. Natl. Acad. Sci. USA 86:6553-6556; Lemaitre et al. (1987) Proc. Natl. Acad. Sci. USA 84:648-652; PCT Publication No. WO88/09810) or the blood-brain barrier (see, e.g., PCT Publication No. WO89/10134). In addition, oligonucleotides can be modified with hybridization-triggered cleavage agents (see, e.g., Krol et al. (1988) Bio-Techniques 6:958-976) or intercalating agents (see, e.g., Zon (1988) Pharm. Res. 5:539-549). To this end, the oligonucleotide may be conjugated to another molecule, (e.g., a peptide, hybridization triggered cross-linking agent, transport agent, or hybridization-triggered cleavage agent).

[1432] The invention also includes molecular beacon oligonucleotide primer and probe molecules having at least one region that is complementary to a 58224 nucleic acid of the invention. The molecular beacon primer and probe molecules also have two complementary regions, one having a fluorophore and one having a quencher, such that the molecular beacon is useful for quantitating the presence of a 58224 nucleic acid of the invention in a sample. Molecular beacon nucleic acids are described, for example, in Lizardi et al., U.S. Pat. No. 5,854,033; Nazarenko et al., U.S. Pat. No. 5,866,336, and Livak et al., U.S. Pat. No. 5,876,930.

[1433] Isolated 58224 Polypeptides

[1434] In another aspect, the invention features an isolated 58224 protein or fragment thereof, e.g., a biologically active portion for use as immunogens or antigens to raise or test (or more generally to bind) anti-58224 antibodies. 58224 protein can be isolated from cells or tissue sources using standard protein purification techniques. 58224 protein or fragments thereof can be produced by recombinant DNA techniques or synthesized chemically.

[1435] Polypeptides of the invention include those that arise as a result of the existence of multiple genes, alternative transcription events, alternative RNA splicing events, and alternative translational and postranslational events. The polypeptide can be expressed in systems, e.g., cultured cells, which result in substantially the same postranslational modifications present when expressed the polypeptide is expressed in a native cell, or in systems which result in the alteration or omission of postranslational modifications, e.g., glycosylation or cleavage, present when expressed in a native cell.

[1436] In a preferred embodiment, a 58224 polypeptide has one or more of the following characteristics:

[1437] (i) it has the ability to promote the separation of two complementary strands of a duplex nucleic acid;

[1438] (ii) it has a molecular weight (e.g., a deduced molecular weight, preferably ignoring any contribution of post translational modifications), amino acid composition or other physical characteristic of a 58224 polypeptide, e.g., a polypeptide of SEQ ID NO:23;

[1439] (iii) it has an overall sequence similarity of at least 60%, more preferably at least 70, 80, 90, or 95%, with a polypeptide of SEQ ID NO:23;

[1440] (iv) it has a SNF2 N-terminal domain which has sequence similarity preferably about 70%, 80%, 90% or 95% with amino acid residues 226-577 of SEQ ID NO:23;

[1441] (iv) it has a helicase conserved C-terminal domain, which has sequence similarity of preferably about 70%, 80%, 90% or 95% sequence similarity with amino acid residues 629-712 of SEQ ID NO:23);

[1442] (v) it has a lymphoid specific lymphocyte specific helicase domain 1, which has sequence similarity of preferably about 70%, 80%, 90% or 95% sequence similarity with amino acid residues 492-590 of SEQ ID NO:23);

[1443] (vi) it has a lymphoid specific lymphocyte specific helicase domain 2, which has sequence similarity of preferably about 70%, 80%, 90% or 95% with amino acid residues 775-838 of SEQ ID NO:23);

[1444] (vii) it has at least 2, preferably 4, and most preferably 9 of the cysteines found amino acid sequence of the native protein;

[1445] (viii) it has the ability to bind a nucleic acid, e.g., RNA or DNA; or

[1446] (ix) it has an NTPase activity, e.g., a nucleic acid dependent ATPase activity.

[1447] In a preferred embodiment, the 58224 protein or fragment thereof differs from the corresponding sequence in SEQ ID NO:23. In one embodiment, it differs by at least one but by less than 15, 10 or 5 amino acid residues. In another embodiment, it differs from the corresponding sequence in SEQ ID NO:23 by at least one residue but less than 20%, 15%, 10% or 5% of the residues in it differ from the corresponding sequence in SEQ ID NO:23. (If this comparison requires alignment, the sequences should be aligned for maximum homology. “Looped” out sequences from deletions, insertions, or mismatches, are considered differences.) The differences are, preferably, differences or changes at a non-essential residue or a conservative substitution. In a preferred embodiment, the differences are not in a helicase domain. In another preferred embodiment one or more differences are at non-active site residues, e.g., amino acids 1-226, 578 to 628, or 713 to 838 of SEQ ID NO:23.

[1448] Other embodiments include a protein that contains one or more changes in amino acid sequence, e.g., a change in an amino acid residue that is not essential for activity. Such 58224 proteins differ in amino acid sequence from SEQ ID NO:23, yet retain biological activity.

[1449] In one embodiment, the protein includes an amino acid sequence at least about 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 98%, 99% or more homologous to SEQ ID NO:23.

[1450] In another embodiment, the protein includes an amino acid sequence at least 106 amino acids in length, and about 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 98%, homologous to SEQ ID NO:23.

[1451] In a preferred embodiment, the protein includes an amino acid sequence at least 273 amino acids in length, and about 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 98%, homologous to SEQ ID NO:23.

[1452] In another embodiment, a 58224 protein or fragment has an amino acid sequence which differs from the amino acid sequence encoded by the nucleotide sequence of Genbank Accession Number AK001201 or from the amino acid sequence of Genbank Accession Number BAA91550 by at least one, two, three, five or more amino acids. The variations may include the addition, replacement, and/or deletion of amino acid residues.

[1453] In another embodiment, a 58224 protein fragment has an amino acid sequence which contains one, preferably more, residues from the sequence of residues 1-563 or 837-838 of SEQ ID NO:23.

[1454] In another embodiment, a 58224 protein fragment has an amino acid sequence which contains one, preferably more, residues from the sequence of residues 106-838 of SEQ ID NO:23.

[1455] A 58224 protein or fragment is provided which varies from the sequence of SEQ ID NO:23 in non-active site residues by at least one but by less than 15, 10 or 5 amino acid residues in the protein or fragment, but which does not differ from SEQ ID NO:23 or SEQ ID NO:26 in regions having a helicase activity. (If this comparison requires alignment the sequences should be aligned for maximum homology. “Looped” out sequences from deletions, insertions, or mismatches, are considered differences.) In some embodiments, the difference is at a non-essential residue or is a conservative substitution, while in others, the difference is at an essential residue or is a non conservative substitution.

[1456] In one embodiment, a biologically active portion of a 58224 protein includes a helicase domain, e.g., a SNF2 domain, or a conserved C-terminal helicase domain. Moreover, other biologically active portions, in which other regions of the protein are deleted, can be prepared by recombinant techniques and evaluated for one or more of the functional activities of a native 58224 protein.

[1457] In a preferred embodiment, the 58224 protein has an amino acid sequence shown in SEQ ID NO:23. In other embodiments, the 58224 protein is substantially identical to SEQ ID NO:23. In yet another embodiment, the 58224 protein is substantially identical to SEQ ID NO:23 and retains the functional activity of the protein of SEQ ID NO:23, as described in detail in subsection I above. Accordingly, in another embodiment, the 58224 protein is a protein which includes an amino acid sequence at least about 60%, 65%, 70%, 75%, 80%, 85%, 90%, 94%. 95%, 96%, 97%, 98%, 99% or more identical to SEQ ID NO:23.

[1458] 58224 Chimeric or Fusion Proteins

[1459] In another aspect, the invention provides 58224 chimeric or fusion proteins. As used herein, a 58224 “chimeric protein” or “fusion protein” includes a 58224 polypeptide linked to a non-58224 polypeptide. A “non-58224 polypeptide” refers to a polypeptide having an amino acid sequence corresponding to a protein that is not substantially homologous to the 58224 protein, e.g., a protein that is different from the 58224 protein and that is derived from the same or a different organism. The 58224 polypeptide of the fusion protein can correspond to all or a portion e.g., a fragment described herein of a 58224 amino acid sequence. In a preferred embodiment, a 58224 fusion protein includes at least one (e.g., two) biologically active portion of a 58224 protein. The non-58224 polypeptide can be fused to the N-terminus or C-terminus of a 58224 polypeptide.

[1460] The fusion protein can include a moiety that has high affinity for a ligand, e.g., a helicase substrate. For example, the fusion protein can be a GST-58224 fusion protein in which the 58224 sequences are fused to the C-terminus of the GST sequences. Such fusion proteins can facilitate the purification of recombinant 58224. Alternatively, the fusion protein can be a 58224 protein containing a heterologous signal sequence at its N-terminus. In certain host cells (e.g., mammalian host cells), expression and/or secretion of 58224 can be increased through use of a heterologous signal sequence.

[1461] Fusion proteins can include all or a part of a serum protein, e.g., an IgG constant region, or human serum albumin.

[1462] The 58224 fusion proteins of the invention can be incorporated into pharmaceutical compositions and administered to a subject in vivo. The 58224 fusion proteins can be used to affect the bioavailability of a 58224 substrate. 58224 fusion proteins may be useful therapeutically for the treatment of disorders caused by, for example: (i) aberrant modification or mutation of a gene encoding a 58224 protein; (ii) misregulation of the 58224 gene; and (iii) aberrant post-translational modification of a 58224 protein.

[1463] Moreover, 58224-fusion proteins of the invention can be used as immunogens to produce anti-58224 antibodies in a subject, to purify 58224 ligands, and in screening assays to identify molecules that inhibit the interaction of 58224 with a 58224 substrate.

[1464] Expression vectors are commercially available that already encode a fusion moiety (e.g., a GST polypeptide). A 58224-encoding nucleic acid can be cloned into such an expression vector such that the fusion moiety is linked in-frame to the 58224 protein.

[1465] Variants of 58224 Proteins

[1466] In another aspect, the invention features a variant of a 58224 polypeptide, e.g., a polypeptide that functions as an agonist (mimetic) or as an antagonist of 58224 activities. Variants of the 58224 proteins can be generated by mutagenesis, e.g., discrete point mutations, the insertion or deletion of sequences or the truncation of a 58224 protein. An agonist of the 58224 protein retains substantially the same, or a subset, of the biological activities of the naturally occurring form of a 58224 protein. An antagonist of a 58224 protein can inhibit one or more of the activities of the naturally occurring form of the 58224 protein by, for example, competitively modulating a 58224-mediated activity of a 58224 protein. Thus, specific biological effects can be elicited by treatment with a variant of limited function. Preferably, treatment of a subject with a variant having a subset of the biological activities of the naturally occurring form of the protein has fewer side effects in a subject relative to treatment with the naturally occurring form of the 58224 protein.

[1467] Variants of a 58224 protein can be identified by screening combinatorial libraries of mutants, e.g., truncation mutants, of a 58224 protein for agonist or antagonist activity.

[1468] Libraries of fragments e.g., N terminal, C terminal, or internal fragments, of a 58224 protein coding sequence can be used to generate a variegated population of fragments for screening and subsequent selection of variants of a 58224 protein.

[1469] Variants in which a cysteine residue is added or deleted or in which a residue that is glycosylated is added or deleted are particularly preferred.

[1470] Methods for screening gene products of combinatorial libraries made by point mutations or truncation, and for screening cDNA libraries for gene products having a selected property are known. Recursive ensemble mutagenesis (REM), a new technique which enhances the frequency of functional mutants in the libraries, can be used in combination with screening assays to identify 58224 variants (Arkin and Yourvan (1992) Proc. Natl. Acad. Sci. USA 89:7811-7815; Delgrave et al. (1993) Protein Engineering 6(3):327-331).

[1471] Cell based assays can be exploited to analyze a variegated 58224 library. For example, a library of expression vectors can be transfected into a cell line, e.g., a cell line which ordinarily responds to 58224 in a substrate-dependent manner. The transfected cells are then contacted with 58224 and the effect of the expression of the mutant on signaling by a 58224 substrate can be detected, e.g., by measuring helicase activity, e.g., a helicase activity described herein. Plasmid DNA can then be recovered from the cells that score for inhibition, or alternatively, potentiation of signaling by the 58224 substrate, and the individual clones further characterized.

[1472] In another aspect, the invention features a method of making a 58224 polypeptide, e.g., a peptide having a non-wild type activity, e.g., an antagonist, agonist, or super agonist of a naturally occurring 58224 polypeptide, e.g., a naturally occurring 58224 polypeptide. The method includes: altering the sequence of a 58224 polypeptide, e.g., by substitution or deletion of one or more residues of a non-conserved region, a domain, or residue disclosed herein, and testing the altered polypeptide for the desired activity.

[1473] In another aspect, the invention features a method of making a fragment or analog of a 58224 polypeptide that retains at least one biological activity of a naturally occurring 58224 polypeptide. The method includes: altering the sequence, e.g., by substitution or deletion of one or more residues, of a 58224 polypeptide, e.g., altering the sequence of a non-conserved region, or a domain or residue described herein, and testing the altered polypeptide for the desired activity.

[1474] Anti-58224 Antibodies

[1475] In another aspect, the invention provides an anti-58224 antibody, or a fragment thereof (e.g., an antigen-binding fragment thereof). The term “antibody” as used herein refers to an immunoglobulin molecule or immunologically active portion thereof, i.e., an antigen-binding portion. As used herein, the term “antibody” refers to a protein comprising at least one, and preferably two, heavy (H) chain variable regions (abbreviated herein as VH), and at least one and preferably two light (L) chain variable regions (abbreviated herein as VL). The VH and VL regions can be further subdivided into regions of hypervariability, termed “complementarity determining regions” (“CDR”), interspersed with regions that are more conserved, termed “framework regions” (FR). The extent of the framework region and CDR's has been precisely defined (see, Kabat et al. (1991) Sequences of Proteins of Immunological Interest, Fifth Edition, U.S. Department of Health and Human Services, NIH Publication No. 91-3242, and Chothia et al. (1987) J. Mol. Biol. 196:901-917, which are incorporated herein by reference). Each VH and VL is composed of three CDR's and four FRs, arranged from amino-terminus to carboxy-terminus in the following order: FR1, CDR1, FR2, CDR2, FR3, CDR3, FR4.

[1476] The anti-58224 antibody can further include a heavy and light chain constant region, to thereby form a heavy and light immunoglobulin chain, respectively. In one embodiment, the antibody is a tetramer of two heavy immunoglobulin chains and two light immunoglobulin chains, wherein the heavy and light immunoglobulin chains are inter-connected by, e.g., disulfide bonds. The heavy chain constant region is comprised of three domains, CH1, CH2 and CH3. The light chain constant region is comprised of one domain, CL. The variable region of the heavy and light chains contains a binding domain that interacts with an antigen. The constant regions of the antibodies typically mediate the binding of the antibody to host tissues or factors, including various cells of the immune system (e.g., effector cells) and the first component (Clq) of the classical complement system.

[1477] As used herein, the term “immunoglobulin” refers to a protein consisting of one or more polypeptides substantially encoded by immunoglobulin genes. The recognized human immunoglobulin genes include the kappa, lambda, alpha (IgA1 and IgA2), gamma (IgG1, IgG2, IgG3, IgG4), delta, epsilon and mu constant region genes, as well as the myriad immunoglobulin variable region genes. Full-length immunoglobulin “light chains” (about 25 Kd or 214 amino acids) are encoded by a variable region gene at the NH2-terminus (about 110 amino acids) and a kappa or lambda constant region gene at the COOH—terminus. Full-length immunoglobulin “heavy chains” (about 50 Kd or 446 amino acids), are similarly encoded by a variable region gene (about 116 amino acids) and one of the other aforementioned constant region genes, e.g., gamma (encoding about 330 amino acids).

[1478] The term “antigen-binding fragment” of an antibody (or simply “antibody portion,” or “fragment”), as used herein, refers to one or more fragments of a full-length antibody that retain the ability to specifically bind to the antigen, e.g., 58224 polypeptide or fragment thereof. Examples of antigen-binding fragments of the anti-58224 antibody include, but are not limited to: (i) a Fab fragment, a monovalent fragment consisting of the VL, VH, CL and CH1 domains; (ii) a F(ab′)2 fragment, a bivalent fragment comprising two Fab fragments linked by a disulfide bridge at the hinge region; (iii) a Fd fragment consisting of the VH and CH1 domains; (iv) a Fv fragment consisting of the VL and VH domains of a single arm of an antibody, (v) a dAb fragment (Ward et al., (1989) Nature 341:544-546), which consists of a VH domain; and (vi) an isolated complementarity determining region (CDR). Furthermore, although the two domains of the Fv fragment, VL and VH, are coded for by separate genes, they can be joined, using recombinant methods, by a synthetic linker that enables them to be made as a single protein chain in which the VL and VH regions pair to form monovalent molecules (known as single chain Fv (scFv); see e.g., Bird et al. (1988) Science 242:423-426; and Huston et al. (1988) Proc. Natl. Acad. Sci. USA 85:5879-5883). Such single chain antibodies are also encompassed within the term “antigen-binding fragment” of an antibody. These antibody fragments are obtained using conventional techniques known to those with skill in the art, and the fragments are screened for utility in the same manner as are intact antibodies.

[1479] The anti-58224 antibody can be a polyclonal or a monoclonal antibody. In other embodiments, the antibody can be recombinantly produced, e.g., produced by phage display or by combinatorial methods.

[1480] Phage display and combinatorial methods for generating anti-58224 antibodies are known in the art (as described in, e.g., Ladner et al. U.S. Pat. No. 5,223,409; Kang et al. International Publication No. WO 92/18619; Dower et al. International Publication No. WO 91/17271; Winter et al. International Publication WO 92/20791; Markland et al. International Publication No. WO 92/15679; Breitling et al. International Publication WO 93/01288; McCafferty et al. International Publication No. WO 92/01047; Garrard et al. International Publication No. WO 92/09690; Ladner et al. International Publication No. WO 90/02809; Fuchs et al. (1991) Bio/Technology 9:1370-1372; Hay et al. (1992) Hum Antibod Hybridomas 3:81-85; Huse et al. (1989) Science 246:1275-1281; Griffths et al. (1993) EMBO J. 12:725-734; Hawkins et al. (1992) J Mol Biol 226:889-896; Clackson et al. (1991) Nature 352:624-628; Gram et al. (1992) PNAS 89:3576-3580; Garrad et al. (1991) Bio/Technology 9:1373-1377; Hoogenboom et al. (1991) Nuc Acid Res 19:4133-4137; and Barbas et al. (1991) PNAS 88:7978-7982, the contents of all of which are incorporated by reference herein).

[1481] In one embodiment, the anti-58224 antibody is a fully human antibody (e.g., an antibody made in a mouse which has been genetically engineered to produce an antibody from a human immunoglobulin sequence), or a non-human antibody, e.g., a rodent (mouse or rat), goat, primate (e.g., monkey), camel antibody. Preferably, the non-human antibody is a rodent (mouse or rat antibody). Methods of producing rodent antibodies are known in the art.

[1482] Human monoclonal antibodies can be generated using transgenic mice carrying the human immunoglobulin genes rather than the mouse system. Splenocytes from these transgenic mice immunized with the antigen of interest are used to produce hybridomas that secrete human mAbs with specific affinities for epitopes from a human protein (see, e.g., Wood et al. International Application WO 91/00906, Kucherlapati et al. PCT publication WO 91/10741; Lonberg et al. International Application WO 92/03918; Kay et al. International Application 92/03917; Lonberg, N. et al. 1994 Nature 368:856-859; Green, L. L. et al. 1994 Nature Genet. 7:13-21; Morrison, S. L. et al. 1994 Proc. Natl. Acad. Sci. USA 81:6851-6855; Bruggeman et al. 1993 Year Immunol 7:33-40; Tuaillon et al. 1993 PNAS 90:3720-3724; Bruggeman et al. 1991 Eur J Immunol 21:1323-1326).

[1483] An anti-58224 antibody can be one in which the variable region, or a portion thereof, e.g., the CDR's, are generated in a non-human organism, e.g., a rat or mouse. Chimeric, CDR-grafted, and humanized antibodies are within the invention. Antibodies generated in a non-human organism, e.g., a rat or mouse, and then modified, e.g., in the variable framework or constant region, to decrease antigenicity in a human are within the invention.

[1484] Chimeric antibodies can be produced by recombinant DNA techniques known in the art. For example, a gene encoding the Fc constant region of a murine (or other species) monoclonal antibody molecule is digested with restriction enzymes to remove the region encoding the murine Fc, and the equivalent portion of a gene encoding a human Fc constant region is substituted (see Robinson et al., International Patent Publication PCT/US86/02269; Akira, et al., European Patent Application 184,187; Taniguchi, M., European Patent Application 171,496; Morrison et al., European Patent Application 173,494; Neuberger et al., International Application WO 86/01533; Cabilly et al. U.S. Pat. No. 4,816,567; Cabilly et al., European Patent Application 125,023; Better et al. (1988 Science 240:1041-1043); Liu et al. (1987) PNAS 84:3439-3443; Liu et al., 1987, J. Immunol. 139:3521-3526; Sun et al. (1987) PNAS 84:214-218; Nishimura et al., 1987, Canc. Res. 47:999-1005; Wood et al. (1985) Nature 314:446-449; and Shaw et al., 1988, J. Natl Cancer Inst. 80:1553-1559).

[1485] A humanized or CDR-grafted antibody will have at least one or two but generally all three recipient CDR's (of heavy and or light immuoglobulin chains) replaced with a donor CDR. The antibody may be replaced with at least a portion of a non-human CDR or only some of the CDR's may be replaced with non-human CDR's. It is only necessary to replace the number of CDR's required for binding of the humanized antibody to a 58224 or a fragment thereof. Preferably, the donor will be a rodent antibody, e.g., a rat or mouse antibody, and the recipient will be a human framework or a human consensus framework. Typically, the immunoglobulin providing the CDR's is called the “donor” and the immunoglobulin providing the framework is called the “acceptor.” In one embodiment, the donor immunoglobulin is a non-human (e.g., rodent). The acceptor framework is a naturally-occurring (e.g., a human) framework or a consensus framework, or a sequence about 85% or higher, preferably 90%, 95%, 99% or higher identical thereto.

[1486] As used herein, the term “consensus sequence” refers to the sequence formed from the most frequently occurring amino acids (or nucleotides) in a family of related sequences (See e.g., Winnaker, From Genes to Clones (Verlagsgesellschaft, Weinheim, Germany 1987). In a family of proteins, each position in the consensus sequence is occupied by the amino acid occurring most frequently at that position in the family. If two amino acids occur equally frequently, either can be included in the consensus sequence. A “consensus framework” refers to the framework region in the consensus immunoglobulin sequence.

[1487] An antibody can be humanized by methods known in the art. Humanized antibodies can be generated by replacing sequences of the Fv variable region which are not directly involved in antigen binding with equivalent sequences from human Fv variable regions. General methods for generating humanized antibodies are provided by Morrison, S. L., 1985, Science 229:1202-1207, by Oi et al., 1986, BioTechniques 4:214, and by Queen et al. U.S. Pat. No. 5,585,089, U.S. Pat. No. 5,693,761 and U.S. Pat. No. 5,693,762, the contents of all of which are hereby incorporated by reference. Those methods include isolating, manipulating, and expressing the nucleic acid sequences that encode all or part of immunoglobulin Fv variable regions from at least one of a heavy or light chain. Sources of such nucleic acid are well known to those skilled in the art and, for example, may be obtained from a hybridoma producing an antibody against a 58224 polypeptide or fragment thereof. The recombinant DNA encoding the humanized antibody, or fragment thereof, can then be cloned into an appropriate expression vector.

[1488] Humanized or CDR-grafted antibodies can be produced by CDR-grafting or CDR substitution, wherein one, two, or all CDR's of an immunoglobulin chain can be replaced. See e.g., U.S. Pat. No. 5,225,539; Jones et al. 1986 Nature 321:552-525; Verhoeyan et al. 1988 Science 239:1534; Beidler et al. 1988 J. Immunol. 141:4053-4060; Winter U.S. Pat. No. 5,225,539, the contents of all of which are hereby expressly incorporated by reference. Winter describes a CDR-grafting method which may be used to prepare the humanized antibodies of the present invention (UK Patent Application GB 2188638A, filed on Mar. 26, 1987; Winter U.S. Pat. No. 5,225,539), the contents of which is expressly incorporated by reference.

[1489] Also within the scope of the invention are humanized antibodies in which specific amino acids have been substituted, deleted or added. Preferred humanized antibodies have amino acid substitutions in the framework region, such as to improve binding to the antigen. For example, a humanized antibody will have framework residues identical to the donor framework residue or to another amino acid other than the recipient framework residue. To generate such antibodies, a selected, small number of acceptor framework residues of the humanized immunoglobulin chain can be replaced by the corresponding donor amino acids. Preferred locations of the substitutions include amino acid residues adjacent to the CDR, or which are capable of interacting with a CDR (see e.g., U.S. Pat. No. 5,585,089). Criteria for selecting amino acids from the donor are described in U.S. Pat. No. 5,585,089, e.g., columns 12-16 of U.S. Pat. No. 5,585,089, the e.g., columns 12-16 of U.S. Pat. No. 5,585,089, the contents of which are hereby incorporated by reference. Other techniques for humanizing antibodies are described in Padlan et al. EP 519596 A1, published on Dec. 23, 1992.

[1490] In preferred embodiments an antibody can be made by immunizing with purified 58224 antigen, or a fragment thereof, e.g., a fragment described herein.

[1491] A full-length 58224 protein or, antigenic peptide fragment of 58224 can be used as an immunogen or can be used to identify anti-58224 antibodies made with other immunogens, e.g., cells, membrane preparations, and the like. The antigenic peptide of 58224 should include at least 8 amino acid residues of the amino acid sequence shown in SEQ ID NO:23 or SEQ ID NO:26 and encompass an epitope of 58224. Preferably, the antigenic peptide includes at least 10 amino acid residues, more preferably at least 15 amino acid residues, even more preferably at least 20 amino acid residues, and most preferably at least 30 amino acid residues.

[1492] Fragments of 58224 which include residues about 150-160 can be used to make, e.g., used as immunogens or used to characterize the specificity of an antibody, antibodies against hydrophilic regions of the 58224 protein. Similarly, fragments of 58224 which include residues 255-265 can be used to make an antibody against a hydrophobic region of the 58224 protein; a fragment of 58224 which includes residues about 226 to 577 can be used to make an antibody against the SNF2 region of the 58224 protein, and a fragment of 58224 which includes residues about 629 to 712 can be used to make an antibody against the C-terminal conserved helicase domain of the 58224 protein.

[1493] Antibodies reactive with, or specific for, any of these regions, or other regions or domains described herein are provided.

[1494] Antibodies which bind only native 58224 protein, only denatured or otherwise non-native 58224 protein, or which bind both, are with in the invention. Antibodies with linear or conformational epitopes are within the invention. Confromational epitopes can sometimes be identified by indentifying antibodies which bind to native but not denatured 58224 protein.

[1495] Preferred epitopes encompassed by the antigenic peptide are regions of 58224 are located on the surface of the protein, e.g., hydrophilic regions, as well as regions with high antigenicity. For example, an Emini surface probability analysis of the human 58224 protein sequence can be used to indicate the regions that have a particularly high probability of being localized to the surface of the 58224 protein and are thus likely to constitute surface residues useful for targeting antibody production.

[1496] In preferred embodiments antibodies can bind one or more of purified antigen; tissue, e.g., tissue sections; whole cells, preferably living cells; lysed cells; cell fractions.

[1497] The anti-58224 antibody can be a single chain antibody. A single-chain antibody (scFV) may be engineered (see, for example, Colcher et al. (1999) Ann N Y Acad Sci 880:263-80; and Reiter (1996) Clin Cancer Res 2:245-52). The single chain antibody can be dimerized or multimerized to generate multivalent antibodies having specificities for different epitopes of the same target 58224 protein.

[1498] In a preferred embodiment the antibody has: effector function; and can fix complement. In other embodiments the antibody does not; recruit effector cells; or fix complement.

[1499] In a preferred embodiment, the antibody has reduced or no ability to bind an Fc receptor. For example., it is a isotype or subtype, fragment or other mutant, which does not support binding to an Fc receptor, e.g., it has a mutagenized or deleted Fc receptor binding region.

[1500] The antibody can be coupled to a toxin, e.g., a polypeptide toxin, e,g, ricin or diptheria toxin or active fragment hereof, or a radionuclide, or imaging agent, e.g. a radioactive, enzymatic, or other, e.g., imaging agent, e.g., a NMR contrast agent. Labels which produce detectable radioactive emissions or fluorescence are preferred.

[1501] An anti-58224 antibody (e.g., monoclonal antibody) can be used to isolate 58224 by standard techniques, such as affinity chromatography or immunoprecipitation. Moreover, an anti-58224 antibody can be used to detect 58224 protein (e.g., in a cellular lysate or cell supernatant) in order to evaluate the abundance and pattern of expression of the protein. Anti-58224 antibodies can be used diagnostically to monitor protein levels in tissue as part of a clinical testing procedure, e.g., to determine the efficacy of a given treatment regimen. Detection can be facilitated by coupling (i.e., physically linking) the antibody to a detectable substance (i.e., antibody labelling). Examples of detectable substances include various enzymes, prosthetic groups, fluorescent materials, luminescent materials, bioluminescent materials, and radioactive materials. Examples of suitable enzymes include horseradish peroxidase, alkaline phosphatase, &bgr;-galactosidase, or acetylcholinesterase; examples of suitable prosthetic group complexes include streptavidin/biotin and avidin/biotin; examples of suitable fluorescent materials include umbelliferone, fluorescein, fluorescein isothiocyanate, rhodamine, dichlorotriazinylamine fluorescein, dansyl chloride or phycoerythrin; an example of a luminescent material includes luminol; examples of bioluminescent materials include luciferase, luciferin, and aequorin, and examples of suitable radioactive material include 125I, 131I, 35S or 3H.

[1502] The invention also includes a nucleic acid that encodes an anti-58224 antibody, e.g., an anti-58224 antibody described herein. Also included are vectors which include the nucleic acid and cells transformed with the nucleic acid, particularly cells which are useful for producing an antibody, e.g., mammalian cells, e.g. CHO or lymphatic cells.

[1503] The invention also includes cell lines, e.g., hybridomas, which make an anti-58224 antibody, e.g., and antibody described herein, and method of using said cells to make a 58224 antibody.

[1504] 58224 Recombinant Expression Vectors, Host Cells and Genetically Engineered Cells

[1505] In another aspect, the invention includes, vectors, preferably expression vectors, containing a nucleic acid encoding a polypeptide described herein. As used herein, the term “vector” refers to a nucleic acid molecule capable of transporting another nucleic acid to which it has been linked and can include a plasmid, cosmid or viral vector. The vector can be capable of autonomous replication or it can integrate into a host DNA. Viral vectors include, e.g., replication defective retroviruses, adenoviruses and adeno-associated viruses.

[1506] A vector can include a 58224 nucleic acid in a form suitable for expression of the nucleic acid in a host cell. Preferably the recombinant expression vector includes one or more regulatory sequences operatively linked to the nucleic acid sequence to be expressed. The term “regulatory sequence” includes promoters, enhancers and other expression control elements (e.g., polyadenylation signals). Regulatory sequences include those which direct constitutive expression of a nucleotide sequence, as well as tissue-specific regulatory and/or inducible sequences. The design of the expression vector can depend on such factors as the choice of the host cell to be transformed, the level of expression of protein desired, and the like. The expression vectors of the invention can be introduced into host cells to thereby produce proteins or polypeptides, including fusion proteins or polypeptides, encoded by nucleic acids as described herein (e.g., 58224 proteins, mutant forms of 58224 proteins, fusion proteins, and the like).

[1507] The recombinant expression vectors of the invention can be designed for expression of 58224 proteins in prokaryotic or eukaryotic cells. For example, polypeptides of the invention can be expressed in E. coli, insect cells (e.g., using baculovirus expression vectors), yeast cells or mammalian cells. Suitable host cells are discussed further in Goeddel, Gene Expression Technology: Methods in Enzymology 185, Academic Press, San Diego, Calif. (1990). Alternatively, the recombinant expression vector can be transcribed and translated in vitro, for example using T7 promoter regulatory sequences and T7 polymerase.

[1508] Expression of proteins in prokaryotes is most often carried out in E. coli with vectors containing constitutive or inducible promoters directing the expression of either fusion or non-fusion proteins. Fusion vectors add a number of amino acids to a protein encoded therein, usually to the amino terminus of the recombinant protein. Such fusion vectors typically serve three purposes: 1) to increase expression of recombinant protein; 2) to increase the solubility of the recombinant protein; and 3) to aid in the purification of the recombinant protein by acting as a ligand in affinity purification. Often, a proteolytic cleavage site is introduced at the junction of the fusion moiety and the recombinant protein to enable separation of the recombinant protein from the fusion moiety subsequent to purification of the fusion protein. Such enzymes, and their cognate recognition sequences, include Factor Xa, thrombin and enterokinase. Typical fusion expression vectors include pGEX (Pharmacia Biotech Inc; Smith, D. B. and Johnson, K. S. (1988) Gene 67:31-40), pMAL (New England Biolabs, Beverly, Mass.) and pRIT5 (Pharmacia, Piscataway, N.J.) which fuse glutathione S-transferase (GST), maltose E binding protein, or protein A, respectively, to the target recombinant protein.

[1509] Purified fusion proteins can be used in 58224 activity assays, (e.g., direct assays or competitive assays described in detail below), or to generate antibodies specific for 58224 proteins. In a preferred embodiment, a fusion protein expressed in a retroviral expression vector of the present invention can be used to infect bone marrow cells which are subsequently transplanted into irradiated recipients. The pathology of the subject recipient is then examined after sufficient time has passed (e.g., six (6) weeks).

[1510] To maximize recombinant protein expression in E. coli is to express the protein in a host bacteria with an impaired capacity to proteolytically cleave the recombinant protein (Gottesman, S., Gene Expression Technology: Methods in Enzymology 185, Academic Press, San Diego, Calif. (1990) 119-128). Another strategy is to alter the nucleic acid sequence of the nucleic acid to be inserted into an expression vector so that the individual codons for each amino acid are those preferentially utilized in E. coli (Wada et al., (1992) Nucleic Acids Res. 20:2111-2118). Such alteration of nucleic acid sequences of the invention can be carried out by standard DNA synthesis techniques.

[1511] The 58224 expression vector can be a yeast expression vector, a vector for expression in insect cells, e.g., a baculovirus expression vector or a vector suitable for expression in mammalian cells.

[1512] When used in mammalian cells, the expression vector's control functions are often provided by viral regulatory elements. For example, commonly used promoters are derived from polyoma, Adenovirus 2, cytomegalovirus and Simian Virus 40.

[1513] In another embodiment, the recombinant mammalian expression vector is capable of directing expression of the nucleic acid preferentially in a particular cell type (e.g., tissue-specific regulatory elements are used to express the nucleic acid). Non-limiting examples of suitable tissue-specific promoters include the albumin promoter (liver-specific; Pinkert et al. (1987) Genes Dev. 1:268-277), lymphoid-specific promoters (Calame and Eaton (1988) Adv. Immunol. 43:235-275), in particular promoters of T cell receptors (Winoto and Baltimore (1989) EMBO J. 8:729-733) and immunoglobulins (Banerji et al. (1983) Cell 33:729-740; Queen and Baltimore (1983) Cell 33:741-748), neuron-specific promoters (e.g., the neurofilament promoter; Byrne and Ruddle (1989) Proc. Natl. Acad. Sci. USA 86:5473-5477), pancreas-specific promoters (Edlund et al. (1985) Science 230:912-916), and mammary gland-specific promoters (e.g., milk whey promoter; U.S. Pat. No. 4,873,316 and European Application Publication No. 264,166). Developmentally-regulated promoters are also encompassed, for example, the murine hox promoters (Kessel and Gruss (1990) Science 249:374-379) and the &agr;-fetoprotein promoter (Campes and Tilghman (1989) Genes Dev. 3:537-546).

[1514] The invention further provides a recombinant expression vector comprising a DNA molecule of the invention cloned into the expression vector in an antisense orientation. Regulatory sequences (e.g., viral promoters and/or enhancers) operatively linked to a nucleic acid cloned in the antisense orientation can be chosen which direct the constitutive, tissue specific or cell type specific expression of antisense RNA in a variety of cell types. The antisense expression vector can be in the form of a recombinant plasmid, phagemid or attenuated virus. For a discussion of the regulation of gene expression using antisense genes see Weintraub, H. et al., Antisense RNA as a molecular tool for genetic analysis, Reviews—Trends in Genetics, Vol. 1(1)1986.

[1515] Another aspect the invention provides a host cell which includes a nucleic acid molecule described herein, e.g., a 58224 nucleic acid molecule within a recombinant expression vector or a 58224 nucleic acid molecule containing sequences which allow it to homologously recombine into a specific site of the host cell's genome. The terms “host cell” and “recombinant host cell” are used interchangeably herein. Such terms refer not only to the particular subject cell but to the progeny or potential progeny of such a cell. Because certain modifications may occur in succeeding generations due to either mutation or environmental influences, such progeny may not, in fact, be identical to the parent cell, but are still included within the scope of the term as used herein.

[1516] A host cell can be any prokaryotic or eukaryotic cell. For example, a 58224 protein can be expressed in bacterial cells such as E. coli, insect cells, yeast or mammalian cells (such as Chinese hamster ovary cells (CHO) or COS cells). Other suitable host cells are known to those skilled in the art.

[1517] Vector DNA can be introduced into host cells via conventional transformation or transfection techniques. As used herein, the terms “transformation” and “transfection” are intended to refer to a variety of art-recognized techniques for introducing foreign nucleic acid (e.g., DNA) into a host cell, including calcium phosphate or calcium chloride co-precipitation, DEAE-dextran-mediated transfection, lipofection, or electroporation

[1518] A host cell of the invention can be used to produce (i.e., express) a 58224 protein. Accordingly, the invention further provides methods for producing a 58224 protein using the host cells of the invention. In one embodiment, the method includes culturing the host cell of the invention (into which a recombinant expression vector encoding a 58224 protein has been introduced) in a suitable medium such that a 58224 protein is produced. In another embodiment, the method further includes isolating a 58224 protein from the medium or the host cell.

[1519] In another aspect, the invention features, a cell or purified preparation of cells which include a 58224 transgene, or which otherwise misexpress 58224. The cell preparation can consist of human or non human cells, e.g., rodent cells, e.g., mouse or rat cells, rabbit cells, or pig cells. In preferred embodiments, the cell or cells include a 58224 transgene, e.g., a heterologous form of a 58224, e.g., a gene derived from humans (in the case of a non-human cell). The 58224 transgene can be misexpressed, e.g., overexpressed or underexpressed. In other preferred embodiments, the cell or cells include a gene which misexpress an endogenous 58224, e.g., a gene the expression of which is disrupted, e.g., a knockout. Such cells can serve as a model for studying disorders which are related to mutated or mis-expressed 58224 alleles or for use in drug screening.

[1520] In another aspect, the invention features, a human cell, e.g., a lymphoid cell, transformed with nucleic acid which encodes a subject 58224 polypeptide.

[1521] Also provided are cells, preferably human cells, e.g., human lympoid or fibroblast cells, in which an endogenous 58224 is under the control of a regulatory sequence that does not normally control the expression of the endogenous 58224 gene. The expression characteristics of an endogenous gene within a cell, e.g., a cell line or microorganism, can be modified by inserting a heterologous DNA regulatory element into the genome of the cell such that the inserted regulatory element is operably linked to the endogenous 58224 gene. For example, an endogenous 58224 gene which is “transcriptionally silent,” e.g., not normally expressed, or expressed only at very low levels, may be activated by inserting a regulatory element which is capable of promoting the expression of a normally expressed gene product in that cell. Techniques such as targeted homologous recombinations, can be used to insert the heterologous DNA as described in, e.g., Chappel, U.S. Pat. No. 5,272,071; WO 91/06667, published in May 16, 1991.

[1522] 58224 Transgenic Animals

[1523] The invention provides non-human transgenic animals. Such animals are useful for studying the function and/or activity of a 58224 protein and for identifying and/or evaluating modulators of 58224 activity. As used herein, a “transgenic animal” is a non-human animal, preferably a mammal, more preferably a rodent such as a rat or mouse, in which one or more of the cells of the animal includes a transgene. Other examples of transgenic animals include non-human primates, sheep, dogs, cows, goats, chickens, amphibians, and the like. A transgene is exogenous DNA or a rearrangment, e.g., a deletion of endogenous chromosomal DNA, which preferably is integrated into or occurs in the genome of the cells of a transgenic animal. A transgene can direct the expression of an encoded gene product in one or more cell types or tissues of the transgenic animal, other transgenes, e.g., a knockout, reduce expression. Thus, a transgenic animal can be one in which an endogenous 58224 gene has been altered by, e.g., by homologous recombination between the endogenous gene and an exogenous DNA molecule introduced into a cell of the animal, e.g., an embryonic cell of the animal, prior to development of the animal.

[1524] Intronic sequences and polyadenylation signals can also be included in the transgene to increase the efficiency of expression of the transgene. A tissue-specific regulatory sequence(s) can be operably linked to a transgene of the invention to direct expression of a 58224 protein to particular cells. A transgenic founder animal can be identified based upon the presence of a 58224 transgene in its genome and/or expression of 58224 mRNA in tissues or cells of the animals. A transgenic founder animal can then be used to breed additional animals carrying the transgene. Moreover, transgenic animals carrying a transgene encoding a 58224 protein can further be bred to other transgenic animals carrying other transgenes.

[1525] 58224 proteins or polypeptides can be expressed in transgenic animals or plants, e.g., a nucleic acid encoding the protein or polypeptide can be introduced into the genome of an animal. In preferred embodiments the nucleic acid is placed under the control of a tissue specific promoter, e.g., a milk or egg specific promoter, and recovered from the milk or eggs produced by the animal. Suitable animals are mice, pigs, cows, goats, and sheep.

[1526] The invention also includes a population of cells from a transgenic animal, as discussed, e.g., below.

[1527] Uses of 58224

[1528] The nucleic acid molecules, proteins, protein homologues, and antibodies described herein can be used in one or more of the following methods: (a) screening assays; (b) predictive medicine (e.g., diagnostic assays, prognostic assays, monitoring clinical trials, and pharmacogenetics); and (c) methods of treatment (e.g., therapeutic and prophylactic). The isolated nucleic acid molecules of the invention can be used, for example, to express a 58224 protein (e.g., via a recombinant expression vector in a host cell in gene therapy applications), to detect a 58224 mRNA (e.g., in a biological sample) or a genetic alteration in a 58224 gene, and to modulate 58224 activity, as described further below. The 58224 proteins can be used to treat disorders characterized by insufficient or excessive production of a 58224 substrate or production of 58224 inhibitors. In addition, the 58224 proteins can be used to screen for naturally occurring 58224 substrates, to screen for drugs or compounds that modulate 58224 activity, as well as to treat disorders characterized by insufficient or excessive production of 58224 protein or production of 58224 protein forms which have decreased, aberrant or unwanted activity compared to 58224 wild type protein (e.g., imbalance of helicase activity, leading to an increase or decrease in cell proliferation, differentiation, or neoplastic transformation). Moreover, the anti-58224 antibodies of the invention can be used to detect and isolate 58224 proteins, regulate the bioavailability of 58224 proteins, and modulate 58224 activity.

[1529] A method of evaluating a compound for the ability to interact with, e.g., bind, a subject 58224 polypeptide is provided. The method includes: contacting the compound with the subject 58224 polypeptide; and evaluating ability of the compound to interact with, e.g., to bind, to form a complex with, or to enzymatically act upon, the subject 58224 polypeptide. This method can be performed in vitro, e.g., in a cell free system, or in vivo, e.g., in a two-hybrid interaction trap assay. This method can be used to identify naturally occurring molecules that interact with a subject 58224 polypeptide. It can also be used to find natural or synthetic inhibitors of a subject 58224 polypeptide. Screening methods are discussed in more detail below.

[1530] 58224 Screening Assays

[1531] The invention provides methods (also referred to herein as “screening assays”) for identifying modulators, i.e., candidate or test compounds or agents (e.g., proteins, peptides, peptidomimetics, peptoids, small molecules or other drugs) that bind to 58224 proteins, have a stimulatory or inhibitory effect on, for example, 58224 expression or 58224 activity, or have a stimulatory or inhibitory effect on, for example, the expression or activity of a 58224 substrate. Compounds thus identified can be used to modulate the activity of target gene products (e.g., 58224 genes) in a therapeutic protocol, to elaborate the biological function of the target gene product, or to identify compounds that disrupt normal target gene interactions.

[1532] In one embodiment, the invention provides assays for screening candidate or test compounds that are substrates of a 58224 protein or polypeptide or a biologically active portion thereof. In another embodiment, the invention provides assays for screening candidate or test compounds that bind to or modulate the activity of a 58224 protein or polypeptide or a biologically active portion thereof.

[1533] In any screening assay, a 58224 polypeptide that may have, e.g., a helicase domain, can be used.

[1534] Assays for helicase activity can include nucleic acid binding activity, e.g., RNA or DNA binding activity; NTPase activity, e.g., RNA or DNA dependent ATPase activity; nucleic acid unwinding activity, e.g., RNA or DNA unwinding activity. Such assays are described in, e.g., Luking et al. (1998) Crit. Rev. Biochem. and Mol. Biol. 33:259-296, and/or in the references cited therein.

[1535] The test compounds of the present invention can be obtained using any of the numerous approaches in combinatorial library methods known in the art, including: biological libraries; peptoid libraries [libraries of molecules having the functionalities of peptides, but with a novel, non-peptide backbone which are resistant to enzymatic degradation but which nevertheless remain bioactive] (see, e.g., Zuckermann, R. N. et al. J. Med. Chem. 1994, 37: 2678-85); spatially addressable parallel solid phase or solution phase libraries; synthetic library methods requiring deconvolution; the ‘one-bead one-compound’ library method; and synthetic library methods using affinity chromatography selection. The biological library and peptoid library approaches are limited to peptide libraries, while the other four approaches are applicable to peptide, non-peptide oligomer or small molecule libraries of compounds (Lam, K. S. (1997) Anticancer Drug Des. 12:145).

[1536] Examples of methods for the synthesis of molecular libraries can be found in the art, for example in: DeWitt et al. (1993) Proc. Natl. Acad. Sci. U.S.A. 90:6909; Erb et al. (1994) Proc. Natl. Acad. Sci. USA 91:11422; Zuckermann et al. (1994). J. Med. Chem. 37:2678; Cho et al. (1993) Science 261:1303; Carrell et al. (1994) Angew. Chem. Int. Ed. Engl. 33:2059; Carell et al. (1994) Angew. Chem. Int. Ed. Engl. 33:2061; and in Gallop et al. (1994) J. Med. Chem. 37:1233.

[1537] Libraries of compounds may be presented in solution (e.g., Houghten (1992) Biotechniques 13:412-421), or on beads (Lam (1991) Nature 354:82-84), chips (Fodor (1993) Nature 364:555-556), bacteria or spores (Ladner U.S. Pat. No. 5,223,409), plasmids (Cull et al. (1992) Proc Natl Acad Sci USA 89:1865-1869) or on phage (Scott and Smith (1990) Science 249:386-390); (Devlin (1990) Science 249:404-406); (Cwirla et al. (1990) Proc. Natl. Acad. Sci. 87:6378-6382); (Felici (1991) J. Mol. Biol. 222:301-310); (Ladner supra.).

[1538] In one embodiment, an assay is a cell-based assay in which a cell that expresses a 58224 protein or biologically active portion thereof is contacted with a test compound, and the ability of the test compound to modulate 58224 activity is determined. Determining the ability of the test compound to modulate 58224 activity can be accomplished by monitoring, for example, helicase activity, e.g., a helicase activity described herein. The cell, for example, can be of mammalian origin, e.g., human.

[1539] The ability of the test compound to modulate 58224 binding to a compound, e.g., a 58224 substrate, or to bind to 58224 can also be evaluated. This can be accomplished, for example, by coupling the compound, e.g., the substrate with a radioisotope or enzymatic label such that binding of the compound, e.g., the substrate, to 58224 can be determined by detecting the labeled compound, e.g., substrate, in a complex. Alternatively, 58224 can be coupled with a radioisotope or enzymatic label to monitor the ability of a test compound to modulate 58224 binding to a 58224 substrate in a complex. For example, compounds (e.g., 58224 substrates) can be labeled with 125I, 35S, 14C, or 3H, either directly or indirectly, and the radioisotope detected by direct counting of radioemission or by scintillation counting. Alternatively, compounds can be enzymatically labeled with, for example, horseradish peroxidase, alkaline phosphatase, or luciferase, and the enzymatic label detected by determination of conversion of an appropriate substrate to product.

[1540] The ability of a compound (e.g., a 58224 substrate or modulator) to interact with 58224 with or without the labeling of any of the interactants can be evaluated. For example, a microphysiometer can be used to detect the interaction of a compound with 58224 without the labeling of either the compound or 58224. McConnell, H. M. et al. (1992) Science 257:1906-1912. As used herein, a “microphysiometer” (e.g., Cytosensor) is an analytical instrument that measures the rate at which a cell acidifies its environment using a light-addressable potentiometric sensor (LAPS). Changes in this acidification rate can be used as an indicator of the interaction between a compound and 58224.

[1541] In yet another embodiment, a cell-free assay is provided in which a 58224 protein or biologically active portion thereof is contacted with a test compound and the ability of the test compound to bind to the 58224 protein or biologically active portion thereof is evaluated. Preferred biologically active portions of the 58224 proteins to be used in assays of the present invention include fragments that participate in interactions with non-58224 molecules, e.g., fragments with high surface probability scores.

[1542] Soluble and/or membrane-bound forms of isolated proteins (e.g., 58224 proteins or biologically active portions thereof) can be used in the cell-free assays of the invention. When membrane-bound forms of the protein are used, it may be desirable to utilize a solubilizing agent. Examples of such solubilizing agents include non-ionic detergents such as n-octylglucoside, n-dodecylglucoside, n-dodecylmaltoside, octanoyl-N-methylglucamide, decanoyl-N-methylglucamide, Triton® X-100, Triton® X-114, Thesit®, Isotridecypoly(ethylene glycol ether)n, 3-[(3-cholamidopropyl)dimethylamminio]-1-propane sulfonate (CHAPS), 3-[(3-cholamidopropyl)dimethylamminio]-2-hydroxy-1-propane sulfonate (CHAPSO), or N-dodecyl=N,N-dimethyl-3-ammonio-1-propane sulfonate.

[1543] Cell-free assays involve preparing a reaction mixture of the target gene protein and the test compound under conditions and for a time sufficient to allow the two components to interact and bind, thus forming a complex that can be removed and/or detected.

[1544] Assays where ability of agent to block helicase activity within a cell is evaluated.

[1545] The interaction between two molecules can also be detected, e.g., using fluorescence energy transfer (FET) (see, for example, Lakowicz et al., U.S. Pat. No. 5,631,169; Stavrianopoulos, et al., U.S. Pat. No. 4,868,103). A fluorophore label on the first, ‘donor’ molecule is selected such that its emitted fluorescent energy will be absorbed by a fluorescent label on a second, ‘acceptor’ molecule, which in turn is able to fluoresce due to the absorbed energy. Alternately, the ‘donor’ protein molecule may simply utilize the natural fluorescent energy of tryptophan residues. Labels are chosen that emit different wavelengths of light, such that the ‘acceptor’ molecule label may be differentiated from that of the ‘donor’. Since the efficiency of energy transfer between the labels is related to the distance separating the molecules, the spatial relationship between the molecules can be assessed. In a situation in which binding occurs between the molecules, the fluorescent emission of the ‘acceptor’ molecule label in the assay should be maximal. An FET binding event can be conveniently measured through standard fluorometric detection means well known in the art (e.g., using a fluorimeter).

[1546] In another embodiment, determining the ability of the 58224 protein to bind to a target molecule can be accomplished using real-time Biomolecular Interaction Analysis (BIA) (see, e.g., Sjolander, S. and Urbaniczky, C. (1991) Anal. Chem. 63:2338-2345 and Szabo et al. (1995) Curr. Opin. Struct. Biol. 5:699-705). “Surface plasmon resonance” or “BIA” detects biospecific interactions in real time, without labeling any of the interactants (e.g., BIAcore). Changes in the mass at the binding surface (indicative of a binding event) result in alterations of the refractive index of light near the surface (the optical phenomenon of surface plasmon resonance (SPR)), resulting in a detectable signal that can be used as an indication of real-time reactions between biological molecules.

[1547] In one embodiment, the target gene product or the test substance is anchored onto a solid phase. The target gene product/test compound complexes anchored on the solid phase can be detected at the end of the reaction. Preferably, the target gene product can be anchored onto a solid surface, and the test compound (which is not anchored), can be labeled, either directly or indirectly, with detectable labels discussed herein.

[1548] It may be desirable to immobilize either 58224, an anti 58224 antibody or its target molecule to facilitate separation of complexed from uncomplexed forms of one or both of the proteins, as well as to accommodate automation of the assay. Binding of a test compound to a 58224 protein, or interaction of a 58224 protein with a target molecule in the presence and absence of a candidate compound, can be accomplished in any vessel suitable for containing the reactants. Examples of such vessels include microtiter plates, test tubes, and micro-centrifuge tubes. In one embodiment, a fusion protein can be provided which adds a domain that allows one or both of the proteins to be bound to a matrix. For example, glutathione-S-transferase/58224 fusion proteins or glutathione-S-transferase/target fusion proteins can be adsorbed onto glutathione sepharose beads (Sigma Chemical, St. Louis, Mo.) or glutathione derivatized microtiter plates, which are then combined with the test compound or the test compound and either the non-adsorbed target protein or 58224 protein, and the mixture incubated under conditions conducive to complex formation (e.g., at physiological conditions for salt and pH). Following incubation, the beads or microtiter plate wells are washed to remove any unbound components, the matrix immobilized in the case of beads, complex determined either directly or indirectly, for example, as described above. Alternatively, the complexes can be dissociated from the matrix, and the level of 58224 binding or activity determined using standard techniques.

[1549] Other techniques for immobilizing either a 58224 protein or a target molecule on matrices include using conjugation of biotin and streptavidin. Biotinylated 58224 protein or target molecules can be prepared from biotin-NHS(N-hydroxy-succinimide) using techniques known in the art (e.g., biotinylation kit, Pierce Chemicals, Rockford, Ill.), and immobilized in the wells of streptavidin-coated 96 well plates (Pierce Chemical).

[1550] In order to conduct the assay, the non-immobilized component is added to the coated surface containing the anchored component. After the reaction is complete, unreacted components are removed (e.g., by washing) under conditions such that any complexes formed will remain immobilized on the solid surface. The detection of complexes anchored on the solid surface can be accomplished in a number of ways. Where the previously non-immobilized component is pre-labeled, the detection of label immobilized on the surface indicates that complexes were formed. Where the previously non-immobilized component is not pre-labeled, an indirect label can be used to detect complexes anchored on the surface; e.g., using a labeled antibody specific for the immobilized component (the antibody, in turn, can be directly labeled or indirectly labeled with, e.g., a labeled anti-Ig antibody).

[1551] In one embodiment, this assay is performed utilizing antibodies reactive with 58224 protein or target molecules but which do not interfere with binding of the 58224 protein to its target molecule. Such antibodies can be derivatized to the wells of the plate, and unbound target or 58224 protein is trapped in the wells by antibody conjugation. Methods for detecting such complexes, in addition to those described above for the GST-immobilized complexes, include immunodetection of complexes using antibodies reactive with the 58224 protein or target molecule, as well as enzyme-linked assays which rely on detecting an enzymatic activity associated with the 58224 protein or target molecule.

[1552] Alternatively, cell free assays can be conducted in a liquid phase. In such an assay, the reaction products are separated from unreacted components by any of a number of standard techniques, including but not limited to: differential centrifugation (see, for example, Rivas, G., and Minton, A. P., (1993) Trends Biochem Sci Aug; 18(8):284-7); chromatography (gel filtration chromatography, ion-exchange chromatography); electrophoresis (see, e.g., Ausubel, F. et al., eds. Current Protocols in Molecular Biology 1999, J. Wiley: New York.); and immunoprecipitation (see, for example, Ausubel, F. et al., eds. Current Protocols in Molecular Biology 1999, J. Wiley: New York). Such resins and chromatographic techniques are known to one skilled in the art (see, e.g., Heegaard, N. H., (1998) J Mol Recognit Winter; 11 (1-6):141-8; Hage, D. S., and Tweed, S. A. (1997) J. Chromatogr B. Biomed Sci Appl Oct 10;699(1-2):499-525). Further, fluorescence energy transfer may also be conveniently utilized, as described herein, to detect binding without further purification of the complex from solution.

[1553] In a preferred embodiment, the assay includes contacting the 58224 protein or biologically active portion thereof with a known compound which binds 58224 to form an assay mixture, contacting the assay mixture with a test compound, and determining the ability of the test compound to interact with a 58224 protein, wherein determining the ability of the test compound to interact with a 58224 protein includes determining the ability of the test compound to preferentially bind to 58224 or biologically active portion thereof, or to modulate the activity of a target molecule, as compared to the known compound.

[1554] The target gene products of the invention can, in vivo, interact with one or more cellular or extracellular macromolecules, such as proteins. For the purposes of this discussion, such cellular and extracellular macromolecules are referred to herein as “binding partners.” Compounds that disrupt such interactions can be useful in regulating the activity of the target gene product. Such compounds can include, but are not limited to molecules such as antibodies, peptides, and small molecules. The preferred target genes/products for use in this embodiment are the 58224 genes herein identified. In an alternative embodiment, the invention provides methods for determining the ability of the test compound to modulate the activity of a 58224 protein through modulation of the activity of a downstream effector of a 58224 target molecule. For example, the activity of the effector molecule on an appropriate target can be determined, or the binding of the effector to an appropriate target can be determined, as previously described.

[1555] To identify compounds that interfere with the interaction between the target gene product and its cellular or extracellular binding partner(s), e.g., a substrate, a reaction mixture containing the target gene product and the binding partner is prepared, under conditions and for a time sufficient, to allow the two products to form complex. In order to test an inhibitory agent, the reaction mixture is provided in the presence and absence of the test compound. The test compound can be initially included in the reaction mixture, or can be added at a time subsequent to the addition of the target gene and its cellular or extracellular binding partner. Control reaction mixtures are incubated without the test compound or with a placebo. The formation of any complexes between the target gene product and the cellular or extracellular binding partner is then detected. The formation of a complex in the control reaction, but not in the reaction mixture containing the test compound, indicates that the compound interferes with the interaction of the target gene product and the interactive binding partner. Additionally, complex formation within reaction mixtures containing the test compound and normal target gene product can also be compared to complex formation within reaction mixtures containing the test compound and mutant target gene product. This comparison can be important in those cases wherein it is desirable to identify compounds that disrupt interactions of mutant but not normal target gene products.

[1556] These assays can be conducted in a heterogeneous or homogeneous format. Heterogeneous assays involve anchoring either the target gene product or the binding partner onto a solid phase, and detecting complexes anchored on the solid phase at the end of the reaction. In homogeneous assays, the entire reaction is carried out in a liquid phase. In either approach, the order of addition of reactants can be varied to obtain different information about the compounds being tested. For example, test compounds that interfere with the interaction between the target gene products and the binding partners, e.g., by competition, can be identified by conducting the reaction in the presence of the test substance. Alternatively, test compounds that disrupt preformed complexes, e.g., compounds with higher binding constants that displace one of the components from the complex, can be tested by adding the test compound to the reaction mixture after complexes have been formed. The various formats are briefly described below.

[1557] In a heterogeneous assay system, either the target gene product or the interactive cellular or extracellular binding partners, is anchored onto a solid surface (e.g., a microtiter plate), while the non-anchored species is labeled either directly or indirectly. The anchored species can be immobilized by non-covalent or covalent attachments. Alternatively, an immobilized antibody specific for the species to be anchored can be used to anchor the species to the solid surface.

[1558] In order to conduct the assay, the partner of the immobilized species is exposed to the coated surface with or without the test compound. After the reaction is complete, unreacted components are removed (e.g., by washing) and any complexes that have formed remain immobilized on the solid surface. In assays where the non-immobilized species is pre-labeled, the detection of label immobilized on the surface indicates that complexes were formed. In assays where the non-immobilized species is not pre-labeled, an indirect label can be used to detect complexes anchored on the surface; e.g., using a labeled antibody specific for the initially non-immobilized species (the antibody, in turn, can be directly labeled or indirectly labeled with, e.g., a labeled anti-Ig antibody). Depending upon the order of addition of reaction components, test compounds that inhibit complex formation or that disrupt preformed complexes can be detected.

[1559] Alternatively, the reaction can be conducted in a liquid phase in the presence or absence of the test compound. Reaction products are separated from unreacted components and complexes detected using, for example, an immobilized antibody specific for one of the binding components to anchor any complexes formed in solution and a labeled antibody specific for the other partner to detect anchored complexes. Again, depending upon the order of addition of reactants to the liquid phase, test compounds that inhibit complex formation or that disrupt preformed complexes can be identified.

[1560] In an alternate embodiment of the invention, a homogeneous assay can be used. For example, a preformed complex of the target gene product and the interactive cellular or extracellular binding partner product is prepared in which either the target gene products or their binding partners are labeled, but the signal generated by the label is quenched due to complex formation (see, e.g., U.S. Pat. No. 4,109,496 that utilizes this approach for immunoassays). The addition of a test substance that competes with and displaces one of the species from the preformed complex will result in the generation of a signal above background. In this way, test substances that disrupt target gene product-binding partner interaction can be identified.

[1561] In yet another aspect, the 58224 proteins can be used as “bait proteins” in a two-hybrid assay or three-hybrid assay (see, e.g., U.S. Pat. No. 5,283,317; Zervos et al. (1993) Cell 72:223-232; Madura et al. (1993) J. Biol. Chem. 268:12046-12054; Bartel et al. (1993) Biotechniques 14:920-924; Iwabuchi et al. (1993) Oncogene 8:1693-1696; and Brent WO94/10300), to identify other proteins, which bind to or interact with 58224 (“58224-binding proteins” or “58224-bp”) and are involved in 58224 activity. Such 58224-bps can be activators or inhibitors of signals by the 58224 proteins or 58224 targets as, for example, downstream elements of a 58224-mediated signaling pathway.

[1562] The two-hybrid system is based on the modular nature of most transcription factors, which consist of separable DNA-binding and activation domains. Briefly, the assay utilizes two different DNA constructs. In one construct, the gene that codes for a 58224 protein is fused to a gene encoding the DNA binding domain of a known transcription factor (e.g., GAL-4). In the other construct, a DNA sequence from a library of DNA sequences that encodes an unidentified protein (“prey” or “sample”) is fused to a gene that codes for the activation domain of the known transcription factor. (Alternatively the 58224 protein can be fused to the activator domain.) If the “bait” and the “prey” proteins are able to interact in vivo and form a 58224-dependent complex, the DNA-binding and activation domains of the transcription factor are brought into close proximity. This proximity allows transcription of a reporter gene (e.g., LacZ) that is operably linked to a transcriptional regulatory site responsive to the transcription factor. Expression of the reporter gene can be detected and cell colonies containing the functional transcription factor can be isolated and used to obtain the cloned gene that encodes the protein that interacts with the 58224 protein.

[1563] In another embodiment, modulators of 58224 expression are identified. For example, a cell or cell free mixture is contacted with a candidate compound and the expression of 58224 mRNA or protein evaluated relative to the level of expression of 58224 mRNA or protein in the absence of the candidate compound. When expression of 58224 mRNA or protein is greater in the presence of the candidate compound than in its absence, the candidate compound is identified as a stimulator of 58224 mRNA or protein expression. Alternatively, when expression of 58224 mRNA or protein is less (statistically significantly less) in the presence of the candidate compound than in its absence, the candidate compound is identified as an inhibitor of 58224 mRNA or protein expression. The level of 58224 mRNA or protein expression can be determined by methods described herein for detecting 58224 mRNA or protein.

[1564] In another aspect, the invention pertains to a combination of two or more of the assays described herein. For example, a modulating agent can be identified using a cell-based or a cell free assay, and the ability of the agent to modulate the activity of a 58224 protein can be confirmed in vivo, e.g., in an animal model.

[1565] This invention further pertains to novel agents identified by the above-described screening assays. Accordingly, it is within the scope of this invention to further use an agent identified as described herein (e.g., a 58224 modulating agent, an antisense 58224 nucleic acid molecule, a 58224-specific antibody, or a 58224-binding partner) in an appropriate animal model to determine the efficacy, toxicity, side effects, or mechanism of action, of treatment with such an agent. Furthermore, novel agents identified by the above-described screening assays can be used for treatments as described herein.

[1566] 58224 Detection Assays

[1567] Portions or fragments of the nucleic acid sequences identified herein can be used as polynucleotide reagents. For example, these sequences can be used to: (i) map their respective genes on a chromosome e.g., to locate gene regions associated with genetic disease or to associate 58224 with a disease; (ii) identify an individual from a minute biological sample (tissue typing); and (iii) aid in forensic identification of a biological sample. These applications are described in the subsections below.

[1568] 58224 Chromosome Mapping

[1569] The 58224 nucleotide sequences or portions thereof can be used to map the location of the 58224 genes on a chromosome. This process is called chromosome mapping. Chromosome mapping is useful in correlating the 58224 sequences with genes associated with disease.

[1570] Briefly, 58224 genes can be mapped to chromosomes by preparing PCR primers (preferably 15-25 bp in length) from the 58224 nucleotide sequences. These primers can then be used for PCR screening of somatic cell hybrids containing individual human chromosomes. Only those hybrids containing the human gene corresponding to the 58224 sequences will yield an amplified fragment.

[1571] A panel of somatic cell hybrids in which each cell line contains either a single human chromosome or a small number of human chromosomes and a full set of mouse chromosomes, allows easy mapping of individual genes to specific human chromosomes. (D'Eustachio P. et al. (1983) Science 220:919-924).

[1572] Other mapping strategies e.g., in situ hybridization (described in Fan, Y. et al. (1990) Proc. Natl. Acad. Sci. USA, 87:6223-27), pre-screening with labeled flow-sorted chromosomes, and pre-selection by hybridization to chromosome specific cDNA libraries can be used to map 58224 to a chromosomal location.

[1573] Fluorescence in situ hybridization (FISH) of a DNA sequence to a metaphase chromosomal spread can further be used to provide a precise chromosomal location in one step. The FISH technique can be used with a DNA sequence as short as 500 or 600 bases. However, clones larger than 1,000 bases have a higher likelihood of binding to a unique chromosomal location with sufficient signal intensity for simple detection. Preferably 1,000 bases, and more preferably 2,000 bases will suffice to get good results at a reasonable amount of time. For a review of this technique, see Verma et al., Human Chromosomes: A Manual of Basic Techniques (Pergamon Press, New York 1988).

[1574] Reagents for chromosome mapping can be used individually to mark a single chromosome or a single site on that chromosome, or panels of reagents can be used for marking multiple sites and/or multiple chromosomes. Reagents corresponding to noncoding regions of the genes actually are preferred for mapping purposes. Coding sequences are more likely to be conserved within gene families, thus increasing the chance of cross hybridizations during chromosomal mapping.

[1575] Once a sequence has been mapped to a precise chromosomal location, the physical position of the sequence on the chromosome can be correlated with genetic map data. (Such data are found, for example, in V. McKusick, Mendelian Inheritance in Man, available on-line through Johns Hopkins University Welch Medical Library). The relationship between a gene and a disease, mapped to the sane chromosomal region, can then be identified through linkage analysis (co-inheritance of physically adjacent genes), described in, for example, Egeland, J. et al. (1987) Nature, 325:783-787.

[1576] Moreover, differences in the DNA sequences between individuals affected and unaffected with a disease associated with the 58224 gene, can be determined. If a mutation is observed in some or all of the affected individuals but not in any unaffected individuals, then the mutation is likely to be the causative agent of the particular disease. Comparison of affected and unaffected individuals generally involves first looking for structural alterations in the chromosomes, such as deletions or translocations that are visible from chromosome spreads or detectable using PCR based on that DNA sequence. Ultimately, complete sequencing of genes from several individuals can be performed to confirm the presence of a mutation and to distinguish mutations from polymorphisms.

[1577] 58224 Tissue Typing

[1578] 58224 sequences can be used to identify individuals from biological samples using, e.g., restriction fragment length polymorphism (RFLP). In this technique, an individual's genomic DNA is digested with one or more restriction enzymes, the fragments separated, e.g., by electrophoresis and Southern blotted, and probed to yield bands for identification. The sequences of the present invention are useful as additional DNA markers for RFLP (described in U.S. Pat. No. 5,272,057).

[1579] Furthermore, the sequences of the present invention can also be used to determine the actual base-by-base DNA sequence of selected portions of an individual's genome. Thus, the 58224 nucleotide sequences described herein can be used to prepare two PCR primers from the 5′ and 3′ends of the sequences. These primers can then be used to amplify an individual's DNA and subsequently sequence it. Panels of corresponding DNA sequences from individuals, prepared in this manner, can provide unique individual identifications, as each individual will have a unique set of such DNA sequences due to allelic differences.

[1580] Allelic variation occurs to some degree in the coding regions of these sequences, and to a greater degree in the noncoding regions. Each of the sequences described herein can, to some degree, be used as a standard against which DNA from an individual can be compared for identification purposes. Because greater numbers of polymorphisms occur in the noncoding regions, fewer sequences are necessary to differentiate individuals. The noncoding sequences of SEQ ID NO:22 can provide positive individual identification with a panel of perhaps 10 to 1,000 primers, which each yield a noncoding amplified sequence of 100 bases. If predicted coding sequences, such as those in SEQ ID NO:24 are used, a more appropriate number of primers for positive individual identification would be 500-2,000.

[1581] If a panel of reagents from 58224 nucleotide sequences described herein is used to generate a unique identification database for an individual, those same reagents can later be used to identify tissue from that individual. Using the unique identification database, positive identification of the individual, living or dead, can be made from extremely small tissue samples.

[1582] Use of Partial 58224 Sequences in Forensic Biology

[1583] DNA-based identification techniques can also be used in forensic biology. To make such an identification, PCR technology can be used to amplify DNA sequences taken from very small biological samples such as tissues, e.g., hair or skin, or body fluids, e.g., blood, saliva, or semen, found at a crime scene. The amplified sequence can then be compared to a standard, thereby allowing identification of the origin of the biological sample.

[1584] The sequences of the present invention can be used to provide polynucleotide reagents, e.g., PCR primers, targeted to specific loci in the human genome, which can enhance the reliability of DNA-based forensic identifications by, for example, providing another “identification marker” (i.e., another DNA sequence that is unique to a particular individual). As mentioned above, actual base sequence information can be used for identification as an accurate alternative to patterns formed by restriction enzyme generated fragments. Sequences targeted to noncoding regions of SEQ ID NO:22 (e.g., fragments derived from the noncoding regions of SEQ ID NO:22 and having a length of at least 20 bases, preferably at least 30 bases) are particularly appropriate for this use.

[1585] The 58224 nucleotide sequences described herein can further be used to provide polynucleotide reagents, e.g., labeled or labelable probes which can be used in, for example, an in situ hybridization technique, to identify a specific tissue, e.g., a tissue containing 58224 helicase activity. This can be very useful in cases where a forensic pathologist is presented with a tissue of unknown origin. Panels of such 58224 probes can be used to identify tissue by species and/or by organ type.

[1586] In a similar fashion, these reagents, e.g., 58224 primers or probes can be used to screen tissue culture for contamination (i.e., screen for the presence of a mixture of different types of cells in a culture).

[1587] Predictive Medicine of 58224

[1588] The present invention also pertains to the field of predictive medicine in which diagnostic assays, prognostic assays, and monitoring clinical trials are used for prognostic (predictive) purposes to thereby treat an individual.

[1589] Generally, the invention provides, a method of determining if a subject is at risk for a disorder related to a lesion in or the misexpression of a gene that encodes 58224. Such disorders include, e.g., a disorder associated with the misexpression of 58224.

[1590] The method includes one or more of the following:

[1591] detecting, in a tissue of the subject, the presence or absence of a mutation which affects the expression of the 58224 gene, or detecting the presence or absence of a mutation in a region which controls the expression of the gene, e.g., a mutation in the 5′control region;

[1592] detecting, in a tissue of the subject, the presence or absence of a mutation which alters the structure of the 58224 gene;

[1593] detecting, in a tissue of the subject, the misexpression of the 58224 gene at the mRNA level, e.g., detecting a non-wild type level of a mRNA;

[1594] detecting, in a tissue of the subject, the misexpression of the gene at the protein level, e.g., detecting a non-wild type level of a 58224 polypeptide.

[1595] In preferred embodiments the method includes: ascertaining the existence of at least one of: a deletion of one or more nucleotides from the 58224 gene; an insertion of one or more nucleotides into the gene, a point mutation, e.g., a substitution of one or more nucleotides of the gene, or a gross chromosomal rearrangement of the gene, e.g., a translocation, inversion, or deletion.

[1596] For example, detecting the genetic lesion can include: (i) providing a probe/primer including an oligonucleotide containing a region of nucleotide sequence that hybridizes to a sense or antisense sequence from SEQ ID NO:22, 24, or naturally occurring mutants thereof or 5′ or 3′flanking sequences naturally associated with the 58224 gene; (ii) exposing the probe/primer to nucleic acid of the tissue; and (iii) detecting, by hybridization, e.g., in situ hybridization, of the probe/primer to the nucleic acid, the presence or absence of the genetic lesion.

[1597] In preferred embodiments detecting the misexpression includes ascertaining the existence of at least one of: an alteration in the level of a messenger RNA transcript of the 58224 gene; the presence of a non-wild type splicing pattern of a messenger RNA transcript of the gene; or a non-wild type level of 58224.

[1598] Methods of the invention can be used prenatally or to determine if a subject's offspring will be at risk for a disorder.

[1599] In preferred embodiments the method includes determining the structure of a 58224 gene, an abnormal structure being indicative of risk for the disorder.

[1600] In preferred embodiments the method includes contacting a sample form the subject with an antibody to the 58224 protein or a nucleic acid, which hybridizes specifically with the gene. This and other embodiments are discussed below.

[1601] Diagnostic and Prognostic Assays of 58224

[1602] Diagnostic and prognostic assays of the invention include method for assessing the expression level of 58224 molecules and for identifying variations and mutations in the sequence of 58224 molecules.

[1603] Expression Monitoring and Profiling:

[1604] The presence, level, or absence of 58224 protein or nucleic acid in a biological sample can be evaluated by obtaining a biological sample from a test subject and contacting the biological sample with a compound or an agent capable of detecting 58224 protein or nucleic acid (e.g., mRNA, genomic DNA) that encodes 58224 protein such that the presence of 58224 protein or nucleic acid is detected in the biological sample. The term “biological sample” includes tissues, cells and biological fluids isolated from a subject, as well as tissues, cells and fluids present within a subject. A preferred biological sample is serum. The level of expression of the 58224 gene can be measured in a number of ways, including, but not limited to: measuring the mRNA encoded by the 58224 genes; measuring the amount of protein encoded by the 58224 genes; or measuring the activity of the protein encoded by the 58224 genes.

[1605] The level of mRNA corresponding to the 58224 gene in a cell can be determined both by in situ and by in vitro formats.

[1606] The isolated mRNA can be used in hybridization or amplification assays that include, but are not limited to, Southern or Northern analyses, polymerase chain reaction analyses and probe arrays. One preferred diagnostic method for the detection of mRNA levels involves contacting the isolated mRNA with a nucleic acid molecule (probe) that can hybridize to the mRNA encoded by the gene being detected. The nucleic acid probe can be, for example, a full-length 58224 nucleic acid, such as the nucleic acid of SEQ ID NO:22 or 24, or a portion thereof, such as an oligonucleotide of at least 7, 15, 30, 50, 100, 250 or 500 nucleotides in length and sufficient to specifically hybridize under stringent conditions to 58224 mRNA or genomic DNA. The probe can be disposed on an address of an array, e.g., an array described below. Other suitable probes for use in the diagnostic assays are described herein.

[1607] In one format, mRNA (or cDNA) is immobilized on a surface and contacted with the probes, for example by running the isolated mRNA on an agarose gel and transferring the mRNA from the gel to a membrane, such as nitrocellulose. In an alternative format, the probes are immobilized on a surface and the mRNA (or cDNA) is contacted with the probes, for example, in a two-dimensional gene chip array described below. A skilled artisan can adapt known mRNA detection methods for use in detecting the level of mRNA encoded by the 58224 genes.

[1608] The level of mRNA in a sample that is encoded by one of 58224 can be evaluated with nucleic acid amplification, e.g., by rtPCR (Mullis (1987) U.S. Pat. No. 4,683,202), ligase chain reaction (Barany (1991) Proc. Natl. Acad. Sci. USA 88:189-193), self sustained sequence replication (Guatelli et al., (1990) Proc. Natl. Acad. Sci. USA 87:1874-1878), transcriptional amplification system (Kwoh et al., (1989), Proc. Natl. Acad. Sci. USA 86:1173-1177), Q-Beta Replicase (Lizardi et al., (1988) Bio/Technology 6:1197), rolling circle replication (Lizardi et al., U.S. Pat. No. 5,854,033) or any other nucleic acid amplification method, followed by the detection of the amplified molecules using techniques known in the art. As used herein, amplification primers are defined as being a pair of nucleic acid molecules that can anneal to 5′ or 3′regions of a gene (plus and minus strands, respectively, or vice-versa) and contain a short region in between. In general, amplification primers are from about 10 to 30 nucleotides in length and flank a region from about 50 to 200 nucleotides in length. Under appropriate conditions and with appropriate reagents, such primers permit the amplification of a nucleic acid molecule comprising the nucleotide sequence flanked by the primers.

[1609] For in situ methods, a cell or tissue sample can be prepared/processed and immobilized on a support, typically a glass slide, and then contacted with a probe that can hybridize to mRNA that encodes the 58224 gene being analyzed.

[1610] In another embodiment, the methods further contacting a control sample with a compound or agent capable of detecting 58224 mRNA, or genomic DNA, and comparing the presence of 58224 mRNA or genomic DNA in the control sample with the presence of 58224 mRNA or genomic DNA in the test sample. In still another embodiment, serial analysis of gene expression, as described in U.S. Pat. No. 5,695,937, is used to detect 58224 transcript levels.

[1611] A variety of methods can be used to determine the level of protein encoded by 58224. In general, these methods include contacting an agent that selectively binds to the protein, such as an antibody with a sample, to evaluate the level of protein in the sample. In a preferred embodiment, the antibody bears a detectable label. Antibodies can be polyclonal, or more preferably, monoclonal. An intact antibody, or a fragment thereof (e.g., Fab or F(ab′)2) can be used. The term “labeled”, with regard to the probe or antibody, is intended to encompass direct labeling of the probe or antibody by coupling (i.e., physically linking) a detectable substance to the probe or antibody, as well as indirect labeling of the probe or antibody by reactivity with a detectable substance. Examples of detectable substances are provided herein.

[1612] The detection methods can be used to detect 58224 protein in a biological sample in vitro as well as in vivo. In vitro techniques for detection of 58224 protein include enzyme linked immunosorbent assays (ELISAs), immunoprecipitations, immunofluorescence, enzyme immunoassay (EIA), radioimmunoassay (RIA), and Western blot analysis. In vivo techniques for detection of 58224 protein include introducing into a subject a labeled anti-58224 antibody. For example, the antibody can be labeled with a radioactive marker whose presence and location in a subject can be detected by standard imaging techniques. In another embodiment, the sample is labeled, e.g., biotinylated and then contacted to the antibody, e.g., an anti-58224 antibody positioned on an antibody array (as described below). The sample can be detected, e.g., with avidin coupled to a fluorescent label.

[1613] In another embodiment, the methods further include contacting the control sample with a compound or agent capable of detecting 58224 protein, and comparing the presence of 58224 protein in the control sample with the presence of 58224 protein in the test sample.

[1614] The invention also includes kits for detecting the presence of 58224 in a biological sample. For example, the kit can include a compound or agent capable of detecting 58224 protein or mRNA in a biological sample; and a standard. The compound or agent can be packaged in a suitable container. The kit can further comprise instructions for using the kit to detect 58224 protein or nucleic acid.

[1615] For antibody-based kits, the kit can include: (1) a first antibody (e.g., attached to a solid support) which binds to a polypeptide corresponding to a marker of the invention; and, optionally, (2) a second, different antibody which binds to either the polypeptide or the first antibody and is conjugated to a detectable agent.

[1616] For oligonucleotide-based kits, the kit can include: (1) an oligonucleotide, e.g., a detectably labeled oligonucleotide, which hybridizes to a nucleic acid sequence encoding a polypeptide corresponding to a marker of the invention or (2) a pair of primers useful for amplifying a nucleic acid molecule corresponding to a marker of the invention. The kit can also includes a buffering agent, a preservative, or a protein stabilizing agent. The kit can also includes components necessary for detecting the detectable agent (e.g., an enzyme or a substrate). The kit can also contain a control sample or a series of control samples which can be assayed and compared to the test sample contained. Each component of the kit can be enclosed within an individual container and all of the various containers can be within a single package, along with instructions for interpreting the results of the assays performed using the kit.

[1617] The diagnostic methods described herein can identify subjects having, or at risk of developing, a disease or disorder associated with misexpressed or aberrant or unwanted 58224 expression or activity. As used herein, the term “unwanted” includes an unwanted phenomenon involved in a biological response such as pain or deregulated cell proliferation.

[1618] In one embodiment, a disease or disorder associated with aberrant or unwanted 58224 expression or activity is identified. A test sample is obtained from a subject and 58224 protein or nucleic acid (e.g., mRNA or genomic DNA) is evaluated, wherein the level, e.g., the presence or absence, of 58224 protein or nucleic acid is diagnostic for a subject having or at risk of developing a disease or disorder associated with aberrant or unwanted 58224 expression or activity. As used herein, a “test sample” refers to a biological sample obtained from a subject of interest, including a biological fluid (e.g., serum), cell sample, or tissue.

[1619] The prognostic assays described herein can be used to determine whether a subject can be administered an agent (e.g., an agonist, antagonist, peptidomimetic, protein, peptide, nucleic acid, small molecule, or other drug candidate) to treat a disease or disorder associated with aberrant or unwanted 58224 expression or activity. For example, such methods can be used to determine whether a subject can be effectively treated with an agent for a cell proliferation or differentiation disorder, e.g., cancer (e.g., breast, ovary, lung, or colon cancer, or liver metastasis), Werner syndrome, Bloom syndrome, cockayne's syndrome, xerodema pigmentosum, lymphoid proliferative diseases, &agr;-Thalassemia X-linked mental retardation, or another cell proliferation or differentiation disorder as described herein.

[1620] In another aspect, the invention features a computer medium having a plurality of digitally encoded data records. Each data record includes a value representing the level of expression of 58224 in a sample, and a descriptor of the sample. The descriptor of the sample can be an identifier of the sample, a subject from which the sample was derived (e.g., a patient), a diagnosis, or a treatment (e.g., a preferred treatment). In a preferred embodiment, the data record further includes values representing the level of expression of genes other than 58224 (e.g., other genes associated with a 58224-disorder, or other genes on an array). The data record can be structured as a table, e.g., a table that is part of a database such as a relational database (e.g., a SQL database of the Oracle or Sybase database environments).

[1621] Also featured is a method of evaluating a sample. The method includes providing a sample, e.g., from the subject, and determining a gene expression profile of the sample, wherein the profile includes a value representing the level of 58224 expression. The method can further include comparing the value or the profile (i.e., multiple values) to a reference value or reference profile. The gene expression profile of the sample can be obtained by any of the methods described herein (e.g., by providing a nucleic acid from the sample and contacting the nucleic acid to an array). The method can be used to diagnose a cell proliferation or differentiation disorder, e.g., Werner syndrome, Bloom syndrome, cockayne's syndrome, xerodema pigmentosum, lymphoid proliferative diseases, cancer or &agr;-Thalassemia X-linked mental retardation, in a subject wherein altered 58224 expression is an indication that the subject has or is disposed to having a cell proliferation or differentiation disorder as described herein. The method can be used to monitor a treatment for a cell proliferation or differentiation disorder, e.g., cancer (e.g., breast, ovary, lung, or colon cancer, or liver metastasis), Werner syndrome, Bloom syndrome, cockayne's syndrome, xerodema pigmentosum, lymphoid proliferative diseases, &agr;-Thalassemia X-linked mental retardation, or another cell proliferation or differentiation disorder as described herein. For example, the gene expression profile can be determined for a sample from a subject undergoing treatment. The profile can be compared to a reference profile or to a profile obtained from the subject prior to treatment or prior to onset of the disorder (see, e.g., Golub et al. (1999) Science 286:531).

[1622] In yet another aspect, the invention features a method of evaluating a test compound (see also, “Screening Assays”, above). The method includes providing a cell and a test compound; contacting the test compound to the cell; obtaining a subject expression profile for the contacted cell; and comparing the subject expression profile to one or more reference profiles. The profiles include a value representing the level of 58224 expression. In a preferred embodiment, the subject expression profile is compared to a target profile, e.g., a profile for a normal cell or for desired condition of a cell. The test compound is evaluated favorably if the subject expression profile is more similar to the target profile than an expression profile obtained from an uncontacted cell.

[1623] In another aspect, the invention features, a method of evaluating a subject. The method includes: a) obtaining a sample from a subject, e.g., from a caregiver, e.g., a caregiver who obtains the sample from the subject; b) determining a subject expression profile for the sample. Optionally, the method further includes either or both of steps: c) comparing the subject expression profile to one or more reference expression profiles; and d) selecting the reference profile most similar to the subject reference profile. The subject expression profile and the reference profiles include a value representing the level of 58224 expression. A variety of routine statistical measures can be used to compare two reference profiles. One possible metric is the length of the distance vector that is the difference between the two profiles. Each of the subject and reference profile is represented as a multi-dimensional vector, wherein each dimension is a value in the profile.

[1624] The method can further include transmitting a result to a caregiver. The result can be the subject expression profile, a result of a comparison of the subject expression profile with another profile, a most similar reference profile, or a descriptor of any of the aforementioned. The result can be transmitted across a computer network, e.g., the result can be in the form of a computer transmission, e.g., a computer data signal embedded in a carrier wave.

[1625] Also featured is a computer medium having executable code for effecting the following steps: receive a subject expression profile; access a database of reference expression profiles; and either i) select a matching reference profile most similar to the subject expression profile or ii) determine at least one comparison score for the similarity of the subject expression profile to at least one reference profile. The subject expression profile, and the reference expression profiles each include a value representing the level of 58224 expression.

[1626] 58224 Arrays and Uses Thereof

[1627] In another aspect, the invention features an array that includes a substrate having a plurality of addresses. At least one address of the plurality includes a capture probe that binds specifically to a 58224 molecule (e.g., a 58224 nucleic acid or a 58224 polypeptide). The array can have a density of at least than 10, 50, 100, 200, 500, 1,000, 2,000, or 10,000 or more addresses/cm2, and ranges between. In a preferred embodiment, the plurality of addresses includes at least 10, 100, 500, 1,000, 5,000, 10,000, 50,000 addresses. In a preferred embodiment, the plurality of addresses includes equal to or less than 10, 100, 500, 1,000, 5,000, 10,000, or 50,000 addresses. The substrate can be a two-dimensional substrate such as a glass slide, a wafer (e.g., silica or plastic), a mass spectroscopy plate, or a three-dimensional substrate such as a gel pad. Addresses in addition to address of the plurality can be disposed on the array.

[1628] In a preferred embodiment, at least one address of the plurality includes a nucleic acid capture probe that hybridizes specifically to a 58224 nucleic acid, e.g., the sense or anti-sense strand. In one preferred embodiment, a subset of addresses of the plurality of addresses has a nucleic acid capture probe for 58224. Each address of the subset can include a capture probe that hybridizes to a different region of a 58224 nucleic acid. In another preferred embodiment, addresses of the subset include a capture probe for a 58224 nucleic acid. Each address of the subset is unique, overlapping, and complementary to a different variant of 58224 (e.g., an allelic variant, or all possible hypothetical variants). The array can be used to sequence 58224 by hybridization (see, e.g., U.S. Pat. No. 5,695,940).

[1629] An array can be generated by various methods, e.g., by photolithographic methods (see, e.g., U.S. Pat. Nos. 5,143,854; 5,510,270; and 5,527,681), mechanical methods (e.g., directed-flow methods as described in U.S. Pat. No. 5,384,261), pin-based methods (e.g., as described in U.S. Pat. No. 5,288,514), and bead-based techniques (e.g., as described in PCT US/93/04145).

[1630] In another preferred embodiment, at least one address of the plurality includes a polypeptide capture probe that binds specifically to a 58224 polypeptide or fragment thereof. The polypeptide can be a naturally-occurring interaction partner of 58224 polypeptide. Preferably, the polypeptide is an antibody, e.g., an antibody described herein (see “Anti-58224 Antibodies,” above), such as a monoclonal antibody or a single-chain antibody.

[1631] In another aspect, the invention features a method of analyzing the expression of 58224. The method includes providing an array as described above; contacting the array with a sample and detecting binding of a 58224-molecule (e.g., nucleic acid or polypeptide) to the array. In a preferred embodiment, the array is a nucleic acid array. Optionally the method further includes amplifying nucleic acid from the sample prior or during contact with the array.

[1632] In another embodiment, the array can be used to assay gene expression in a tissue to ascertain tissue specificity of genes in the array, particularly the expression of 58224. If a sufficient number of diverse samples is analyzed, clustering (e.g., hierarchical clustering, k-means clustering, Bayesian clustering and the like) can be used to identify other genes which are co-regulated with 58224. For example, the array can be used for the quantitation of the expression of multiple genes. Thus, not only tissue specificity, but also the level of expression of a battery of genes in the tissue is ascertained. Quantitative data can be used to group (e.g., cluster) genes on the basis of their tissue expression per se and level of expression in that tissue.

[1633] For example, array analysis of gene expression can be used to assess the effect of cell-cell interactions on 58224 expression. A first tissue can be perturbed and nucleic acid from a second tissue that interacts with the first tissue can be analyzed. In this context, the effect of one cell type on another cell type in response to a biological stimulus can be determined, e.g., to monitor the effect of cell-cell interaction at the level of gene expression.

[1634] In another embodiment, cells are contacted with a therapeutic agent. The expression profile of the cells is determined using the array, and the expression profile is compared to the profile of like cells not contacted with the agent. For example, the assay can be used to determine or analyze the molecular basis of an undesirable effect of the therapeutic agent. If an agent is administered therapeutically to treat one cell type but has an undesirable effect on another cell type, the invention provides an assay to determine the molecular basis of the undesirable effect and thus provides the opportunity to co-administer a counteracting agent or otherwise treat the undesired effect. Similarly, even within a single cell type, undesirable biological effects can be determined at the molecular level. Thus, the effects of an agent on expression of other than the target gene can be ascertained and counteracted.

[1635] In another embodiment, the array can be used to monitor expression of one or more genes in the array with respect to time. For example, samples obtained from different time points can be probed with the array. Such analysis can identify and/or characterize the development of a 58224-associated disease or disorder; and processes, such as a cellular transformation associated with a 58224-associated disease or disorder. The method can also evaluate the treatment and/or progression of a 58224-associated disease or disorder

[1636] The array is also useful for ascertaining differential expression patterns of one or more genes in normal and abnormal cells. This provides a battery of genes (e.g., including 58224) that could serve as a molecular target for diagnosis or therapeutic intervention.

[1637] In another aspect, the invention features an array having a plurality of addresses. Each address of the plurality includes a unique polypeptide. At least one address of the plurality has disposed thereon a 58224 polypeptide or fragment thereof. Methods of producing polypeptide arrays are described in the art, e.g., in De Wildt et al. (2000). Nature Biotech. 18, 989-994; Lueking et al. (1999). Anal. Biochem. 270, 103-111; Ge, H. (2000). Nucleic Acids Res. 28, e3, I-VII; MacBeath, G., and Schreiber, S. L. (2000). Science 289, 1760-1763; and WO 99/51773A1. In a preferred embodiment, each addresses of the plurality has disposed thereon a polypeptide at least 60, 70, 80, 85, 90, 95 or 99% identical to a 58224 polypeptide or fragment thereof. For example, multiple variants of a 58224 polypeptide (e.g., encoded by allelic variants, site-directed mutants, random mutants, or combinatorial mutants) can be disposed at individual addresses of the plurality. Addresses in addition to the address of the plurality can be disposed on the array.

[1638] The polypeptide array can be used to detect a 58224 binding compound, e.g., an antibody in a sample from a subject with specificity for a 58224 polypeptide or the presence of a 58224-binding protein or ligand.

[1639] The array is also useful for ascertaining the effect of the expression of a gene on the expression of other genes in the same cell or in different cells (e.g., ascertaining the effect of 58224 expression on the expression of other genes). This provides, for example, for a selection of alternate molecular targets for therapeutic intervention if the ultimate or downstream target cannot be regulated.

[1640] In another aspect, the invention features a method of analyzing a plurality of probes. The method is useful, e.g., for analyzing gene expression. The method includes: providing a two dimensional array having a plurality of addresses, each address of the plurality being positionally distinguishable from each other address of the plurality having a unique capture probe, e.g., wherein the capture probes are from a cell or subject which express 58224 or from a cell or subject in which a 58224 mediated response has been elicited, e.g., by contact of the cell with 58224 nucleic acid or protein, or administration to the cell or subject 58224 nucleic acid or protein; providing a two dimensional array having a plurality of addresses, each address of the plurality being positionally distinguishable from each other address of the plurality, and each address of the plurality having a unique capture probe, e.g., wherein the capture probes are from a cell or subject which does not express 58224 (or does not express as highly as in the case of the 58224 positive plurality of capture probes) or from a cell or subject which in which a 58224 mediated response has not been elicited (or has been elicited to a lesser extent than in the first sample); contacting the array with one or more inquiry probes (which is preferably other than a 58224 nucleic acid, polypeptide, or antibody), and thereby evaluating the plurality of capture probes. Binding, e.g., in the case of a nucleic acid, hybridization with a capture probe at an address of the plurality, is detected, e.g., by signal generated from a label attached to the nucleic acid, polypeptide, or antibody.

[1641] In another aspect, the invention features a method of analyzing a plurality of probes or a sample. The method is useful, e.g., for analyzing gene expression. The method includes: providing a two dimensional array having a plurality of addresses, each address of the plurality being positionally distinguishable from each other address of the plurality having a unique capture probe, contacting the array with a first sample from a cell or subject which express or mis-express 58224 or from a cell or subject in which a 58224-mediated response has been elicited, e.g., by contact of the cell with 58224 nucleic acid or protein, or administration to the cell or subject 58224 nucleic acid or protein; providing a two dimensional array having a plurality of addresses, each address of the plurality being positionally distinguishable from each other address of the plurality, and each address of the plurality having a unique capture probe, and contacting the array with a second sample from a cell or subject which does not express 58224 (or does not express as highly as in the case of the 58224 positive plurality of capture probes) or from a cell or subject which in which a 58224 mediated response has not been elicited (or has been elicited to a lesser extent than in the first sample); and comparing the binding of the first sample with the binding of the second sample. Binding, e.g., in the case of a nucleic acid, hybridization with a capture probe at an address of the plurality, is detected, e.g., by signal generated from a label attached to the nucleic acid, polypeptide, or antibody. The same array can be used for both samples or different arrays can be used. If different arrays are used the plurality of addresses with capture probes should be present on both arrays.

[1642] In another aspect, the invention features a method of analyzing 58224, e.g., analyzing structure, function, or relatedness to other nucleic acid or amino acid sequences. The method includes: providing a 58224 nucleic acid or amino acid sequence; comparing the 58224 sequence with one or more preferably a plurality of sequences from a collection of sequences, e.g., a nucleic acid or protein sequence database; to thereby analyze 58224.

[1643] Detection of 58224 Variations or Mutations

[1644] The methods of the invention can also be used to detect genetic alterations in a 58224 gene, thereby determining if a subject with the altered gene is at risk for a disorder characterized by misregulation in 58224 protein activity or nucleic acid expression, such as a cell proliferation or differentiation disorder, e.g., cancer (e.g., breast, ovary, lung, or colon cancer, or liver metastasis), Werner syndrome, Bloom syndrome, cockayne's syndrome, xerodema pigmentosum, lymphoid proliferative diseases, &agr;-Thalassemia X-linked mental retardation, or another cell proliferation or differentiation disorder as described herein. In preferred embodiments, the methods include detecting, in a sample from the subject, the presence or absence of a genetic alteration characterized by at least one of an alteration affecting the integrity of a gene encoding a 58224-protein, or the mis-expression of the 58224 gene. For example, such genetic alterations can be detected by ascertaining the existence of at least one of 1) a deletion of one or more nucleotides from a 58224 gene; 2) an addition of one or more nucleotides to a 58224 gene; 3) a substitution of one or more nucleotides of a 58224 gene, 4) a chromosomal rearrangement of a 58224 gene; 5) an alteration in the level of a messenger RNA transcript of a 58224 gene, 6) aberrant modification of a 58224 gene, such as of the methylation pattern of the genomic DNA, 7) the presence of a non-wild type splicing pattern of a messenger RNA transcript of a 58224 gene, 8) a non-wild type level of a 58224-protein, 9) allelic loss of a 58224 gene, and 10) inappropriate post-translational modification of a 58224-protein.

[1645] An alteration can be detected without a probe/primer in a polymerase chain reaction, such as anchor PCR or RACE PCR, or, alternatively, in a ligation chain reaction (LCR), the latter of which can be particularly useful for detecting point mutations in the 58224-gene. This method can include the steps of collecting a sample of cells from a subject, isolating nucleic acid (e.g., genomic, mRNA or both) from the sample, contacting the nucleic acid sample with one or more primers which specifically hybridize to a 58224 gene under conditions such that hybridization and amplification of the 58224-gene (if present) occurs, and detecting the presence or absence of an amplification product, or detecting the size of the amplification product and comparing the length to a control sample. It is anticipated that PCR and/or LCR may be desirable to use as a preliminary amplification step in conjunction with any of the techniques used for detecting mutations described herein. Alternatively, other amplification methods described herein or known in the art can be used.

[1646] In another embodiment, mutations in a 58224 gene from a sample cell can be identified by detecting alterations in restriction enzyme cleavage patterns. For example, sample and control DNA is isolated, amplified (optionally), digested with one or more restriction endonucleases, and fragment length sizes are determined, e.g., by gel electrophoresis and compared. Differences in fragment length sizes between sample and control DNA indicates mutations in the sample DNA. Moreover, the use of sequence specific ribozymes (see, for example, U.S. Pat. No. 5,498,531) can be used to score for the presence of specific mutations by development or loss of a ribozyme cleavage site.

[1647] In other embodiments, genetic mutations in 58224 can be identified by hybridizing a sample and control nucleic acids, e.g., DNA or RNA, two-dimensional arrays, e.g., chip based arrays. Such arrays include a plurality of addresses, each of which is positionally distinguishable from the other. A different probe is located at each address of the plurality. A probe can be complementary to a region of a 58224 nucleic acid or a putative variant (e.g., allelic variant) thereof. A probe can have one or more mismatches to a region of a 58224 nucleic acid (e.g., a destabilizing mismatch). The arrays can have a high density of addresses, e.g., can contain hundreds or thousands of oligonucleotides probes (Cronin, M. T. et al. (1996) Human Mutation 7: 244-255; Kozal, M. J. et al. (1996) Nature Medicine 2: 753-759). For example, genetic mutations in 58224 can be identified in two-dimensional arrays containing light-generated DNA probes as described in Cronin, M. T. et al. supra. Briefly, a first hybridization array of probes can be used to scan through long stretches of DNA in a sample and control to identify base changes between the sequences by making linear arrays of sequential overlapping probes. This step allows the identification of point mutations. This step is followed by a second hybridization array that allows the characterization of specific mutations by using smaller, specialized probe arrays complementary to all variants or mutations detected. Each mutation array is composed of parallel probe sets, one complementary to the wild-type gene and the other complementary to the mutant gene.

[1648] In yet another embodiment, any of a variety of sequencing reactions known in the art can be used to directly sequence the 58224 gene and detect mutations by comparing the sequence of the sample 58224 with the corresponding wild-type (control) sequence. Automated sequencing procedures can be utilized when performing the diagnostic assays ((1995) Biotechniques 19:448), including sequencing by mass spectrometry.

[1649] Other methods for detecting mutations in the 58224 gene include methods in which protection from cleavage agents is used to detect mismatched bases in RNA/RNA or RNA/DNA heteroduplexes (Myers et al. (1985) Science 230:1242; Cotton et al. (1988) Proc. Natl Acad Sci USA 85:4397; Saleeba et al. (1992) Methods Enzymol. 217:286-295).

[1650] In still another embodiment, the mismatch cleavage reaction employs one or more proteins that recognize mismatched base pairs in double-stranded DNA (so called “DNA mismatch repair” enzymes) in defined systems for detecting and mapping point mutations in 58224 cDNAs obtained from samples of cells. For example, the mutY enzyme of E. coli cleaves A at G/A mismatches and the thymidine DNA glycosylase from HeLa cells cleaves T at G/T mismatches (Hsu et al. (1994) Carcinogenesis 15:1657-1662; U.S. Pat. No. 5,459,039).

[1651] In other embodiments, alterations in electrophoretic mobility will be used to identify mutations in 58224 genes. For example, single strand conformation polymorphism (SSCP) may be used to detect differences in electrophoretic mobility between mutant and wild type nucleic acids (Orita et al. (1989) Proc Natl. Acad. Sci USA: 86:2766, see also Cotton (1993) Mutat. Res. 285:125-144; and Hayashi (1992) Genet. Anal. Tech. Appl. 9:73-79). Single-stranded DNA fragments of sample and control 58224 nucleic acids will be denatured and allowed to renature. The secondary structure of single-stranded nucleic acids varies according to sequence, the resulting alteration in electrophoretic mobility enables the detection of even a single base change. The DNA fragments may be labeled or detected with labeled probes. The sensitivity of the assay may be enhanced by using RNA (rather than DNA), in which the secondary structure is more sensitive to a change in sequence. In a preferred embodiment, the subject method utilizes heteroduplex analysis to separate double stranded heteroduplex molecules on the basis of changes in electrophoretic mobility (Keen et al. (1991) Trends Genet 7:5).

[1652] In yet another embodiment, the movement of mutant or wild-type fragments in polyacrylamide gels containing a gradient of denaturant is assayed using denaturing gradient gel electrophoresis (DGGE) (Myers et al. (1985) Nature 313:495). When DGGE is used as the method of analysis, DNA will be modified to insure that it does not completely denature, for example by adding a GC clamp of approximately 40 bp of high-melting GC-rich DNA by PCR. In a further embodiment, a temperature gradient is used in place of a denaturing gradient to identify differences in the mobility of control and sample DNA (Rosenbaum and Reissner (1987) Biophys Chem 265:12753).

[1653] Examples of other techniques for detecting point mutations include, but are not limited to, selective oligonucleotide hybridization, selective amplification, or selective primer extension (Saiki et al. (1986) Nature 324:163); Saiki et al. (1989) Proc. Natl. Acad. Sci USA 86:6230). A further method of detecting point mutations is the chemical ligation of oligonucleotides as described in Xu et al. ((2001) Nature Biotechnol. 19:148). Adjacent oligonucleotides, one of which selectively anneals to the query site, are ligated together if the nucleotide at the query site of the sample nucleic acid is complementary to the query oligonucleotide; ligation can be monitored, e.g., by fluorescent dyes coupled to the oligonucleotides.

[1654] Alternatively, allele specific amplification technology that depends on selective PCR amplification may be used in conjunction with the instant invention. Oligonucleotides used as primers for specific amplification may carry the mutation of interest in the center of the molecule (so that amplification depends on differential hybridization) (Gibbs et al. (1989) Nucleic Acids Res. 17:2437-2448) or at the extreme 3′end of one primer where, under appropriate conditions, mismatch can prevent, or reduce polymerase extension (Prossner (1993) Tibtech 11:238). In addition it may be desirable to introduce a novel restriction site in the region of the mutation to create cleavage-based detection (Gasparini et al. (1992) Mol. Cell Probes 6:1). It is anticipated that in certain embodiments amplification may also be performed using Taq ligase for amplification (Barany (1991) Proc. Natl. Acad. Sci USA 88:189). In such cases, ligation will occur only if there is a perfect match at the 3′end of the 5′sequence making it possible to detect the presence of a known mutation at a specific site by looking for the presence or absence of amplification.

[1655] In another aspect, the invention features a set of oligonucleotides. The set includes a plurality of oligonucleotides, each of which is at least partially complementary (e.g., at least 50%, 60%, 70%, 80%, 90%, 92%, 95%, 97%, 98%, or 99% complementary) to a 58224 nucleic acid.

[1656] In a preferred embodiment the set includes a first and a second oligonucleotide. The first and second oligonucleotide can hybridize to the same or to different locations of SEQ ID NO:22 or 24, or the complement of SEQ ID NO:22 or 24. Different locations can be different but overlapping or or nonoverlapping on the same strand. The first and second oligonucleotide can hybridize to sites on the same or on different strands.

[1657] The set can be useful, e.g., for identifying SNP's, or identifying specific alleles of 58224. In a preferred embodiment, each oligonucleotide of the set has a different nucleotide at an interrogation position. In one embodiment, the set includes two oligonucleotides, each complementary to a different allele at a locus, e.g., a biallelic or polymorphic locus.

[1658] In another embodiment, the set includes four oligonucleotides, each having a different nucleotide (e.g., adenine, guanine, cytosine, or thymidine) at the interrogation position. The interrogation position can be a SNP or the site of a mutation. In another preferred embodiment, the oligonucleotides of the plurality are identical in sequence to one another (except for differences in length). The oligonucleotides can be provided with differential labels, such that an oligonucleotide that hybridizes to one allele provides a signal that is distinguishable from an oligonucleotide that hybridizes to a second allele. In still another embodiment, at least one of the oligonucleotides of the set has a nucleotide change at a position in addition to a query position, e.g., a destabilizing mutation to decrease the Tm of the oligonucleotide. In another embodiment, at least one oligonucleotide of the set has a non-natural nucleotide, e.g., inosine. In a preferred embodiment, the oligonucleotides are attached to a solid support, e.g., to different addresses of an array or to different beads or nanoparticles.

[1659] In a preferred embodiment the set of oligo nucleotides can be used to specifically amplify, e.g., by PCR, or detect, a 58224 nucleic acid.

[1660] The methods described herein may be performed, for example, by utilizing pre-packaged diagnostic kits comprising at least one probe nucleic acid or antibody reagent described herein, which may be conveniently used, e.g., in clinical settings to diagnose patients exhibiting symptoms or family history of a disease or illness involving a 58224 gene.

[1661] Use of 58224 Molecules as Surrogate Markers

[1662] The 58224 molecules of the invention are also useful as markers of disorders or disease states, as markers for precursors of disease states, as markers for predisposition of disease states, as markers of drug activity, or as markers of the pharmacogenomic profile of a subject. Using the methods described herein, the presence, absence and/or quantity of the 58224 molecules of the invention may be detected, and may be correlated with one or more biological states in vivo. For example, the 58224 molecules of the invention may serve as surrogate markers for one or more disorders or disease states or for conditions leading up to disease states. As used herein, a “surrogate marker” is an objective biochemical marker which correlates with the absence or presence of a disease or disorder, or with the progression of a disease or disorder (e.g., with the presence or absence of a tumor). The presence or quantity of such markers is independent of the disease. Therefore, these markers may serve to indicate whether a particular course of treatment is effective in lessening a disease state or disorder. Surrogate markers are of particular use when the presence or extent of a disease state or disorder is difficult to assess through standard methodologies (e.g., early stage tumors), or when an assessment of disease progression is desired before a potentially dangerous clinical endpoint is reached (e.g., an assessment of cardiovascular disease may be made using cholesterol levels as a surrogate marker, and an analysis of HIV infection may be made using HIV RNA levels as a surrogate marker, well in advance of the undesirable clinical outcomes of myocardial infarction or fully-developed AIDS). Examples of the use of surrogate markers in the art include: Koomen et al. (2000) J. Mass. Spectrom. 35: 258-264; and James (1994) AIDS Treatment News Archive 209.

[1663] The 58224 molecules of the invention are also useful as pharmacodynamic markers. As used herein, a “pharmacodynamic marker” is an objective biochemical marker which correlates specifically with drug effects. The presence or quantity of a pharmacodynamic marker is not related to the disease state or disorder for which the drug is being administered; therefore, the presence or quantity of the marker is indicative of the presence or activity of the drug in a subject. For example, a pharmacodynamic marker may be indicative of the concentration of the drug in a biological tissue, in that the marker is either expressed or transcribed or not expressed or transcribed in that tissue in relationship to the level of the drug. In this fashion, the distribution or uptake of the drug may be monitored by the pharmacodynamic marker. Similarly, the presence or quantity of the pharmacodynamic marker may be related to the presence or quantity of the metabolic product of a drug, such that the presence or quantity of the marker is indicative of the relative breakdown rate of the drug in vivo. Pharmacodynamic markers are of particular use in increasing the sensitivity of detection of drug effects, particularly when the drug is administered in low doses. Since even a small amount of a drug may be sufficient to activate multiple rounds of marker (e.g., a 58224 marker) transcription or expression, the amplified marker may be in a quantity which is more readily detectable than the drug itself. Also, the marker may be more easily detected due to the nature of the marker itself; for example, using the methods described herein, anti-58224 antibodies may be employed in an immune-based detection system for a 58224 protein marker, or 58224-specific radiolabeled probes may be used to detect a 58224 mRNA marker. Furthermore, the use of a pharmacodynamic marker may offer mechanism-based prediction of risk due to drug treatment beyond the range of possible direct observations. Examples of the use of pharmacodynamic markers in the art include: Matsuda et al. U.S. Pat. No. 6,033,862; Hattis et al. (1991) Env. Health Perspect. 90: 229-238; Schentag (1999) Am. J. Health-Syst. Pharm. 56 Suppl. 3: S21-S24; and Nicolau (1999) Am, J. Health-Syst. Pharm. 56 Suppl. 3: S16-S20.

[1664] The 58224 molecules of the invention are also useful as pharmacogenomic markers. As used herein, a “pharmacogenomic marker” is an objective biochemical marker which correlates with a specific clinical drug response or susceptibility in a subject (see, e.g., McLeod et al. (1999) Eur. J. Cancer 35:1650-1652). The presence or quantity of the pharmacogenomic marker is related to the predicted response of the subject to a specific drug or class of drugs prior to administration of the drug. By assessing the presence or quantity of one or more pharmacogenomic markers in a subject, a drug therapy which is most appropriate for the subject, or which is predicted to have a greater degree of success, may be selected. For example, based on the presence or quantity of RNA, or protein (e.g., 58224 protein or RNA) for specific tumor markers in a subject, a drug or course of treatment may be selected that is optimized for the treatment of the specific tumor likely to be present in the subject. Similarly, the presence or absence of a specific sequence mutation in 58224 DNA may correlate 58224 drug response. The use of pharmacogenomic markers therefore permits the application of the most appropriate treatment for each subject without having to administer the therapy.

[1665] Pharmaceutical Compositions of 58224

[1666] The nucleic acid and polypeptides, fragments thereof, as well as anti-58224 antibodies (also referred to herein as “active compounds”) of the invention can be incorporated into pharmaceutical compositions. Such compositions typically include the nucleic acid molecule, protein, or antibody and a pharmaceutically acceptable carrier. As used herein the language “pharmaceutically acceptable carrier” includes solvents, dispersion media, coatings, antibacterial and antifungal agents, isotonic and absorption delaying agents, and the like, compatible with pharmaceutical administration. Supplementary active compounds can also be incorporated into the compositions.

[1667] A pharmaceutical composition is formulated to be compatible with its intended route of administration. Examples of routes of administration include parenteral, e.g., intravenous, intradermal, subcutaneous, oral (e.g., inhalation), transdermal (topical), transmucosal, and rectal administration. Solutions or suspensions used for parenteral, intradermal, or subcutaneous application can include the following components: a sterile diluent such as water for injection, saline solution, fixed oils, polyethylene glycols, glycerine, propylene glycol or other synthetic solvents; antibacterial agents such as benzyl alcohol or methyl parabens; antioxidants such as ascorbic acid or sodium bisulfite; chelating agents such as ethylenediaminetetraacetic acid; buffers such as acetates, citrates or phosphates and agents for the adjustment of tonicity such as sodium chloride or dextrose. pH can be adjusted with acids or bases, such as hydrochloric acid or sodium hydroxide. The parenteral preparation can be enclosed in ampoules, disposable syringes or multiple dose vials made of glass or plastic.

[1668] Pharmaceutical compositions suitable for injectable use include sterile aqueous solutions (where water soluble) or dispersions and sterile powders for the extemporaneous preparation of sterile injectable solutions or dispersion. For intravenous administration, suitable carriers include physiological saline, bacteriostatic water, Cremophor EL™ (BASF, Parsippany, N.J.) or phosphate buffered saline (PBS). In all cases, the composition must be sterile and should be fluid to the extent that easy syringability exists. It should be stable under the conditions of manufacture and storage and must be preserved against the contaminating action of microorganisms such as bacteria and fungi. The carrier can be a solvent or dispersion medium containing, for example, water, ethanol, polyol (for example, glycerol, propylene glycol, and liquid polyetheylene glycol, and the like), and suitable mixtures thereof. The proper fluidity can be maintained, for example, by the use of a coating such as lecithin, by the maintenance of the required particle size in the case of dispersion and by the use of surfactants. Prevention of the action of microorganisms can be achieved by various antibacterial and antifungal agents, for example, parabens, chlorobutanol, phenol, ascorbic acid, thimerosal, and the like. In many cases, it will be preferable to include isotonic agents, for example, sugars, polyalcohols such as manitol, sorbitol, sodium chloride in the composition. Prolonged absorption of the injectable compositions can be brought about by including in the composition an agent which delays absorption, for example, aluminum monostearate and gelatin.

[1669] Sterile injectable solutions can be prepared by incorporating the active compound in the required amount in an appropriate solvent with one or a combination of ingredients enumerated above, as required, followed by filtered sterilization. Generally, dispersions are prepared by incorporating the active compound into a sterile vehicle that contains a basic dispersion medium and the required other ingredients from those enumerated above. In the case of sterile powders for the preparation of sterile injectable solutions, the preferred methods of preparation are vacuum drying and freeze-drying, which yield a powder of the active ingredient plus any additional desired ingredient from a previously sterile-filtered solution thereof.

[1670] Oral compositions generally include an inert diluent or an edible carrier. For the purpose of oral therapeutic administration, the active compound can be incorporated with excipients and used in the form of tablets, troches, or capsules, e.g., gelatin capsules. Oral compositions can also be prepared using a fluid carrier for use as a mouthwash. Pharmaceutically compatible binding agents, and/or adjuvant materials can be included as part of the composition. The tablets, pills, capsules, troches and the like can contain any of the following ingredients, or compounds of a similar nature: a binder such as microcrystalline cellulose, gum tragacanth or gelatin; an excipient such as starch or lactose, a disintegrating agent such as alginic acid, Primogel, or corn starch; a lubricant such as magnesium stearate or Sterotes; a glidant such as colloidal silicon dioxide; a sweetening agent such as sucrose or saccharin; or a flavoring agent such as peppermint, methyl salicylate, or orange flavoring.

[1671] For administration by inhalation, the compounds are delivered in the form of an aerosol spray from pressured container or dispenser that contains a suitable propellant, e.g., a gas such as carbon dioxide, or a nebulizer.

[1672] Systemic administration can also be by transmucosal or transdermal means. For transmucosal or transdermal administration, penetrants appropriate to the barrier to be permeated are used in the formulation. Such penetrants are generally known in the art, and include, for example, for transmucosal administration, detergents, bile salts, and fusidic acid derivatives. Transmucosal administration can be accomplished through the use of nasal sprays or suppositories. For transdermal administration, the active compounds are formulated into ointments, salves, gels, or creams as generally known in the art.

[1673] The compounds can also be prepared in the form of suppositories (e.g., with conventional suppository bases such as cocoa butter and other glycerides) or retention enemas for rectal delivery.

[1674] In one embodiment, the active compounds are prepared with carriers that will protect the compound against rapid elimination from the body, such as a controlled release formulation, including implants and microencapsulated delivery systems. Biodegradable, biocompatible polymers can be used, such as ethylene vinyl acetate, polyanhydrides, polyglycolic acid, collagen, polyorthoesters, and polylactic acid. Methods for preparation of such formulations will be apparent to those skilled in the art. The materials can also be obtained commercially from Alza Corporation and Nova Pharmaceuticals, Inc. Liposomal suspensions (including liposomes targeted to infected cells with monoclonal antibodies to viral antigens) can also be used as pharmaceutically acceptable carriers. These can be prepared according to methods known to those skilled in the art, for example, as described in U.S. Pat. No. 4,522,811.

[1675] It is advantageous to formulate oral or parenteral compositions in dosage unit form for ease of administration and uniformity of dosage. Dosage unit form as used herein refers to physically discrete units suited as unitary dosages for the subject to be treated; each unit containing a predetermined quantity of active compound calculated to produce the desired therapeutic effect in association with the required pharmaceutical carrier.

[1676] Toxicity and therapeutic efficacy of such compounds can be determined by standard pharmaceutical procedures in cell cultures or experimental animals, e.g., for determining the LD50 (the dose lethal to 50% of the population) and the ED50 (the dose therapeutically effective in 50% of the population). The dose ratio between toxic and therapeutic effects is the therapeutic index and it can be expressed as the ratio LD50/ED50. Compounds that exhibit high therapeutic indices are preferred. While compounds that exhibit toxic side effects may be used, care should be taken to design a delivery system that targets such compounds to the site of affected tissue in order to minimize potential damage to uninfected cells and, thereby, reduce side effects.

[1677] The data obtained from the cell culture assays and animal studies can be used in formulating a range of dosage for use in humans. The dosage of such compounds lies preferably within a range of circulating concentrations that include the ED50 with little or no toxicity. The dosage may vary within this range depending upon the dosage form employed and the route of administration utilized. For any compound used in the method of the invention, the therapeutically effective dose can be estimated initially from cell culture assays. A dose may be formulated in animal models to achieve a circulating plasma concentration range that includes the IC50 (i.e., the concentration of the test compound which achieves a half-maximal inhibition of symptoms) as determined in cell culture. Such information can be used to more accurately determine useful doses in humans. Levels in plasma may be measured, for example, by high performance liquid chromatography.

[1678] As defined herein, a therapeutically effective amount of protein or polypeptide (i.e., an effective dosage) ranges from about 0.001 to 30 mg/kg body weight, preferably about 0.01 to 25 mg/kg body weight, more preferably about 0.1 to 20 mg/kg body weight, and even more preferably about 1 to 10 mg/kg, 2 to 9 mg/kg, 3 to 8 mg/kg, 4 to 7 mg/kg, or 5 to 6 mg/kg body weight. The protein or polypeptide can be administered one time per week for between about 1 to 10 weeks, preferably between 2 to 8 weeks, more preferably between about 3 to 7 weeks, and even more preferably for about 4, 5, or 6 weeks. The skilled artisan will appreciate that certain factors may influence the dosage and timing required to effectively treat a subject, including but not limited to the severity of the disease or disorder, previous treatments, the general health and/or age of the subject, and other diseases present. Moreover, treatment of a subject with a therapeutically effective amount of a protein, polypeptide, or antibody can include a single treatment or, preferably, can include a series of treatments.

[1679] For antibodies, the preferred dosage is 0.1 mg/kg of body weight (generally 10 mg/kg to 20 mg/kg). If the antibody is to act in the brain, a dosage of 50 mg/kg to 100 mg/kg is usually appropriate. Generally, partially human antibodies and fully human antibodies have a longer half-life within the human body than other antibodies. Accordingly, lower dosages and less frequent administration is often possible. Modifications such as lipidation can be used to stabilize antibodies and to enhance uptake and tissue penetration (e.g., into the brain). A method for lipidation of antibodies is described by Cruikshank et al. ((1997) J. Acquired Immune Deficiency Syndromes and Human Retrovirology 14:193).

[1680] The present invention encompasses agents that modulate expression or activity. An agent may, for example, be a small molecule. For example, such small molecules include, but are not limited to, peptides, peptidomimetics (e.g., peptoids), amino acids, amino acid analogs, polynucleotides, polynucleotide analogs, nucleotides, nucleotide analogs, organic or inorganic compounds (i.e., including heteroorganic and organometallic compounds) having a molecular weight less than about 10,000 grams per mole, organic or inorganic compounds having a molecular weight less than about 5,000 grams per mole, organic or inorganic compounds having a molecular weight less than about 1,000 grams per mole, organic or inorganic compounds having a molecular weight less than about 500 grams per mole, and salts, esters, and other pharmaceutically acceptable forms of such compounds.

[1681] Exemplary doses include milligram or microgram amounts of the small molecule per kilogram of subject or sample weight (e.g., about 1 &mgr;g/kg to about 500 mg/kg, about 100 &mgr;g/kg to about 5 mg/kg, or about 1 &mgr;g/kg to about 50 &mgr;g/kg. It is furthermore understood that appropriate doses of a small molecule depend upon the potency of the small molecule with respect to the expression or activity to be modulated. When one or more of these small molecules is to be administered to an animal (e.g., a human) in order to modulate expression or activity of a polypeptide or nucleic acid of the invention, a physician, veterinarian, or researcher may, for example, prescribe a relatively low dose at first, subsequently increasing the dose until an appropriate response is obtained. In addition, it is understood that the specific dose level for any particular animal subject will depend upon a variety of factors including the activity of the specific compound employed, the age, body weight, general health, gender, and diet of the subject, the time of administration, the route of administration, the rate of excretion, any drug combination, and the degree of expression or activity to be modulated.

[1682] An antibody (or fragment thereof) may be conjugated to a therapeutic moiety such as a cytotoxin, a therapeutic agent or a radioactive metal ion. A cytotoxin or cytotoxic agent includes any agent that is detrimental to cells. Examples include taxol, cytochalasin B, gramicidin D, ethidium bromide, emetine, mitomycin, etoposide, tenoposide, vincristine, vinblastine, colchicin, doxorubicin, daunorubicin, dihydroxy anthracin dione, mitoxantrone, mithramycin, actinomycin D, 1-dehydrotestosterone, glucocorticoids, procaine, tetracaine, lidocaine, propranolol, and puromycin and analogs or homologs thereof. Therapeutic agents include, but are not limited to, antimetabolites (e.g., methotrexate, 6-mercaptopurine, 6-thioguanine, cytarabine, 5-fluorouracil decarbazine), alkylating agents (e.g., mechlorethamine, thioepa chlorambucil, melphalan, carmustine (BSNU) and lomustine (CCNU), cyclothosphamide, busulfan, dibromomannitol, streptozotocin, mitomycin C, and cis-dichlorodiamine platinum (II) (DDP)cisplatin), anthracyclines (e.g., daunorubicin (formerly daunomycin) and doxorubicin), antibiotics (e.g., dactinomycin (formerly actinomycin), bleomycin, mithramycin, and anthramycin (AMC)), and anti-mitotic agents (e.g., vincristine and vinblastine).

[1683] The conjugates of the invention can be used for modifying a given biological response. The drug moiety is not to be construed as limited to classical chemical therapeutic agents. For example, the drug moiety may be a protein or polypeptide possessing a desired biological activity. Such proteins may include, for example, a toxin such as abrin, ricin A, pseudomonas exotoxin, or diphtheria toxin; a protein such as tumor necrosis factor, &agr;-interferon, &bgr;-interferon, nerve growth factor, platelet derived growth factor, tissue plasminogen activator; or, biological response modifiers such as, for example, lymphokines, interleukin-1 (“IL-1”), interleukin-2 (“IL-2”), interleukin-6 (“IL-6”), granulocyte macrophase colony stimulating factor (“GM-CSF”), granulocyte colony stimulating factor (“G-CSF”), or other growth factors.

[1684] Alternatively, an antibody can be conjugated to a second antibody to form an antibody heteroconjugate as described by Segal in U.S. Pat. No. 4,676,980.

[1685] The nucleic acid molecules of the invention can be inserted into vectors and used as gene therapy vectors. Gene therapy vectors can be delivered to a subject by, for example, intravenous injection, local administration (see U.S. Pat. No. 5,328,470) or by stereotactic injection (see e.g., Chen et al. (1994) Proc. Natl. Acad. Sci. USA 91:3054-3057). The pharmaceutical preparation of the gene therapy vector can include the gene therapy vector in an acceptable diluent, or can comprise a slow release matrix in which the gene delivery vehicle is imbedded. Alternatively, where the complete gene delivery vector can be produced intact from recombinant cells, e.g., retroviral vectors, the pharmaceutical preparation can include one or more cells which produce the gene delivery system.

[1686] The pharmaceutical compositions can be included in a container, pack, or dispenser together with instructions for administration.

[1687] Methods of Treatment for 58224

[1688] The present invention provides for both prophylactic and therapeutic methods of treating a subject at risk of (or susceptible to) a disorder or having a disorder associated with aberrant or unwanted 58224 expression or activity. As used herein, the term “treatment” is defined as the application or administration of a therapeutic agent to a patient, or application or administration of a therapeutic agent to an isolated tissue or cell line from a patient, who has a disease, a symptom of disease or a predisposition toward a disease, with the purpose to cure, heal, alleviate, relieve, alter, remedy, ameliorate, improve or affect the disease, the symptoms of disease or the predisposition toward disease. A therapeutic agent includes, but is not limited to, small molecules, peptides, antibodies, ribozymes and antisense oligonucleotides.

[1689] It is possible that some 58224 disorders can be caused, at least in part, by an abnormal level of gene product, or by the presence of a gene product exhibiting abnormal activity. As such, the reduction in the level and/or activity of such gene products would bring about the amelioration of disorder symptoms. Relevant disorders can include cell proliferation or differentiation disorders, e.g., cancer (e.g., breast, ovary, lung, or colon cancer, or liver metastasis), Werner syndrome, Bloom syndrome, cockayne's syndrome, xerodema pigmentosum, lymphoid proliferative diseases, &agr;-Thalassemia X-linked mental retardation, or another cell proliferation or differentiation disorder as described herein.

[1690] As the 58224 molecules are expressed in breast, ovary, colon, and lung tumors, in liver metastases, and in fetal liver, these molecules can be used diagnostically and therapeutically to treat/diagnose proliferative diseases of the breast, ovary, colon, lung, and liver, as described herein above.

[1691] With regards to both prophylactic and therapeutic methods of treatment, such treatments may be specifically tailored or modified, based on knowledge obtained from the field of pharmacogenomics as described below.

[1692] The 58224 molecules can act as novel diagnostic targets and therapeutic agents for controlling cellular proliferative and/or differentiative disorders, which have been described above, as well as disorders associated with bone metabolism, brain disorders, the immune system, the cardiovascular system, the liver, viral diseases, and pain.

[1693] Disorders involving the brain include, but are not limited to, disorders involving neurons, and disorders involving glia, such as astrocytes, oligodendrocytes, ependymal cells, and microglia; cerebral edema, raised intracranial pressure and herniation, and hydrocephalus; malformations and developmental diseases, such as neural tube defects, forebrain anomalies, posterior fossa anomalies, and syringomyelia and hydromyelia; perinatal brain injury; cerebrovascular diseases, such as those related to hypoxia, ischemia, and infarction, including hypotension, hypoperfusion, and low-flow states—global cerebral ischemia and focal cerebral ischemia—infarction from obstruction of local blood supply, intracranial hemorrhage, including intracerebral (intraparenchymal) hemorrhage, subarachnoid hemorrhage and ruptured berry aneurysms, and vascular malformations, hypertensive cerebrovascular disease, including lacunar infarcts, slit hemorrhages, and hypertensive encephalopathy; infections, such as acute meningitis, including acute pyogenic (bacterial) meningitis and acute aseptic (viral) meningitis, acute focal suppurative infections, including brain abscess, subdural empyema, and extradural abscess, chronic bacterial meningoencephalitis, including tuberculosis and mycobacterioses, neurosyphilis, and neuroborreliosis (Lyme disease), viral meningoencephalitis, including arthropod-borne (Arbo) viral encephalitis, Herpes simplex virus Type 1, Herpes simplex virus Type 2, Varicalla-zoster virus (Herpes zoster), cytomegalovirus, poliomyelitis, rabies, and human immunodeficiency virus 1, including HIV-1 meningoencephalitis (subacute encephalitis), vacuolar myelopathy, AIDS-associated myopathy, peripheral neuropathy, and AIDS in children, progressive multifocal leukoencephalopathy, subacute sclerosing panencephalitis, fungal meningoencephalitis, other infectious diseases of the nervous system; transmissible spongiform encephalopathies (prion diseases); demyelinating diseases, including multiple sclerosis, multiple sclerosis variants, acute disseminated encephalomyelitis and acute necrotizing hemorrhagic encephalomyelitis, and other diseases with demyelination; degenerative diseases, such as degenerative diseases affecting the cerebral cortex, including Alzheimer disease and Pick disease, degenerative diseases of basal ganglia and brain stem, including Parkinsonism, idiopathic Parkinson disease (paralysis agitans), progressive supranuclear palsy, corticobasal degenration, multiple system atrophy, including striatonigral degenration, Shy-Drager syndrome, and olivopontocerebellar atrophy, and Huntington disease; spinocerebellar degenerations, including spinocerebellar ataxias, including Friedreich ataxia, and ataxia-telanglectasia, degenerative diseases affecting motor neurons, including amyotrophic lateral sclerosis (motor neuron disease), bulbospinal atrophy (Kennedy syndrome), and spinal muscular atrophy; inborn errors of metabolism, such as leukodystrophies, including Krabbe disease, metachromatic leukodystrophy, adrenoleukodystrophy, Pelizaeus-Merzbacher disease, and Canavan disease, mitochondrial encephalomyopathies, including Leigh disease and other mitochondrial encephalomyopathies; toxic and acquired metabolic diseases, including vitamin deficiencies such as thiamine (vitamin B1) deficiency and vitamin B12 deficiency, neurologic sequelae of metabolic disturbances, including hypoglycemia, hyperglycemia, and hepatic encephatopathy, toxic disorders, including carbon monoxide, methanol, ethanol, and radiation, including combined methotrexate and radiation-induced injury; tumors, such as gliomas, including astrocytoma, including fibrillary (diffuse) astrocytoma and glioblastoma multiforme, pilocytic astrocytoma, pleomorphic xanthoastrocytoma, and brain stem glioma, oligodendroglioma, and ependymoma and related paraventricular mass lesions, neuronal tumors, poorly differentiated neoplasms, including medulloblastoma, other parenchymal tumors, including primary brain lymphoma, germ cell tumors, and pineal parenchymal tumors, meningiomas, metastatic tumors, paraneoplastic syndromes, peripheral nerve sheath tumors, including schwannoma, neurofibroma, and malignant peripheral nerve sheath tumor (malignant schwannoma), and neurocutaneous syndromes (phakomatoses), including neurofibromotosis, including Type 1 neurofibromatosis (NF1) and TYPE 2 neurofibromatosis (NF2), tuberous sclerosis, and Von Hippel-Lindau disease.

[1694] Aberrant expression and/or activity of 58224 molecules may mediate disorders associated with bone metabolism. “Bone metabolism” refers to direct or indirect effects in the formation or degeneration of bone structures, e.g., bone formation, bone resorption, etc., which may ultimately affect the concentrations in serum of calcium and phosphate. This term also includes activities mediated by 58224 molecules effects in bone cells, e.g. osteoclasts and osteoblasts, that may in turn result in bone formation and degeneration. For example, 58224 molecules may support different activities of bone resorbing osteoclasts such as the stimulation of differentiation of monocytes and mononuclear phagocytes into osteoclasts. Accordingly, 58224 molecules that modulate the production of bone cells can influence bone formation and degeneration, and thus may be used to treat bone disorders. Examples of such disorders include, but are not limited to, osteoporosis, osteodystrophy, osteomalacia, rickets, osteitis fibrosa cystica, renal osteodystrophy, osteosclerosis, anti-convulsant treatment, osteopenia, fibrogenesis-imperfecta ossium, secondary hyperparathyrodism, hypoparathyroidism, hyperparathyroidism, cirrhosis, obstructive jaundice, drug induced metabolism, medullary carcinoma, chronic renal disease, rickets, sarcoidosis, glucocorticoid antagonism, malabsorption syndrome, steatorrhea, tropical sprue, idiopathic hypercalcemia and milk fever.

[1695] The 58224 nucleic acid and protein of the invention can be used to treat and/or diagnose a variety of immune disorders. Examples of immune disorders or diseases include, but are not limited to, autoimmune diseases (including, for example, diabetes mellitus, arthritis (including rheumatoid arthritis, juvenile rheumatoid arthritis, osteoarthritis, psoriatic arthritis), multiple sclerosis, encephalomyelitis, myasthenia gravis, systemic lupus erythematosis, autoimmune thyroiditis, dermatitis (including atopic dermatitis and eczematous dermatitis), psoriasis, Sjögren's Syndrome, Crohn's disease, aphthous ulcer, iritis, conjunctivitis, keratoconjunctivitis, ulcerative colitis, asthma, allergic asthma, cutaneous lupus erythematosus, scleroderma, vaginitis, proctitis, drug eruptions, leprosy reversal reactions, erythema nodosum leprosum, autoimmune uveitis, allergic encephalomyelitis, acute necrotizing hemorrhagic encephalopathy, idiopathic bilateral progressive sensorineural hearing loss, aplastic anemia, pure red cell anemia, idiopathic thrombocytopenia, polychondritis, Wegener's granulomatosis, chronic active hepatitis, Stevens-Johnson syndrome, idiopathic sprue, lichen planus, Graves' disease, sarcoidosis, primary biliary cirrhosis, uveitis posterior, and interstitial lung fibrosis), graft-versus-host disease, cases of transplantation, and allergy such as, atopic allergy.

[1696] Examples of disorders involving the heart or “cardiovascular disorder” include, but are not limited to, a disease, disorder, or state involving the cardiovascular system, e.g., the heart, the blood vessels, and/or the blood. A cardiovascular disorder can be caused by an imbalance in arterial pressure, a malfunction of the heart, or an occlusion of a blood vessel, e.g., by a thrombus. Examples of such disorders include hypertension, atherosclerosis, coronary artery spasm, congestive heart failure, coronary artery disease, valvular disease, arrhythmias, and cardiomyopathies.

[1697] Disorders which may be treated or diagnosed by methods described herein include, but are not limited to, disorders associated with an accumulation in the liver of fibrous tissue, such as that resulting from an imbalance between production and degradation of the extracellular matrix-accompanied by the collapse and condensation of preexisting fibers. The methods described herein can be used to diagnose or treat hepatocellular necrosis or injury induced by a wide variety of agents including processes which disturb homeostasis, such as an inflammatory process, tissue damage resulting from toxic injury or altered hepatic blood flow, and infections (e.g., bacterial, viral and parasitic). For example, the methods can be used for the early detection of hepatic injury, such as portal hypertension or hepatic fibrosis. In addition, the methods can be employed to detect liver fibrosis attributed to inborn errors of metabolism, for example, fibrosis resulting from a storage disorder such as Gaucher's disease (lipid abnormalities) or a glycogen storage disease, A1-antitrypsin deficiency; a disorder mediating the accumulation (e.g., storage) of an exogenous substance, for example, hemochromatosis (iron-overload syndrome) and copper storage diseases (Wilson's disease), disorders resulting in the accumulation of a toxic metabolite (e.g., tyrosinemia, fructosemia and galactosemia) and peroxisomal disorders (e.g., Zellweger syndrome). Additionally, the methods described herein may be useful for the early detection and treatment of liver injury associated with the administration of various chemicals or drugs, such as for example, methotrexate, isonizaid, oxyphenisatin, methyldopa, chlorpromazine, tolbutamide or alcohol, or which represents a hepatic manifestation of a vascular disorder such as obstruction of either the intrahepatic or extrahepatic bile flow or an alteration in hepatic circulation resulting, for example, from chronic heart failure, veno-occlusive disease, portal vein thrombosis or Budd-Chiari syndrome.

[1698] Additionally, 58224 molecules may play an important role in the etiology of certain viral diseases, including but not limited to Hepatitis B, Hepatitis C and Herpes Simplex Virus (HSV). Modulators of 58224 activity could be used to control viral diseases. The modulators can be used in the treatment and/or diagnosis of viral infected tissue or virus-associated tissue fibrosis, especially liver and liver fibrosis. Also, 58224 modulators can be used in the treatment and/or diagnosis of virus-associated carcinoma, especially hepatocellular cancer.

[1699] Additionally, 58224 may play an important role in the regulation pain disorders. Examples of pain disorders include, but are not limited to, pain response elicited during various forms of tissue injury, e.g., inflammation, infection, and ischemia, usually referred to as hyperalgesia (described in, for example, Fields, H. L. (1987) Pain, New York: McGraw-Hill); pain associated with musculoskeletal disorders, e.g., joint pain; tooth pain; headaches; pain associated with surgery; pain related to irritable bowel syndrome; or chest pain.

[1700] In one aspect, the invention provides a method for preventing in a subject, a disease or condition associated with an aberrant or unwanted 58224 expression or activity, by administering to the subject 58224 or an agent that modulates 58224 expression or at least one 58224 activity. Subjects at risk for a disease that is caused or contributed to by aberrant or unwanted 58224 expression or activity can be identified by, for example, any or a combination of diagnostic or prognostic assays as described herein. Administration of a prophylactic agent can occur prior to the manifestation of symptoms characteristic of the 58224 aberrance, such that a disease or disorder is prevented or, alternatively, delayed in its progression. Depending on the type of 58224 aberrance, for example, a 58224 agonist or 58224 antagonist agent can be used for treating the subject. The appropriate agent can be determined based on screening assays described herein.

[1701] It is possible that some 58224 disorders can be caused, at least in part, by an abnormal level of gene product, or by the presence of a gene product exhibiting abnormal activity. As such, the reduction in the level and/or activity of such gene products would bring about the amelioration of disorder symptoms.

[1702] As discussed above, successful treatment of 58224 disorders can be brought about by techniques that serve to inhibit the expression or activity of target gene products. For example, compounds, e.g., an agent identified using assays described above, that exhibits negative modulatory activities, can be used in accordance with the invention to prevent and/or ameliorate symptoms of 58224 disorders. Such molecules can include, but are not limited to peptides, phosphopeptides, small organic or inorganic molecules, or antibodies (including, for example, polyclonal, monoclonal, humanized, anti-idiotypic, chimeric or single chain antibodies, and Fab, F(ab′)2 and FAb expression library fragments, scFV molecules, and epitope-binding fragments thereof).

[1703] Further, antisense and ribozyme molecules that inhibit expression of the target gene can also be used in accordance with the invention to reduce the level of target gene expression, thus effectively reducing the level of target gene activity. Still further, triple helix-molecules can be utilized in reducing the level of target gene activity. Antisense, ribozyme and triple helix molecules are discussed above.

[1704] It is possible that the use of antisense, ribozyme, and/or triple helix molecules to reduce or inhibit mutant gene expression can also reduce or inhibit the transcription (triple helix) and/or translation (antisense, ribozyme) of mRNA produced by normal target gene alleles, such that the concentration of normal target gene product present can be lower than is necessary for a normal phenotype. In such cases, nucleic acid molecules that encode and express target gene polypeptides exhibiting normal target gene activity can be introduced into cells via gene therapy method. Alternatively, in instances in which the target gene encodes an extracellular protein, it can be preferable to co-administer normal target gene protein into the cell or tissue in order to maintain the requisite level of cellular or tissue target gene activity.

[1705] Another method by which nucleic acid molecules may be utilized in treating or preventing a disease characterized by 58224 expression is through the use of aptamer molecules specific for 58224 protein. Aptamers are nucleic acid molecules having a tertiary structure that permits them to specifically bind to protein ligands (see, e.g., Osborne, et al. 1997 Curr. Opin. Chem Biol. 1(1): 5-9; and Patel, D. J. 1997 Curr Opin Chem Biol Jun; 1(1):32-46). Since nucleic acid molecules may in many cases, be more conveniently introduced into target cells than therapeutic protein molecules, aptamers offer a method by which 58224 protein activity may be specifically decreased without the introduction of drugs or other molecules which may have pluripotent effects.

[1706] Antibodies can be generated that are both specific for target gene products and that reduce target gene product activity. Such antibodies may, therefore, by administered in instances whereby negative modulatory techniques are appropriate for the treatment of 58224 disorders. For a description of antibodies, see the Antibody section above.

[1707] In circumstances wherein injection of an animal or a human subject with a 58224 protein or epitope for stimulating antibody production is harmful to the subject, it is possible to generate an immune response against 58224 through the use of anti-idiotypic antibodies (see, for example, Herlyn, D. 1999 Ann Med 31(1):66-78; and Bhattacharya-Chatterjee, M., and Foon, K. A. 1998 Cancer Treat Res 94:51-68). If an anti-idiotypic antibody is introduced into a mammal or human subject, it should stimulate the production of anti-anti-idiotypic antibodies, which should be specific to the 58224 protein. Vaccines directed to a disease characterized by 58224 expression may also be generated in this fashion.

[1708] In instances where the target antigen is intracellular and whole antibodies are used, internalizing antibodies may be preferred. Lipofectin or liposomes can be used to deliver the antibody or a fragment of the Fab region that binds to the target antigen into cells. Where fragments of the antibody are used, the smallest inhibitory fragment that binds to the target antigen is preferred. For example, peptides having an amino acid sequence corresponding to the Fv region of the antibody can be used. Alternatively, single chain neutralizing antibodies that bind to intracellular target antigens can also be administered. Such single chain antibodies can be administered, for example, by expressing nucleotide sequences encoding single-chain antibodies within the target cell population (see e.g., Marasco et al. (1993) Proc. Natl. Acad. Sci. USA 90:7889-7893).

[1709] The identified compounds that inhibit target gene expression, synthesis and/or activity can be administered to a patient at therapeutically effective doses to prevent, treat or ameliorate 58224 disorders. A therapeutically effective dose refers to that amount of the compound sufficient to result in amelioration of symptoms of the disorders.

[1710] Toxicity and therapeutic efficacy of such compounds can be determined by standard pharmaceutical procedures in cell cultures or experimental animals, e.g., for determining the LD50 and the ED50 as described above in the Pharmaceutical Composition section.

[1711] Another example of determination of effective dose for an individual is the ability to directly assay levels of “free” and “bound” compound in the serum of the test subject. Such assays may utilize antibody mimics and/or “biosensors” that have been created through molecular imprinting techniques. A compound that is able to modulate 58224 activity is used as a template or “imprinting molecule,” to spatially organize polymerizable monomers prior to their polymerization with catalytic reagents. The subsequent removal of the imprinted molecule leaves a polymer matrix that contains a repeated “negative image” of the compound and is able to selectively rebind the molecule under biological assay conditions. A detailed review of this technique can be seen in Ansell, R. J. et al (1996) Current Opinion in Biotechnology 7:89-94 and in Shea, K. J. (1994) Trends in Polymer Science 2:166-173. Such “imprinted” affinity matrixes are amenable to ligand-binding assays, whereby the immobilized monoclonal antibody component is replaced by an appropriately imprinted matrix. An example of the use of such matrixes in this way can be seen in Vlatakis, G. et al (1993) Nature 361:645-647. Through the use of isotope-labeling, the “free” concentration of compound which modulates the expression or activity of 58224 can be readily monitored and used in calculations of IC50.

[1712] Such “imprinted” affinity matrixes can also be designed to include fluorescent groups whose photon-emitting properties measurably change upon local and selective binding of target compound. These changes can be readily assayed in real time using appropriate fiberoptic devices, in turn allowing the dose in a test subject to be quickly optimized based on its individual IC50. A rudimentary example of such a “biosensor” is discussed in Kriz, D. et al (1995) Analytical Chemistry 67:2142-2144.

[1713] Another aspect of the invention pertains to methods of modulating 58224 expression or activity for therapeutic purposes. Accordingly, in an exemplary embodiment, the modulatory method of the invention involves contacting a cell with 58224 or agent that modulates one or more of the activities of 58224 protein activity associated with the cell. An agent that modulates 58224 protein activity can be an agent as described herein, such as a nucleic acid or a protein, a naturally-occurring target molecule of a 58224 protein (e.g., a 58224 substrate or receptor), a 58224 antibody, a 58224 agonist or antagonist, a peptidomimetic of a 58224 agonist or antagonist, or other small molecule.

[1714] In one embodiment, the agent stimulates one or more 58224 activities. Examples of such stimulatory agents include active 58224 protein and a nucleic acid molecule encoding 58224. In another embodiment, the agent inhibits one or more 58224 activities. Examples of such inhibitory agents include antisense 58224 nucleic acid molecules, anti-58224 antibodies, and 58224 inhibitors. These modulatory methods can be performed in vitro (e.g., by culturing the cell with the agent) or, alternatively, in vivo (e.g., by administering the agent to a subject). As such, the present invention provides methods of treating an individual afflicted with a disease or disorder characterized by aberrant or unwanted expression or activity of a 58224 protein or nucleic acid molecule. In one embodiment, the method involves administering an agent (e.g., an agent identified by a screening assay described herein), or combination of agents that modulates (e.g., up-regulates or down-regulates) 58224 expression or activity. In another embodiment, the method involves administering a 58224 protein or nucleic acid molecule as therapy to compensate for reduced, aberrant, or unwanted 58224 expression or activity.

[1715] Stimulation of 58224 activity is desirable in situations in which 58224 is abnormally down-regulated and/or in which increased 58224 activity is likely to have a beneficial effect. For example, stimulation of 58224 activity is desirable in situations in which a 58224 is down-regulated and/or in which increased 58224 activity is likely to have a beneficial effect. Likewise, inhibition of 58224 activity is desirable in situations in which 58224 is abnormally up-regulated and/or in which decreased 58224 activity is likely to have a beneficial effect.

[1716] 58224 Pharmacogenomics

[1717] The 58224 molecules of the present invention, as well as agents, or modulators which have a stimulatory or inhibitory effect on 58224 activity (e.g., 58224 gene expression) as identified by a screening assay described herein can be administered to individuals to treat (prophylactically or therapeutically) 58224-associated disorders associated with aberrant or unwanted 58224 activity (e.g., hyperproliferative disorders, e.g., cancer). In conjunction with such treatment, pharmacogenomics may be considered. “Pharmacogenomics,” as used herein, refers to the application of genomics technologies such as gene sequencing, statistical genetics, and gene expression analysis to drugs in clinical development and on the market. More specifically, the term refers the study of how a patient's genes determine his or her response to a drug (e.g., a patient's “drug response phenotype,” or “drug response genotype.”) Thus, another aspect of the invention provides methods for tailoring an individual's prophylactic or therapeutic treatment with either the 58224 molecules of the present invention or 58224 modulators according to that individual's drug response genotype.

[1718] Pharmacogenomics deals with clinically significant hereditary variations in the response to drugs due to altered drug disposition and abnormal action in affected persons. See, for example, Eichelbaum, M. et al. (1996) Clin. Exp. Pharmacol. Physiol. 23(10-11) &dgr; 983-985 and Linder, M. W. et al. (1997) Clin. Chem. 43(2):254-266. In general, two types of pharmacogenetic conditions can be differentiated. Genetic conditions transmitted as a single factor altering the way drugs act on the body (altered drug action) or genetic conditions transmitted as single factors altering the way the body acts on drugs (altered drug metabolism). These pharmacogenetic conditions can occur either as rare genetic defects or as naturally occurring polymorphisms.

[1719] Differences in metabolism of therapeutics can lead to severe toxicity or therapeutic failure by altering the relation between dose and blood concentration of the pharmacologically active drug. Thus, a physician or clinician may consider applying knowledge obtained in relevant pharmacogenomics studies in determining whether to administer a 44576 molecule or 44576 modulator as well as tailoring the dosage and/or therapeutic regimen of treatment with a 44576 molecule or 44576 modulator.

[1720] One pharmacogenomics approach to identifying genes that predict drug response, known as “a genome-wide association,” relies primarily on a high-resolution map of the human genome consisting of already known gene-related markers (e.g., a “bi-allelic” gene marker map which consists of 60,000-100,000 polymorphic or variable sites on the human genome, each of which has two variants.) Such a high-resolution genetic map can be compared to a map of the genome of each of a statistically significant number of patients taking part in a Phase II/III drug trial to identify markers associated with a particular observed drug response or side effect. Alternatively, such a high-resolution map can be generated from a combination of some ten-million known single nucleotide polymorphisms (SNPs) in the human genome. As used herein, a “SNP” is a common alteration that occurs in a single nucleotide base in a stretch of DNA. For example, a SNP may occur once per every 1000 bases of DNA. A SNP may be involved in a disease process, however, the vast majority may not be disease-associated. Given a genetic map based on the occurrence of such SNPs, individuals can be grouped into genetic categories depending on a particular pattern of SNPs in their individual genome. In such a manner, treatment regimens can be tailored to groups of genetically similar individuals, taking into account traits that may be common among such genetically similar individuals.

[1721] Alternatively, a method termed the “candidate gene approach,” can be utilized to identify genes that predict drug response. According to this method, if a gene that encodes a drug's target is known (e.g., a 58224 protein of the present invention), all common variants of that gene can be fairly easily identified in the population and it can be determined if having one version of the gene versus another is associated with a particular drug response.

[1722] Alternatively, a method termed “gene expression profiling,” can be utilized to identify genes that predict drug response. For example, the gene expression of an animal dosed with a drug (e.g., a 58224 molecule or 58224 modulator of the present invention) can give an indication whether gene pathways related to toxicity have been turned on.

[1723] Information generated from more than one of the above pharmacogenomics approaches can be used to determine appropriate dosage and treatment regimens for prophylactic or therapeutic treatment of an individual. This knowledge, when applied to dosing or drug selection, can avoid adverse reactions or therapeutic failure and thus enhance therapeutic or prophylactic efficiency when treating a subject with a 58224 molecule or 58224 modulator, such as a modulator identified by one of the exemplary screening assays described herein.

[1724] The present invention further provides methods for identifying new agents, or combinations, that are based on identifying agents that modulate the activity of one or more of the gene products encoded by one or more of the 58224 genes of the present invention, wherein these products may be associated with resistance of the cells to a therapeutic agent. Specifically, the activity of the proteins encoded by the 58224 genes of the present invention can be used as a basis for identifying agents for overcoming agent resistance. By blocking the activity of one or more of the resistance proteins, target cells, e.g., cancer cells, will become sensitive to treatment with an agent that the unmodified target cells were resistant to.

[1725] Monitoring the influence of agents (e.g., drugs) on the expression or activity of a 58224 protein can be applied in clinical trials. For example, the effectiveness of an agent determined by a screening assay as described herein to increase 58224 gene expression, protein levels, or up-regulate 58224 activity, can be monitored in clinical trials of subjects exhibiting decreased 58224 gene expression, protein levels, or down-regulated 58224 activity. Alternatively, the effectiveness of an agent determined by a screening assay to decrease 58224 gene expression, protein levels, or down-regulate 58224 activity, can be monitored in clinical trials of subjects exhibiting increased 58224 gene expression, protein levels, or upregulated 58224 activity. In such clinical trials, the expression or activity of a 58224 gene, and preferably, other genes that have been implicated in, for example, a 58224-associated disorder can be used as a “read out” or markers of the phenotype of a particular cell.

[1726] 58224 Informatics

[1727] The sequence of a 58224 molecule is provided in a variety of media to facilitate use thereof. A sequence can be provided as a manufacture, other than an isolated nucleic acid or amino acid molecule, which contains a 58224. Such a manufacture can provide a nucleotide or amino acid sequence, e.g., an open reading frame, in a form which allows examination of the manufacture using means not directly applicable to examining the nucleotide or amino acid sequences, or a subset thereof, as they exists in nature or in purified form. The sequence information can include, but is not limited to, 58224 full-length nucleotide and/or amino acid sequences, partial nucleotide and/or amino acid sequences, polymorphic sequences including single nucleotide polymorphisms (SNPs), epitope sequence, and the like. In a preferred embodiment, the manufacture is a machine-readable medium, e.g., a magnetic, optical, chemical or mechanical information storage device.

[1728] As used herein, “machine-readable media” refers to any medium that can be read and accessed directly by a machine, e.g., a digital computer or analogue computer. Non-limiting examples of a computer include a desktop PC, laptop, mainframe, server (e.g., a web server, network server, or server farm), handheld digital assistant, pager, mobile telephone, and the like. The computer can be stand-alone or connected to a communications network, e.g., a local area network (such as a VPN or intranet), a wide area network (e.g., an Extranet or the Internet), or a telephone network (e.g., a wireless, DSL, or ISDN network). Machine-readable media include, but are not limited to: magnetic storage media, such as floppy discs, hard disc storage medium, and magnetic tape; optical storage media such as CD-ROM; electrical storage media such as RAM, ROM, EPROM, EEPROM, flash memory, and the like; and hybrids of these categories such as magnetic/optical storage media.

[1729] A variety of data storage structures are available to a skilled artisan for creating a machine-readable medium having recorded thereon a nucleotide or amino acid sequence of the present invention. The choice of the data storage structure will generally be based on the means chosen to access the stored information. In addition, a variety of data processor programs and formats can be used to store the nucleotide sequence information of the present invention on computer readable medium. The sequence information can be represented in a word processing text file, formatted in commercially-available software such as WordPerfect and Microsoft Word, or represented in the form of an ASCII file, stored in a database application, such as DB2, Sybase, Oracle, or the like. The skilled artisan can readily adapt any number of data processor structuring formats (e.g., text file or database) in order to obtain computer readable medium having recorded thereon the nucleotide sequence information of the present invention.

[1730] In a preferred embodiment, the sequence information is stored in a relational database (such as Sybase or Oracle). The database can have a first table for storing sequence (nucleic acid and/or amino acid sequence) information. The sequence information can be stored in one field (e.g., a first column) of a table row and an identifier for the sequence can be store in another field (e.g., a second column) of the table row. The database can have a second table, e.g., storing annotations. The second table can have a field for the sequence identifier, a field for a descriptor or annotation text (e.g., the descriptor can refer to a functionality of the sequence, a field for the initial position in the sequence to which the annotation refers, and a field for the ultimate position in the sequence to which the annotation refers. Non-limiting examples for annotation to nucleic acid sequences include polymorphisms (e.g., SNP's) translational regulatory sites and splice junctions. Non-limiting examples for annotations to amino acid sequence include polypeptide domains, e.g., a domain described herein; active sites and other functional amino acids; and modification sites.

[1731] By providing the nucleotide or amino acid sequences of the invention in computer readable form, the skilled artisan can routinely access the sequence information for a variety of purposes. For example, one skilled in the art can use the nucleotide or amino acid sequences of the invention in computer readable form to compare a target sequence or target structural motif with the sequence information stored within the data storage means. A search is used to identify fragments or regions of the sequences of the invention which match a particular target sequence or target motif. The search can be a BLAST search or other routine sequence comparison, e.g., a search described herein.

[1732] Thus, in one aspect, the invention features a method of analyzing 58224, e.g., analyzing structure, function, or relatedness to one or more other nucleic acid or amino acid sequences. The method includes: providing a 58224 nucleic acid or amino acid sequence; comparing the 58224 sequence with a second sequence, e.g., one or more preferably a plurality of sequences from a collection of sequences, e.g., a nucleic acid or protein sequence database to thereby analyze 58224. The method can be performed in a machine, e.g., a computer, or manually by a skilled artisan.

[1733] The method can include evaluating the sequence identity between a 58224 sequence and a database sequence. The method can be performed by accessing the database at a second site, e.g., over the Internet.

[1734] As used herein, a “target sequence” can be any DNA or amino acid sequence of six or more nucleotides or two or more amino acids. A skilled artisan can readily recognize that the longer a target sequence is, the less likely a target sequence will be present as a random occurrence in the database. Typical sequence lengths of a target sequence are from about 10 to 100 amino acids or from about 30 to 300 nucleotide residues. However, it is well recognized that commercially important fragments, such as sequence fragments involved in gene expression and protein processing, may be of shorter length.

[1735] Computer software is publicly available which allows a skilled artisan to access sequence information provided in a computer readable medium for analysis and comparison to other sequences. A variety of known algorithms are disclosed publicly and a variety of commercially available software for conducting search means are and can be used in the computer-based systems of the present invention. Examples of such software include, but are not limited to, MacPattern (EMBL), BLASTN and BLASTX (NCBI).

[1736] Thus, the invention features a method of making a computer readable record of a sequence of a 58224 sequence which includes recording the sequence on a computer readable matrix. In a preferred embodiment the record includes one or more of the following: identification of an ORF; identification of a domain, region, or site; identification of the start of transcription; identification of the transcription terminator; the full length amino acid sequence of the protein, or a mature form thereof; the 5′end of the translated region.

[1737] In another aspect, the invention features, a method of analyzing a sequence. The method includes: providing a 58224 sequence, or record, in machine-readable form; comparing a second sequence to the 58224 sequence; thereby analyzing a sequence. Comparison can include comparing to sequences for sequence identity or determining if one sequence is included within the other, e.g., determining if the 58224 sequence includes a sequence being compared. In a preferred embodiment the 58224 or second sequence is stored on a first computer, e.g., at a first site and the comparison is performed, read, or recorded on a second computer, e.g., at a second site. E.g., the 58224 or second sequence can be stored in a public or proprietary database in one computer, and the results of the comparison performed, read, or recorded on a second computer. In a preferred embodiment the record includes one or more of the following: identification of an ORF; identification of a domain, region, or site; identification of the start of transcription; identification of the transcription terminator; the full length amino acid sequence of the protein, or a mature form thereof; the 5′end of the translated region.

[1738] In another aspect, the invention provides a machine-readable medium for holding instructions for performing a method for determining whether a subject has a 58224-associated disease or disorder or a pre-disposition to a 58224-associated disease or disorder, wherein the method comprises the steps of determining 58224 sequence information associated with the subject and based on the 58224 sequence information, determining whether the subject has a 58224-associated disease or disorder or a pre-disposition to a 58224-associated disease or disorder and/or recommending a particular treatment for the disease, disorder or pre-disease condition.

[1739] The invention further provides in an electronic system and/or in a network, a method for determining whether a subject has a 58224-associated disease or disorder or a pre-disposition to a disease associated with a 58224 wherein the method comprises the steps of determining 58224 sequence information associated with the subject, and based on the 58224 sequence information, determining whether the subject has a 58224-associated disease or disorder or a pre-disposition to a 58224-associated disease or disorder, and/or recommending a particular treatment for the disease, disorder or pre-disease condition. In a preferred embodiment, the method further includes the step of receiving information, e.g., phenotypic or genotypic information, associated with the subject and/or acquiring from a network phenotypic information associated with the subject. The information can be stored in a database, e.g., a relational database. In another embodiment, the method further includes accessing the database, e.g., for records relating to other subjects, comparing the 58224 sequence of the subject to the 58224 sequences in the database to thereby determine whether the subject as a 58224-associated disease or disorder, or a pre-disposition for such.

[1740] The present invention also provides in a network, a method for determining whether a subject has a 58224 associated disease or disorder or a pre-disposition to a 58224-associated disease or disorder associated with 58224, said method comprising the steps of receiving 58224 sequence information from the subject and/or information related thereto, receiving phenotypic information associated with the subject, acquiring information from the network corresponding to 58224 and/or corresponding to a 58224-associated disease or disorder (e.g., a cell proliferation or differentiation disorder, e.g., cancer, e.g., breast, ovary, lung, or colon cancer, or liver metastasis; Werner syndrome; Bloom syndrome; cockayne's syndrome; xerodema pigmentosum; lymphoid proliferative diseases; &agr;-Thalassemia X-linked mental retardation; or another cell proliferation or differentiation disorder as described herein), and based on one or more of the phenotypic information, the 58224 information (e.g., sequence information and/or information related thereto), and the acquired information, determining whether the subject has a 58224-associated disease or disorder or a pre-disposition to a 58224-associated disease or disorder. The method may further comprise the step of recommending a particular treatment for the disease, disorder or pre-disease condition.

[1741] The present invention also provides a method for determining whether a subject has a 58224-associated disease or disorder or a pre-disposition to a 58224-associated disease or disorder, said method comprising the steps of receiving information related to 58224 (e.g., sequence information and/or information related thereto), receiving phenotypic information associated with the subject, acquiring information from the network related to 58224 and/or related to a 58224-associated disease or disorder, and based on one or more of the phenotypic information, the 58224 information, and the acquired information, determining whether the subject has a 58224-associated disease or disorder or a pre-disposition to a 58224-associated disease or disorder. The method may further comprise the step of recommending a particular treatment for the disease, disorder or pre-disease condition.

[1742] This invention is further illustrated by the following examples that should not be construed as limiting. The contents of all references, patents and published patent applications cited throughout this application are incorporated herein by reference.

Background of the 46980 Invention

[1743] Carboxylesterases are a structurally related family of proteins. Carboxylesterases have been classified into three categories (A, B and C) on the basis of differential patters of inhibition by organophosphates (Myers, M. et al. (1988) Mol. Biol. Evol. 5(2):113-119). Members of the type B carboxylesterase subfamily include, for example, mammalian cholinesterases and mammalian bile salt activated lipases. Generally enzymatically active carboxylesterases have a conserved catalytic triad of active site residues. However, some carboxylesterases lack at least one member of the triad and may not retain catalytic function. The neuroligins are among such carboxylesterases.

[1744] Neuroligins are cell surface molecules composed of approximately five domains: an N-terminal signal sequence, which can be cleaved, a large extracellular domain homologous to carboxylesterases, a linker domain between the transmembrane region and the carboxylesterase homology domain, a transmembrane region, and a cytoplasmic tail (Ichtchenko, K. et al. (1996) J. Biol. Chem. 271(5):2676-2682). Sequence comparisons place the neuroligins in the large family of esterase homology domain proteins that includes thyroglobulin, acetylcholinesterase, and gliotactin. However, neuroligins are only distantly related to these proteins, and thus appear to form a unique subset of the esterase family. At least three neuroligins have been cloned from rat, namely Neuroligins 1, 2 and 3 (Ichtchenko, K. et al. (1995) Cell 81:435-443; Ichtchenko, K. et al. (1996) supra). These three neuroligins are expressed at high levels in the brain, primarily in neurons. Neuroligins have been shown to mediate cell adhesion events associated with neuronal development and/or maintenance. For example, Neuroligin 1 has been found to be enriched in postsynaptic densities where it may recruit receptors, channels, and signal transduction molecules at synaptic sites of cell adhesion (Song et al. (1999) Proc. Natl. Acad. Sci. 96(3):1100-5).

[1745] Functionally neuroligins bind tightly, in a calcium-dependent manner, to the extracellular domains of the polymorphic cell surface proteins known as &bgr;-neurexins. Neurexins are neuronal cell surface proteins that exhibit a high degree of diversity (Ushkaryov et al. (1994) J. Biol. Chem. 269: 11987-11992). Neuroligin-&bgr;-neurexin interactions have been implicated in mediating recognition processes between neurons that give rise to neuronal developmental events such as synaptogenesis (e.g., specification of excitatory synapses) (Brose, N. (1999) Naturwissenschaften 86(11):516-24).

Summary of the 46980 Invention

[1746] The present invention is based, in part, on the discovery of a novel neuroligin family member, referred to herein as “46980”. The nucleotide sequence of a cDNA encoding 46980 is shown in SEQ ID NO:27, and the amino acid sequence of a 46980 polypeptide is shown in SEQ ID NO:28. In addition, the nucleotide sequences of the coding region are depicted in SEQ ID NO:29. See, e.g., Example 19, below.

[1747] Accordingly, in one aspect, the invention features a nucleic acid molecule that encodes a 46980 protein or polypeptide, e.g., a biologically active portion of the 46980 protein. In a preferred embodiment the isolated nucleic acid molecule encodes a polypeptide having the amino acid sequence of SEQ ID NO:28. In other embodiments, the invention provides isolated 46980 nucleic acid molecules having the nucleotide sequence shown in SEQ ID NO:27, SEQ ID NO:29, or the sequence of the DNA insert of the plasmid deposited with ATCC Accession Number ______. In still other embodiments, the invention provides nucleic acid molecules that are substantially identical (e.g., naturally occurring allelic variants) to the nucleotide sequence shown in SEQ ID NO:27, SEQ ID NO:29, or the sequence of the DNA insert of the plasmid deposited with ATCC Accession Number ______. In other embodiments, the invention provides a nucleic acid molecule which hybridizes under a stringency condition described herein to a nucleic acid molecule comprising the nucleotide sequence of SEQ ID NO:27, SEQ ID NO:29, or the sequence of the DNA insert of the plasmid deposited with ATCC Accession Number ______, wherein the nucleic acid encodes a full length 46980 protein or an active fragment thereof.

[1748] In a related aspect, the invention further provides nucleic acid constructs that include a 46980 nucleic acid molecule described herein. In certain embodiments, the nucleic acid molecules of the invention are operatively linked to native or heterologous regulatory sequences. Also included, are vectors and host cells containing the 46980 nucleic acid molecules of the invention e.g., vectors and host cells suitable for producing 46980 nucleic acid molecules and polypeptides.

[1749] In another related aspect, the invention provides nucleic acid fragments suitable as primers or hybridization probes for the detection of 46980-encoding nucleic acids.

[1750] In still another related aspect, isolated nucleic acid molecules that are antisense to a 46980 encoding nucleic acid molecule are provided.

[1751] In another aspect, the invention features, 46980 polypeptides, and biologically active or antigenic fragments thereof that are useful, e.g., as reagents or targets in assays applicable to treatment and diagnosis of 46980-mediated or -related disorders. In another embodiment, the invention provides 46980 polypeptides having a 46980 activity. Preferred polypeptides are 46980 proteins including at least one carboxylesterase domain, and, preferably, having a 46980 activity, e.g., a neurexin-binding or other 46980 activity described herein.

[1752] In other embodiments, the invention provides 46980 polypeptides, e.g., a 46980 polypeptide having the amino acid sequence shown in SEQ ID NO:28 or the amino acid sequence encoded by the cDNA insert of the plasmid deposited with ATCC Accession Number ______; an amino acid sequence that is substantially identical to the amino acid sequence shown in SEQ ID NO:28 or the amino acid sequence encoded by the cDNA insert of the plasmid deposited with ATCC Accession Number ______; or an amino acid sequence encoded by a nucleic acid molecule having a nucleotide sequence which hybridizes under a stringency condition described herein to a nucleic acid molecule comprising the nucleotide sequence of SEQ ID NO:27, SEQ ID NO:29, or the sequence of the DNA insert of the plasmid deposited with ATCC Accession Number ______, wherein the nucleic acid encodes a full length 46980 protein or an active fragment thereof.

[1753] In a related aspect, the invention further provides nucleic acid constructs which include a 46980 nucleic acid molecule described herein.

[1754] In a related aspect, the invention provides 46980 polypeptides or fragments operatively linked to non-46980 polypeptides to form fusion proteins.

[1755] In another aspect, the invention features antibodies and antigen-binding fragments thereof, that react with, or more preferably specifically bind 46980 polypeptides or fragments thereof, e.g., an extracellular domain of a 46980 polypeptide. In one embodiment, the antibodies or antigen-binding fragment thereof competitively inhibit the binding of a second antibody to a 46980 polypeptide or a fragment thereof, e.g., an extracellular domain of a 46980 polypeptide such as a neurexin binding domain or a carboxylesterase domain.

[1756] In another aspect, the invention provides methods of screening for compounds that modulate the expression or activity of the 46980 polypeptides or nucleic acids.

[1757] In still another aspect, the invention provides a process for modulating 46980 polypeptide or nucleic acid expression or activity, e.g. using the screened compounds. In certain embodiments, the methods involve treatment of conditions related to aberrant activity or expression of the 46980 polypeptides or nucleic acids, such as conditions involving a neuronal disorder, e.g., a pain-related, neuronal connectivity-related, or neural degenerative disorder.

[1758] The invention also provides assays for determining the activity of or the presence or absence of 46980 polypeptides or nucleic acid molecules in a biological sample, including for disease diagnosis.

[1759] In another aspect, the invention provides methods of screening for compounds that modulate the expression or activity of the 46980 polypeptides or nucleic acids, e.g., agents (e.g., compounds) that modulate the normal pain response, aberrant or altered pain response, inflammatory response, or fertility (e.g., spermatid activity).

[1760] In a preferred embodiment, the effect of an agent, e.g., compound, on the pain response is evaluated by an analgesic test, e.g., the hot plate test, tail flick test, writhing test, paw pressure test, all electric stimulation test, tail withdrawal test, or formalin test.

[1761] In a preferred embodiment, the compound inhibits a 46980 activity.

[1762] In another preferred embodiment, the agent, e.g., compound, modulates endogenous levels of a 46980 ligand, e.g., a neurexin, e.g., a &bgr;-neurexin.

[1763] In still another aspect, the invention provides a process for modulating 46980 polypeptide or nucleic acid expression or activity, e.g. using the screened compounds. In certain embodiments, the methods involve treatment of conditions related to aberrant, e.g., decreased or increased expression of the 46980 polypeptides or nucleic acids, such as conditions involving pain response, aberrant or altered pain response, pain related disorders, or fertility.

[1764] In still another aspect, the invention features a method of modulating (e.g., enhancing or inhibiting) the pain response. The method includes contacting a cell with an agent that modulates the activity or expression of a 46980 polypeptide or nucleic acid, in an amount effective to modulate the pain response.

[1765] In a preferred embodiment, the agent modulates (e.g., increases or decreases) a 46980, e.g., a protein-interaction function of a 46980 polypeptide, e.g., neurexin binding, e.g., &bgr;-neurxin binding.

[1766] In a preferred embodiment, the agent, e.g., the compound, modulates (e.g., increases or decreases) expression of the 46980 nucleic acid by, e.g., modulating transcription, mRNA stability, mRNA nuclear export, splicing, and so forth.

[1767] In preferred embodiments, the agent, e.g., the compound, is a peptide, a phosphopeptide, a small molecule, e.g., a member of a combinatorial library, or an antibody, or any combination thereof. The antibody can be conjugated to a therapeutic moiety selected from the group consisting of a cytotoxin, a cytotoxic agent and a radioactive metal ion.

[1768] In additional preferred embodiments, the agent is an antisense molecule, a ribozyme, a triple helix molecule, or a 46980 nucleic acid, or any combination thereof.

[1769] In a preferred embodiment, the agent, e.g., the compound, is administered in combination with a cytotoxic agent.

[1770] In a preferred embodiment, the cell, e.g., the 46980-expressing cell, is a central or peripheral nervous system cell, e.g., a cell in an area involved in pain control, e.g., a cell in the substantia gelatinosa of the spinal cord, or a cell in the periaqueductal gray matter.

[1771] In a preferred embodiment, the agent and the 46980-polypeptide or nucleic acid are contacted in vitro or ex vivo.

[1772] In a preferred embodiment, the contacting step is effected in vivo in a subject, e.g., as part of a therapeutic or prophylactic protocol. The contacting step(s) can be repeated.

[1773] Preferably, the subject is a human, e.g., a patient with pain or a pain-associated disorder disclosed herein. For example, the subject can be a patient with pain elicited from tissue injury, e.g., inflammation, infection, ischemia; pain associated with musculoskeletal disorders, e.g., joint pain; tooth pain; headaches, e.g., migrane; pain associated with surgery; pain related to inflammation, e.g., irritable bowel syndrome; or chest pain. The subject can be a patient with complex regional pain syndrome (CRPS), reflex sympathetic dystrophy (RSD), causalgia, neuralgia, central pain and dysesthesia syndrome, carotidynia, neurogenic pain, refractory cervicobrachial pain syndrome, myofascial pain syndrome, craniomandibular pain dysfunction syndrome, chronic idiopathic pain syndrome, Costen's pain-dysfunction, acute chest pain syndrome, gynecologic pain syndrome, patellofemoral pain syndrome, anterior knee pain syndrome, recurrent abdominal pain in children, colic, low back pain syndrome, neuropathic pain, phantom pain from amputation, phantom tooth pain, or pain asymbolia. The subject can be a cancer patient, e.g., a patient with brain cancer, bone cancer, or prostate cancer. In other embodiments, the subject is a non-human animal, e.g., an experimental animal, e.g., an arthritic rat model of chronic pain, a chronic constriction injury (CCI) rat model of neuropathic pain, or a rat model of unilateral inflammatory pain by intraplantar injection of Freund's complete adjuvant (FCA).

[1774] In preferred embodiments, the agent, e.g., the compound, is a peptide, a phosphopeptide, a small molecule, e.g., a member of a combinatorial library, or an antibody, or any combination thereof. The antibody can be conjugated to a therapeutic moiety selected from the group consisting of a cytotoxin, a cytotoxic agent and a radioactive metal ion.

[1775] In additional preferred embodiments, the agent is an antisense, a ribozyme, or a triple helix molecule, or an 46980 nucleic acid, or any combination thereof.

[1776] In a preferred embodiment, the agent is administered in combination with a cytotoxic agent. The administration of the agent and/or protein can be repeated.

[1777] In still another aspect, the invention features a method of modulating (e.g., enhancing or inhibiting) spermatid activity or fertility. The method includes contacting a cell with an agent that modulates the activity or expression of a 46980 polypeptide or nucleic acid, in an amount effective to modulate fertility or spermatid activity.

[1778] In a preferred embodiment, the agent modulates (e.g., increases or decreases) a 46980 ligand binding activity, e.g., neurexin binding activity. In another preferred embodiment, the agent modulates (e.g., increases or decreases) expression of the 46980 nucleic acid by, e.g., modulating transcription, mRNA stability, mRNA nuclear export, splicing, and so forth.

[1779] In preferred embodiments, the agent is a peptide, a phosphopeptide, a small molecule, e.g., a member of a combinatorial library, or an antibody, or any combination thereof. The antibody can be conjugated to a therapeutic moiety selected from the group consisting of a cytotoxin, a cytotoxic agent and a radioactive metal ion.

[1780] In additional preferred embodiments, the agent is an antisense molecule, a ribozyme, a triple helix molecule, or an 46980 nucleic acid, or any combination thereof.

[1781] In a preferred embodiment, the agent is administered in combination with a cytotoxic agent.

[1782] In a preferred embodiment, the cell, e.g., the 46980-expressing cell, is a cell of the male reproductive system, e.g., a spermatid cell.

[1783] In a preferred embodiment, the agent and the 46980-polypeptide or nucleic acid are contacted in vitro or ex vivo.

[1784] In a preferred embodiment, the contacting step is effected in vivo in a subject, e.g., as part of a therapeutic or prophylactic protocol. Preferably, the subject is a human, e.g., a patient with infertility. The subject can be a cancer patient, e.g., a patient with prostate cancer. In other embodiments, the subject is a non-human animal, e.g., an experimental animal, e.g., a rodent model for infertility. In still other embodiment, the subject is a human or non-human animal for which contraception is intended. The contacting step(s) can be repeated.

[1785] In preferred embodiments, the agent is a peptide, a phosphopeptide, a small molecule, e.g., a member of a combinatorial library, or an antibody, or any combination thereof. The antibody can be conjugated to a therapeutic moiety selected from the group consisting of a cytotoxin, a cytotoxic agent and a radioactive metal ion.

[1786] In additional preferred embodiments, the agent is an antisense, a ribozyme, or a triple helix molecule, or a 46980 nucleic acid, or any combination thereof.

[1787] In a preferred embodiment, the agent is administered in combination with a cytotoxic agent.

[1788] The administration of the agent and/or protein can be repeated.

[1789] In still another aspect, the invention features a method for evaluating the efficacy of a treatment of a disorder, e.g., a disorder disclosed herein, in a subject. The method includes treating a subject with a protocol under evaluation; assessing the expression of a 46980 nucleic acid or 46980 polypeptide, such that a change in the level of 46980 nucleic acid or 46980 polypeptide after treatment, relative to the level before treatment, is indicative of the efficacy of the treatment of the disorder.

[1790] In a preferred embodiment, the disorder is a neuronal disorder. In a preferred embodiment, the disorder is a pain or a pain related disorder. In another preferred embodiment, the disorder is a neurodegenerative disorder. In still another preferred embodiment, the disorder is a neuronal connectivity-related disorder or a developmental disorder of the nervous system.

[1791] In a preferred embodiment, the disorder is infertility or aberrant spermatid cell activity.

[1792] In a preferred embodiment, the disorder is a cancer, e.g., prostate cancer, brain cancer, neurofibramatosis, or testicular cancer.

[1793] In a preferred embodiment, the subject is a human. In a preferred embodiment, the subject is an experimental animal, e.g., an animal model for a pain or pain-related disorder.

[1794] The invention also features a method of diagnosing a disorder, e.g., a disorder disclosed herein, in a subject. The method includes evaluating the expression or activity of a 46980 nucleic acid or a 46980 polypeptide, such that, a difference in the level of 46980 nucleic acid or 46980 polypeptide relative to a normal subject or a cohort of normal subjects is indicative of the disorder.

[1795] In a preferred embodiment, the disorder is a neuronal disorder. In a preferred embodiment, the disorder is a pain or a pain related disorder. In another preferred embodiment, the disorder is a neurodegenerative disorder. In still another preferred embodiment, the disorder is a neuronal connectivity-related disorder or a developmental disorder of the nervous system.

[1796] In a preferred embodiment, the disorder is infertility or aberrant spermatid cell activity.

[1797] In a preferred embodiment, the disorder is a cancer, e.g., prostate cancer, brain cancer, neurofibramatosis, or testicular cancer.

[1798] In a preferred embodiment, the subject is a human.

[1799] In a preferred embodiment, the evaluating step occurs in vitro or ex vivo. For example, a sample, e.g., a blood sample, is obtained from the subject.

[1800] In a preferred embodiment, the evaluating step occurs in vivo. For example, by administering to the subject a detectably labeled agent that interacts with the 46980 nucleic acid or polypeptide, such that a signal is generated relative to the level of activity or expression of the 46980 nucleic acid or polypeptide.

[1801] The invention also provides assays for determining the activity of or the presence or absence of 46980 polypeptides or nucleic acid molecules in a biological sample, including for disease diagnosis.

[1802] In further aspect, the invention provides assays for determining the presence or absence of a genetic alteration in a 46980 polypeptide or nucleic acid molecule, including for disease diagnosis.

[1803] In yet another aspect, the invention features a method for identifying an agent, e.g., a compound, which modulates the activity of a 46980 polypeptide, e.g., a 46980 polypeptide as described herein, or the expression of a 46980 nucleic acid, e.g., a 46980 nucleic acid as described herein, including contacting the 46980 polypeptide or nucleic acid with a test agent (e.g., a test compound); and determining the effect of the test compound on the activity of the 46980 polypeptide or nucleic acid to thereby identify a compound which modulates the activity of the 46980 polypeptide or nucleic acid.

[1804] In a preferred embodiment, the activity of the 46980 polypeptide is a protein-interaction activity, e.g., a neurexin-binding activity, e.g., a &bgr;-neurexin binding activity.

[1805] In a preferred embodiment, the activity of the 46980 polypeptide is modulation of pain response.

[1806] In preferred embodiments, the agent is a peptide (or polypeptide), a small molecule, e.g., a member of a combinatorial library, or an antibody, or any combination thereof. The antibody, peptide, or small molecule can binding to a 46980 surface region that interfaces with a 46980 ligand, e.g., a neurexin, e.g., a &bgr;-neurexin.

[1807] In additional preferred embodiments, the agent is an antisense, a ribozyme, or a triple helix molecule, or an 46980 nucleic acid, or any combination thereof.

[1808] In another aspect, the invention features a method of inhibiting the function or inducing the killing of a 46980-expressing cell, e.g., a brain cell, a neuron, a spinal cord cell, a prostate cell, a testes cell, or a spermatid. The method includes contacting the 46980-expressing cell with a compound that binds to a 46980 polypeptide in an amount effective to inhibit the function of the cell or induce killing of the cell. In a preferred embodiment, the compound is a small molecule. In another preferred embodiment, the compound is a polypeptide, e.g., a polypeptide that binds the 46980 extracellular domain, e.g., an antibody or a soluble fragment of a neurexin. The polypeptide can be coupled to a cytotoxin, a cytotoxic agent and a radioactive metal ion.

[1809] In another embodiment, the compound disrupts a 46980 interaction with a 46980 ligand.

[1810] In still another aspect, the invention features a method of labelling a 46980-expressing cell, e.g., a brain cell, a neuron, a spinal cord cell, a prostate cell, a testes cell, or a spermatid. The method includes contacting the 46980-expressing cell with a compound that binds to a 46980 polypeptide. In a preferred embodiment, the compound is a small molecule that includes a label or a moiety recognizable by a labeling entity. In another preferred embodiment, the compound is a polypeptide, e.g., a polypeptide that binds the 46980 extracellular domain, e.g., an antibody or a soluble fragment of a neurexin. The polypeptide includes a label or a moiety recognizable by a labeling entity.

[1811] In a preferred embodiment, the 46980-expressing cell is in a subject mammal. The compound is labeled, e.g., with a label that is detectable by a scan of the subject, e.g., by magnetic resonance imaging. The compound is administered to the subject, e.g., by parenteral injection or by an epidural injection.

[1812] In another preferred embodiment, the 46980-expressing cell is in a sample, e.g., a biopsy or a tissue section. The sample can be fixed or live.

[1813] In further aspect, the invention provides assays for determining the presence or absence of a genetic alteration in a 46980 polypeptide or nucleic acid molecule, including for disease diagnosis.

[1814] In another aspect, the invention features a two dimensional array having a plurality of addresses, each address of the plurality being positionally distinguishable from each other address of the plurality, and each address of the plurality having a unique capture probe, e.g., a nucleic acid or peptide sequence. At least one address of the plurality has a capture probe that recognizes a 46980 molecule. In one embodiment, the capture probe is a nucleic acid, e.g., a probe complementary to a 46980 nucleic acid sequence. In another embodiment, the capture probe is a polypeptide, e.g., an antibody specific for 46980 polypeptides. Also featured is a method of analyzing a sample by contacting the sample to the aforementioned array and detecting binding of the sample to the array.

[1815] Other features and advantages of the invention will be apparent from the following detailed description, and from the claims.

Detailed Description of 46980

[1816] The human 46980 sequence (see SEQ ID NO:27, as recited in Example 19), which is approximately 3502 nucleotides long including untranslated regions, contains a predicted methionine-initiated coding sequence of about 2451 nucleotides, including the termination codon. The coding sequence encodes a 816 amino acid protein (see SEQ ID NO:28, as recited in Example 19). The human 46980 protein of SEQ ID NO:28 and FIG. 16 includes an amino-terminal hydrophobic amino acid sequence, consistent with a signal sequence, of about 43 amino acids (from amino acid 1 to about amino acid 43 of SEQ ID NO:28), which upon cleavage results in the production of a mature protein form.)). This mature protein form is approximately 773 amino acid residues in length (from about amino acid 44 to amino acid 816 of SEQ ID NO:28).

[1817] Human 46980 contains the following regions or other structural features:

[1818] a predicted carboxylesterase domain (PFAM Accession PF00135) located at about amino acid 25 to 590 of SEQ ID NO:28 (which includes a carboxylesterase type B signature 2 domain located at about amino acids 144 to 154 of SEQ ID NO:28);

[1819] a predicted transmembrane region located at about amino acids 675 to 696 of SEQ ID NO:28;

[1820] a predicted N-terminal extracellular domain located at about amino acids 1 to 674 of SEQ ID NO:28;

[1821] a predicted C-terminal intracellular domain located at about amino acids 697 to 816 of SEQ ID NO:28;

[1822] two predicted N-glycosylation sites (PS00001) located from about amino acids 102 to 105, and 511 to 514 of SEQ ID NO:28;

[1823] a glycosaminoglycan attachment site (PS00002) located at about amino acids 253 to 256 of SEQ ID NO:28;

[1824] five predicted protein kinase C phosphorylation sites (PS00005) located at about amino acids 707 to 709, 751 to 753, 763 to 765, 794 to 796, and 813 to 815, of SEQ ID NO:28;

[1825] two predicted casein kinase II phosphorylation sites (PS00006) located at about amino acids 717 to 720 and 767 to 770, of SEQ ID NO:28; and

[1826] ten predicted N-myristylation sites (PS00008) located at about amino acids 75 to 80, 99 to 104, 187 to 192, 239 to 244, 252 to 257, 281 to 286, 370 to 375, 382 to 387, 389 to 394, and 800 to 805, of SEQ ID NO:28.

[1827] For general information regarding PFAM identifiers, PS prefix and PF prefix domain identification numbers, refer to Sonnhammer et al. (1997) Protein 28:405-420 and http://www.psc.edu/general/software/packages/pfam/pfam.html.

[1828] A plasmid containing the nucleotide sequence encoding human 46980 (clone “Fbh46980FL”) was deposited with American Type Culture Collection (ATCC), 10801 University Boulevard, Manassas, Va. 20110-2209, on ______ and assigned Accession Number ______. This deposit will be maintained under the terms of the Budapest Treaty on the International Recognition of the Deposit of Microorganisms for the Purposes of Patent Procedure. This deposit was made merely as a convenience for those of skill in the art and is not an admission that a deposit is required under 35 U.S.C. § 112.

[1829] The 46980 protein contains a significant number of structural characteristics in common with members of the carboxylesterase family. The term “family” when referring to the protein and nucleic acid molecules of the invention means two or more proteins or nucleic acid molecules having a common structural domain or motif and having sufficient amino acid or nucleotide sequence homology as defined herein. Such family members can be naturally or non-naturally occurring and can be from either the same or different species. For example, a family can contain a first protein of human origin as well as other distinct proteins of human origin, or alternatively, can contain homologues of non-human origin, e.g., rat or mouse proteins. Members of a family can also have common functional characteristics. Carboxylesterase family members are known to act on carboxylic esters. Based on the differential patters of inhibition by organophosphates, carboxylesterases have been classified into three categories (A, B and C) (Myers, M. et al. (1988) Mol. Biol. Evol. 5(2):113-119). 46980 proteins of the invention include a carboxylesterase type B signature 2 domain located at about amino acids 144 to 154 of SEQ ID NO:28, which suggests that the 46980 proteins belong to the carboxylesterase type B family.

[1830] 46980 proteins of the invention are homologous to the rat neuroligin protein, in particular, the rat neuroligin 3 protein (FIG. 17). Thus, the protein of the invention are members of the neuroligin subfamily of carboxylesterase type B proteins. As used herein, the term “neuroligin” refers to cell surface molecules composed of five domains: an N-terminal cleaved signal sequence, a large extracellular domain homologous to esterases, a linker domain between the transmembrane region and the esterase homology domain, a single transmembrane region, and a cytoplasmic tail (Ichtchenko, K. et al. (1996) J. Biol. Chem. 271(5):2676-2682). Neuroligins can lack a catalytic serine amino acid at a conserved position in the carboxylesterase active site.

[1831] Members of the neuroligins are typically expressed at high levels in the nervous system, e.g., the spinal cord and the brain, e.g., primarily in neurons. Preferably, neuroligins are capable of mediating cell adhesion events associated with development and/or maintenance, e.g., neural events such as synaptogenesis, recruitment of receptors, channels, and signal transduction molecules at synaptic sites (e.g., at excitatory synapses) (Song et al. (1999) Proc. Natl. Acad. Sci. 96(3):1100-5. Preferably, neuroligins are capable of interacting with a cell surface protein, e.g., a neurexin (e.g., a &bgr;-neurexins). Preferably, neuroligin-&bgr;-neurexin interactions mediate cell adhesion events, e.g., neuron-neuron, or neuron-glia cell adhesion events.

[1832] A 46980 polypeptide can include at least one “carboxylesterase domain” or at least one region homologous with a “carboxylesterase domain”. A 46980 can optionally further include at least one transmembrane domain; at least one extracellular domain; and at least one intracellular domain. A 46980 can optionally further include at least one, two, three, preferably four, N-glycosylation sites; at least one, preferably two, cAMP/cGMP phosphorylation sites; at least one, two, three, four, or five protein kinase C sites; at least one or two casein kinase II sites; at least one, two, three, four, five, six, seven, eight, preferably ten N-myristylation sites; and at least one glycosaminoglycan attachment site.

[1833] As used herein, the term “carboxylesterase domain” refers to a protein domain which is includes a carboxylesterase type B signature 2 domain and that is at least 300 amino acid, and has a bit score for alignment to the HMM profile for the Pfam carboxylesterase domain (PF001350) at the time of filing of at least 100.

[1834] Preferably, the carboxylesterase type B signature 2 domain is about 5 to 20 amino acids, more preferably 8-15, most preferably 11 amino acids and includes the sequence: E-D-X(0,1)-C-L-Y (SEQ ID NO:32). Most preferably, the carboxylesterase type B signature 2 domain has the amino acid sequence: EDCLYNIYVP located at about amino acids 144 to 154 of SEQ ID NO:28.

[1835] Preferably, the carboxylesterase domain has an amino acid sequence of about 400 to about 650 amino acid residues and having a bit score for the alignment of the sequence to the carboxylesterase domain (HMM) of at least 100. Preferably, a carboxylesterase domain includes at least about 450 to about 600 amino acids, more preferably about 500 to about 575 amino acid residues, about 550 to 570, or about 565 amino acids and has a bit score for the alignment of the sequence to the carboxylesterase domain (HMM) of at least 200, preferably 300, more preferably 400 or greater. The carboxylesterase domain (HMM) has been assigned the PFAM Accession (PF00135) (http://genome.wustl.edu/Pfam/html). An alignment of the carboxylesterase domain (from about amino acids 25 to about 590 of SEQ ID NO:28) of human 46980 with a consensus amino acid sequence derived from a hidden Markov model (PFAM) at the time of filing is depicted in FIG. 16.

[1836] In a preferred embodiment, 46980 polypeptide or protein has a “carboxylesterase domain” or a region which includes at least about 400 to about 650 amino acids, preferably 450 to about 600 amino acids, more preferably about 500 to about 570 amino acid residues, about 550 to 570, or about 565 amino acid residues and has at least about 60%, 70% 80% 90% 95%, 99%, or 100% homology with a “carboxylesterase domain,” e.g., the carboxylesterase domain of human 46980 (e.g., residues 25 to 590 of SEQ ID NO:28).

[1837] To identify the presence of a “carboxylesterase” domain in a 46980 protein sequence, and make the determination that a polypeptide or protein of interest has a particular profile, the amino acid sequence of the protein can be searched against a database of HMMs (e.g., the Pfam database, release 2.1) using the default parameters (http://www.sanger.ac.uk/Software/Pfam/HMM_search). For example, the hmmsf program, which is available as part of the HMMER package of search programs, is a family specific default program for MILPAT0063 and a score of 15 is the default threshold score for determining a hit. Alternatively, the threshold score for determining a hit can be lowered (e.g., to 8 bits). A description of the Pfam database can be found in Sonhammer et al. (1997) Proteins 28(3):405-420 and a detailed description of HMMs can be found, for example, in Gribskov et al. (1990) Meth. Enzymol. 183:146-159; Gribskov et al. (1987) Proc. Natl. Acad. Sci. USA 84:4355-4358; Krogh et al. (1994) J. Mol. Biol. 235:1501-1531; and Stultz et al. (1993) Protein Sci. 2:305-314, the contents of which are incorporated herein by reference. A search was performed against the PFAM HMM database resulting in the identification of a “carboxylesterase domain” in the amino acid sequence of human 46980 at about residues 25 to about 590 of SEQ ID NO:28 (see FIGS. 15 and 17).

[1838] In one embodiment, a 46980 protein includes at least one, preferably two, transmembrane domains. As used herein, the term “transmembrane domain” includes an amino acid sequence of about 15 amino acid residues in length that spans a phospholipid membrane. More preferably, a transmembrane domain includes about at least 16, 18, 20, 21, 22, 25, 30, 35 or 40 amino acid residues and spans a phospholipid membrane. Transmembrane domains are rich in hydrophobic residues, and typically have an &agr;-helical structure. In a preferred embodiment, at least 50%, 60%, 70%, 80%, 90%, 95% or more of the amino acids of a transmembrane domain are hydrophobic, e.g., leucines, isoleucines, tyrosines, methionines, phenylalanines, or tryptophans. Transmembrane domains are described in, for example, Zagotta W. N. et al, (1996) Annual Rev. Neuronsci. 19: 235-63, the contents of which are incorporated herein by reference.

[1839] In a preferred embodiment, a 46980 polypeptide or protein has at least one transmembrane domain or a region which includes at least 16, 18, 20, 21, 22, 25, 30, 35 or 40 amino acid residues and has at least about 60%, 70% 80% 90% 95%, 99%, or 100% homology with a “transmembrane domain,” e.g., at least one transmembrane domain of human 46980 (e.g., from about amino acid residues 675 to about 696 of SEQ ID NO:28).

[1840] A 46980 protein further includes a predicted N-terminal extracellular domain located at about amino acids 1-674 (or 44-674 of the mature protein) of SEQ ID NO:28. As used herein, an “N-terminal extracellular domain” includes an amino acid sequence about 1-800, preferably about 200-700, and even more preferably about 300-680 or 674, amino acid residues in length and is located outside of a cell, extracellularly, or internally in a membrane compartment that is topologically equivalent to the extracellular milieu. The C-terminal amino acid residue of a “N-terminal extracellular domain” is adjacent to an N-terminal amino acid residue of a transmembrane domain in a naturally-occurring 46980 or 46980-like protein. For example, an N-terminal cytoplasmic domain is located at about amino acid residues Ito 674 of SEQ ID NO:28.

[1841] In a preferred embodiment 46980 polypeptide or protein has an “N-terminal extracellular domain” or a region which includes at least about 1-800, preferably about 200-700, and even more preferably about 300-680 or 674 amino acid residues and has at least about 60%, 70% 80% 90% 95%, 99%, or 100% homology with an “N-terminal extracellular domain,” e.g., the N-terminal extracellular domain of human 46980 (e.g., residues 1-674 of SEQ ID NO:28). Preferably, the N-terminal extracellular domain is capable of interacting (e.g., binding to) with an extracellular signal (e.g., a neurexin) and/or modulating cell adhesion.

[1842] In another embodiment, a 46980 protein includes a “C-terminal cytoplasmic domain”, also referred to herein as a C-terminal cytoplasmic tail, in the sequence of the protein. As used herein, a “C-terminal cytoplasmic domain” includes an amino acid sequence having a length of at least about 50 to 200, preferably 100 to 150, and more preferably 109 amino acid residues and is located within a cell or within the cytoplasm of a cell. Accordingly, the N-terminal amino acid residue of a “C-terminal cytoplasmic domain” is adjacent to a C-terminal amino acid residue of a transmembrane domain in a naturally-occurring 46980 or 46980-like protein. For example, a C-terminal cytoplasmic domain is found at about amino acid residues 697-816 of SEQ ID NO:28.

[1843] In a preferred embodiment, a 46980 polypeptide or protein has a C-terminal cytoplasmic domain or a region which includes at least about 50 to 200, preferably 100 to 150, and more preferably 109 amino acid residues and has at least about 60%, 70% 80% 90% 95%, 99%, or 100% homology with an “C-terminal cytoplasmic domain,” e.g., the C-terminal cytoplasmic domain of human 46980 (e.g., residues 697-916 of SEQ ID NO:28).

[1844] A 46980 molecule can further include a signal sequence. As used herein, a “signal sequence” refers to a peptide of about 20-50 amino acid residues in length which occurs at the N-terminus of secretory and integral membrane proteins and which contains a majority of hydrophobic amino acid residues. For example, a signal sequence contains at least about 30-48 amino acid residues, preferably about 40-45 amino acid residues, more preferably about 43 amino acid residues, and has at least about 40-70%, preferably about 50-65%, and more preferably about 55-60% hydrophobic amino acid residues (e.g., alanine, valine, leucine, isoleucine, phenylalanine, tyrosine, tryptophan, or proline). Such a “signal sequence”, also referred to in the art as a “signal peptide”, serves to direct a protein containing such a sequence to a lipid bilayer. For example, in one embodiment, a 46980 protein contains a signal sequence of about amino acids 1-43 of SEQ ID NO:28. The “signal sequence” can be cleaved during processing of the mature protein. The mature 46980 protein can correspond to about amino acids 44 to 816 of SEQ ID NO:28.

[1845] As the 46980 polypeptides of the invention may modulate 46980-mediated activities, they may be useful as of for developing novel diagnostic and therapeutic agents for 46980-mediated or -related disorders, as described below.

[1846] As used herein, a “46980 activity”, “biological activity of 46980” or “functional activity of 46980”, refers to an activity exerted by a 46980 protein, polypeptide or nucleic acid molecule. For example, a 46980 activity can be an activity exerted by 46980 in a physiological milieu on, e.g., a 46980-responsive cell or on a 46980 substrate, e.g., a protein substrate. A 46980 activity can be determined in vivo or in vitro. In one embodiment, a 46980 activity is a direct activity, such as an association with a 46980 target molecule. A “target molecule” or “binding partner” is a molecule with which a 46980 protein binds or interacts in nature. In an exemplary embodiment, a 46980 binding partner is a neurexin, e.g., a &bgr;-neurexin.

[1847] A 46980 activity can also be an indirect activity, e.g., a cellular signaling activity mediated by interaction of the 46980 protein with a 46980 receptor. The features of the 46980 molecules of the present invention can provide similar biological activities as neuroligin family members. For example, the 46980 proteins of the present invention is predicted to have one or more of the following activities: (1) ability to catalyze the hydrolysis of carboxylic esters; (2) ability to mediate cell-cell (e.g., neuron-neuron, or neuron-glia) recognition events, adhesion or attachment (3) ability to interact with a cell surface protein (e.g., neurexin) or an extracellular component; (4) ability to modulate cell migration, (5) ability to modulate patterning, (6) ability to modulate proliferation, and/or differentiation, of a cell (e.g., a neural cell); (7) ability to modulate embryonic development and differentiation; (8) ability to modulate morphogenesis; (9) ability to modulate tissue maintenance; or (10) ability to modulate neural development, e.g., axonal growth, synaptogenesis, neurite outgrowth, membrane excitability and/or guidance.

[1848] Based on their localization and their structural features, the 46980 molecules of the present invention can have similar biological activities as carboxylesterase family members, in particular neuroligin proteins. Thus, the 46980 molecules can act as novel diagnostic targets and therapeutic agents for controlling a neuronal disorder, e.g., a pain-related, neuronal connectivity-related, or neural degenerative disorder, disorders, as well as male-fertility disorders, disorders of the brain, and cell proliferative disorders.

[1849] 46980 is highly expressed in the central and peripheral nervous system (see Example 20, below). More specifically, high levels of 46980 mRNA expression were found in human brain, spinal cord and dorsal root ganglia (see Table 5, below) as demonstrated by TaqMan analysis. 46980 molecules can function in a neuronal cell involved in pain response or in a neuronal developmental pathway.

[1850] Animal models of pain response include, but are not limited to, axotomy, the cutting or severing of an axon; chronic constriction injury (CCI), a model of neuropathic pain which involves ligation of the sciatic nerve in rodents, e.g., rats; or intraplantar Freund's adjuvant injection as a model of arthritic pain. Other animal models of pain response are described in, e.g., ILAR Journal (1999) Volume 40, Number 3 (entire issue). e. The 46980 molecules of the invention may be directly or indirectly involved include pain, pain syndromes, and inflammatory disorders, including inflammatory pain. Modulators of 46980 molecules may have analgesic effects, e.g., as evaluated by analgesic tests on animals, e.g., the hot plate test, tail flick test, writhing test, paw pressure test, all electric stimulation test, tail withdrawal test, or formalin test (Roques et al. (1995) Methods in Enzymology 248:263-283).

[1851] As the 46980 molecules of the invention may modulate 46980-mediated activities, they may be useful for developing novel diagnostic and therapeutic agents for 46980-mediated or related disorders. For example, the 46980 molecules can act as novel diagnostic targets and therapeutic agents controlling pain, pain disorders, and other neuronal disorders.

[1852] Examples of pain conditions include, but are not limited to, pain elicited during various forms of tissue injury, e.g., inflammation, infection, and ischemia; pain associated with musculoskeletal disorders, e.g., joint pain, or arthritis; tooth pain; headaches, e.g., migrane; pain associated with surgery; pain related to inflammation, e.g., irritable bowel syndrome; chest pain; or hyperalgesia, e.g., excessive sensitivity to pain (described in, for example, Fields (1987) Pain, New York: McGraw-Hill). Other examples of pain disorders or pain syndromes include, but are not limited to, complex regional pain syndrome (CRPS), reflex sympathetic dystrophy (RSD), causalgia, neuralgia, central pain and dysesthesia syndrome, carotidynia, neurogenic pain, refractory cervicobrachial pain syndrome, myofascial pain syndrome, craniomandibular pain dysfunction syndrome, chronic idiopathic pain syndrome, Costen's pain-dysfunction, acute chest pain syndrome, nonulcer dyspepsia, interstitial cystitis, gynecologic pain syndrome, patellofemoral pain syndrome, anterior knee pain syndrome, recurrent abdominal pain in children, colic, low back pain syndrome, neuropathic pain, phantom pain from amputation, phantom tooth pain, or pain asymbolia (the inability to feel pain). Other examples of pain conditions include pain induced by parturition, or post partum pain.

[1853] Agents that modulate 46980 polypeptide or nucleic acid activity or expression can be used to treat pain elicited by any medical condition. A subject receiving the treatment can be additionally treated with a second agent, e.g., an anti-inflammatory agent, an antibiotic, or a chemotherapeutic agent, to further ameliorate the condition.

[1854] The 46980 molecules can also act as novel diagnostic targets and therapeutic agents controlling pain caused by other disorders, e.g., cancer, e.g., prostate cancer.

[1855] As the 46980 molecules are highly expressed in brain, the 46980 molecules can act as diagnostic markers and therapeutic agents or targets in disorders of the brain. Disorders involving the brain include, but are not limited to, disorders involving neurons, and disorders involving glia, such as astrocytes, oligodendrocytes, ependymal cells, and microglia; cerebral edema, raised intracranial pressure and herniation, and hydrocephalus; malformations and developmental diseases, such as neural tube defects, forebrain anomalies, posterior fossa anomalies, and syringomyelia and hydromyelia; perinatal brain injury; cerebrovascular diseases, such as those related to hypoxia, ischemia, and infarction, including hypotension, hypoperfusion, and low-flow states—global cerebral ischemia and focal cerebral ischemia—infarction from obstruction of local blood supply, intracranial hemorrhage, including intracerebral (intraparenchymal) hemorrhage, subarachnoid hemorrhage and ruptured berry aneurysms, and vascular malformations, hypertensive cerebrovascular disease, including lacunar infarcts, slit hemorrhages, and hypertensive encephalopathy; infections, such as acute meningitis, including acute pyogenic (bacterial) meningitis and acute aseptic (viral) meningitis, acute focal suppurative infections, including brain abscess, subdural empyema, and extradural abscess, chronic bacterial meningoencephalitis, including tuberculosis and mycobacterioses, neurosyphilis, and neuroborreliosis (Lyme disease), viral meningoencephalitis, including arthropod-borne (Arbo) viral encephalitis, Herpes simplex virus Type 1, Herpes simplex virus Type 2, Varicalla-zoster virus (Herpes zoster), cytomegalovirus, poliomyelitis, rabies, and human immunodeficiency virus 1, including HIV-1 meningoencephalitis (subacute encephalitis), vacuolar myelopathy, AIDS-associated myopathy, peripheral neuropathy, and AIDS in children, progressive multifocal leukoencephalopathy, subacute sclerosing panencephalitis, fungal meningoencephalitis, other infectious diseases of the nervous system; transmissible spongiform encephalopathies (prion diseases); demyelinating diseases, including multiple sclerosis, multiple sclerosis variants, acute disseminated encephalomyelitis and acute necrotizing hemorrhagic encephalomyelitis, and other diseases with demyelination; degenerative diseases, such as degenerative diseases affecting the cerebral cortex, including Alzheimer disease and Pick disease, degenerative diseases of basal ganglia and brain stem, including Parkinsonism, idiopathic Parkinson disease (paralysis agitans), progressive supranuclear palsy, corticobasal degenration, multiple system atrophy, including striatonigral degenration, Shy-Drager syndrome, and olivopontocerebellar atrophy, and Huntington disease; spinocerebellar degenerations, including spinocerebellar ataxias, including Friedreich ataxia, and ataxia-telanglectasia, degenerative diseases affecting motor neurons, including amyotrophic lateral sclerosis (motor neuron disease), bulbospinal atrophy (Kennedy syndrome), and spinal muscular atrophy; inborn errors of metabolism, such as leukodystrophies, including Krabbe disease, metachromatic leukodystrophy, adrenoleukodystrophy, Pelizaeus-Merzbacher disease, and Canavan disease, mitochondrial encephalomyopathies, including Leigh disease and other mitochondrial encephalomyopathies; toxic and acquired metabolic diseases, including vitamin deficiencies such as thiamine (vitamin B1) deficiency and vitamin B12 deficiency, neurologic sequelae of metabolic disturbances, including hypoglycemia, hyperglycemia, and hepatic encephatopathy, toxic disorders, including carbon monoxide, methanol, ethanol, and radiation, including combined methotrexate and radiation-induced injury; tumors, such as gliomas, including astrocytoma, including fibrillary (diffuse) astrocytoma and glioblastoma multiforme, pilocytic astrocytoma, pleomorphic xanthoastrocytoma, and brain stem glioma, oligodendroglioma, and ependymoma and related paraventricular mass lesions, neuronal tumors, poorly differentiated neoplasms, including medulloblastoma, other parenchymal tumors, including primary brain lymphoma, germ cell tumors, and pineal parenchymal tumors, meningiomas, metastatic tumors, paraneoplastic syndromes, peripheral nerve sheath tumors, including schwannoma, neurofibroma, and malignant peripheral nerve sheath tumor (malignant schwannoma), and neurocutaneous syndromes (phakomatoses), including neurofibromotosis, including Type 1 neurofibromatosis (NF1) and TYPE 2 neurofibromatosis (NF2), tuberous sclerosis, and Von Hippel-Lindau disease.

[1856] As 46980 is highly expressed in human testis, 46980 molecules can have a role in, e,g, fertility or spermatid development, and disorders of the testes (see Table 5, below). Human 46980 molecules can also act as novel diagnostic targets and therapeutic agents controlling sperm formation or other processes related to fertility, e.g., spermatogenesis or fertilization. Disorders involving the testis and epididymis include, but are not limited to, congenital anomalies such as cryptorchidism, regressive changes such as atrophy, inflammations such as nonspecific epididymitis and orchitis, granulomatous (autoimmune) orchitis, and specific inflammations including, but not limited to, gonorrhea, mumps, tuberculosis, and syphilis, vascular disturbances including torsion, testicular tumors including germ cell tumors that include, but are not limited to, seminoma, spermatocytic seminoma, embryonal carcinoma, yolk sac tumor choriocarcinoma, teratoma, and mixed tumors, tumore of sex cord-gonadal stroma including, but not limited to, Leydig (interstitial) cell tumors and sertoli cell tumors (androblastoma), and testicular lymphoma, and miscellaneous lesions of tunica vaginalis.

[1857] In addition, 46980 molecules may be useful for disorders of the prostate. A prostate disorder refers to an abnormal condition occurring in the male pelvic region characterized by, e.g., male sexual dysfunction and/or urinary symptoms. This disorder may be manifested in the form of genitourinary inflammation (e.g., inflammation of smooth muscle cells) as in several common diseases of the http://164.195.100.11/netacgi/nph-Parser?Sect1=PTO2&Sect2=HITOFF&u=/netahtml/-h5http://164.195.100.11/netacgi/nph-Parser?Sect1=PTO2&Sect2=HITOFF&u=/netahtml/-h7prostate including prostatitis, benign prostatic hyperplasia and cancer, e.g., adenocarcinoma or carcinoma, of the http://164.195.100.11/netacgi/nph-Parser?Sect1=PTO2&Sect2=HITOFF&u=/netahtml/-h6http://164.195.100.11/netacgi/nph-Parser?Sect1=PTO2&Sect2=HITOFF&u=/netahtml/-h8prostate.

[1858] The 46980 protein, fragments thereof, and derivatives and other variants of the sequence in SEQ ID NO:28 thereof are collectively referred to as “polypeptides or proteins of the invention” or “46980 polypeptides or proteins”. Nucleic acid molecules encoding such polypeptides or proteins are collectively referred to as “nucleic acids of the invention” or “46980 nucleic acids.” 46980 molecules refer to 46980 nucleic acids, polypeptides, and antibodies.

[1859] As used herein, the term “nucleic acid molecule” includes DNA molecules (e.g., a cDNA or genomic DNA), RNA molecules (e.g., an mRNA) and analogs of the DNA or RNA. A DNA or RNA analog can be synthesized from nucleotide analogs. The nucleic acid molecule can be single-stranded or double-stranded, but preferably is double-stranded DNA.

[1860] The term “isolated nucleic acid molecule” or “purified nucleic acid molecule” includes nucleic acid molecules that are separated from other nucleic acid molecules present in the natural source of the nucleic acid. For example, with regards to genomic DNA, the term “isolated” includes nucleic acid molecules which are separated from the chromosome with which the genomic DNA is naturally associated. Preferably, an “isolated” nucleic acid is free of sequences which naturally flank the nucleic acid (i.e., sequences located at the 5′ and/or 3′ends of the nucleic acid) in the genomic DNA of the organism from which the nucleic acid is derived. For example, in various embodiments, the isolated nucleic acid molecule can contain less than about 5 kb, 4 kb, 3 kb, 2 kb, 1 kb, 0.5 kb or 0.1 kb of 5′ and/or 3′nucleotide sequences which naturally flank the nucleic acid molecule in genomic DNA of the cell from which the nucleic acid is derived. Moreover, an “isolated” nucleic acid molecule, such as a cDNA molecule, can be substantially free of other cellular material, or culture medium when produced by recombinant techniques, or substantially free of chemical precursors or other chemicals when chemically synthesized.

[1861] As used herein, the term “hybridizes under low stringency, medium stringency, high stringency, or very high stringency conditions” describes conditions for hybridization and washing. Guidance for performing hybridization reactions can be found in Current Protocols in Molecular Biology, John Wiley & Sons, N.Y. (1989), 6.3.1-6.3.6, which is incorporated by reference. Aqueous and nonaqueous methods are described in that reference and either can be used. Specific hybridization conditions referred to herein are as follows: 1) low stringency hybridization conditions in 6× sodium chloride/sodium citrate (SSC) at about 45° C., followed by two washes in 0.2× SSC, 0.1% SDS at least at 50° C. (the temperature of the washes can be increased to 55° C. for low stringency conditions); 2) medium stringency hybridization conditions in 6× SSC at about 45° C., followed by one or more washes in 0.2× SSC, 0.1% SDS at 60° C.; 3) high stringency hybridization conditions in 6× SSC at about 45° C., followed by one or more washes in 0.2× SSC, 0.1% SDS at 65° C.; and preferably 4) very high stringency hybridization conditions are 0.5M sodium phosphate, 7% SDS at 65° C., followed by one or more washes at 0.2× SSC, 1% SDS at 65° C. Very high stringency conditions (4) are the preferred conditions and the ones that should be used unless otherwise specified.

[1862] Preferably, an isolated nucleic acid molecule of the invention that hybridizes under a stringency condition described herein to the sequence of SEQ ID NO:27 or SEQ ID NO:29, corresponds to a naturally-occurring nucleic acid molecule.

[1863] As used herein, a “naturally-occurring” nucleic acid molecule refers to an RNA or DNA molecule having a nucleotide sequence that occurs in nature. For example a naturally occurring nucleic acid molecule can encode a natural protein. As used herein, the terms “gene” and “recombinant gene” refer to nucleic acid molecules which include at least an open reading frame encoding a 46980 protein. The gene can optionally further include non-coding sequences, e.g., regulatory sequences and introns. Preferably, a gene encodes a mammalian 46980 protein or derivative thereof.

[1864] An “isolated” or “purified” polypeptide or protein is substantially free of cellular material or other contaminating proteins from the cell or tissue source from which the protein is derived, or substantially free from chemical precursors or other chemicals when chemically synthesized. “Substantially free” means that a preparation of 46980 protein is at least 10% pure. In a preferred embodiment, the preparation of 46980 protein has less than about 30%, 20%, 10% and more preferably 5% (by dry weight), of non-46980 protein (also referred to herein as a “contaminating protein”), or of chemical precursors or non-46980 chemicals. When the 46980 protein or biologically active portion thereof is recombinantly produced, it is also preferably substantially free of culture medium, i.e., culture medium represents less than about 20%, more preferably less than about 10%, and most preferably less than about 5% of the volume of the protein preparation. The invention includes isolated or purified preparations of at least 0.01, 0.1, 1.0, and 10 milligrams in dry weight.

[1865] A “non-essential” amino acid residue is a residue that can be altered from the wild-type sequence of 46980 without abolishing or substantially altering a 46980 activity. Preferably the alteration does not substantially alter the 46980 activity, e.g., the activity is at least 20%, 40%, 60%, 70% or 80% of wild-type. An “essential” amino acid residue is a residue that, when altered from the wild-type sequence of 46980, results in abolishing a 46980 activity such that less than 20% of the wild-type activity is present. For example, conserved amino acid residues in 46980 are predicted to be particularly unamenable to alteration.

[1866] A “conservative amino acid substitution” is one in which the amino acid residue is replaced with an amino acid residue having a similar side chain. Families of amino acid residues having similar side chains have been defined in the art. These families include amino acids with basic side chains (e.g., lysine, arginine, histidine), acidic side chains (e.g., aspartic acid, glutamic acid), uncharged polar side chains (e.g., glycine, asparagine, glutamine, serine, threonine, tyrosine, cysteine), nonpolar side chains (e.g., alanine, valine, leucine, isoleucine, proline, phenylalanine, methionine, tryptophan), beta-branched side chains (e.g., threonine, valine, isoleucine) and aromatic side chains (e.g., tyrosine, phenylalanine, tryptophan, histidine). Thus, a predicted nonessential amino acid residue in a 46980 protein is preferably replaced with another amino acid residue from the same side chain family. Alternatively, in another embodiment, mutations can be introduced randomly along all or part of a 46980 coding sequence, such as by saturation mutagenesis, and the resultant mutants can be screened for 46980 biological activity to identify mutants that retain activity. Following mutagenesis of SEQ ID NO:27 or SEQ ID NO:29, the encoded protein can be expressed recombinantly and the activity of the protein can be determined.

[1867] As used herein, a “biologically active portion” of a 46980 protein includes a fragment of a 46980 protein which participates in an interaction, e.g., an intramolecular or an inter-molecular interaction. An inter-molecular interaction can be a specific binding interaction or an enzymatic interaction (e.g., the interaction can be transient and a covalent bond is formed or broken). An inter-molecular interaction can be between a 46980 molecule and a non-46980 molecule or between a first 46980 molecule and a second 46980 molecule (e.g., a dimerization interaction). Biologically active portions of a 46980 protein include peptides comprising amino acid sequences sufficiently homologous to or derived from the amino acid sequence of the 46980 protein, e.g., the amino acid sequence shown in SEQ ID NO:28, which include less amino acids than the full length 46980 proteins, and exhibit at least one activity of a 46980 protein. Typically, biologically active portions comprise a domain or motif with at least one activity of the 46980 protein, e.g., a protein binding or recognition function, e.g., neurexin binding or recognition. A biologically active portion of a 46980 protein can be a polypeptide which is, for example, 10, 25, 50, 100, 200 or more amino acids in length. Biologically active portions of a 46980 protein can be used as targets for developing agents which modulate a 46980 mediated activity, e.g., a protein binding or recognition function, e.g., neurexin binding or recognition.

[1868] Calculations of homology or sequence identity between sequences (the terms are used interchangeably herein) are performed as follows.

[1869] To determine the percent identity of two amino acid sequences, or of two nucleic acid sequences, the sequences are aligned for optimal comparison purposes (e.g., gaps can be introduced in one or both of a first and a second amino acid or nucleic acid sequence for optimal alignment and non-homologous sequences can be disregarded for comparison purposes). In a preferred embodiment, the length of a reference sequence aligned for comparison purposes is at least 30%, preferably at least 40%, more preferably at least 50%, 60%, and even more preferably at least 70%, 80%, 90%, 100% of the length of the reference sequence. The amino acid residues or nucleotides at corresponding amino acid positions or nucleotide positions are then compared. When a position in the first sequence is occupied by the same amino acid residue or nucleotide as the corresponding position in the second sequence, then the molecules are identical at that position (as used herein amino acid or nucleic acid “identity” is equivalent to amino acid or nucleic acid “homology”).

[1870] The percent identity between the two sequences is a function of the number of identical positions shared by the sequences, taking into account the number of gaps, and the length of each gap, which need to be introduced for optimal alignment of the two sequences.

[1871] The comparison of sequences and determination of percent identity between two sequences can be accomplished using a mathematical algorithm. In a preferred embodiment, the percent identity between two amino acid sequences is determined using the Needleman and Wunsch ((1970) J. Mol. Biol. 48:444-453) algorithm which has been incorporated into the GAP program in the GCG software package (available at http://www.gcg.com), using either a Blossum 62 matrix or a PAM250 matrix, and a gap weight of 16, 14, 12, 10, 8, 6, or 4 and a length weight of 1, 2, 3, 4, 5, or 6. In yet another preferred embodiment, the percent identity between two nucleotide sequences is determined using the GAP program in the GCG software package (available at http://www.gcg.com), using a NWSgapdna.CMP matrix and a gap weight of 40, 50, 60, 70, or 80 and a length weight of 1, 2, 3, 4, 5, or 6. A particularly preferred set of parameters (and the one that should be used unless otherwise specified) are a Blossum 62 scoring matrix with a gap penalty of 12, a gap extend penalty of 4, and a frameshift gap penalty of 5.

[1872] The percent identity between two amino acid or nucleotide sequences can be determined using the algorithm of E. Meyers and W. Miller ((1989) CABIOS, 4:11-17) which has been incorporated into the ALIGN program (version 2.0), using a PAM120 weight residue table, a gap length penalty of 12 and a gap penalty of 4.

[1873] The nucleic acid and protein sequences described herein can be used as a “query sequence” to perform a search against public databases to, for example, identify other family members or related sequences. Such searches can be performed using the NBLAST and XBLAST programs (version 2.0) of Altschul, et al. (1990) J. Mol. Biol. 215:403-10. BLAST nucleotide searches can be performed with the NBLAST program, score=100, wordlength=12 to obtain nucleotide sequences homologous to 46980 nucleic acid molecules of the invention. BLAST protein searches can be performed with the XBLAST program, score=50, wordlength=3 to obtain amino acid sequences homologous to 46980 protein molecules of the invention. To obtain gapped alignments for comparison purposes, Gapped BLAST can be utilized as described in Altschul et al., (1997) Nucleic Acids Res. 25:3389-3402. When utilizing BLAST and Gapped BLAST programs, the default parameters of the respective programs (e.g., XBLAST and NBLAST) can be used. See http://www.ncbi.nlm.nih.gov.

[1874] Particularly preferred 46980 polypeptides of the present invention have an amino acid sequence substantially identical to the amino acid sequence of SEQ ID NO:28. In the context of an amino acid sequence, the term “substantially identical” is used herein to refer to a first amino acid that contains a sufficient or minimum number of amino acid residues that are i) identical to, or ii) conservative substitutions of aligned amino acid residues in a second amino acid sequence such that the first and second amino acid sequences can have a common structural domain and/or common functional activity. For example, amino acid sequences that contain a common structural domain having at least about 60%, or 65% identity, likely 75% identity, more likely 85%, 90%. 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98% or 99% identity to SEQ ID NO:28 are termed substantially identical.

[1875] In the context of nucleotide sequence, the term “substantially identical” is used herein to refer to a first nucleic acid sequence that contains a sufficient or minimum number of nucleotides that are identical to aligned nucleotides in a second nucleic acid sequence such that the first and second nucleotide sequences encode a polypeptide having common functional activity, or encode a common structural polypeptide domain or a common functional polypeptide activity. For example, nucleotide sequences having at least about 60%, or 65% identity, likely 75% identity, more likely 85%, 90%. 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98% or 99% identity to SEQ ID NO:27 or 29 are termed substantially identical.

[1876] “Misexpression or aberrant expression”, as used herein, refers to a non-wildtype pattern of gene expression at the RNA or protein level. It includes: expression at non-wild type levels, i.e., over- or under-expression; a pattern of expression that differs from wild type in terms of the time or stage at which the gene is expressed, e.g., increased or decreased expression (as compared with wild type) at a predetermined developmental period or stage; a pattern of expression that differs from wild type in terms of altered, e.g., increased or decreased, expression (as compared with wild type) in a predetermined cell type or tissue type; a pattern of expression that differs from wild type in terms of the splicing size, translated amino acid sequence, post-transitional modification, or biological activity of the expressed polypeptide; a pattern of expression that differs from wild type in terms of the effect of an environmental stimulus or extracellular stimulus on expression of the gene, e.g., a pattern of increased or decreased expression (as compared with wild type) in the presence of an increase or decrease in the strength of the stimulus.

[1877] “Subject,” as used herein, refers to human and non-human animals. The term “non-human animals” of the invention includes all vertebrates, e.g., mammals, such as non-human primates (particularly higher primates), sheep, dog, rodent (e.g., mouse or rat), guinea pig, goat, pig, cat, rabbits, cow, and non-mammals, such as chickens, amphibians, reptiles, etc. In a preferred embodiment, the subject is a human. In another embodiment, the subject is an experimental animal or animal suitable as a disease model.

[1878] A “purified preparation of cells”, as used herein, refers to an in vitro preparation of cells. In the case cells from multicellular organisms (e.g., plants and animals), a purified preparation of cells is a subset of cells obtained from the organism, not the entire intact organism. In the case of unicellular microorganisms (e.g., cultured cells and microbial cells), it consists of a preparation of at least 10% and more preferably 50% of the subject cells.

[1879] Various aspects of the invention are described in further detail below.

[1880] Isolated Nucleic Acid Molecules of 46980

[1881] In one aspect, the invention provides, an isolated or purified, nucleic acid molecule that encodes a 46980 polypeptide described herein, e.g., a full-length 46980 protein or a fragment thereof, e.g., a biologically active portion of 46980 protein. Also included is a nucleic acid fragment suitable for use as a hybridization probe, which can be used, e.g., to identify a nucleic acid molecule encoding a polypeptide of the invention, 46980 mRNA, and fragments suitable for use as primers, e.g., PCR primers for the amplification or mutation of nucleic acid molecules.

[1882] In one embodiment, an isolated nucleic acid molecule of the invention includes the nucleotide sequence shown in SEQ ID NO:27, or a portion of any of these nucleotide sequences. In one embodiment, the nucleic acid molecule includes sequences encoding the human 46980 protein (i.e., “the coding region” of SEQ ID NO:27, as shown in SEQ ID NO:29), as well as 5′untranslated sequences. Alternatively, the nucleic acid molecule can include only the coding region of SEQ ID NO:27 (e.g., SEQ ID NO:29) and, e.g., no flanking sequences which normally accompany the subject sequence. In another embodiment, the nucleic acid molecule encodes a sequence corresponding to a fragment of the protein from about amino acid 25 to 590.

[1883] In another embodiment, an isolated nucleic acid molecule of the invention includes a nucleic acid molecule which is a complement of the nucleotide sequence shown in SEQ ID NO:27 or SEQ ID NO:29, or a portion of any of these nucleotide sequences. In other embodiments, the nucleic acid molecule of the invention is sufficiently complementary to the nucleotide sequence shown in SEQ ID NO:27 or SEQ ID NO:29, such that it can hybridize (e.g., under a stringency condition described herein) to the nucleotide sequence shown in SEQ ID NO:27 or 29, thereby forming a stable duplex.

[1884] In one embodiment, an isolated nucleic acid molecule of the present invention includes a nucleotide sequence which is at least about: 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or more homologous to the entire length of the nucleotide sequence shown in SEQ ID NO:27 or SEQ ID NO:29, or a portion, preferably of the same length, of any of these nucleotide sequences.

[1885] 46980 Nucleic Acid Fragments

[1886] A nucleic acid molecule of the invention can include only a portion of the nucleic acid sequence of SEQ ID NO:27 or 29. For example, such a nucleic acid molecule can include a fragment which can be used as a probe or primer or a fragment encoding a portion of a 46980 protein, e.g., an immunogenic or biologically active portion of a 46980 protein. A fragment can comprise those nucleotides of SEQ ID NO:27, which encode a carboxylesterase domain of human 46980. The nucleotide sequence determined from the cloning of the 46980 gene allows for the generation of probes and primers designed for use in identifying and/or cloning other 46980 family members, or fragments thereof, as well as 46980 homologues, or fragments thereof, from other species.

[1887] In another embodiment, a nucleic acid includes a nucleotide sequence that includes part, or all, of the coding region and extends into either (or both) the 5′ or 3′noncoding region. Other embodiments include a fragment which includes a nucleotide sequence encoding an amino acid fragment described herein. Nucleic acid fragments can encode a specific domain or site described herein or fragments thereof, particularly fragments thereof which are at least 100, 200, 400, 450, 500, 550, 600, 720, 760, 800 amino acids in length. Fragments also include nucleic acid sequences corresponding to specific amino acid sequences described above or fragments thereof. Nucleic acid fragments should not to be construed as encompassing those fragments that may have been disclosed prior to the invention.

[1888] A nucleic acid fragment can include a sequence corresponding to a domain, region, or functional site described herein. A nucleic acid fragment can also include one or more domain, region, or functional site described herein. Thus, for example, a 46980 nucleic acid fragment can include a sequence corresponding to a carboxylesterase domain, an extracellular domain, or an intracellular domain.

[1889] 46980 probes and primers are provided. Typically a probe/primer is an isolated or purified oligonucleotide. The oligonucleotide typically includes a region of nucleotide sequence that hybridizes under a stringency condition described herein to at least about 7, 12 or 15, preferably about 20 or 25, more preferably about 30, 35, 40, 45, 50, 55, 60, 65, or 75 consecutive nucleotides of a sense or antisense sequence of SEQ ID NO:27 or SEQ ID NO:29, or of a naturally occurring allelic variant or mutant of SEQ ID NO:27 or SEQ ID NO:29.

[1890] In a preferred embodiment the nucleic acid is a probe which is at least 5 or 10, and less than 200, more preferably less than 100, or less than 50, base pairs in length. It should be identical, or differ by 1, or less than in 5 or 10 bases, from a sequence disclosed herein. If alignment is needed for this comparison the sequences should be aligned for maximum homology. “Looped” out sequences from deletions or insertions, or mismatches, are considered differences.

[1891] A probe or primer can be derived from the sense or anti-sense strand of a nucleic acid which encodes: a carboxylesterase domain from about amino acid 25 to 590 of SEQ ID NO:28.

[1892] In another embodiment a set of primers is provided, e.g., primers suitable for use in a PCR, which can be used to amplify a selected region of a 46980 sequence, e.g., a domain, region, site or other sequence described herein. The primers should be at least 5, 10, or 50 base pairs in length and less than 100, or less than 200, base pairs in length. The primers should be identical, or differs by one base from a sequence disclosed herein or from a naturally occurring variant. For example, primers suitable for amplifying all or a portion of any of the following regions are provided: a carboxylesterase domain from about amino acid 25 to 590 of SEQ ID NO:28.

[1893] A nucleic acid fragment can encode an epitope bearing region of a polypeptide described herein.

[1894] A nucleic acid fragment encoding a “biologically active portion of a 46980 polypeptide” can be prepared by isolating a portion of the nucleotide sequence of SEQ ID NO:27 or 29, which encodes a polypeptide having a 46980 biological activity (e.g., the biological activities of the 46980 proteins are described herein), expressing the encoded portion of the 46980 protein (e.g., by recombinant expression in vitro) and assessing the activity of the encoded portion of the 46980 protein. For example, a nucleic acid fragment encoding a biologically active portion of 46980 includes a carboxylesterase domain, e.g., amino acid residues about 25 to 590 of SEQ ID NO:28. A nucleic acid fragment encoding a biologically active portion of a 46980 polypeptide, may comprise a nucleotide sequence which is greater than 300, 400, 500 or more nucleotides in length.

[1895] In preferred embodiments, a nucleic acid includes a nucleotide sequence which is about 300, 400, 500, 600, 800, 1000, 1200, 1500, 2000, 2200, 2400, 2800, 3000 or more nucleotides in length and hybridizes under a stringency condition described herein to a nucleic acid molecule of SEQ ID NO:27, or SEQ ID NO:29. A nucleic acid fragment can include at least a contiguous nucleic acid sequence from a region selected from the group consisting of about nucleotides 1-243, 1-368, 244-368, 1822-2100, 465-517, 1405-1422, 2544-2700, 780-900, or 493-600 of SEQ ID NO:27. A nucleic acid fragment can include at least a contiguous nucleic acid sequence that encodes a peptide having an amino acid sequence of a region selected from the group consisting of about amino acids 1-150, 139-156, 450-460, 300-456, 139-200, 1-175, 614-700, 300-375, 430-480, or 520-600 of SEQ ID NO:28.

[1896] In another preferred embodiment, the nucleic acid fragment encodes a nucleic acid sequence that is at least 650 amino acid in length, e.g., at least 675, 700, 725, 750, or 800 amino acids in length.

[1897] In still another embodiment, the nucleic acid fragment is other than the nucleic acid sequence of BAA86574 or BAA76795.

[1898] 46980 Nucleic Acid Variants

[1899] The invention further encompasses-nucleic-acid molecules that differ from the nucleotide sequence shown in SEQ ID NO:27 or SEQ ID NO:29. Such differences can be due to degeneracy of the genetic code (and result in a nucleic acid which encodes the same 46980 proteins as those encoded by the nucleotide sequence disclosed herein. In another embodiment, an isolated nucleic acid molecule of the invention has a nucleotide sequence encoding a protein having an amino acid sequence which differs, by at least 1, but less than 5, 10, 20, 50, or 100 amino acid residues that shown in SEQ ID NO:28. If alignment is needed for this comparison the sequences should be aligned for maximum homology. “Looped” out sequences from deletions or insertions, or mismatches, are considered differences.

[1900] Nucleic acids of the inventor can be chosen for having codons, which are preferred, or non-preferred, for a particular expression system. For example, the nucleic acid can be one in which at least one codon, at preferably at least 10%, or 20% of the codons has been altered such that the sequence is optimized for expression in E. coli, yeast, human, insect, or CHO cells.

[1901] Nucleic acid variants can be naturally occurring, such as allelic variants (same locus), homologs (different locus), and orthologs (different organism) or can be non naturally occurring. Non-naturally occurring variants can be made by mutagenesis techniques, including those applied to polynucleotides, cells, or organisms. The variants can contain nucleotide substitutions, deletions, inversions and insertions. Variation can occur in either or both the coding and non-coding regions. The variations can produce both conservative and non-conservative amino acid substitutions (as compared in the encoded product).

[1902] In a preferred embodiment, the nucleic acid differs from that of SEQ ID NO:27 or 29, e.g., as follows: by at least one but less than 10, 20, 30, or 40 nucleotides; at least one but less than 1%, 5%, 10% or 20% of the nucleotides in the subject nucleic acid. If necessary for this analysis the sequences should be aligned for maximum homology. “Looped” out sequences from deletions or insertions, or mismatches, are considered differences.

[1903] Orthologs, homologs, and allelic variants can be identified using methods known in the art. These variants comprise a nucleotide sequence encoding a polypeptide that is 50%, at least about 55%, typically at least about 70-75%, more typically at least about 80-85%, and most typically at least about 90-95% or more identical to the nucleotide sequence shown in SEQ ID NO:28 or a fragment of this sequence. Such nucleic acid molecules can readily be identified as being able to hybridize under a stringency condition described herein, to the nucleotide sequence shown in SEQ ID NO:28 or a fragment of the sequence. Nucleic acid molecules corresponding to orthologs, homologs, and allelic variants of the 46980 cDNAs of the invention can further be isolated by mapping to the same chromosome or locus as the 46980 gene.

[1904] Preferred variants include those that are correlated with a protein binding or recognition function, e.g., neurexin binding or recognition.

[1905] Allelic variants of 46980, e.g., human 46980, include both functional and non-functional proteins. Functional allelic variants are naturally occurring amino acid sequence variants of the 46980 protein within a population that maintain the ability to bind neurexin. Functional allelic variants will typically contain only conservative substitution of one or more amino acids of SEQ ID NO:28, or substitution, deletion or insertion of non-critical residues in non-critical regions of the protein. Non-functional allelic variants are naturally-occurring amino acid sequence variants of the 46980, e.g., human 46980, protein within a population that do not have the ability to neurexin. Non-functional allelic variants will typically contain a non-conservative substitution, a deletion, or insertion, or premature truncation of the amino acid sequence of SEQ ID NO:28, or a substitution, insertion, or deletion in critical residues or critical regions of the protein.

[1906] Moreover, nucleic acid molecules encoding other 46980 family members and, thus, which have a nucleotide sequence which differs from the 46980 sequences of SEQ ID NO:27 or SEQ ID NO:29 are intended to be within the scope of the invention.

[1907] Antisense Nucleic Acid Molecules, Ribozymes and Modified 46980 Nucleic Acid Molecules

[1908] In another aspect, the invention features, an isolated nucleic acid molecule which is antisense to 46980. An “antisense” nucleic acid can include a nucleotide sequence which is complementary to a “sense” nucleic acid encoding a protein, e.g., complementary to the coding strand of a double-stranded cDNA molecule or complementary to an mRNA sequence. The antisense nucleic acid can be complementary to an entire 46980 coding strand, or to only a portion thereof (e.g., the coding region of human 46980 corresponding to SEQ ID NO:29). In another embodiment, the antisense nucleic acid molecule is antisense to a “noncoding region” of the coding strand of a nucleotide sequence encoding 46980 (e.g., the 5′ and 3′untranslated regions).

[1909] An antisense nucleic acid can be designed such that it is complementary to the entire coding region of 46980 mRNA, but more preferably is an oligonucleotide which is antisense to only a portion of the coding or noncoding region of 46980 mRNA. For example, the antisense oligonucleotide can be complementary to the region surrounding the translation start site of 46980 mRNA, e.g., between the −10 and +10 regions of the target gene nucleotide sequence of interest. An antisense oligonucleotide can be, for example, about 7, 10, 15, 20, 25, 30, 35, 40, 45, 50, 55, 60, 65, 70, 75, 80, or more nucleotides in length.

[1910] An antisense nucleic acid of the invention can be constructed using chemical synthesis and enzymatic ligation reactions using procedures known in the art. For example, an antisense nucleic acid (e.g., an antisense oligonucleotide) can be chemically synthesized using naturally occurring nucleotides or variously modified nucleotides designed to increase the biological stability of the molecules or to increase the physical stability of the duplex formed between the antisense and sense nucleic acids, e.g., phosphorothioate derivatives and acridine substituted nucleotides can be used. The antisense nucleic acid also can be produced biologically using an expression vector into which a nucleic acid has been subcloned in an antisense orientation (i.e., RNA transcribed from the inserted nucleic acid will be of an antisense orientation to a target nucleic acid of interest, described further in the following subsection).

[1911] The antisense nucleic acid molecules of the invention are typically administered to a subject (e.g., by direct injection at a tissue site), or generated in situ such that they hybridize with or bind to cellular mRNA and/or genomic DNA encoding a 46980 protein to thereby inhibit expression of the protein, e.g., by inhibiting transcription and/or translation. Alternatively, antisense nucleic acid molecules can be modified to target selected cells and then administered systemically. For systemic administration, antisense molecules can be modified such that they specifically bind to receptors or antigens expressed on a selected cell surface, e.g., by linking the antisense nucleic acid molecules to peptides or antibodies which bind to cell surface receptors or antigens. The antisense nucleic acid molecules can also be delivered-to-cells using the vectors described herein. To achieve sufficient intracellular concentrations of the antisense molecules, vector constructs in which the antisense nucleic acid molecule is placed under the control of a strong pol II or pol III promoter are preferred.

[1912] In yet another embodiment, the antisense nucleic acid molecule of the invention is an &agr;-anomeric nucleic acid molecule. An &agr;-anomeric nucleic acid molecule forms specific double-stranded hybrids with complementary RNA in which, contrary to the usual &bgr;-units, the strands run parallel to each other (Gaultier et al. (1987) Nucleic Acids. Res. 15:6625-6641). The antisense nucleic acid molecule can also comprise a 2′-o-methylribonucleotide (Inoue et al. (1987) Nucleic Acids Res. 15:6131-6148) or a chimeric RNA-DNA analogue (Inoue et al. (1987) FEBS Lett. 215:327-330).

[1913] In still another embodiment, an antisense nucleic acid of the invention is a ribozyme. A ribozyme having specificity for a 46980-encoding nucleic acid can include one or more sequences complementary to the nucleotide sequence of a 46980 cDNA disclosed herein (i.e., SEQ ID NO:27 or SEQ ID NO:29), and a sequence having known catalytic sequence responsible for mRNA cleavage (see U.S. Pat. No. 5,093,246 or Haselhoff and Gerlach (1988) Nature 334:585-591). For example, a derivative of a Tetrahymena L-19 IVS RNA can be constructed in which the nucleotide sequence of the active site is complementary to the nucleotide sequence to be cleaved in a 46980-encoding mRNA. See, e.g., Cech et al. U.S. Pat. No. 4,987,071; and Cech et al. U.S. Pat. No. 5,116,742. Alternatively, 46980 mRNA can be used to select a catalytic RNA having a specific ribonuclease activity from a pool of RNA molecules. See, e.g., Bartel, D. and Szostak, J. W. (1993) Science 261:1411-1418.

[1914] 46980 gene expression can be inhibited by targeting nucleotide sequences complementary to the regulatory region of the 46980 (e.g., the 46980 promoter and/or enhancers) to form triple helical structures that prevent transcription of the 46980 gene in target cells. See generally, Helene, C. (1991) Anticancer Drug Des. 6:569-84; Helene, C. i (1992) Ann. N.Y. Acad. Sci. 660:27-36; and Maher, L. J. (1992) Bioassays 14:807-15. The potential sequences that can be targeted for triple helix formation can be increased by creating a so-called “switchback” nucleic acid molecule. Switchback molecules are synthesized in an alternating 5′-3′,3′-5′manner, such that they base pair with first one strand of a duplex and then the other, eliminating the necessity for a sizeable stretch of either purines or pyrimidines to be present on one strand of a duplex.

[1915] The invention also provides detectably labeled oligonucleotide primer and probe molecules. Typically, such labels are chemiluminescent, fluorescent, radioactive, or colorimetric.

[1916] A 46980 nucleic acid molecule can be modified at the base moiety, sugar moiety or phosphate backbone to improve, e.g., the stability, hybridization, or solubility of the molecule. For non-limiting examples of synthetic oligonucleotides with modifications see Toulmé (2001) Nature Biotech. 19:17 and Faria et al. (2001) Nature Biotech. 19:40-44. Such phosphoramidite oligonucleotides can be effective antisense agents.

[1917] For example, the deoxyribose phosphate backbone of the nucleic acid molecules can be modified to generate peptide nucleic acids (see Hyrup B. et al. (1996) Bioorganic & Medicinal Chemistry 4: 5-23). As used herein, the terms “peptide nucleic acid” or “PNA” refers to a nucleic acid mimic, e.g., a DNA mimic, in which the deoxyribose phosphate backbone is replaced by a pseudopeptide backbone and only the four natural nucleobases are retained. The neutral backbone of a PNA can allow for specific hybridization to DNA and RNA under conditions of low ionic strength. The synthesis of PNA oligomers can be performed using standard solid phase peptide synthesis protocols as described in Hyrup B. et al. (1996) supra and Perry-O'Keefe et al. Proc. Natl. Acad. Sci. 93: 14670-675.

[1918] PNAs of 46980 nucleic acid molecules can be used in therapeutic and diagnostic applications. For example, PNAs can be used as antisense or antigene agents for sequence-specific modulation of gene expression by, for example, inducing transcription or translation arrest or inhibiting replication. PNAs of 46980 nucleic acid molecules can also be used in the analysis of single base pair mutations in a gene, (e.g., by PNA-directed PCR clamping); as ‘artificial restriction enzymes’ when used in combination with other enzymes, (e.g., S1 nucleases (Hyrup B. et al. (1996) supra)); or as probes or primers for DNA sequencing or hybridization (Hyrup B. et al. (1996) supra; Perry-O'Keefe supra).

[1919] In other embodiments, the oligonucleotide may include other appended groups such as peptides (e.g., for targeting host cell receptors in vivo), or agents facilitating transport across the cell membrane (see, e.g., Letsinger et al. (1989) Proc. Natl. Acad. Sci. USA 86:6553-6556; Lemaitre et al. (1987) Proc. Natl. Acad. Sci. USA 84:648-652; PCT Publication No. WO88/09810) or the blood-brain barrier (see, e.g., PCT Publication No. WO89/10134). In addition, oligonucleotides can be modified with hybridization-triggered cleavage agents (see, e.g., Krol et al. (1988) Bio-Techniques 6:958-976) or intercalating agents. (see, e.g., Zon (1988) Pharm. Res. 5:539-549). To this end, the oligonucleotide may be conjugated to another molecule, (e.g., a peptide, hybridization triggered cross-linking agent, transport agent, or hybridization-triggered cleavage agent).

[1920] The invention also includes molecular beacon oligonucleotide primer and probe molecules having at least one region which is complementary to a 46980 nucleic acid of the invention, two complementary regions one having a fluorophore and one a quencher such that the molecular beacon is useful for quantitating the presence of the 46980 nucleic acid of the invention in a sample. Molecular beacon nucleic acids are described, for example, in Lizardi et al., U.S. Pat. No. 5,854,033; Nazarenko et al., U.S. Pat. No. 5,866,336, and Livak et al., U.S. Pat. No. 5,876,930.

[1921] Isolated 46980 Polypeptides

[1922] In another aspect, the invention features, an isolated 46980 protein, or fragment, e.g., a biologically active portion, for use as immunogens or antigens to raise or test (or more generally to bind) anti-46980 antibodies. 46980 protein can be isolated from cells or tissue sources using standard protein purification techniques. 46980 protein or fragments thereof can be produced by recombinant DNA techniques or synthesized chemically.

[1923] Polypeptides of the invention include those which arise as a result of the existence of multiple genes, alternative transcription events, alternative RNA splicing events, and alternative translational and post-translational events. The polypeptide can be expressed in systems, e.g., cultured cells, which result in substantially the same post-translational modifications present when expressed the polypeptide is expressed in a native cell, or in systems which result in the alteration or omission of post-translational modifications, e.g., glycosylation or cleavage, present when expressed in a native cell.

[1924] In a preferred embodiment, a 46980 polypeptide has one or more of the following characteristics:

[1925] (i) it has the ability to recognize and/or bind a ligand, e.g., a neurexin, e.g., a &bgr;-neurexin;

[1926] (ii) it has a molecular weight, e.g., a deduced molecular weight, preferably ignoring any contribution of post translational modifications, amino acid composition or other physical characteristic of a 46980 polypeptide, e.g., the polypeptide of SEQ ID NO:28;

[1927] (iii) it has an overall sequence similarity of at least 60%, more preferably at least 70, 80, 90, or 95%, with a polypeptide a of SEQ ID NO:28;

[1928] (iv) it can be found in the central and/or peripheral nervous system;

[1929] (v) it has a carboxylesterase domain which is preferably about 70%, 80%, 90% or 95% with amino acid residues about 25 to 590 of SEQ ID NO:28;

[1930] (vi) it can include conserved cysteine that form disulfide bonds, notably cysteines at about amino acid 110, 146, 306, and 317 of SEQ ID NO:28; and

[1931] (vii) it can include an altered carboxylesterase active site that lacks a conserved serine at about amino acid 254, but retains a conserved acidic residue at about amino acid 375 of SEQ ID NO:28, and a conserved histidine at about amino acid 489 of SEQ ID NO:28.

[1932] In a preferred embodiment the 46980 protein, or fragment thereof, differs from the corresponding sequence in SEQ ID NO:28. In one embodiment it differs by at least one but by less than 15, 10 or 5 amino acid residues. In another it differs from the corresponding sequence in SEQ ID NO:28 by at least one residue but less than 20%, 15%, 10% or 5% of the residues in it differ from the corresponding sequence in SEQ ID NO:28. (If this comparison requires alignment the sequences should be aligned for maximum homology. “Looped” out sequences from deletions or insertions, or mismatches, are considered differences.) The differences are, preferably, differences or changes at a non essential residue or a conservative substitution. In a preferred embodiment the differences are not in the carboxylesterase domain. In another preferred embodiment one or more differences are in the carboxylesterase domain.

[1933] Other embodiments include a protein that contain one or more changes in amino acid sequence, e.g., a change in an amino acid residue which is not essential for activity. Such 46980 proteins differ in amino acid sequence from SEQ ID NO:28, yet retain biological activity.

[1934] In one embodiment, the protein includes an amino acid sequence at least about 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 98% or more homologous to SEQ ID NO:28.

[1935] A 46980 protein or fragment is provided which varies from the sequence of SEQ ID NO:28 in regions defined by amino acids about 696 to 891 by at least one but by less than 15, 10 or 5 amino acid residues in the protein or fragment but which does not differ from SEQ ID NO:28 in regions defined by amino acids about 41 to 674. (If this comparison requires alignment the sequences should be aligned for maximum homology. “Looped” out sequences from deletions or insertions, or mismatches, are considered differences.) In some embodiments the difference is at a non-essential residue or is a conservative substitution, while in others the difference is at an essential residue or is a non-conservative substitution.

[1936] In one embodiment, a biologically active portion of a 46980 protein includes a carboxylesterase domain. Moreover, other biologically active portions, in which other regions of the protein are deleted, can be prepared by recombinant techniques and evaluated for one or more of the functional activities of a native 46980 protein.

[1937] In a preferred embodiment, the 46980 protein has an amino acid sequence shown in SEQ ID NO:28. In other embodiments, the 46980 protein is substantially identical to SEQ ID NO:28. In yet another embodiment, the 46980 protein is substantially identical to SEQ ID NO:28 and retains the functional activity of the protein of SEQ ID NO:28, as described in detail in the subsections above.

[1938] In another preferred embodiment, the 46980 polypeptide can have an amino acid sequence other than that encoded by BAA86574 (KIAA1260) and/or other than that encoded by BAA76795 (KIAA0951).

[1939] 46980 Chimeric or Fusion Proteins

[1940] In another aspect, the invention provides 46980 chimeric or fusion proteins. As used herein, a 46980 “chimeric protein” or “fusion protein” includes a 46980 polypeptide linked to a non-46980 polypeptide. A “non-46980 polypeptide” refers to a polypeptide having an amino acid sequence corresponding to a protein which is not substantially homologous to the 46980 protein, e.g., a protein which is different from the 46980 protein and which is derived from the same or a different organism. The 46980 polypeptide of the fusion protein can correspond to all or a portion e.g., a fragment described herein of a 46980 amino acid sequence. In a preferred embodiment, a 46980 fusion protein includes at least one (or two) biologically active portion of a 46980 protein. The non-46980 polypeptide can be fused to the N-terminus or C-terminus of the 46980 polypeptide.

[1941] The fusion protein can include a moiety which has a high affinity for a ligand. For example, the fusion protein can be a GST-46980 fusion protein in which the 46980 sequences are fused to the C-terminus of the GST sequences. Such fusion proteins can facilitate the purification of recombinant 46980. Alternatively, the fusion protein can be a 46980 protein containing a heterologous signal sequence at its N-terminus. In certain host cells (e.g., mammalian host cells), expression and/or secretion of 46980 can be increased through use of a heterologous signal sequence.

[1942] Fusion proteins can include all or a part of a serum protein, e.g., an IgG constant region, or human serum albumin.

[1943] The 46980 fusion proteins of the invention can be incorporated into pharmaceutical compositions and administered to a subject in vivo. The 46980 fusion proteins can be used to affect the bioavailability of a 46980 substrate. 46980 fusion proteins may be useful therapeutically for the treatment of disorders caused by, for example, (i) aberrant modification or mutation of a gene encoding a 46980 protein; (ii) mis-regulation of the 46980 gene; and (iii) aberrant post-translational modification of a 46980 protein.

[1944] Moreover, the 46980-fusion proteins of the invention can be used as immunogens to produce anti-46980 antibodies in a subject, to purify 46980 ligands and in screening assays to identify molecules which inhibit the interaction of 46980 with a 46980 substrate.

[1945] Expression vectors are commercially available that already encode a fusion moiety (e.g., a GST polypeptide). A 46980-encoding nucleic acid can be cloned into such an expression vector such that the fusion moiety is linked in-frame to the 46980 protein.

[1946] Variants of 46980 Proteins

[1947] In another aspect, the invention also features a variant of a 46980 polypeptide, e.g., which functions as an agonist (mimetics) or as an antagonist. Variants of the 46980 proteins can be generated by mutagenesis, e.g., discrete point mutation, the insertion or deletion of sequences or the truncation of a 46980 protein. An agonist of the 46980 proteins can retain substantially the same, or a subset, of the biological activities of the naturally occurring form of a 46980 protein. An antagonist of a 46980 protein can inhibit one or more of the activities of the naturally occurring form of the 46980 protein by, for example, competitively modulating a 46980-mediated activity of a 46980 protein. Thus, specific biological effects can be elicited by treatment with a variant of limited function. Preferably, treatment of a subject with a variant having a subset of the biological activities of the naturally occurring form of the protein has fewer side effects in a subject relative to treatment with the naturally occurring form of the 46980 protein.

[1948] Variants of a 46980 protein can be identified by screening combinatorial libraries of mutants, e.g., truncation mutants, of a 46980 protein for agonist or antagonist activity.

[1949] Libraries of fragments e.g., N terminal, C terminal, or internal fragments, of a 46980 protein coding sequence can be used to generate a variegated population of fragments for screening and subsequent selection of variants of a 46980 protein. Variants in which a cysteine residues is added or deleted or in which a residue which is glycosylated is added or deleted are particularly preferred.

[1950] Methods for screening gene products of combinatorial libraries made by point mutations or truncation, and for screening cDNA libraries for gene products having a selected property are known in the art. Such methods are adaptable for rapid screening of the gene libraries generated by combinatorial mutagenesis of 46980 proteins. Recursive ensemble mutagenesis (REM), a new technique which enhances the frequency of functional mutants in the libraries, can be used in combination with the screening assays to identify 46980 variants (Arkin and Yourvan (1992) Proc. Natl. Acad. Sci. USA 89:7811-7815; Delgrave et al. (1993)Protein Engineering 6:327-331).

[1951] Cell based assays can be exploited to analyze a variegated 46980 library. For example, a library of expression vectors can be transfected into a cell line, e.g., a cell line, which ordinarily responds to a 46980 polypeptide in a substrate-dependent manner. The transfected cells are then contacted with a 46980 polypeptide (e.g. using a soluable form of the extracellular domain of 46980 or a cell expressing a transmembrane form of 46980) and the effect of the expression of the mutant on signaling by the 46980 substrate can be detected, e.g., by measuring a protein binding or recognition function, e.g., neurexin binding or recognition. Plasmid DNA can then be recovered from the cells which score for inhibition, or alternatively, potentiation of signaling by the 46980 substrate, and the individual clones further characterized.

[1952] In another aspect, the invention features a method of making a 46980 polypeptide, e.g., a peptide having a non-wild type activity, e.g., an antagonist, agonist, or super agonist of a naturally occurring 46980 polypeptide, e.g., a naturally occurring 46980 polypeptide. The method includes: altering the sequence of a 46980 polypeptide, e.g., altering the sequence, e.g., by substitution or deletion of one or more residues of a non-conserved region, a domain or residue disclosed herein, and testing the altered polypeptide for the desired activity.

[1953] In another aspect, the invention features a method of making a fragment or analog of a 46980 polypeptide a biological activity of a naturally occurring 46980 polypeptide. The method includes: altering the sequence, e.g., by substitution or deletion of one or more residues, of a 46980 polypeptide, e.g., altering the sequence of a non-conserved region, or a domain or residue described herein, and testing the altered polypeptide for the desired activity.

[1954] Anti-46980 Antibodies

[1955] In another aspect, the invention provides an anti-46980 antibody, or a fragment thereof (e.g., an antigen-binding fragment thereof). The term “antibody” as used herein refers to an immunoglobulin molecule or immunologically active portion thereof, i.e., an antigen-binding portion. As used herein, the term “antibody” refers to a protein comprising at least one, and preferably two, heavy (H) chain variable regions (abbreviated herein as VH), and at least one and preferably two light (L) chain variable regions (abbreviated herein as VL). The VH and VL regions can be further subdivided into regions of hypervariability, termed “complementarity determining regions” (“CDR”), interspersed with regions that are more conserved, termed “framework regions” (FR). The extent of the framework region and CDR's has been precisely defined (see, Kabat, E. A., et al. (1991) Sequences of Proteins of Immunological Interest, Fifth Edition, U.S. Department of Health and Human Services, NIH Publication No. 91-3242, and Chothia, C. et al. (1987) J. Mol. Biol. 196:901-917, which are incorporated herein by reference). Each VH and VL is composed of three CDR's and four FRs, arranged from amino-terminus to carboxy-terminus in the following order: FR1, CDR1, FR2, CDR2, FR3, CDR3, FR4.

[1956] The anti-46980 antibody can further include a heavy and light chain constant region, to thereby form a heavy and light immunoglobulin chain, respectively. In one embodiment, the antibody is a tetramer of two heavy immunoglobulin chains and two light immunoglobulin chains, wherein the heavy and light immunoglobulin chains are inter-connected by, e.g., disulfide bonds. The heavy chain constant region is comprised of three domains, CH1, CH2 and CH3. The light chain constant region is comprised of one domain, CL. The variable region of the heavy and light chains contains a binding domain that interacts with an antigen. The constant regions of the antibodies typically mediate the binding of the antibody to host tissues or factors, including various cells of the immune system (e.g., effector cells) and the first component (Clq) of the classical complement system.

[1957] As used herein, the term “immunoglobulin” refers to a protein consisting of one or more polypeptides substantially encoded by immunoglobulin genes. The recognized human immunoglobulin genes include the kappa, lambda, alpha (IgA1 and IgA2), gamma (IgG1, IgG2, IgG3, IgG4), delta, epsilon and mu constant region genes, as well as the myriad immunoglobulin variable region genes. Full-length immunoglobulin “light chains” (about 25 KDa or 214 amino acids) are encoded by a variable region gene at the NH2-terminus (about 110 amino acids) and a kappa or lambda constant region gene at the COOH—terminus. Full-length immunoglobulin “heavy chains” (about 50 KDa or 446 amino acids), are similarly encoded by a variable region gene (about 116 amino acids) and one of the other aforementioned constant region genes, e.g., gamma (encoding about 330 amino acids).

[1958] The term “antigen-binding fragment” of an antibody (or simply “antibody portion,” or “fragment”), as used herein, refers to one or more fragments of a full-length antibody that retain the ability to specifically bind to the antigen, e.g., 46980 polypeptide or fragment thereof. Examples of antigen-binding fragments of the anti-46980 antibody include, but are not limited to: (i) a Fab fragment, a monovalent fragment consisting of the VL, VH, CL and CH1 domains; (ii) a F(ab′)2 fragment, a bivalent fragment comprising two Fab fragments linked by a disulfide bridge at the hinge region; (iii) a Fd fragment consisting of the VH and CH1 domains; (iv) a Fv fragment consisting of the VL and VH domains of a single arm of an antibody, (v) a dAb fragment (Ward et al., (1989) Nature 341:544-546), which consists of a VH domain; and (vi) an isolated complementarity determining region (CDR). Furthermore, although the two domains of the Fv fragment, VL and VH, are coded for by separate genes, they can be joined, using recombinant methods, by a synthetic linker that enables them to be made as a single protein chain in which the VL and VH regions pair to form monovalent molecules (known as single chain Fv (scFv); see e.g., Bird et al. (1988) Science 242:423-426; and Huston et al. (1988) Proc. Natl. Acad. Sci. USA 85:5879-5883). Such single chain antibodies are also encompassed within the term “antigen-binding fragment” of an antibody. These antibody fragments are obtained using conventional techniques known to those with skill in the art, and the fragments are screened for utility in the same manner as are intact antibodies.

[1959] The anti-46980 antibody can be a polyclonal or a monoclonal antibody. In other embodiments, the antibody can be recombinantly produced, e.g., produced by phage display or by combinatorial methods.

[1960] Phage display and combinatorial methods for generating anti-46980 antibodies are known in the art (as described in, e.g., Ladner et al. U.S. Pat. No. 5,223,409; Kang et al. International Publication No. WO 92/18619; Dower et al. International Publication No. WO 91/17271; Winter et al. International Publication WO 92/20791; Markland et al. International Publication No. WO 92/15679; Breitling et al. International Publication WO 93/01288; McCafferty et al. International Publication No. WO 92/01047; Garrard et al. International Publication No. WO 92/09690; Ladner et al. International Publication No. WO 90/02809; Fuchs et al. (1991) Bio/Technology 9:1370-1372; Hay et al. (1992) Hum Antibod Hybridomas 3:81-85; Huse et al. (1989) Science 246:1275-1281; Griffths et al. (1993) EMBO J 12:725-734; Hawkins et al. (1992) J Mol Biol 226:889-896; Clackson et al. (1991) Nature 352:624-628; Gram et al. (1992) PNAS 89:3576-3580; Garrad et al. (1991) Bio/Technology 9:1373-1377; Hoogenboom et al. (1991) Nuc Acid Res 19:4133-4137; and Barbas et al. (1991) PNAS 88:7978-7982, the contents of all of which are incorporated by reference herein).

[1961] In one embodiment, the anti-46980 antibody is a fully human antibody (e.g., an antibody made in a mouse which has been genetically engineered to produce an antibody from a human immunoglobulin sequence), or a non-human antibody, e.g., a rodent (mouse or rat), goat, primate (e.g., monkey), camel antibody. Preferably, the non-human antibody is a rodent (mouse or rat antibody). Method of producing rodent antibodies are known in the art.

[1962] Human monoclonal antibodies can be generated using transgenic mice carrying the human immunoglobulin genes rather than the mouse system. Splenocytes from these transgenic mice immunized with the antigen of interest are used to produce hybridomas that secrete human mAbs with specific affinities for epitopes from a human protein (see, e.g., Wood et al. International Application WO 91/00906, Kucherlapati et al. PCT publication WO 91/10741; Lonberg et al. International Application WO 92/03918; Kay et al. International Application 92/03917; Lonberg, N. et al. 1994 Nature 368:856-859; Green, L. L. et al. 1994 Nature Genet. 7:13-21; Morrison, S. L. et al. 1994 Proc. Natl. Acad. Sci. USA 81:6851-6855; Bruggeman et al. 1993 Year Immunol 7:33-40; Tuaillon et al. 1993 PNAS 90:3720-3724; Bruggeman et al. 1991 Eur J Immunol 21:1323-1326).

[1963] An anti-46980 antibody can be one in which the variable region, or a portion thereof, e.g., the CDR's, are generated in a non-human organism, e.g., a rat or mouse. Chimeric, CDR-grafted, and humanized antibodies are within the invention. Antibodies generated in a non-human organism, e.g., a rat or mouse, and then modified, e.g., in the variable framework or constant region, to decrease antigenicity in a human are within the invention.

[1964] Chimeric antibodies can be produced by recombinant DNA techniques known in the art. For example, a gene encoding the Fc constant region of a murine (or other species) monoclonal antibody molecule is digested with restriction enzymes to remove the region encoding the murine Fc, and the equivalent portion of a gene encoding a human Fc constant region is substituted (see Robinson et al., International Patent Publication PCT/US86/02269; Akira, et al., European Patent Application 184,187; Taniguchi, M., European Patent Application 171,496; Morrison et al., European Patent Application 173,494; Neuberger et al., International Application WO 86/01533; Cabilly et al. U.S. Pat. No. 4,816,567; Cabilly et al., European Patent Application 125,023; Better et al. (1988 Science 240:1041-1043); Liu et al. (1987) PNAS 84:3439-3443; Liu et al., 1987, J. Immunol. 139:3521-3526; Sun et al. (1987) PNAS 84:214-218; Nishimura et al., 1987, Canc. Res. 47:999-1005; Wood et al. (1985) Nature 314:446-449; and Shaw et al., 1988, J. Natl Cancer Inst. 80:1553-1559).

[1965] A humanized or CDR-grafted antibody will have at least one or two but generally all three recipient CDR's (of heavy and or light immuoglobulin chains) replaced with a donor CDR. The antibody may be replaced with at least a portion of a non-human CDR or only some of the CDR's may be replaced with non-human CDR's. It is only necessary to replace the number of CDR's required for binding of the humanized antibody to a 46980 or a fragment thereof. Preferably, the donor will be a rodent antibody, e.g., a rat or mouse antibody, and the recipient will be a human framework or a human consensus framework. Typically, the immunoglobulin providing the CDR's is called the “donor” and the immunoglobulin providing the framework is called the “acceptor.” In one embodiment, the donor immunoglobulin is a non-human (e.g., rodent). The acceptor framework is a naturally-occurring (e.g., a human) framework or a consensus framework, or a sequence about 85% or higher, preferably 90%, 95%, 99% or higher identical thereto.

[1966] As used herein, the term “consensus sequence” refers to the sequence formed from the most frequently occurring amino acids (or nucleotides) in a family of related sequences (See e.g., Winnaker, From Genes to Clones (Verlagsgesellschaft, Weinheim, Germany 1987). In a family of proteins, each position in the consensus sequence is occupied by the amino acid occurring most frequently at that position in the family. If two amino acids occur equally frequently, either can be included in the consensus sequence. A “consensus framework” refers to the framework region in the consensus immunoglobulin sequence.

[1967] An antibody can be humanized by methods known in the art. Humanized antibodies can be generated by replacing sequences of the Fv variable region which are not directly involved in antigen binding with equivalent sequences from human Fv variable regions. General methods for generating humanized antibodies are provided by Morrison, S. L., 1985, Science 229:1202-1207, by Oi et al., 1986, BioTechniques 4:214, and by Queen et al. U.S. Pat. No. 5,585,089, U.S. Pat. No. 5,693,761 and U.S. Pat. No. 5,693,762, the contents of all of which are hereby incorporated by reference. Those methods include isolating, manipulating, and expressing the nucleic acid sequences that encode all or part of immunoglobulin Fv variable regions from at least one of a heavy or light chain. Sources of such nucleic acid are well known to those skilled in the art and, for example, may be obtained from a hybridoma producing an antibody against a 46980 polypeptide or fragment thereof. The recombinant DNA encoding the humanized antibody, or fragment thereof, can then be cloned into an appropriate expression vector.

[1968] Humanized or CDR-grafted antibodies can be produced by CDR-grafting or CDR substitution, wherein one, two, or all CDR's of an immunoglobulin chain can be replaced. See e.g., U.S. Pat. No. 5,225,539; Jones et al. 1986 Nature 321:552-525; Verhoeyan et al. 1988 Science 239:1534; Beidler et al. 1988 J. Immunol. 141:4053-4060; Winter U.S. Pat. No. 5,225,539, the contents of all of which are hereby expressly incorporated by reference. Winter describes a CDR-grafting method which may be used to prepare the humanized antibodies of the present invention (UK Patent Application GB 2188638A, filed on Mar. 26, 1987; Winter U.S. Pat. No. 5,225,539), the contents of which is expressly incorporated by reference.

[1969] Also within the scope of the invention are humanized antibodies in which specific amino acids have been substituted, deleted or added. Preferred humanized antibodies have amino acid substitutions in the framework region, such as to improve binding to the antigen. For example, a humanized antibody will have framework residues identical to the donor framework residue or to another amino acid other than the recipient framework residue. To generate such antibodies, a selected, small number of acceptor framework residues of the humanized immunoglobulin chain can be replaced by the corresponding donor amino acids. Preferred locations of the substitutions include amino acid residues adjacent to the CDR, or which are capable of interacting with a CDR (see e.g., U.S. Pat. No. 5,585,089). Criteria for selecting amino acids from the donor are described in U.S. Pat. No. 5,585,089; e.g., columns 12-16 of U.S. Pat. No. 5,585,089, the e.g., columns 12-16 of U.S. Pat. No. 5,585,089, the contents of which are hereby incorporated by reference. Other techniques for humanizing antibodies are described in Padlan et al. EP 519596 A1, published on Dec. 23, 1992.

[1970] In preferred embodiments an antibody can be made by immunizing with purified 46980 antigen, or a fragment thereof, e.g., a fragment described herein, membrane associated antigen, tissue, e.g., crude tissue preparations, whole cells, preferably living cells, lysed cells, or cell fractions, e.g., membrane fractions.

[1971] A full-length 46980 protein or, antigenic peptide fragment of 46980 can be used as an immunogen or can be used to identify anti-46980 antibodies made with other immunogens, e.g., cells, membrane preparations, and the like. The antigenic peptide of 46980 should include at least 8 amino acid residues of the amino acid sequence shown in SEQ ID NO:28 and encompasses an epitope of 46980. Preferably, the antigenic peptide includes at least 10 amino acid residues, more preferably at least 15 amino acid residues, even more preferably at least 20 amino acid residues, and most preferably at least 30 amino acid residues.

[1972] Fragments of 46980 which include residues about 162 to 174, about 422 to 454, or about 696 to 726 can be used to make, e.g., used as immunogens or used to characterize the specificity of an antibody, antibodies against hydrophilic regions of the 46980 protein. Similarly, fragments of 46980 which include residues about 1 to 41, about 675 to 696, or about 521 to 535 can be used to make an antibody against a hydrophobic region of the 46980 protein; fragments of 46980 which include residues about 1 to 674, about 44 to 674, or about 25 to 590 of SEQ ID NO:28 can be used to make an antibody against an extracellular region of the 46980 protein; a fragment of 46980 which includes residues about 697 to 816 can be used to make an antibody against an intracellular region of the 46980 protein; a fragment of 46980 which includes residues about 25 to 590 of SEQ ID NO:28 can be used to make an antibody against the carboxylesterase region of the 46980 protein.

[1973] Antibodies reactive with, or specific for, any of these regions, or other regions or domains described herein are provided.

[1974] Antibodies which bind only native 46980 protein, only denatured or otherwise non-native 46980 protein, or which bind both, are with in the invention. Antibodies with linear or conformational epitopes are within the invention. Conformational epitopes can sometimes be identified by identifying antibodies which bind to native but not denatured 46980 protein.

[1975] Preferred epitopes encompassed by the antigenic peptide are regions of 46980 are located on the surface of the protein, e.g., hydrophilic regions, as well as regions with high antigenicity. For example, an Emini surface probability analysis of the human 46980 protein sequence can be used to indicate the regions that have a particularly high probability of being localized to the surface of the 46980 protein and are thus likely to constitute surface residues useful for targeting antibody production.

[1976] In a preferred embodiment the antibody can bind to the extracellular portion of the 46980 protein, e.g., it can bind to a whole cell which expresses the 46980 protein. In another embodiment, the antibody binds an intracellular portion of the 46980 protein. In preferred embodiments antibodies can bind one or more of purified antigen, membrane associated antigen, tissue, e.g., tissue sections, whole cells, preferably living cells, lysed cells, cell fractions, e.g., membrane fractions.

[1977] The anti-46980 antibody can be a single chain antibody. A single-chain antibody (scFV) may be engineered (see, for example, Colcher, D. et al. (1999) Ann N Y Acad Sci 880:263-80; and Reiter, Y. (1996) Clin Cancer Res 2:245-52). The single chain antibody can be dimerized or multimerized to generate multivalent antibodies having specificities for different epitopes of the same target 46980 protein.

[1978] In a preferred embodiment the antibody has: effector function; and can fix complement. In other embodiments the antibody does not; recruit effector cells; or fix complement.

[1979] In a preferred embodiment, the antibody has reduced or no ability to bind an Fc receptor. For example., it is a isotype or subtype, fragment or other mutant, which does not support binding to an Fc receptor, e.g., it has a mutagenized or deleted Fc receptor binding region.

[1980] In a preferred embodiment, an anti-46980 antibody alters (e.g., increases or decreases) the a protein binding or recognition function, e.g., neurexin binding or recognition, activity of a 46980 polypeptide. For example, the antibody can bind at or in proximity to the active site, e.g., to an epitope that includes a residue located from about 245 to 262 of SEQ ID NO:28.

[1981] The antibody can be coupled to a toxin, e.g., a polypeptide toxin, e,g, ricin or diphtheria toxin or active fragment hereof, or a radioactive nucleus, or imaging agent, e.g. a radioactive, enzymatic, or other, e.g., imaging agent, e.g., a NMR contrast agent. Labels which produce detectable radioactive emissions or fluorescence are preferred.

[1982] An anti-46980 antibody (e.g., monoclonal antibody) can be used to isolate 46980 by standard techniques, such as affinity chromatography or immunoprecipitation. Moreover, an anti-46980 antibody can be used to detect 46980 protein (e.g., in a cellular lysate or cell supernatant) in order to evaluate the abundance and pattern of expression of the protein. Anti-46980 antibodies can be used diagnostically to monitor protein levels in tissue as part of a clinical testing procedure, e.g., to determine the efficacy of a given treatment regimen. Detection can be facilitated by coupling (i.e., physically linking) the antibody to a detectable substance (i.e., antibody labelling). Examples of detectable substances include various enzymes, prosthetic groups, fluorescent materials, luminescent materials, bioluminescent materials, and radioactive materials. Examples of suitable enzymes include horseradish peroxidase, alkaline phosphatase, &bgr;-galactosidase, or acetylcholinesterase; examples of suitable prosthetic group complexes include streptavidin/biotin and avidin/biotin; examples of suitable fluorescent materials include umbelliferone, fluorescein, fluorescein isothiocyanate, rhodamine, dichlorotriazinylamine fluorescein, dansyl chloride or phycoerythrin; an example of a luminescent material includes luminol; examples of bioluminescent materials include luciferase, luciferin, and aequorin, and examples of suitable radioactive material include 125I, 131I, 35S or 3H.

[1983] The invention also includes a nucleic acid which encodes an anti-46980 antibody, e.g., an anti-46980 antibody described herein. Also included are vectors which include the nucleic acid and cells transformed with the nucleic acid, particularly cells which are useful for producing an antibody, e.g., mammalian cells, e.g. CHO or lymphatic cells.

[1984] The invention also includes cell lines, e.g., hybridomas, which make an anti-46980 antibody, e.g., and antibody described herein, and method of using said cells to make a 46980 antibody.

[1985] The invention also includes other 46980-ligands, e.g., a neurexin, preferably a &bgr;-neurexin. The neurexin can be modified by routine protein engineering techniques to obtain a soluable neurexin domain which binds to a 46980 polypeptide.

[1986] 46980 Recombinant Expression Vectors, Host Cells and Genetically Engineered Cells

[1987] In another aspect, the invention includes, vectors, preferably expression vectors, containing a nucleic acid encoding a polypeptide described herein. As used herein, the term “vector” refers to a nucleic acid molecule capable of transporting another nucleic acid to which it has been linked and can include a plasmid, cosmid or viral vector. The vector can be capable of autonomous replication or it can integrate into a host DNA. Viral vectors include, e.g., replication defective retroviruses, adenoviruses and adeno-associated viruses.

[1988] A vector can include a 46980 nucleic acid in a form suitable for expression of the nucleic acid in a host cell. Preferably the recombinant expression vector includes one or more regulatory sequences operatively linked to the nucleic acid sequence to be expressed. The term “regulatory sequence” includes promoters, enhancers and other expression control elements (e.g., polyadenylation signals). Regulatory sequences include those which direct constitutive expression of a nucleotide sequence, as well as tissue-specific regulatory and/or inducible sequences. The design of the expression vector can depend on such factors as the choice of the host cell to be transformed, the level of expression of protein desired, and the like. The expression vectors of the invention can be introduced into host cells to thereby produce proteins or polypeptides, including fusion proteins or polypeptides, encoded by nucleic acids as described herein (e.g., 46980 proteins, mutant forms of 46980 proteins, fusion proteins, and the like).

[1989] The recombinant expression vectors of the invention can be designed for expression of 46980 proteins in prokaryotic or eukaryotic cells. For example, polypeptides of the invention can be expressed in E. coli, insect cells (e.g., using baculovirus expression vectors), yeast cells or mammalian cells. Suitable host cells are discussed further in Goeddel, (1990) Gene Expression Technology: Methods in Enzymology 185, Academic Press, San Diego, Calif. Alternatively, the recombinant expression vector can be transcribed and translated in vitro, for example using T7 promoter regulatory sequences and T7 polymerase.

[1990] Expression of proteins in prokaryotes is most often carried out in E. coli with vectors containing constitutive or inducible promoters directing the expression of either fusion or non-fusion proteins. Fusion vectors add a number of amino acids to a protein encoded therein, usually to the amino terminus of the recombinant protein. Such fusion vectors typically serve three purposes: 1) to increase expression of recombinant protein; 2) to increase the solubility of the recombinant protein; and 3) to aid in the purification of the recombinant protein by acting as a ligand in affinity purification. Often, a proteolytic cleavage site is introduced at the junction of the fusion moiety and the recombinant protein to enable separation of the recombinant protein from the fusion moiety subsequent to purification of the fusion protein. Such enzymes, and their cognate recognition sequences, include Factor Xa, thrombin and enterokinase. Typical fusion expression vectors include pGEX (Pharmacia Biotech Inc; Smith, D. B. and Johnson, K. S. (1988) Gene 67:31-40), pMAL (New England Biolabs, Beverly, Mass.) and pRIT5 (Pharmacia, Piscataway, N.J.) which fuse glutathione S-transferase (GST), maltose E binding protein, or protein A, respectively, to the target recombinant protein.

[1991] Purified fusion proteins can be used in 46980 activity assays, (e.g., direct assays or competitive assays described in detail below), or to generate antibodies specific for 46980 proteins. In a preferred embodiment, a fusion protein expressed in a retroviral expression vector of the present invention can be used to infect bone marrow cells which are subsequently transplanted into irradiated recipients. The pathology of the subject recipient is then examined after sufficient time has passed (e.g., six weeks).

[1992] To maximize recombinant protein expression in E. coli is to express the protein in a host bacteria with an impaired capacity to proteolytically cleave the recombinant protein (Gottesman, S., (1990) Gene Expression Technology: Methods in Enzymology 185, Academic Press, San Diego, Calif. 119-128). Another strategy is to alter the nucleic acid sequence of the nucleic acid to be inserted into an expression vector so that the individual codons for each amino acid are those preferentially utilized in E. coli (Wada et al., (1992) Nucleic Acids Res. 20:2111-2118). Such alteration of nucleic acid sequences of the invention can be carried out by standard DNA synthesis techniques.

[1993] The 46980 expression vector can be a yeast expression vector, a vector for expression in insect cells, e.g., a baculovirus expression vector or a vector suitable for expression in mammalian cells.

[1994] When used in mammalian cells, the expression vector's control functions can be provided by viral regulatory elements. For example, commonly used promoters are derived from polyoma, Adenovirus 2, cytomegalovirus and Simian Virus 40.

[1995] In another embodiment, the promoter is an inducible promoter, e.g., a promoter regulated by a steroid hormone, by a polypeptide hormone (e.g., by means of a signal transduction pathway), or by a heterologous polypeptide (e.g., the tetracycline-inducible systems, “Tet-On” and “Tet-Off”; see, e.g., Clontech Inc., CA, Gossen and Bujard (1992) Proc. Natl. Acad. Sci. USA 89:5547, and Paillard (1989) Human Gene Therapy 9:983).

[1996] In another embodiment, the recombinant mammalian expression vector is capable of directing expression of the nucleic acid preferentially in a particular cell type (e.g., tissue-specific regulatory elements are used to express the nucleic acid). Non-limiting examples of suitable tissue-specific promoters include the albumin promoter (liver-specific; Pinkert et al. (1987) Genes Dev. 1:268-277), lymphoid-specific promoters (Calame and Eaton (1988) Adv. Immunol. 43:235-275), in particular promoters of T cell receptors (Winoto and Baltimore (1989) EMBO J. 8:729-733) and immunoglobulins (Banerji et al. (1983) Cell 33:729-740; Queen and Baltimore (1983) Cell 33:741-748), neuron-specific promoters (e.g., the neurofilament promoter; Byrne and Ruddle (1989) Proc. Natl. Acad. Sci. USA 86:5473-5477), pancreas-specific promoters (Edlund et al. (1985) Science 230:912-916), and mammary gland-specific promoters (e.g., milk whey promoter; U.S. Pat. No. 4,873,316 and European Application Publication No. 264,166). Developmentally-regulated promoters are also encompassed, for example, the murine hox promoters (Kessel and Gruss (1990) Science 249:374-379) and the &agr;-fetoprotein promoter (Campes and Tilghman (1989) Genes Dev. 3:537-546).

[1997] The invention further provides a recombinant expression vector comprising a DNA molecule of the invention cloned into the expression vector in an antisense orientation. Regulatory sequences (e.g., viral promoters and/or enhancers) operatively linked to a nucleic acid cloned in the antisense orientation can be chosen which direct the constitutive, tissue specific or cell type specific expression of antisense RNA in a variety of cell types. The antisense expression vector can be in the form of a recombinant plasmid, phagemid or attenuated virus.

[1998] Another aspect the invention provides a host cell which includes a nucleic acid molecule described herein, e.g., a 46980 nucleic acid molecule within a recombinant expression vector or a 46980 nucleic acid molecule containing sequences which allow it to homologously recombine into a specific site of the host cell's genome. The terms “host cell” and “recombinant host cell” are used interchangeably herein. Such terms refer not only to the particular subject cell but to the progeny or potential progeny of such a cell. Because certain modifications may occur in succeeding generations due to either mutation or environmental influences, such progeny may not, in fact, be identical to the parent cell, but are still included within the scope of the term as used herein.

[1999] A host cell can be any prokaryotic or eukaryotic cell. For example, a 46980 protein can be expressed in bacterial cells (such as E. coli), insect cells, yeast or mammalian cells (such as Chinese hamster ovary cells (CHO) or COS cells (African green monkey kidney cells CV-1 origin SV40 cells; Gluzman (1981) Cell 123:175-182)). Other suitable host cells are known to those skilled in the art.

[2000] Vector DNA can be introduced into host cells via conventional transformation or transfection techniques. As used herein, the terms “transformation” and “transfection” are intended to refer to a variety of art-recognized techniques for introducing foreign nucleic acid (e.g., DNA) into a host cell, including calcium phosphate or calcium chloride co-precipitation, DEAE-dextran-mediated transfection, lipofection, or electroporation.

[2001] A host cell of the invention can be used to produce (i.e., express) a 46980 protein. Accordingly, the invention further provides methods for producing a 46980 protein using the host cells of the invention. In one embodiment, the method includes culturing the host cell of the invention (into which a recombinant expression vector encoding a 46980 protein has been introduced) in a suitable medium such that a 46980 protein is produced. In another embodiment, the method further includes isolating a 46980 protein from the medium or the host cell.

[2002] In another aspect, the invention features, a cell or purified preparation of cells which include a 46980 transgene, or which otherwise misexpress 46980. The cell preparation can consist of human or non-human cells, e.g., rodent cells, e.g., mouse or rat cells, rabbit cells, or pig cells. In preferred embodiments, the cell or cells include a 46980 transgene, e.g., a heterologous form of a 46980, e.g., a gene derived from humans (in the case of a non-human cell). The 46980 transgene can be misexpressed, e.g., overexpressed or underexpressed. In other preferred embodiments, the cell or cells include a gene that mis-expresses an endogenous 46980, e.g., a gene the expression of which is disrupted, e.g., a knockout. Such cells can serve as a model for studying disorders that are related to mutated or mis-expressed 46980 alleles or for use in drug screening.

[2003] In another aspect, the invention features, a human cell, e.g., a neuronal stem cell, transformed with nucleic acid which encodes a subject 46980 polypeptide.

[2004] Also provided are cells, preferably human cells, e.g., human neuronal cells or fibroblast cells, in which an endogenous 46980 is under the control of a regulatory sequence that does not normally control the expression of the endogenous 46980 gene. The expression characteristics of an endogenous gene within a cell, e.g., a cell line or microorganism, can be modified by inserting a heterologous DNA regulatory element into the genome of the cell such that the inserted regulatory element is operably linked to the endogenous 46980 gene. For example, an endogenous 46980 gene which is “transcriptionally silent,” e.g., not normally expressed, or expressed only at very low levels, may be activated by inserting a regulatory element which is capable of promoting the expression of a normally expressed gene product in that cell. Techniques such as targeted homologous recombinations, can be used to insert the heterologous DNA as described in, e.g., Chappel, U.S. Pat. No. 5,272,071; WO 91/06667, published in May 16, 1991.

[2005] In a preferred embodiment, recombinant cells described herein can be used for replacement therapy in a subject. For example, a nucleic acid encoding a 46980 polypeptide, e.g., a soluable extracellular domain thereof, operably linked to an inducible promoter (e.g., a steroid hormone receptor-regulated promoter) is introduced into a human or nonhuman, e.g., mammalian, e.g., porcine recombinant cell. The cell is cultivated and encapsulated in a biocompatible material, such as poly-lysine alginate, and subsequently implanted into the subject. See, e.g., Lanza (1996) Nat. Biotechnol. 14:1107; Joki et al. (2001) Nat. Biotechnol. 19:35; and U.S. Pat. No. 5,876,742. Production of 46980 polypeptide can be regulated in the subject by administering an agent (e.g., a steroid hormone) to the subject. In another preferred embodiment, the implanted recombinant cells express and secrete an antibody specific for a 46980 polypeptide. The antibody can be any antibody or any antibody derivative described herein.

[2006] 46980 Transgenic Animals

[2007] The invention provides non-human transgenic animals. Such animals are useful for studying the function and/or activity of a 46980 protein and for identifying and/or evaluating modulators of 46980 activity. As used herein, a “transgenic animal” is a non-human animal, preferably a mammal, more preferably a rodent such as a rat or mouse, in which one or more of the cells of the animal includes a transgene. Other examples of transgenic animals include non-human primates, sheep, dogs, cows, goats, chickens, amphibians, and the like. A transgene is exogenous DNA or a rearrangement, e.g., a deletion of endogenous chromosomal DNA, which preferably is integrated into or occurs in the genome of the cells of a transgenic animal. A transgene can direct the expression of an encoded gene product in one or more cell types or tissues of the transgenic animal, other transgenes, e.g., a knockout, reduce expression. Thus, a transgenic animal can be one in which an endogenous 46980 gene has been altered by, e.g., by homologous recombination between the endogenous gene and an exogenous DNA molecule introduced into a cell of the animal, e.g., an embryonic cell of the animal, prior to development of the animal.

[2008] Intronic sequences and polyadenylation signals can also be included in the transgene to increase the efficiency of expression of the transgene. A tissue-specific regulatory sequence(s) can be operably linked to a transgene of the invention to direct expression of a 46980 protein to particular cells. A transgenic founder animal can be identified based upon the presence of a 46980 transgene in its genome and/or expression of 46980 mRNA in tissues or cells of the animals. A transgenic founder animal can then be used to breed additional animals carrying the transgene. Moreover, transgenic animals carrying a transgene encoding a 46980 protein can further be bred to other transgenic animals carrying other transgenes.

[2009] 46980 proteins or polypeptides can be expressed in transgenic animals or plants, e.g., a nucleic acid encoding the protein or polypeptide can be introduced into the genome of an animal. In preferred embodiments the nucleic acid is placed under the control of a tissue specific promoter, e.g., a milk or egg specific promoter, and recovered from the milk or eggs produced by the animal. Suitable animals are mice, pigs, cows, goats, and sheep.

[2010] The invention also includes a population of cells from a transgenic animal, as discussed, e.g., below.

[2011] Uses of 46980

[2012] The nucleic acid molecules, proteins, protein homologues, and antibodies described herein can be used in one or more of the following methods: a) screening assays; b) predictive medicine (e.g., diagnostic assays, prognostic assays, monitoring clinical trials, and pharmacogenetics); c) isolation of neuronal cells; d) detection of 46980-expressing cells, e.g., neural cells, testicular cells, and cancerous derivatives of such cells; and e) methods of treatment (e.g., therapeutic and prophylactic).

[2013] The isolated nucleic acid molecules of the invention can be used, for example, to express a 46980 protein (e.g., via a recombinant expression vector in a host cell in gene therapy applications), to detect a 46980 mRNA (e.g., in a biological sample) or a genetic alteration in a 46980 gene, and to modulate 46980 activity, as described further below. The 46980 proteins can be used to treat disorders characterized by insufficient or excessive production of a 46980 substrate or production of 46980 inhibitors. In addition, the 46980 proteins can be used to screen for naturally occurring 46980 substrates, to screen for drugs or compounds which modulate 46980 activity, as well as to treat disorders characterized by insufficient or excessive production of 46980 protein or production of 46980 protein forms which have decreased, aberrant or unwanted activity compared to 46980 wild type protein (e.g., a neuronal disorder, e.g., a pain-related, neuronal connectivity-related, or neural degenerative disorder,). Moreover, the anti-46980 antibodies of the invention can be used to detect and isolate 46980 proteins, regulate the bioavailability of 46980 proteins, and modulate 46980 activity. Further anti-46980 antibodies can be used to purify 46980-expressing cells, e.g., for diagnostics. In another application, solid supports having stably attached soluble 46980 extracellular domains can be used to purify a neurexin or neurexin-expressing cell.

[2014] A method of evaluating a compound for the ability to interact with, e.g., bind, a subject 46980 polypeptide is provided. The method includes: contacting the compound with the subject 46980 polypeptide; and evaluating ability of the compound to interact with, e.g., to bind or form a complex with the subject 46980 polypeptide. This method can be performed in vitro, e.g., in a cell free system, or in vivo, e.g., in a two-hybrid interaction trap assay. This method can be used to identify naturally occurring molecules that interact with subject 46980 polypeptide. It can also be used to find natural or synthetic inhibitors of subject 46980 polypeptide. Screening methods are discussed in more detail below.

[2015] 46980 Screening Assays

[2016] The invention provides methods (also referred to herein as “screening assays”) for identifying modulators, i.e., candidate or test compounds or agents (e.g., proteins, peptides, peptidomimetics, peptoids, small molecules or other drugs) which bind to 46980 proteins, have a stimulatory or inhibitory effect on, for example, 46980 expression or 46980 activity, or have a stimulatory or inhibitory effect on, for example, the expression or activity of a 46980 substrate. Compounds thus identified can be used to modulate the activity of target gene products (e.g., 46980 genes) in a therapeutic protocol, to elaborate the biological function of the target gene product, or to identify compounds that disrupt normal target gene interactions.

[2017] In one embodiment, the invention provides assays for screening candidate or test compounds which are substrates of a 46980 protein or polypeptide or a biologically active portion thereof. In another embodiment, the invention provides assays for screening candidate or test compounds that bind to or modulate an activity of a 46980 protein or polypeptide or a biologically active portion thereof.

[2018] In one embodiment, an activity of a 46980 protein can be assayed by measuring binding to neurexin. For example, Ichtchenko et al., supra, provide a biochemical assay for binding of a neuroligin to an immobilized neurexin-IgG fusion protein. The biochemical interaction can be calcium dependent and can require lack of an insert in splice site four of the neurexin.

[2019] The test compounds of the present invention can be obtained using any of the numerous approaches in combinatorial-library methods known in the art including: biological libraries; peptoid libraries (libraries of molecules having the functionalities of peptides, but with a novel, non-peptide backbone which are resistant to enzymatic degradation but which nevertheless remain bioactive; see, e.g., Zuckermann, R. N. et al. (1994) J. Med. Chem. 37:2678-85); spatially addressable parallel solid phase or solution phase libraries; synthetic library methods requiring deconvolution; the ‘one-bead one-compound’ library method; and synthetic library methods using affinity chromatography selection. The biological library and peptoid library approaches are limited to peptide libraries, while the other four approaches are applicable to peptide, non-peptide oligomer or small molecule libraries of compounds (Lam (1997) Anticancer Drug Des. 12:145).

[2020] Examples of methods for the synthesis of molecular libraries can be found in the art, for example in: DeWitt et al. (1993) Proc. Natl. Acad. Sci. U.S.A. 90:6909; Erb et al. (1994) Proc. Natl. Acad. Sci. USA 91:11422; Zuckermann et al. (1994). J. Med. Chem. 37:2678; Cho et al. (1993) Science 261:1303; Carrell et al. (1994) Angew. Chem. Int. Ed. Engl. 33:2059; Carell et al. (1994) Angew. Chem. Int. Ed. Engl. 33:2061; and Gallop et al. (1994) J. Med. Chem. 37:1233.

[2021] Libraries of compounds may be presented in solution (e.g., Houghten (1992) Biotechniques 13:412-421), or on beads (Lam (1991) Nature 354:82-84), chips (Fodor (1993) Nature 364:555-556), bacteria (Ladner, U.S. Pat. No. 5,223,409), spores (Ladner U.S. Pat. No. 5,223,409), plasmids (Cull et al. (1992) Proc Natl Acad Sci USA 89:1865-1869) or on phage (Scott and Smith (1990) Science 249:386-390; Devlin (1990) Science 249:404-406; Cwirla et al. (1990) Proc. Natl. Acad. Sci. 87:6378-6382; Felici (1991) J. Mol. Biol. 222:301-310; Ladner supra.).

[2022] In one embodiment, an assay is a cell-based assay in which a cell which expresses a 46980 protein or biologically active portion thereof is contacted with a test compound, and the ability of the test compound to modulate 46980 activity is determined. Determining the ability of the test compound to modulate 46980 activity can be accomplished by monitoring, for example, a protein binding or recognition function, e.g., neurexin binding or recognition. The cell, for example, can be of mammalian origin, e.g., human.

[2023] The ability of the test compound to modulate 46980 binding to a compound, e.g., a 46980 substrate, or to bind to 46980 can also be evaluated. This can be accomplished, for example, by coupling the compound, e.g., the substrate, with a radioisotope or enzymatic label such that binding of the compound, e.g., the substrate, to 46980 can be determined by detecting the labeled compound, e.g., substrate, in a complex. Alternatively, 46980 could be coupled with a radioisotope or enzymatic label to monitor the ability of a test compound to modulate 46980 binding to a 46980 substrate in a complex. For example, compounds (e.g., 46980 substrates) can be labeled with 125I, 35S, 14C, or 3H, either directly or indirectly, and the radioisotope detected by direct counting of radioemmission or by scintillation counting. Alternatively, compounds can be enzymatically labeled with, for example, horseradish peroxidase, alkaline phosphatase, or luciferase, and the enzymatic label detected by determination of conversion of an appropriate substrate to product.

[2024] The ability of a compound (e.g., a 46980 substrate) to interact with 46980 with or without the labeling of any of the interactants can be evaluated. For example, a microphysiometer can be used to detect the interaction of a compound with 46980 without the labeling of either the compound or the 46980. McConnell, H. M. et al. (1992) Science 257:1906-1912. As used herein, a “microphysiometer” (e.g., Cytosensor) is an analytical instrument that measures the rate at which a cell acidifies its environment using a light-addressable potentiometric sensor (LAPS). Changes in this acidification rate can be used as an indicator of the interaction between a compound and 46980.

[2025] In yet another embodiment, a cell-free assay is provided in which a 46980 protein or biologically active portion thereof is contacted with a test compound and the ability of the test compound to bind to the 46980 protein or biologically active portion thereof is evaluated. Preferred biologically active portions of the 46980 proteins to be used in assays of the present invention include fragments which participate in interactions with non-46980 molecules, e.g., fragments with high surface probability scores.

[2026] Soluble and/or membrane-bound forms of isolated proteins (e.g., 46980 proteins or biologically active portions thereof) can be used in the cell-free assays of the invention. When membrane-bound forms of the protein are used, it may be desirable to utilize a solubilizing agent. Examples of such solubilizing agents include non-ionic detergents such as n-octylglucoside, n-dodecylglucoside, n-dodecylmaltoside, octanoyl-N-methylglucamide, decanoyl-N-methylglucamide, Triton® X-100, Triton® X-114, Thesit®, Isotridecypoly(ethylene glycol ether)n, 3-[(3-cholamidopropyl)dimethylamminio]-1-propane sulfonate (CHAPS), 3-[(3-cholamidopropyl)dimethylamminio]-2-hydroxy-1-propane sulfonate (CHAPSO), or N-dodecyl=N,N-dimethyl-3-ammonio-1-propane sulfonate.

[2027] Cell-free assays involve preparing a reaction mixture of the target gene protein and the test compound under conditions and for a time sufficient to allow the two components to interact and bind, thus forming a complex that can be removed and/or detected.

[2028] The interaction between two molecules can also be detected, e.g., using fluorescence energy transfer (FET) (see, for example, Lakowicz et al., U.S. Pat. No. 5,631,169; Stavrianopoulos, et al., U.S. Pat. No. 4,868,103). A fluorophore label on the first, ‘donor’ molecule is selected such that its emitted fluorescent energy will be absorbed by a fluorescent label on a second, ‘acceptor’ molecule, which in turn is able to fluoresce due to the absorbed energy. Alternately, the ‘donor’ protein molecule may simply utilize the natural fluorescent energy of tryptophan residues. Labels are chosen that emit different wavelengths of light, such that the ‘acceptor’ molecule label may be differentiated from that of the ‘donor’. Since the efficiency of energy transfer between the labels is related to the distance separating the molecules, the spatial relationship between the molecules can be assessed. In a situation in which binding occurs between the molecules, the fluorescent emission of the ‘acceptor’ molecule label in the assay should be maximal. An FET binding event can be conveniently measured through standard fluorometric detection means well known in the art (e.g., using a fluorimeter).

[2029] In another embodiment, determining the ability of the 46980 protein to bind to a target molecule can be accomplished using real-time Biomolecular Interaction Analysis (BIA) (see, e.g., Sjolander, S. and Urbaniczky, C. (1991) Anal. Chem. 63:2338-2345 and Szabo et al. (1995) Curr. Opin. Struct. Biol. 5:699-705). “Surface plasmon resonance” or “BIA” detects biospecific interactions in real time, without labeling any of the interactants (e.g., BIAcore). Changes in the mass at the binding surface (indicative of a binding event) result in alterations of the refractive index of light near the surface (the optical phenomenon of surface plasmon resonance (SPR)), resulting in a detectable signal which can be used as an indication of real-time reactions between biological molecules.

[2030] In one embodiment, the target gene product or the test substance is anchored onto a solid phase. The target gene product/test compound complexes anchored on the solid phase can be detected at the end of the reaction. Preferably, the target gene product can be anchored onto a solid surface, and the test compound, (which is not anchored), can be labeled, either directly or indirectly, with detectable labels discussed herein.

[2031] It may be desirable to immobilize either 46980, an anti-46980 antibody or its target molecule to facilitate separation of complexed from uncomplexed forms of one or both of the proteins, as well as to accommodate automation of the assay. Binding of a test compound to a 46980 protein, or interaction of a 46980 protein with a target molecule in the presence and absence of a candidate compound, can be accomplished in any vessel suitable for containing the reactants. Examples of such vessels include microtiter plates, test tubes, and micro-centrifuge tubes. In one embodiment, a fusion protein can be provided which adds a domain that allows one or both of the proteins to be bound to a matrix. For example, glutathione-S-transferase/46980 fusion proteins or glutathione-S-transferase/target fusion proteins can be adsorbed onto glutathione sepharose beads (Sigma Chemical, St. Louis, Mo.) or glutathione derivatized microtiter plates, which are then combined with the test compound or the test compound and either the non-adsorbed target protein or 46980 protein, and the mixture incubated under conditions conducive to complex formation (e.g., at physiological conditions for salt and pH). Following incubation, the beads or microtiter plate wells are washed to remove any unbound components, the matrix immobilized in the case of beads, complex determined either directly or indirectly, for example, as described above. Alternatively, the complexes can be dissociated from the matrix, and the level of 46980 binding or activity determined using standard techniques.

[2032] Other techniques for immobilizing either a 46980 protein or a target molecule on matrices include using conjugation of biotin and streptavidin. Biotinylated 46980 protein or target molecules can be prepared from biotin-NHS(N-hydroxy-succinimide) using techniques known in the art (e.g., biotinylation kit, Pierce Chemicals, Rockford, Ill.), and immobilized in the wells of streptavidin-coated 96 well plates (Pierce Chemical).

[2033] In order to conduct the assay, the non-immobilized component is added to the coated surface containing the anchored component. After the reaction is complete, unreacted components are removed (e.g., by washing) under conditions such that any complexes formed will remain immobilized on the solid surface. The detection of complexes anchored on the solid surface can be accomplished in a number of ways. Where the previously non-immobilized component is pre-labeled, the detection of label immobilized on the surface indicates that complexes were formed. Where the previously non-immobilized component is not pre-labeled, an indirect label can be used to detect complexes anchored on the surface; e.g., using a labeled antibody specific for the immobilized component (the antibody, in turn, can be directly labeled or indirectly labeled with, e.g., a labeled anti-Ig antibody).

[2034] In one embodiment, this assay is performed utilizing antibodies reactive with 46980 protein or target molecules but which do not interfere with binding of the 46980 protein to its target molecule. Such antibodies can be derivatized to the wells of the plate, and unbound target or 46980 protein trapped in the wells by antibody conjugation. Methods for detecting such complexes, in addition to those described above for the GST-immobilized complexes, include immunodetection of complexes using antibodies reactive with the 46980 protein or target molecule, as well as enzyme-linked assays which rely on detecting an enzymatic activity associated with the 46980 protein or target molecule.

[2035] Alternatively, cell free assays can be conducted in a liquid phase. In such an assay, the reaction products are separated from unreacted components, by any of a number of standard techniques, including but not limited to: differential centrifugation (see, for example, Rivas, G., and Minton, A. P., (1993) Trends Biochem Sci 18:284-7); chromatography (gel filtration chromatography, ion-exchange chromatography); electrophoresis (see, e.g., Ausubel, F. et al., eds. Current Protocols in Molecular Biology 1999, J. Wiley: New York.); and immunoprecipitation (see, for example, Ausubel, F. et al., eds. (1999) Current Protocols in Molecular Biology, J. Wiley: New York). Such resins and chromatographic techniques are known to one skilled in the art (see, e.g., Heegaard, N. H., (1998) J Mol Recognit 11: 141-8; Hage, D. S., and Tweed, S. A. (1997) J Chromatogr B Biomed Sci Appl. 699:499-525). Further, fluorescence energy transfer may also be conveniently utilized, as described herein, to detect binding without further purification of the complex from solution.

[2036] In a preferred embodiment, the assay includes contacting the 46980 protein or biologically active portion thereof with a known compound which binds 46980 to form an assay mixture, contacting the assay mixture with a test compound, and determining the ability of the test compound to interact with a 46980 protein, wherein determining the ability of the test compound to interact with a 46980 protein includes determining the ability of the test compound to preferentially bind to 46980 or biologically active portion thereof, or to modulate the activity of a target molecule, as compared to the known compound.

[2037] The target gene products of the invention can, in vivo, interact with one or more cellular or extracellular macromolecules, such as proteins. For the purposes of this discussion, such cellular and extracellular macromolecules are referred to herein as “binding partners.” Compounds that disrupt such interactions can be useful in regulating the activity of the target gene product. Such compounds can include, but are not limited to molecules such as antibodies, peptides, and small molecules. The preferred target genes/products for use in this embodiment are the 46980 genes herein identified. In an alternative embodiment, the invention provides methods for determining the ability of the test compound to modulate the activity of a 46980 protein through modulation of the activity of a downstream effector of a 46980 target molecule. For example, the activity of the effector molecule on an appropriate target can be determined, or the binding of the effector to an appropriate target can be determined, as previously described.

[2038] To identify compounds that interfere with the interaction between the target gene product and its cellular or extracellular binding partner(s), a reaction mixture containing the target gene product and the binding partner is prepared, under conditions and for a time sufficient, to allow the two products to form complex. In order to test an inhibitory agent, the reaction mixture is provided in the presence and absence of the test compound. The test compound can be initially included in the reaction mixture, or can be added at a time subsequent to the addition of the target gene and its cellular or extracellular binding partner. Control reaction mixtures are incubated without the test compound or with a placebo. The formation of any complexes between the target gene product and the cellular or extracellular binding partner is then detected. The formation of a complex in the control reaction, but not in the reaction mixture containing the test compound, indicates that the compound interferes with the interaction of the target gene product and the interactive binding partner. Additionally, complex formation within reaction mixtures containing the test compound and normal target gene product can also be compared to complex formation within reaction mixtures containing the test compound and mutant target gene product. This comparison can be important in those cases wherein it is desirable to identify compounds that disrupt interactions of mutant but not normal target gene products.

[2039] These assays can be conducted in a heterogeneous or homogeneous format. Heterogeneous assays involve anchoring either the target gene product or the binding partner onto a solid phase, and detecting complexes anchored on the solid phase at the end of the reaction. In homogeneous assays, the entire reaction is carried out in a liquid phase. In either approach, the order of addition of reactants can be varied to obtain different information about the compounds being tested. For example, test compounds that interfere with the interaction between the target gene products and the binding partners, e.g., by competition, can be identified by conducting the reaction in the presence of the test substance. Alternatively, test compounds that disrupt preformed complexes, e.g., compounds with higher binding constants that displace one of the components from the complex, can be tested by adding the test compound to the reaction mixture after complexes have been formed. The various formats are briefly described below.

[2040] In a heterogeneous assay system, either the target gene product or the interactive cellular or extracellular binding partner, is anchored onto a solid surface (e.g., a microtiter plate), while the non-anchored species is labeled, either directly or indirectly. The anchored species can be immobilized by non-covalent or covalent attachments. Alternatively, an immobilized antibody specific for the species to be anchored can be used to anchor the species to the solid surface.

[2041] In order to conduct the assay, the partner of the immobilized species is exposed to the coated surface with or without the test compound. After the reaction is complete, unreacted components are removed (e.g., by washing) and any complexes formed will remain immobilized on the solid surface. Where the non-immobilized species is pre-labeled, the detection of label immobilized on the surface indicates that complexes were formed. Where the non-immobilized species is not pre-labeled, an indirect label can be used to detect complexes anchored on the surface; e.g., using a labeled antibody specific for the initially non-immobilized species (the antibody, in turn, can be directly labeled or indirectly labeled with, e.g., a labeled anti-Ig antibody). Depending upon the order of addition of reaction components, test compounds that inhibit complex formation or that disrupt preformed complexes can be detected.

[2042] Alternatively, the reaction can be conducted in a liquid phase in the presence or absence of the test compound, the reaction products separated from unreacted components, and complexes detected; e.g., using an immobilized antibody specific for one of the binding components to anchor any complexes formed in solution, and a labeled antibody specific for the other partner to detect anchored complexes. Again, depending upon the order of addition of reactants to the liquid phase, test compounds that inhibit complex or that disrupt preformed complexes can be identified.

[2043] In an alternate embodiment of the invention, a homogeneous assay can be used. For example, a preformed complex of the target gene product and the interactive cellular or extracellular binding partner product is prepared in that either the target gene products or their binding partners are labeled, but the signal generated by the label is quenched due to complex formation (see, e.g., U.S. Pat. No. 4,109,496 that utilizes this approach for immunoassays). The addition of a test substance that competes with and displaces one of the species from the preformed complex will result in the generation of a signal above background. In this way, test substances that disrupt target gene product-binding partner interaction can be identified.

[2044] In yet another aspect, the 46980 proteins can be used as “bait proteins” in a two-hybrid assay or three-hybrid assay (see, e.g., U.S. Pat. No. 5,283,317; Zervos et al. (1993) Cell 72:223-232; Madura et al. (1993) J. Biol. Chem. 268:12046-12054; Bartel et al. (1993) Biotechniques 14:920-924; Iwabuchi et al. (1993) Oncogene 8:1693-1696; and Brent WO94/10300), to identify other proteins, which bind to or interact with 46980 (“46980-binding proteins” or “46980-bp”) and are involved in 46980 activity. Such 46980-bps can be activators or inhibitors of signals by the 46980 proteins or 46980 targets as, for example, downstream elements of a 46980-mediated signaling pathway.

[2045] The two-hybrid system is based on the modular nature of most transcription factors, which consist of separable DNA-binding and activation domains. Briefly, the assay utilizes two different DNA constructs. In one construct, the gene that codes for a 46980 protein is fused to a gene encoding the DNA binding domain of a known transcription factor (e.g., GAL-4). In the other construct, a DNA sequence, from a library of DNA sequences, that encodes an unidentified protein (“prey” or “sample”) is fused to a gene that codes for the activation domain of the known transcription factor. (Alternatively the: 46980 protein can be the fused to the activator domain.) If the “bait” and the “prey” proteins are able to interact, in vivo, forming a 46980-dependent complex, the DNA-binding and activation domains of the transcription factor are brought into close proximity. This proximity allows transcription of a reporter gene (e.g., lacZ) which is operably linked to a transcriptional regulatory site responsive to the transcription factor. Expression of the reporter gene can be detected and cell colonies containing the functional transcription factor can be isolated and used to obtain the cloned gene which encodes the protein which interacts with the 46980 protein.

[2046] In another embodiment, modulators of 46980 expression are identified. For example, a cell or cell free mixture is contacted with a candidate compound and the expression of 46980 mRNA or protein evaluated relative to the level of expression of 46980 mRNA or protein in the absence of the candidate compound. When expression of 46980 mRNA or protein is greater in the presence of the candidate compound than in its absence, the candidate compound is identified as a stimulator of 46980 mRNA or protein expression. Alternatively, when expression of 46980 mRNA or protein is less (statistically significantly less) in the presence of the candidate compound than in its absence, the candidate compound is identified as an inhibitor of 46980 mRNA or protein expression. The level of 46980 mRNA or protein expression can be determined by methods described herein for detecting 46980 mRNA or protein.

[2047] In another aspect, the invention pertains to a combination of two or more of the assays described herein. For example, a modulating agent can be identified using a cell-based or a cell free assay, and the ability of the agent to modulate the activity of a 46980 protein can be confirmed in vivo, e.g., in an animal such as an animal model for a neuronal disorder, e.g., a pain-related, neuronal connectivity-related, or neural degenerative disorder.

[2048] This invention further pertains to novel agents identified by the above-described screening assays. Accordingly, it is within the scope of this invention to further use an agent identified as described herein (e.g., a 46980 modulating agent, an antisense 46980 nucleic acid molecule, a 46980-specific antibody, or a 46980-binding partner) in an appropriate animal model to determine the efficacy, toxicity, side effects, or mechanism of action, of treatment with such an agent. Furthermore, novel agents identified by the above-described screening assays can be used for treatments as described herein.

[2049] 46980 Detection Assays

[2050] Portions or fragments of the nucleic acid sequences identified herein can be used as polynucleotide reagents. For example, these sequences can be used to: (i) map their respective genes on a chromosome e.g., to locate gene regions associated with genetic disease or to associate 46980 with a disease; (ii) identify an individual from a minute biological sample (tissue typing); and (iii) aid in forensic identification of a biological sample. These applications are described in the subsections below.

[2051] 46980 Chromosome Mapping

[2052] The 46980 nucleotide sequences or portions thereof can be used to map the location of the 46980 genes on a chromosome. This process is called chromosome mapping. Chromosome mapping is useful in correlating the 46980 sequences with genes associated with disease.

[2053] Briefly, 46980 genes can be mapped to chromosomes by preparing PCR primers (preferably 15-25 bp in length) from the 46980 nucleotide sequences. These primers can then be used for PCR screening of somatic cell hybrids containing individual human chromosomes. Only those hybrids containing the human gene corresponding to the 46980 sequences will yield an amplified fragment.

[2054] A panel of somatic cell hybrids in which each cell line contains either a single human chromosome or a small number of human chromosomes, and a full set of mouse chromosomes, can allow easy mapping of individual genes to specific human chromosomes. (D'Eustachio P. et al. (1983) Science 220:919-924).

[2055] Other mapping strategies e.g., in situ hybridization (described in Fan, Y. et al. (1990) Proc. Natl. Acad. Sci. USA, 87:6223-27), pre-screening with labeled flow-sorted chromosomes, and pre-selection by hybridization to chromosome specific cDNA libraries can be used to map 46980 to a chromosomal location.

[2056] Fluorescence in situ hybridization (FISH) of a DNA sequence to a metaphase chromosomal spread can further be used to provide a precise chromosomal location in one step. The FISH technique can be used with a DNA sequence as short as 500 or 600 bases. However, clones larger than 1,000 bases have a higher likelihood of binding to a unique chromosomal location with sufficient signal intensity for simple detection. Preferably 1,000 bases, and more preferably 2,000 bases will suffice to get good results at a reasonable amount of time. For a review of this technique, see Verma et al., Human Chromosomes: A Manual of Basic Techniques ((1988) Pergamon Press, New York).

[2057] Reagents for chromosome mapping can be used individually to mark a single chromosome or a single site on that chromosome, or panels of reagents can be used for marking multiple sites and/or multiple chromosomes. Reagents corresponding to noncoding regions of the genes actually are preferred for mapping purposes. Coding sequences are more likely to be conserved within gene families, thus increasing the chance of cross hybridizations during chromosomal mapping.

[2058] Once a sequence has been mapped to a precise chromosomal location, the physical position of the sequence on the chromosome can be correlated with genetic map data. (Such data are found, for example, in V. McKusick, Mendelian Inheritance in Man, available on-line through Johns Hopkins University Welch Medical Library). The relationship between a gene and a disease, mapped to the same chromosomal region, can then be identified through linkage analysis (co-inheritance of physically adjacent genes), described in, for example, Egeland, J. et al. (1987) Nature, 325:783-787.

[2059] Moreover, differences in the DNA sequences between individuals affected and unaffected with a disease associated with the 46980 gene, can be determined. If a mutation is observed in some or all of the affected individuals but not in any unaffected individuals, then the mutation is likely to be the causative agent of the particular disease. Comparison of affected and unaffected individuals generally involves first looking for structural alterations in the chromosomes, such as deletions or translocations that are visible from chromosome spreads or detectable using PCR based on that DNA sequence. Ultimately, complete sequencing of genes from several individuals can be performed to confirm the presence of a mutation and to distinguish mutations from polymorphisms.

[2060] 46980 Tissue Typing

[2061] 46980 sequences can be used to identify individuals from biological samples using, e.g., restriction fragment length polymorphism (RFLP). In this technique, an individual's genomic DNA is digested with one or more restriction enzymes, the fragments separated, e.g., in a Southern blot, and probed to yield bands for identification. The sequences of the present invention are useful as additional DNA markers for RFLP (described in U.S. Pat. No. 5,272,057).

[2062] Furthermore, the sequences of the present invention can also be used to determine the actual base-by-base DNA sequence of selected portions of an individual's genome. Thus, the 46980 nucleotide sequences described herein can be used to prepare two PCR primers from the 5′ and 3′ends of the sequences. These primers can then be used to amplify an individual's DNA and subsequently sequence it. Panels of corresponding DNA sequences from individuals, prepared in this manner, can provide unique individual identifications, as each individual will have a unique set of such DNA sequences due to allelic differences.

[2063] Allelic variation occurs to some degree in the coding regions of these sequences, and to a greater degree in the noncoding regions. Each of the sequences described herein can, to some degree, be used as a standard against which DNA from an individual can be compared for identification purposes. Because greater numbers of polymorphisms occur in the noncoding regions, fewer sequences are necessary to differentiate individuals. The noncoding sequences of SEQ ID NO:27 can provide positive individual identification with a panel of perhaps 10 to 1,000 primers which each yield a noncoding amplified sequence of 100 bases. If predicted coding sequences, such as those in SEQ ID NO:29 are used, a more appropriate number of primers for positive individual identification would be 500-2,000.

[2064] If a panel of reagents from 46980 nucleotide sequences described herein is used to generate a unique identification database for an individual, those same reagents can later be used to identify tissue from that individual. Using the unique identification database, positive identification of the individual, living or dead, can be made from extremely small tissue samples.

[2065] Use of Partial 46980 Sequences in Forensic Biology

[2066] DNA-based identification techniques can also be used in forensic biology. To make such an identification, PCR technology can be used to amplify DNA sequences taken from very small biological samples such as tissues, e.g., hair or skin, or body fluids, e.g., blood, saliva, or semen found at a crime scene. The amplified sequence can then be compared to a standard, thereby allowing identification of the origin of the biological sample.

[2067] The sequences of the present invention can be used to provide polynucleotide reagents, e.g., PCR primers, targeted to specific loci in the human genome, which can enhance the reliability of DNA-based forensic identifications by, for example, providing another “identification marker” (i.e. another DNA sequence that is unique to a particular individual). As mentioned above, actual base sequence information can be used for identification as an accurate alternative to patterns formed by restriction enzyme generated fragments. Sequences targeted to noncoding regions of SEQ ID NO:27 (e.g., fragments derived from the noncoding regions of SEQ ID NO:27 having a length of at least 20 bases, preferably at least 30 bases) are particularly appropriate for this use.

[2068] The 46980 nucleotide sequences described herein can further be used to provide polynucleotide reagents, e.g., labeled or labelable probes which can be used in, for example, an in situ hybridization technique, to identify a specific tissue. This can be very useful in cases where a forensic pathologist is presented with a tissue of unknown origin. Panels of such 46980 probes can be used to identify tissue by species and/or by organ type.

[2069] In a similar fashion, these reagents, e.g., 46980 primers or probes can be used to screen tissue culture for contamination (i.e. screen for the presence of a mixture of different types of cells in a culture).

[2070] Predictive Medicine of 46980

[2071] The present invention also pertains to the field of predictive medicine in which diagnostic assays, prognostic assays, and monitoring clinical trials are used for prognostic (predictive) purposes to thereby treat an individual.

[2072] Generally, the invention provides, a method of determining if a subject is at risk for a disorder related to a lesion in or the misexpression of a gene which encodes 46980.

[2073] Such disorders include, e.g., a disorder associated with the misexpression of 46980 gene; a disorder of the neurological system.

[2074] The method includes one or more of the following:

[2075] detecting, in a tissue of the subject, the presence or absence of a mutation which affects the expression of the 46980 gene, or detecting the presence or absence of a mutation in a region which controls the expression of the gene, e.g., a mutation in the 5′control region;

[2076] detecting, in a tissue of the subject, the presence or absence of a mutation which alters the structure of the 46980 gene;

[2077] detecting, in a tissue of the subject, the misexpression of the 46980 gene, at the mRNA level, e.g., detecting a non-wild type level of a mRNA;

[2078] detecting, in a tissue of the subject, the misexpression of the gene, at the protein level, e.g., detecting a non-wild type level of a 46980 polypeptide.

[2079] In preferred embodiments the method includes: ascertaining the existence of at least one of: a deletion of one or more nucleotides from the 46980 gene; an insertion of one or more nucleotides into the gene, a point mutation, e.g., a substitution of one or more nucleotides of the gene, a gross chromosomal rearrangement of the gene, e.g., a translocation, inversion, or deletion.

[2080] For example, detecting the genetic lesion can include: (i) providing a probe/primer including an oligonucleotide containing a region of nucleotide sequence which hybridizes to a sense or antisense sequence from SEQ ID NO:27, or naturally occurring mutants thereof or 5′ or 3′flanking sequences naturally associated with the 46980 gene; (ii) exposing the probe/primer to nucleic acid of the tissue; and detecting, by hybridization, e.g., in situ hybridization, of the probe/primer to the nucleic acid, the presence or absence of the genetic lesion.

[2081] In preferred embodiments detecting the misexpression includes ascertaining the existence of at least one of: an alteration in the level of a messenger RNA transcript of the 46980 gene; the presence of a non-wild type splicing pattern of a messenger RNA transcript of the gene; or a non-wild type level of 46980.

[2082] Methods of the invention can be used prenatally or to determine if a subject's offspring will be at risk for a disorder.

[2083] In preferred embodiments the method includes determining the structure of a 46980 gene, an abnormal structure being indicative of risk for the disorder.

[2084] In preferred embodiments the method includes contacting a sample from the subject with an antibody to the 46980 protein or a nucleic acid, which hybridizes specifically with the gene. These and other embodiments are discussed below.

[2085] Diagnostic and Prognostic Assays of 46980

[2086] Diagnostic and prognostic assays of the invention include method for assessing the expression level of 46980 molecules and for identifying variations and mutations in the sequence of 46980 molecules.

[2087] Expression Monitoring and Profiling:

[2088] The presence, level, or absence of 46980 protein or nucleic acid in a biological sample can be evaluated by obtaining a biological sample from a test subject and contacting the biological sample with a compound or an agent capable of detecting 46980 protein or nucleic acid (e.g., mRNA, genomic DNA) that encodes 46980 protein such that the presence of 46980 protein or nucleic acid is detected in the biological sample. The term “biological sample” includes tissues, cells and biological fluids isolated from a subject, as well as tissues, cells and fluids present within a subject. A preferred biological sample is serum. The level of expression of the 46980 gene can be measured in a number of ways, including, but not limited to: measuring the mRNA encoded by the 46980 genes; measuring the amount of protein encoded by the 46980 genes; or measuring the activity of the protein encoded by the 46980 genes.

[2089] The level of mRNA corresponding to the 46980 gene in a cell can be determined both by in situ and by in vitro formats.

[2090] The isolated mRNA can be used in hybridization or amplification assays that include, but are not limited to, Southern or Northern analyses, polymerase chain reaction analyses and probe arrays. One preferred diagnostic method for the detection of mRNA levels involves contacting the isolated mRNA with a nucleic acid molecule (probe) that can hybridize to the mRNA encoded by the gene being detected. The nucleic acid probe can be, for example, a full-length 46980 nucleic acid, such as the nucleic acid of SEQ ID NO:27, or a portion thereof, such as an oligonucleotide of at least 7, 15, 30, 50, 100, 250 or 500 nucleotides in length and sufficient to specifically hybridize under stringent conditions to 46980 mRNA or genomic DNA. The probe can be disposed on an address of an array, e.g., an array described below. Other suitable probes for use in the diagnostic assays are described herein.

[2091] In one format, mRNA (or cDNA) is immobilized on a surface and contacted with the probes, for example by running the isolated mRNA on an agarose gel and transferring the mRNA from the gel to a membrane, such as nitrocellulose. In an alternative format, the probes are immobilized on a surface and the mRNA (or cDNA) is contacted with the probes, for example, in a two-dimensional gene chip array described below. A skilled artisan can adapt known mRNA detection methods for use in detecting the level of mRNA encoded by the 46980 genes.

[2092] The level of mRNA in a sample that is encoded by one of 46980 can be evaluated with nucleic acid amplification, e.g., by rtPCR (Mullis (1987) U.S. Pat. No. 4,683,202), ligase chain reaction (Barany (1991) Proc. Natl. Acad. Sci. USA 88:189-193), self sustained sequence replication (Guatelli et al., (1990) Proc. Natl. Acad. Sci. USA 87:1874-1878), transcriptional amplification system (Kwoh et al., (1989), Proc. Natl. Acad. Sci. USA 86:1173-1177), Q-Beta Replicase (Lizardi et al., (1988) Bio/Technology 6:1197), rolling circle replication (Lizardi et al., U.S. Pat. No. 5,854,033) or any other nucleic acid amplification method, followed by the detection of the amplified molecules using techniques known in the art. As used herein, amplification primers are defined as being a pair of nucleic acid molecules that can anneal to 5′ or 3′regions of a gene (plus and minus strands, respectively, or vice-versa) and contain a short region in between. In general, amplification primers are from about 10 to 30 nucleotides in length and flank a region from about 50 to 200 nucleotides in length. Under appropriate conditions and with appropriate reagents, such primers permit the amplification of a nucleic acid molecule comprising the nucleotide sequence flanked by the primers.

[2093] For in situ methods, a cell or tissue sample can be prepared/processed and immobilized on a support, typically a glass slide, and then contacted with a probe that can hybridize to mRNA that encodes the 46980 gene being analyzed.

[2094] In another embodiment, the methods further contacting a control sample with a compound or agent capable of detecting 46980 mRNA, or genomic DNA, and comparing the presence of 46980 mRNA or genomic DNA in the control sample with the presence of 46980 mRNA or genomic DNA in the test sample. In still another embodiment, serial analysis of gene expression, as described in U.S. Pat. No. 5,695,937, is used to detect 46980 transcript levels.

[2095] A variety of methods can be used to determine the level of protein encoded by 46980. In general, these methods include contacting an agent that selectively binds to the protein, such as an antibody with a sample, to evaluate the level of protein in the sample. In a preferred embodiment, the antibody bears a detectable label. Antibodies can be polyclonal, or more preferably, monoclonal. An intact antibody, or a fragment thereof (e.g., Fab or F(ab′)2) can be used. The term “labeled”, with regard to the probe or antibody, is intended to encompass direct labeling of the probe or antibody by coupling (i.e., physically linking) a detectable substance to the probe or antibody, as well as indirect labeling of the probe or antibody by reactivity with a detectable substance. Examples of detectable substances are provided herein.

[2096] The detection methods can be used to detect 46980 protein in a biological sample in vitro as well as in vivo. In vitro techniques for detection of 46980 protein include enzyme linked immunosorbent assays (ELISAs), immunoprecipitations, immunofluorescence, enzyme immunoassay (EIA), radioimmunoassay (RIA), and Western blot analysis. In vivo techniques for detection of 46980 protein include introducing into a subject a labeled anti-46980 antibody. For example, the antibody can be labeled with a radioactive marker whose presence and location in a subject can be detected by standard imaging techniques. In another embodiment, the sample is labeled, e.g., biotinylated and then contacted to the antibody, e.g., an anti-46980 antibody positioned on an antibody array (as described below). The sample can be detected, e.g., with avidin coupled to a fluorescent label.

[2097] In another embodiment, the methods further include contacting the control sample with a compound or agent capable of detecting 46980 protein, and comparing the presence of 46980 protein in the control sample with the presence of 46980 protein in the test sample.

[2098] The invention also includes kits for detecting the presence of 46980 in a biological sample. For example, the kit can include a compound or agent capable of detecting 46980 protein or mRNA in a biological sample; and a standard. The compound or agent can be packaged in a suitable container. The kit can further comprise instructions for using the kit to detect 46980 protein or nucleic acid.

[2099] For antibody-based kits, the kit can include: (1) a first antibody (e.g., attached to a solid support) which binds to a polypeptide corresponding to a marker of the invention; and, optionally, (2) a second, different antibody which binds to either the polypeptide or the first antibody and is conjugated to a detectable agent.

[2100] For oligonucleotide-based kits, the kit can include: (1) an oligonucleotide, e.g., a detectably labeled oligonucleotide, which hybridizes to a nucleic acid sequence encoding a polypeptide corresponding to a marker of the invention or (2) a pair of primers useful for amplifying a nucleic acid molecule corresponding to a marker of the invention. The kit can also includes a buffering agent, a preservative, or a protein stabilizing agent. The kit can also includes components necessary for detecting the detectable agent (e.g., an enzyme or a substrate). The kit can also contain a control sample or a series of control samples which can be assayed and compared to the test sample contained. Each component of the kit can be enclosed within an individual container and all of the various containers can be within a single package, along with instructions for interpreting the results of the assays performed using the kit.

[2101] The diagnostic methods described herein can identify subjects having, or at risk of developing, a disease or disorder associated with misexpressed or aberrant or unwanted 46980 expression or activity. As used herein, the term “unwanted” includes an unwanted phenomenon involved in a biological response such as a neuronal disorder, e.g., a pain-related, neuronal connectivity-related, or neural degenerative disorder, or deregulated cell proliferation.

[2102] In one embodiment, a disease or disorder associated with aberrant or unwanted 46980 expression or activity is identified. A test sample is obtained from a subject and 46980 protein or nucleic acid (e.g., mRNA or genomic DNA) is evaluated, wherein the level, e.g., the presence or absence, of 46980 protein or nucleic acid is diagnostic for a subject having or at risk of developing a disease or disorder associated with aberrant or unwanted 46980 expression or activity. As used herein, a “test sample” refers to a biological sample obtained from a subject of interest, including a biological fluid (e.g., serum), cell sample, or tissue.

[2103] The prognostic assays described herein can be used to determine whether a subject can be administered an agent (e.g., an agonist, antagonist, peptidomimetic, protein, peptide, nucleic acid, small molecule, or other drug candidate) to treat a disease or disorder associated with aberrant or unwanted 46980 expression or activity. For example, such methods can be used to determine whether a subject can be effectively treated with an agent for a cell a neuronal disorder, e.g., a pain-related, neuronal connectivity-related, or neural degenerative disorder, disorder.

[2104] In another aspect, the invention features a computer medium having a plurality of digitally encoded data records. Each data record includes a value representing the level of expression of 46980 in a sample, and a descriptor of the sample. The descriptor of the sample can be an identifier of the sample, a subject from which the sample was derived (e.g., a patient), a diagnosis, or a treatment (e.g., a preferred treatment). In a preferred embodiment, the data record further includes values representing the level of expression of genes other than 46980 (e.g., other genes associated with a 46980-disorder, or other genes on an array). The data record can be structured as a table, e.g., a table that is part of a database such as a relational database (e.g., a SQL database of the Oracle or Sybase database environments).

[2105] Also featured is a method of evaluating a sample. The method includes providing a sample, e.g., from the subject, and determining a gene expression profile of the sample, wherein the profile includes a value representing the level of 46980 expression. The method can further include comparing the value or the profile (i.e., multiple values) to a reference value or reference profile. The gene expression profile of the sample can be obtained by any of the methods described herein (e.g., by providing a nucleic acid from the sample and contacting the nucleic acid to an array). The method can be used to diagnose a a neuronal disorder, e.g., a pain-related, neuronal connectivity-related, or neural degenerative disorder, disorder in a subject wherein an alteration in 46980 expression is an indication that the subject has or is disposed to having a a neuronal disorder, e.g., a pain-related, neuronal connectivity-related, or neural degenerative disorder. The method can be used to monitor a treatment for a neuronal disorder, e.g., a pain-related, neuronal connectivity-related, or neural degenerative disorder, in a subject. For example, the gene expression profile can be determined for a sample from a subject undergoing treatment. The profile can be compared to a reference profile or to a profile obtained from the subject prior to treatment or prior to onset of the disorder (see, e.g., Golub et al. (1999) Science 286:531).

[2106] In yet another aspect, the invention features a method of evaluating a test compound (see also, “Screening Assays”, above). The method includes providing a cell and a test compound; contacting the test compound to the cell; obtaining a subject expression profile for the contacted cell; and comparing the subject expression profile to one or more reference profiles. The profiles include a value representing the level of 46980 expression. In a preferred embodiment, the subject expression profile is compared to a target profile, e.g., a profile for a normal cell or for desired condition of a cell. The test compound is evaluated favorably if the subject expression profile is more similar to the target profile than an expression profile obtained from an uncontacted cell.

[2107] In another aspect, the invention features, a method of evaluating a subject. The method includes: a) obtaining a sample from a subject, e.g., from a caregiver, e.g., a caregiver who obtains the sample from the subject; b) determining a subject expression profile for the sample. Optionally, the method further includes either or both of steps: c) comparing the subject expression profile to one or more reference expression profiles; and d) selecting the reference profile most similar to the subject reference profile. The subject expression profile and the reference profiles include a value representing the level of 46980 expression. A variety of routine statistical measures can be used to compare two reference profiles. One possible metric is the length of the distance vector that is the difference between the two profiles. Each of the subject and reference profile is represented as a multi-dimensional vector, wherein each dimension is a value in the profile.

[2108] The method can further include transmitting a result to a caregiver. The result can be the subject expression profile, a result of a comparison of the subject expression profile with another profile, a most similar reference profile, or a descriptor of any of the aforementioned. The result can be transmitted across a computer network, e.g., the result can be in the form of a computer transmission, e.g., a computer data signal embedded in a carrier wave.

[2109] Also featured is a computer medium having executable code for effecting the following steps: receive a subject expression profile; access a database of reference expression profiles; and either i) select a matching reference profile most similar to the subject expression profile or ii) determine at least one comparison score for the similarity of the subject expression profile to at least one reference profile. The subject expression profile, and the reference expression profiles each include a value representing the level of 46980 expression.

[2110] 46980 Arrays and Uses Thereof

[2111] In another aspect, the invention features an array that includes a substrate having a plurality of addresses. At least one address of the plurality includes a capture probe that binds specifically to a 46980 molecule (e.g., a 46980 nucleic acid or a 46980 polypeptide). The array can have a density of at least than 10, 50, 100, 200, 500, 1,000, 2,000, or 10,000 or more addresses/cm2, and ranges between. In a preferred embodiment, the plurality of addresses includes at least 10, 100, 500, 1,000, 5,000, 10,000, 50,000 addresses. In a preferred embodiment, the plurality of addresses includes equal to or less than 10, 100, 500, 1,000, 5,000, 10,000, or 50,000 addresses. The substrate can be a two-dimensional substrate such as a glass slide, a wafer (e.g., silica or plastic), a mass-spectroscopy plate, or a three-dimensional substrate such as a gel pad. Addresses in addition to address of the plurality can be disposed on the array.

[2112] In a preferred embodiment, at least one address of the plurality includes a nucleic acid capture probe that hybridizes specifically to a 46980 nucleic acid, e.g., the sense or anti-sense strand. In one preferred embodiment, a subset of addresses of the plurality of addresses has a nucleic acid capture probe for 46980. Each address of the subset can include a capture probe that hybridizes to a different region of a 46980 nucleic acid. In another preferred embodiment, addresses of the subset include a capture probe for a 46980 nucleic acid. Each address of the subset is unique, overlapping, and complementary to a different variant of 46980 (e.g., an allelic variant, or all possible hypothetical variants). The array can be used to sequence 46980 by hybridization (see, e.g., U.S. Pat. No. 5,695,940).

[2113] An array can be generated by various methods, e.g., by photolithographic methods (see, e.g., U.S. Pat. Nos. 5,143,854; 5,510,270; and 5,527,681), mechanical methods (e.g., directed-flow methods as described in U.S. Pat. No. 5,384,261), pin-based methods (e.g., as described in U.S. Pat. No. 5,288,514), and bead-based techniques (e.g., as described in PCT US/93/04145).

[2114] In another preferred embodiment, at least one address of the plurality includes a polypeptide capture probe that binds specifically to a 46980 polypeptide or fragment thereof. The polypeptide can be a naturally-occurring interaction partner of 46980 polypeptide. Preferably, the polypeptide is an antibody, e.g., an antibody described herein (see “Anti-46980 Antibodies,” above), such as a monoclonal antibody or a single-chain antibody.

[2115] In another aspect, the invention features a method of analyzing the expression of 46980. The method includes providing an array as described above; contacting the array with a sample and detecting binding of a 46980-molecule (e.g., nucleic acid or polypeptide) to the array. In a preferred embodiment, the array is a nucleic acid array. Optionally the method further includes amplifying nucleic acid from the sample prior or during contact with the array.

[2116] In another embodiment, the array can be used to assay gene expression in a tissue to ascertain tissue specificity of genes in the array, particularly the expression of 46980. If a sufficient number of diverse samples is analyzed, clustering (e.g., hierarchical clustering, k-means clustering, Bayesian clustering and the like) can be used to identify other genes which are co-regulated with 46980. For example, the array can be used for the quantitation of the expression of multiple genes. Thus, not only tissue specificity, but also the level of expression of a battery of genes in the tissue is ascertained. Quantitative data can be used to group (e.g., cluster) genes on the basis of their tissue expression per se and level of expression in that tissue.

[2117] For example, array analysis of gene expression can be used to assess the effect of cell-cell interactions on 46980 expression. A first tissue can be perturbed and nucleic acid from a second tissue that interacts with the first tissue can be analyzed. In this context, the effect of one cell type on another cell type in response to a biological stimulus can be determined, e.g., to monitor the effect of cell-cell interaction at the level of gene expression.

[2118] In another embodiment, cells are contacted with a therapeutic agent. The expression profile of the cells is determined using the array, and the expression profile is compared to the profile of like cells not contacted with the agent. For example, the assay can be used to determine or analyze the molecular basis of an undesirable effect of the therapeutic agent. If an agent is administered therapeutically to treat one cell type but has an undesirable effect on another cell type, the invention provides an assay to determine the molecular basis of the undesirable effect and thus provides the opportunity to co-administer a counteracting agent or otherwise treat the undesired effect. Similarly, even within a single cell type, undesirable biological effects can be determined at the molecular level. Thus, the effects of an agent on expression of other than the target gene can be ascertained and counteracted.

[2119] In another embodiment, the array can be used to monitor expression of one or more genes in the array with respect to time. For example, samples obtained from different time points can be probed with the array. Such analysis can identify and/or characterize the development of a 46980-associated disease or disorder; and processes, such as a cellular transformation associated with a 46980-associated disease or disorder. The method can also evaluate the treatment and/or progression of a 46980-associated disease or disorder

[2120] The array is also useful for ascertaining differential expression patterns of one or more genes in normal and abnormal cells. This provides a battery of genes (e.g., including 46980) that could serve as a molecular target for diagnosis or therapeutic intervention.

[2121] In another aspect, the invention features an array having a plurality of addresses. Each address of the plurality includes a unique polypeptide. At least one address of the plurality has disposed thereon a 46980 polypeptide or fragment thereof. Methods of producing polypeptide arrays are described in the art, e.g., in De Wildt et al. (2000). Nature Biotech. 18, 989-994; Lueking et al. (1999). Anal. Biochem. 270, 103-111; Ge, H. (2000). Nucleic Acids Res. 28, e3, I-VII; MacBeath, G., and Schreiber, S. L. (2000). Science 289, 1760-1763; and WO 99/51773A1. In a preferred embodiment, each addresses of the plurality has disposed thereon a polypeptide at least 60, 70, 80, 85, 90, 95 or 99% identical to a 46980 polypeptide or fragment thereof. For example, multiple variants of a 46980 polypeptide (e.g., encoded by allelic variants, site-directed mutants, random mutants, or combinatorial mutants) can be disposed at individual addresses of the plurality. Addresses in addition to the address of the plurality can be disposed on the array.

[2122] The polypeptide array can be used to detect a 46980 binding compound, e.g., an antibody in a sample from a subject with specificity for a 46980 polypeptide or the presence of a 46980-binding protein or ligand.

[2123] The array is also useful for ascertaining the effect of the expression of a gene on the expression of other genes in the same cell or in different cells (e.g., ascertaining the effect of 46980 expression on the expression of other genes). This provides, for example, for a selection of alternate molecular targets for therapeutic intervention if the ultimate or downstream target cannot be regulated.

[2124] In another aspect, the invention features a method of analyzing a plurality of probes. The method is useful, e.g., for analyzing gene expression. The method includes: providing a two dimensional array having a plurality of addresses, each address of the plurality being positionally distinguishable from each other address of the plurality having a unique capture probe, e.g., wherein the capture probes are from a cell or subject which express 46980 or from a cell or subject in which a 46980 mediated response has been elicited, e.g., by contact of the cell with 46980 nucleic acid or protein, or administration to the cell or subject 46980 nucleic acid or protein; providing a two dimensional array having a plurality of addresses, each address of the plurality being positionally distinguishable from each other address of the plurality, and each address of the plurality having a unique capture probe, e.g., wherein the capture probes are from a cell or subject which does not express 46980 (or does not express as highly as in the case of the 46980 positive plurality of capture probes) or from a cell or subject which in which a 46980 mediated response has not been elicited (or has been elicited to a lesser extent than in the first sample); contacting the array with one or more inquiry probes (which is preferably other than a 46980 nucleic acid, polypeptide, or antibody), and thereby evaluating the plurality of capture probes. Binding, e.g., in the case of a nucleic acid, hybridization with a capture probe at an address of the plurality, is detected, e.g., by signal generated from a label attached to the nucleic acid, polypeptide, or antibody.

[2125] In another aspect, the invention features a method of analyzing a plurality of probes or a sample. The method is useful, e.g., for analyzing gene expression. The method includes: providing a two dimensional array having a plurality of addresses, each address of the plurality being positionally distinguishable from each other address of the plurality having a unique capture probe, contacting the array with a first sample from a cell or subject which express or mis-express 46980 or from a cell or subject in which a 46980-mediated response has been elicited, e.g., by contact of the cell with 46980 nucleic acid or protein, or administration to the cell or subject 46980 nucleic acid or protein; providing a two dimensional array having a plurality of addresses, each address of the plurality being positionally distinguishable from each other address of the plurality, and each address of the plurality having a unique capture probe, and contacting the array with a second sample from a cell or subject which does not express 46980 (or does not express as highly as in the case of the 46980 positive plurality of capture probes) or from a cell or subject which in which a 46980 mediated response has not been elicited (or has been elicited to a lesser extent than in the first sample); and comparing the binding of the first sample with the binding of the second sample. Binding, e.g., in the case of a nucleic acid, hybridization with a capture probe at an address of the plurality, is detected, e.g., by signal generated from a label attached to the nucleic acid, polypeptide, or antibody. The same array can be used for both samples or different arrays can be used. If different arrays are used the plurality of addresses with capture probes should be present on both arrays.

[2126] In another aspect, the invention features a method of analyzing 46980, e.g., analyzing structure, function, or relatedness to other nucleic acid or amino acid sequences. The method includes: providing a 46980 nucleic acid or amino acid sequence; comparing the 46980 sequence with one or more preferably a plurality of sequences from a collection of sequences, e.g., a nucleic acid or protein sequence database; to thereby analyze 46980.

[2127] Detection of 46980 Variations or Mutations

[2128] The methods of the invention can also be used to detect genetic alterations in a 46980 gene, thereby determining if a subject with the altered gene is at risk for a disorder characterized by misregulation in 46980 protein activity or nucleic acid expression, such as a a neuronal disorder, e.g., a pain-related, neuronal connectivity-related, or neural degenerative disorder, disorder. In preferred embodiments, the methods include detecting, in a sample from the subject, the presence or absence of a genetic alteration characterized by at least one of an alteration affecting the integrity of a gene encoding a 46980-protein, or the mis-expression of the 46980 gene. For example, such genetic alterations can be detected by ascertaining the existence of at least one of 1) a deletion of one or more nucleotides from a 46980 gene; 2) an addition of one or more nucleotides to a 46980 gene; 3) a substitution of one or more nucleotides of a 46980 gene, 4) a chromosomal rearrangement of a 46980 gene; 5) an alteration in the level of a messenger RNA transcript of a 46980 gene, 6) aberrant modification of a 46980 gene, such as of the methylation pattern of the genomic DNA, 7) the presence of a non-wild type splicing pattern of a messenger RNA transcript of a 46980 gene, 8) a non-wild type level of a 46980-protein, 9) allelic loss of a 46980 gene, and 10) inappropriate post-translational modification of a 46980-protein.

[2129] An alteration can be detected without a probe/primer in a polymerase chain reaction, such as anchor PCR or RACE PCR, or, alternatively, in a ligation chain reaction (LCR), the latter of which can be particularly useful for detecting point mutations in the 46980-gene. This method can include the steps of collecting a sample of cells from a subject, isolating nucleic acid (e.g., genomic, mRNA or both) from the sample, contacting the nucleic acid sample with one or more primers which specifically hybridize to a 46980 gene under conditions such that hybridization and amplification of the 46980-gene (if present) occurs, and detecting the presence or absence of an amplification product, or detecting the size of the amplification product and comparing the length to a control sample. It is anticipated that PCR and/or LCR may be desirable to use as a preliminary amplification step in conjunction with any of the techniques used for detecting mutations described herein. Alternatively, other amplification methods described herein or known in the art can be used.

[2130] In another embodiment, mutations in a 46980 gene from a sample cell can be identified by detecting alterations in restriction enzyme cleavage patterns. For example, sample and control DNA is isolated, amplified (optionally), digested with one or more restriction endonucleases, and fragment length sizes are determined, e.g., by gel electrophoresis and compared. Differences in fragment length sizes between sample and control DNA indicates mutations in the sample DNA. Moreover, the use of sequence specific ribozymes (see, for example, U.S. Pat. No. 5,498,531) can be used to score for the presence of specific mutations by development or loss of a ribozyme cleavage site.

[2131] In other embodiments, genetic mutations in 46980 can be identified by hybridizing a sample and control nucleic acids, e.g., DNA or RNA, two-dimensional arrays, e.g., chip based arrays. Such arrays include a plurality of addresses, each of which is positionally distinguishable from the other. A different probe is located at each address of the plurality. A probe can be complementary to a region of a 46980 nucleic acid or a putative variant (e.g., allelic variant) thereof. A probe can have one or more mismatches to a region of a 46980 nucleic acid (e.g., a destabilizing mismatch). The arrays can have a high density of addresses, e.g., can contain hundreds or thousands of oligonucleotides probes (Cronin, M. T. et al. (1996) Human Mutation 7: 244-255; Kozal, M. J. et al. (1996) Nature Medicine 2: 753-759). For example, genetic mutations in 46980 can be identified in two-dimensional arrays containing light-generated DNA probes as described in Cronin, M. T. et al. supra. Briefly, a first hybridization array of probes can be used to scan through long stretches of DNA in a sample and control to identify base changes between the sequences by making linear arrays of sequential overlapping probes. This step allows the identification of point mutations. This step is followed by a second hybridization array that allows the characterization of specific mutations by using smaller, specialized probe arrays complementary to all variants or mutations detected. Each mutation array is composed of parallel probe sets, one complementary to the wild-type gene and the other complementary to the mutant gene.

[2132] In yet another embodiment, any of a variety of sequencing reactions known in the art can be used to directly sequence the 46980 gene and detect mutations by comparing the sequence of the sample 46980 with the corresponding wild-type (control) sequence. Automated sequencing procedures can be utilized when performing the diagnostic assays ((1995) Biotechniques 19:448), including sequencing by mass spectrometry.

[2133] Other methods for detecting mutations in the 46980 gene include methods in which protection from cleavage agents is used to detect mismatched bases in RNA/RNA or RNA/DNA heteroduplexes (Myers et al. (1985) Science 230:1242; Cotton et al. (1988) Proc. Natl. Acad Sci USA 85:4397; Saleeba et al. (1992) Methods Enzymol. 217:286-295).

[2134] In still another embodiment, the mismatch cleavage reaction employs one or more proteins that recognize mismatched base pairs in double-stranded DNA (so called “DNA mismatch repair” enzymes) in defined systems for detecting and mapping point mutations in 46980 cDNAs obtained from samples of cells. For example, the mutY enzyme of E. coli cleaves A at G/A mismatches and the thymidine DNA glycosylase from HeLa cells cleaves T at G/T mismatches (Hsu et al. (1994) Carcinogenesis 15:1657-1662; U.S. Pat. No. 5,459,039).

[2135] In other embodiments, alterations in electrophoretic mobility will be used to identify mutations in 46980 genes. For example, single strand conformation polymorphism (SSCP) may be used to detect differences in electrophoretic mobility between mutant and wild type nucleic acids (Orita et al. (1989) Proc Natl. Acad. Sci USA: 86:2766, see also Cotton (1993) Mutat. Res. 285:125-144; and Hayashi (1992) Genet. Anal. Tech. Appl. 9:73-79). Single-stranded DNA fragments of sample and control 46980 nucleic acids will be denatured and allowed to renature. The secondary structure of single-stranded nucleic acids varies according to sequence, the resulting alteration in electrophoretic mobility enables the detection of even a single base change. The DNA fragments may be labeled or detected with labeled probes. The sensitivity of the assay may be enhanced by using RNA (rather than DNA), in which the secondary structure is more sensitive to a change in sequence. In a preferred embodiment, the subject method utilizes heteroduplex analysis to separate double stranded heteroduplex molecules on the basis of changes in electrophoretic mobility (Keen et al. (1991) Trends Genet 7:5).

[2136] In yet another embodiment, the movement of mutant or wild-type fragments in polyacrylamide gels containing a gradient of denaturant is assayed using denaturing gradient gel electrophoresis (DGGE) (Myers et al. (1985) Nature 313:495). When DGGE is used as the method of analysis, DNA will be modified to insure that it does not completely denature, for example by adding a GC clamp of approximately 40 bp of high-melting GC-rich DNA by PCR. In a further embodiment, a temperature gradient is used in place of a denaturing gradient to identify differences in the mobility of control and sample DNA (Rosenbaum and Reissner (1987) Biophys Chem 265:12753).

[2137] Examples of other techniques for detecting point mutations include, but are not limited to, selective oligonucleotide hybridization, selective amplification, or selective primer extension (Saiki et al. (1986) Nature 324:163); Saiki et al. (1989) Proc. Natl. Acad. Sci USA 86:6230). A further method of detecting point mutations is the chemical ligation of oligonucleotides as described in Xu et al. ((2001) Nature Biotechnol. 19:148). Adjacent oligonucleotides, one of which selectively anneals to the query site, are ligated together if the nucleotide at the query site of the sample nucleic acid is complementary to the query oligonucleotide; ligation can be monitored, e.g., by fluorescent dyes coupled to the oligonucleotides.

[2138] Alternatively, allele specific amplification technology that depends on selective PCR amplification may be used in conjunction with the instant invention. Oligonucleotides used as primers for specific amplification may carry the mutation of interest in the center of the molecule (so that amplification depends on differential hybridization) (Gibbs et al. (1989) Nucleic Acids Res. 17:2437-2448) or at the extreme 3′end of one primer where, under appropriate conditions, mismatch can prevent, or reduce polymerase extension (Prossner (1993) Tibtech 11:238). In addition it may be desirable to introduce a novel restriction site in the region of the mutation to create cleavage-based detection (Gasparini et al. (1992) Mol. Cell Probes 6:1). It is anticipated that in certain embodiments amplification may also be performed using Taq ligase for amplification (Barany (1991) Proc. Natl. Acad. Sci USA 88:189). In such cases, ligation will occur only if there is a perfect match at the 3′end of the 5′sequence making it possible to detect the presence of a known mutation at a specific site by looking for the presence or absence of amplification.

[2139] In another aspect, the invention features a set of oligonucleotides. The set includes a plurality of oligonucleotides, each of which is at least partially complementary (e.g., at least 50%, 60%, 70%, 80%, 90%, 92%, 95%, 97%, 98%, or 99% complementary) to a 46980 nucleic acid.

[2140] In a preferred embodiment the set includes a first and a second oligonucleotide. The first and second oligonucleotide can hybridize to the same or to different locations of SEQ ID NO:27 or the complement of SEQ ID NO:27. Different locations can be different but overlapping, or non-overlapping on the same strand. The first and second oligonucleotide can hybridize to sites on the same or on different strands.

[2141] The set can be useful, e.g., for identifying SNP's, or identifying specific alleles of 46980. In a preferred embodiment, each oligonucleotide of the set has a different nucleotide at an interrogation position. In one embodiment, the set includes two oligonucleotides, each complementary to a different allele at a locus, e.g., a biallelic or polymorphic locus.

[2142] In another embodiment, the set includes four oligonucleotides, each having a different nucleotide (e.g., adenine, guanine, cytosine, or thymidine) at the interrogation position. The interrogation position can be a SNP or the site of a mutation. In another preferred embodiment, the oligonucleotides of the plurality are identical in sequence to one another (except for differences in length). The oligonucleotides can be provided with differential labels, such that an oligonucleotide that hybridizes to one allele provides a signal that is distinguishable from an oligonucleotide that hybridizes to a second allele. In still another embodiment, at least one of the oligonucleotides of the set has a nucleotide change at a position in addition to a query position, e.g., a destabilizing mutation to decrease the Tm of the oligonucleotide. In another embodiment, at least one oligonucleotide of the set has a non-natural nucleotide, e.g., inosine. In a preferred embodiment, the oligonucleotides are attached to a solid support, e.g., to different addresses of an array or to different beads or nanoparticles.

[2143] In a preferred embodiment the set of oligo nucleotides can be used to specifically amplify, e.g., by PCR, or detect, a 46980 nucleic acid.

[2144] The methods described herein may be performed, for example, by utilizing pre-packaged diagnostic kits comprising at least one probe nucleic acid or antibody reagent described herein, which may be conveniently used, e.g., in clinical settings to diagnose patients exhibiting symptoms or family history of a disease or illness involving a 46980 gene.

[2145] Use of 46980 Molecules as Surrogate Markers

[2146] The 46980 molecules of the invention are also useful as markers of disorders or disease states, as markers for precursors of disease states, as markers for predisposition of disease states, as markers of drug activity, or as markers of the pharmacogenomic profile of a subject. Using the methods described herein, the presence, absence and/or quantity of the 46980 molecules of the invention may be detected, and may be correlated with one or more biological states in vivo. For example, the 46980 molecules of the invention may serve as surrogate markers for one or more disorders or disease states or for conditions leading up to disease states. As used herein, a “surrogate marker” is an objective biochemical marker which correlates with the absence or presence of a disease or disorder, or with the progression of a disease or disorder (e.g., with the presence or absence of a tumor). The presence or quantity of such markers is independent of the disease. Therefore, these markers may serve to indicate whether a particular course of treatment is effective in lessening a disease state or disorder. Surrogate markers are of particular use when the presence or extent of a disease state or disorder is difficult to assess through standard methodologies (e.g., early stage tumors), or when an assessment of disease progression is desired before a potentially dangerous clinical endpoint is reached (e.g., an assessment of cardiovascular disease may be made using cholesterol levels as a surrogate marker, and an analysis of HIV infection may be made using HIV RNA levels as a surrogate marker, well in advance of the undesirable clinical outcomes of myocardial infarction or fully-developed AIDS). Examples of the use of surrogate markers in the art include: Koomen et al. (2000) J. Mass. Spectrom. 35: 258-264; and James (1994) AIDS Treatment News Archive 209.

[2147] The 46980 molecules of the invention are also useful as pharmacodynamic markers. As used herein, a “pharmacodynamic marker” is an objective biochemical marker which correlates specifically with drug effects. The presence or quantity of a pharmacodynamic marker is not related to the disease state or disorder for which the drug is being administered; therefore, the presence or quantity of the marker is indicative of the presence or activity of the drug in a subject. For example, a pharmacodynamic marker may be indicative of the concentration of the drug in a biological tissue, in that the marker is either expressed or transcribed or not expressed or transcribed in that tissue in relationship to the level of the drug. In this fashion, the distribution or uptake of the drug may be monitored by the pharmacodynamic marker. Similarly, the presence or quantity of the pharmacodynamic marker may be related to the presence or quantity of the metabolic product of a drug, such that the presence or quantity of the marker is indicative of the relative breakdown rate of the drug in vivo. Pharmacodynamic markers are of particular use in increasing the sensitivity of detection of drug effects, particularly when the drug is administered in low doses. Since even a small amount of a drug may be sufficient to activate multiple rounds of marker (e.g., a 46980 marker) transcription or expression, the amplified marker may be in a quantity which is more readily detectable than the drug itself. Also, the marker may be more easily detected due to the nature of the marker itself; for example, using the methods described herein, anti-46980 antibodies may be employed in an immune-based detection system for a 46980 protein marker, or 46980-specific radiolabeled probes may be used to detect a 46980 mRNA marker. Furthermore, the use of a pharmacodynamic marker may offer mechanism-based prediction of risk due to drug treatment beyond the range of possible direct observations. Examples of the use of pharmacodynamic markers in the art include: Matsuda et al. U.S. Pat. No. 6,033,862; Hattis et al. (1991) Env. Health Perspect. 90: 229-238; Schentag (1999) Am. J. Health-Syst. Pharm. 56 Suppl. 3: S21-S24; and Nicolau (1999) Am, J. Health-Syst. Pharm. 56 Suppl. 3: S16-S20.

[2148] The 46980 molecules of the invention are also useful as pharmacogenomic markers. As used herein, a “pharmacogenomic marker” is an objective biochemical marker which correlates with a specific clinical drug response or susceptibility in a subject (see, e.g., McLeod et al. (1999) Eur. J. Cancer 35:1650-1652). The presence or quantity of the pharmacogenomic marker is related to the predicted response of the subject to a specific drug or class of drugs prior to administration of the drug. By assessing the presence or quantity of one or more pharmacogenomic markers in a subject, a drug therapy which is most appropriate for the subject, or which is predicted to have a greater degree of success, may be selected. For example, based on the presence or quantity of RNA, or protein (e.g., 46980 protein or RNA) for specific tumor markers in a subject, a drug or course of treatment may be selected that is optimized for the treatment of the specific tumor likely to be present in the subject. Similarly, the presence or absence of a specific sequence mutation in 46980 DNA may correlate 46980 drug response. The use of pharmacogenomic markers therefore permits the application of the most appropriate treatment for each subject without having to administer the therapy.

[2149] Pharmaceutical Compositions of 46980

[2150] The nucleic acid and polypeptides, fragments thereof, as well as anti-46980 antibodies (also referred to herein as “active compounds”) of the invention can be incorporated into pharmaceutical compositions. Such compositions typically include the nucleic acid molecule, protein, or antibody and a pharmaceutically acceptable carrier. As used herein the language “pharmaceutically acceptable carrier” includes solvents, dispersion media, coatings, antibacterial and antifungal agents, isotonic and absorption delaying agents, and the like, compatible with pharmaceutical administration. Supplementary active compounds can also be incorporated into the compositions.

[2151] A pharmaceutical composition is formulated to be compatible with its intended route of administration. Examples of routes of administration include parenteral, e.g., intravenous, intradermal, subcutaneous, oral (e.g., inhalation), transdermal (topical), transmucosal, and rectal administration. Solutions or suspensions used for parenteral, intradermal, or subcutaneous application can include the following components: a sterile diluent such as water for injection, saline solution, fixed oils, polyethylene glycols, glycerine, propylene glycol or other synthetic solvents; antibacterial agents such as benzyl alcohol or methyl parabens; antioxidants such as ascorbic acid or sodium bisulfite; chelating agents such as ethylenediaminetetraacetic acid; buffers such as acetates, citrates or phosphates and agents for the adjustment of tonicity such as sodium chloride or dextrose. pH can be adjusted with acids or bases, such as hydrochloric acid or sodium hydroxide. The parenteral preparation can be enclosed in ampoules, disposable syringes or multiple dose vials made of glass or plastic.

[2152] Pharmaceutical compositions suitable for injectable use include sterile aqueous solutions (where water soluble) or dispersions and sterile powders for the extemporaneous preparation of sterile injectable solutions or dispersion. For intravenous administration, suitable carriers include physiological saline, bacteriostatic water, Cremophor EL™ (BASF, Parsippany, N.J.) or phosphate buffered saline (PBS). In all cases, the composition must be sterile and should be fluid to the extent that easy syringability exists. It should be stable under the conditions of manufacture and storage and must be preserved against the contaminating action of microorganisms such as bacteria and fungi. The carrier can be a solvent or dispersion medium containing, for example, water, ethanol, polyol (for example, glycerol, propylene glycol, and liquid polyetheylene glycol, and the like), and suitable mixtures thereof. The proper fluidity can be maintained, for example, by the use of a coating such as lecithin, by the maintenance of the required particle size in the case of dispersion and by the use of surfactants. Prevention of the action of microorganisms can be achieved by various antibacterial and antifungal agents, for example, parabens, chlorobutanol, phenol, ascorbic acid, thimerosal, and the like. In many cases, it will be preferable to include isotonic agents, for example, sugars, polyalcohols such as manitol, sorbitol, sodium chloride in the composition. Prolonged absorption of the injectable compositions can be brought about by including in the composition an agent which delays absorption, for example, aluminum monostearate and gelatin.

[2153] Sterile injectable solutions can be prepared by incorporating the active compound in the required amount in an appropriate solvent with one or a combination of ingredients enumerated above, as required, followed by filtered sterilization. Generally, dispersions are prepared by incorporating the active compound into a sterile vehicle which contains a basic dispersion medium and the required other ingredients from those enumerated above. In the case of sterile powders for the preparation of sterile injectable solutions, the preferred methods of preparation are vacuum drying and freeze-drying which yields a powder of the active ingredient plus any additional desired ingredient from a previously sterile-filtered solution thereof.

[2154] Oral compositions generally include an inert diluent or an edible carrier. For the purpose of oral therapeutic administration, the active compound can be incorporated with excipients and used in the form of tablets, troches, or capsules, e.g., gelatin capsules. Oral compositions can also be prepared using a fluid carrier for use as a mouthwash. Pharmaceutically compatible binding agents, and/or adjuvant materials can be included as part of the composition. The tablets, pills, capsules, troches and the like can contain any of the following ingredients, or compounds of a similar nature: a binder such as microcrystalline cellulose, gum tragacanth or gelatin; an excipient such as starch or lactose, a disintegrating agent such as alginic acid, Primogel, or corn starch; a lubricant such as magnesium stearate or Sterotes; a glidant such as colloidal silicon dioxide; a sweetening agent such as sucrose or saccharin; or a flavoring agent such as peppermint, methyl salicylate, or orange flavoring.

[2155] For administration by inhalation, the compounds are delivered in the form of an aerosol spray from pressured container or dispenser which contains a suitable propellant, e.g., a gas such as carbon dioxide, or a nebulizer.

[2156] Systemic administration can also be by transmucosal or transdermal means. For transmucosal or transdermal administration, penetrants appropriate to the barrier to be permeated are used in the formulation. Such penetrants are generally known in the art, and include, for example, for transmucosal administration, detergents, bile salts, and fusidic acid derivatives. Transmucosal administration can be accomplished through the use of nasal sprays or suppositories. For transdermal administration, the active compounds are formulated into ointments, salves, gels, or creams as generally known in the art.

[2157] The compounds can also be prepared in the form of suppositories (e.g., with conventional suppository bases such as cocoa butter and other glycerides) or retention enemas for rectal delivery.

[2158] In one embodiment, the active compounds are prepared with carriers that will protect the compound against rapid elimination from the body, such as a controlled release formulation, including implants and microencapsulated delivery systems. Biodegradable, biocompatible polymers can be used, such as ethylene vinyl acetate, polyanhydrides, polyglycolic acid, collagen, polyorthoesters, and polylactic acid. Methods for preparation of such formulations will be apparent to those skilled in the art. The materials can also be obtained commercially from Alza Corporation and Nova Pharmaceuticals, Inc. Liposomal suspensions (including liposomes targeted to infected cells with monoclonal antibodies to viral antigens) can also be used as pharmaceutically acceptable carriers. These can be prepared according to methods known to those skilled in the art, for example, as described in U.S. Pat. No. 4,522,811.

[2159] It is advantageous to formulate oral or parenteral compositions in dosage unit form for ease of administration and uniformity of dosage. Dosage unit form as used herein refers to physically discrete units suited as unitary dosages for the subject to be treated; each unit containing a predetermined quantity of active compound calculated to produce the desired therapeutic effect in association with the required pharmaceutical carrier.

[2160] Toxicity and therapeutic efficacy of such compounds can be determined by standard pharmaceutical procedures in cell cultures or experimental animals, e.g., for determining the LD50 (the dose lethal to 50% of the population) and the ED50 (the dose therapeutically effective in 50% of the population). The dose ratio between toxic and therapeutic effects is the therapeutic index and it can be expressed as the ratio LD50/ED50. Compounds which exhibit high therapeutic indices are preferred. While compounds that exhibit toxic side effects may be used, care should be taken to design a delivery system that targets such compounds to the site of affected tissue in order to minimize potential damage to uninfected cells and, thereby, reduce side effects.

[2161] The data obtained from the cell culture assays and animal studies can be used in formulating a range of dosage for use in humans. The dosage of such compounds lies preferably within a range of circulating concentrations that include the ED50 with little or no toxicity. The dosage may vary within this range depending upon the dosage form employed and the route of administration utilized. For any compound used in the method of the invention, the therapeutically effective dose can be estimated initially from cell culture assays. A dose may be formulated in animal models to achieve a circulating plasma concentration range that includes the IC50 (i.e., the concentration of the test compound which achieves a half-maximal inhibition of symptoms) as determined in cell culture. Such information can be used to more accurately determine useful doses in humans. Levels in plasma may be measured, for example, by high performance liquid chromatography. As defined herein, a therapeutically effective amount of protein or polypeptide (i.e., an effective dosage) ranges from about 0.001 to 30 mg/kg body weight, preferably about 0.01 to 25 mg/kg body weight, more preferably about 0.1 to 20 mg/kg body weight, and even more preferably about 1 to 10 mg/kg, 2 to 9 mg/kg, 3 to 8 mg/kg, 4 to 7 mg/kg, or 5 to 6 mg/kg body weight. The protein or polypeptide can be administered one time per week for between about 1 to 10 weeks, preferably between 2 to 8 weeks, more preferably between about 3 to 7 weeks, and even more preferably for about 4, 5, or 6 weeks. The skilled artisan will appreciate that certain factors may influence the dosage and timing required to effectively treat a subject, including but not limited to the severity of the disease or disorder, previous treatments, the general health and/or age of the subject, and other diseases present. Moreover, treatment of a subject with a therapeutically effective amount of a protein, polypeptide, or antibody can include a single treatment or, preferably, can include a series of treatments.

[2162] For antibodies, the preferred dosage is 0.1 mg/kg of body weight (generally 10 mg/kg to 20 mg/kg). If the antibody is to act in the brain, a dosage of 50 mg/kg to 100 mg/kg is usually appropriate. Generally, partially human antibodies and fully human antibodies have a longer half-life within the human body than other antibodies. Accordingly, lower dosages and less frequent administration is often possible. Modifications such as lipidation can be used to stabilize antibodies and to enhance uptake and tissue penetration (e.g., into the brain). A method for lipidation of antibodies is described by Cruikshank et al. ((1997) J. Acquired Immune Deficiency Syndromes and Human Retrovirology 14:193).

[2163] The present invention encompasses agents which modulate expression or activity. An agent may, for example, be a small molecule. For example, such small molecules include, but are not limited to, peptides, peptidomimetics (e.g., peptoids), amino acids, amino acid analogs, polynucleotides, polynucleotide analogs, nucleotides, nucleotide analogs, organic or inorganic compounds (i.e. including heteroorganic and organometallic compounds) having a molecular weight less than about 10,000 grams per mole, organic or inorganic compounds having a molecular weight less than about 5,000 grams per mole, organic or inorganic compounds having a molecular weight less than about 1,000 grams per mole, organic or inorganic compounds having a molecular weight less than about 500 grams per mole, and salts, esters, and other pharmaceutically acceptable forms of such compounds.

[2164] Exemplary doses include milligram or microgram amounts of the small molecule per kilogram of subject or sample weight (e.g., about 1 microgram per kilogram to about 500 milligrams per kilogram, about 100 micrograms per kilogram to about 5 milligrams per kilogram, or about 1 microgram per kilogram to about 50 micrograms per kilogram. It is furthermore understood that appropriate doses of a small molecule depend upon the potency of the small molecule with respect to the expression or activity to be modulated. When one or more of these small molecules is to be administered to an animal (e.g., a human) in order to modulate expression or activity of a polypeptide or nucleic acid of the invention, a physician, veterinarian, or researcher may, for example, prescribe a relatively low dose at first, subsequently increasing the dose until an appropriate response is obtained. In addition, it is understood that the specific dose level for any particular animal subject will depend upon a variety of factors including the activity of the specific compound employed, the age, body weight, general health, gender, and diet of the subject, the time of administration, the route of administration, the rate of excretion, any drug combination, and the degree of expression or activity to be modulated.

[2165] An antibody (or fragment thereof) may be conjugated to a therapeutic moiety such as a cytotoxin, a therapeutic agent or a radioactive ion. A cytotoxin or cytotoxic agent includes any agent that is detrimental to cells. Examples include taxol, cytochalasin B, gramicidin D, ethidium bromide, emetine, mitomycin, etoposide, tenoposide, vincristine, vinblastine, colchicin, doxorubicin, daunorubicin, dihydroxy anthracin dione, mitoxantrone, mithramycin, actinomycin D, 1-dehydrotestosterone, glucocorticoids, procaine, tetracaine, lidocaine, propranolol, puromycin, maytansinoids, e.g., maytansinol (see U.S. Pat. No. 5,208,020), CC-1065 (see U.S. Pat. Nos. 5,475,092, 5,585,499, 5,846,545) and analogs or homologs thereof. Therapeutic agents include, but are not limited to, antimetabolites (e.g., methotrexate, 6-mercaptopurine, 6-thioguanine, cytarabine, 5-fluorouracil decarbazine), alkylating agents (e.g., mechlorethamine, thioepa chlorambucil, CC-1065, melphalan, carmustine (BSNU) and lomustine (CCNU), cyclothosphamide, busulfan, dibromomannitol, streptozotocin, mitomycin C, and cis-dichlorodiamine platinum (II) (DDP) cisplatin), anthracyclines (e.g., daunorubicin (formerly daunomycin) and doxorubicin), antibiotics (e.g., dactinomycin (formerly actinomycin), bleomycin, mithramycin, and anthramycin (AMC)), and anti-mitotic agents (e.g., vincristine, vinblastine, taxol and maytansinoids). Radioactive ions include, but are not limited to iodine, yttrium and praseodymium.

[2166] The conjugates of the invention can be used for modifying a given biological response, the drug moiety is not to be construed as limited to classical chemical therapeutic agents. For example, the drug moiety may be a protein or polypeptide possessing a desired biological activity. Such proteins may include, for example, a toxin such as abrin, ricin A, pseudomonas exotoxin, or diphtheria toxin; a protein such as tumor necrosis factor, &agr;-interferon, &bgr;-interferon, nerve growth factor, platelet derived growth factor, tissue plasminogen activator; or, biological response modifiers such as, for example, lymphokines, interleukin-1 (“IL-1”), interleukin-2 (“IL-2”), interleukin-6 (“IL-6”), granulocyte macrophase colony stimulating factor (“GM-CSF”), granulocyte colony stimulating factor (“G-CSF”), or other growth factors. Alternatively, an antibody can be conjugated to a second antibody to form an antibody heteroconjugate as described by Segal in U.S. Pat. No. 4,676,980.

[2167] The nucleic acid molecules of the invention can be inserted into vectors and used as gene therapy vectors. Gene therapy vectors can be delivered to a subject by, for example, intravenous injection, local administration (see U.S. Pat. No. 5,328,470) or by stereotactic injection (see e.g., Chen et al. (1994) Proc. Natl. Acad. Sci. USA 91:3054-3057). The pharmaceutical preparation of the gene therapy vector can include the gene therapy vector in an acceptable diluent, or can comprise a slow release matrix in which the gene delivery vehicle is imbedded. Alternatively, where the complete gene delivery vector can be produced intact from recombinant cells, e.g., retroviral vectors, the pharmaceutical preparation can include one or more cells which produce the gene delivery system.

[2168] The pharmaceutical compositions can be included in a container, pack, or dispenser together with instructions for administration.

[2169] Methods of Treatment for 46980

[2170] The present invention provides for both prophylactic and therapeutic methods of treating a subject at risk of (or susceptible to) a disorder or having a disorder associated with aberrant or unwanted 46980 expression or activity. As used herein, the term “treatment” is defined as the application or administration of a therapeutic agent to a patient, or application or administration of a therapeutic agent to an isolated tissue or cell line from a patient, who has a disease, a symptom of disease or a predisposition toward a disease, with the purpose to cure, heal, alleviate, relieve, alter, remedy, ameliorate, improve or affect the disease, the symptoms of disease or the predisposition toward disease. A therapeutic agent includes, but is not limited to, small molecules, peptides, antibodies, ribozymes and antisense oligonucleotides.

[2171] With regards to both prophylactic and therapeutic methods of treatment, such treatments may be specifically tailored or modified, based on knowledge obtained from the field of pharmacogenomics. “Pharmacogenomics”, as used herein, refers to the application of genomics technologies such as gene sequencing, statistical genetics, and gene expression analysis to drugs in clinical development and on the market. More specifically, the term refers the study of how a patient's genes determine his or her response to a drug (e.g., a patient's “drug response phenotype”, or “drug response genotype”.) Thus, another aspect of the invention provides methods for tailoring an individual's prophylactic or therapeutic treatment with either the 46980 molecules of the present invention or 46980 modulators according to that individual's drug response genotype. Pharmacogenomics allows a clinician or physician to target prophylactic or therapeutic treatments to patients who will most benefit from the treatment and to avoid treatment of patients who will experience toxic drug-related side effects.

[2172] In one aspect, the invention provides a method for preventing in a subject, a disease or condition associated with an aberrant or unwanted 46980 expression or activity, by administering to the subject a 46980 or an agent which modulates 46980 expression or at least one 46980 activity. Subjects at risk for a disease which is caused or contributed to by aberrant or unwanted 46980 expression or activity can be identified by, for example, any or a combination of diagnostic or prognostic assays as described herein. Administration of a prophylactic agent can occur prior to the manifestation of symptoms characteristic of the 46980 aberrance, such that a disease or disorder is prevented or, alternatively, delayed in its progression. Depending on the type of 46980 aberrance, for example, a 46980, 46980 agonist or 46980 antagonist agent can be used for treating the subject. The appropriate agent can be determined based on screening assays described herein.

[2173] It is possible that some 46980 disorders can be caused, at least in part, by an abnormal level of gene product, or by the presence of a gene product exhibiting abnormal activity. As such, the reduction in the level and/or activity of such gene products would bring about the amelioration of disorder symptoms.

[2174] In addition to the disorders mentioned above, the 46980 molecules can act as novel diagnostic targets and therapeutic agents for controlling one or more of cellular proliferative and/or differentiative disorders, disorders associated with the thymus, disorders associated with bone metabolism, immune disorders, cardiovascular disorders, liver disorders, viral diseases, pain or metabolic disorders.

[2175] Examples of cellular proliferative and/or differentiative disorders include cancer, e.g., carcinoma, sarcoma, metastatic disorders or hematopoietic neoplastic disorders, e.g., leukemias. A metastatic tumor can arise from a multitude of primary tumor types, including but not limited to those of prostate, colon, lung, breast and liver origin.

[2176] The terms “cancer” or “neoplasms” include malignancies of the various organ systems, such as affecting lung, breast, thyroid, lymphoid, gastrointestinal, and genito-urinary tract, as well as adenocarcinomas which include malignancies such as most colon cancers, renal-cell carcinoma, prostate cancer and/or testicular tumors, non-small cell carcinoma of the lung, cancer of the small intestine and cancer of the esophagus.

[2177] Additional examples of proliferative disorders include hematopoietic neoplastic disorders. As used herein, the term “hematopoietic neoplastic disorders” includes diseases involving hyperplastic/neoplastic cells of hematopoietic origin, e.g., arising from myeloid, lymphoid or erythroid lineages, or precursor cells thereof. Preferably, the diseases arise from poorly differentiated acute leukemias, e.g., erythroblastic leukemia and acute megakaryoblastic leukemia. Additional exemplary myeloid disorders include, but are not limited to, acute promyeloid leukemia (APML), acute myelogenous leukemia (AML) and chronic myelogenous leukemia (CML) (reviewed in Vaickus, L. (1991) Crit Rev. in Oncol./Hemotol. 11:267-97); lymphoid malignancies include, but are not limited to acute lymphoblastic leukemia (ALL) which includes B-lineage ALL and T-lineage ALL, chronic lymphocytic leukemia (CLL), prolymphocytic leukemia (PLL), hairy cell leukemia (HLL) and Waldenstrom's macroglobulinemia (WM). Additional forms of malignant lymphomas include, but are not limited to non-Hodgkin lymphoma and variants thereof, peripheral T cell lymphomas, adult T cell leukemia/lymphoma (ATL), cutaneous T-cell lymphoma (CTCL), large granular lymphocytic leukemia (LGF), Hodgkin's disease and Reed-Sternberg disease. Disorders involving the thymus include developmental disorders, such as DiGeorge syndrome with thymic hypoplasia or aplasia; thymic cysts; thymic hypoplasia, which involves the appearance of lymphoid follicles within the thymus, creating thymic follicular hyperplasia; and thymomas, including germ cell tumors, lynphomas, Hodgkin disease, and carcinoids. Thymomas can include benign or encapsulated thymoma, and malignant thymoma Type I (invasive thymoma) or Type II, designated thymic carcinoma.

[2178] Aberrant expression and/or activity of 46980 molecules may mediate disorders associated with bone metabolism. “Bone metabolism” refers to direct or indirect effects in the formation or degeneration of bone structures, e.g., bone formation, bone resorption, etc., which may ultimately affect the concentrations in serum of calcium and phosphate. This term also includes activities mediated by 46980 molecules effects in bone cells, e.g. osteoclasts and osteoblasts, that may in turn result in bone formation and degeneration. For example, 46980 molecules may support different activities of bone resorbing osteoclasts such as the stimulation of differentiation of monocytes and mononuclear phagocytes into osteoclasts. Accordingly, 46980 molecules that modulate the production of bone cells can influence bone formation and degeneration, and thus may be used to treat bone disorders. Examples of such disorders include, but are not limited to, osteoporosis, osteodystrophy, osteomalacia, rickets, osteitis fibrosa cystica, renal osteodystrophy, osteosclerosis, anti-convulsant treatment, osteopenia, fibrogenesis-imperfecta ossium, secondary hyperparathyrodism, hypoparathyroidism, hyperparathyroidism, cirrhosis, obstructive jaundice, drug induced metabolism, medullary carcinoma, chronic renal disease, rickets, sarcoidosis, glucocorticoid antagonism, malabsorption syndrome, steatorrhea, tropical sprue, idiopathic hypercalcemia and milk fever.

[2179] The 46980 nucleic acid and protein of the invention can be used to treat and/or diagnose a variety of immune disorders. Examples of immune disorders or diseases include, but are not limited to, autoimmune diseases (including, for example, diabetes mellitus, arthritis (including rheumatoid arthritis, juvenile rheumatoid arthritis, osteoarthritis, psoriatic arthritis), multiple sclerosis, encephalomyelitis, myasthenia gravis, systemic lupus erythematosis, autoimmune thyroiditis, dermatitis (including atopic dermatitis and eczematous dermatitis), psoriasis, Sjögren's Syndrome, Crohn's disease, aphthous ulcer, iritis, conjunctivitis, keratoconjunctivitis, ulcerative colitis, asthma, allergic asthma, cutaneous lupus erythematosus, scleroderma, vaginitis, proctitis, drug eruptions, leprosy reversal reactions, erythema nodosum leprosum, autoimmune uveitis, allergic encephalomyelitis, acute necrotizing hemorrhagic encephalopathy, idiopathic bilateral progressive sensorineural hearing loss, aplastic anemia, pure red cell anemia, idiopathic thrombocytopenia, polychondritis, Wegener's granulomatosis, chronic active hepatitis, Stevens-Johnson syndrome, idiopathic sprue, lichen planus, Graves' disease, sarcoidosis, primary biliary cirrhosis, uveitis posterior, and interstitial lung fibrosis), graft-versus-host disease, cases of transplantation, and allergy such as, atopic allergy.

[2180] Examples of disorders involving the heart or “cardiovascular disorder” include, but are not limited to, a disease, disorder, or state involving the cardiovascular system, e.g., the heart, the blood vessels, and/or the blood. A cardiovascular disorder can be caused by an imbalance in arterial pressure, a malfunction of the heart, or an occlusion of a blood vessel, e.g., by a thrombus. Examples of such disorders include hypertension, atherosclerosis, coronary artery spasm, congestive heart failure, coronary artery disease, valvular disease, arrhythmias, and cardiomyopathies.

[2181] Disorders which may be treated or diagnosed by methods described herein include, but are not limited to, disorders associated with an accumulation in the liver of fibrous tissue, such as that resulting from an imbalance between production and degradation of the extracellular matrix accompanied by the collapse and condensation of preexisting fibers. The methods described herein can be used to diagnose or treat hepatocellular necrosis or injury induced by a wide variety of agents including processes which disturb homeostasis, such as an inflammatory process, tissue damage resulting from toxic injury or altered hepatic blood flow, and infections (e.g., bacterial, viral and parasitic). For example, the methods can be used for the early detection of hepatic injury, such as portal hypertension or hepatic fibrosis. In addition, the methods can be employed to detect liver fibrosis attributed to inborn errors of metabolism, for example, fibrosis resulting from a storage disorder such as Gaucher's disease (lipid abnormalities) or a glycogen storage disease, A1-antitrypsin deficiency; a disorder mediating the accumulation (e.g., storage) of an exogenous substance, for example, hemochromatosis (iron-overload syndrome) and copper storage diseases (Wilson's disease), disorders resulting in the accumulation of a toxic metabolite (e.g., tyrosinemia, fructosemia and galactosemia) and peroxisomal disorders (e.g., Zellweger syndrome). Additionally, the methods described herein may be useful for the early detection and treatment of liver injury associated with the administration of various chemicals or drugs, such as for example, methotrexate, isonizaid, oxyphenisatin, methyldopa, chlorpromazine, tolbutamide or alcohol, or which represents a hepatic manifestation of a vascular disorder such as obstruction of either the intrahepatic or extrahepatic bile flow or an alteration in hepatic circulation resulting, for example, from chronic heart failure, veno-occlusive disease, portal vein thrombosis or Budd-Chiari syndrome.

[2182] Additionally, 46980 molecules may play an important role in the etiology of certain viral diseases, including but not limited to Hepatitis B, Hepatitis C and Herpes Simplex Virus (HSV). Modulators of 46980 activity could be used to control viral diseases. The modulators can be used in the treatment and/or diagnosis of viral infected tissue or virus-associated tissue fibrosis, especially liver and liver fibrosis. Also, 46980 modulators can be used in the treatment and/or diagnosis of virus-associated carcinoma, especially hepatocellular cancer.

[2183] Additionally, 46980 may play an important role in the regulation of metabolism or pain disorders. Diseases of metabolic imbalance include, but are not limited to, obesity, anorexia nervosa, cachexia, lipid disorders, and diabetes. Examples of pain disorders include, but are not limited to, pain response elicited during various forms of tissue injury, e.g., inflammation, infection, and ischemia, usually referred to as hyperalgesia (described in, for example, Fields, H. L. (1987) Pain, New York: McGraw-Hill); pain associated with musculoskeletal disorders, e.g., joint pain; tooth pain; headaches; pain associated with surgery; pain related to irritable bowel syndrome; or chest pain.

[2184] As discussed, successful treatment of 46980 disorders can be brought about by techniques that serve to inhibit the expression or activity of target gene products. For example, compounds, e.g., an agent identified using an assays described above, that proves to exhibit negative modulatory activity, can be used in accordance with the invention to prevent and/or ameliorate symptoms of 46980 disorders. Such molecules can include, but are not limited to peptides, phosphopeptides, small organic or inorganic molecules, or antibodies (including, for example, polyclonal, monoclonal, humanized, anti-idiotypic, chimeric or single chain antibodies, and Fab, F(ab′)2 and Fab expression library fragments, scFV molecules, and epitope-binding fragments thereof).

[2185] Further, antisense and ribozyme molecules that inhibit expression of the target gene can also be used in accordance with the invention to reduce the level of target gene expression, thus effectively reducing the level of target gene activity. Still further, triple helix molecules can be utilized in reducing the level of target gene activity. Antisense, ribozyme and triple helix molecules are discussed above.

[2186] It is possible that the use of antisense, ribozyme, and/or triple helix molecules to reduce or inhibit mutant gene expression can also reduce or inhibit the transcription (triple helix) and/or translation (antisense, ribozyme) of mRNA produced by normal target gene alleles, such that the concentration of normal target gene product present can be lower than is necessary for a normal phenotype. In such cases, nucleic acid molecules that encode and express target gene polypeptides exhibiting normal target gene activity can be introduced into cells via gene therapy method. Alternatively, in instances in that the target gene encodes an extracellular protein, it can be preferable to co-administer normal target gene protein into the cell or tissue in order to maintain the requisite level of cellular or tissue target gene activity.

[2187] Another method by which nucleic acid molecules may be utilized in treating or preventing a disease characterized by 46980 expression is through the use of aptamer molecules specific for 46980 protein. Aptamers are nucleic acid molecules having a tertiary structure which permits them to specifically bind to protein ligands (see, e.g., Osborne, et al. (1997) Curr. Opin. Chem Biol. 1: 5-9; and Patel, D. J. (1997) Curr Opin Chem Biol 1:32-46). Since nucleic acid molecules may in many cases be more conveniently introduced into target cells than therapeutic protein molecules may be, aptamers offer a method by which 46980 protein activity may be specifically decreased without the introduction of drugs or other molecules which may have pluripotent effects.

[2188] Antibodies can be generated that are both specific for target gene product and that reduce target gene product activity. Such antibodies may, therefore, by administered in instances whereby negative modulatory techniques are appropriate for the treatment of 46980 disorders. For a description of antibodies, see the Antibody section above.

[2189] In circumstances wherein injection of an animal or a human subject with a 46980 protein or epitope for stimulating antibody production is harmful to the subject, it is possible to generate an immune response against 46980 through the use of anti-idiotypic antibodies (see, for example, Herlyn, D. (1999) Ann Med 31:66-78; and Bhattacharya-Chatterjee, M., and Foon, K. A. (1998) Cancer Treat Res. 94:51-68). If an anti-idiotypic antibody is introduced into a mammal or human subject, it should stimulate the production of anti-anti-idiotypic antibodies, which should be specific to the 46980 protein. Vaccines directed to a disease characterized by 46980 expression may also be generated in this fashion.

[2190] In instances where the target antigen is intracellular and whole antibodies are used, internalizing antibodies may be preferred. Lipofectin or liposomes can be used to deliver the antibody or a fragment of the Fab region that binds to the target antigen into cells. Where fragments of the antibody are used, the smallest inhibitory fragment that binds to the target antigen is preferred. For example, peptides having an amino acid sequence corresponding to the Fv region of the antibody can be used. Alternatively, single chain neutralizing antibodies that bind to intracellular target antigens can also be administered. Such single chain antibodies can be administered, for example, by expressing nucleotide sequences encoding single-chain antibodies within the target cell population (see e.g., Marasco et al. (1993) Proc. Natl. Acad. Sci. USA 90:7889-7893).

[2191] The identified compounds that inhibit target gene expression, synthesis and/or activity can be administered to a patient at therapeutically effective doses to prevent, treat or ameliorate 46980 disorders. A therapeutically effective dose refers to that amount of the compound sufficient to result in amelioration of symptoms of the disorders. Toxicity and therapeutic efficacy of such compounds can be determined by standard pharmaceutical procedures as described above.

[2192] The data obtained from the cell culture assays and animal studies can be used in formulating a range of dosage for use in humans. The dosage of such compounds lies preferably within a range of circulating concentrations that include the ED50 with little or no toxicity. The dosage can vary within this range depending upon the dosage form employed and the route of administration utilized. For any compound used in the method of the invention, the therapeutically effective dose can be estimated initially from cell culture assays. A dose can be formulated in animal models to achieve a circulating plasma concentration range that includes the IC50 (i.e., the concentration of the test compound that achieves a half-maximal inhibition of symptoms) as determined in cell culture. Such information can be used to more accurately determine useful doses in humans. Levels in plasma can be measured, for example, by high performance liquid chromatography.

[2193] Another example of determination of effective dose for an individual is the ability to directly assay levels of “free” and “bound” compound in the serum of the test subject. Such assays may utilize antibody mimics and/or “biosensors” that have been created through molecular imprinting techniques. The compound which is able to modulate 46980 activity is used as a template, or “imprinting molecule”, to spatially organize polymerizable monomers prior to their polymerization with catalytic reagents. The subsequent removal of the imprinted molecule leaves a polymer matrix which contains a repeated “negative image” of the compound and is able to selectively rebind the molecule under biological assay conditions. A detailed review of this technique can be seen in Ansell, R. J. et al (1996) Current Opinion in Biotechnology 7:89-94 and in Shea, K. J. (1994) Trends in Polymer Science 2:166-173. Such “imprinted” affinity matrixes are amenable to ligand-binding assays, whereby the immobilized monoclonal antibody component is replaced by an appropriately imprinted matrix. An example of the use of such matrixes in this way can be seen in Vlatakis, G. et al (1993) Nature 361:645-647. Through the use of isotope-labeling, the “free” concentration of compound which modulates the expression or activity of 46980 can be readily monitored and used in calculations of IC50.

[2194] Such “imprinted” affinity matrixes can also be designed to include fluorescent groups whose photon-emitting properties measurably change upon local and selective binding of target compound. These changes can be readily assayed in real time using appropriate fiberoptic devices, in turn allowing the dose in a test subject to be quickly optimized based on its individual IC50. An rudimentary example of such a “biosensor” is discussed in Kriz, D. et al (1995) Analytical Chemistry 67:2142-2144.

[2195] Another aspect of the invention pertains to methods of modulating 46980 expression or activity for therapeutic purposes. Accordingly, in an exemplary embodiment, the modulatory method of the invention involves contacting a cell with a 46980 or agent that modulates one or more of the activities of 46980 protein activity associated with the cell. An agent that modulates 46980 protein activity can be an agent as described herein, such as a nucleic acid or a protein, a naturally-occurring target molecule of a 46980 protein (e.g., a 46980 substrate or receptor), a 46980 antibody, a 46980 agonist or antagonist, a peptidomimetic of a 46980 agonist or antagonist, or other small molecule.

[2196] In one embodiment, the agent stimulates one or 46980 activities. Examples of such stimulatory agents include active 46980 protein and a nucleic acid molecule encoding 46980. In another embodiment, the agent inhibits one or more 46980 activities. Examples of such inhibitory agents include antisense 46980 nucleic acid molecules, anti-46980 antibodies, and 46980 inhibitors. These modulatory methods can be performed in vitro (e.g., by culturing the cell with the agent) or, alternatively, in vivo (e.g., by administering the agent to a subject). As such, the present invention provides methods of treating an individual afflicted with a disease or disorder characterized by aberrant or unwanted expression or activity of a 46980 protein or nucleic acid molecule. In one embodiment, the method involves administering an agent (e.g., an agent identified by a screening assay described herein), or combination of agents that modulates (e.g., up regulates or down regulates) 46980 expression or activity. In another embodiment, the method involves administering a 46980 protein or nucleic acid molecule as therapy to compensate for reduced, aberrant, or unwanted 46980 expression or activity.

[2197] Stimulation of 46980 activity is desirable in situations in which 46980 is abnormally downregulated and/or in which increased 46980 activity is likely to have a beneficial effect. For example, stimulation of 46980 activity is desirable in situations in which a 46980 is downregulated and/or in which increased 46980 activity is likely to have a beneficial effect. Likewise, inhibition of 46980 activity is desirable in situations in which 46980 is abnormally upregulated and/or in which decreased 46980 activity is likely to have a beneficial effect.

[2198] 46980 Pharmacogenomics

[2199] The 46980 molecules of the present invention, as well as agents, or modulators which have a stimulatory or inhibitory effect on 46980 activity (e.g., 46980 gene expression) as identified by a screening assay described herein can be administered to individuals to treat (prophylactically or therapeutically) 46980 associated disorders (e.g., a neuronal disorder, e.g., a pain-related, neuronal connectivity-related, or neural degenerative disorder,) associated with aberrant or unwanted 46980 activity. In conjunction with such treatment, pharmacogenomics (i.e., the study of the relationship between an individual's genotype and that individual's response to a foreign compound or drug) may be considered. Differences in metabolism of therapeutics can lead to severe toxicity or therapeutic failure by altering the relation between dose and blood concentration of the pharmacologically active drug. Thus, a physician or clinician may consider applying knowledge obtained in relevant pharmacogenomics studies in determining whether to administer a 46980 molecule or 46980 modulator as well as tailoring the dosage and/or therapeutic regimen of treatment with a 46980 molecule or 46980 modulator.

[2200] Pharmacogenomics deals with clinically significant hereditary variations in the response to drugs due to altered drug disposition and abnormal action in affected persons. See, for example, Eichelbaum, M. et al. (1996) Clin. Exp. Pharmacol. Physiol. 23:983-985 and Linder, M. W. et al. (1997) Clin. Chem. 43:254-266. In general, two types of pharmacogenetic conditions can be differentiated. Genetic conditions transmitted as a single factor altering the way drugs act on the body (altered drug action) or genetic conditions transmitted as single factors altering the way the body acts on drugs (altered drug metabolism). These pharmacogenetic conditions can occur either as rare genetic defects or as naturally-occurring polymorphisms. For example, glucose-6-phosphate dehydrogenase deficiency (G6PD) is a common inherited enzymopathy in which the main clinical complication is haemolysis after ingestion of oxidant drugs (anti-malarials, sulfonamides, analgesics, nitrofurans) and consumption of fava beans.

[2201] One pharmacogenomics approach to identifying genes that predict drug response, known as “a genome-wide association”, relies primarily on a high-resolution map of the human genome consisting of already known gene-related markers (e.g., a “bi-allelic” gene marker map which consists of 60,000-100,000 polymorphic or variable sites on the human genome, each of which has two variants.) Such a high-resolution genetic map can be compared to a map of the genome of each of a statistically significant number of patients taking part in a Phase II/III drug trial to identify markers associated with a particular observed drug response or side effect. Alternatively, such a high resolution map can be generated from a combination of some ten-million known single nucleotide polymorphisms (SNPs) in the human genome. As used herein, a “SNP” is a common alteration that occurs in a single nucleotide base in a stretch of DNA. For example, a SNP may occur once per every 1000 bases of DNA. A SNP may be involved in a disease process, however, the vast majority may not be disease-associated. Given a genetic map based on the occurrence of such SNPs, individuals can be grouped into genetic categories depending on a particular pattern of SNPs in their individual genome. In such a manner, treatment regimens can be tailored to groups of genetically similar individuals, taking into account traits that may be common among such genetically similar individuals.

[2202] Alternatively, a method termed the “candidate gene approach,” can be utilized to identify genes that predict drug response. According to this method, if a gene that encodes a drug's target is known (e.g., a 46980 protein of the present invention), all common variants of that gene can be fairly easily identified in the population and it can be determined if having one version of the gene versus another is associated with a particular drug response.

[2203] Alternatively, a method termed the “gene expression profiling,” can be utilized to identify genes that predict drug response. For example, the gene expression of an animal dosed with a drug (e.g., a 46980 molecule or 46980 modulator of the present invention) can give an indication whether gene pathways related to toxicity have been turned on.

[2204] Information generated from more than one of the above pharmacogenomics approaches can be used to determine appropriate dosage and treatment regimens for prophylactic or therapeutic treatment of an individual. This knowledge, when applied to dosing or drug selection, can avoid adverse reactions or therapeutic failure and thus enhance therapeutic or prophylactic efficiency when treating a subject with a 46980 molecule or 46980 modulator, such as a modulator identified by one of the exemplary screening assays described herein.

[2205] The present invention further provides methods for identifying new agents, or combinations, that are based on identifying agents that modulate the activity of one or more of the gene products encoded by one or more of the 46980 genes of the present invention, wherein these products may be associated with resistance of the cells to a therapeutic agent. Specifically, the activity of the proteins encoded by the 46980 genes of the present invention can be used as a basis for identifying agents for overcoming agent resistance. By blocking the activity of one or more of the resistance proteins, target cells, e.g., human cells, will become sensitive to treatment with an agent that the unmodified target cells were resistant to.

[2206] Monitoring the influence of agents (e.g., drugs) on the expression or activity of a 46980 protein can be applied in clinical trials. For example, the effectiveness of an agent determined by a screening assay as described herein to increase 46980 gene expression, protein levels, or upregulate 46980 activity, can be monitored in clinical trials of subjects exhibiting decreased 46980 gene expression, protein levels, or downregulated 46980 activity. Alternatively, the effectiveness of an agent determined by a screening assay to decrease 46980 gene expression, protein levels, or downregulate 46980 activity, can be monitored in clinical trials of subjects exhibiting increased 46980 gene expression, protein levels, or upregulated 46980 activity. In such clinical trials, the expression or activity of a 46980 gene, and preferably, other genes that have been implicated in, for example, a 46980-associated disorder can be used as a “read out” or markers of the phenotype of a particular cell.

[2207] 46980 Informatics

[2208] The sequence of a 46980 molecule is provided in a variety of media to facilitate use thereof. A sequence can be provided as a manufacture, other than an isolated nucleic acid or amino acid molecule, which contains a 46980. Such a manufacture can provide a nucleotide or amino acid sequence, e.g., an open reading frame, in a form which allows examination of the manufacture using means not directly applicable to examining the nucleotide or amino acid sequences, or a subset thereof, as they exists in nature or in purified form. The sequence information can include, but is not limited to, 46980 full-length nucleotide and/or amino acid sequences, partial nucleotide and/or amino acid sequences, polymorphic sequences including single nucleotide polymorphisms (SNPs), epitope sequence, and the like. In a preferred embodiment, the manufacture is a machine-readable medium, e.g., a magnetic, optical, chemical or mechanical information storage device.

[2209] As used herein, “machine-readable media” refers to any medium that can be read and accessed directly by a machine, e.g., a digital computer or analogue computer. Non-limiting examples of a computer include a desktop PC, laptop, mainframe, server (e.g., a web server, network server, or server farm), handheld digital assistant, pager, mobile telephone, and the like. The computer can be stand-alone or connected to a communications network, e.g., a local area network (such as a VPN or intranet), a wide area network (e.g., an Extranet or the Internet), or a telephone network (e.g., a wireless, DSL, or ISDN network). Machine-readable media include, but are not limited to: magnetic storage media, such as floppy discs, hard disc storage medium, and magnetic tape; optical storage media such as CD-ROM; electrical storage media such as RAM, ROM, EPROM, EEPROM, flash memory, and the like; and hybrids of these categories such as magnetic/optical storage media.

[2210] A variety of data storage structures are available to a skilled artisan for creating a machine-readable medium having recorded thereon a nucleotide or amino acid sequence of the present invention. The choice of the data storage structure will generally be based on the means chosen to access the stored information. In addition, a variety of data processor programs and formats can be used to store the nucleotide sequence information of the present invention on computer readable medium. The sequence information can be represented in a word processing text file, formatted in commercially-available software such as WordPerfect and Microsoft Word, or represented in the form of an ASCII file, stored in a database application, such as DB2, Sybase, Oracle, or the like. The skilled artisan can readily adapt any number of data processor structuring formats (e.g., text file or database) in order to obtain computer readable medium having recorded thereon the nucleotide sequence information of the present invention.

[2211] In a preferred embodiment, the sequence information is stored in a relational database (such as Sybase or Oracle). The database can have a first table for storing sequence (nucleic acid and/or amino acid sequence) information. The sequence information can be stored in one field (e.g., a first column) of a table row and an identifier for the sequence can be store in another field (e.g., a second column) of the table row. The database can have a second table, e.g., storing annotations. The second table can have a field for the sequence identifier, a field for a descriptor or annotation text (e.g., the descriptor can refer to a functionality of the sequence, a field for the initial position in the sequence to which the annotation refers, and a field for the ultimate position in the sequence to which the annotation refers. Non-limiting examples for annotation to nucleic acid sequences include polymorphisms (e.g., SNP's) translational regulatory sites and splice junctions. Non-limiting examples for annotations to amino acid sequence include polypeptide domains, e.g., a domain described herein; active sites and other functional amino acids; and modification sites.

[2212] By providing the nucleotide or amino acid sequences of the invention in computer readable form, the skilled artisan can routinely access the sequence information for a variety of purposes. For example, one skilled in the art can use the nucleotide or amino acid sequences of the invention in computer readable form to compare a target sequence or target structural motif with the sequence information stored within the data storage means. A search is used to identify fragments or regions of the sequences of the invention which match a particular target sequence or target motif. The search can be a BLAST search or other routine sequence comparison, e.g., a search described herein.

[2213] Thus, in one aspect, the invention features a method of analyzing 46980, e.g., analyzing structure, function, or relatedness to one or more other nucleic acid or amino acid sequences. The method includes: providing a 46980 nucleic acid or amino acid sequence; comparing the 46980 sequence with a second sequence, e.g., one or more preferably a plurality of sequences from a collection of sequences, e.g., a nucleic acid or protein sequence database to thereby analyze 46980. The method can be performed in a machine, e.g., a computer, or manually by a skilled artisan.

[2214] The method can include evaluating the sequence identity between a 46980 sequence and a database sequence. The method can be performed by accessing the database at a second site, e.g., over the Internet.

[2215] As used herein, a “target sequence” can be any DNA or amino acid sequence of six or more nucleotides or two or more amino acids. A skilled artisan can readily recognize that the longer a target sequence is, the less likely a target sequence will be present as a random occurrence in the database. Typical sequence lengths of a target sequence are from about 10 to 100 amino acids or from about 30 to 300 nucleotide residues. However, it is well recognized that commercially important fragments, such as sequence fragments involved in gene expression and protein processing, may be of shorter length.

[2216] Computer software is publicly available which allows a skilled artisan to access sequence information provided in a computer readable medium for analysis and comparison to other sequences. A variety of known algorithms are disclosed publicly and a variety of commercially available software for conducting search means are and can be used in the computer-based systems of the present invention. Examples of such software include, but are not limited to, MacPattern (EMBL), BLASTN and BLASTX (NCBI).

[2217] Thus, the invention features a method of making a computer readable record of a sequence of a 46980 sequence which includes recording the sequence on a computer readable matrix. In a preferred embodiment the record includes one or more of the following: identification of an ORF; identification of a domain, region, or site; identification of the start of transcription; identification of the transcription terminator; the full length amino acid sequence of the protein, or a mature form thereof; the 5′end of the translated region.

[2218] In another aspect, the invention features, a method of analyzing a sequence. The method includes: providing a 46980 sequence, or record, in machine-readable form; comparing a second sequence to the 46980 sequence; thereby analyzing a sequence. Comparison can include comparing to sequences for sequence identity or determining if one sequence is included within the other, e.g., determining if the 46980 sequence includes a sequence being compared. In a preferred embodiment the 46980 or second sequence is stored on a first computer, e.g., at a first site and the comparison is performed, read, or recorded on a second computer, e.g., at a second site. E.g., the 46980 or second sequence can be stored in a public or proprietary database in one computer, and the results of the comparison performed, read, or recorded on a second computer. In a preferred embodiment the record includes one or more of the following: identification of an ORF; identification of a domain, region, or site; identification of the start of transcription; identification of the transcription terminator; the full length amino acid sequence of the protein, or a mature form thereof; the 5′end of the translated region.

[2219] In another aspect, the invention provides a machine-readable medium for holding instructions for performing a method for determining whether a subject has a 46980-associated disease or disorder or a pre-disposition to a 46980-associated disease or disorder, wherein the method comprises the steps of determining 46980 sequence information associated with the subject and based on the 46980 sequence information, determining whether the subject has a 46980-associated disease or disorder or a pre-disposition to a 46980-associated disease or disorder and/or recommending a particular treatment for the disease, disorder or pre-disease condition.

[2220] The invention further provides in an electronic system and/or in a network, a method for determining whether a subject has a 46980-associated disease or disorder or a pre-disposition to a disease associated with a 46980 wherein the method comprises the steps of determining 46980 sequence information associated with the subject, and based on the 46980 sequence information, determining whether the subject has a 46980-associated disease or disorder or a pre-disposition to a 46980-associated disease or disorder, and/or recommending a particular treatment for the disease, disorder or pre-disease condition. In a preferred embodiment, the method further includes the step of receiving information, e.g., phenotypic or genotypic information, associated with the subject and/or acquiring from a network phenotypic information associated with the subject. The information can be stored in a database, e.g., a relational database. In another embodiment, the method further includes accessing the database, e.g., for records relating to other subjects, comparing the 46980 sequence of the subject to the 46980 sequences in the database to thereby determine whether the subject as a 46980-associated disease or disorder, or a pre-disposition for such.

[2221] The present invention also provides in a network, a method for determining whether a subject has a 46980 associated disease or disorder or a pre-disposition to a 46980-associated disease or disorder associated with 46980, said method comprising the steps of receiving 46980 sequence information from the subject and/or information related thereto, receiving phenotypic information associated with the subject, acquiring information from the network corresponding to 46980 and/or corresponding to a 46980-associated disease or disorder (e.g., a neuronal disorder, e.g., a pain-related, neuronal connectivity-related, or neural degenerative disorder,), and based on one or more of the phenotypic information, the 46980 information (e.g., sequence information and/or information related thereto), and the acquired information, determining whether the subject has a 46980-associated disease or disorder or a pre-disposition to a 46980-associated disease or disorder. The method may further comprise the step of recommending a particular treatment for the disease, disorder or pre-disease condition.

[2222] The present invention also provides a method for determining whether a subject has a 46980-associated disease or disorder or a pre-disposition to a 46980-associated disease or disorder, said method comprising the steps of receiving information related to 46980 (e.g., sequence information and/or information related thereto), receiving phenotypic information associated with the subject, acquiring information from the network related to 46980 and/or related to a 46980-associated disease or disorder, and based on one or more of the phenotypic information, the 46980 information, and the acquired information, determining whether the subject has a 46980-associated disease or disorder or a pre-disposition to a 46980-associated disease or disorder. The method may further comprise the step of recommending a particular treatment for the disease, disorder or pre-disease condition.

[2223] This invention is further illustrated by the following examples that should not be construed as limiting. The contents of all references, patents and published patent applications cited throughout this application are incorporated herein by reference.

Background of the 32225 Invention

[2224] The hydrolysis of chemical bonds is critical for most metabolic (e.g., catabolic and anabolic) pathways in cells. Hydrolases are a large class of enzymes which catalyze the cleavage of a bond with the addition of water. In particular, the &agr;/&bgr; hydrolase family of enzymes is a phylogenetically diverse group of enzymes that have a common fold, typically comprising an eight-stranded &bgr;-sheet surrounded by &agr;-helices (Ollis, D. et al. (1992) Protein Eng 5:197-211; Nardini and Dikkstra (1999) Curr Opin Str Bio 9:732-737). Members of the &agr;/&bgr; hydrolase family are found in nearly all organisms, from microbes to plants to humans. Members of the hydrolase family of enzymes include enzymes that hydrolyze ester bonds (e.g., phosphatases, sulfatases, exonucleases, and endonucleases), glycosidases, enzymes that act on ether bonds, peptidases (e.g., exopeptidases and endopeptidases), as well as enzymes that hydrolyze carbon-nitrogen bonds, acid anhydrides, carbon-carbon bonds, halide bonds, phosphorous-nitrogen bonds, sulfur-nitrogen bonds, carbon-phosphorous bonds, and sulfur-sulfur bonds (E. C. Webb ed., Enzyme Nomenclature, pp. 306-450, ©1992 Academic Press, Inc. San Diego, Calif.).

[2225] &agr;/&bgr; hydrolases vary widely in primary sequence, substrate specificity, and physical properties. However, despite the lack of sequence homology, hydrolase family members display structural similarities, e.g., conservation of a catalytic site framework. For example, the alpha/beta hydrolase fold is a structural motif that is common to a variety of hydrolytic enzymes including, lipases, e.g., fungal, bacterial and pancreatic lipases, acetylcholinesterases, serine carboxypeptidases, prolyl aminopeptidases, haloalkane dehalogenases, dienelactone hydrolases, A2 bromoperoxidases, and thioesterases (Schrag, J. et al. (1997) Meth. Enzymol. 284:85-107). Of particular medical significance, &agr;/&bgr; hydrolase family includes acetylcholinesterases, epoxide hydrolases, cholesterol esterases, and lipases. Inhibitors of acetylcholinesterase are useful therapeutic agents for the treatment of Alzheimer's disease, myasthenia gravis, and glaucoma. Epoxide hydrolases detoxify harmful aromatic compounds in mammals. The human hormone sensitive lipase catalyzes the rate-limiting reaction of hydrolyzing fat in adipocytes.

[2226] Enzymes possessing the &agr;/&bgr; hydrolase fold have diverged from a common ancestor so as to preserve the arrangement of the catalytic residues (Ollis, D. et al. (1992) Protein Eng. 5:197-211). In particular, one conserved feature of the alpha/beta hydrolase fold is a nucleophile-histidine-acid catalytic triad. The identities of the triad residues in alpha/beta hydrolase fold enzymes are quite variable in that serine, aspartate, and cysteine have all been identified as catalytic nucleophiles (Schrag, J. et al. supra).

[2227] Hydrolases play important roles in the synthesis and breakdown of nearly all major metabolic intermediates, including polypeptides, nucleic acids, and lipids. As such, their activity contributes to the ability of the cell to grow and differentiate, to proliferate, to adhere and move, and to interact and communicate with other cells. Hydrolases also are important in the conversion of pro-proteins and pro-hormones to their active forms, the inactivation of peptides, the biotransformation of compounds (e.g., a toxin or carcinogen), antigen presentation, and the regulation of synaptic transmission.

Summary of the 32225 Invention

[2228] The present invention is based, in part, on the discovery of a novel &agr;/&bgr; hydrolase family member, referred to herein as “32225”. The nucleotide sequence of a cDNA encoding 32225 is shown in SEQ ID NO:33, and the amino acid sequence of a 32225 polypeptide is shown in SEQ ID NO:34. In addition, the nucleotide sequences of the coding region are depicted in SEQ ID NO:35.

[2229] Accordingly, in one aspect, the invention features a nucleic acid molecule that encodes a 32225 protein or polypeptide, e.g., a biologically active portion of the 32225 protein. In a preferred embodiment the isolated nucleic acid molecule encodes a polypeptide having the amino acid sequence of SEQ ID NO:34. In other embodiments, the invention provides isolated 32225 nucleic acid molecules having the nucleotide sequence shown in SEQ ID NO:33, SEQ ID NO:35, or the sequence of the DNA insert of the plasmid deposited with ATCC Accession Number ______. In still other embodiments, the invention provides nucleic acid molecules that are substantially identical (e.g., naturally occurring allelic variants) to the nucleotide sequence shown in SEQ ID NO:33, SEQ ID NO:35, or the sequence of the DNA insert of the plasmid deposited with ATCC Accession Number ______. In other embodiments, the invention provides a nucleic acid molecule which hybridizes under a stringency condition described herein to a nucleic acid molecule comprising the nucleotide sequence of SEQ ID NO:33, SEQ ID NO:35, or the sequence of the DNA insert of the plasmid deposited with ATCC Accession Number ______, wherein the nucleic acid encodes a full length 32225 protein or an active fragment thereof.

[2230] In a related aspect, the invention further provides nucleic acid constructs that include a 32225 nucleic acid molecule described herein. In certain embodiments, the nucleic acid molecules of the invention are operatively linked to native or heterologous regulatory sequences. Also included, are vectors and host cells containing the 32225 nucleic acid molecules of the invention e.g., vectors and host cells suitable for producing 32225 nucleic acid molecules and polypeptides.

[2231] In another related aspect, the invention provides nucleic acid fragments suitable as primers or hybridization probes for the detection of 32225-encoding nucleic acids.

[2232] In still another related aspect, isolated nucleic acid molecules that are antisense to a 32225 encoding nucleic acid molecule are provided.

[2233] In another aspect, the invention features, 32225 polypeptides, and biologically active or antigenic fragments thereof that are useful, e.g., as reagents or targets in assays applicable to treatment and diagnosis of 32225-mediated or -related disorders. In another embodiment, the invention provides 32225 polypeptides having a 32225 activity. Preferred polypeptides are 32225 proteins including at least one &agr;/&bgr; hydrolase domain, and, preferably, having a 32225 activity, e.g., a 32225 activity as described herein.

[2234] In other embodiments, the invention provides 32225 polypeptides, e.g., a 32225 polypeptide having the amino acid sequence shown in SEQ ID NO:34 or the amino acid sequence encoded by the cDNA insert of the plasmid deposited with ATCC Accession Number ______; an amino acid sequence that is substantially identical to the amino acid sequence shown in SEQ ID NO:34 or the amino acid sequence encoded by the cDNA insert of the plasmid deposited with ATCC Accession Number ______; or an amino acid sequence encoded by a nucleic acid molecule having a nucleotide sequence which hybridizes under a stringency condition described herein to a nucleic acid molecule comprising the nucleotide sequence of SEQ ID NO:33, SEQ ID NO:35, or the sequence of the DNA insert of the plasmid deposited with ATCC Accession Number ______, wherein the nucleic acid encodes a full length 32225 protein or an active fragment thereof.

[2235] In a related aspect, the invention further provides nucleic acid constructs which include a 32225 nucleic acid molecule described herein.

[2236] In a related aspect, the invention provides 32225 polypeptides or fragments operatively linked to non-32225 polypeptides to form fusion proteins.

[2237] In another aspect, the invention features antibodies and antigen-binding fragments thereof, that react with, or more preferably specifically bind 32225 polypeptides or fragments thereof, e.g., a peptide loop that is involved in hydrolysis. In one embodiment, the antibodies or antigen-binding fragment thereof competitively inhibit the binding of a second antibody to a 32225 polypeptide or a fragment thereof, e.g., a peptide loop that is involved in hydrolysis.

[2238] In another aspect, the invention provides methods of screening for compounds that modulate the expression or activity of the 32225 polypeptides or nucleic acids.

[2239] In still another aspect, the invention provides a process for modulating 32225 polypeptide or nucleic acid expression or activity, e.g. using the screened compounds. In certain embodiments, the methods involve treatment of conditions related to aberrant activity or expression of the 32225 polypeptides or nucleic acids, such as conditions involving aberrant or deficient breakdown of toxic molecules, neuronal function, or cellular proliferation and/or differentiation.

[2240] The invention also provides assays for determining the activity of or the presence or absence of 32225 polypeptides or nucleic acid molecules in a biological sample, including for disease diagnosis.

[2241] In yet another aspect, the invention provides methods for inhibiting the proliferation or inducing the killing, of a 32225-expressing cell, e.g., a hyper-proliferative 32225-expressing cell. The method includes contacting the cell with a compound (e.g., a compound identified using the methods described herein) that modulates the activity, or expression, of the 32225 polypeptide or nucleic acid. In a preferred embodiment, the contacting step is effective in vitro or ex vivo. In other embodiments, the contacting step is effected in vivo, e.g., in a subject (e.g., a mammal, e.g., a human), as part of a therapeutic or prophylactic protocol. In a preferred embodiment, the cell is a hyperproliferative cell, e.g., a cell found in a solid tumor, a soft tissue tumor, or a metastatic lesion, e.g., a tumor or lesion located in a lung, breast, colon, ovary, brain, or liver tissue.

[2242] In a preferred embodiment, the compound is an inhibitor of a 32225 polypeptide. Preferably, the inhibitor is chosen from a peptide, a phosphopeptide, a small organic molecule, a small inorganic molecule and an antibody (e.g., an antibody conjugated to a therapeutic moiety selected from a cytotoxin, a cytotoxic agent and a radioactive metal ion). In another preferred embodiment, the compound is an inhibitor of a 32225 nucleic acid, e.g., an antisense, a ribozyme, or a triple helix molecule.

[2243] In a preferred embodiment, the compound is administered in combination with a cytotoxic agent. Examples of cytotoxic agents include anti-microtubule agent, a topoisomerase I inhibitor, a topoisomerase II inhibitor, an anti-metabolite, a mitotic inhibitor, an alkylating agent, an intercalating agent, an agent capable of interfering with a signal transduction pathway, an agent that promotes apoptosis or necrosis, and radiation.

[2244] In another embodiment, the compound is an activator or a 32225 polypeptide. Preferably, the activator is chosen from a peptide, a phosphopeptide, a small organic molecule, a small inorganic molecule and an antibody.

[2245] In another aspect, the invention features methods for treating or preventing a disorder characterized by aberrant cellular proliferation or differentiation of a 32225-expressing cell, in a subject. Preferably, the method includes administering to the subject (e.g., a mammal, e.g., a human) an effective amount of a compound (e.g., a compound identified using the methods described herein) that modulates the activity, or expression, of the 32225 polypeptide or nucleic acid. In a preferred embodiment, the disorder is a cancerous or pre-cancerous condition.

[2246] In a further aspect, the invention provides methods for evaluating the efficacy of a treatment of a disorder, e.g., a proliferative disorder, a liver disorder, or a neural disorder. The method includes: treating a subject, e.g., a patient or an animal, with a protocol under evaluation (e.g., treating a subject with one or more of: chemotherapy, radiation, and/or a compound identified using the methods described herein); and evaluating the expression of a 32225 nucleic acid or polypeptide before and after treatment. A change, e.g., a decrease or increase, in the level of a 32225 nucleic acid (e.g., mRNA) or polypeptide after treatment, relative to the level of expression before treatment, is indicative of the efficacy of the treatment of the disorder. The level of 32225 nucleic acid or polypeptide expression can be detected by any method described herein.

[2247] In a preferred embodiment, the evaluating step includes obtaining a sample (e.g., a tissue sample, e.g., a biopsy, or a fluid sample) from the subject, before and after treatment and comparing the level of expressing of a 32225 nucleic acid (e.g., mRNA) or polypeptide before and after treatment.

[2248] In another aspect, the invention provides methods for evaluating the efficacy of a therapeutic or prophylactic agent (e.g., an anti-neoplastic agent). The method includes: contacting a sample with an agent (e.g., a compound identified using the methods described herein, a cytotoxic agent) and, evaluating the expression of 32225 nucleic acid or polypeptide in the sample before and after the contacting step. A change, e.g., a decrease or increase, in the level of 32225 nucleic acid (e.g., mRNA) or polypeptide in the sample obtained after the contacting step, relative to the level of expression in the sample before the contacting step, is indicative of the efficacy of the agent. The level of 32225 nucleic acid or polypeptide expression can be detected by any method described herein. In a preferred embodiment, the sample includes cells obtained from a cancerous tissue, e.g., a cancerous lung, breast, colon, ovary, brain, or liver tissue.

[2249] In further aspect, the invention provides assays for determining the presence or absence of a genetic alteration in a 32225 polypeptide or nucleic acid molecule, including for disease diagnosis.

[2250] In another aspect, the invention features a two dimensional array having a plurality of addresses, each address of the plurality being positionally distinguishable from each other address of the plurality, and each address of the plurality having a unique capture probe, e.g., a nucleic acid or peptide sequence. At least one address of the plurality has a capture probe that recognizes a 32225 molecule. In one embodiment, the capture probe is a nucleic acid, e.g., a probe complementary to a 32225 nucleic acid sequence. In another embodiment, the capture probe is a polypeptide, e.g., an antibody specific for 32225 polypeptides. Also featured is a method of analyzing a sample by contacting the sample to the aforementioned array and detecting binding of the sample to the array.

[2251] Other features and advantages of the invention will be apparent from the following detailed description, and from the claims.

Detailed Description of 32225

[2252] The human 32225 sequence (see SEQ ID NO:33, as recited in Example 24), which is approximately 2305 nucleotides long including untranslated regions, contains a predicted methionine-initiated coding sequence of about 1029 nucleotides, including the termination codon. The coding sequence encodes a 342 amino acid protein (see SEQ ID NO:34, as recited in Example 24.

[2253] Human 32225 contains the following regions or other structural features:

[2254] an &agr;/&bgr; hydrolase domain (PFAM Accession-Number-PF00561) located at about amino acid residues 95 to 338 of SEQ ID NO:34;

[2255] an &agr;/&bgr; hydrolase domain nucleophile motif, located at about amino acid residues 144 to 149 of SEQ ID NO:34;

[2256] an a/b hydrolase domain catalytic histidine residue, located at about amino acid 320 of SEQ ID NO:34;

[2257] six predicted protein kinase C phosphorylation sites (PS00005) locate at about amino acids 19 to 21, 92 to 94, 108 to 110, 130 to 132, 156 to 158, and 297 to 299 of SEQ ID NO:34;

[2258] four predicted casein kinase II phosphorylation sites (PS00006) located at about amino acids 126 to 129, 130 to 133, 237 to 240, and 291 to 294 of SEQ ID NO:34;

[2259] two predicted N-myristylation sites (PS00008) located at about amino acids 78 to 83, and 148 to 153 of SEQ ID NO:34; and

[2260] one predicted amidation site (PS00009) located at about amino acid 297 to 300 of SEQ ID NO:34.

[2261] For general information regarding PFAM identifiers, PS prefix and PF prefix domain identification numbers, refer to Sonnhammer et al. (1997) Protein 28:405-420 and http://www.psc.edu/general/software/packages/pfam/pfam.html.

[2262] A plasmid containing the nucleotide sequence encoding human 32225 (clone “Fbh32225FL”) was deposited with American Type Culture Collection (ATCC), 10801 University Boulevard, Manassas, Va. 20110-2209, on ______ and assigned Accession Number ______. This deposit will be maintained under the terms of the Budapest Treaty on the International Recognition of the Deposit of Microorganisms for the Purposes of Patent Procedure. This deposit was made merely as a convenience for those of skill in the art and is not an admission that a deposit is required under 35 U.S.C. §112.

[2263] The 32225 protein contains a significant number of structural characteristics in common with members of the &agr;/&bgr; hydrolase family. The term “family” when referring to the protein and nucleic acid molecules of the invention means two or more proteins or nucleic acid molecules having a common structural domain or motif and having sufficient amino acid or nucleotide sequence homology as defined herein. Such family members can be naturally or non-naturally occurring and can be from either the same or different species. For example, a family can contain a first protein of human origin as well as other distinct proteins of human origin, or alternatively, can contain homologues of non-human origin, e.g., rat or mouse proteins. Members of a family can also have common functional characteristics.

[2264] A hydrolase family of proteins includes a diverse array of enzymes and is characterized by a common fold. The domain fold typically contains an eight stranded &bgr;-sheet surrounded by &agr;-helices, and frequently contains conserved catalytic residues. This enzyme family includes lipases, esterase, and proteases. Despite the variety of reactions this family is capable of, the chemistry of these reactions is generally similar. Three positions in particular form a catalytic triad contributing nucleophilic, acidic, and histidine residues. The side chains of these residues are critical for nucleophilic attack. Another conserved feature, an oxyanion hole near strand three is believed to stabilize potential covalent intermediates. This intermediate then proceeds to product by general base catalysis (Ollis, D. et al. (1992) Protein Eng 5:197-211). The &agr;/&bgr; hydrolase family includes polypeptides with acetylcholinesterase, epoxide hydrolase, cholesterol esterase, and lipase activity. Acetylcholinesterase are important for neurotransmission; epoxide hydrolases for detoxifying harmful aromatic chemicals, and lipases for cholesterol and fat metabolism. Thus, the family of &agr;/&bgr; hydrolases include enzymes critical for the proper function of many physiological systems.

[2265] A 32225 polypeptide can include an “&agr;/&bgr; hydrolase domain” or regions homologous with an “&agr;/&bgr; hydrolase domain”.

[2266] As used herein, the term “&agr;/&bgr; hydrolase domain” includes an amino acid sequence of about 200 to 400 amino acid residues in length and having a bit score for the alignment of the sequence to the &agr;/&bgr; hydrolase domain profile (Pfam HMM) of at least 40. Preferably, an &agr;/&bgr; hydrolase domain includes at least about 220 to 380 amino acids, more preferably about 230 to 365 amino acid residues, or about 240 to 350 amino acids and has a bit score for the alignment of the sequence to the hydrolase domain (HMM) of at least 50, 60, 70, 75 or greater. The hydrolase domain (HMM) has been assigned the PFAM Accession Number PF00561 (http;//genome.wustl.edu/Pfam/.html). An alignment of the hydrolase domain (amino acids 95 to 338 of SEQ ID NO:34) of human 32225 with a consensus amino acid sequence (SEQ ID NO:36) derived from a hidden Markov model is depicted in FIG. 19.

[2267] In a preferred embodiment a 32225 polypeptide or protein has an “&agr;/&bgr; hydrolase domain” or a region which includes at least about 220 to 380, more preferably about 230 to 365, or 240 to 350 amino acid residues and has at least about 50%, 60%, 70% 80% 90% 95%, 99%, or 100% homology with an “&agr;/&bgr; hydrolase domain,” e.g., the hydrolase domain of human 32225 (e.g., residues 95 to 338 of SEQ ID NO:34).

[2268] To identify the presence of a “hydrolase” domain in a 32225 protein sequence, and make the determination that a polypeptide or protein of interest has a particular profile, the amino acid sequence of the protein can be searched against the Pfam database of HMMs (e.g., the Pfam database, release 2.1) using the default parameters (http://www.sanger.ac.uk/Software/Pfam/HMM_search). For example, the hmmsf program, which is available as part of the HMMER package of search programs, is a family specific default program for MILPAT0063 and a score of 15 is the default threshold score for determining a hit. Alternatively, the threshold score for determining a hit can be lowered (e.g., to 8 bits). A description of the Pfam database can be found in Sonhammer et al. (1997) Proteins 28(3):405-420 and a detailed description of HMMs can be found, for example, in Gribskov et al. (1990) Meth. Enzymol. 183:146-159; Gribskov et al. (1987) Proc. Natl. Acad. Sci. USA 84:4355-4358; Krogh et al. (1994) J. Mol. Biol. 235:1501-1531; and Stultz et al. (1993) Protein Sci. 2:305-314, the contents of which are incorporated herein by reference. A search was performed against the HMM database resulting in the identification of an “&agr;/&bgr; hydrolase” domain in the amino acid sequence of human 32225 at about residues 95 to 338 of SEQ ID NO:34 (see FIG. 19).

[2269] Alternatively, to identify the presence of an “&agr;/&bgr; hydrolase” domain in a 32225 protein sequence, and make the determination that a polypeptide or protein of interest has a particular profile, the amino acid sequence of the protein can be searched against a different database of domains, e.g., the ProDom database (Corpet et al. (1999), Nucl. Acids Res. 27:263-267) The ProDom protein domain database consists of an automatic compilation of homologous domains. Current versions of ProDom are built using recursive PSI-BLAST searches (Altschul S F et al. (1997) Nucleic Acids Res. 25:3389-3402; Gouzy et al. (1999) Computers and Chemistry 23:333-340.) of the SWISS-PROT 38 and TREMBL protein databases. The database automatically generates a consensus sequence for each domain. A BLAST search was performed against the ProDom database resulting in the identification of several fragments of an “&agr;/&bgr; hydrolase” domain in the amino acid sequence of human 32225, located at about amino acid residues 8 to 70, 187 to 277, and 280 to 342 of SEQ ID NO:34 (see FIGS. 20A-20C).

[2270] In one embodiment, a 32225 protein includes at least one &agr;/&bgr; hydrolase domain nucleophile motif. As used herein, an “&agr;/&bgr; hydrolase domain nucleophile motif” includes a sequence of at least five amino acid residues defined by the sequence: G-X-S-X-G-G (SEQ ID NO:40). An &agr;/&bgr; hydrolase domain nucleophile motif, as defined, can be involved in the enzymatic hydrolysis of a chemical bond, e.g., the bond between a nucleophilic carbon atom and a relatively electrophilic atom to which the carbon is bound, e.g., an oxygen, nitrogen, or cloride atom. More preferably, an &agr;/&bgr; hydrolase domain nucleophile motif includes six amino acid residues. &agr;/&bgr; hydrolase domain nucleophili motifs have been described in, e.g., Ollis et al. (1992), supra, and Nardini and Dijkstra (1999), supra, the contents of which are incorporated herein by reference.

[2271] In a preferred embodiment, a 32225 polypeptide or protein has at least one &agr;/&bgr; hydrolase domain nucleophile motif, or a region which includes at least five, or even six, amino acid residues and has at least 70%, 80%, 90%, or 100% homology with an “a/b hydrolase domain nucleophile motif”, e.g., at least one &agr;/&bgr; hydrolase domain nucleophile motif of human 32225, e.g., amino acid residues 144 to 149 of SEQ ID NO:34.

[2272] In another preferred embodiment, a 32225 protein includes at least one &agr;/&bgr; hydrolase domain catalytic histidine residue. As used herein, an “&agr;/&bgr; hydrolase domain catalytic histidine residue” is a conserved histidine residue located toward the C-terminal end of an &agr;/&bgr; hydrolase domain in a peptide loop that connects two elements of secondary structure, e.g., &bgr;-sheets and &agr;-helices. An a/b hydrolase domain catalytic histidine residue, as defined, can be involved in the enzymatic hydrolysis of a chemical bond, e.g., the bond between a nucleophilic carbon atom and a relatively electrophilic atom to which the carbon is bound, e.g., an oxygen, nitrogen, or cloride atom. &agr;/&bgr; hydrolase domain catalytic histidine residues have been described in, e.g., Ollis et al. (1992), supra, and Nardini and Dijkstra (1999), supra, the contents of which are incorporated herein by reference.

[2273] A 32225 family member can include at least one &agr;/&bgr; hydrolase domain. Furthermore, a 32225 family member can include at least one &agr;/&bgr; hydrolase domain nucleophile motif; at least one &agr;/&bgr; hydrolase domain catalytic histidine residue; at least one, two, three, four, five, preferably six predicted protein kinase C phosphorylation sites (PS00005); at least one, two, three, preferably four predicted casein kinase II phosphorylation sites (PS00006); at least one, preferably two predicted N-myristylation sites (PS00008); and at least one predicted amidation site (PS00009).

[2274] As the 32225 polypeptides of the invention may modulate 32225-mediated activities, they may be useful as of for developing novel diagnostic and therapeutic agents for 32225-mediated or related disorders, as described below.

[2275] As used herein, a “32225 activity”, “biological activity of 32225” or “functional activity of 32225”, refers to an activity exerted by a 32225 protein, polypeptide or nucleic acid molecule. For example, a 32225 activity can be an activity exerted by 32225 in a physiological milieu on, e.g., a 32225-responsive cell or on a 32225 substrate, e.g., a small molecule substrate. A 32225 activity can be determined in vivo or in vitro. In one embodiment, a 32225 activity is a direct activity, such as an association with a 32225 target molecule. A “target molecule” or “binding partner” is a molecule with which a 32225 protein binds or interacts in nature. In an exemplary embodiment, 32225 is an enzyme for a substrate that contains a nucleophilic carbon atom bound to a relatively electrophilic atom, e.g., an oxygen, nitrogen, or chlorine atom. Such substrates are present in many small molecules, e.g., hydroxymuconic semialdehyde, 1,2-dichloroethane, acetycholine, and triacylglycerols.

[2276] A 32225 activity can also be an indirect activity, e.g., a cellular signaling activity mediated by interaction of the 32225 protein with a 32225 receptor. The features of the 32225 molecules of the present invention can provide similar biological activities as a/p hydrolase family members. For example, the 32225 proteins of the present invention can have one or more of the following activities: (1) hydrolysis of acetylcholine and other neurotransmitters; (2) hydrolysis of epoxides and other toxic chemicals; (3) hydrolysis of lipid substrates; (4) hydrolysis of cholesterol; (5) protease activity; and (6) thioesterase activity. In addition, the 32225 protein may have a critical function in one or more of the following physiological processes: (1) neurotransmitter function; (2) detoxification; (3) metabolite regulation and degradation; and (4) toxin removal and neutralization. Selected 32225 polypeptides can antagonize, e.g., by competitive inhibition, one of the above-mentioned activities.

[2277] Thus, the 32225 molecules can act as novel diagnostic targets and therapeutic agents for controlling cell growth and/or differentiative disorders, neurological disorders, liver disorders, metabolic disorders, and cardiovascular and/or endothelial cell disorders.

[2278] Examples of cellular proliferative and/or differentiative disorders include cancer, e.g., carcinoma, sarcoma, metastatic disorders or hematopoietic neoplastic disorders, e.g., leukemias. A metastatic tumor can arise from a multitude of primary tumor types, including but not limited to those of prostate, colon, lung, breast and liver origin.

[2279] As used herein, the terms “cancer”, “hyperproliferative” and “neoplastic” refer to cells having the capacity for autonomous growth. Examples of such cells include cells having an abnormal state or condition characterized by rapidly proliferating cell growth. Hyperproliferative and neoplastic disease states may be categorized as pathologic, i.e., characterizing or constituting a disease state, or may be categorized as non-pathologic, i.e., a deviation from normal but not associated with a disease state. The term is meant to include all types of cancerous growths or oncogenic processes, metastatic tissues or malignantly transformed cells, tissues, or organs, irrespective of histopathologic type or stage of invasiveness. “Pathologic hyperproliferative” cells occur in disease states characterized by malignant tumor growth. Examples of non-pathologic hyperproliferative cells include proliferation of cells associated with wound repair.

[2280] The terms “cancer” or “neoplasms” include malignancies of the various organ systems, such as affecting lung, breast, thyroid, lymphoid, gastrointestinal, and genito-urinary tract, as well as adenocarcinomas which include malignancies such as most colon cancers, renal-cell carcinoma, prostate cancer and/or testicular tumors, non-small cell carcinoma of the lung, cancer of the small intestine and cancer of the esophagus.

[2281] The term “carcinoma” is art recognized and refers to malignancies of epithelial or endocrine tissues including respiratory system carcinomas, gastrointestinal system carcinomas, genitourinary system carcinomas, testicular carcinomas, breast carcinomas, prostatic carcinomas, endocrine system carcinomas, and melanomas. Exemplary carcinomas include those forming from tissue of the cervix, lung, prostate, breast, head and neck, colon and ovary. The term also includes carcinosarcomas, e.g., which include malignant tumors composed of carcinomatous and sarcomatous tissues. An “adenocarcinoma” refers to a carcinoma derived from glandular tissue or in which the tumor cells form recognizable glandular structures.

[2282] The term “sarcoma” is art recognized and refers to malignant tumors of mesenchymal derivation.

[2283] Examples of cellular proliferative and/or differentiative disorders of the colon include, but are not limited to, non-neoplastic polyps, adenomas, familial syndromes, colorectal carcinogenesis, colorectal carcinoma, and carcinoid tumors.

[2284] Examples of cellular proliferative and/or differentiative disorders of the liver include, but are not limited to, nodular hyperplasias, adenomas, and malignant tumors, including primary carcinoma of the liver and metastatic tumors.

[2285] Examples of cellular proliferative and/or differentiative disorders of the breast include, but are not limited to, proliferative breast disease including, e.g., epithelial hyperplasia, sclerosing adenosis, and small duct papillomas; tumors, e.g., stromal tumors such as fibroadenoma, phyllodes tumor, and sarcomas, and epithelial tumors such as large duct papilloma; carcinoma of the breast including in situ (noninvasive) carcinoma that includes ductal carcinoma in situ (including Paget's disease) and lobular carcinoma in situ, and invasive (infiltrating) carcinoma including, but not limited to, invasive ductal carcinoma, invasive lobular carcinoma, medullary carcinoma, colloid (mucinous) carcinoma, tubular carcinoma, and invasive papillary carcinoma, and miscellaneous malignant neoplasms. Disorders in the male breast include, but are not limited to, gynecomastia and carcinoma.

[2286] Examples of cellular proliferative and/or differentiative disorders of the lung include, but are not limited to, bronchogenic carcinoma, including paraneoplastic syndromes, bronchioloalveolar carcinoma, neuroendocrine tumors, such as bronchial carcinoid, miscellaneous tumors, and metastatic tumors; pathologies of the pleura, including inflammatory pleural effusions, noninflammatory pleural effusions, pneumothorax, and pleural tumors, including solitary fibrous tumors (pleural fibroma) and malignant mesothelioma.

[2287] Additional examples of proliferative disorders include hematopoietic neoplastic disorders. As used herein, the term “hematopoietic neoplastic disorders” includes diseases involving hyperplastic/neoplastic cells of hematopoietic origin. A hematopoietic neoplastic disorder can arise from myeloid, lymphoid or erythroid lineages, or precursor cells thereof. Preferably, the diseases arise from poorly differentiated acute leukemias, e.g., erythroblastic leukemia and acute megakaryoblastic leukemia. Additional exemplary myeloid disorders include, but are not limited to, acute promyeloid leukemia (APML), acute myelogenous leukemia (AML) and chronic myelogenous leukemia (CML) (reviewed in Vaickus, L. (1991) Crit Rev. in Oncol./Hemotol. 11:267-97); lymphoid malignancies include, but are not limited to acute lymphoblastic leukemia (ALL) which includes B-lineage ALL and T-lineage ALL, chronic lymphocytic leukemia (CLL), prolymphocytic leukemia (PLL), hairy cell leukemia (HLL) and Waldenstrom's macroglobulinemia (WM). Additional forms of malignant lymphomas include, but are not limited to non-Hodgkin lymphoma and variants thereof, peripheral T cell lymphomas, adult T cell leukemia/lymphoma (ATL), cutaneous T-cell lymphoma (CTCL), large granular lymphocytic leukemia (LGF), Hodgkin's disease and Reed-Sternberg disease.

[2288] 32225 polypeptide may be involved with neuron outgrowth, central nervous system (CNS) development, psychiatric function, and neuronal repair. Neurological disorders include, but are not limited to, disorders involving neurons, and disorders involving glia, such as astrocytes, oligodendrocytes, ependymal cells, and microglia; cerebral edema, raised intracranial pressure and herniation, and hydrocephalus; malformations and developmental diseases, such as neural tube defects, forebrain anomalies, posterior fossa anomalies, and syringomyelia and hydromyelia; perinatal brain injury; cerebrovascular diseases, such as those related to hypoxia, ischemia, and infarction, including hypotension, hypoperfusion, and low-flow states—global cerebral ischemia and focal cerebral ischemia—infarction from obstruction of local blood supply, intracranial hemorrhage, including intracerebral (intraparenchymal) hemorrhage, subarachnoid hemorrhage and ruptured berry aneurysms, and vascular malformations, hypertensive cerebrovascular disease, including lacunar infarcts, slit hemorrhages, and hypertensive encephalopathy; infections, such as acute meningitis, including acute pyogenic (bacterial) meningitis and acute aseptic (viral) meningitis, acute focal suppurative infections, including brain abscess, subdural empyema, and extradural abscess, chronic bacterial meningoencephalitis, including tuberculosis and mycobacterioses, neurosyphilis, and neuroborreliosis (Lyme disease), viral meningoencephalitis, including arthropod-borne (Arbo) viral encephalitis, Herpes simplex virus Type 1, Herpes simplex virus Type 2, Varicalla-zoster virus (Herpes zoster), cytomegalovirus, poliomyelitis, rabies, and human immunodeficiency virus 1, including HIV-1 meningoencephalitis (subacute encephalitis), vacuolar myelopathy, A/DS-associated myopathy, peripheral neuropathy, and AIDS in children, progressive multifocal leukoencephalopathy, subacute sclerosing panencephalitis, fungal meningoencephalitis, other infectious diseases of the nervous system; transmissible spongiform encephalopathies (prion diseases); demyelinating diseases, including multiple sclerosis, multiple sclerosis variants, acute disseminated encephalomyelitis and acute necrotizing hemorrhagic encephalomyelitis, and other diseases with demyelination; degenerative diseases, such as degenerative diseases affecting the cerebral cortex, including Alzheimer disease and Pick disease, degenerative diseases of basal ganglia and brain stem, including Parkinsonism, idiopathic Parkinson disease (paralysis agitans), progressive supranuclear palsy, corticobasal degenration, multiple system atrophy, including striatonigral degenration, Shy-Drager syndrome, and olivopontocerebellar atrophy, and Huntington disease; spinocerebellar degenerations, including spinocerebellar ataxias, including Friedreich ataxia, and ataxia-telanglectasia, degenerative diseases affecting motor neurons, including amyotrophic lateral sclerosis (motor neuron disease), bulbospinal atrophy (Kennedy syndrome), and spinal muscular atrophy; inborn errors of metabolism, such as leukodystrophies, including Krabbe disease, metachromatic leukodystrophy, adrenoleukodystrophy, Pelizaeus-Merzbacher disease, and Canavan disease, mitochondrial encephalomyopathies, including Leigh disease and other mitochondrial encephalomyopathies; toxic and acquired metabolic diseases, including vitamin deficiencies such as thiamine (vitamin B1) deficiency and vitamin B12 deficiency, neurologic sequelae of metabolic disturbances, including hypoglycemia, hyperglycemia, and hepatic encephatopathy, toxic disorders, including carbon monoxide, methanol, ethanol, and radiation, including combined methotrexate and radiation-induced injury; tumors, such as gliomas, including astrocytoma, including fibrillary (diffuse) astrocytoma and glioblastoma multiforme, pilocytic astrocytoma, pleomorphic xanthoastrocytoma, and brain stem glioma, oligodendroglioma, and ependymoma and related paraventricular mass lesions, neuronal tumors, poorly differentiated neoplasms, including medulloblastoma, other parenchymal tumors, including primary brain lymphoma, germ cell tumors, and pineal parenchymal tumors, meningiomas, metastatic tumors, paraneoplastic syndromes, peripheral nerve sheath tumors, including schwannoma, neurofibroma, and malignant peripheral nerve sheath tumor (malignant schwannoma), and neurocutaneous syndromes (phakomatoses), including neurofibromotosis, including Type 1 neurofibromatosis (NF1) and TYPE 2 neurofibromatosis (NF2), tuberous sclerosis, and Von Hippel-Lindau disease.

[2289] Disorders of the liver include, but are not limited to, disorders associated with an accumulation in the liver of fibrous tissue, such as that resulting from an imbalance between production and degradation of the extracellular matrix accompanied by the collapse and condensation of preexisting fibers. Additional liver disorders include hepatocellular necrosis or injury induced by a wide variety of agents including processes which disturb homeostasis, such as an inflammatory process, tissue damage resulting from toxic injury or altered hepatic blood flow, and infections (e.g., bacterial, viral and parasitic). For example, the methods can be used for the early detection of hepatic injury, such as portal hypertension or hepatic fibrosis. In addition, the methods can be employed to detect liver fibrosis attributed to inborn errors of metabolism, for example, fibrosis resulting from a storage disorder such as Gaucher's disease (lipid abnormalities) or a glycogen storage disease, A1-antitrypsin deficiency; a disorder mediating the accumulation (e.g., storage) of an exogenous substance, for example, hemochromatosis (iron-overload syndrome) and copper storage diseases (Wilson's disease), disorders resulting in the accumulation of a toxic metabolite (e.g., tyrosinemia, fructosemia and galactosemia) and peroxisomal disorders (e.g., Zellweger syndrome). Additionally, the methods described herein may be useful for the early detection and treatment of liver injury associated with the administration of various chemicals or drugs, such as for example, methotrexate, isonizaid, oxyphenisatin, methyldopa, chlorpromazine, tolbutamide or alcohol, or which represents a hepatic manifestation of a vascular disorder such as obstruction of either the intrahepatic or extrahepatic bile flow or an alteration in hepatic circulation resulting, for example, from chronic heart failure, veno-occlusive disease, portal vein thrombosis or Budd-Chiari syndrome.

[2290] 32225 may have a critical role in removing xenobiotic epoxides and other toxins from the body. Furthermore, it may contribute to the metabolism of drugs and other pharmaceuticals. The study of polymorphisms in the 32225 gene should provide a useful resource for pharmacogenomic (see below) analysis of drug responses. Additionally, variations in 32225 may contribute to population differences in sensitivity to environmental toxins.

[2291] Additionally, 32225 may play an important role in the regulation of metabolism Diseases of metabolic imbalance include, but are not limited to, obesity, anorexia nervosa, cachexia, lipid disorders diabetes. For example, many &agr;/&bgr; hydrolase family members have lipase activity and cholesterol esterase activity. The human hormone sensitive lipase is responsible for metabolizing fat stored in adipocytes.

[2292] As used herein, disorders involving the heart, or “cardiovascular disease” or a “cardiovascular disorder” includes a disease or disorder which affects the cardiovascular system, e.g., the heart, the blood vessels, and/or the blood. A cardiovascular disorder can be caused by an imbalance in arterial pressure, a malfunction of the heart, or an occlusion of a blood vessel, e.g., by a thrombus. A cardiovascular disorder includes, but is not limited to disorders such as arteriosclerosis, atherosclerosis, cardiac hypertrophy, ischemia reperfusion injury, restenosis, arterial inflammation, vascular wall remodeling, ventricular remodeling, rapid ventricular pacing, coronary microembolism, tachycardia, bradycardia, pressure overload, aortic bending, coronary artery ligation, vascular heart disease, valvular disease, including but not limited to, valvular degeneration caused by calcification, rheumatic heart disease, endocarditis, or complications of artificial valves; atrial fibrillation, long-QT syndrome, congestive heart failure, sinus node dysfunction, angina, heart failure, hypertension, atrial fibrillation, atrial flutter, pericardial disease, including but not limited to, pericardial effusion and pericarditis; cardiomyopathies, e.g., dilated cardiomyopathy or idiopathic cardiomyopathy, myocardial infarction, coronary artery disease, coronary artery spasm, ischemic disease, arrhythmia, sudden cardiac death, and cardiovascular developmental disorders (e.g., arteriovenous malformations, arteriovenous fistulae, raynaud's syndrome, neurogenic thoracic outlet syndrome, causalgia/reflex sympathetic dystrophy, hemangioma, aneurysm, cavernous angioma, aortic valve stenosis, atrial septal defects, atrioventricular canal, coarctation of the aorta, ebsteins anomaly, hypoplastic left heart syndrome, interruption of the aortic arch, mitral valve prolapse, ductus arteriosus, patent foramen ovale, partial anomalous pulmonary venous return, pulmonary atresia with ventricular septal defect, pulmonary atresia without ventricular septal defect, persistance of the fetal circulation, pulmonary valve stenosis, single ventricle, total anomalous pulmonary venous return, transposition of the great vessels, tricuspid atresia, truncus arteriosus, ventricular septal defects). A cardiovasular disease or disorder also can include an endothelial cell disorder.

[2293] As used herein, an “endothelial cell disorder” includes a disorder characterized by aberrant, unregulated, or unwanted endothelial cell activity, e.g., proliferation, migration, angiogenesis, or vascularization; or aberrant expression of cell surface adhesion molecules or genes associated with angiogenesis, e.g., TIE-2, FLT and FLK. Endothelial cell disorders include tumorigenesis, tumor metastasis, psoriasis, diabetic retinopathy, endometriosis, Grave's disease, ischemic disease (e.g., atherosclerosis), and chronic inflammatory diseases (e.g., rheumatoid arthritis).

[2294] The 32225 protein, fragments thereof, and derivatives and other variants of the sequence in SEQ ID NO:34 thereof are collectively referred to as “polypeptides or proteins of the invention” or “32225 polypeptides or proteins”. Nucleic acid molecules encoding such polypeptides or proteins are collectively referred to as “nucleic acids of the invention” or “32225 nucleic acids.” 32225 molecules refer to 32225 nucleic acids, polypeptides, and antibodies.

[2295] As used herein, the term “nucleic acid molecule” includes DNA molecules (e.g., a cDNA or genomic DNA), RNA molecules (e.g., an mRNA) and analogs of the DNA or RNA. A DNA or RNA analog can be synthesized from nucleotide analogs. The nucleic acid molecule can be single-stranded or double-stranded, but preferably is double-stranded DNA.

[2296] The term “isolated nucleic acid molecule” or “purified nucleic acid molecule” includes nucleic acid molecules that are separated from other nucleic acid molecules present in the natural source of the nucleic acid. For example, with regards to genomic DNA, the term “isolated” includes nucleic acid molecules which are separated from the chromosome with which the genomic DNA is naturally associated. Preferably, an “isolated” nucleic acid is free of sequences which naturally flank the nucleic acid (i.e., sequences located at the 5′ and/or 3′ends of the nucleic acid) in the genomic DNA of the organism from which the nucleic acid is derived. For example, in various embodiments, the isolated nucleic acid molecule can contain less than about 5 kb, 4 kb, 3 kb, 2 kb, 1 kb, 0.5 kb or 0.1 kb of 5′ and/or 3′nucleotide sequences which naturally flank the nucleic acid molecule in genomic DNA of the cell from which the nucleic acid is derived. Moreover, an “isolated” nucleic acid molecule, such as a cDNA molecule, can be substantially free of other cellular material, or culture medium when produced by recombinant techniques, or substantially free of chemical precursors or other chemicals when chemically synthesized.

[2297] As used herein, the term “hybridizes under low stringency, medium stringency, high stringency, or very high stringency conditions” describes conditions for hybridization and washing. Guidance for performing hybridization reactions can be found in Current Protocols in Molecular Biology, John Wiley & Sons, N.Y. (1989), 6.3.1-6.3.6, which is incorporated by reference. Aqueous and nonaqueous methods are described in that reference and either can be used. Specific hybridization conditions referred to herein are as follows: 1) low stringency hybridization conditions in 6× sodium chloride/sodium citrate (SSC) at about 45° C., followed by two washes in 0.2× SSC, 0.1% SDS at least at 50° C. (the temperature of the washes can be increased to 55° C. for low stringency conditions); 2) medium stringency hybridization conditions in 6× SSC at about 45° C., followed by one or more washes in 0.2× SSC, 0.1% SDS at 60° C.; 3) high stringency hybridization conditions in 6× SSC at about 45° C., followed by one or more washes in 0.2× SSC, 0.1% SDS at 65° C.; and preferably 4) very high stringency hybridization conditions are 0.5M sodium phosphate, 7% SDS at 65° C., followed by one or more washes at 0.2× SSC, 1% SDS at 65° C. Very high stringency conditions (4) are the preferred conditions and the ones that should be used unless otherwise specified.

[2298] Preferably, an isolated nucleic acid molecule of the invention that hybridizes under a stringency condition described herein to the sequence of SEQ ID NO:33 or SEQ ID NO:35, corresponds to a naturally-occurring nucleic acid molecule.

[2299] As used herein, a “naturally-occurring” nucleic acid molecule refers to an RNA or DNA molecule having a nucleotide sequence that occurs in nature. For example a naturally occurring nucleic acid molecule can encode a natural protein. As used herein, the terms “gene” and “recombinant gene” refer to nucleic acid molecules which include at least an open reading frame encoding a 32225 protein. The gene can optionally further include non-coding sequences, e.g., regulatory sequences and introns. Preferably, a gene encodes a mammalian 32225 protein or derivative thereof.

[2300] An “isolated” or “purified” polypeptide or protein is substantially free of cellular material or other contaminating proteins from the cell or tissue source from which the protein is derived, or substantially free from chemical precursors or other chemicals when chemically synthesized. “Substantially free” means that a preparation of 32225 protein is at least 10% pure. In a preferred embodiment, the preparation of 32225 protein has less than about 30%, 20%, 10% and more preferably 5% (by dry weight), of non-32225 protein (also referred to herein as a “contaminating protein”), or of chemical precursors or non-32225 chemicals. When the 32225 protein or biologically active portion thereof is recombinantly produced, it is also preferably substantially free of culture medium, i.e., culture medium represents less than about 20%, more preferably less than about 10%, and most preferably less than about 5% of the volume of the protein preparation. The invention includes isolated or purified preparations of at least 0.01, 0.1, 1.0, and 10 milligrams in dry weight.

[2301] A “non-essential” amino acid residue is a residue that can be altered from the wild-type sequence of 32225 without abolishing or substantially altering a 32225 activity. Preferably the alteration does not substantially alter the 32225 activity, e.g., the activity is at least 20%, 40%, 60%, 70% or 80% of wild-type. An “essential” amino acid residue is a residue that, when altered from the wild-type sequence of 32225, results in abolishing a 32225 activity such that less than 20% of the wild-type activity is present. For example, conserved amino acid residues in 32225 are predicted to be particularly unamenable to alteration.

[2302] A “conservative amino acid substitution” is one in which the amino acid residue is replaced with an amino acid residue having a similar side chain. Families of amino acid residues having similar side chains have been defined in the art. These families include amino acids with basic side chains (e.g., lysine, arginine, histidine), acidic side chains (e.g., aspartic acid, glutamic acid), uncharged polar side chains (e.g., glycine, asparagine, glutamine, serine, threonine, tyrosine, cysteine), nonpolar side chains (e.g., alanine, valine, leucine, isoleucine, proline, phenylalanine, methionine, tryptophan), beta-branched side chains (e.g., threonine, valine, isoleucine) and aromatic side chains (e.g., tyrosine, phenylalanine, tryptophan, histidine). Thus, a predicted nonessential amino acid residue in a 32225 protein is preferably replaced with another amino acid residue from the same side chain family. Alternatively, in another embodiment, mutations can be introduced randomly along all or part of a 32225 coding sequence, such as by saturation mutagenesis, and the resultant mutants can be screened for 32225 biological activity to identify mutants that retain activity. Following mutagenesis of SEQ ID NO:33 or SEQ ID NO:35, the encoded protein can be expressed recombinantly and the activity of the protein can be determined.

[2303] As used herein, a “biologically active portion” of a 32225 protein includes a fragment of a 32225 protein which participates in an interaction, e.g., an intramolecular or an inter-molecular interaction. An inter-molecular interaction can be a specific binding interaction or an enzymatic interaction (e.g., the interaction can be transient and a covalent bond is formed or broken). An inter-molecular interaction can be between a 32225 molecule and a non-32225 molecule or between a first 32225 molecule and a second 32225 molecule (e.g., a dimerization interaction). Biologically active portions of a 32225 protein include peptides comprising amino acid sequences sufficiently homologous to or derived from the amino acid sequence of the 32225 protein, e.g., the amino acid sequence shown in SEQ ID NO:34, which include less amino acids than the full length 32225 proteins, and exhibit at least one activity of a 32225 protein. Typically, biologically active portions comprise a domain or motif with at least one activity of the 32225 protein, e.g., hydrolysis of a substrate molecule, e.g., hydroxymuconic semialdehyde, acetycholine, 1,2-dichloroethane, or triacylglycerols. A biologically active portion of a 32225 protein can be a polypeptide which is, for example, 10, 25, 50, 100, 200 or more amino acids in length. Biologically active portions of a 32225 protein can be used as targets for developing agents which modulate a 32225 mediated activity, e.g., hydrolysis of a substrate molecule, e.g., hydroxymuconic semialdehyde, acetycholine, 1,2-dichloroethane, or triacylglycerols.

[2304] Calculations of homology or sequence identity between sequences (the terms are used interchangeably herein) are performed as follows.

[2305] To determine the percent identity of two amino acid sequences, or of two nucleic acid sequences, the sequences are aligned for optimal comparison purposes (e.g., gaps can be introduced in one or both of a first and a second amino acid or nucleic acid sequence for optimal alignment and non-homologous sequences can be disregarded for comparison purposes). In a preferred embodiment, the length of a reference sequence aligned for comparison purposes is at least 30%, preferably at least 40%, more preferably at least 50%, 60%, and even more preferably at least 70%, 80%, 90%, 100% of the length of the reference sequence. The amino acid residues or nucleotides at corresponding amino acid positions or nucleotide positions are then compared. When a position in the first sequence is occupied by the same amino acid residue or nucleotide as the corresponding position in the second sequence, then the molecules are identical at that position (as used herein amino acid or nucleic acid “identity” is equivalent to amino acid or nucleic acid “homology”).

[2306] The percent identity between the two sequences is a function of the number of identical positions shared by the sequences, taking into account the number of gaps, and the length of each gap, which need to be introduced for optimal alignment of the two sequences.

[2307] The comparison of sequences and determination of percent identity between two sequences can be accomplished using a mathematical algorithm. In a preferred embodiment, the percent identity between two amino acid sequences is determined using the Needleman and Wunsch ((1970) J. Mol. Biol. 48:444-453) algorithm which has been incorporated into the GAP program in the GCG software package (available at http://www.gcg.com), using either a Blossum 62 matrix or a PAM250 matrix, and a gap weight of 16, 14, 12, 10, 8, 6, or 4 and a length weight of 1, 2, 3, 4, 5, or 6. In yet another preferred embodiment, the percent identity between two nucleotide sequences is determined using the GAP program in the GCG software package (available at http://www.gcg.com), using a NWSgapdna.CMP matrix and a gap weight of 40, 50, 60, 70, or 80 and a length weight of 1, 2, 3, 4, 5, or 6. A particularly preferred set of parameters (and the one that should be used unless otherwise specified) are a Blossum 62 scoring matrix with a gap penalty of 12, a gap extend penalty of 4, and a frameshift gap penalty of 5.

[2308] The percent identity between two amino acid or nucleotide sequences can be determined using the algorithm of E. Meyers and W. Miller ((1989) CABIOS, 4:11-17) which has been incorporated into the ALIGN program (version 2.0), using a PAM120 weight residue table, a gap length penalty of 12 and a gap penalty of 4.

[2309] The nucleic acid and protein sequences described herein can be used as a “query sequence” to perform a search against public databases to, for example, identify other family members or related sequences. Such searches can be performed using the NBLAST and XBLAST programs (version 2.0) of Altschul, et al. (1990) J. Mol. Biol. 215:403-10. BLAST nucleotide searches can be performed with the NBLAST program, score=100, wordlength=12 to obtain nucleotide sequences homologous to 32225 nucleic acid molecules of the invention. BLAST protein searches can be performed with the XBLAST program, score=50, wordlength=3 to obtain amino acid sequences homologous to 32225 protein molecules of the invention. To obtain gapped alignments for comparison purposes, Gapped BLAST can be utilized as described in Altschul et al., (1997) Nucleic Acids Res. 25:3389-3402. When utilizing BLAST and Gapped BLAST programs, the default parameters of the respective programs (e.g., XBLAST and NBLAST) can be used. See http://www.ncbi.nlm.nih.gov.

[2310] Particular 32225 polypeptides of the present invention have an amino acid sequence substantially identical to the amino acid sequence of SEQ ID NO:34. In the context of an amino acid sequence, the term “substantially identical” is used herein to refer to a first amino acid that contains a sufficient or minimum number of amino acid residues that are i) identical to, or ii) conservative substitutions of aligned amino acid residues in a second amino acid sequence such that the first and second amino acid sequences can have a common structural domain and/or common functional activity. For example, amino acid sequences that contain a common structural domain having at least about 60%, or 65% identity, likely 75% identity, more likely 85%, 90%. 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98% or 99% identity to SEQ ID NO:34 are termed substantially identical.

[2311] In the context of nucleotide sequence, the term “substantially identical” is used herein to refer to a first nucleic acid sequence that contains a sufficient or minimum number of nucleotides that are identical to aligned nucleotides in a second nucleic acid sequence such that the first and second nucleotide sequences encode a polypeptide having common functional activity, or encode a common structural polypeptide domain or a common functional polypeptide activity. For example, nucleotide sequences having at least about 60%, or 65% identity, likely 75% identity, more likely 85%, 90%. 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98% or 99% identity to SEQ ID NO:33 or 35 are termed substantially identical.

[2312] “Misexpression or aberrant expression”, as used herein, refers to a non-wildtype pattern of gene expression at the RNA or protein level. It includes: expression at non-wild type levels, i.e., over- or under-expression; a pattern of expression that differs from wild type in terms of the time or stage at which the gene is expressed, e.g., increased or decreased expression (as compared with wild type) at a predetermined developmental period or stage; a pattern of expression that differs from wild type in terms of altered, e.g., increased or decreased, expression (as compared with wild type) in a predetermined cell type or tissue type; a pattern of expression that differs from wild type in terms of the splicing size, translated amino acid sequence, post-transitional modification, or biological activity of the expressed polypeptide; a pattern of expression that differs from wild type in terms of the effect of an environmental stimulus or extracellular stimulus on expression of the gene, e.g., a pattern of increased or decreased expression (as compared with wild type) in the presence of an increase or decrease in the strength of the stimulus.

[2313] “Subject,” as used herein, refers to human and non-human animals. The term “non-human animals” of the invention includes all vertebrates, e.g., mammals, such as non-human primates (particularly higher primates), sheep, dog, rodent (e.g., mouse or rat), guinea pig, goat, pig, cat, rabbits, cow, and non-mammals, such as chickens, amphibians, reptiles, etc. In a preferred embodiment, the subject is a human. In another embodiment, the subject is an experimental animal or animal suitable as a disease model.

[2314] A “purified preparation of cells”, as used herein, refers to an in vitro preparation of cells. In the case cells from multicellular organisms (e.g., plants and animals), a purified preparation of cells is a subset of cells obtained from the organism, not the entire intact organism. In the case of unicellular microorganisms (e.g., cultured cells and microbial cells), it consists of a preparation of at least 10% and more preferably 50% of the subject cells.

[2315] Various aspects of the invention are described in further detail below.

[2316] Isolated Nucleic Acid Molecules of 32225

[2317] In one aspect, the invention provides, an isolated or purified, nucleic acid molecule that encodes a 32225 polypeptide described herein, e.g., a full-length 32225 protein or a fragment thereof, e.g., a biologically active portion of 32225 protein. Also included is a nucleic acid fragment suitable for use as a hybridization probe, which can be used, e.g., to identify a nucleic acid molecule encoding a polypeptide of the invention, 32225 mRNA, and fragments suitable for use as primers, e.g., PCR primers for the amplification or mutation of nucleic acid molecules.

[2318] In one embodiment, an isolated nucleic acid molecule of the invention includes the nucleotide sequence shown in SEQ ID NO:33, or a portion of any of these nucleotide sequences. In one embodiment, the nucleic acid molecule includes sequences encoding the human 32225 protein (i.e., “the coding region” of SEQ ID NO:33, as shown in SEQ ID NO:35), as well as 5′untranslated sequences. Alternatively, the nucleic acid molecule can include only the coding region of SEQ ID NO:33 (e.g., SEQ ID NO:35) and, e.g., no flanking sequences which normally accompany the subject sequence. In another embodiment, the nucleic acid molecule encodes a sequence corresponding to a fragment of the protein from about amino acid 95 to 338.

[2319] In another embodiment, an isolated nucleic acid molecule of the invention includes a nucleic acid molecule which is a complement of the nucleotide sequence shown in SEQ ID NO:33 or SEQ ID NO:35, or a portion of any of these nucleotide sequences. In other embodiments, the nucleic acid molecule of the invention is sufficiently complementary to the nucleotide sequence shown in SEQ ID NO:33 or SEQ ID NO:35, such that it can hybridize (e.g., under a stringency condition described herein) to the nucleotide sequence shown in SEQ ID NO:33 or 35, thereby forming a stable duplex.

[2320] In one embodiment, an isolated nucleic acid molecule of the present invention includes a nucleotide sequence which is at least about 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or more homologous to the entire length of the nucleotide sequence shown in SEQ ID NO:33 or SEQ ID NO:35, or a portion, preferably of the same length, of any of these nucleotide sequences.

[2321] 32225 Nucleic Acid Fragments

[2322] A nucleic acid molecule of the invention can include only a portion of the nucleic acid sequence of SEQ ID NO:33 or 35. For example, such a nucleic acid molecule can include a fragment which can be used as a probe or primer or a fragment encoding a portion of a 32225 protein, e.g., an immunogenic or biologically active portion of a 32225 protein. A fragment can comprise those nucleotides of SEQ ID NO:33, which encode an &agr;/&bgr; hydrolase domain of human 32225. The nucleotide sequence determined from the cloning of the 32225 gene allows for the generation of probes and primers designed for use in identifying and/or cloning other 32225 family members, or fragments thereof, as well as 32225 homologues, or fragments thereof, from other species.

[2323] In another embodiment, a nucleic acid includes a nucleotide sequence that includes part, or all, of the coding region and extends into either (or both) the 5′ or 3′noncoding region. Other embodiments include a fragment which includes a nucleotide sequence encoding an amino acid fragment described herein. Nucleic acid fragments can encode a specific domain or site described herein or fragments thereof, particularly fragments thereof which are at least 120 amino acids in length. Fragments also include nucleic acid sequences corresponding to specific amino acid sequences described above or fragments thereof. Nucleic acid fragments should not to be construed as encompassing those fragments that may have been disclosed prior to the invention, e.g., BE007972, BE007963, BE008013, or SEQ ID NO:74 from WO 99/06548.

[2324] A nucleic acid fragment can include a sequence corresponding to a domain, region, or functional site described herein. A nucleic acid fragment can also include one or more domain, region, or functional site described herein. Thus, for example, a 32225 nucleic acid fragment can include a sequence corresponding to an &agr;/&bgr; hydrolase domain.

[2325] 32225 probes and primers are provided. Typically a probe/primer is an isolated or purified oligonucleotide. The oligonucleotide typically includes a region of nucleotide sequence that hybridizes under a stringency condition described herein to at least about 7, 12 or 15, preferably about 20 or 25, more preferably about 30, 35, 40, 45, 50, 55, 60, 65, or 75 consecutive nucleotides of a sense or antisense sequence of SEQ ID NO:33 or SEQ ID NO:35, or of a naturally occurring allelic variant or mutant of SEQ ID NO:33 or SEQ ID NO:35.

[2326] In a preferred embodiment the nucleic acid is a probe which is at least 5 or 10, and less than 200, more preferably less than 100, or less than 50, base pairs in length. It should be identical, or differ by 1, or less than in 5 or 10 bases, from a sequence disclosed herein. If alignment is needed for this comparison the sequences should be aligned for maximum homology. “Looped” out sequences from deletions or insertions, or mismatches, are considered differences.

[2327] A probe or primer can be derived from the sense or anti-sense strand of a nucleic acid which encodes an &agr;/&bgr; hydrolase domain, e.g., about nucleotides 315 to 1058 of SEQ ID NO:33.

[2328] In another embodiment a set of primers is provided, e.g., primers suitable for use in a PCR, which can be used to amplify a selected region of a 32225 sequence, e.g., a domain, region, site or other sequence described herein. The primers should be at least 5, 10, or 50 base pairs in length and less than 100, or less than 200, base pairs in length. The primers should be identical, or differs by one base from a sequence disclosed herein or from a naturally occurring variant. For example, primers suitable for amplifying all or a portion of any of the following regions are provided: an &agr;/&bgr; hydrolase domain, from about amino acid residues 95 to 342 of SEQ ID NO:34; a fragment of an &agr;/&bgr; hydrolase domain that includes the catalytic histidine residue, from about amino acid residues 285 to 342 of SEQ ID NO:34; or a fragment of an a/b hydrolase domain that includes a nucleophile motif, from about amino acid residues 130 to 170 of SEQ ID NO:34.

[2329] A nucleic acid fragment can encode an epitope bearing region of a polypeptide described herein.

[2330] A nucleic acid fragment encoding a “biologically active portion of a 32225 polypeptide” can be prepared by isolating a portion of the nucleotide sequence of SEQ ID NO:33 or 35, which encodes a polypeptide having a 32225 biological activity (e.g., the biological activities of the 32225 proteins are described herein), expressing the encoded portion of the 32225 protein (e.g., by recombinant expression in vitro) and assessing the activity of the encoded portion of the 32225 protein. For example, a nucleic acid fragment encoding a biologically active portion of 32225 includes an &agr;/&bgr; hydrolase domain, e.g., amino acid residues about 95 to 338 of SEQ ID NO:34, or fragments of an &agr;/&bgr; hydrolase domain, e.g., about amino acid residues 135 to 155, 285 to 305, or 315 to 330 of SEQ ID NO:34. A nucleic acid fragment encoding a biologically active portion of a 32225 polypeptide, may comprise a nucleotide sequence which is greater than 300, 350, 400, or more nucleotides in length.

[2331] In preferred embodiments, a nucleic acid includes a nucleotide sequence which is about 300, 400, 500, 600, 700, 800, 900, 1000, 1100, 1200, 1300, 1400, 1500, 1600, 1700, 1800, 1900, 2000, 2100, 2200, 2300 or more nucleotides in length and hybridizes under a stringency condition described herein to a nucleic acid molecule of SEQ ID NO:33, or SEQ ID NO:35. In a preferred embodiment, a nucleic acid includes at least one contiguous nucleotide from the region of about nucleotides 1-200, 33-314, 251-450, 315-500, 400-600, 501-750, 601-850, 751-1000, 851-1058, 900-1200, 1059-1500, 1201-1700, 1400-2000, 1650-2100, 1800-2200, 1950-2305.

[2332] 32225 Nucleic Acid Variants

[2333] The invention further encompasses nucleic acid molecules that differ from the nucleotide sequence shown in SEQ ID NO:33 or SEQ ID NO:35. Such differences can be due to degeneracy of the genetic code (and result in a nucleic acid which encodes the same 32225 proteins as those encoded by the nucleotide sequence disclosed herein. In another embodiment, an isolated nucleic acid molecule of the invention has a nucleotide sequence encoding a protein having an amino acid sequence which differs, by at least 1, but less than 5, 10, 20, 50, or 100 amino acid residues that shown in SEQ ID NO:34. If alignment is needed for this comparison the sequences should be aligned for maximum homology. “Looped” out sequences from deletions or insertions, or mismatches, are considered differences.

[2334] Nucleic acids of the inventor can be chosen for having codons, which are preferred, or non-preferred, for a particular expression system. E.g., the nucleic acid can be one in which at least one codon, at preferably at least 10%, or 20% of the codons has been altered such that the sequence is optimized for expression in E. coli, yeast, human, insect, or CHO cells.

[2335] Nucleic acid variants can be naturally occurring, such as allelic variants (same locus), homologs (different locus), and orthologs (different organism) or can be non naturally occurring. Non-naturally occurring variants can be made by mutagenesis techniques, including those applied to polynucleotides, cells, or organisms. The variants can contain nucleotide substitutions, deletions, inversions and insertions. Variation can occur in either or both the coding and non-coding regions. The variations can produce both conservative and non-conservative amino acid substitutions (as compared in the encoded product).

[2336] In a preferred embodiment, the nucleic acid differs from that of SEQ ID NO:33 or 35, e.g., as follows: by at least one but less than 10, 20, 30, or 40 nucleotides; at least one but less than 1%, 5%, 10% or 20% of the nucleotides in the subject nucleic acid. If necessary for this analysis the sequences should be aligned for maximum homology. “Looped” out sequences from deletions or insertions, or mismatches, are considered differences.

[2337] Orthologs, homologs, and allelic variants can be identified using methods known in the art. These variants comprise a nucleotide sequence encoding a polypeptide that is 50%, at least about 55%, typically at least about 70-75%, more typically at least about 80-85%, and most typically at least about 90-95% or more identical to the nucleotide sequence shown in SEQ ID NO:34 or a fragment of this sequence. Such nucleic acid molecules can readily be identified as being able to hybridize under a stringency condition described herein, to the nucleotide sequence shown in SEQ ID NO:34 or a fragment of the sequence. Nucleic acid molecules corresponding to orthologs, homologs, and allelic variants of the 32225 cDNAs of the invention can further be isolated by mapping to the same chromosome or locus as the 32225 gene.

[2338] Preferred variants include those that are correlated with the ability to hydrolyze a chemical bond, e.g., a chemical bond present in a substrate, e.g., hydroxymuconic semialdehyde, acetycholine, 1,2-dichloroethane, or triacylglycerols.

[2339] Allelic variants of 32225, e.g., human 32225, include both functional and non-functional proteins. Functional allelic variants are naturally occurring amino acid sequence variants of the 32225 protein within a population that maintain the ability to hydrolyze a substrate molecule, e.g., hydroxymuconic semialdehyde, acetycholine, 1,2-dichloroethane, or triacylglycerols. Functional allelic variants will typically contain only conservative substitution of one or more amino acids of SEQ ID NO:34, or substitution, deletion or insertion of non-critical residues in non-critical regions of the protein. Non-functional allelic variants are naturally-occurring amino acid sequence variants of the 32225, e.g., human 32225, protein within a population that do not have the ability to hydrolyze a chemical bond present in a substrate molecule, e.g., hydroxymuconic semialdehyde, acetycholine, 1,2-dichloroethane, or triacylglycerols. Non-functional allelic variants will typically contain a non-conservative substitution, a deletion, or insertion, or premature truncation of the amino acid sequence of SEQ ID NO:34, or a substitution, insertion, or deletion in critical residues or critical regions of the protein.

[2340] Moreover, nucleic acid molecules encoding other 32225 family members and, thus, which have a nucleotide sequence which differs from the 32225 sequences of SEQ ID NO:33 or SEQ ID NO:35 are intended to be within the scope of the invention.

[2341] Antisense Nucleic Acid Molecules, Ribozymes and Modified 32225 Nucleic Acid Molecules

[2342] In another aspect, the invention features, an isolated nucleic acid molecule which is antisense to 32225. An “antisense” nucleic acid can include a nucleotide sequence which is complementary to a “sense” nucleic acid encoding a protein, e.g., complementary to the coding strand of a double-stranded cDNA molecule or complementary to an mRNA sequence. The antisense nucleic acid can be complementary to an entire 32225 coding strand, or to only a portion thereof (e.g., the coding region of human 32225 corresponding to SEQ ID NO:35). In another embodiment, the antisense nucleic acid molecule is antisense to a “noncoding region” of the coding strand of a nucleotide sequence encoding 32225 (e.g., the 5′ and 3′untranslated regions).

[2343] An antisense nucleic acid can be designed such that it is complementary to the entire coding region of 32225 mRNA, but more preferably is an oligonucleotide which is antisense to only a portion of the coding or noncoding region of 32225 mRNA. For example, the antisense oligonucleotide can be complementary to the region surrounding the translation start site of 32225 mRNA, e.g., between the −10 and +10 regions of the target gene nucleotide sequence of interest. An antisense oligonucleotide can be, for example, about 7, 10, 15, 20, 25, 30, 35, 40, 45, 50, 55, 60, 65, 70, 75, 80, or more nucleotides in length.

[2344] An antisense nucleic acid of the invention can be constructed using chemical synthesis and enzymatic ligation reactions using procedures known in the art. For example, an antisense nucleic acid (e.g., an antisense oligonucleotide) can be chemically synthesized using naturally occurring nucleotides or variously modified nucleotides designed to increase the biological stability of the molecules or to increase the physical stability of the duplex formed between the antisense and sense nucleic acids, e.g., phosphorothioate derivatives and acridine substituted nucleotides can be used. The antisense nucleic acid also can be produced biologically using an expression vector into which a nucleic acid has been subcloned in an antisense orientation (i.e., RNA transcribed from the inserted nucleic acid will be of an antisense orientation to a target nucleic acid of interest, described further in the following subsection).

[2345] The antisense nucleic acid molecules of the invention are typically administered to a subject (e.g., by direct injection at a tissue site), or generated in situ such that they hybridize with or bind to cellular mRNA and/or genomic DNA encoding a 32225 protein to thereby inhibit expression of the protein, e.g., by inhibiting transcription and/or translation. Alternatively, antisense nucleic acid molecules can be modified to target selected cells and then administered systemically. For systemic administration, antisense molecules can be modified such that they specifically bind to receptors or antigens expressed on a selected cell surface, e.g., by linking the antisense nucleic acid molecules to peptides or antibodies which bind to cell surface receptors or antigens. The antisense nucleic acid molecules can also be delivered to cells using the vectors described herein. To achieve sufficient intracellular concentrations of the antisense molecules, vector constructs in which the antisense nucleic acid molecule is placed under the control of a strong pol II or pol III promoter are preferred.

[2346] In yet another embodiment, the antisense nucleic acid molecule of the invention is an &agr;-anomeric nucleic acid molecule. An &agr;-anomeric nucleic acid molecule forms specific double-stranded hybrids with complementary RNA in which, contrary to the usual &bgr;-units, the strands run parallel to each other (Gaultier et al. (1987) Nucleic Acids. Res. 15:6625-6641). The antisense nucleic acid molecule can also comprise a 2′-o-methylribonucleotide (Inoue et al. (1987) Nucleic Acids Res. 15:6131-6148) or a chimeric RNA-DNA analogue (Inoue et al. (1987) FEBS Lett. 215:327-330).

[2347] In still another embodiment, an antisense nucleic acid of the invention is a ribozyme. A ribozyme having specificity for a 32225-encoding nucleic acid can include one or more sequences complementary to the nucleotide sequence of a 32225 cDNA disclosed herein (i.e., SEQ ID NO:33 or SEQ ID NO:35), and a sequence having known catalytic sequence responsible for mRNA cleavage (see U.S. Pat. No. 5,093,246 or Haselhoff and Gerlach (1988) Nature 334:585-591). For example, a derivative of a Tetrahymena L-19 IVS RNA can be constructed in which the nucleotide sequence of the active site is complementary to the nucleotide sequence to be cleaved in a 32225-encoding mRNA. See, e.g., Cech et al. U.S. Pat. No. 4,987,071; and Cech et al. U.S. Pat. No. 5,116,742. Alternatively, 32225 mRNA can be used to select a catalytic RNA having a specific ribonuclease activity from a pool of RNA molecules. See, e.g., Bartel, D. and Szostak, J. W. (1993) Science 261:1411-1418.

[2348] 32225 gene expression can be inhibited by targeting nucleotide sequences complementary to the regulatory region of the 32225 (e.g., the 32225 promoter and/or enhancers) to form triple helical structures that prevent transcription of the 32225 gene in target cells. See generally, Helene, C. (1991) Anticancer Drug Des. 6:569-84; Helene, C. i (1992) Ann. N.Y. Acad. Sci. 660:27-36; and Maher, L. J. (1992) Bioassays 14:807-15. The potential sequences that can be targeted for triple helix formation can be increased by creating a so-called “switchback” nucleic acid molecule. Switchback molecules are synthesized in an alternating 5′-3′,3′-5′manner, such that they base pair with first one strand of a duplex and then the other, eliminating the necessity for a sizeable stretch of either purines or pyrimidines to be present on one strand of a duplex.

[2349] The invention also provides detectably labeled oligonucleotide primer and probe molecules. Typically, such labels are chemiluminescent, fluorescent, radioactive, or calorimetric.

[2350] A 32225 nucleic acid molecule can be modified at the base moiety, sugar moiety or phosphate backbone to improve, e.g., the stability, hybridization, or solubility of the molecule. For non-limiting examples of synthetic oligonucleotides with modifications see Toulmé (2001) Nature Biotech. 19:17 and Faria et al. (2001) Nature Biotech. 19:40-44. Such phosphoramidite oligonucleotides can be effective antisense agents.

[2351] For example, the deoxyribose phosphate backbone of the nucleic acid molecules can be modified to generate peptide nucleic acids (see Hyrup B. et al. (1996) Bioorganic & Medicinal Chemistry 4: 5-23). As used herein, the terms “peptide nucleic acid” or “PNA” refers to a nucleic acid mimic, e.g., a DNA mimic, in which the deoxyribose phosphate backbone is replaced by a pseudopeptide backbone and only the four natural nucleobases are retained. The neutral backbone of a PNA can allow for specific hybridization to DNA and RNA under conditions of low ionic strength. The synthesis of PNA oligomers can be performed using standard solid phase peptide synthesis protocols as described in Hyrup B. et al. (1996) supra and Perry-O'Keefe et al. Proc. Natl. Acad. Sci. 93: 14670-675.

[2352] PNAs of 32225 nucleic acid molecules can be used in therapeutic and diagnostic applications. For example, PNAs can be used as antisense or antigene agents for sequence-specific modulation of gene expression by, for example, inducing transcription or translation arrest or inhibiting replication. PNAs of 32225 nucleic acid molecules can also be used in the analysis of single base pair mutations in a gene, (e.g., by PNA-directed PCR clamping); as ‘artificial restriction enzymes’ when used in combination with other enzymes, (e.g., S1 nucleases (Hyrup B. et al. (1996) supra)); or as probes or primers for DNA sequencing or hybridization (Hyrup B. et al. (1996) supra; Perry-O'Keefe supra).

[2353] In other embodiments, the oligonucleotide may include other appended groups such as peptides (e.g., for targeting host cell receptors in vivo), or agents facilitating transport across the cell membrane (see, e.g., Letsinger et al. (1989) Proc. Natl. Acad. Sci. USA 86:6553-6556; Lemaitre et al. (1987) Proc. Natl. Acad. Sci. USA 84:648-652; PCT Publication No. WO88/09810) or the blood-brain barrier (see, e.g., PCT Publication No. WO89/10134). In addition, oligonucleotides can be modified with hybridization-triggered cleavage agents (see, e.g., Krol et al. (1988) Bio-Techniques 6:958-976) or intercalating agents. (see, e.g., Zon (1988) Pharm. Res. 5:539-549). To this end, the oligonucleotide may be conjugated to another molecule, (e.g., a peptide, hybridization triggered cross-linking agent, transport agent, or hybridization-triggered cleavage agent).

[2354] The invention also includes molecular beacon oligonucleotide primer and probe molecules having at least one region which is complementary to a 32225 nucleic acid of the invention, two complementary regions one having a fluorophore and one a quencher such that the molecular beacon is useful for quantitating the presence of the 32225 nucleic acid of the invention in a sample. Molecular beacon nucleic acids are described, for example, in Lizardi et al., U.S. Pat. No. 5,854,033; Nazarenko et al., U.S. Pat. No. 5,866,336, and Livak et al., U.S. Pat. No. 5,876,930.

[2355] Isolated 32225 Polypeptides

[2356] In another aspect, the invention features, an isolated 32225 protein, or fragment, e.g., a biologically active portion, for use as immunogens or antigens to raise or test (or more generally to bind) anti-32225 antibodies. 32225 protein can be isolated from cells or tissue sources using standard protein purification techniques. 32225 protein or fragments thereof can be produced by recombinant DNA techniques or synthesized chemically.

[2357] Polypeptides of the invention include those which arise as a result of the existence of multiple genes, alternative transcription events, alternative RNA splicing events, and alternative translational and post-translational events. The polypeptide can be expressed in systems, e.g., cultured cells, which result in substantially the same post-translational modifications present when expressed the polypeptide is expressed in a native cell, or in systems which result in the alteration or omission of post-translational modifications, e.g., glycosylation or cleavage, present when expressed in a native cell.

[2358] In a preferred embodiment, a 32225 polypeptide has one or more of the following characteristics:

[2359] (i) it has the ability to hydrolyze a chemical bond present in a substrate molecule, e.g., hydroxymuconic semialdehyde, acetycholine, 1,2-dichloroethane, or triacylglycerols;

[2360] (ii) it has a molecular weight, e.g., a deduced molecular weight, preferably ignoring any contribution of post translational modifications, amino acid composition or other physical characteristic of SEQ ID NO:34;

[2361] (iii) it has an overall sequence similarity of at least 50%, preferably at least 60%, more preferably at least 70, 80, 90, or 95%, with a polypeptide a of SEQ ID NO:34;

[2362] (iv) it can be found in brain, liver, breast, or ovary;

[2363] (v) it has an &agr;/&bgr; hydrolase domain which is preferably about 70%, 80%, 90% or 95% with amino acid residues about 95 to 338 of SEQ ID NO:34;

[2364] (vi) it has an &agr;/&bgr; hydrolase domain nucleophile motif;

[2365] (vii) it has an &agr;/&bgr; hydrolase domain catalytic histidine residue;

[2366] (viii) it has one, two, three, four, five, preferably six predicted Protein Kinase C phosphorylation sites (PS00005);

[2367] (ix) it has one, two, three, preferably four predicted Casein Kinase II phosphorylation sites (PS00006);

[2368] (x) it has one, preferably two predicted N-myristoylation sites (PS00008); and

[2369] (xi) it has one predicted amidation site (PS00009).

[2370] In a preferred embodiment the 32225 protein, or fragment thereof, differs from the corresponding sequence in SEQ ID NO:34. In one embodiment it differs by at least one but by less than 15, 10 or 5 amino acid residues. In another it differs from the corresponding sequence in SEQ ID NO:34 by at least one residue but less than 20%, 15%, 10% or 5% of the residues in it differ from the corresponding sequence in SEQ ID NO:34. (If this comparison requires alignment the sequences should be aligned for maximum homology. “Looped” out sequences from deletions or insertions, or mismatches, are considered differences.) The differences are, preferably, differences or changes at a non essential residue or a conservative substitution. In a preferred embodiment the differences are not in the &agr;/&bgr; hydrolase domain, e.g., located at about amino acid residues 95 to 342 of SEQ ID NO:34. In another preferred embodiment one or more differences are in the &agr;/&bgr; hydrolase domain, e.g., located at about amino acid residues 95 to 342 of SEQ ID NO:34.

[2371] Other embodiments include a protein that contain one or more changes in amino acid sequence, e.g., a change in an amino acid residue which is not essential for activity. Such 32225 proteins differ in amino acid sequence from SEQ ID NO:34, yet retain biological activity.

[2372] In one embodiment, the protein includes an amino acid sequence at least about 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 98% or more homologous to SEQ ID NO:34.

[2373] A 32225 protein or fragment is provided which varies from the sequence of SEQ ID NO:34 by at least one but by less than 15, 10 or 5 amino acid residues in the protein or fragment. A 32225 protein or fragment is provided which varies from the sequence of SEQ ID NO:34 e.g., in regions defined by amino acids about 1 to 140 and 151 to 280 by at least one but by less than 15, 10 or 5 amino acid residues in the protein or fragment, but which does not differ from SEQ ID NO:34 in regions defined by amino acids about 141 to 150 and 281 to 342. (If this comparison requires alignment the sequences should be aligned for maximum homology. “Looped” out sequences from deletions or insertions, or mismatches, are considered differences.) In some embodiments the difference is at a non-essential residue or is a conservative substitution, while in others the difference is at an essential residue or is a non-conservative substitution.

[2374] In one embodiment, a biologically active portion of a 32225 protein includes an &agr;/&bgr; hydrolase domain. Moreover, other biologically active portions, in which other regions of the protein are deleted, can be prepared by recombinant techniques and evaluated for one or more of the functional activities of a native 32225 protein.

[2375] In a preferred embodiment, the 32225 protein has an amino acid sequence shown in SEQ ID NO:34. In other embodiments, the 32225 protein is substantially identical to SEQ ID NO:34. In yet another embodiment, the 32225 protein is substantially identical to SEQ ID NO:34 and retains the functional activity of the protein of SEQ ID NO:34, as described in detail in the subsections above.

[2376] 32225 Chimeric or Fusion Proteins

[2377] In another aspect, the invention provides 32225 chimeric or fusion proteins. As used herein, a 32225 “chimeric protein” or “fusion protein” includes a 32225 polypeptide linked to a non-32225 polypeptide. A “non-32225 polypeptide” refers to a polypeptide having an amino acid sequence corresponding to a protein which is not substantially homologous to the 32225 protein, e.g., a protein which is different from the 32225 protein and which is derived from the same or a different organism. The 32225 polypeptide of the fusion protein can correspond to all or a portion e.g., a fragment described herein of a 32225 amino acid sequence. In a preferred embodiment, a 32225 fusion protein includes at least one (or two) biologically active portion of a 32225 protein. The non-32225 polypeptide can be fused to the N-terminus or C-terminus of the 32225 polypeptide.

[2378] The fusion protein can include a moiety which has a high affinity for a ligand. For example, the fusion protein can be a GST-32225 fusion protein in which the 32225 sequences are fused to the C-terminus of the GST sequences. Such fusion proteins can facilitate the purification of recombinant 32225. Alternatively, the fusion protein can be a 32225 protein containing a heterologous signal sequence at its N-terminus. In certain host cells (e.g., mammalian host cells), expression and/or secretion of 32225 can be increased through use of a heterologous signal sequence.

[2379] Fusion proteins can include all or a part of a serum protein, e.g., an IgG constant region, or human serum albumin.

[2380] The 32225 fusion proteins of the invention can be incorporated into pharmaceutical compositions and administered to a subject in vivo. The 32225 fusion proteins can be used to affect the bioavailability of a 32225 substrate. 32225 fusion proteins may be useful therapeutically for the treatment of disorders caused by, for example, (i) aberrant modification or mutation of a gene encoding a 32225 protein; (ii) mis-regulation of the 32225 gene; and (iii) aberrant post-translational modification of a 32225 protein.

[2381] Moreover, the 32225-fusion proteins of the invention can be used as immunogens to produce anti-32225 antibodies in a subject, to purify 32225 ligands and in screening assays to identify molecules which inhibit the interaction of 32225 with a 32225 substrate.

[2382] Expression vectors are commercially available that already encode a fusion moiety (e.g., a GST polypeptide). A 32225-encoding nucleic acid can be cloned into such an expression vector such that the fusion moiety is linked in-frame to the 32225 protein.

[2383] Variants of 32225 Proteins

[2384] In another aspect, the invention also features a variant of a 32225 polypeptide, e.g., which functions as an agonist (mimetics) or as an antagonist. Variants of the 32225 proteins can be generated by mutagenesis, e.g., discrete point mutation, the insertion or deletion of sequences or the truncation of a 32225 protein. An agonist of the 32225 proteins can retain substantially the same, or a subset, of the biological activities of the naturally occurring form of a 32225 protein. An antagonist of a 32225 protein can inhibit one or more of the activities of the naturally occurring form of the 32225 protein by, for example, competitively modulating a 32225-mediated activity of a 32225 protein. Thus, specific biological effects can be elicited by treatment with a variant of limited function. Preferably, treatment of a subject with a variant having a subset of the biological activities of the naturally occurring form of the protein has fewer side effects in a subject relative to treatment with the naturally occurring form of the 32225 protein.

[2385] Variants of a 32225 protein can be identified by screening combinatorial libraries of mutants, e.g., truncation mutants, of a 32225 protein for agonist or antagonist activity.

[2386] Libraries of fragments e.g., N terminal, C terminal, or internal fragments, of a 32225 protein coding sequence can be used to generate a variegated population of fragments for screening and subsequent selection of variants of a 32225 protein. Variants in which a cysteine residues is added or deleted or in which a residue which is glycosylated is added or deleted are particularly preferred.

[2387] Methods for screening gene products of combinatorial libraries made by point mutations or truncation, and for screening cDNA libraries for gene products having a selected property are known in the art. Such methods are adaptable for rapid screening of the gene libraries generated by combinatorial mutagenesis of 32225 proteins. Recursive ensemble mutagenesis (REM), a new technique which enhances the frequency of functional mutants in the libraries, can be used in combination with the screening assays to identify 32225 variants (Arkin and Yourvan (1992) Proc. Natl. Acad. Sci. USA 89:7811-7815; Delgrave et al. (1993) Protein Engineering 6:327-331).

[2388] Cell based assays can be exploited to analyze a variegated 32225 library. For example, a library of expression vectors can be transfected into a cell line, e.g., a cell line, which ordinarily responds to 32225 in a substrate-dependent manner. The transfected cells are then contacted with 32225 and the effect of the expression of the mutant on signaling by the 32225 substrate can be detected, e.g., by measuring the hydrolysis of a substrate molecule, e.g., hydroxymuconic semialdehyde, acetycholine, 1,2-dichloroethane, or triacylglycerols. Plasmid DNA can then be recovered from the cells which score for inhibition, or alternatively, potentiation of signaling by the 32225 substrate, and the individual clones further characterized.

[2389] In another aspect, the invention features a method of making a 32225 polypeptide, e.g., a peptide having a non-wild type activity, e.g., an antagonist, agonist, or super agonist of a naturally occurring 32225 polypeptide, e.g., a naturally occurring 32225 polypeptide. The method includes: altering the sequence of a 32225 polypeptide, e.g., altering the sequence, e.g., by substitution or deletion of one or more residues of a non-conserved region, a domain or residue disclosed herein, and testing the altered polypeptide for the desired activity.

[2390] In another aspect, the invention features a method of making a fragment or analog of a 32225 polypeptide a biological activity of a naturally occurring 32225 polypeptide. The method includes: altering the sequence, e.g., by substitution or deletion of one or more residues, of a 32225 polypeptide, e.g., altering the sequence of a non-conserved region, or a domain or residue described herein, and testing the altered polypeptide for the desired activity.

[2391] Anti-32225 Antibodies

[2392] In another aspect, the invention provides an anti-32225 antibody, or a fragment thereof (e.g., an antigen-binding fragment thereof). The term “antibody” as used herein refers to an immunoglobulin molecule or immunologically active portion thereof, i.e., an antigen-binding portion. As used herein, the term “antibody” refers to a protein comprising at least one, and preferably two, heavy (H) chain variable regions (abbreviated herein as VH), and at least one and preferably two light (L) chain variable regions (abbreviated herein as VL). The VH and VL regions can be further subdivided into regions of hypervariability, termed “complementarity determining regions” (“CDR”), interspersed with regions that are more conserved, termed “framework regions” (FR). The extent of the framework region and CDR's has been precisely defined (see, Kabat, E. A., et al. (1991) Sequences of Proteins of Immunological Interest, Fifth Edition, U.S. Department of Health and Human Services, NIH Publication No. 91-3242, and Chothia, C. et al. (1987) J. Mol. Biol. 196:901-917, which are incorporated herein by reference). Each VH and VL is composed of three CDR's and four FRs, arranged from amino-terminus to carboxy-terminus in the following order: FR1, CDR1, FR2, CDR2, FR3, CDR3, FR4.

[2393] The anti-32225 antibody can further include a heavy and light chain constant region, to thereby form a heavy and light immunoglobulin chain, respectively. In one embodiment, the antibody is a tetramer of two heavy immunoglobulin chains and two light immunoglobulin chains, wherein the heavy and light immunoglobulin chains are inter-connected by, e.g., disulfide bonds. The heavy chain constant region is comprised of three domains, CH1, CH2 and CH3. The light chain constant region is comprised of one domain, CL. The variable region of the heavy and light chains contains a binding domain that interacts with an antigen. The constant regions of the antibodies typically mediate the binding of the antibody to host tissues or factors, including various cells of the immune system (e.g., effector cells) and the first component (Clq) of the classical complement system.

[2394] As used herein, the term “immunoglobulin” refers to a protein consisting of one or more polypeptides substantially encoded by immunoglobulin genes. The recognized human immunoglobulin genes include the kappa, lambda, alpha (IgA1 and IgA2), gamma (IgG1, IgG2, IgG3, IgG4), delta, epsilon and mu constant region genes, as well as the myriad immunoglobulin variable region genes. Full-length immunoglobulin “light chains” (about 25 KDa or 214 amino acids) are encoded by a variable region gene at the NH2-terminus (about 110 amino acids) and a kappa or lambda constant region gene at the COOH—terminus. Full-length immunoglobulin “heavy chains” (about 50 KDa or 446 amino acids), are similarly encoded by a variable region gene (about 116 amino acids) and one of the other aforementioned constant region genes, e.g., gamma (encoding about 330 amino acids).

[2395] The term “antigen-binding fragment” of an antibody (or simply “antibody portion,” or “fragment”), as used herein, refers to one or more fragments of a full-length antibody that retain the ability to specifically bind to the antigen, e.g., 32225 polypeptide or fragment thereof. Examples of antigen-binding fragments of the anti-32225 antibody include, but are not limited to: (i) a Fab fragment, a monovalent fragment consisting of the VL, VH, CL and CH1 domains; (ii) a F(ab′)2 fragment, a bivalent fragment comprising two Fab fragments linked by a disulfide bridge at the hinge region; (iii) a Fd fragment consisting of the VH and CH1 domains; (iv) a Fv fragment consisting of the VL and VH domains of a single arm of an antibody, (v) a dAb fragment (Ward et al., (1989) Nature 341:544-546), which consists of a VH domain; and (vi) an isolated complementarity determining region (CDR). Furthermore, although the two domains of the Fv fragment, VL and VH, are coded for by separate genes, they can be joined, using recombinant methods, by a synthetic linker that enables them to be made as a single protein chain in which the VL and VH regions pair to form monovalent molecules (known as single chain Fv (scFv); see e.g., Bird et al. (1988) Science 242:423-426; and Huston et al. (1988) Proc. Natl. Acad. Sci. USA 85:5879-5883). Such single chain antibodies are also encompassed within the term “antigen-binding fragment” of an antibody. These antibody fragments are obtained using conventional techniques known to those with skill in the art, and the fragments are screened for utility in the same manner as are intact antibodies.

[2396] The anti-32225 antibody can be a polyclonal or a monoclonal antibody. In other embodiments, the antibody can be recombinantly produced, e.g., produced by phage display or by combinatorial methods.

[2397] Phage display and combinatorial methods for generating anti-32225 antibodies are known in the art (as described in, e.g., Ladner et al. U.S. Pat. No. 5,223,409; Kang et al. International Publication No. WO 92/18619; Dower et al. International Publication No. WO 91/17271; Winter et al. International Publication WO 92/20791; Markland et al. International Publication No. WO 92/15679; Breitling et al. International Publication WO 93/01288; McCafferty et al. International Publication No. WO 92/01047; Garrard et al. International Publication No. WO 92/09690; Ladner et al. International Publication No. WO 90/02809; Fuchs et al. (1991) Bio/Technology 9:1370-1372; Hay et al. (1992) Hum Antibod Hybridomas 3:81-85; Huse et al. (1989) Science 246:1275-1281; Griffths et al. (1993) EMBO J 12:725-734; Hawkins et al. (1992) J Mol Biol 226:889-896; Clackson et al. (1991) Nature 352:624-628; Gram et al. (1992) PNAS 89:3576-3580; Garrad et al. (1991) Bio/Technology 9:1373-1377; Hoogenboom et al. (1991) Nuc Acid Res 19:4133-4137; and Barbas et al. (1991) PNAS 88:7978-7982, the contents of all of which are incorporated by reference herein).

[2398] In one embodiment, the anti-32225 antibody is a fully human antibody (e.g., an antibody made in a mouse which has been genetically engineered to produce an antibody from a human immunoglobulin sequence), or a non-human antibody, e.g., a rodent (mouse or rat), goat, primate (e.g., monkey), camel antibody. Preferably, the non-human antibody is a rodent (mouse or rat antibody). Method of producing rodent antibodies are known in the art.

[2399] Human monoclonal antibodies can be generated using transgenic mice carrying the human immunoglobulin genes rather than the mouse system. Splenocytes from these transgenic mice immunized with the antigen of interest are used to produce hybridomas that secrete human mAbs with specific affinities for epitopes from a human protein (see, e.g., Wood et al. International Application WO 91/00906, Kucherlapati et al. PCT publication WO 91/10741; Lonberg et al. International Application WO 92/03918; Kay et al. International Application 92/03917; Lonberg, N. et al. 1994 Nature 368:856-859; Green, L. L. et al. 1994 Nature Genet. 7:13-21; Morrison, S. L. et al. 1994 Proc. Natl. Acad. Sci. USA 81:6851-6855; Bruggeman et al. 1993 Year Immunol 7:33-40; Tuaillon et al. 1993 PNAS 90:3720-3724; Bruggeman et al. 1991 Eur J Immunol 21:1323-1326).

[2400] An anti-32225 antibody can be one in which the variable region, or a portion thereof, e.g., the CDR's, are generated in a non-human organism, e.g., a rat or mouse. Chimeric, CDR-grafted, and humanized antibodies are within the invention. Antibodies generated in a non-human organism, e.g., a rat or mouse, and then modified, e.g., in the variable framework or constant region, to decrease antigenicity in a human are within the invention.

[2401] Chimeric antibodies can be produced by recombinant DNA techniques known in the art. For example, a gene encoding the Fc constant region of a murine (or other species) monoclonal antibody molecule is digested with restriction enzymes to remove the region encoding the murine Fc, and the equivalent portion of a gene encoding a human Fc constant region is substituted (see Robinson et al., International Patent Publication PCT/US86/02269; Akira, et al., European Patent Application 184,187; Taniguchi, M., European Patent Application 171,496; Morrison et al., European Patent Application 173,494; Neuberger et al., International Application WO 86/01533; Cabilly et al. U.S. Pat. No. 4,816,567; Cabilly et al., European Patent Application 125,023; Better et al. (1988 Science 240:1041-1043); Liu et al. (1987) PNAS 84:3439-3443; Liu et al., 1987, J. Immunol. 139:3521-3526; Sun et al. (1987) PNAS 84:214-218; Nishimura et al., 1987, Canc. Res. 47:999-1005; Wood et al. (1985) Nature 314:446-449; and Shaw et al., 1988, J. Natl Cancer Inst. 80:1553-1559).

[2402] A humanized or CDR-grafted antibody will have at least one or two but generally all three recipient CDR's (of heavy and or light immuoglobulin chains) replaced with a donor CDR. The antibody may be replaced with at least a portion of a non-human CDR or only some of the CDR's may be replaced with non-human CDR's. It is only necessary to replace the number of CDR's required for binding of the humanized antibody to a 32225 or a fragment thereof. Preferably, the donor will be a rodent antibody, e.g., a rat or mouse antibody, and the recipient will be a human framework or a human consensus framework. Typically, the immunoglobulin providing the CDR's is called the “donor” and the immunoglobulin providing the framework is called the “acceptor.” In one embodiment, the donor immunoglobulin is a non-human (e.g., rodent). The acceptor framework is a naturally-occurring (e.g., a human) framework or a consensus framework, or a sequence about 85% or higher, preferably 90%, 95%, 99% or higher identical thereto. As used herein, the term “consensus sequence” refers to the sequence formed from the most frequently occurring amino acids (or nucleotides) in a family of related sequences (See e.g., Winnaker, From Genes to Clones (Verlagsgesellschaft, Weinheim, Germany 1987). In a family of proteins, each position in the consensus sequence is occupied by the amino acid occurring most frequently at that position in the family. If two amino acids occur equally frequently, either can be included in the consensus sequence. A “consensus framework” refers to the framework region in the consensus immunoglobulin sequence.

[2403] An antibody can be humanized by methods known in the art. Humanized antibodies can be generated by replacing sequences of the Fv variable region which are not directly involved in antigen binding with equivalent sequences from human Fv variable regions. General methods for generating humanized antibodies are provided by Morrison, S. L., 1985, Science 229:1202-1207, by Oi et al., 1986, BioTechniques 4:214, and by Queen et al. U.S. Pat. No. 5,585,089, U.S. Pat. No. 5,693,761 and U.S. Pat. No. 5,693,762, the contents of all of which are hereby incorporated by reference. Those methods include isolating, manipulating, and expressing the nucleic acid sequences that encode all or part of immunoglobulin Fv variable regions from at least one of a heavy or light chain. Sources of such nucleic acid are well known to those skilled in the art and, for example, may be obtained from a hybridoma producing an antibody against a 32225 polypeptide or fragment thereof. The recombinant DNA encoding the humanized antibody, or fragment thereof, can then be cloned into an appropriate expression vector.

[2404] Humanized or CDR-grafted antibodies can be produced by CDR-grafting or CDR substitution, wherein one, two, or all CDR's of an immunoglobulin chain can be replaced. See e.g., U.S. Pat. No. 5,225,539; Jones et al. 1986 Nature 321:552-525; Verhoeyan et al. 1988 Science 239:1534; Beidler et al. 1988 J. Immunol. 141:4053-4060; Winter U.S. Pat. No. 5,225,539, the contents of all of which are hereby expressly incorporated by reference. Winter describes a CDR-grafting method which may be used to prepare the humanized antibodies of the present invention (UK Patent Application GB 2188638A, filed on Mar. 26, 1987; Winter U.S. Pat. No. 5,225,539), the contents of which is expressly incorporated by reference.

[2405] Also within the scope of the invention are humanized antibodies in which specific amino acids have been substituted, deleted or added. Preferred humanized antibodies have amino acid substitutions in the framework region, such as to improve binding to the antigen. For example, a humanized antibody will have framework residues identical to the donor framework residue or to another amino acid other than the recipient framework residue. To generate such antibodies, a selected, small number of acceptor framework residues of the humanized immunoglobulin chain can be replaced by the corresponding donor amino acids. Preferred locations of the substitutions include amino acid residues adjacent to the CDR, or which are capable of interacting with a CDR (see e.g., U.S. Pat. No. 5,585,089). Criteria for selecting amino acids from the donor are described in U.S. Pat. No. 5,585,089, e.g., columns 12-16 of U.S. Pat. No. 5,585,089, the e.g., columns 12-16 of U.S. Pat. No. 5,585,089, the contents of which are hereby incorporated by reference. Other techniques for humanizing antibodies are described in Padlan et al. EP 519596 A1, published on Dec. 23, 1992.

[2406] In preferred embodiments an antibody can be made by immunizing with purified 32225 antigen, or a fragment thereof, e.g., a fragment described herein, membrane associated antigen, tissue, e.g., crude tissue preparations, whole cells, preferably living cells, lysed cells, or cell fractions, e.g., cytosolic fractions.

[2407] A full-length 32225 protein or, antigenic peptide fragment of 32225 can be used as an immunogen or can be used to identify anti-32225 antibodies made with other immunogens, e.g., cells, cytosolic fractions, and the like. The antigenic peptide of 32225 should include at least 8 amino acid residues of the amino acid sequence shown in SEQ ID NO:34 and encompasses an epitope of 32225. Preferably, the antigenic peptide includes at least 10 amino acid residues, more preferably at least 15 amino acid residues, even more preferably at least 20 amino acid residues, and most preferably at least 30 amino acid residues.

[2408] Fragments of 32225 which include residues about 106 to 123, about 220 to 236, or about 299 to 316 can be used to make, e.g., used as immunogens or used to characterize the specificity of an antibody, antibodies against hydrophilic regions of the 32225 protein. Similarly, fragments of 32225 which include residues about 71 to 88, about 135 to 157, or about 186 to 199 can be used to make an antibody against a hydrophobic region of the 32225 protein; and a fragment of 32225 which include residues about 130 to 155, about 280 to 300, or about 310 to 342 can be used to make an antibody against the &agr;/&bgr; hydrolase region of the 32225 protein.

[2409] Antibodies reactive with, or specific for, any of these regions, or other regions or domains described herein are provided.

[2410] Antibodies which bind only native 32225 protein, only denatured or otherwise non-native 32225 protein, or which bind both, are with in the invention. Antibodies with linear or conformational epitopes are within the invention. Conformational epitopes can sometimes be identified by identifying antibodies which bind to native but not denatured 32225 protein.

[2411] Preferred epitopes encompassed by the antigenic peptide are regions of 32225 are located on the surface of the protein, e.g., hydrophilic regions, as well as regions with high antigenicity. For example, an Emini surface probability analysis of the human 32225 protein sequence can be used to indicate the regions that have a particularly high probability of being localized to the surface of the 32225 protein and are thus likely to constitute surface residues useful for targeting antibody production.

[2412] In preferred embodiments antibodies can bind one or more of purified antigen, membrane associated antigen, tissue, e.g., tissue sections, whole cells, preferably living cells, lysed cells, cell fractions, e.g., cytosolic fractions.

[2413] The anti-32225 antibody can be a single chain antibody. A single-chain antibody (scFV) may be engineered (see, for example, Colcher, D. et al. (1999) Ann N Y Acad Sci 880:263-80; and Reiter, Y. (1996) Clin Cancer Res 2:245-52). The single chain antibody can be dimerized or multimerized to generate multivalent antibodies having specificities for different epitopes of the same target 32225 protein.

[2414] In a preferred embodiment the antibody has effector function and can fix complement. In other embodiments the antibody does not recruit effector cells or fix complement.

[2415] In a preferred embodiment, the antibody has reduced or no ability to bind an Fc receptor. For example., it is a isotype or subtype, fragment or other mutant, which does not support binding to an Fc receptor, e.g., it has a mutagenized or deleted Fc receptor binding region.

[2416] In a preferred embodiment, an anti-32225 antibody alters (e.g., increases or decreases) the hydrolase activity of a 32225 polypeptide. For example, the antibody can bind at or in proximity to the active site, e.g., to an epitope that includes a residue located from about 310 to 342 of SEQ ID NO:34.

[2417] The antibody can be coupled to a toxin, e.g., a polypeptide toxin, e,g, ricin or diphtheria toxin or active fragment hereof, or a radioactive nucleus, or imaging agent, e.g. a radioactive, enzymatic, or other, e.g., imaging agent, e.g., a NMR contrast agent. Labels which produce detectable radioactive emissions or fluorescence are preferred.

[2418] An anti-32225 antibody (e.g., monoclonal antibody) can be used to isolate 32225 by standard techniques, such as affinity chromatography or immunoprecipitation. Moreover, an anti-32225 antibody can be used to detect 32225 protein (e.g., in a cellular lysate or cell supernatant) in order to evaluate the abundance and pattern of expression of the protein. Anti-32225 antibodies can be used diagnostically to monitor protein levels in tissue as part of a clinical testing procedure, e.g., to determine the efficacy of a given treatment regimen. Detection can be facilitated by coupling (i.e., physically linking) the antibody to a detectable substance (i.e., antibody labelling). Examples of detectable substances include various enzymes, prosthetic groups, fluorescent materials, luminescent materials, bioluminescent materials, and radioactive materials. Examples of suitable enzymes include horseradish peroxidase, alkaline phosphatase, &bgr;-galactosidase, or acetylcholinesterase; examples of suitable prosthetic group complexes include streptavidin/biotin and avidin/biotin; examples of suitable fluorescent materials include umbelliferone, fluorescein, fluorescein isothiocyanate, rhodamine, dichlorotriazinylamine fluorescein, dansyl chloride or phycoerythrin; an example of a luminescent material includes luminol; examples of bioluminescent materials include luciferase, luciferin, and aequorin, and examples of suitable radioactive material include 125I, 131I, 35S or 3H.

[2419] The invention also includes a nucleic acids which encodes an anti-32225 antibody, e.g., an anti-32225 antibody described herein. Also included are vectors which include the nucleic acid and sells transformed with the nucleic acid, particularly cells which are useful for producing an antibody, e.g., mammalian cells, e.g. CHO or lymphatic cells.

[2420] The invention also includes cell lines, e.g., hybridomas, which make an anti-32225 antibody, e.g., and antibody described herein, and method of using said cells to make a 32225 antibody.

[2421] 32225 Recombinant Expression Vectors, Host Cells and Genetically Engineered Cells

[2422] In another aspect, the invention includes, vectors, preferably expression vectors, containing a nucleic acid encoding a polypeptide described herein. As used herein, the term “vector” refers to a nucleic acid molecule capable of transporting another nucleic acid to which it has been linked and can include a plasmid, cosmid or viral vector. The vector can be capable of autonomous replication or it can integrate into a host DNA. Viral vectors include, e.g., replication defective retroviruses, adenoviruses and adeno-associated viruses.

[2423] A vector can include a 32225 nucleic acid in a form suitable for expression of the nucleic acid in a host cell. Preferably the recombinant expression vector includes one or more regulatory sequences operatively linked to the nucleic acid sequence to be expressed. The term “regulatory sequence” includes promoters, enhancers and other expression control elements (e.g., polyadenylation signals). Regulatory sequences include those which direct constitutive expression of a nucleotide sequence, as well as tissue-specific regulatory and/or inducible sequences. The design of the expression vector can depend on such factors as the choice of the host cell to be transformed, the level of expression of protein desired, and the like. The expression vectors of the invention can be introduced into host cells to thereby produce proteins or polypeptides, including fusion proteins or polypeptides, encoded by nucleic acids as described herein (e.g., 32225 proteins, mutant forms of 32225 proteins, fusion proteins, and the like).

[2424] The recombinant expression vectors of the invention can be designed for expression of 32225 proteins in prokaryotic or eukaryotic cells. For example, polypeptides of the invention can be expressed in E. coli, insect cells (e.g., using baculovirus expression vectors), yeast cells or mammalian cells. Suitable host cells are discussed further in Goeddel, (1990) Gene Expression Technology: Methods in Enzymology 185, Academic Press, San Diego, Calif. Alternatively, the recombinant expression vector can be transcribed and translated in vitro, for example using T7 promoter regulatory sequences and T7 polymerase.

[2425] Expression of proteins in prokaryotes is most often carried out in E. coli with vectors containing constitutive or inducible promoters directing the expression of either fusion or non-fusion proteins. Fusion vectors add a number of amino acids to a protein encoded therein, usually to the amino terminus of the recombinant protein. Such fusion vectors typically serve three purposes: 1) to increase expression of recombinant protein; 2) to increase the solubility of the recombinant protein; and 3) to aid in the purification of the recombinant protein by acting as a ligand in affinity purification. Often, a proteolytic cleavage site is introduced at the junction of the fusion moiety and the recombinant protein to enable separation of the recombinant protein from the fusion moiety subsequent to purification of the fusion protein. Such enzymes, and their cognate recognition sequences, include Factor Xa, thrombin and enterokinase. Typical fusion expression vectors include pGEX (Pharmacia Biotech Inc; Smith, D. B. and Johnson, K. S. (1988) Gene 67:31-40), pMAL (New England Biolabs, Beverly, Mass.) and pRIT5 (Pharmacia, Piscataway, N.J.) which fuse glutathione S-transferase (GST), maltose E binding protein, or protein A, respectively, to the target recombinant protein.

[2426] Purified fusion proteins can be used in 32225 activity assays, (e.g., direct assays or competitive assays described in detail below), or to generate antibodies specific for 32225 proteins. In a preferred embodiment, a fusion protein expressed in a retroviral expression vector of the present invention can be used to infect bone marrow cells which are subsequently transplanted into irradiated recipients. The pathology of the subject recipient is then examined after sufficient time has passed (e.g., six weeks).

[2427] To maximize recombinant protein expression in E. coli is to express the protein in a host bacteria with an impaired capacity to proteolytically cleave the recombinant protein (Gottesman, S., (1990) Gene Expression Technology: Methods in Enzymology 185, Academic Press, San Diego, Calif. 119-128). Another strategy is to alter the nucleic acid sequence of the nucleic acid to be inserted into an expression vector so that the individual codons for each amino acid are those preferentially utilized in E. coli (Wada et al., (1992) Nucleic Acids Res. 20:2111-2118). Such alteration of nucleic acid sequences of the invention can be carried out by standard DNA synthesis techniques.

[2428] The 32225 expression vector can be a yeast expression vector, a vector for expression in insect cells, e.g., a baculovirus expression vector or a vector suitable for expression in mammalian cells.

[2429] When used in mammalian cells, the expression vector's control functions can be provided by viral regulatory elements. For example, commonly used promoters are derived from polyoma, Adenovirus 2, cytomegalovirus and Simian Virus 40.

[2430] In another embodiment, the promoter is an inducible promoter, e.g., a promoter regulated by a steroid hormone, by a polypeptide hormone (e.g., by means of a signal transduction pathway), or by a heterologous polypeptide (e.g., the tetracycline-inducible systems, “Tet-On” and “Tet-Off”; see, e.g., Clontech Inc., CA, Gossen and Bujard (1992) Proc. Natl. Acad. Sci. USA 89:5547, and Paillard (1989) Human Gene Therapy 9:983).

[2431] In another embodiment, the recombinant mammalian expression vector is capable of directing expression of the nucleic acid preferentially in a particular cell type (e.g., tissue-specific regulatory elements are used to express the nucleic acid). Non-limiting examples of suitable tissue-specific promoters include the albumin promoter (liver-specific; Pinkert et al. (1987) Genes Dev. 1:268-277), lymphoid-specific promoters (Calame and Eaton (1988) Adv. Immunol. 43:235-275), in particular promoters of T cell receptors (Winoto and Baltimore (1989) EMBO J. 8:729-733) and immunoglobulins (Banerji et al. (1983) Cell 33:729-740; Queen and Baltimore (1983) Cell 33:741-748), neuron-specific promoters (e.g., the neurofilament promoter; Byrne and Ruddle (1989) Proc. Natl. Acad. Sci. USA 86:5473-5477), pancreas-specific promoters (Edlund et al. (1985) Science 230:912-916), and mammary gland-specific promoters (e.g., milk whey promoter; U.S. Pat. No. 4,873,316 and European Application Publication No. 264,166). Developmentally-regulated promoters are also encompassed, for example, the murine hox promoters (Kessel and Gruss (1990) Science 249:374-379) and the &agr;-fetoprotein promoter (Campes and Tilghman (1989) Genes Dev. 3:537-546).

[2432] The invention further provides a recombinant expression vector comprising a DNA molecule of the invention cloned into the expression vector in an antisense orientation. Regulatory sequences (e.g., viral promoters and/or enhancers) operatively linked to a nucleic acid cloned in the antisense orientation can be chosen which direct the constitutive, tissue specific or cell type specific expression of antisense RNA in a variety of cell types. The antisense expression vector can be in the form of a recombinant plasmid, phagemid or attenuated virus.

[2433] Another aspect the invention provides a host cell which includes a nucleic acid molecule described herein, e.g., a 32225 nucleic acid molecule within a recombinant expression vector or a 32225 nucleic acid molecule containing sequences which allow it to homologously recombine into a specific site of the host cell's genome. The terms “host cell” and “recombinant host cell” are used interchangeably herein. Such terms refer not only to the particular subject cell but to the progeny or potential progeny of such a cell. Because certain modifications may occur in succeeding generations due to either mutation or environmental influences, such progeny may not, in fact, be identical to the parent cell, but are still included within the scope of the term as used herein.

[2434] A host cell can be any prokaryotic or eukaryotic cell. For example, a 32225 protein can be expressed in bacterial cells (such as E. coli), insect cells, yeast or mammalian cells (such as Chinese hamster ovary cells (CHO) or COS cells (African green monkey kidney cells CV-1 origin SV40 cells; Gluzman (1981) Cell123:175-182)). Other suitable host cells are known to those skilled in the art.

[2435] Vector DNA can be introduced into host cells via conventional transformation or transfection techniques. As used herein, the terms “transformation” and “transfection” are intended to refer to a variety of art-recognized techniques for introducing foreign nucleic acid (e.g., DNA) into a host cell, including calcium phosphate or calcium chloride co-precipitation, DEAE-dextran-mediated transfection, lipofection, or electroporation.

[2436] A host cell of the invention can be used to produce (i.e., express) a 32225 protein. Accordingly, the invention further provides methods for producing a 32225 protein using the host cells of the invention. In one embodiment, the method includes culturing the host cell of the invention (into which a recombinant expression vector encoding a 32225 protein has been introduced) in a suitable medium such that a 32225 protein is produced. In another embodiment, the method further includes isolating a 32225 protein from the medium or the host cell.

[2437] In another aspect, the invention features, a cell or purified preparation of cells which include a 32225 transgene, or which otherwise misexpress 32225. The cell preparation can consist of human or non-human cells, e.g., rodent cells, e.g., mouse or rat cells, rabbit cells, or pig cells. In preferred embodiments, the cell or cells include a 32225 transgene, e.g., a heterologous form of a 32225, e.g., a gene derived from humans (in the case of a non-human cell). The 32225 transgene can be misexpressed, e.g., overexpressed or underexpressed. In other preferred embodiments, the cell or cells include a gene that mis-expresses an endogenous 32225, e.g., a gene the expression of which is disrupted, e.g., a knockout. Such cells can serve as a model for studying disorders that are related to mutated or mis-expressed 32225 alleles or for use in drug screening.

[2438] In another aspect, the invention features, a human cell, e.g., a hepatic or hematopoietic stem cell, transformed with nucleic acid which encodes a subject 32225 polypeptide.

[2439] Also provided are cells, preferably human cells, e.g., human hepatic, hematopoietic or fibroblast cells, in which an endogenous 32225 is under the control of a regulatory sequence that does not normally control the expression of the endogenous 32225 gene. The expression characteristics of an endogenous gene within a cell, e.g., a cell line or microorganism, can be modified by inserting a heterologous DNA regulatory element into the genome of the cell such that the inserted regulatory element is operably linked to the endogenous 32225 gene. For example, an endogenous 32225 gene which is “transcriptionally silent,” e.g., not normally expressed, or expressed only at very low levels, may be activated by inserting a regulatory element which is capable of promoting the expression of a normally expressed gene product in that cell. Techniques such as targeted homologous recombinations, can be used to insert the heterologous DNA as described in, e.g., Chappel, U.S. Pat. No. 5,272,071; WO 91/06667, published in May 16, 1991.

[2440] In a preferred embodiment, recombinant cells described herein can be used for replacement therapy in a subject. For example, a nucleic acid encoding a 32225 polypeptide operably linked to an inducible promoter (e.g., a steroid hormone receptor-regulated promoter) is introduced into a human or nonhuman, e.g., mammalian, e.g., porcine recombinant cell. The cell is cultivated and encapsulated in a biocompatible material, such as poly-lysine alginate, and subsequently implanted into the subject. See, e.g., Lanza (1996) Nat. Biotechnol. 14:1107; Joki et al. (2001) Nat. Biotechnol. 19:35; and U.S. Pat. No. 5,876,742. Production of 32225 polypeptide can be regulated in the subject by administering an agent (e.g., a steroid hormone) to the subject. In another preferred embodiment, the implanted recombinant cells express and secrete an antibody specific for a 32225 polypeptide. The antibody can be any antibody or any antibody derivative described herein.

[2441] 32225 Transgenic Animals

[2442] The invention provides non-human transgenic animals. Such animals are useful for studying the function and/or activity of a 32225 protein and for identifying and/or evaluating modulators of 32225 activity. As used herein, a “transgenic animal” is a non-human animal, preferably a mammal, more preferably a rodent such as a rat or mouse, in which one or more of the cells of the animal includes a transgene. Other examples of transgenic animals include non-human primates, sheep, dogs, cows, goats, chickens, amphibians, and the like. A transgene is exogenous DNA or a rearrangement, e.g., a deletion of endogenous chromosomal DNA, which preferably is integrated into or occurs in the genome of the cells of a transgenic animal. A transgene can direct the expression of an encoded gene product in one or more cell types or tissues of the transgenic animal, other transgenes, e.g., a knockout, reduce expression. Thus, a transgenic animal can be one in which an endogenous 32225 gene has been altered by, e.g., by homologous recombination between the endogenous gene and an exogenous DNA molecule introduced into a cell of the animal, e.g., an embryonic cell of the animal, prior to development of the animal.

[2443] Intronic sequences and polyadenylation signals can also be included in the transgene to increase the efficiency of expression of the transgene. A tissue-specific regulatory sequence(s) can be operably linked to a transgene of the invention to direct expression of a 32225 protein to particular cells. A transgenic founder animal can be identified based upon the presence of a 32225 transgene in its genome and/or expression of 32225 mRNA in tissues or cells of the animals. A transgenic founder animal can then be used to breed additional animals carrying the transgene. Moreover, transgenic animals carrying a transgene encoding a 32225 protein can further be bred to other transgenic animals carrying other transgenes.

[2444] 32225 proteins or polypeptides can be expressed in transgenic animals or plants, e.g., a nucleic acid encoding the protein or polypeptide can be introduced into the genome of an animal. In preferred embodiments the nucleic acid is placed under the control of a tissue specific promoter, e.g., a milk or egg specific promoter, and recovered from the milk or eggs produced by the animal. Suitable animals are mice, pigs, cows, goats, and sheep.

[2445] The invention also includes a population of cells from a transgenic animal, as discussed, e.g., below.

[2446] Uses of 32225

[2447] The nucleic acid molecules, proteins, protein homologues, and antibodies described herein can be used in one or more of the following methods: a) screening assays; b) predictive medicine (e.g., diagnostic assays, prognostic assays, monitoring clinical trials, and pharmacogenetics); and c) methods of treatment (e.g., therapeutic and prophylactic).

[2448] The isolated nucleic acid molecules of the invention can be used, for example, to express a 32225 protein (e.g., via a recombinant expression vector in a host cell in gene therapy applications), to detect a 32225 mRNA (e.g., in a biological sample) or a genetic alteration in a 32225 gene, and to modulate 32225 activity, as described further below. 32225 proteins can be used, in vitro, to hydrolyze substrate compounds as part of a synthetic process. The 32225 proteins can also be used to treat disorders characterized by insufficient or excessive production of a 32225 substrate or production of 32225 inhibitors. In addition, the 32225 proteins can be used to screen for naturally occurring 32225 substrates, to screen for drugs or compounds which modulate 32225 activity, as well as to treat disorders characterized by insufficient or excessive production of 32225 protein or production of 32225 protein forms which have decreased, aberrant or unwanted activity compared to 32225 wild type protein (e.g., cellular proliferative and/or differentiative disorders, neural disorders, liver disorders, metabolic disorders, or cardiovascular disorders). Moreover, the anti-32225 antibodies of the invention can be used to detect and isolate 32225 proteins, regulate the bioavailability of 32225 proteins, and modulate 32225 activity.

[2449] A method of evaluating a compound for the ability to interact with, e.g., bind, a subject 32225 polypeptide is provided. The method includes: contacting the compound with the subject 32225 polypeptide; and evaluating ability of the compound to interact with, e.g., to bind or form a complex with the subject 32225 polypeptide. This method can be performed in vitro, e.g., in a cell free system, or in vivo, e.g., in a two-hybrid interaction trap assay. This method can be used to identify naturally occurring molecules that interact with subject 32225 polypeptide. It can also be used to find natural or synthetic inhibitors of subject 32225 polypeptide. Screening methods are discussed in more detail below.

[2450] 32225 Screening Assays

[2451] The invention provides methods (also referred to herein as “screening assays”) for identifying modulators, i.e., candidate or test compounds or agents (e.g., proteins, peptides, peptidomimetics, peptoids, small molecules or other drugs) which bind to 32225 proteins, have a stimulatory or inhibitory effect on, for example, 32225 expression or 32225 activity, or have a stimulatory or inhibitory effect on, for example, the expression or activity of a 32225 substrate. Compounds thus identified can be used to modulate the activity of target gene products (e.g., 32225 genes) in a therapeutic protocol, to elaborate the biological function of the target gene product, or to identify compounds that disrupt normal target gene interactions.

[2452] In one embodiment, the invention provides assays for screening candidate or test compounds which are substrates of a 32225 protein or polypeptide or a biologically active portion thereof. In another embodiment, the invention provides assays for screening candidate or test compounds that bind to or modulate an activity of a 32225 protein or polypeptide or a biologically active portion thereof.

[2453] In one embodiment, an activity of a 32225 protein can be assayed in vitro. First, 32225 protein can be expressed in a bacterial cell and then purified, e.g., by means of an covalently attached affinity tag, e.g., a His-6 tag. Purified 32225 can then be incubated in buffer, e.g., 100 mM Tris-HCL, pH 8.5, along with a substrate molecule, e.g., hydroxymuconic semialdehyde, acetycholine, 1,2-dichloroethane, or triacylglycerols. By measuring the optical density of the solution at various time points, the loss of substrate or increase in product can be monitored, which is a direct measure of 32225 activity. The appropriate wavelength used to measure optical density will depend upon the light absorption spectra of the substrate and product molecules. An example of such as assay, used to monitor the activity of a 2-Hydroxymuconic Semialdehyde Dehydrogenase enzyme, is provided in Inoue et al. (1995), J of Bacteriology 177(5):1196-1201, the contents of which are incorporated herein by reference.

[2454] The test compounds of the present invention can be obtained using any of the numerous approaches in combinatorial library methods known in the art, including: biological libraries; peptoid libraries (libraries of molecules having the functionalities of peptides, but with a novel, non-peptide backbone which are resistant to enzymatic degradation but which nevertheless remain bioactive; see, e.g., Zuckermann, R. N. et al. (1994) J. Med. Chem. 37:2678-85); spatially addressable parallel solid phase or solution phase libraries; synthetic library methods requiring deconvolution; the ‘one-bead one-compound’ library method; and synthetic library methods using affinity chromatography selection. The biological library and peptoid library approaches are limited to peptide libraries, while the other four approaches are applicable to peptide, non-peptide oligomer or small molecule libraries of compounds (Lam (1997) Anticancer Drug Des. 12:145).

[2455] Examples of methods for the synthesis of molecular libraries can be found in the art, for example in: DeWitt et al. (1993) Proc. Natl. Acad. Sci. U.S.A. 90:6909; Erb et al. (1994) Proc. Natl. Acad. Sci. USA 91:11422; Zuckermann et al. (1994). J. Med. Chem. 37:2678; Cho et al. (1993) Science 261:1303; Carrell et al. (1994) Angew. Chem. Int. Ed. Engl. 33:2059; Carell et al. (1994) Angew. Chem. Int. Ed. Engl. 33:2061; and Gallop et al. (1994) J. Med. Chem. 37:1233.

[2456] Libraries of compounds may be presented in solution (e.g., Houghten (1992) Biotechniques 13:412-421), or on beads (Lam (1991) Nature 354:82-84), chips (Fodor (1993) Nature 364:555-556), bacteria (Ladner, U.S. Pat. No. 5,223,409), spores (Ladner U.S. Pat. No. 5,223,409), plasmids (Cull et al. (1992) Proc Natl Acad Sci USA 89:1865-1869) or on phage (Scott and Smith (1990) Science 249:386-390; Devlin (1990) Science 249:404-406; Cwirla et al. (1990) Proc. Natl. Acad. Sci. 87:6378-6382; Felici (1991) J. Mol. Biol. 222:301-310; Ladner supra.).

[2457] In one embodiment, an assay is a cell-based assay in which a cell which expresses a 32225 protein or biologically active portion thereof is contacted with a test compound, and the ability of the test compound to modulate 32225 activity is determined. Determining the ability of the test compound to modulate 32225 activity can be accomplished by monitoring, for example, the hydrolysis of a substrate molecule, e.g., hydroxymuconic semialdehyde, acetycholine, 1,2-dichloroethane, or triacylglycerols. The cell, for example, can be of mammalian origin, e.g., human.

[2458] The ability of the test compound to modulate 32225 binding to a compound, e.g., a 32225 substrate, or to bind to 32225 can also be evaluated. This can be accomplished, for example, by coupling the compound, e.g., the substrate, with a radioisotope or enzymatic label such that binding of the compound, e.g., the substrate, to 32225 can be determined by detecting the labeled compound, e.g., substrate, in a complex. Alternatively, 32225 could be coupled with a radioisotope or enzymatic label to monitor the ability of a test compound to modulate 32225 binding to a 32225 substrate in a complex. For example, compounds (e.g., 32225 substrates) can be labeled with 125I, 35S, 14C, or 3H, either directly or indirectly, and the radioisotope detected by direct counting of radioemmission or by scintillation counting. Alternatively, compounds can be enzymatically labeled with, for example, horseradish peroxidase, alkaline phosphatase, or luciferase, and the enzymatic label detected by determination of conversion of an appropriate substrate to product.

[2459] The ability of a compound (e.g., a 32225 substrate) to interact with 32225 with or without the labeling of any of the interactants can be evaluated. For example, a microphysiometer can be used to detect the interaction of a compound with 32225 without the labeling of either the compound or the 32225. McConnell, H. M. et al. (1992) Science 257:1906-1912. As used herein, a “microphysiometer” (e.g., Cytosensor) is an analytical instrument that measures the rate at which a cell acidifies its environment using a light-addressable potentiometric sensor (LAPS). Changes in this acidification rate can be used as an indicator of the interaction between a compound and 32225.

[2460] In yet another embodiment, a cell-free assay is provided in which a 32225 protein or biologically active portion thereof is contacted with a test compound and the ability of the test compound to bind to the 32225 protein or biologically active portion thereof is evaluated. Preferred biologically active portions of the 32225 proteins to be used in assays of the present invention include fragments which participate in interactions with non-32225 molecules, e.g., fragments with high surface probability scores.

[2461] Soluble and/or membrane-bound forms of isolated proteins (e.g., 32225 proteins or biologically active portions thereof) can be used in the cell-free assays of the invention. When membrane-bound forms of the protein are used, it may be desirable to utilize a solubilizing agent. Examples of such solubilizing agents include non-ionic detergents such as n-octylglucoside, n-dodecylglucoside, n-dodecylmaltoside, octanoyl-N-methylglucamide, decanoyl-N-methylglucamide, Triton® X-100, Triton® X-114, Thesit®, Isotridecypoly(ethylene glycol ether)n, 3-[(3-cholamidopropyl)dimethylamminio]-1-propane sulfonate (CHAPS), 3-[(3-cholamidopropyl)dimethylamminio]-2-hydroxy-1-propane sulfonate (CHAPSO), or N-dodecyl=N,N-dimethyl-3-ammonio-1-propane sulfonate.

[2462] Cell-free assays involve preparing a reaction mixture of the target gene protein and the test compound under conditions and for a time sufficient to allow the two components to interact and bind, thus forming a complex that can be removed and/or detected.

[2463] The interaction between two molecules can also be detected, e.g., using fluorescence energy transfer (FET) (see, for example, Lakowicz et al., U.S. Pat. No. 5,631,169; Stavrianopoulos, et al., U.S. Pat. No. 4,868,103). A fluorophore label on the first, ‘donor’ molecule is selected such that its emitted fluorescent energy will be absorbed by a fluorescent label on a second, ‘acceptor’ molecule, which in turn is able to fluoresce due to the absorbed energy. Alternately, the ‘donor’ protein molecule may simply utilize the natural fluorescent energy of tryptophan residues. Labels are chosen that emit different wavelengths of light, such that the ‘acceptor’ molecule label may be differentiated from that of the ‘donor’. Since the efficiency of energy transfer between the labels is related to the distance separating the molecules, the spatial relationship between the molecules can be assessed. In a situation in which binding occurs between the molecules, the fluorescent emission of the ‘acceptor’ molecule label in the assay should be maximal. An FET binding event can be conveniently measured through standard fluorometric detection means well known in the art (e.g., using a fluorimeter).

[2464] In another embodiment, determining the ability of the 32225 protein to bind to a target molecule can be accomplished using real-time Biomolecular Interaction Analysis (BIA) (see, e.g., Sjolander, S. and Urbaniczky, C. (1991) Anal. Chem. 63:2338-2345 and Szabo et al. (1995) Curr. Opin. Struct. Biol. 5:699-705). “Surface plasmon resonance” or “BIA” detects biospecific interactions in real time, without labeling any of the interactants (e.g., BIAcore). Changes in the mass at the binding surface (indicative of a binding event) result in alterations of the refractive index of light near the surface (the optical phenomenon of surface plasmon resonance (SPR)), resulting in a detectable signal which can be used as an indication of real-time reactions between biological molecules.

[2465] In one embodiment, the target gene product or the test substance is anchored onto a solid phase. The target gene product/test compound complexes anchored on the solid phase can be detected at the end of the reaction. Preferably, the target gene product can be anchored onto a solid surface, and the test compound, (which is not anchored), can be labeled, either directly or indirectly, with detectable labels discussed herein.

[2466] It may be desirable to immobilize either 32225, an anti-32225 antibody or its target molecule to facilitate separation of complexed from uncomplexed forms of one or both of the proteins, as well as to accommodate automation of the assay. Binding of a test compound to a 32225 protein, or interaction of a 32225 protein with a target molecule in the presence and absence of a candidate compound, can be accomplished in any vessel suitable for containing the reactants. Examples of such vessels include microtiter plates, test tubes, and micro-centrifuge tubes. In one embodiment, a fusion protein can be provided which adds a domain that allows one or both of the proteins to be bound to a matrix. For example, glutathione-S-transferase/32225 fusion proteins or glutathione-S-transferase/target fusion proteins can be adsorbed onto glutathione sepharose beads (Sigma Chemical, St. Louis, Mo.) or glutathione derivatized microtiter plates, which are then combined with the test compound or the test compound and either the non-adsorbed target protein or 32225 protein, and the mixture incubated under conditions conducive to complex formation (e.g., at physiological conditions for salt and pH). Following incubation, the beads or microtiter plate wells are washed to remove any unbound components, the matrix immobilized in the case of beads, complex determined either directly or indirectly, for example, as described above. Alternatively, the complexes can be dissociated from the matrix, and the level of 32225 binding or activity determined using standard techniques.

[2467] Other techniques for immobilizing either a 32225 protein or a target molecule on matrices include using conjugation of biotin and streptavidin. Biotinylated 32225 protein or target molecules can be prepared from biotin-NHS(N-hydroxy-succinimide) using techniques known in the art (e.g., biotinylation kit, Pierce Chemicals, Rockford, Ill.), and immobilized in the wells of streptavidin-coated 96 well plates (Pierce Chemical).

[2468] In order to conduct the assay, the non-immobilized component is added to the coated surface containing the anchored component. After the reaction is complete, unreacted components are removed (e.g., by washing) under conditions such that any complexes formed will remain immobilized on the solid surface. The detection of complexes anchored on the solid surface can be accomplished in a number of ways. Where the previously non-immobilized component is pre-labeled, the detection of label immobilized on the surface indicates that complexes were formed. Where the previously non-immobilized component is not pre-labeled, an indirect label can be used to detect complexes anchored on the surface; e.g., using a labeled antibody specific for the immobilized component (the antibody, in turn, can be directly labeled or indirectly labeled with, e.g., a labeled anti-Ig antibody).

[2469] In one embodiment, this assay is performed utilizing antibodies reactive with 32225 protein or target molecules but which do not interfere with binding of the 32225 protein to its target molecule. Such antibodies can be derivatized to the wells of the plate, and unbound target or 32225 protein trapped in the wells by antibody conjugation. Methods for detecting such complexes, in addition to those described above for the GST-immobilized complexes, include immunodetection of complexes using antibodies reactive with the 32225 protein or target molecule, as well as enzyme-linked assays which rely on detecting an enzymatic activity associated with the 32225 protein or target molecule.

[2470] Alternatively, cell free assays can be conducted in a liquid phase. In such an assay, the reaction products are separated from unreacted components, by any of a number of standard techniques, including but not limited to: differential centrifugation (see, for example, Rivas, G., and Minton, A. P., (1993) Trends Biochem Sci 18:284-7); chromatography (gel filtration chromatography, ion-exchange chromatography); electrophoresis (see, e.g., Ausubel, F. et al., eds. Current Protocols in Molecular Biology 1999, J. Wiley: New York.); and immunoprecipitation (see, for example, Ausubel, F. et al., eds. (1999) Current Protocols in Molecular Biology, J. Wiley: New York). Such resins and chromatographic techniques are known to one skilled in the art (see, e.g., Heegaard, N. H., (1998) J Mol Recognit 11: 141-8; Hage, D. S., and Tweed, S. A. (1997) J Chromatogr B Biomed Sci Appl. 699:499-525). Further, fluorescence energy transfer may also be conveniently utilized, as described herein, to detect binding without further purification of the complex from solution.

[2471] In a preferred embodiment, the assay includes contacting the 32225 protein or biologically active portion thereof with a known compound which binds 32225 to form an assay mixture, contacting the assay mixture with a test compound, and determining the ability of the test compound to interact with a 32225 protein, wherein determining the ability of the test compound to interact with a 32225 protein includes determining the ability of the test compound to preferentially bind to 32225 or biologically active portion thereof, or to modulate the activity of a target molecule, as compared to the known compound.

[2472] The target gene products of the invention can, in vivo, interact with one or more cellular or extracellular macromolecules, such as proteins. For the purposes of this discussion, such cellular and extracellular macromolecules are referred to herein as “binding partners.” Compounds that disrupt such interactions can be useful in regulating the activity of the target gene product. Such compounds can include, but are not limited to molecules such as antibodies, peptides, and small molecules. The preferred target genes/products for use in this embodiment are the 32225 genes herein identified. In an alternative embodiment, the invention provides methods for determining the ability of the test compound to modulate the activity of a 32225 protein through modulation of the activity of a downstream effector of a 32225 target molecule. For example, the activity of the effector molecule on an appropriate target can be determined, or the binding of the effector to an appropriate target can be determined, as previously described.

[2473] To identify compounds that interfere with the interaction between the target gene product and its cellular or extracellular binding partner(s), a reaction mixture containing the target gene product and the binding partner is prepared, under conditions and for a time sufficient, to allow the two products to form complex. In order to test an inhibitory agent, the reaction mixture is provided in the presence and absence of the test compound. The test compound can be initially included in the reaction mixture, or can be added at a time subsequent to the addition of the target gene and its cellular or extracellular binding partner. Control reaction mixtures are incubated without the test compound or with a placebo. The formation of any complexes between the target gene product and the cellular or extracellular binding partner is then detected. The formation of a complex in the control reaction, but not in the reaction mixture containing the test compound, indicates that the compound interferes with the interaction of the target gene product and the interactive binding partner. Additionally, complex formation within reaction mixtures containing the test compound and normal target gene product can also be compared to complex formation within reaction mixtures containing the test compound and mutant target gene product. This comparison can be important in those cases wherein it is desirable to identify compounds that disrupt interactions of mutant but not normal target gene products.

[2474] These assays can be conducted in a heterogeneous or homogeneous format. Heterogeneous assays involve anchoring either the target gene product or the binding partner onto a solid phase, and detecting complexes anchored on the solid phase at the end of the reaction. In homogeneous assays, the entire reaction is carried out in a liquid phase. In either approach, the order of addition of reactants can be varied to obtain different information about the compounds being tested. For example, test compounds that interfere with the interaction between the target gene products and the binding partners, e.g., by competition, can be identified by conducting the reaction in the presence of the test substance. Alternatively, test compounds that disrupt preformed complexes, e.g., compounds with higher binding constants that displace one of the components from the complex, can be tested by adding the test compound to the reaction mixture after complexes have been formed. The various formats are briefly described below.

[2475] In a heterogeneous assay system, either the target gene product or the interactive cellular or extracellular binding partner, is anchored onto a solid surface (e.g., a microtiter plate), while the non-anchored species is labeled, either directly or indirectly. The anchored species can be immobilized by non-covalent or covalent attachments. Alternatively, an immobilized antibody specific for the species to be anchored can be used to anchor the species to the solid surface.

[2476] In order to conduct the assay, the partner of the immobilized species is exposed to the coated surface with or without the test compound. After the reaction is complete, unreacted components are removed (e.g., by washing) and any complexes formed will remain immobilized on the solid surface. Where the non-immobilized species is pre-labeled, the detection of label immobilized on the surface indicates that complexes were formed. Where the non-immobilized species is not pre-labeled, an indirect label can be used to detect complexes anchored on the surface; e.g., using a labeled antibody specific for the initially non-immobilized species (the antibody, in turn, can be directly labeled or indirectly labeled with, e.g., a labeled anti-Ig antibody). Depending upon the order of addition of reaction components, test compounds that inhibit complex formation or that disrupt preformed complexes can be detected.

[2477] Alternatively, the reaction can be conducted in a liquid phase in the presence or absence of the test compound, the reaction products separated from unreacted components, and complexes detected; e.g., using an immobilized antibody specific for one of the binding components to anchor any complexes formed in solution, and a labeled antibody specific for the other partner to detect anchored complexes. Again, depending upon the order of addition of reactants to the liquid phase, test compounds that inhibit complex or that disrupt preformed complexes can be identified.

[2478] In an alternate embodiment of the invention, a homogeneous assay can be used. For example, a preformed complex of the target gene product and the interactive cellular or extracellular binding partner product is prepared in that either the target gene products or their binding partners are labeled, but the signal generated by the label is quenched due to complex formation (see, e.g., U.S. Pat. No. 4,109,496 that utilizes this approach for immunoassays). The addition of a test substance that competes with and displaces one of the species from the preformed complex will result in the generation of a signal above background. In this way, test substances that disrupt target gene product-binding partner interaction can be identified.

[2479] In yet another aspect, the 32225 proteins can be used as “bait proteins” in a two-hybrid assay or three-hybrid assay (see, e.g., U.S. Pat. No. 5,283,317; Zervos et al. (1993) Cell 72:223-232; Madura et al. (1993) J. Biol. Chem. 268:12046-12054; Bartel et al. (1993) Biotechniques 14:920-924; Iwabuchi et al. (1993) Oncogene 8:1693-1696; and Brent WO94/10300), to identify other proteins, which bind to or interact with 32225 (“32225-binding proteins” or “32225-bp”) and are involved in 32225 activity. Such 32225-bps can be activators or inhibitors of signals by the 32225 proteins or 32225 targets as, for example, downstream elements of a 32225-mediated signaling pathway.

[2480] The two-hybrid system is based on the modular nature of most transcription factors, which consist of separable DNA-binding and activation domains. Briefly, the assay utilizes two different DNA constructs. In one construct, the gene that codes for a 32225 protein is fused to a gene encoding the DNA binding domain of a known transcription factor (e.g., GAL-4). In the other construct, a DNA sequence, from a library of DNA sequences, that encodes an unidentified protein (“prey” or “sample”) is fused to a gene that codes for the activation domain of the known transcription factor. (Alternatively the: 32225 protein can be the fused to the activator domain.) If the “bait” and the “prey” proteins are able to interact, in vivo, forming a 32225-dependent complex, the DNA-binding and activation domains of the transcription factor are brought into close proximity. This proximity allows transcription of a reporter gene (e.g., lacZ) which is operably linked to a transcriptional regulatory site responsive to the transcription factor. Expression of the reporter gene can be detected and cell colonies containing the functional transcription factor can be isolated and used to obtain the cloned gene which encodes the protein which interacts with the 32225 protein.

[2481] In another embodiment, modulators of 32225 expression are identified. For example, a cell or cell free mixture is contacted with a candidate compound and the expression of 32225 mRNA or protein evaluated relative to the level of expression of 32225 mRNA or protein in the absence of the candidate compound. When expression of 32225 mRNA or protein is greater in the presence of the candidate compound than in its absence, the candidate compound is identified as a stimulator of 32225 mRNA or protein expression. Alternatively, when expression of 32225 mRNA or protein is less (statistically significantly less) in the presence of the candidate compound than in its absence, the candidate compound is identified as an inhibitor of 32225 mRNA or protein expression. The level of 32225 mRNA or protein expression can be determined by methods described herein for detecting 32225 mRNA or protein.

[2482] In another aspect, the invention pertains to a combination of two or more of the assays described herein. For example, a modulating agent can be identified using a cell-based or a cell free assay, and the ability of the agent to modulate the activity of a 32225 protein can be confirmed in vivo, e.g., in an animal such as an animal model for a cellular proliferative and/or differentiative disorder, a neural disorder, a liver disorder, a metabolic disorder, or a cardiovascular disorder.

[2483] This invention further pertains to novel agents identified by the above-described screening assays. Accordingly, it is within the scope of this invention to further use an agent identified as described herein (e.g., a 32225 modulating agent, an antisense 32225 nucleic acid molecule, a 32225-specific antibody, or a 32225-binding partner) in an appropriate animal model to determine the efficacy, toxicity, side effects, or mechanism of action, of treatment with such an agent. Furthermore, novel agents identified by the above-described screening assays can be used for treatments as described herein.

[2484] 32225 Detection Assays

[2485] Portions or fragments of the nucleic acid sequences identified herein can be used as polynucleotide reagents. For example, these sequences can be used to: (i) map their respective genes on a chromosome e.g., to locate gene regions associated with genetic disease or to associate 32225 with a disease; (ii) identify an individual from a minute biological sample (tissue typing); and (iii) aid in forensic identification of a biological sample. These applications are described in the subsections below.

[2486] 32225 Chromosome Mapping

[2487] The 32225 nucleotide sequences or portions thereof can be used to map the location of the 32225 genes on a chromosome. This process is called chromosome mapping. Chromosome mapping is useful in correlating the 32225 sequences with genes associated with disease.

[2488] Briefly, 32225 genes can be mapped to chromosomes by preparing PCR primers (preferably 15-25 bp in length) from the 32225 nucleotide sequences. These primers can then be used for PCR screening of somatic cell hybrids containing individual human chromosomes. Only those hybrids containing the human gene corresponding to the 32225 sequences will yield an amplified fragment.

[2489] A panel of somatic cell hybrids in which each cell line contains either a single human chromosome or a small number of human chromosomes, and a full set of mouse chromosomes, can allow easy mapping of individual genes to specific human chromosomes. (D'Eustachio P. et al. (1983) Science 220:919-924).

[2490] Other mapping strategies e.g., in situ hybridization (described in Fan, Y. et al. (1990) Proc. Natl. Acad. Sci. USA, 87:6223-27), pre-screening with labeled flow-sorted chromosomes, and pre-selection by hybridization to chromosome specific cDNA libraries can be used to map 32225 to a chromosomal location.

[2491] Fluorescence in situ hybridization (FISH) of a DNA sequence to a metaphase chromosomal spread can further be used to provide a precise chromosomal location in one step. The FISH technique can be used with a DNA sequence as short as 500 or 600 bases. However, clones larger than 1,000 bases have a higher likelihood of binding to a unique chromosomal location with sufficient signal intensity for simple detection. Preferably 1,000 bases, and more preferably 2,000 bases will suffice to get good results at a reasonable amount of time. For a review of this technique, see Verma et al., Human Chromosomes: A Manual of Basic Techniques ((1988) Pergamon Press, New York).

[2492] Reagents for chromosome mapping can be used individually to mark a single chromosome or a single site on that chromosome, or panels of reagents can be used for marking multiple sites and/or multiple chromosomes. Reagents corresponding to noncoding regions of the genes actually are preferred for mapping purposes. Coding sequences are more likely to be conserved within gene families, thus increasing the chance of cross hybridizations during chromosomal mapping.

[2493] Once a sequence has been mapped to a precise chromosomal location, the physical position of the sequence on the chromosome can be correlated with genetic map data. (Such data are found, for example, in V. McKusick, Mendelian Inheritance in Man, available on-line through Johns Hopkins University Welch Medical Library). The relationship between a gene and a disease, mapped to the same chromosomal region, can then be identified through linkage analysis (co-inheritance of physically adjacent genes), described in, for example, Egeland, J. et al. (1987) Nature, 325:783-787.

[2494] Moreover, differences in the DNA sequences between individuals affected and unaffected with a disease associated with the 32225 gene, can be determined. If a mutation is observed in some or all of the affected individuals but not in any unaffected individuals, then the mutation is likely to be the causative agent of the particular disease. Comparison of affected and unaffected individuals generally involves first looking for structural alterations in the chromosomes, such as deletions or translocations that are visible from chromosome spreads or detectable using PCR based on that DNA sequence. Ultimately, complete sequencing of genes from several individuals can be performed to confirm the presence of a mutation and to distinguish mutations from polymorphisms.

[2495] 32225 Tissue Typing

[2496] 32225 sequences can be used to identify individuals from biological samples using, e.g., restriction fragment length polymorphism (RFLP). In this technique, an individual's genomic DNA is digested with one or more restriction enzymes, the fragments separated, e.g., in a Southern blot, and probed to yield bands for identification. The sequences of the present invention are useful as additional DNA markers for RFLP (described in U.S. Pat. No. 5,272,057).

[2497] Furthermore, the sequences of the present invention can also be used to determine the actual base-by-base DNA sequence of selected portions of an individual's genome. Thus, the 32225 nucleotide sequences described herein can be used to prepare two PCR primers from the 5′ and 3′ends of the sequences. These primers can then be used to amplify an individual's DNA and subsequently sequence it. Panels of corresponding DNA sequences from individuals, prepared in this manner, can provide unique individual identifications, as each individual will have a unique set of such DNA sequences due to allelic differences.

[2498] Allelic variation occurs to some degree in the coding regions of these sequences, and to a greater degree in the noncoding regions. Each of the sequences described herein can, to some degree, be used as a standard against which DNA from an individual can be compared for identification purposes. Because greater numbers of polymorphisms occur in the noncoding regions, fewer sequences are necessary to differentiate individuals. The noncoding sequences of SEQ ID NO:33 can provide positive individual identification with a panel of perhaps 10 to 1,000 primers which each yield a noncoding amplified sequence of 100 bases. If predicted coding sequences, such as those in SEQ ID NO:35 are used, a more appropriate number of primers for positive individual identification would be 500-2,000.

[2499] If a panel of reagents from 32225 nucleotide sequences described herein is used to generate a unique identification database for an individual, those same reagents can later be used to identify tissue from that individual. Using the unique identification database, positive identification of the individual, living or dead, can be made from extremely small tissue samples.

[2500] Use of Partial 32225 Sequences in Forensic Biology

[2501] DNA-based identification techniques can also be used in forensic biology. To make such an identification, PCR technology can be used to amplify DNA sequences taken from very small biological samples such as tissues, e.g., hair or skin, or body fluids, e.g., blood, saliva, or semen found at a crime scene. The amplified sequence can then be compared to a standard, thereby allowing identification of the origin of the biological sample.

[2502] The sequences of the present invention can be used to provide polynucleotide reagents, e.g., PCR primers, targeted to specific loci in the human genome, which can enhance the reliability of DNA-based forensic identifications by, for example, providing another “identification marker” (i.e. another DNA sequence that is unique to a particular individual). As mentioned above, actual base sequence information can be used for identification as an accurate alternative to patterns formed by restriction enzyme generated fragments. Sequences targeted to noncoding regions of SEQ ID NO:33 (e.g., fragments derived from the noncoding regions of SEQ ID NO:33 having a length of at least 20 bases, preferably at least 30 bases) are particularly appropriate for this use.

[2503] The 32225 nucleotide sequences described herein can further be used to provide polynucleotide reagents, e.g., labeled or labelable probes which can be used in, for example, an in situ hybridization technique, to identify a specific tissue. This can be very useful in cases where a forensic pathologist is presented with a tissue of unknown origin. Panels of such 32225 probes can be used to identify tissue by species and/or by organ type.

[2504] In a similar fashion, these reagents, e.g., 32225 primers or probes can be used to screen tissue culture for contamination (i.e. screen for the presence of a mixture of different types of cells in a culture).

[2505] Predictive Medicine of 32225

[2506] The present invention also pertains to the field of predictive medicine in which diagnostic assays, prognostic assays, and monitoring clinical trials are used for prognostic (predictive) purposes to thereby treat an individual.

[2507] Generally, the invention provides, a method of determining if a subject is at risk for a disorder related to a lesion in or the misexpression of a gene which encodes 32225.

[2508] Such disorders include, e.g., a disorder associated with the misexpression of 32225 gene; a disorder of the lung, breast, colon, ovary, brain, or liver; a disorder characterized by unwanted cell proliferation, e.g., cancer, of the lung, breast, colon, ovary, brain, or liver.

[2509] The method includes one or more of the following:

[2510] detecting, in a tissue of the subject, the presence or absence of a mutation which affects the expression of the 32225 gene, or detecting the presence or absence of a mutation in a region which controls the expression of the gene, e.g., a mutation in the 5′control region;

[2511] detecting, in a tissue of the subject, the presence or absence of a mutation which alters the structure of the 32225 gene;

[2512] detecting, in a tissue of the subject, the misexpression of the 32225 gene, at the mRNA level, e.g., detecting a non-wild type level of a mRNA;

[2513] detecting, in a tissue of the subject, the misexpression of the gene, at the protein level, e.g., detecting a non-wild type level of a 32225 polypeptide.

[2514] In preferred embodiments the method includes: ascertaining the existence of at least one of: a deletion of one or more nucleotides from the 32225 gene; an insertion of one or more nucleotides into the gene, a point mutation, e.g., a substitution of one or more nucleotides of the gene, a gross chromosomal rearrangement of the gene, e.g., a translocation, inversion, or deletion.

[2515] For example, detecting the genetic lesion can include: (i) providing a probe/primer including an oligonucleotide containing a region of nucleotide sequence which hybridizes to a sense or antisense sequence from SEQ ID NO:33, or naturally occurring mutants thereof or 5′ or 3′flanking sequences naturally associated with the 32225 gene; (ii) exposing the probe/primer to nucleic acid of the tissue; and detecting, by hybridization, e.g., in situ hybridization, of the probe/primer to the nucleic acid, the presence or absence of the genetic lesion.

[2516] In preferred embodiments detecting the misexpression includes ascertaining the existence of at least one of: an alteration in the level of a messenger RNA transcript of the 32225 gene; the presence of a non-wild type splicing pattern of a messenger RNA transcript of the gene; or a non-wild type level of 32225.

[2517] Methods of the invention can be used prenatally or to determine if a subject's offspring will be at risk for a disorder.

[2518] In preferred embodiments the method includes determining the structure of a 32225 gene, an abnormal structure being indicative of risk for the disorder.

[2519] In preferred embodiments the method includes contacting a sample from the subject with an antibody to the 32225 protein or a nucleic acid, which hybridizes specifically with the gene. These and other embodiments are discussed below.

[2520] Diagnostic and Prognostic Assays of 32225

[2521] Diagnostic and prognostic assays of the invention include method for assessing the expression level of 32225 molecules and for identifying variations and mutations in the sequence of 32225 molecules.

[2522] Expression Monitoring and Profiling:

[2523] The presence, level, or absence of 32225 protein or nucleic acid in a biological sample can be evaluated by obtaining a biological sample from a test subject and contacting the biological sample with a compound or an agent capable of detecting 32225 protein or nucleic acid (e.g., mRNA, genomic DNA) that encodes 32225 protein such that the presence of 32225 protein or nucleic acid is detected in the biological sample. The term “biological sample” includes tissues, cells and biological fluids isolated from a subject, as well as tissues, cells and fluids present within a subject. A preferred biological sample is serum or lung, breast, colon, ovary, brain, or liver tissue. The level of expression of the 32225 gene can be measured in a number of ways, including, but not limited to: measuring the mRNA encoded by the 32225 genes; measuring the amount of protein encoded by the 32225 genes; or measuring the activity of the protein encoded by the 32225 genes.

[2524] The level of mRNA corresponding to the 32225 gene in a cell can be determined both by in situ and by in vitro formats.

[2525] The isolated mRNA can be used in hybridization or amplification assays that include, but are not limited to, Southern or Northern analyses, polymerase chain reaction analyses and probe arrays. One preferred diagnostic method for the detection of mRNA levels involves contacting the isolated mRNA with a nucleic acid molecule (probe) that can hybridize to the mRNA encoded by the gene being detected. The nucleic acid probe can be, for example, a full-length 32225 nucleic acid, such as the nucleic acid of SEQ ID NO:33, or a portion thereof, such as an oligonucleotide of at least 7, 15, 30, 50, 100, 250 or 500 nucleotides in length and sufficient to specifically hybridize under stringent conditions to 32225 mRNA or genomic DNA. The probe can be disposed on an address of an array, e.g., an array described below. Other suitable probes for use in the diagnostic assays are described herein.

[2526] In one format, mRNA (or cDNA) is immobilized on a surface and contacted with the probes, for example by running the isolated mRNA on an agarose gel and transferring the mRNA from the gel to a membrane, such as nitrocellulose. In an alternative format, the probes are immobilized on a surface and the mRNA (or cDNA) is contacted with the probes, for example, in a two-dimensional gene chip array described below. A skilled artisan can adapt known mRNA detection methods for use in detecting the level of mRNA encoded by the 32225 genes.

[2527] The level of mRNA in a sample that is encoded by one of 32225 can be evaluated with nucleic acid amplification, e.g., by rtPCR (Mullis (1987) U.S. Pat. No. 4,683,202), ligase chain reaction (Barany (1991) Proc. Natl. Acad. Sci. USA 88:189-193), self sustained sequence replication (Guatelli et al., (1990) Proc. Natl. Acad. Sci. USA 87:1874-1878), transcriptional amplification system (Kwoh et al., (1989), Proc. Natl. Acad. Sci. USA 86:1173-1177), Q-Beta Replicase (Lizardi et al., (1988) Bio/Technology 6:1197), rolling circle replication (Lizardi et al., U.S. Pat. No. 5,854,033) or any other nucleic acid amplification method, followed by the detection of the amplified molecules using techniques known in the art. As used herein, amplification primers are defined as being a pair of nucleic acid molecules that can anneal to 5′ or 3′regions of a gene (plus and minus strands, respectively, or vice-versa) and contain a short region in between. In general, amplification primers are from about 10 to 30 nucleotides in length and flank a region from about 50 to 200 nucleotides in length. Under appropriate conditions and with appropriate reagents, such primers permit the amplification of a nucleic acid molecule comprising the nucleotide sequence flanked by the primers.

[2528] For in situ methods, a cell or tissue sample can be prepared/processed and immobilized on a support, typically a glass slide, and then contacted with a probe that can hybridize to mRNA that encodes the 32225 gene being analyzed.

[2529] In another embodiment, the methods further contacting a control sample with a compound or agent capable of detecting 32225 mRNA, or genomic DNA, and comparing the presence of 32225 mRNA or genomic DNA in the control sample with the presence of 32225 mRNA or genomic DNA in the test sample. In still another embodiment, serial analysis of gene expression, as described in U.S. Pat. No. 5,695,937, is used to detect 32225 transcript levels.

[2530] A variety of methods can be used to determine the level of protein encoded by 32225. In general, these methods include contacting an agent that selectively binds to the protein, such as an antibody with a sample, to evaluate the level of protein in the sample. In a preferred embodiment, the antibody bears a detectable label. Antibodies can be polyclonal, or more preferably, monoclonal. An intact antibody, or a fragment thereof (e.g., Fab or F(ab′)2) can be used. The term “labeled”, with regard to the probe or antibody, is intended to encompass direct labeling of the probe or antibody by coupling (i.e., physically linking) a detectable substance to the probe or antibody, as well as indirect labeling of the probe or antibody by reactivity with a detectable substance. Examples of detectable substances are provided herein.

[2531] The detection methods can be used to detect 32225 protein in a biological sample in vitro as well as in vivo. In vitro techniques for detection of 32225 protein include enzyme linked immunosorbent assays (ELISAs), immunoprecipitations, immunofluorescence, enzyme immunoassay (EIA), radioimmunoassay (RIA), and Western blot analysis. In vivo techniques for detection of 32225 protein include introducing into a subject a labeled anti-32225 antibody. For example, the antibody can be labeled with a radioactive marker whose presence and location in a subject can be detected by standard imaging techniques. In another embodiment, the sample is labeled, e.g., biotinylated and then contacted to the antibody, e.g., an anti-32225 antibody positioned on an antibody array (as described below). The sample can be detected, e.g., with avidin coupled to a fluorescent label.

[2532] In another embodiment, the methods further include contacting the control sample with a compound or agent capable of detecting 32225 protein, and comparing the presence of 32225 protein in the control sample with the presence of 32225 protein in the test sample.

[2533] The invention also includes kits for detecting the presence of 32225 in a biological sample. For example, the kit can include a compound or agent capable of detecting 32225 protein or mRNA in a biological sample; and a standard. The compound or agent can be packaged in a suitable container. The kit can further comprise instructions for using the kit to detect 32225 protein or nucleic acid.

[2534] For antibody-based kits, the kit can include: (1) a first antibody (e.g., attached to a solid support) which binds to a polypeptide corresponding to a marker of the invention; and, optionally, (2) a second, different antibody which binds to either the polypeptide or the first antibody and is conjugated to a detectable agent.

[2535] For oligonucleotide-based kits, the kit can include: (1) an oligonucleotide, e.g., a detectably labeled oligonucleotide, which hybridizes to a nucleic acid sequence encoding a polypeptide corresponding to a marker of the invention or (2) a pair of primers useful for amplifying a nucleic acid molecule corresponding to a marker of the invention. The kit can also includes a buffering agent, a preservative, or a protein stabilizing agent. The kit can also includes components necessary for detecting the detectable agent (e.g., an enzyme or a substrate). The kit can also contain a control sample or a series of control samples which can be assayed and compared to the test sample contained. Each component of the kit can be enclosed within an individual container and all of the various containers can be within a single package, along with instructions for interpreting the results of the assays performed using the kit.

[2536] The diagnostic methods described herein can identify subjects having, or at risk of developing, a disease or disorder associated with misexpressed or aberrant or unwanted 32225 expression or activity. As used herein, the term “unwanted” includes an unwanted phenomenon involved in a biological response such as neurological disorder, a liver disorder, a metabolic disorder, a cardiovascular disorder, or deregulated cell proliferation.

[2537] In one embodiment, a disease or disorder associated with aberrant or unwanted 32225 expression or activity is identified. A test sample is obtained from a subject and 32225 protein or nucleic acid (e.g., mRNA or genomic DNA) is evaluated, wherein the level, e.g., the presence or absence, of 32225 protein or nucleic acid is diagnostic for a subject having or at risk of developing a disease or disorder associated with aberrant or unwanted 32225 expression or activity. As used herein, a “test sample” refers to a biological sample obtained from a subject of interest, including a biological fluid (e.g., serum), cell sample, or tissue.

[2538] The prognostic assays described herein can be used to determine whether a subject can be administered an agent (e.g., an agonist, antagonist, peptidomimetic, protein, peptide, nucleic acid, small molecule, or other drug candidate) to treat a disease or disorder associated with aberrant or unwanted 32225 expression or activity. For example, such methods can be used to determine whether a subject can be effectively treated with an agent for a neurological disorder, a liver disorder, a metabolic disorder, a cardiovascular disorder, or a cell proliferative and/or differentiative disorder.

[2539] In another aspect, the invention features a computer medium having a plurality of digitally encoded data records. Each data record includes a value representing the level of expression of 32225 in a sample, and a descriptor of the sample. The descriptor of the sample can be an identifier of the sample, a subject from which the sample was derived (e.g., a patient), a diagnosis, or a treatment (e.g., a preferred treatment). In a preferred embodiment, the data record further includes values representing the level of expression of genes other than 32225 (e.g., other genes associated with a 32225-disorder, or other genes on an array). The data record can be structured as a table, e.g., a table that is part of a database such as a relational database (e.g., a SQL database of the Oracle or Sybase database environments).

[2540] Also featured is a method of evaluating a sample. The method includes providing a sample, e.g., from the subject, and determining a gene expression profile of the sample, wherein the profile includes a value representing the level of 32225 expression. The method can further include comparing the value or the profile (i.e., multiple values) to a reference value or reference profile. The gene expression profile of the sample can be obtained by any of the methods described herein (e.g., by providing a nucleic acid from the sample and contacting the nucleic acid to an array). The method can be used to diagnose a cellular proliferative disorder in a subject wherein an increase in 32225 expression is an indication that the subject has or is disposed to having a cellular proliferative disorder, e.g., lung cancer, or some forms of breast, colon, ovary, or brain cancer. Alternatively, a decrease in 32225 expression could be an indication that the subject has or is predisposed to having a cellular proliferative disorder, e.g., some form of ovary or brain cancer. The method can be used to monitor a treatment for a cellular proliferative disorder in a subject. For example, the gene expression profile can be determined for a sample from a subject undergoing treatment. The profile can be compared to a reference profile or to a profile obtained from the subject prior to treatment or prior to onset of the disorder (see, e.g., Golub et al. (1999) Science 286:531).

[2541] In yet another aspect, the invention features a method of evaluating a test compound (see also, “Screening Assays”, above). The method includes providing a cell and a test compound; contacting the test compound to the cell; obtaining a subject expression profile for the contacted cell; and comparing the subject expression profile to one or more reference profiles. The profiles include a value representing the level of 32225 expression. In a preferred embodiment, the subject expression profile is compared to a target profile, e.g., a profile for a normal cell or for desired condition of a cell. The test compound is evaluated favorably if the subject expression profile is more similar to the target profile than an expression profile obtained from an uncontacted cell.

[2542] In another aspect, the invention features, a method of evaluating a subject. The method includes: a) obtaining a sample from a subject, e.g., from a caregiver, e.g., a caregiver who obtains the sample from the subject; b) determining a subject expression profile for the sample. Optionally, the method further includes either or both of steps: c) comparing the subject expression profile to one or more reference expression profiles; and d) selecting the reference profile most similar to the subject reference profile. The subject expression profile and the reference profiles include a value representing the level of 32225 expression. A variety of routine statistical measures can be used to compare two reference profiles. One possible metric is the length of the distance vector that is the difference between the two profiles. Each of the subject and reference profile is represented as a multi-dimensional vector, wherein each dimension is a value in the profile.

[2543] The method can further include transmitting a result to a caregiver. The result can be the subject expression profile, a result of a comparison of the subject expression profile with another profile, a most similar reference profile, or a descriptor of any of the aforementioned. The result can be transmitted across a computer network, e.g., the result can be in the form of a computer transmission, e.g., a computer data signal embedded in a carrier wave.

[2544] Also featured is a computer medium having executable code for effecting the following steps: receive a subject expression profile; access a database of reference expression profiles; and either i) select a matching reference profile most similar to the subject expression profile or ii) determine at least one comparison score for the similarity of the subject expression profile to at least one reference profile. The subject expression profile, and the reference expression profiles each include a value representing the level of 32225 expression.

[2545] 32225 Arrays and Uses Thereof

[2546] In another aspect, the invention features an array that includes a substrate having a plurality of addresses. At least one address of the plurality includes a capture probe that binds specifically to a 32225 molecule (e.g., a 32225 nucleic acid or a 32225 polypeptide). The array can have a density of at least than 10, 50, 100, 200, 500, 1,000, 2,000, or 10,000 or more addresses/cm2, and ranges between. In a preferred embodiment, the plurality of addresses includes at least 10, 100, 500, 1,000, 5,000, 10,000, 50,000 addresses. In a preferred embodiment, the plurality of addresses includes equal to or less than 10, 100, 500, 1,000, 5,000, 10,000, or 50,000 addresses. The substrate can be a two-dimensional substrate such as a glass slide, a wafer (e.g., silica or plastic), a mass spectroscopy plate, or a three-dimensional substrate such as a gel pad. Addresses in addition to address of the plurality can be disposed on the array.

[2547] In a preferred embodiment, at least one address of the plurality includes a nucleic acid capture probe that hybridizes specifically to a 32225 nucleic acid, e.g., the sense or anti-sense strand. In one preferred embodiment, a subset of addresses of the plurality of addresses has a nucleic acid capture probe for 32225. Each address of the subset can include a capture probe that hybridizes to a different region of a 32225 nucleic acid. In another preferred embodiment, addresses of the subset include a capture probe for a 32225 nucleic acid. Each address of the subset is unique, overlapping, and complementary to a different variant of 32225 (e.g., an allelic variant, or all possible hypothetical variants). The array can be used to sequence 32225 by hybridization (see, e.g., U.S. Pat. No. 5,695,940).

[2548] An array can be generated by various methods, e.g., by photolithographic methods (see, e.g., U.S. Pat. Nos. 5,143,854; 5,510,270; and 5,527,681), mechanical methods (e.g., directed-flow methods as described in U.S. Pat. No. 5,384,261), pin-based methods (e.g., as described in U.S. Pat. No. 5,288,514), and bead-based techniques (e.g., as described in PCT US/93/04145).

[2549] In another preferred embodiment, at least one address of the plurality includes a polypeptide capture probe that binds specifically to a 32225 polypeptide or fragment thereof. The polypeptide can be a naturally-occurring interaction partner of 32225 polypeptide. Preferably, the polypeptide is an antibody, e.g., an antibody described herein (see “Anti-32225 Antibodies,” above), such as a monoclonal antibody or a single-chain antibody.

[2550] In another aspect, the invention features a method of analyzing the expression of 32225. The method includes providing an array as described above; contacting the array with a sample and detecting binding of a 32225-molecule (e.g., nucleic acid or polypeptide) to the array. In a preferred embodiment, the array is a nucleic acid array. Optionally the method further includes amplifying nucleic acid from the sample prior or during contact with the array.

[2551] In another embodiment, the array can be used to assay gene expression in a tissue, lung, breast, colon, ovary, brain, or liver tisue, to ascertain tissue specificity of genes in the array, particularly the expression of 32225. If a sufficient number of diverse samples is analyzed, clustering (e.g., hierarchical clustering, k-means clustering, Bayesian clustering and the like) can be used to identify other genes which are co-regulated with 32225. For example, the array can be used for the quantitation of the expression of multiple genes. Thus, not only tissue specificity, but also the level of expression of a battery of genes in the tissue is ascertained. Quantitative data can be used to group (e.g., cluster) genes on the basis of their tissue expression per se and level of expression in that tissue.

[2552] For example, array analysis of gene expression can be used to assess the effect of cell-cell interactions on 32225 expression. A first tissue can be perturbed and nucleic acid from a second tissue that interacts with the first tissue can be analyzed. In this context, the effect of one cell type on another cell type in response to a biological stimulus can be determined, e.g., to monitor the effect of cell-cell interaction at the level of gene expression.

[2553] In another embodiment, cells are contacted with a therapeutic agent. The expression profile of the cells is determined using the array, and the expression profile is compared to the profile of like cells not contacted with the agent. For example, the assay can be used to determine or analyze the molecular basis of an undesirable effect of the therapeutic agent. If an agent is administered therapeutically to treat one cell type but has an undesirable effect on another cell type, the invention provides an assay to determine the molecular basis of the undesirable effect and thus provides the opportunity to co-administer a counteracting agent or otherwise treat the undesired effect. Similarly, even within a single cell type, undesirable biological effects can be determined at the molecular level. Thus, the effects of an agent on expression of other than the target gene can be ascertained and counteracted.

[2554] In another embodiment, the array can be used to monitor expression of one or more genes in the array with respect to time. For example, samples obtained from different time points can be probed with the array. Such analysis can identify and/or characterize the development of a 32225-associated disease or disorder; and processes, such as a cellular transformation associated with a 32225-associated disease or disorder. The method can also evaluate the treatment and/or progression of a 32225-associated disease or disorder

[2555] The array is also useful for ascertaining differential expression patterns of one or more genes in normal and abnormal cells. This provides a battery of genes (e.g., including 32225) that could serve as a molecular target for diagnosis or therapeutic intervention.

[2556] In another aspect, the invention features an array having a plurality of addresses. Each address of the plurality includes a unique polypeptide. At least one address of the plurality has disposed thereon a 32225 polypeptide or fragment thereof. Methods of producing polypeptide arrays are described in the art, e.g., in De Wildt et al. (2000). Nature Biotech. 18, 989-994; Lueking et al. (1999). Anal. Biochem. 270, 103-111; Ge, H. (2000). Nucleic Acids Res. 28, e3, I-VII; MacBeath, G., and Schreiber, S. L. (2000). Science 289, 1760-1763; and WO 99/51773A1. In a preferred embodiment, each addresses of the plurality has disposed thereon a polypeptide at least 60, 70, 80, 85, 90, 95 or 99% identical to a 32225 polypeptide or fragment thereof. For example, multiple variants of a 32225 polypeptide (e.g., encoded by allelic variants, site-directed mutants, random mutants, or combinatorial mutants) can be disposed at individual addresses of the plurality. Addresses in addition to the address of the plurality can be disposed on the array.

[2557] The polypeptide array can be used to detect a 32225 binding compound, e.g., an antibody in a sample from a subject with specificity for a 32225 polypeptide or the presence of a 32225-binding protein or ligand.

[2558] The array is also useful for ascertaining the effect of the expression of a gene on the expression of other genes in the same cell or in different cells (e.g., ascertaining the effect of 32225 expression on the expression of other genes). This provides, for example, for a selection of alternate molecular targets for therapeutic intervention if the ultimate or downstream target cannot be regulated.

[2559] In another aspect, the invention features a method of analyzing a plurality of probes. The method is useful, e.g., for analyzing gene expression. The method includes: providing a two dimensional array having a plurality of addresses, each address of the plurality being positionally distinguishable from each other address of the plurality having a unique capture probe, e.g., wherein the capture probes are from a cell or subject which express 32225 or from a cell or subject in which a 32225 mediated response has been elicited, e.g., by contact of the cell with 32225 nucleic acid or protein, or administration to the cell or subject 32225 nucleic acid or protein; providing a two dimensional array having a plurality of addresses, each address of the plurality being positionally distinguishable from each other address of the plurality, and each address of the plurality having a unique capture probe, e.g., wherein the capture probes are from a cell or subject which does not express 32225 (or does not express as highly as in the case of the 32225 positive plurality of capture probes) or from a cell or subject which in which a 32225 mediated response has not been elicited (or has been elicited to a lesser extent than in the first sample); contacting the array with one or more inquiry probes (which is preferably other than a 32225 nucleic acid, polypeptide, or antibody), and thereby evaluating the plurality of capture probes. Binding, e.g., in the case of a nucleic acid, hybridization with a capture probe at an address of the plurality, is detected, e.g., by signal generated from a label attached to the nucleic acid, polypeptide, or antibody.

[2560] In another aspect, the invention features a method of analyzing a plurality of probes or a sample. The method is useful, e.g., for analyzing gene expression. The method includes: providing a two dimensional array having a plurality of addresses, each address of the plurality being positionally distinguishable from each other address of the plurality having a unique capture probe, contacting the array with a first sample from a cell or subject which express or mis-express 32225 or from a cell or subject in which a 32225-mediated response has been elicited, e.g., by contact of the cell with 32225 nucleic acid or protein, or administration to the cell or subject 32225 nucleic acid or protein, or a normal or diseased lung, breast, colon, ovary, brain, or liver tissue; providing a two dimensional array having a plurality of addresses, each address of the plurality being positionally distinguishable from each other address of the plurality, and each address of the plurality having a unique capture probe, and contacting the array with a second sample from a cell or subject which does not express 32225 (or does not express as highly as in the case of the 32225 positive plurality of capture probes) or from a cell or subject which in which a 32225 mediated response has not been elicited (or has been elicited to a lesser extent than in the first sample); and comparing the binding of the first sample with the binding of the second sample. Binding, e.g., in the case of a nucleic acid, hybridization with a capture probe at an address of the plurality, is detected, e.g., by signal generated from a label attached to the nucleic acid, polypeptide, or antibody. The same array can be used for both samples or different arrays can be used. If different arrays are used the plurality of addresses with capture probes should be present on both arrays.

[2561] In another aspect, the invention features a method of analyzing 32225, e.g., analyzing structure, function, or relatedness to other nucleic acid or amino acid sequences. The method includes: providing a 32225 nucleic acid or amino acid sequence; comparing the 32225 sequence with one or more preferably a plurality of sequences from a collection of sequences, e.g., a nucleic acid or protein sequence database; to thereby analyze 32225.

[2562] Detection of 32225 Variations or Mutations

[2563] The methods of the invention can also be used to detect genetic alterations in a 32225 gene, thereby determining if a subject with the altered gene is at risk for a disorder characterized by misregulation in 32225 protein activity or nucleic acid expression, such as a neurological disorder, a liver disorder, a metabolic disorder, a cardiovascular disorder, or deregulated cell proliferation. In preferred embodiments, the methods include detecting, in a sample from the subject, the presence or absence of a genetic alteration characterized by at least one of an alteration affecting the integrity of a gene encoding a 32225-protein, or the mis-expression of the 32225 gene. For example, such genetic alterations can be detected by ascertaining the existence of at least one of 1) a deletion of one or more nucleotides from a 32225 gene; 2) an addition of one or more nucleotides to a 32225 gene; 3) a substitution of one or more nucleotides of a 32225 gene, 4) a chromosomal rearrangement of a 32225 gene; 5) an alteration in the level of a messenger RNA transcript of a 32225 gene, 6) aberrant modification of a 32225 gene, such as of the methylation pattern of the genomic DNA, 7) the presence of a non-wild type splicing pattern of a messenger RNA transcript of a 32225 gene, 8) a non-wild type level of a 32225-protein, 9) allelic loss of a 32225 gene, and 10) inappropriate post-translational modification of a 32225-protein.

[2564] An alteration can be detected without a probe/primer in a polymerase chain reaction, such as anchor PCR or RACE PCR, or, alternatively, in a ligation chain reaction (LCR), the latter of which can be particularly useful for detecting point mutations in the 32225-gene. This method can include the steps of collecting a sample of cells from a subject, isolating nucleic acid (e.g., genomic, mRNA or both) from the sample, contacting the nucleic acid sample with one or more primers which specifically hybridize to a 32225 gene under conditions such that hybridization and amplification of the 32225-gene (if present) occurs, and detecting the presence or absence of an amplification product, or detecting the size of the amplification product and comparing the length to a control sample. It is anticipated that PCR and/or LCR may be desirable to use as a preliminary amplification step in conjunction with any of the techniques used for detecting mutations described herein. Alternatively, other amplification methods described herein or known in the art can be used.

[2565] In another embodiment, mutations in a 32225 gene from a sample cell can be identified by detecting alterations in restriction enzyme cleavage patterns. For example, sample and control DNA is isolated, amplified (optionally), digested with one or more restriction endonucleases, and fragment length sizes are determined, e.g., by gel electrophoresis and compared. Differences in fragment length sizes between sample and control DNA indicates mutations in the sample DNA. Moreover, the use of sequence specific ribozymes (see, for example, U.S. Pat. No. 5,498,531) can be used to score for the presence of specific mutations by development or loss of a ribozyme cleavage site.

[2566] In other embodiments, genetic mutations in 32225 can be identified by hybridizing a sample and control nucleic acids, e.g., DNA or RNA, two-dimensional arrays, e.g., chip based arrays. Such arrays include a plurality of addresses, each of which is positionally distinguishable from the other. A different probe is located at each address of the plurality. A probe can be complementary to a region of a 32225 nucleic acid or a putative variant (e.g., allelic variant) thereof. A probe can have one or more mismatches to a region of a 32225 nucleic acid (e.g., a destabilizing mismatch). The arrays can have a high density of addresses, e.g., can contain hundreds or thousands of oligonucleotides probes (Cronin, M. T. et al. (1996) Human Mutation 7: 244-255; Kozal, M. J. et al. (1996) Nature Medicine 2: 753-759). For example, genetic mutations in 32225 can be identified in two-dimensional arrays containing light-generated DNA probes as described in Cronin, M. T. et al. supra. Briefly, a first hybridization array of probes can be used to scan through long stretches of DNA in a sample and control to identify base changes between the sequences by making linear arrays of sequential overlapping probes. This step allows the identification of point mutations. This step is followed by a second hybridization array that allows the characterization of specific mutations by using smaller, specialized probe arrays complementary to all variants or mutations detected. Each mutation array is composed of parallel probe sets, one complementary to the wild-type gene and the other complementary to the mutant gene.

[2567] In yet another embodiment, any of a variety of sequencing reactions known in the art can be used to directly sequence the 32225 gene and detect mutations by comparing the sequence of the sample 32225 with the corresponding wild-type (control) sequence. Automated sequencing procedures can be utilized when performing the diagnostic assays ((1995) Biotechniques 19:448), including sequencing by mass spectrometry.

[2568] Other methods for detecting mutations in the 32225 gene include methods in which protection from cleavage agents is used to detect mismatched bases in RNA/RNA or RNA/DNA heteroduplexes (Myers et al. (1985) Science 230:1242; Cotton et al. (1988) Proc. Natl Acad Sci USA 85:4397; Saleeba et al. (1992) Methods Enzymol. 217:286-295).

[2569] In still another embodiment, the mismatch cleavage reaction employs one or more proteins that recognize mismatched base pairs in double-stranded DNA (so called “DNA mismatch repair” enzymes) in defined systems for detecting and mapping point mutations in 32225 cDNAs obtained from samples of cells. For example, the mutY enzyme of E. coli cleaves A at G/A mismatches and the thymidine DNA glycosylase from HeLa cells cleaves T at G/T mismatches (Hsu et al. (1994) Carcinogenesis 15:1657-1662; U.S. Pat. No. 5,459,039).

[2570] In other embodiments, alterations in electrophoretic mobility will be used to identify mutations in 32225 genes. For example, single strand conformation polymorphism (SSCP) may be used to detect differences in electrophoretic mobility between mutant and wild type nucleic acids (Orita et al. (1989) Proc Natl. Acad. Sci USA: 86:2766, see also Cotton (1993) Mutat. Res. 285:125-144; and Hayashi (1992) Genet. Anal. Tech. Appl. 9:73-79). Single-stranded DNA fragments of sample and control 32225 nucleic acids will be denatured and allowed to renature. The secondary structure of single-stranded nucleic acids varies according to sequence, the resulting alteration in electrophoretic mobility enables the detection of even a single base change. The DNA fragments may be labeled or detected with labeled probes. The sensitivity of the assay may be enhanced by using RNA (rather than DNA), in which the secondary structure is more sensitive to a change in sequence. In a preferred embodiment, the subject method utilizes heteroduplex analysis to separate double stranded heteroduplex molecules on the basis of changes in electrophoretic mobility (Keen et al. (1991) Trends Genet 7:5).

[2571] In yet another embodiment, the movement of mutant or wild-type fragments in polyacrylamide gels containing a gradient of denaturant is assayed using denaturing gradient gel electrophoresis (DGGE) (Myers et al. (1985) Nature 313:495). When DGGE is used as the method of analysis, DNA will be modified to insure that it does not completely denature, for example by adding a GC clamp of approximately 40 bp of high-melting GC-rich DNA by PCR. In a further embodiment, a temperature gradient is used in place of a denaturing gradient to identify differences in the mobility of control and sample DNA (Rosenbaum and Reissner (1987) Biophys Chem 265:12753).

[2572] Examples of other techniques for detecting point mutations include, but are not limited to, selective oligonucleotide hybridization, selective amplification, or selective primer extension (Saiki et al. (1986) Nature 324:163); Saiki et al. (1989) Proc. Natl Acad. Sci USA 86:6230). A further method of detecting point mutations is the chemical ligation of oligonucleotides as described in Xu et al. ((2001) Nature Biotechnol. 19:148). Adjacent oligonucleotides, one of which selectively anneals to the query site, are ligated together if the nucleotide at the query site of the sample nucleic acid is complementary to the query oligonucleotide; ligation can be monitored, e.g., by fluorescent dyes coupled to the oligonucleotides.

[2573] Alternatively, allele specific amplification technology that depends on selective PCR amplification may be used in conjunction with the instant invention. Oligonucleotides used as primers for specific amplification may carry the mutation of interest in the center of the molecule (so that amplification depends on differential hybridization) (Gibbs et al. (1989) Nucleic Acids Res. 17:2437-2448) or at the extreme 3′end of one primer where, under appropriate conditions, mismatch can prevent, or reduce polymerase extension (Prossner (1993) Tibtech 11:238). In addition it may be desirable to introduce a novel restriction site in the region of the mutation to create cleavage-based detection (Gasparini et al. (1992) Mol. Cell Probes 6:1). It is anticipated that in certain embodiments amplification may also be performed using Taq ligase for amplification (Barany (1991) Proc. Natl. Acad. Sci USA 88:189). In such cases, ligation will occur only if there is a perfect match at the 3′end of the 5′sequence making it possible to detect the presence of a known mutation at a specific site by looking for the presence or absence of amplification.

[2574] In another aspect, the invention features a set of oligonucleotides. The set includes a plurality of oligonucleotides, each of which is at least partially complementary (e.g., at least 50%, 60%, 70%, 80%, 90%, 92%, 95%, 97%, 98%, or 99% complementary) to a 32225 nucleic acid.

[2575] In a preferred embodiment the set includes a first and a second oligonucleotide. The first and second oligonucleotide can hybridize to the same or to different locations of SEQ ID NO:33 or the complement of SEQ ID NO:33. Different locations can be different but overlapping, or non-overlapping on the same strand. The first and second oligonucleotide can hybridize to sites on the same or on different strands.

[2576] The set can be useful, e.g., for identifying SNP's, or identifying specific alleles of 32225. In a preferred embodiment, each oligonucleotide of the set has a different nucleotide at an interrogation position. In one embodiment, the set includes two oligonucleotides, each complementary to a different allele at a locus, e.g., a biallelic or polymorphic locus.

[2577] In another embodiment, the set includes four oligonucleotides, each having a different nucleotide (e.g., adenine, guanine, cytosine, or thymidine) at the interrogation position. The interrogation position can be a SNP or the site of a mutation. In another preferred embodiment, the oligonucleotides of the plurality are identical in sequence to one another (except for differences in length). The oligonucleotides can be provided with differential labels, such that an oligonucleotide that hybridizes to one allele provides a signal that is distinguishable from an oligonucleotide that hybridizes to a second allele. In still another embodiment, at least one of the oligonucleotides of the set has a nucleotide change at a position in addition to a query position, e.g., a destabilizing mutation to decrease the Tm of the oligonucleotide. In another embodiment, at least one oligonucleotide of the set has a non-natural nucleotide, e.g., inosine. In a preferred embodiment, the oligonucleotides are attached to a solid support, e.g., to different addresses of an array or to different beads or nanoparticles.

[2578] In a preferred embodiment the set of oligo nucleotides can be used to specifically amplify, e.g., by PCR, or detect, a 32225 nucleic acid.

[2579] The methods described herein may be performed, for example, by utilizing pre-packaged diagnostic kits comprising at least one probe nucleic acid or antibody reagent described herein, which may be conveniently used, e.g., in clinical settings to diagnose patients exhibiting symptoms or family history of a disease or illness involving a 32225 gene.

[2580] Use of 32225 Molecules as Surrogate Markers

[2581] The 32225 molecules of the invention are also useful as markers of disorders or disease states, as markers for precursors of disease states, as markers for predisposition of disease states, as markers of drug activity, or as markers of the pharmacogenomic profile of a subject. Using the methods described herein, the presence, absence and/or quantity of the 32225 molecules of the invention may be detected, and may be correlated with one or more biological states in vivo. For example, the 32225 molecules of the invention may serve as surrogate markers for one or more disorders or disease states or for conditions leading up to disease states. As used herein, a “surrogate marker” is an objective biochemical marker which correlates with the absence or presence of a disease or disorder, or with the progression of a disease or disorder (e.g., with the presence or absence of a tumor). The presence or quantity of such markers is independent of the disease. Therefore, these markers may serve to indicate whether a particular course of treatment is effective in lessening a disease state or disorder. Surrogate markers are of particular use when the presence or extent of a disease state or disorder is difficult to assess through standard methodologies (e.g., early stage tumors), or when an assessment of disease progression is desired before a potentially dangerous clinical endpoint is reached (e.g., an assessment of cardiovascular disease may be made using cholesterol levels as a surrogate marker, and an analysis of HIV infection may be made using HIV RNA levels as a surrogate marker, well in advance of the undesirable clinical outcomes of myocardial infarction or fully-developed AIDS). Examples of the use of surrogate markers in the art include: Koomen et al. (2000) J. Mass. Spectrom. 35: 258-264; and James (1994) AIDS Treatment News Archive 209.

[2582] The 32225 molecules of the invention are also useful as pharmacodynamic markers. As used herein, a “pharmacodynamic marker” is an objective biochemical marker which correlates specifically with drug effects. The presence or quantity of a pharmacodynamic marker is not related to the disease state or disorder for which the drug is being administered; therefore, the presence or quantity of the marker is indicative of the presence or activity of the drug in a subject. For example, a pharmacodynamic marker may be indicative of the concentration of the drug in a biological tissue, in that the marker is either expressed or transcribed or not expressed or transcribed in that tissue in relationship to the level of the drug. In this fashion, the distribution or uptake of the drug may be monitored by the pharmacodynamic marker. Similarly, the presence or quantity of the pharmacodynamic marker may be related to the presence or quantity of the metabolic product of a drug, such that the presence or quantity of the marker is indicative of the relative breakdown rate of the drug in vivo. Pharmacodynamic markers are of particular use in increasing the sensitivity of detection of drug effects, particularly when the drug is administered in low doses. Since even a small amount of a drug may be sufficient to activate multiple rounds of marker (e.g., a 32225 marker) transcription or expression, the amplified marker may be in a quantity which is more readily detectable than the drug itself. Also, the marker may be more easily detected due to the nature of the marker itself; for example, using the methods described herein, anti-32225 antibodies may be employed in an immune-based detection system for a 32225 protein marker, or 32225-specific radiolabeled probes may be used to detect a 32225 mRNA marker. Furthermore, the use of a pharmacodynamic marker may offer mechanism-based prediction of risk due to drug treatment beyond the range of possible direct observations. Examples of the use of pharmacodynamic markers in the art include: Matsuda et al. U.S. Pat. No. 6,033,862; Hattis et al. (1991) Env. Health Perspect. 90: 229-238; Schentag (1999) Am. J. Health-Syst. Pharm. 56 Suppl. 3: S21-S24; and Nicolau (1999) Am, J. Health-Syst. Pharm. 56 Suppl. 3: S16-S20.

[2583] The 32225 molecules of the invention are also useful as pharmacogenomic markers. As used herein, a “pharmacogenomic marker” is an objective biochemical marker which correlates with a specific clinical drug response or susceptibility in a subject (see, e.g., McLeod et al. (1999) Eur. J. Cancer 35:1650-1652). The presence or quantity of the pharmacogenomic marker is related to the predicted response of the subject to a specific drug or class of drugs prior to administration of the drug. By assessing the presence or quantity of one or more pharmacogenomic markers in a subject, a drug therapy which is most appropriate for the subject, or which is predicted to have a greater degree of success, may be selected. For example, based on the presence or quantity of RNA, or protein (e.g., 32225 protein or RNA) for specific tumor markers in a subject, a drug or course of treatment may be selected that is optimized for the treatment of the specific tumor likely to be present in the subject. Similarly, the presence or absence of a specific sequence mutation in 32225 DNA may correlate 32225 drug response. The use of pharmacogenomic markers therefore permits the application of the most appropriate treatment for each subject without having to administer the therapy.

[2584] Pharmaceutical Compositions of 32225

[2585] The nucleic acid and polypeptides, fragments thereof, as well as anti-32225 antibodies (also referred to herein as “active compounds”) of the invention can be incorporated into pharmaceutical compositions. Such compositions typically include the nucleic acid molecule, protein, or antibody and a pharmaceutically acceptable carrier. As used herein the language “pharmaceutically acceptable carrier” includes solvents, dispersion media, coatings, antibacterial and antifungal agents, isotonic and absorption delaying agents, and the like, compatible with pharmaceutical administration. Supplementary active compounds can also be incorporated into the compositions.

[2586] A pharmaceutical composition is formulated to be compatible with its intended route of administration. Examples of routes of administration include parenteral, e.g., intravenous, intradermal, subcutaneous, oral (e.g., inhalation), transdermal (topical), transmucosal, and rectal administration. Solutions or suspensions used for parenteral, intradermal, or subcutaneous application can include the following components: a sterile diluent such as water for injection, saline solution, fixed oils, polyethylene glycols, glycerine, propylene glycol or other synthetic solvents; antibacterial agents such as benzyl alcohol or methyl parabens; antioxidants such as ascorbic acid or sodium bisulfite; chelating agents such as ethylenediaminetetraacetic acid; buffers such as acetates, citrates or phosphates and agents for the adjustment of tonicity such as sodium chloride or dextrose. pH can be adjusted with acids or bases, such as hydrochloric acid or sodium hydroxide. The parenteral preparation can be enclosed in ampoules, disposable syringes or multiple dose vials made of glass or plastic.

[2587] Pharmaceutical compositions suitable for injectable use include sterile aqueous solutions (where water soluble) or dispersions and sterile powders for the extemporaneous preparation of sterile injectable solutions or dispersion. For intravenous administration, suitable carriers include physiological saline, bacteriostatic water, Cremophor EL™ (BASF, Parsippany, N.J.) or phosphate buffered saline (PBS). In all cases, the composition must be sterile and should be fluid to the extent that easy syringability exists. It should be stable under the conditions of manufacture and storage and must be preserved against the contaminating action of microorganisms such as bacteria and fungi. The carrier can be a solvent or dispersion medium containing, for example, water, ethanol, polyol (for example, glycerol, propylene glycol, and liquid polyetheylene glycol, and the like), and suitable mixtures thereof. The proper fluidity can be maintained, for example, by the use of a coating such as lecithin, by the maintenance of the required particle size in the case of dispersion and by the use of surfactants. Prevention of the action of microorganisms can be achieved by various antibacterial and antifingal agents, for example, parabens, chlorobutanol, phenol, ascorbic acid, thimerosal, and the like. In many cases, it will be preferable to include isotonic agents, for example, sugars, polyalcohols such as manitol, sorbitol, sodium chloride in the composition. Prolonged absorption of the injectable compositions can be brought about by including in the composition an agent which delays absorption, for example, aluminum monostearate and gelatin.

[2588] Sterile injectable solutions can be prepared by incorporating the active compound in the required amount in an appropriate solvent with one or a combination of ingredients enumerated above, as required, followed by filtered sterilization. Generally, dispersions are prepared by incorporating the active compound into a sterile vehicle which contains a basic dispersion medium and the required other ingredients from those enumerated above. In the case of sterile powders for the preparation of sterile injectable solutions, the preferred methods of preparation are vacuum drying and freeze-drying which yields a powder of the active ingredient plus any additional desired ingredient from a previously sterile-filtered solution thereof.

[2589] Oral compositions generally include an inert diluent or an edible carrier. For the purpose of oral therapeutic administration, the active compound can be incorporated with excipients and used in the form of tablets, troches, or capsules, e.g., gelatin capsules. Oral compositions can also be prepared using a fluid carrier for use as a mouthwash. Pharmaceutically compatible binding agents, and/or adjuvant materials can be included as part of the composition. The tablets, pills, capsules, troches and the like can contain any of the following ingredients, or compounds of a similar nature: a binder such as microcrystalline cellulose, gum tragacanth or gelatin; an excipient such as starch or lactose, a disintegrating agent such as alginic acid, Primogel, or corn starch; a lubricant such as magnesium stearate or Sterotes; a glidant such as colloidal silicon dioxide; a sweetening agent such as sucrose or saccharin; or a flavoring agent such as peppermint, methyl salicylate, or orange flavoring.

[2590] For administration by inhalation, the compounds are delivered in the form of an aerosol spray from pressured container or dispenser which contains a suitable propellant, e.g., a gas such as carbon dioxide, or a nebulizer.

[2591] Systemic administration can also be by transmucosal or transdermal means. For transmucosal or transdermal administration, penetrants appropriate to the barrier to be permeated are used in the formulation. Such penetrants are generally known in the art, and include, for example, for transmucosal administration, detergents, bile salts, and fusidic acid derivatives. Transmucosal administration can be accomplished through the use of nasal sprays or suppositories. For transdermal administration, the active compounds are formulated into ointments, salves, gels, or creams as generally known in the art.

[2592] The compounds can also be prepared in the form of suppositories (e.g., with conventional suppository bases such as cocoa butter and other glycerides) or retention enemas for rectal delivery.

[2593] In one embodiment, the active compounds are prepared with carriers that will protect the compound against rapid elimination from the body, such as a controlled release formulation, including implants and microencapsulated delivery systems. Biodegradable, biocompatible polymers can be used, such as ethylene vinyl acetate, polyanhydrides, polyglycolic acid, collagen, polyorthoesters, and polylactic acid. Methods for preparation of such formulations will be apparent to those skilled in the art. The materials can also be obtained commercially from Alza Corporation and Nova Pharmaceuticals, Inc. Liposomal suspensions (including liposomes targeted to infected cells with monoclonal antibodies to viral antigens) can also be used as pharmaceutically acceptable carriers. These can be prepared according to methods known to those skilled in the art, for example, as described in U.S. Pat. No. 4,522,811.

[2594] It is advantageous to formulate oral or parenteral compositions in dosage unit form for ease of administration and uniformity of dosage. Dosage unit form as used herein refers to physically discrete units suited as unitary dosages for the subject to be treated; each unit containing a predetermined quantity of active compound calculated to produce the desired therapeutic effect in association with the required pharmaceutical carrier.

[2595] Toxicity and therapeutic efficacy of such compounds can be determined by standard pharmaceutical procedures in cell cultures or experimental animals, e.g., for determining the LD50 (the dose lethal to 50% of the population) and the ED50 (the dose therapeutically effective in 50% of the population). The dose ratio between toxic and therapeutic effects is the therapeutic index and it can be expressed as the ratio LD50/ED50. Compounds which exhibit high therapeutic indices are preferred. While compounds that exhibit toxic side effects may be used, care should be taken to design a delivery system that targets such compounds to the site of affected tissue in order to minimize potential damage to uninfected cells and, thereby, reduce side effects.

[2596] The data obtained from the cell culture assays and animal studies can be used in formulating a range of dosage for use in humans. The dosage of such compounds lies preferably within a range of circulating concentrations that include the ED50 with little or no toxicity. The dosage may vary within this range depending upon the dosage form employed and the route of administration utilized. For any compound used in the method of the invention, the therapeutically effective dose can be estimated initially from cell culture assays. A dose may be formulated in animal models to achieve a circulating plasma concentration range that includes the IC50 (i.e., the concentration of the test compound which achieves a half-maximal inhibition of symptoms) as determined in cell culture. Such information can be used to more accurately determine useful doses in humans. Levels in plasma may be measured, for example, by high performance liquid chromatography.

[2597] As defined herein, a therapeutically effective amount of protein or polypeptide (i.e., an effective dosage) ranges from about 0.001 to 30 mg/kg body weight, preferably about 0.01 to 25 mg/kg body weight, more preferably about 0.1 to 20 mg/kg body weight, and even more preferably about 1 to 10 mg/kg, 2 to 9 mg/kg, 3 to 8 mg/kg, 4 to 7 mg/kg, or 5 to 6 mg/kg body weight. The protein or polypeptide can be administered one time per week for between about 1 to 10 weeks, preferably between 2 to 8 weeks, more preferably between about 3 to 7 weeks, and even more preferably for about 4, 5, or 6 weeks. The skilled artisan will appreciate that certain factors may influence the dosage and timing required to effectively treat a subject, including but not limited to the severity of the disease or disorder, previous treatments, the general health and/or age of the subject, and other diseases present. Moreover, treatment of a subject with a therapeutically effective amount of a protein, polypeptide, or antibody can include a single treatment or, preferably, can include a series of treatments.

[2598] For antibodies, the preferred dosage is 0.1 mg/kg of body weight (generally 10 mg/kg to 20 mg/kg). If the antibody is to act in the brain, a dosage of 50 mg/kg to 100 mg/kg is usually appropriate. Generally, partially human antibodies and fully human antibodies have a longer half-life within the human body than other antibodies. Accordingly, lower dosages and less frequent administration is often possible. Modifications such as lipidation can be used to stabilize antibodies and to enhance uptake and tissue penetration (e.g., into the brain). A method for lipidation of antibodies is described by Cruikshank et al. ((1997) J. Acquired Immune Deficiency Syndromes and Human Retrovirology 14:193).

[2599] The present invention encompasses agents which modulate expression or activity. An agent may, for example, be a small molecule. For example, such small molecules include, but are not limited to, peptides, peptidomimetics (e.g., peptoids), amino acids, amino acid analogs, polynucleotides, polynucleotide analogs, nucleotides, nucleotide analogs, organic or inorganic compounds (i.e., including heteroorganic and organometallic compounds) having a molecular weight less than about 10,000 grams per mole, organic or inorganic compounds having a molecular weight less than about 5,000 grams per mole, organic or inorganic compounds having a molecular weight less than about 1,000 grams per mole, organic or inorganic compounds having a molecular weight less than about 500 grams per mole, and salts, esters, and other pharmaceutically acceptable forms of such compounds.

[2600] Exemplary doses include milligram or microgram amounts of the small molecule per kilogram of subject or sample weight (e.g., about 1 microgram per kilogram to about 500 milligrams per kilogram, about 100 micrograms per kilogram to about 5 milligrams per kilogram, or about 1 microgram per kilogram to about 50 micrograms per kilogram. It is furthermore understood that appropriate doses of a small molecule depend upon the potency of the small molecule with respect to the expression or activity to be modulated. When one or more of these small molecules is to be administered to an animal (e.g., a human) in order to modulate expression or activity of a polypeptide or nucleic acid of the invention, a physician, veterinarian, or researcher may, for example, prescribe a relatively low dose at first, subsequently increasing the dose until an appropriate response is obtained. In addition, it is understood that the specific dose level for any particular animal subject will depend upon a variety of factors including the activity of the specific compound employed, the age, body weight, general health, gender, and diet of the subject, the time of administration, the route of administration, the rate of excretion, any drug combination, and the degree of expression or activity to be modulated.

[2601] An antibody (or fragment thereof) may be conjugated to a therapeutic moiety such as a cytotoxin, a therapeutic agent or a radioactive ion. A cytotoxin or cytotoxic agent includes any agent that is detrimental to cells. Examples include taxol, cytochalasin B, gramicidin D, ethidium bromide, emetine, mitomycin, etoposide, tenoposide, vincristine, vinblastine, colchicin, doxorubicin, daunorubicin, dihydroxy anthracin dione, mitoxantrone, mithramycin, actinomycin D, 1-dehydrotestosterone, glucocorticoids, procaine, tetracaine, lidocaine, propranolol, puromycin, maytansinoids, e.g., maytansinol (see U.S. Pat. No. 5,208,020), CC-1065 (see U.S. Pat. Nos. 5,475,092, 5,585,499, 5,846,545) and analogs or homologs thereof. Therapeutic agents include, but are not limited to, antimetabolites (e.g., methotrexate, 6-mercaptopurine, 6-thioguanine, cytarabine, 5-fluorouracil decarbazine), alkylating agents (e.g., mechlorethamine, thioepa chlorambucil, CC-1065, melphalan, carmustine (BSNU) and lomustine (CCNU), cyclothosphamide, busulfan, dibromomannitol, streptozotocin, mitomycin C, and cis-dichlorodiamine platinum (II) (DDP) cisplatin), anthracyclines (e.g., daunorubicin (formerly daunomycin) and doxorubicin), antibiotics (e.g., dactinomycin (formerly actinomycin), bleomycin, mithramycin, and anthramycin (AMC)), and anti-mitotic agents (e.g., vincristine, vinblastine, taxol and maytansinoids). Radioactive ions include, but are not limited to iodine, yttrium and praseodymium.

[2602] The conjugates of the invention can be used for modifying a given biological response, the drug moiety is not to be construed as limited to classical chemical therapeutic agents. For example, the drug moiety may be a protein or polypeptide possessing a desired biological activity. Such proteins may include, for example, a toxin such as abrin, ricin A, pseudomonas exotoxin, or diphtheria toxin; a protein such as tumor necrosis factor, &agr;-interferon, &bgr;-interferon, nerve growth factor, platelet derived growth factor, tissue plasminogen activator; or, biological response modifiers such as, for example, lymphokines, interleukin-1 (“IL-1”), interleukin-2 (“IL-2”), interleukin-6 (“IL-6”), granulocyte macrophase colony stimulating factor (“GM-CSF”), granulocyte colony stimulating factor (“G-CSF”), or other growth factors.

[2603] Alternatively, an antibody can be conjugated to a second antibody to form an antibody heteroconjugate as described by Segal in U.S. Pat. No. 4,676,980.

[2604] The nucleic acid molecules of the invention can be inserted into vectors and used as gene therapy vectors. Gene therapy vectors can be delivered to a subject by, for example, intravenous injection, local administration (see U.S. Pat. No. 5,328,470) or by stereotactic injection (see e.g., Chen et al. (1994) Proc. Natl. Acad. Sci. USA 91:3054-3057). The pharmaceutical preparation of the gene therapy vector can include the gene therapy vector in an acceptable diluent, or can comprise a slow release matrix in which the gene delivery vehicle is imbedded. Alternatively, where the complete gene delivery vector can be produced intact from recombinant cells, e.g., retroviral vectors, the pharmaceutical preparation can include one or more cells which produce the gene delivery system.

[2605] The pharmaceutical compositions can be included in a container, pack, or dispenser together with instructions for administration.

[2606] Methods of Treatment for 32225

[2607] The present invention provides for both prophylactic and therapeutic methods of treating a subject at risk of (or susceptible to) a disorder or having a disorder associated with aberrant or unwanted 32225 expression or activity. As used herein, the term “treatment” is defined as the application or administration of a therapeutic agent to a patient, or application or administration of a therapeutic agent to an isolated tissue or cell line from a patient, who has a disease, a symptom of disease or a predisposition toward a disease, with the purpose to cure, heal, alleviate, relieve, alter, remedy, ameliorate, improve or affect the disease, the symptoms of disease or the predisposition toward disease. A therapeutic agent includes, but is not limited to, small molecules, peptides, antibodies, ribozymes and antisense oligonucleotides.

[2608] With regards to both prophylactic and therapeutic methods of treatment, such treatments may be specifically tailored or modified, based on knowledge obtained from the field of pharmacogenomics. “Pharmacogenomics”, as used herein, refers to the application of genomics technologies such as gene sequencing, statistical genetics, and gene expression analysis to drugs in clinical development and on the market. More specifically, the term refers the study of how a patient's genes determine his or her response to a drug (e.g., a patient's “drug response phenotype”, or “drug response genotype”.) Thus, another aspect of the invention provides methods for tailoring an individual's prophylactic or therapeutic treatment with either the 32225 molecules of the present invention or 32225 modulators according to that individual's drug response genotype. Pharmacogenomics allows a clinician or physician to target prophylactic or therapeutic treatments to patients who will most benefit from the treatment and to avoid treatment of patients who will experience toxic drug-related side effects.

[2609] In one aspect, the invention provides a method for preventing in a subject, a disease or condition associated with an aberrant or unwanted 32225 expression or activity, by administering to the subject a 32225 or an agent which modulates 32225 expression or at least one 32225 activity. Subjects at risk for a disease which is caused or contributed to by aberrant or unwanted 32225 expression or activity can be identified by, for example, any or a combination of diagnostic or prognostic assays as described herein. Administration of a prophylactic agent can occur prior to the manifestation of symptoms characteristic of the 32225 aberrance, such that a disease or disorder is prevented or, alternatively, delayed in its progression. Depending on the type of 32225 aberrance, for example, a 32225, 32225 agonist or 32225 antagonist agent can be used for treating the subject. The appropriate agent can be determined based on screening assays described herein.

[2610] It is possible that some 32225 disorders can be caused, at least in part, by an abnormal level of gene product, or by the presence of a gene product exhibiting abnormal activity. As such, the reduction in the level and/or activity of such gene products would bring about the amelioration of disorder symptoms.

[2611] The 32225 molecules can act as novel diagnostic targets and therapeutic agents for controlling one or more of cellular proliferative and/or differentiative disorders, neural disorders, liver disorders, metabolic disorders, or cardiovascular disorders, as discussed above. In addition, the 32225 molecules of the invention may act as novel diagnostic targets and therapeutic agents for controlling disorders associated with bone metabolism, immune disorders, viral diseases, or pain disorders.

[2612] Aberrant expression and/or activity of 32225 molecules may mediate disorders associated with bone metabolism. “Bone metabolism” refers to direct or indirect effects in the formation or degeneration of bone structures, e.g., bone formation, bone resorption, etc., which may ultimately affect the concentrations in serum of calcium and phosphate. This term also includes activities mediated by 32225 molecules effects in bone cells, e.g. osteoclasts and osteoblasts, that may in turn result in bone formation and degeneration. For example, 32225 molecules may support different activities of bone resorbing osteoclasts such as the stimulation of differentiation of monocytes and mononuclear phagocytes into osteoclasts. Accordingly, 32225 molecules that modulate the production of bone cells can influence bone formation and degeneration, and thus may be used to treat bone disorders. Examples of such disorders include, but are not limited to, osteoporosis, osteodystrophy, osteomalacia, rickets, osteitis fibrosa cystica, renal osteodystrophy, osteosclerosis, anti-convulsant treatment, osteopenia, fibrogenesis-imperfecta ossium, secondary hyperparathyrodism, hypoparathyroidism, hyperparathyroidism, cirrhosis, obstructive jaundice, drug induced metabolism, medullary carcinoma, chronic renal disease, rickets, sarcoidosis, glucocorticoid antagonism, malabsorption syndrome, steatorrhea, tropical sprue, idiopathic hypercalcemia and milk fever.

[2613] The 32225 nucleic acid and protein of the invention can be used to treat and/or diagnose a variety of immune disorders. Examples of immune disorders or diseases include, but are not limited to, autoimmune diseases (including, for example, diabetes mellitus, arthritis (including rheumatoid arthritis, juvenile rheumatoid arthritis, osteoarthritis, psoriatic arthritis), multiple sclerosis, encephalomyelitis, myasthenia gravis, systemic lupus erythematosis, autoimmune thyroiditis, dermatitis (including atopic dermatitis and eczematous dermatitis), psoriasis, Sjögren's Syndrome, Crohn's disease, aphthous ulcer, iritis, conjunctivitis, keratoconjunctivitis, ulcerative colitis, asthma, allergic asthma, cutaneous lupus erythematosus, scleroderma, vaginitis, proctitis, drug eruptions, leprosy reversal reactions, erythema nodosum leprosum, autoimmune uveitis, allergic encephalomyelitis, acute necrotizing hemorrhagic encephalopathy, idiopathic bilateral progressive sensorineural hearing loss, aplastic anemia, pure red cell anemia, idiopathic thrombocytopenia, polychondritis, Wegener's granulomatosis, chronic active hepatitis, Stevens-Johnson syndrome, idiopathic sprue, lichen planus, Graves' disease, sarcoidosis, primary biliary cirrhosis, uveitis posterior, and interstitial lung fibrosis), graft-versus-host disease, cases of transplantation, and allergy such as, atopic allergy.

[2614] 32225 molecules may play an important role in the etiology of certain viral diseases, including but not limited to Hepatitis B, Hepatitis C and Herpes Simplex Virus (HSV). Modulators of 32225 activity could be used to control viral diseases. The modulators can be used in the treatment and/or diagnosis of viral infected tissue or virus-associated tissue fibrosis, especially liver and liver fibrosis. Also, 32225 modulators can be used in the treatment and/or diagnosis of virus-associated carcinoma, especially hepatocellular cancer.

[2615] Additionally, 32225 may play an important role in the regulation of pain disorders. Examples of pain disorders include, but are not limited to, pain response elicited during various forms of tissue injury, e.g., inflammation, infection, and ischemia, usually referred to as hyperalgesia (described in, for example, Fields, H. L. (1987) Pain, New York: McGraw-Hill); pain associated with musculoskeletal disorders, e.g., joint pain; tooth pain; headaches; pain associated with surgery; pain related to irritable bowel syndrome; or chest pain.

[2616] As discussed, successful treatment of 32225 disorders can be brought about by techniques that serve to inhibit the expression or activity of target gene products. For example, compounds, e.g., an agent identified using an assays described above, that proves to exhibit negative modulatory activity, can be used in accordance with the invention to prevent and/or ameliorate symptoms of 32225 disorders. Such molecules can include, but are not limited to peptides, phosphopeptides, small organic or inorganic molecules, or antibodies (including, for example, polyclonal, monoclonal, humanized, anti-idiotypic, chimeric or single chain antibodies, and Fab, F(ab′)2 and Fab expression library fragments, scFV molecules, and epitope-binding fragments thereof).

[2617] Further, antisense and ribozyme molecules that inhibit expression of the target gene can also be used in accordance with the invention to reduce the level of target gene expression, thus effectively reducing the level of target gene activity. Still further, triple helix molecules can be utilized in reducing the level of target gene activity. Antisense, ribozyme and triple helix molecules are discussed above.

[2618] It is possible that the use of antisense, ribozyme, and/or triple helix molecules to reduce or inhibit mutant gene expression can also reduce or inhibit the transcription (triple helix) and/or translation (antisense, ribozyme) of mRNA produced by normal target gene alleles, such that the concentration of normal target gene product present can be lower than is necessary for a normal phenotype. In such cases, nucleic acid molecules that encode and express target gene polypeptides exhibiting normal target gene activity can be introduced into cells via gene therapy method. Alternatively, in instances in that the target gene encodes an extracellular protein, it can be preferable to co-administer normal target gene protein into the cell or tissue in order to maintain the requisite level of cellular or tissue target gene activity.

[2619] Another method by which nucleic acid molecules may be utilized in treating or preventing a disease characterized by 32225 expression is through the use of aptamer molecules specific for 32225 protein. Aptamers are nucleic acid molecules having a tertiary structure which permits them to specifically bind to protein ligands (see, e.g., Osborne, et al. (1997) Curr. Opin. Chem Biol. 1: 5-9; and Patel, D. J. (1997) Curr Opin Chem Biol 1:32-46). Since nucleic acid molecules may in many cases be more conveniently introduced into target cells than therapeutic protein molecules may be, aptamers offer a method by which 32225 protein activity may be specifically decreased without the introduction of drugs or other molecules which may have pluripotent effects.

[2620] Antibodies can be generated that are both specific for target gene product and that reduce target gene product activity. Such antibodies may, therefore, by administered in instances whereby negative modulatory techniques are appropriate for the treatment of 32225 disorders. For a description of antibodies, see the Antibody section above.

[2621] In circumstances wherein injection of an animal or a human subject with a 32225 protein or epitope for stimulating antibody production is harmful to the subject, it is possible to generate an immune response against 32225 through the use of anti-idiotypic antibodies (see, for example, Herlyn, D. (1999) Ann Med 31:66-78; and Bhattacharya-Chatterjee, M., and Foon, K. A. (1998) Cancer Treat Res. 94:51-68). If an anti-idiotypic antibody is introduced into a mammal or human subject, it should stimulate the production of anti-anti-idiotypic antibodies, which should be specific to the 32225 protein. Vaccines directed to a disease characterized by 32225 expression may also be generated in this fashion.

[2622] In instances where the target antigen is intracellular and whole antibodies are used, internalizing antibodies may be preferred. Lipofectin or liposomes can be used to deliver the antibody or a fragment of the Fab region that binds to the target antigen into cells. Where fragments of the antibody are used, the smallest inhibitory fragment that binds to the target antigen is preferred. For example, peptides having an amino acid sequence corresponding to the Fv region of the antibody can be used. Alternatively, single chain neutralizing antibodies that bind to intracellular target antigens can also be administered. Such single chain antibodies can be administered, for example, by expressing nucleotide sequences encoding single-chain antibodies within the target cell population (see e.g., Marasco et al. (1993) Proc. Natl. Acad. Sci. USA 90:7889-7893).

[2623] The identified compounds that inhibit target gene expression, synthesis and/or activity can be administered to a patient at therapeutically effective doses to prevent, treat or ameliorate 32225 disorders. A therapeutically effective dose refers to that amount of the compound sufficient to result in amelioration of symptoms of the disorders. Toxicity and therapeutic efficacy of such compounds can be determined by standard pharmaceutical procedures as described above.

[2624] The data obtained from the cell culture assays and animal studies can be used in formulating a range of dosage for use in humans. The dosage of such compounds lies preferably within a range of circulating concentrations that include the ED50 with little or no toxicity. The dosage can vary within this range depending upon the dosage form employed and the route of administration utilized. For any compound used in the method of the invention, the therapeutically effective dose can be estimated initially from cell culture assays. A dose can be formulated in animal models to achieve a circulating plasma concentration range that includes the IC50 (i.e., the concentration of the test compound that achieves a half-maximal inhibition of symptoms) as determined in cell culture. Such information can be used to more accurately determine useful doses in humans. Levels in plasma can be measured, for example, by high performance liquid chromatography.

[2625] Another example of determination of effective dose for an individual is the ability to directly assay levels of “free” and “bound” compound in the serum of the test subject. Such assays may utilize antibody mimics and/or “biosensors” that have been created through molecular imprinting techniques. The compound which is able to modulate 32225 activity is used as a template, or “imprinting molecule”, to spatially organize polymerizable monomers prior to their polymerization with catalytic reagents. The subsequent removal of the imprinted molecule leaves a polymer matrix which contains a repeated “negative image” of the compound and is able to selectively rebind the molecule under biological assay conditions. A detailed review of this technique can be seen in Ansell, R. J. et al (1996) Current Opinion in Biotechnology 7:89-94 and in Shea, K. J. (1994) Trends in Polymer Science 2:166-173. Such “imprinted” affinity matrixes are amenable to ligand-binding assays, whereby the immobilized monoclonal antibody component is replaced by an appropriately imprinted matrix. An example of the use of such matrixes in this way can be seen in Vlatakis, G. et al (1993) Nature 361:645-647. Through the use of isotope-labeling, the “free” concentration of compound which modulates the expression or activity of 32225 can be readily monitored and used in calculations of IC50.

[2626] Such “imprinted” affinity matrixes can also be designed to include fluorescent groups whose photon-emitting properties measurably change upon local and selective binding of target compound. These changes can be readily assayed in real time using appropriate fiberoptic devices, in turn allowing the dose in a test subject to be quickly optimized based on its individual IC50. An rudimentary example of such a “biosensor” is discussed in Kriz, D. et al (1995) Analytical Chemistry 67:2142-2144.

[2627] Another aspect of the invention pertains to methods of modulating 32225 expression or activity for therapeutic purposes. Accordingly, in an exemplary embodiment, the modulatory method of the invention involves contacting a cell with a 32225 or agent that modulates one or more of the activities of 32225 protein activity associated with the cell. An agent that modulates 32225 protein activity can be an agent as described herein, such as a nucleic acid or a protein, a naturally-occurring target molecule of a 32225 protein (e.g., a 32225 substrate or receptor), a 32225 antibody, a 32225 agonist or antagonist, a peptidomimetic of a 32225 agonist or antagonist, or other small molecule.

[2628] In one embodiment, the agent stimulates one or 32225 activities. Examples of such stimulatory agents include active 32225 protein and a nucleic acid molecule encoding 32225. In another embodiment, the agent inhibits one or more 32225 activities. Examples of such inhibitory agents include antisense 32225 nucleic acid molecules, anti-32225 antibodies, and 32225 inhibitors. These modulatory methods can be performed in vitro (e.g., by culturing the cell with the agent) or, alternatively, in vivo (e.g., by administering the agent to a subject). As such, the present invention provides methods of treating an individual afflicted with a disease or disorder characterized by aberrant or unwanted expression or activity of a 32225 protein or nucleic acid molecule. In one embodiment, the method involves administering an agent (e.g., an agent identified by a screening assay described herein), or combination of agents that modulates (e.g., up regulates or down regulates) 32225 expression or activity. In another embodiment, the method involves administering a 32225 protein or nucleic acid molecule as therapy to compensate for reduced, aberrant, or unwanted 32225 expression or activity.

[2629] Stimulation of 32225 activity is desirable in situations in which 32225 is abnormally downregulated and/or in which increased 32225 activity is likely to have a beneficial effect. For example, stimulation of 32225 activity is desirable in situations in which a 32225 is downregulated and/or in which increased 32225 activity is likely to have a beneficial effect. Likewise, inhibition of 32225 activity is desirable in situations in which 32225 is abnormally upregulated and/or in which decreased 32225 activity is likely to have a beneficial effect.

[2630] 32225 Pharmacogenomics

[2631] The 32225 molecules of the present invention, as well as agents, or modulators which have a stimulatory or inhibitory effect on 32225 activity (e.g., 32225 gene expression) as identified by a screening assay described herein can be administered to individuals to treat (prophylactically or therapeutically) 32225 associated disorders, e.g., neurological disorder, a liver disorder, a metabolic disorder, a cardiovascular disorder, or cellular proliferatative disorders, associated with aberrant or unwanted 32225 activity. In conjunction with such treatment, pharmacogenomics (i.e., the study of the relationship between an individual's genotype and that individual's response to a foreign compound or drug) may be considered. Differences in metabolism of therapeutics can lead to severe toxicity or therapeutic failure by altering the relation between dose and blood concentration of the pharmacologically active drug. Thus, a physician or clinician may consider applying knowledge obtained in relevant pharmacogenomics studies in determining whether to administer a 32225 molecule or 32225 modulator as well as tailoring the dosage and/or therapeutic regimen of treatment with a 32225 molecule or 32225 modulator.

[2632] Pharmacogenomics deals with clinically significant hereditary variations in the response to drugs due to altered drug disposition and abnormal action in affected persons. See, for example, Eichelbaum, M. et al. (1996) Clin. Exp. Pharmacol. Physiol. 23:983-985 and Linder, M. W. et al. (1997) Clin. Chem. 43:254-266. In general, two types of pharmacogenetic conditions can be differentiated. Genetic conditions transmitted as a single factor altering the way drugs act on the body (altered drug action) or genetic conditions transmitted as single factors altering the way the body acts on drugs (altered drug metabolism). These pharmacogenetic conditions can occur either as rare genetic defects or as naturally-occurring polymorphisms. For example, glucose-6-phosphate dehydrogenase deficiency (G6PD) is a common inherited enzymopathy in which the main clinical complication is haemolysis after ingestion of oxidant drugs (anti-malarials, sulfonamides, analgesics, nitrofurans) and consumption of fava beans.

[2633] One pharmacogenomics approach to identifying genes that predict drug response, known as “a genome-wide association”, relies primarily on a high-resolution map of the human genome consisting of already known gene-related markers (e.g., a “bi-allelic” gene marker map which consists of 60,000-100,000 polymorphic or variable sites on the human genome, each of which has two variants.) Such a high-resolution genetic map can be compared to a map of the genome of each of a statistically significant number of patients taking part in a Phase II/III drug trial to identify markers associated with a particular observed drug response or side effect. Alternatively, such a high resolution map can be generated from a combination of some ten-million known single nucleotide polymorphisms (SNPs) in the human genome. As used herein, a “SNP” is a common alteration that occurs in a single nucleotide base in a stretch of DNA. For example, a SNP may occur once per every 1000 bases of DNA. A SNP may be involved in a disease process, however, the vast majority may not be disease-associated. Given a genetic map based on the occurrence of such SNPs, individuals can be grouped into genetic categories depending on a particular pattern of SNPs in their individual genome. In such a manner, treatment regimens can be tailored to groups of genetically similar individuals, taking into account traits that may be common among such genetically similar individuals.

[2634] Alternatively, a method termed the “candidate gene approach,” can be utilized to identify genes that predict drug response. According to this method, if a gene that encodes a drug's target is known (e.g., a 32225 protein of the present invention), all common variants of that gene can be fairly easily identified in the population and it can be determined if having one version of the gene versus another is associated with a particular drug response.

[2635] Alternatively, a method termed the “gene expression profiling,” can be utilized to identify genes that predict drug response. For example, the gene expression of an animal dosed with a drug (e.g., a 32225 molecule or 32225 modulator of the present invention) can give an indication whether gene pathways related to toxicity have been turned on.

[2636] Information generated from more than one of the above pharmacogenomics approaches can be used to determine appropriate dosage and treatment regimens for prophylactic or therapeutic treatment of an individual. This knowledge, when applied to dosing or drug selection, can avoid adverse reactions or therapeutic failure and thus enhance therapeutic or prophylactic efficiency when treating a subject with a 32225 molecule or 32225 modulator, such as a modulator identified by one of the exemplary screening assays described herein.

[2637] The present invention further provides methods for identifying new agents, or combinations, that are based on identifying agents that modulate the activity of one or more of the gene products encoded by one or more of the 32225 genes of the present invention, wherein these products may be associated with resistance of the cells to a therapeutic agent. Specifically, the activity of the proteins encoded by the 32225 genes of the present invention can be used as a basis for identifying agents for overcoming agent resistance. By blocking the activity of one or more of the resistance proteins, target cells, e.g., human cells, will become sensitive to treatment with an agent that the unmodified target cells were resistant to.

[2638] Monitoring the influence of agents (e.g., drugs) on the expression or activity of a 32225 protein can be applied in clinical trials. For example, the effectiveness of an agent determined by a screening assay as described herein to increase 32225 gene expression, protein levels, or upregulate 32225 activity, can be monitored in clinical trials of subjects exhibiting decreased 32225 gene expression, protein levels, or downregulated 32225 activity. Alternatively, the effectiveness of an agent determined by a screening assay to decrease 32225 gene expression, protein levels, or downregulate 32225 activity, can be monitored in clinical trials of subjects exhibiting increased 32225 gene expression, protein levels, or upregulated 32225 activity. In such clinical trials, the expression or activity of a 32225 gene, and preferably, other genes that have been implicated in, for example, a 32225-associated disorder can be used as a “read out” or markers of the phenotype of a particular cell.

[2639] 32225 Informatics

[2640] The sequence of a 32225 molecule is provided in a variety of media to facilitate use thereof. A sequence can be provided as a manufacture, other than an isolated nucleic acid or amino acid molecule, which contains a 32225. Such a manufacture can provide a nucleotide or amino acid sequence, e.g., an open reading frame, in a form which allows examination of the manufacture using means not directly applicable to examining the nucleotide or amino acid sequences, or a subset thereof, as they exists in nature or in purified form. The sequence information can include, but is not limited to, 32225 full-length nucleotide and/or amino acid sequences, partial nucleotide and/or amino acid sequences, polymorphic sequences including single nucleotide polymorphisms (SNPs), epitope sequence, and the like. In a preferred embodiment, the manufacture is a machine-readable medium, e.g., a magnetic, optical, chemical or mechanical information storage device.

[2641] As used herein, “machine-readable media” refers to any medium that can be read and accessed directly by a machine, e.g., a digital computer or analogue computer. Non-limiting examples of a computer include a desktop PC, laptop, mainframe, server (e.g., a web server, network server, or server farm), handheld digital assistant, pager, mobile telephone, and the like. The computer can be stand-alone or connected to a communications network, e.g., a local area network (such as a VPN or intranet), a wide area network (e.g., an Extranet or the Internet), or a telephone network (e.g., a wireless, DSL, or ISDN network). Machine-readable media include, but are not limited to: magnetic storage media, such as floppy discs, hard disc storage medium, and magnetic tape; optical storage media such as CD-ROM; electrical storage media such as RAM, ROM, EPROM, EEPROM, flash memory, and the like; and hybrids of these categories such as magnetic/optical storage media.

[2642] A variety of data storage structures are available to a skilled artisan for creating a machine-readable medium having recorded thereon a nucleotide or amino acid sequence of the present invention. The choice of the data storage structure will generally be based on the means chosen to access the stored information. In addition, a variety of data processor programs and formats can be used to store the nucleotide sequence information of the present invention on computer readable medium. The sequence information can be represented in a word processing text file, formatted in commercially-available software such as WordPerfect and Microsoft Word, or represented in the form of an ASCII file, stored in a database application, such as DB2, Sybase, Oracle, or the like. The skilled artisan can readily adapt any number of data processor structuring formats (e.g., text file or database) in order to obtain computer readable medium having recorded thereon the nucleotide sequence information of the present invention.

[2643] In a preferred embodiment, the sequence information is stored in a relational database (such as Sybase or Oracle). The database can have a first table for storing sequence (nucleic acid and/or amino acid sequence) information. The sequence information can be stored in one field (e.g., a first column) of a table row and an identifier for the sequence can be store in another field (e.g., a second column) of the table row. The database can have a second table, e.g., storing annotations. The second table can have a field for the sequence identifier, a field for a descriptor or annotation text (e.g., the descriptor can refer to a functionality of the sequence, a field for the initial position in the sequence to which the annotation refers, and a field for the ultimate position in the sequence to which the annotation refers. Non-limiting examples for annotation to nucleic acid sequences include polymorphisms (e.g., SNP's) translational regulatory sites and splice junctions. Non-limiting examples for annotations to amino acid sequence include polypeptide domains, e.g., a domain described herein; active sites and other functional amino acids; and modification sites.

[2644] By providing the nucleotide or amino acid sequences of the invention in computer readable form, the skilled artisan can routinely access the sequence information for a variety of purposes. For example, one skilled in the art can use the nucleotide or amino acid sequences of the invention in computer readable form to compare a target sequence or target structural motif with the sequence information stored within the data storage means. A search is used to identify fragments or regions of the sequences of the invention which match a particular target sequence or target motif. The search can be a BLAST search or other routine sequence comparison, e.g., a search described herein.

[2645] Thus, in one aspect, the invention features a method of analyzing 32225, e.g., analyzing structure, function, or relatedness to one or more other nucleic acid or amino acid sequences. The method includes: providing a 32225 nucleic acid or amino acid sequence; comparing the 32225 sequence with a second sequence, e.g., one or more preferably a plurality of sequences from a collection of sequences, e.g., a nucleic acid or protein sequence database to thereby analyze 32225. The method can be performed in a machine, e.g., a computer, or manually by a skilled artisan.

[2646] The method can include evaluating the sequence identity between a 32225 sequence and a database sequence. The method can be performed by accessing the database at a second site, e.g., over the Internet.

[2647] As used herein, a “target sequence” can be any DNA or amino acid sequence of six or more nucleotides or two or more amino acids. A skilled artisan can readily recognize that the longer a target sequence is, the less likely a target sequence will be present as a random occurrence in the database. Typical sequence lengths of a target sequence are from about 10 to 100 amino acids or from about 30 to 300 nucleotide residues. However, it is well recognized that commercially important fragments, such as sequence fragments involved in gene expression and protein processing, may be of shorter length.

[2648] Computer software is publicly available which allows a skilled artisan to access sequence information provided in a computer readable medium for analysis and comparison to other sequences. A variety of known algorithms are disclosed publicly and a variety of commercially available software for conducting search means are and can be used in the computer-based systems of the present invention. Examples of such software include, but are not limited to, MacPattern (EMBL), BLASTN and BLASTX (NCBI).

[2649] Thus, the invention features a method of making a computer readable record of a sequence of a 32225 sequence which includes recording the sequence on a computer readable matrix. In a preferred embodiment the record includes one or more of the following: identification of an ORF; identification of a domain, region, or site; identification of the start of transcription; identification of the transcription terminator; the full length amino acid sequence of the protein, or a mature form thereof; the 5′end of the translated region.

[2650] In another aspect, the invention features, a method of analyzing a sequence. The method includes: providing a 32225 sequence, or record, in machine-readable form; comparing a second sequence to the 32225 sequence; thereby analyzing a sequence. Comparison can include comparing to sequences for sequence identity or determining if one sequence is included within the other, e.g., determining if the 32225 sequence includes a sequence being compared. In a preferred embodiment the 32225 or second sequence is stored on a first computer, e.g., at a first site and the comparison is performed, read, or recorded on a second computer, e.g., at a second site. E.g., the 32225 or second sequence can be stored in a public or proprietary database in one computer, and the results of the comparison performed, read, or recorded on a second computer. In a preferred embodiment the record includes one or more of the following: identification of an ORF; identification of a domain, region, or site; identification of the start of transcription; identification of the transcription terminator; the full length amino acid sequence of the protein, or a mature form thereof; the 5′end of the translated region.

[2651] In another aspect, the invention provides a machine-readable medium for holding instructions for performing a method for determining whether a subject has a 32225-associated disease or disorder or a pre-disposition to a 32225-associated disease or disorder, wherein the method comprises the steps of determining 32225 sequence information associated with the subject and based on the 32225 sequence information, determining whether the subject has a 32225-associated disease or disorder or a pre-disposition to a 32225-associated disease or disorder and/or recommending a particular treatment for the disease, disorder or pre-disease condition.

[2652] The invention further provides in an electronic system and/or in a network, a method for determining whether a subject has a 32225-associated disease or disorder or a pre-disposition to a disease associated with a 32225 wherein the method comprises the steps of determining 32225 sequence information associated with the subject, and based on the 32225 sequence information, determining whether the subject has a 32225-associated disease or disorder or a pre-disposition to a 32225-associated disease or disorder, and/or recommending a particular treatment for the disease, disorder or pre-disease condition. In a preferred embodiment, the method further includes the step of receiving information, e.g., phenotypic or genotypic information, associated with the subject and/or acquiring from a network phenotypic information associated with the subject. The information can be stored in a database, e.g., a relational database. In another embodiment, the method further includes accessing the database, e.g., for records relating to other subjects, comparing the 32225 sequence of the subject to the 32225 sequences in the database to thereby determine whether the subject as a 32225-associated disease or disorder, or a pre-disposition for such.

[2653] The present invention also provides in a network, a method for determining whether a subject has a 32225 associated disease or disorder or a pre-disposition to a 32225-associated disease or disorder associated with 32225, said method comprising the steps of receiving 32225 sequence information from the subject and/or information related thereto, receiving phenotypic information associated with the subject, acquiring information from the network corresponding to 32225 and/or corresponding to a 32225-associated disease or disorder (e.g., a neurological disorder, a liver disorder, a metabolic disorder, a cardiovascular disorder, or a cellular proliferative and/or differentiative disorder), and based on one or more of the phenotypic information, the 32225 information (e.g., sequence information and/or information related thereto), and the acquired information, determining whether the subject has a 32225-associated disease or disorder or a pre-disposition to a 32225-associated disease or disorder. The method may further comprise the step of recommending a particular treatment for the disease, disorder or pre-disease condition.

[2654] The present invention also provides a method for determining whether a subject has a 32225-associated disease or disorder or a pre-disposition to a 32225-associated disease or disorder, said method comprising the steps of receiving information related to 32225 (e.g., sequence information and/or information related thereto), receiving phenotypic information associated with the subject, acquiring information from the network related to 32225 and/or related to a 32225-associated disease or disorder, and based on one or more of the phenotypic information, the 32225 information, and the acquired information, determining whether the subject has a 32225-associated disease or disorder or a pre-disposition to a 32225-associated disease or disorder. The method may further comprise the step of recommending a particular treatment for the disease, disorder or pre-disease condition.

[2655] This invention is further illustrated by the following examples that should not be construed as limiting. The contents of all references, patents and published patent applications cited throughout this application are incorporated herein by reference.

Background of the 47508 Invention

[2656] Chromatin is a complex of DNA and proteins. The protein component of chromatin includes five different proteins, known collectively as histones. The histones are designated H1, H2A, H2B, H3 and H4 and are highly conserved across genera. Two of each of the histones H2A, H2B, H3, and H4 form a core octomer complex around which DNA is wrapped, or spooled, leading to its condensation.

[2657] Histones undergo numerous alterations during the cell cycle (reviewed in Mahlknecht and Hoelzer (2000), Molecular Medicine 6(8):623-44). For example, histone H4 is acted upon by a battery of enzymes that carry out various covalent modifications. In particular, four lysine residues in the N-terminal region of histone H4 undergo reversible acetylation. This enzymatic reaction is catalyzed by a histone acetylase, which adds acetyl groups donated by acetyl CoA. The acetyl groups are removed by a hydrolysis reaction catalyzed by a histone deacetylase. Histones H2A, H2B1, and H3 are likewise acetylated and deacetylated.

[2658] Acetylation and other covalent modifications alter the net charge of histones. For example, the unmodified histone H4 has a net charge of +5, while the fully modified protein has a net charge of −2. This change in net charge alters the affinity of histones or particular histone domains for DNA and for other proteins. Histone acetylation, which shifts the net charge of histones in the negative direction, leads to the unraveling of chromatin, which is often accompanied by an increase in gene transcription in the unraveled region. Conversely, histone deacetylation shifts the net charge of histones in the positive direction, strengthening the interaction between histones and DNA, and leading to chromosomal condensation. Chromosomal condensation is associated with the silencing of DNA transcription. Thus, the enzymes involved in the covalent modification of histones, histone acetylases and histone deacetylases, play a regulatory role in DNA replication and transcription. Through these activities, histone acetylases and deacetylases are believed to influence major cellular processes, such as differentiation and proliferation, as well as abnormal processes like tumor growth (Mahlknecht and Hoelzer (2000), supra).

Summary of the 47508 Invention

[2659] The present invention is based, in part, on the discovery of a novel histone deacetylase family member, referred to herein as “47508”. The nucleotide sequence of a cDNA encoding 47508 is recited in SEQ ID NO:41, and the amino acid sequence of a 47508 polypeptide is recited in SEQ ID NO:42 (see also Example 29, below). In addition, the nucleotide sequences of the coding region are depicted in SEQ ID NO:43.

[2660] Accordingly, in one aspect, the invention features a nucleic acid molecule that encodes a 47508 protein or polypeptide, e.g., a biologically active portion of the 47508 protein. In a preferred embodiment the isolated nucleic acid molecule encodes a polypeptide having the amino acid sequence of SEQ ID NO:42. In other embodiments, the invention provides isolated 47508 nucleic acid molecules having the nucleotide sequence shown in SEQ ID NO:41, SEQ ID NO:43, or the sequence of the DNA insert of the plasmid deposited with ATCC Accession Number ______. In still other embodiments, the invention provides nucleic acid molecules that are substantially identical (e.g., naturally occurring allelic variants) to the nucleotide sequence shown in SEQ ID NO:41, SEQ ID NO:43, or the sequence of the DNA insert of the plasmid deposited with ATCC Accession Number ______. In other embodiments, the invention provides a nucleic acid molecule which hybridizes under a stringency condition described herein to a nucleic acid molecule comprising the nucleotide sequence of SEQ ID NO:41, SEQ ID NO:43, or the sequence of the DNA insert of the plasmid deposited with ATCC Accession Number ______, wherein the nucleic acid encodes a full length 47508 protein or an active fragment thereof.

[2661] In a related aspect, the invention further provides nucleic acid constructs that include a 47508 nucleic acid molecule described herein. In certain embodiments, the nucleic acid molecules of the invention are operatively linked to native or heterologous regulatory sequences. Also included, are vectors and host cells containing the 47508 nucleic acid molecules of the invention e.g., vectors and host cells suitable for producing 47508 nucleic acid molecules and polypeptides.

[2662] In another related aspect, the invention provides nucleic acid fragments suitable as primers or hybridization probes for the detection of 47508-encoding nucleic acids.

[2663] In still another related aspect, isolated nucleic acid molecules that are antisense to a 47508 encoding nucleic acid molecule are provided.

[2664] In another aspect, the invention features, 47508 polypeptides, and biologically active or antigenic fragments thereof that are useful, e.g., as reagents or targets in assays applicable to treatment and diagnosis of 47508-mediated or -related disorders. In another embodiment, the invention provides 47508 polypeptides having a 47508 activity. Preferred polypeptides are 47508 proteins including at least one histone deacetylase domain, and, preferably, having a 47508 activity, e.g., a 47508 activity as described herein.

[2665] In other embodiments, the invention provides 47508 polypeptides, e.g., a 47508 polypeptide having the amino acid sequence shown in SEQ ID NO:42 or the amino acid sequence encoded by the cDNA insert of the plasmid deposited with ATCC Accession Number ______; an amino acid sequence that is substantially identical to the amino acid sequence shown in SEQ ID NO:42 or the amino acid sequence encoded by the cDNA insert of the plasmid deposited with ATCC Accession Number ______; or an amino acid sequence encoded by a nucleic acid molecule having a nucleotide sequence which hybridizes under a stringency condition described herein to a nucleic acid molecule comprising the nucleotide sequence of SEQ ID NO:41, SEQ ID NO:43, or the sequence of the DNA insert of the plasmid deposited with ATCC Accession Number ______, wherein the nucleic acid encodes a fill length 47508 protein or an active fragment thereof.

[2666] In a related aspect, the invention provides 47508 polypeptides or fragments operatively linked to non-47508 polypeptides to form fusion proteins.

[2667] In another aspect, the invention features antibodies and antigen-binding fragments thereof, that react with, or more preferably specifically bind 47508 polypeptides or fragments thereof, e.g., a histone deacetylase domain of 47508.

[2668] In another aspect, the invention provides methods of screening for compounds that modulate the expression or activity of the 47508 polypeptides or nucleic acids.

[2669] In still another aspect, the invention provides a process for modulating 47508 polypeptide or nucleic acid expression or activity, e.g. using the screened compounds. In certain embodiments, the methods involve treatment of conditions related to aberrant activity or expression of the 47508 polypeptides or nucleic acids, such as conditions involving aberrant or deficient cellular proliferation or differentiation.

[2670] The invention also provides assays for determining the activity of or the presence or absence of 47508 polypeptides or nucleic acid molecules in a biological sample, including for disease diagnosis.

[2671] In yet another aspect, the invention provides methods for inhibiting the proliferation or inducing the killing, of a 47508-expressing cell, e.g., a hyper-proliferative 47508-expressing cell. The method includes contacting the cell with an agent, e.g., a compound, (e.g., a compound identified using the methods described herein) that modulates the activity, or expression, of the 47508 polypeptide or nucleic acid. In a preferred embodiment, the contacting step is effective in vitro or ex vivo. In other embodiments, the contacting step is effected in vivo, e.g., in a subject (e.g., a mammal, e.g., a human), as part of a therapeutic or prophylactic protocol. Preferably, the cell is a hyperproliferative cell, e.g., a cell found in a solid tumor, a soft tissue tumor, or a metastatic lesion. In another preferred embodiment, the cell or tumor is found in, e.g., a breast, ovary, lung, colon, or liver, tissue.

[2672] In a preferred embodiment, the compound is an inhibitor of a 47508 polypeptide. Preferably, the inhibitor is chosen from a peptide, a phosphopeptide, a small organic molecule, a small inorganic molecule and an antibody (e.g., an antibody conjugated to a therapeutic moiety selected from a cytotoxin, a cytotoxic agent and a radioactive metal ion). In another preferred embodiment, the compound is an inhibitor of a 47508 nucleic acid, e.g., an antisense, a ribozyme, or a triple helix molecule.

[2673] In a preferred embodiment, the compound is administered in combination with a cytotoxic agent. Examples of cytotoxic agents include anti-microtubule agent, a topoisomerase I inhibitor, a topoisomerase II inhibitor, an anti-metabolite, a mitotic inhibitor, an alkylating agent, an intercalating agent, an agent capable of interfering with a signal transduction pathway, an agent that promotes apoptosis or necrosis, and radiation.

[2674] In another embodiment, the compound is an activator of a 47508 polypeptide. Preferably, the activator is chosen from a peptide, a phosphopeptide, a small organic molecule, a small inorganic molecule, and an antibody. In another embodiment, the compound is an activator of a 47508 nucleic acid.

[2675] In another aspect, the invention features methods for treating or preventing a disorder characterized by aberrant cellular proliferation or differentiation of a 47508-expressing cell, in a subject. Preferably, the method includes administering to the subject (e.g., a mammal, e.g., a human) an effective amount of a compound (e.g., a compound identified using the methods described herein) that modulates the activity, or expression, of the 47508 polypeptide or nucleic acid. In a preferred embodiment, the disorder is a cancerous or pre-cancerous condition.

[2676] In a further aspect, the invention provides methods for evaluating the efficacy of a treatment of a disorder, e.g., a proliferative disorder. The method includes: treating a subject, e.g., a patient or an animal, with a protocol under evaluation (e.g., treating a subject with one or more of: chemotherapy, radiation, and/or a compound identified using the methods described herein); and evaluating the expression of a 47508 nucleic acid or polypeptide before and after treatment. A change, e.g., a decrease or increase, in the level of a 47508 nucleic acid (e.g., mRNA) or polypeptide after treatment, relative to the level of expression before treatment, is indicative of the efficacy of the treatment of the disorder. The level of 47508 nucleic acid or polypeptide expression can be detected by any method described herein.

[2677] In a preferred embodiment, the evaluating step includes obtaining a sample (e.g., a tissue sample, e.g., a biopsy, or a fluid sample) from the subject, before and after treatment and comparing the level of expressing of a 47508 nucleic acid (e.g., mRNA) or polypeptide before and after treatment.

[2678] In another aspect, the invention provides methods for evaluating the efficacy of a therapeutic or prophylactic agent (e.g., an anti-neoplastic agent). The method includes: contacting a sample with an agent (e.g., a compound identified using the methods described herein, or a cytotoxic agent) and, evaluating the expression of 47508 nucleic acid or polypeptide in the sample before and after the contacting step. A change, e.g., a decrease or increase, in the level of 47508 nucleic acid (e.g., mRNA) or polypeptide in the sample obtained after the contacting step, relative to the level of expression in the sample before the contacting step, is indicative of the efficacy of the agent. The level of 47508 nucleic acid or polypeptide expression can be detected by any method described herein. In a preferred embodiment, the sample includes cells obtained from a cancerous tissue, e.g., a cancerous breast, ovary, lung, colon, or liver tissue.

[2679] In further aspect, the invention provides assays for determining the presence or absence of a genetic alteration in a 47508 polypeptide or nucleic acid molecule, including for disease diagnosis.

[2680] In another aspect, the invention features a two dimensional array having a plurality of addresses, each address of the plurality being positionally distinguishable from each other address of the plurality, and each address of the plurality having a unique capture probe, e.g., a nucleic acid or peptide sequence. At least one address of the plurality has a capture probe that recognizes a 47508 molecule. In one embodiment, the capture probe is a nucleic acid, e.g., a probe complementary to a 47508 nucleic acid sequence. In another embodiment, the capture probe is a polypeptide, e.g., an antibody specific for 47508 polypeptides. Also featured is a method of analyzing a sample by contacting the sample to the aforementioned array and detecting binding of the sample to the array.

[2681] Other features and advantages of the invention will be apparent from the following detailed description, and from the claims.

Detailed Description of 47508

[2682] The human 47508 sequence (see SEQ ID NO:41, as recited in Example 29), which is approximately 1579 nucleotides long including untranslated regions, contains a predicted methionine-initiated coding sequence of about 1242 nucleotides, including the termination codon. The coding sequence encodes a 413 amino acid protein (see SEQ ID NO:42, as recited in Example 29).

[2683] Human 47508 contains the following regions or other structural features:

[2684] a histone deacetylase domain (PFAM accession number PF00850) located at about amino acid residues 83 to 392 of SEQ ID NO:42;

[2685] a histidine deacetylase zinc-binding triad having two conserved aspartic acid residues, located at about amino acid residues 247 and 327 of SEQ ID NO:42, and one conserved histidine residue, located at about amino acid residue 249 of SEQ ID NO:42;

[2686] a first charge-relay system formed by a conserved histidine residue, located at about amino acid residue 208 of SEQ ID NO:42, and a conserved aspartic acid residue, located at about amino acid residue 245 of SEQ ID NO:42;

[2687] a second charge-relay system formed by a conserved histidine residue, located at about amino acid residue 209 of SEQ ID NO:42, and a conservatively substituted asparagines residue, located at about amino acid residue 251 of SEQ ID NO:42.

[2688] four Protein Kinase C phosphorylation sites (PS00005) located at about amino acid residues 87 to 89, 142 to 144, 212 to 214, and 374 to 376 of SEQ ID NO:42;

[2689] five Casein Kinase II phosphorylation sites (PS00006) located at about amino acid residues 133 to 136, 157 to 160, 242 to 245, 295 to 298, and 311 to 314 of SEQ ID NO:42;

[2690] seven N-myristylation sites (PS00008) located at about amino acid residues 2 to 7, 37 to 42, 55 to 60, 66 to 71, 186 to 191, 215 to 220, and 237 to 242 of SEQ ID NO:42; and

[2691] one amidation site (PS00009) located at about amino acid residues 356 to 359.

[2692] For general information regarding PFAM identifiers, PS prefix and PF prefix domain identification numbers, refer to Sonnhammer et al. (1997) Protein 28:405-420 and http://www.psc.edu/general/software/packages/pfam/pfam.html.

[2693] A plasmid containing the nucleotide sequence encoding human 47508 (clone “Fbh47508FL”) was deposited with American Type Culture Collection (ATCC), 10801 University Boulevard, Manassas, Va. 20110-2209, on ______ and assigned Accession Number ______. This deposit will be maintained under the terms of the Budapest Treaty on the International Recognition of the Deposit of Microorganisms for the Purposes of Patent Procedure. This deposit was made merely as a convenience for those of skill in the art and is not an admission that a deposit is required under 35 U.S.C. §112.

[2694] The 47508 protein contains a significant number of structural characteristics in common with members of the histone deacetylase family. The term “family” when referring to the protein and nucleic acid molecules of the invention means two or more proteins or nucleic acid molecules having a common structural domain or motif and having sufficient amino acid or nucleotide sequence homology as defined herein. Such family members can be naturally or non-naturally occurring and can be from either the same or different species. For example, a family can contain a first protein of human origin as well as other distinct proteins of human origin, or alternatively, can contain homologues of non-human origin, e.g., rat or mouse proteins. Members of a family can also have common functional characteristics.

[2695] A histone deacetylase family of proteins is characterized by a common fold, which is a single domain open &agr;/&bgr; fold. The fold consists of a central eight-stranded parallel &bgr;-sheet with four &agr;-helices packed against each face of the &bgr;-sheet. There are an additional eight &agr;-helices, most of which are clustered near one edge of the &agr;-sheet. Together with loops that arise from the carboxy-terminal ends of the &bgr;-strands of the &bgr;-sheet, the additional helices help form a deep, narrow pocket and an internal cavity adjacent to the pocket. At the bottom of the pocket, two aspartic acid residues and a histidine residue are involved in cooridinating a zinc ion. The bottom of the pocket contains an additional five highly conserved residues, including two histidine residues, two aspartic acid residues, and a tyrosine residue. Each of the histidine residues interacts with an aspartic acid residue, creating a charge-relay pair and thereby increasing the basicity of the imidazole N&egr; atom. It is believed that the substrate lysine residue, which is to be deacetylated, inserts into the pocket such that the acetylated end of the lysine residue coordinates with the zinc ion, and the aliphatic carbon atoms of the lysine substrate form van der Waals contacts with conserved hydrophobic residues that line the wall of the pocket. Once the acetylated lysine residue is positioned correctly, the two histidine/aspartic acid charge-relay pairs help catalyze the removal of the acetyl group, giving rise to a deacetylated lysine residue and a free acetate molecule. The structure of a histone deacetylase domain has been described, e.g., in Finnin et al. (1999), Nature 401(6749):188-93, the contents of which are incorporated herein by reference.

[2696] A 47508 polypeptide can include a “histone deacetylase domain” or regions homologous with a “histone deacetylase domain”.

[2697] As used herein, the term “histone deacetylase domain” includes an amino acid sequence of about 250 to 500 amino acid residues in length and having a bit score for the alignment of the sequence to the histone deacetylase domain profile (Pfam HMM) of at least 50. Preferably, a histone deacetylase domain includes at least about 275 to 450 amino acids, more preferably about 300 to 425 amino acid residues, or about 305 to 400 amino acids and has a bit score for the alignment of the sequence to the histone deacetylase domain (HMM) of at least 75, 80, 82, 83, 84, or greater. The histone deacetylase domain (HMM) has been assigned the PFAM Accession Number PF00850 (http;//genome.wustl.edu/Pfam/.html). An alignment of the histone deacetylase domain (amino acids 83 to 392 of SEQ ID NO:42) of human 47508 with a consensus amino acid sequence (SEQ ID NO:44) derived from a hidden Markov model is depicted in FIG. 22.

[2698] In a preferred embodiment 47508 polypeptide or protein has a “histone deacetylase domain” or a region which includes at least about 250 to 500, more preferably about 275 to 450, or 305 to 400 amino acid residues and has at least about 50%, 60%, 70% 80% 90% 95%, 99%, or 100% homology with a “histone deacetylase domain,” e.g., the histone deacetylase domain of human 47508 (e.g., residues 83 to 392 of SEQ ID NO:42).

[2699] To identify the presence of a “histone deacetylase” domain in a 47508 protein sequence, and make the determination that a polypeptide or protein of interest has a particular profile, the amino acid sequence of the protein can be searched against the Pfam database of HMMs (e.g., the Pfam database, release 2.1) using the default parameters (http://www.sanger.ac.uk/Software/Pfam/HMM_search). For example, the hmmsf program, which is available as part of the HMMER package of search programs, is a family specific default program for MILPAT0063 and a score of 15 is the default threshold score for determining a hit. Alternatively, the threshold score for determining a hit can be lowered (e.g., to 8 bits). A description of the Pfam database can be found in Sonhammer et al. (1997) Proteins 28(3):405-420 and a detailed description of HMMs can be found, for example, in Gribskov et al. (1990) Meth. Enzymol. 183:146-159; Gribskov et al. (1987) Proc. Natl. Acad. Sci. USA 84:4355-4358; Krogh et al. (1994) J. Mol. Biol. 235:1501-1531; and Stultz et al. (1993) Protein Sci. 2:305-314, the contents of which are incorporated herein by reference. A search was performed against the HMM database resulting in the identification of a “histone deacetylase” domain in the amino acid sequence of human 47508 at about residues 83 to 392 of SEQ ID NO:42 (see FIG. 22).

[2700] Alternatively, to identify the presence of a “histone deacetylase” domain in a 47508 protein sequence, and make the determination that a polypeptide or protein of interest has a particular profile, the amino acid sequence of the protein can be searched against a database of domains, e.g., the ProDom database (Corpet et al. (1999), Nucl. Acids Res. 27:263-267) The ProDom protein domain database consists of an automatic compilation of homologous domains. Current versions of ProDom are built using recursive PSI-BLAST searches (Altschul S F et al. (1997) Nucleic Acids Res. 25:3389-3402; Gouzy et al. (1999) Computers and Chemistry 23:333-340.) of the SWISS-PROT 38 and TREMBL protein databases. The database automatically generates a consensus sequence for each domain. A BLAST search was performed against the ProDom database resulting in the identification of three portions of a “histone deacetylase” domain in the amino acid sequence of human 47508, located at about residues 71 to 115, 120 to 258, and 251 to 371 of SEQ ID NO:42 (see FIGS. 23A-23C). Taken together, the three portions of a histone deacetylase domain which were identified in human 47508, by blasting the 47508 sequence against the ProDomain database, constitute a domain that has similar length and endpoints (about amino acid residues 71 to 372 of SEQ ID NO:42) as the histone deacetylase domain identified in human 47508 by searching the PFAM database (about amino acid residues 83 to 392 of SEQ ID NO:42). Furthermore, functionally important amino acid residues (described below) that display conservation in the PFAM alignment (see FIG. 22) display similar conservation in the ProDomain alignements (see FIGS. 23A-23C), and the bit score for two of the three ProDomain alignments (FIGS. 23A and 23C, with bit scores of 85.3 and 143.8, respectively) actually exceeds the bit score for the PFAM alignment (FIG. 22, with a bit score of 84.6).

[2701] In one embodiment, a 47508 protein includes at least one histone deacetylase zinc-binding triad. As used herein, a “histone deacetylase zinc-binding triad” includes two conserved aspartic acid residues and a conserved histidine residue located at the bottom of the substrate-binding pocket of a histone deacetylase domain. A “histone deacetylase zinc-binding triad”, as defined, can be involved in the coordination of an ion, e.g., a zinc or cobalt ion, and through said ion, in the interaction with an acetylated lysine substrate. Histone deacetylase zinc-binding triads have been described in Finnin et al. (1999), supra, the contents of which are incorporated herein by reference.

[2702] In a preferred embodiment, a 47508 polypeptide or protein has at least one histone deacetylase zinc-binding triad, or a sequence in which not more that one of the two aspartic acid residues is conservatively substituted, e.g., substituted with asparagine, glutamic acid, or glutamine, while the other aspartic acid residue and the histidine residue are absolutely conserved. The residues of human 47508 that constitute the histone deacetylase zinc-binding triad are the aspartic acid residues located at about amino acid residues 247 and 327 of SEQ ID NO:42 and the histidine residue located at about amino acid residue 249 of SEQ ID NO:42.

[2703] In another embodiment, a 47508 protein includes at least two charge-relay systems. As used herein, a “charge-relay system” includes a conserved histidine residue and either an aspartic acid residue or an asparagine residue, which interact with one another such that the imidazole N&egr; atom of the histidine residue has an increased basicity, and the two residues are located at the bottom of the substrate binding pocket of a histone deacetylase domain. A “charge-relay system”, as defined, can be involved in the enzymatic deacetylation of an acetylated lysine residue. Charge-relay systems, as found in histone deacetylases, have been described in Finnin et al. (1999), supra, the contents of which are incorporated herein by reference.

[2704] In a preferred embodiment, a 47508 polypeptide or protein includes at least two charge-relay systems, or at least two conserved histidine residues that each interact with a second amino acid residue selected from the group of aspartic acid, asparagines, glutamine, and glutamic acid, whereby the imidazole N&egr; atom of the histidine residue has an increased basicity. The residues of human 47508 that constitute the two charge-relay systems are: 1) the histidine residue located at about amino acid residue 208 of SEQ ID NO:42 and the aspartic acid residue located at about amino acid residue 247 of SEQ ID NO:42; and 2) the histidine residue located at about amino acid residue 209 of SEQ ID NO:42 and the asparagine residue located at about amino acid residue 252 of SEQ ID NO:42.

[2705] A 47508 family member can include at least one histone deacetylase domain. Furthermore, a 47508 family member can include at least one histone deacetylase zinc-binding triad; at least two charge-relay systems; at least one, two, three, preferably four predicted protein kinase C phosphorylation sites (PS00005); at least one, two, three, four, and preferably five predicted casein kinase II phosphorylation sites (PS00006); at least one, two, three, four, five, six, and preferably seven predicted N-myristylation sites (PS00008); and at least one predicted amidation site (PS00009).

[2706] As the 47508 polypeptides of the invention may modulate 47508-mediated activities, they may be useful as of for developing novel diagnostic and therapeutic agents for 47508-mediated or related disorders, as described below.

[2707] As used herein, a “47508 activity”, “biological activity of 47508” or “functional activity of 47508”, refers to an activity exerted by a 47508 protein, polypeptide or nucleic acid molecule. For example, a 47508 activity can be an activity exerted by 47508 in a physiological milieu on, e.g., a 47508-responsive cell or on a 47508 substrate, e.g., a protein substrate. A 47508 activity can be determined in vivo or in vitro. In one embodiment, a 47508 activity is a direct activity, such as an association with a 47508 target molecule. A “target molecule” or “binding partner” is a molecule with which a 47508 protein binds or interacts in nature. In an exemplary embodiment, 47508 is an enzyme for that catalyzes the removal of an acetyl group from an acetylated lysine residue of a substrate protein, e.g., a histone protein, e.g., H2A, H2B, H3, or H4.

[2708] A 47508 activity can also be an indirect activity, e.g., a cellular signaling activity mediated by interaction of the 47508 protein with a 47508 receptor. The features of the 47508 molecules of the present invention can provide similar biological activities as histone deacetylase family members. For example, the 47508 proteins of the present invention can have one or more of the following activities: (1) catalytic removal of acetyl groups from proteins, e.g., histones, e.g., histone H2A, H2B, H3, or H4; (2) catalytic removal of acetyl groups from lysine residues present in proteins, e.g., histones, e.g., histone H2A, H2B, H3, or H4; (3) regulation of the association of histiones with DNA; (4) regulation of chromosomal condensation; (5) interaction with transcription factors, e.g., transcriptional repressors; (6) regulation of transcription, e.g., transcriptional repression; (7) regulation of cellular differentiation, e.g., suppression or induction of cellular differentiation; (8) regulation of the cell cycle, e.g., cell-cycle progression or arrest; (9) regulation of tumor growth; (10) localizes to the nucleus; and (11) can be inhibited by histone deacetylase inhibitors, e.g., sodum butyrate, tricostatin, suberoylanilide hydroxamic acid, MS-27-275, or FR901228.

[2709] Thus, the 47508 molecules can act as novel diagnostic targets and therapeutic agents for controlling cellular proliferative and/or differentiative disorders, as well as disorders of the breast, ovary, lung, colon, or liver, and cardiovascular disorders.

[2710] The 47508 molecules can act as novel diagnostic targets and therapeutic agents for controlling one or more of cellular proliferative and/or differentiative disorders, disorders associated with bone metabolism, immune disorders (e.g., inflammatory disorders), cardiovascular disorders, liver disorders, viral diseases, pain or metabolic disorders.

[2711] Examples of cellular proliferative and/or differentiative disorders include cancer, e.g., carcinoma, sarcoma, metastatic disorders or hematopoietic neoplastic disorders, e.g., leukemias. A metastatic tumor can arise from a multitude of primary tumor types, including but not limited to those of prostate, colon, lung, breast and liver origin.

[2712] As used herein, the terms “cancer”, “hyperproliferative” and “neoplastic” refer to cells having the capacity for autonomous growth. Examples of such cells include cells having an abnormal state or condition characterized by rapidly proliferating cell growth. Hyperproliferative and neoplastic disease states may be categorized as pathologic, i.e., characterizing or constituting a disease state, or may be categorized as non-pathologic, i.e., a deviation from normal but not associated with a disease state. The term is meant to include all types of cancerous growths or oncogenic processes, metastatic tissues or malignantly transformed cells, tissues, or organs, irrespective of histopathologic type or stage of invasiveness. “Pathologic hyperproliferative” cells occur in disease states characterized by malignant tumor growth. Examples of non-pathologic hyperproliferative cells include proliferation of cells associated with wound repair.

[2713] The terms “cancer” or “neoplasms” include malignancies of the various organ systems, such as affecting lung, breast, thyroid, lymphoid, gastrointestinal, and genito-urinary tract, as well as adenocarcinomas which include malignancies such as most colon cancers, renal-cell carcinoma, prostate cancer and/or testicular tumors, non-small cell carcinoma of the lung, cancer of the small intestine and cancer of the esophagus.

[2714] The term “carcinoma” is art recognized and refers to malignancies of epithelial or endocrine tissues including respiratory system carcinomas, gastrointestinal system carcinomas, genitourinary system carcinomas, testicular carcinomas, breast carcinomas, prostatic carcinomas, endocrine system carcinomas, and melanomas. Exemplary carcinomas include those forming from tissue of the cervix, lung, prostate, breast, head and neck, colon and ovary. The term also includes carcinosarcomas, e.g., which include malignant tumors composed of carcinomatous and sarcomatous tissues. An “adenocarcinoma” refers to a carcinoma derived from glandular tissue or in which the tumor cells form recognizable glandular structures.

[2715] The term “sarcoma” is art recognized and refers to malignant tumors of mesenchymal derivation.

[2716] Examples of cellular proliferative and/or differentiative disorders of the colon include, but are not limited to, non-neoplastic polyps, adenomas, familial syndromes, colorectal carcinogenesis, colorectal carcinoma, and carcinoid tumors.

[2717] Examples of cellular proliferative and/or differentiative disorders of the liver include, but are not limited to, nodular hyperplasias, adenomas, and malignant tumors, including primary carcinoma of the liver and metastatic tumors.

[2718] Examples of cellular proliferative and/or differentiative disorders of the breast include, but are not limited to, proliferative breast disease including, e.g., epithelial hyperplasia, sclerosing adenosis, and small duct papillomas; tumors, e.g., stromal tumors such as fibroadenoma, phyllodes tumor, and sarcomas, and epithelial tumors such as large duct papilloma; carcinoma of the breast including in situ (noninvasive) carcinoma that includes ductal carcinoma in situ (including Paget's disease) and lobular carcinoma in situ, and invasive (infiltrating) carcinoma including, but not limited to, invasive ductal carcinoma, invasive lobular carcinoma, medullary carcinoma, colloid (mucinous) carcinoma, tubular carcinoma, and invasive papillary carcinoma, and miscellaneous malignant neoplasms. Disorders in the male breast include, but are not limited to, gynecomastia and carcinoma.

[2719] Examples of cellular proliferative and/or differentiative disorders of the lung include, but are not limited to, bronchogenic carcinoma, including paraneoplastic syndromes, bronchioloalveolar carcinoma, neuroendocrine tumors, such as bronchial carcinoid, miscellaneous tumors, and metastatic tumors; pathologies of the pleura, including inflammatory pleural effusions, noninflammatory pleural effusions, pneumothorax, and pleural tumors, including solitary fibrous tumors (pleural fibroma) and malignant mesothelioma.

[2720] Additional examples of proliferative disorders include hematopoietic neoplastic disorders. As used herein, the term “hematopoietic neoplastic disorders” includes diseases involving hyperplastic/neoplastic cells of hematopoietic origin. A hematopoietic neoplastic disorder can arise from myeloid, lymphoid or erythroid lineages, or precursor cells thereof. Preferably, the diseases arise from poorly differentiated acute leukemias, e.g., erythroblastic leukemia and acute megakaryoblastic leukemia. Additional exemplary myeloid disorders include, but are not limited to, acute promyeloid leukemia (APML), acute myelogenous leukemia (AML) and chronic myelogenous leukemia (CML) (reviewed in Vaickus, L. (1991) Crit Rev. in Oncol./Hemotol. 11:267-97); lymphoid malignancies include, but are not limited to acute lymphoblastic leukemia (ALL) which includes B-lineage ALL and T-lineage ALL, chronic lymphocytic leukemia (CLL), prolymphocytic leukemia (PLL), hairy cell leukemia (HLL) and Waldenstrom's macroglobulinemia (WM). Additional forms of malignant lymphomas include, but are not limited to non-Hodgkin lymphoma and variants thereof, peripheral T cell lymphomas, adult T cell leukemia/lymphoma (ATL), cutaneous T-cell lymphoma (CTCL), large granular lymphocytic leukemia (LGF), Hodgkin's disease and Reed-Sternberg disease.

[2721] Disorders of the breast include, but are not limited to, disorders of development; inflammations, including but not limited to, acute mastitis, periductal mastitis, periductal mastitis (recurrent subareolar abscess, squamous metaplasia of lactiferous ducts), mammary duct ectasia, fat necrosis, granulomatous mastitis, and pathologies associated with silicone breast implants; fibrocystic changes; proliferative breast disease including, but not limited to, epithelial hyperplasia, sclerosing adenosis, and small duct papillomas; tumors including, but not limited to, stromal tumors such as fibroadenoma, phyllodes tumor, and sarcomas, and epithelial tumors such as large duct papilloma; carcinoma of the breast including in situ (noninvasive) carcinoma that includes ductal carcinoma in situ (including Paget's disease) and lobular carcinoma in situ, and invasive (infiltrating) carcinoma including, but not limited to, invasive ductal carcinoma, no special type, invasive lobular carcinoma, medullary carcinoma, colloid (mucinous) carcinoma, tubular carcinoma, and invasive papillary carcinoma, and miscellaneous malignant neoplasms. Disorders in the male breast include, but are not limited to, gynecomastia and carcinoma.

[2722] Disorders involving the ovary include, for example, polycystic ovarian disease, Stein-leventhal syndrome, Pseudomyxoma peritonei and stromal hyperthecosis; ovarian tumors such as, tumors of coelomic epithelium, serous tumors, mucinous tumors, endometeriod tumors, clear cell adenocarcinoma, cystadenofibroma, brenner tumor, surface epithelial tumors; germ cell tumors such as mature (benign) teratomas, monodermal teratomas, immature malignant teratomas, dysgerminoma, endodermal sinus tumor, choriocarcinoma; sex cord-stomal tumors such as, granulosa-theca cell tumors, thecoma-fibromas, androblastomas, hill cell tumors, and gonadoblastoma; and metastatic tumors such as Krukenberg tumors.

[2723] Examples of disorders of the lung include, but are not limited to, congenital anomalies; atelectasis; diseases of vascular origin, such as pulmonary congestion and edema, including hemodynamic pulmonary edema and edema caused by microvascular injury, adult respiratory distress syndrome (diffuse alveolar damage), pulmonary embolism, hemorrhage, and infarction, and pulmonary hypertension and vascular sclerosis; chronic obstructive pulmonary disease, such as emphysema, chronic bronchitis, bronchial asthma, and bronchiectasis; diffuse interstitial (infiltrative, restrictive) diseases, such as pneumoconioses, sarcoidosis, idiopathic pulmonary fibrosis, desquamative interstitial pneumonitis, hypersensitivity pneumonitis, pulmonary eosinophilia (pulmonary infiltration with eosinophilia), Bronchiolitis obliterans-organizing pneumonia, diffuse pulmonary hemorrhage syndromes, including Goodpasture syndrome, idiopathic pulmonary hemosiderosis and other hemorrhagic syndromes, pulmonary involvement in collagen vascular disorders, and pulmonary alveolar proteinosis; complications of therapies, such as drug-induced lung disease, radiation-induced lung disease, and lung transplantation; tumors, such as bronchogenic carcinoma, including paraneoplastic syndromes, bronchioloalveolar carcinoma, neuroendocrine tumors, such as bronchial carcinoid, miscellaneous tumors, and metastatic tumors; pathologies of the pleura, including inflammatory pleural effusions, noninflammatory pleural effusions, pneumothorax, and pleural tumors, including solitary fibrous tumors (pleural fibroma) and malignant mesothelioma.

[2724] Disorders involving the colon include, but are not limited to, congenital anomalies, such as atresia and stenosis, Meckel diverticulum, congenital aganglionic megacolon-Hirschsprung disease; enterocolitis, such as diarrhea and dysentery, infectious enterocolitis, including viral gastroenteritis, bacterial enterocolitis, necrotizing enterocolitis, antibiotic-associated colitis (pseudomembranous colitis), and collagenous and lymphocytic colitis, miscellaneous intestinal inflammatory disorders, including parasites and protozoa, acquired immunodeficiency syndrome, transplantation, drug-induced intestinal injury, radiation enterocolitis, neutropenic colitis (typhlitis), and diversion colitis; idiopathic inflammatory bowel disease, such as Crohn disease and ulcerative colitis; tumors of the colon, such as non-neoplastic polyps, adenomas, familial syndromes, colorectal carcinogenesis, colorectal carcinoma, and carcinoid tumors.

[2725] The 47508 protein, fragments thereof, and derivatives and other variants of the sequence in SEQ ID NO:42 thereof are collectively referred to as “polypeptides or proteins of the invention” or “47508 polypeptides or proteins”. Nucleic acid molecules encoding such polypeptides or proteins are collectively referred to as “nucleic acids of the invention” or “47508 nucleic acids.” 47508 molecules refer to 47508 nucleic acids, polypeptides, and antibodies.

[2726] As used herein, the term “nucleic acid molecule” includes DNA molecules (e.g., a cDNA or genomic DNA), RNA molecules (e.g., an mRNA) and analogs of the DNA or RNA. A DNA or RNA analog can be synthesized from nucleotide analogs. The nucleic acid molecule can be single-stranded or double-stranded, but preferably is double-stranded DNA.

[2727] The term “isolated nucleic acid molecule” or “purified nucleic acid molecule” includes nucleic acid molecules that are separated from other nucleic acid molecules present in the natural source of the nucleic acid. For example, with regards to genomic DNA, the term “isolated” includes nucleic acid molecules which are separated from the chromosome with which the genomic DNA is naturally associated. Preferably, an “isolated” nucleic acid is free of sequences which naturally flank the nucleic acid (i.e., sequences located at the 5′ and/or 3′ends of the nucleic acid) in the genomic DNA of the organism from which the nucleic acid is derived. For example, in various embodiments, the isolated nucleic acid molecule can contain less than about 5 kb, 4 kb, 3 kb, 2 kb, 1 kb, 0.5 kb or 0.1 kb of 5′ and/or 3′nucleotide sequences which naturally flank the nucleic acid molecule in genomic DNA of the cell from which the nucleic acid is derived. Moreover, an “isolated” nucleic acid molecule, such as a cDNA molecule, can be substantially free of other cellular material, or culture medium when produced by recombinant techniques, or substantially free of chemical precursors or other chemicals when chemically synthesized.

[2728] As used herein, the term “hybridizes under low stringency, medium stringency, high stringency, or very high stringency conditions” describes conditions for hybridization and washing. Guidance for performing hybridization reactions can be found in Current Protocols in Molecular Biology, John Wiley & Sons, N.Y. (1989), 6.3.1-6.3.6, which is incorporated by reference. Aqueous and nonaqueous methods are described in that reference and either can be used. Specific hybridization conditions referred to herein are as follows: 1) low stringency hybridization conditions in 6× sodium chloride/sodium citrate (SSC) at about 45° C., followed by two washes in 0.2× SSC, 0.1% SDS at least at 50° C. (the temperature of the washes can be increased to 55° C. for low stringency conditions); 2) medium stringency hybridization conditions in 6× SSC at about 45° C., followed by one or more washes in 0.2× SSC, 0.1% SDS at 60° C.; 3) high stringency hybridization conditions in 6× SSC at about 45° C., followed by one or more washes in 0.2× SSC, 0.1% SDS at 65° C.; and preferably 4) very high stringency hybridization conditions are 0.5M sodium phosphate, 7% SDS at 65° C., followed by one or more washes at 0.2× SSC, 1% SDS at 65° C. Very high stringency conditions (4) are the preferred conditions and the ones that should be used unless otherwise specified.

[2729] Preferably, an isolated nucleic acid molecule of the invention that hybridizes under a stringency condition described herein to the sequence of SEQ ID NO:41 or SEQ ID NO:43, corresponds to a naturally-occurring nucleic acid molecule.

[2730] As used herein, a “naturally-occurring” nucleic acid molecule refers to an RNA or DNA molecule having a nucleotide sequence that occurs in nature. For example a naturally occurring nucleic acid molecule can encode a natural protein.

[2731] As used herein, the terms “gene” and “recombinant gene” refer to nucleic acid molecules which include at least an open reading frame encoding a 47508 protein. The gene can optionally further include non-coding sequences, e.g., regulatory sequences and introns. Preferably, a gene encodes a mammalian 47508 protein or derivative thereof.

[2732] An “isolated” or “purified” polypeptide or protein is substantially free of cellular material or other contaminating proteins from the cell or tissue source from which the protein is derived, or substantially free from chemical precursors or other chemicals when chemically synthesized. “Substantially free” means that a preparation of 47508 protein is at least 10% pure. In a preferred embodiment, the preparation of 47508 protein has less than about 30%, 20%, 10% and more preferably 5% (by dry weight), of non-47508 protein (also referred to herein as a “contaminating protein”), or of chemical precursors or non-47508 chemicals. When the 47508 protein or biologically active portion thereof is recombinantly produced, it is also preferably substantially free of culture medium, i.e., culture medium represents less than about 20%, more preferably less than about 10%, and most preferably less than about 5% of the volume of the protein preparation. The invention includes isolated or purified preparations of at least 0.01, 0.1, 1.0, and 10 milligrams in dry weight.

[2733] A “non-essential” amino acid residue is a residue that can be altered from the wild-type sequence of 47508 without abolishing or substantially altering a 47508 activity. Preferably the alteration does not substantially alter the 47508 activity, e.g., the activity is at least 20%, 40%, 60%, 70% or 80% of wild-type. An “essential” amino acid residue is a residue that, when altered from the wild-type sequence of 47508, results in abolishing a 47508 activity such that less than 20% of the wild-type activity is present. For example, conserved amino acid residues in 47508 are predicted to be particularly unamenable to alteration.

[2734] A “conservative amino acid substitution” is one in which the amino acid residue is replaced with an amino acid residue having a similar side chain. Families of amino acid residues having similar side chains have been defined in the art. These families include amino acids with basic side chains (e.g., lysine, arginine, histidine), acidic side chains (e.g., aspartic acid, glutamic acid), uncharged polar side chains (e.g., glycine, asparagine, glutamine, serine, threonine, tyrosine, cysteine), nonpolar side chains (e.g., alanine, valine, leucine, isoleucine, proline, phenylalanine, methionine, tryptophan), beta-branched side chains (e.g., threonine, valine, isoleucine) and aromatic side chains (e.g., tyrosine, phenylalanine, tryptophan, histidine). Thus, a predicted nonessential amino acid residue in a 47508 protein is preferably replaced with another amino acid residue from the same side chain family. Alternatively, in another embodiment, mutations can be introduced randomly along all or part of a 47508 coding sequence, such as by saturation mutagenesis, and the resultant mutants can be screened for 47508 biological activity to identify mutants that retain activity. Following mutagenesis of SEQ ID NO:41 or SEQ ID NO:43, the encoded protein can be expressed recombinantly and the activity of the protein can be determined.

[2735] As used herein, a “biologically active portion” of a 47508 protein includes a fragment of a 47508 protein which participates in an interaction, e.g., an intramolecular or an inter-molecular interaction. An inter-molecular interaction can be a specific binding interaction or an enzymatic interaction (e.g., the interaction can be transient and a covalent bond is formed or broken). An inter-molecular interaction can be between a 47508 molecule and a non-47508 molecule or between a first 47508 molecule and a second 47508 molecule (e.g., a dimerization interaction). Biologically active portions of a 47508 protein include peptides comprising amino acid sequences sufficiently homologous to or derived from the amino acid sequence of the 47508 protein, e.g., the amino acid sequence shown in SEQ ID NO:42, which include less amino acids than the full length 47508 proteins, and exhibit at least one activity of a 47508 protein. Typically, biologically active portions comprise a domain or motif with at least one activity of the 47508 protein, e.g., histone deacetylation, e.g., deacetylation of histone H2A, H2B, H3, or H4. A biologically active portion of a 47508 protein can be a polypeptide which is, for example, 10, 25, 50, 100, 200 or more amino acids in length. Biologically active portions of a 47508 protein can be used as targets for developing agents which modulate a 47508 mediated activity, e.g., histone deacetylation, e.g., deacetylation of histone H2A, H2B, H3, or H4.

[2736] Calculations of homology or sequence identity between sequences (the terms are used interchangeably herein) are performed as follows.

[2737] To determine the percent identity of two amino acid sequences, or of two nucleic acid sequences, the sequences are aligned for optimal comparison purposes (e.g., gaps can be introduced in one or both of a first and a second amino acid or nucleic acid sequence for optimal alignment and non-homologous sequences can be disregarded for comparison purposes). In a preferred embodiment, the length of a reference sequence aligned for comparison purposes is at least 30%, preferably at least 40%, more preferably at least 50%, 60%, and even more preferably at least 70%, 80%, 90%, 100% of the length of the reference sequence. The amino acid residues or nucleotides at corresponding amino acid positions or nucleotide positions are then compared. When a position in the first sequence is occupied by the same amino acid residue or nucleotide as the corresponding position in the second sequence, then the molecules are identical at that position (as used herein amino acid or nucleic acid “identity” is equivalent to amino acid or nucleic acid “homology”).

[2738] The percent identity between the two sequences is a function of the number of identical positions shared by the sequences, taking into account the number of gaps, and the length of each gap, which need to be introduced for optimal alignment of the two sequences.

[2739] The comparison of sequences and determination of percent identity between two sequences can be accomplished using a mathematical algorithm. In a preferred embodiment, the percent identity between two amino acid sequences is determined using the Needleman and Wunsch ((1970) J. Mol. Biol. 48:444-453) algorithm which has been incorporated into the GAP program in the GCG software package (available at http://www.gcg.com), using either a Blossum 62 matrix or a PAM250 matrix, and a gap weight of 16, 14, 12, 10, 8, 6, or 4 and a length weight of 1, 2, 3, 4, 5, or 6. In yet another preferred embodiment, the percent identity between two nucleotide sequences is determined using the GAP program in the GCG software package (available at http://www.gcg.com), using a NWSgapdna.CMP matrix and a gap weight of 40, 50, 60, 70, or 80 and a length weight of 1, 2, 3, 4, 5, or 6. A particularly preferred set of parameters (and the one that should be used unless otherwise specified) are a Blossum 62 scoring matrix with a gap penalty of 12, a gap extend penalty of 4, and a frameshift gap penalty of 5.

[2740] The percent identity between two amino acid or nucleotide sequences can be determined using the algorithm of E. Meyers and W. Miller ((1989) CABIOS, 4:11-17) which has been incorporated into the ALIGN program (version 2.0), using a PAM120 weight residue table, a gap length penalty of 12 and a gap penalty of 4.

[2741] The nucleic acid and protein sequences described herein can be used as a “query sequence” to perform a search against public databases to, for example, identify other family members or related sequences. Such searches can be performed using the NBLAST and XBLAST programs (version 2.0) of Altschul, et al. (1990) J. Mol. Biol. 215:403-10. BLAST nucleotide searches can be performed with the NBLAST program, score=100, wordlength=12 to obtain nucleotide sequences homologous to 47508 nucleic acid molecules of the invention. BLAST protein searches can be performed with the XBLAST program, score=50, wordlength=3 to obtain amino acid sequences homologous to 47508 protein molecules of the invention. To obtain gapped alignments for comparison purposes, Gapped BLAST can be utilized as described in Altschul et al., (1997) Nucleic Acids Res. 25:3389-3402. When utilizing BLAST and Gapped BLAST programs, the default parameters of the respective programs (e.g., XBLAST and NBLAST) can be used. See http://www.ncbi.nlm.nih.gov.

[2742] Particularly preferred 47508 polypeptides of the present invention have an amino acid sequence substantially identical to the amino acid sequence of SEQ ID NO:42. In the context of an amino acid sequence, the term “substantially identical” is used herein to refer to a first amino acid that contains a sufficient or minimum number of amino acid residues that are i) identical to, or ii) conservative substitutions of aligned amino acid residues in a second amino acid sequence such that the first and second amino acid sequences can have a common structural domain and/or common functional activity. For example, amino acid sequences that contain a common structural domain having at least about 60%, or 65% identity, likely 75% identity, more likely 85%, 90%. 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98% or 99% identity to SEQ ID NO:42 are termed substantially identical.

[2743] In the context of nucleotide sequence, the term “substantially identical” is used herein to refer to a first nucleic acid sequence that contains a sufficient or minimum number of nucleotides that are identical to aligned nucleotides in a second nucleic acid sequence such that the first and second nucleotide sequences encode a polypeptide having common functional activity, or encode a common structural polypeptide domain or a common functional polypeptide activity. For example, nucleotide sequences having at least about 60%, or 65% identity, likely 75% identity, more likely 85%, 90%. 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98% or 99% identity to SEQ ID NO:41 or 43 are termed substantially identical.

[2744] “Misexpression or aberrant expression”, as used herein, refers to a non-wildtype pattern of gene expression at the RNA or protein level. It includes: expression at non-wild type levels, i.e., over- or under-expression; a pattern of expression that differs from wild type in terms of the time or stage at which the gene is expressed, e.g., increased or decreased expression (as compared with wild type) at a predetermined developmental period or stage; a pattern of expression that differs from wild type in terms of altered, e.g., increased or decreased, expression (as compared with wild type) in a predetermined cell type or tissue type; a pattern of expression that differs from wild type in terms of the splicing size, translated amino acid sequence, post-transitional modification, or biological activity of the expressed polypeptide; a pattern of expression that differs from wild type in terms of the effect of an environmental stimulus or extracellular stimulus on expression of the gene, e.g., a pattern of increased or decreased expression (as compared with wild type) in the presence of an increase or decrease in the strength of the stimulus.

[2745] “Subject,” as used herein, refers to human and non-human animals. The term “non-human animals” of the invention includes all vertebrates, e.g., mammals, such as non-human primates (particularly higher primates), sheep, dog, rodent (e.g., mouse or rat), guinea pig, goat, pig, cat, rabbits, cow, and non-mammals, such as chickens, amphibians, reptiles, etc. In a preferred embodiment, the subject is a human. In another embodiment, the subject is an experimental animal or animal suitable as a disease model.

[2746] A “purified preparation of cells”, as used herein, refers to an in vitro preparation of cells. In the case cells from multicellular organisms (e.g., plants and animals), a purified preparation of cells is a subset of cells obtained from the organism, not the entire intact organism. In the case of unicellular microorganisms (e.g., cultured cells and microbial cells), it consists of a preparation of at least 10% and more preferably 50% of the subject cells.

[2747] Various aspects of the invention are described in further detail below.

[2748] Isolated Nucleic Acid Molecules of 47508

[2749] In one aspect, the invention provides, an isolated or purified, nucleic acid molecule that encodes a 47508 polypeptide described herein, e.g., a full-length 47508 protein or a fragment thereof, e.g., a biologically active portion of 47508 protein. Also included is a nucleic acid fragment suitable for use as a hybridization probe, which can be used, e.g., to identify a nucleic acid molecule encoding a polypeptide of the invention, 47508 mRNA, and fragments suitable for use as primers, e.g., PCR primers for the amplification or mutation of nucleic acid molecules.

[2750] In one embodiment, an isolated nucleic acid molecule of the invention includes the nucleotide sequence shown in SEQ ID NO:41, or a portion of any of these nucleotide sequences. In one embodiment, the nucleic acid molecule includes sequences encoding the human 47508 protein (i.e., “the coding region” of SEQ ID NO:41, as shown in SEQ ID NO:43), as well as 5′untranslated sequences. Alternatively, the nucleic acid molecule can include only the coding region of SEQ ID NO:41 (e.g., SEQ ID NO:43) and, e.g., no flanking sequences which normally accompany the subject sequence. In yet another embodiment, the nucleic acid includes one or more nucleotides from 1-234 or 852-862 of SEQ ID NO:41. In another embodiment, the nucleic acid molecule encodes a sequence corresponding to a fragment of the protein from about amino acid 8 to 392 of SEQ ID NO:42. In other embodiments, the nucleic acid molecule includes a nucleotide sequence encoding one or more of amino acids 1-57 or 263-266 of SEQ ID NO:42.

[2751] In another embodiment, an isolated nucleic acid molecule of the invention includes a nucleic acid molecule which is a complement of the nucleotide sequence shown in SEQ ID NO:41 or SEQ ID NO:43, or a portion of any of these nucleotide sequences. In other embodiments, the nucleic acid molecule of the invention is sufficiently complementary to the nucleotide sequence shown in SEQ ID NO:41 or SEQ ID NO:43, such that it can hybridize (e.g., under a stringency condition described herein) to the nucleotide sequence shown in SEQ ID NO:41 or 43, thereby forming a stable duplex.

[2752] In one embodiment, an isolated nucleic acid molecule of the present invention includes a nucleotide sequence which is at least about: 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or more homologous to the entire length of the nucleotide sequence shown in SEQ ID NO:41 or SEQ ID NO:43, or a portion, preferably of the same length, of any of these nucleotide sequences.

[2753] 47508 Nucleic Acid Fragments

[2754] A nucleic acid molecule of the invention can include only a portion of the nucleic acid sequence of SEQ ID NO:41 or 43. For example, such a nucleic acid molecule can include a fragment which can be used as a probe or primer or a fragment encoding a portion of a 47508 protein, e.g., an immunogenic or biologically active portion of a 47508 protein. A fragment can comprise those nucleotides of SEQ ID NO:41 (e.g., one or more of nucleotides 1-234 of SEQ ID NO:41), which encode the N-terminus of 47508, e.g., about amino acid residues 1 to 57 of SEQ ID NO:42, or a portion thereof. Alternatively, a fragment can comprise those nucleotides of SEQ ID NO:41 which encode amino acids 263 to 266 of SEQ ID NO:42 (e.g., nucleotides 852-862 of SEQ ID NO:41). The nucleotide sequence determined from the cloning of the 47508 gene allows for the generation of probes and primers designed for use in identifying and/or cloning other 47508 family members, or fragments thereof, as well as 47508 homologues, or fragments thereof, from other species.

[2755] In another embodiment, a nucleic acid includes a nucleotide sequence that includes part, or all, of the coding region and extends into either (or both) the 5′ or 3′noncoding region. Other embodiments include a fragment which includes a nucleotide sequence encoding an amino acid fragment described herein. Nucleic acid fragments can encode a specific domain or site described herein or fragments thereof, particularly fragments thereof which are at least 100, 200, 220, 240, 250, 275, 300, 320, or more amino acids in length. Fragments also include nucleic acid sequences corresponding to specific amino acid sequences described above or fragments thereof. Nucleic acid fragments should not to be construed as encompassing those fragments that may have been disclosed prior to the invention, e.g., SEQ ID NO: 4375 of WO 00/58473, SEQ ID NOS: 2079 and 2462 of WO 01/02568, or sequences having NCBI accession numbers AL137362 or AU079696.

[2756] A nucleic acid fragment can include a sequence corresponding to a domain, region, or functional site described herein. A nucleic acid fragment can also include one or more domain, region, or functional site described herein. Thus, for example, a 47508 nucleic acid fragment can include a sequence corresponding to a histone deacetylase domain, e.g., about amino acid residues 83 to 392 of SEQ ID NO:42 (or a fragment thereof, e.g., amino acids 83-150, 150-200, 200-250, 250-300, 300-350, or 350-392 of SEQ ID NO:42), and at least one amino acid residue from amino acid residues 1 to 57 of SEQ ID NO:42.

[2757] 47508 probes and primers are provided. Typically a probe/primer is an isolated or purified oligonucleotide. The oligonucleotide typically includes a region of nucleotide sequence that hybridizes under a stringency condition described herein to at least about 7, 12 or 15, preferably about 20 or 25, more preferably about 30, 35, 40, 45, 50, 55, 60, 65, or 75 consecutive nucleotides of a sense or antisense sequence of SEQ ID NO:41 or SEQ ID NO:43, or of a naturally occurring allelic variant or mutant of SEQ ID NO:41 or SEQ ID NO:43.

[2758] In a preferred embodiment the nucleic acid is a probe which is at least 5 or 10, and less than 200, more preferably less than 100, or less than 50, base pairs in length. It should be identical, or differ by 1, or less than in 5 or 10 bases, from a sequence disclosed herein. If alignment is needed for this comparison the sequences should be aligned for maximum homology. “Looped” out sequences from deletions or insertions, or mismatches, are considered differences.

[2759] A probe or primer can be derived from the sense or anti-sense strand of a nucleic acid which encodes: the 5′UTR of 47508, e.g., about nucleotides 1 to 65 of SEQ ID NO:41; N-terminal portions of human 47508, e.g., about amino acid residues 1 to 57 of SEQ ID NO:42; the histone deacetylase domain of human 47508, e.g., about amino acid residues 83 to 392; catalytically important motifs of the histone deacetylase domain of human 47508, e.g. about amino acid residues 205 to 215, or 240 to 255, or 325 to 335 of SEQ ID NO:42; or other fragments of the histone deacetylase domain of human 47508, e.g., about amino acid residues 260 to 270 of SEQ ID NO:42.

[2760] In another embodiment, a set of primers is provided, e.g., primers suitable for use in a PCR, which can be used to amplify a selected region of a 47508 sequence, e.g., a domain, region, site or other sequence described herein. The primers should be at least 5, 10, or 50 base pairs in length and less than 100, or less than 200, base pairs in length. The primers should be identical, or differs by one base from a sequence disclosed herein or from a naturally occurring variant. For example, primers suitable for amplifying all or a portion of any of the following regions are provided: a 5′UTR of human 47508, e.g., about nucleotides 1 to 65 of SEQ ID NO:41; a region which encodes an N-terminal portion of human 47508, e.g., about amino acids 1 to 57 of SEQ ID NO:42; a region which encodes a histone deacetylase domain and at least one amino acid from about amino acid residues 1 to 57 of SEQ ID NO:42, e.g., from about amino acid 57 to 392 of SEQ ID NO:42; a region which encodes a catalytically important fragment of the histone deacetylase domain of human 47508, e.g., about amino acid residues 205 to 215, 240 to 255, or 325 to 335; a region which encodes other portions of the histone deacetylase domain of human 47508, e.g., about amino acid residues 260 to 270 of SEQ ID NO:42.

[2761] A nucleic acid fragment can encode an epitope bearing region of a polypeptide described herein.

[2762] A nucleic acid fragment encoding a “biologically active portion of a 47508 polypeptide” can be prepared by isolating a portion of the nucleotide sequence of SEQ ID NO:41 or 43, which encodes a polypeptide having a 47508 biological activity (e.g., the biological activities of the 47508 proteins are described herein), expressing the encoded portion of the 47508 protein (e.g., by recombinant expression in vitro) and assessing the activity of the encoded portion of the 47508 protein. For example, a nucleic acid fragment encoding a biologically active portion of 47508 includes a histone deacetylase domain; at least one amino acid from about amino acid residues 1 to 57 of SEQ ID NO:42, e.g., from about amino acid 57 to 392 of SEQ ID NO:42. A nucleic acid fragment encoding a biologically active portion of a 47508 polypeptide, may comprise a nucleotide sequence which is greater than 300, 350, 400, 450, 500, 550, 600, 650, or more nucleotides in length.

[2763] In preferred embodiments, a nucleic acid includes a nucleotide sequence which is about 300, 400, 500, 600, 700, 800, 900, 1000, 1100, 1200, 1300, 1400, 1500, 1550, or more nucleotides in length and hybridizes under a stringency condition described herein to a nucleic acid molecule of SEQ ID NO:41, or SEQ ID NO:43. In a preferred embodiment, a nucleic acid of human 47508 includes at least one contiguous nucleotide from the region of about nucleotides 1-65, 1-234, 65-234, 65-311, 234-500, 312-700, 312-1307, 600-850, 750-850, 750-1000, 852-862, 900-1182, 1000-1100, 1150-1307, 1183-1307, 1183-1579, 1308-1579.

[2764] 47508 Nucleic Acid Variants

[2765] The invention further encompasses nucleic acid molecules that differ from the nucleotide sequence shown in SEQ ID NO:41 or SEQ ID NO:43. Such differences can be due to degeneracy of the genetic code (and result in a nucleic acid which encodes the same 47508 proteins as those encoded by the nucleotide sequence disclosed herein. In another embodiment, an isolated nucleic acid molecule of the invention has a nucleotide sequence encoding a protein having an amino acid sequence which differs, by at least 1, but less than 5, 10, 20, 50, or 100 amino acid residues that shown in SEQ ID NO:42. If alignment is needed for this comparison the sequences should be aligned for maximum homology. “Looped” out sequences from deletions or insertions, or mismatches, are considered differences.

[2766] Nucleic acids of the inventor can be chosen for having codons, which are preferred, or non-preferred, for a particular expression system. E.g., the nucleic acid can be one in which at least one codon, at preferably at least 10%, or 20% of the codons has been altered such that the sequence is optimized for expression in E. coli, yeast, human, insect, or CHO cells.

[2767] Nucleic acid variants can be naturally occurring, such as allelic variants (same locus), homologs (different locus), and orthologs (different organism) or can be non naturally occurring. Non-naturally occurring variants can be made by mutagenesis techniques, including those applied to polynucleotides, cells, or organisms. The variants can contain nucleotide substitutions, deletions, inversions and insertions. Variation can occur in either or both the coding and non-coding regions. The variations can produce both conservative and non-conservative amino acid substitutions (as compared in the encoded product).

[2768] In a preferred embodiment, the nucleic acid differs from that of SEQ ID NO:41 or 43, e.g., as follows: by at least one but less than 10, 20, 30, or 40 nucleotides; at least one but less than 1%, 5%, 10% or 20% of the nucleotides in the subject nucleic acid. If necessary for this analysis the sequences should be aligned for maximum homology. “Looped” out sequences from deletions or insertions, or mismatches, are considered differences.

[2769] Orthologs, homologs, and allelic variants can be identified using methods known in the art. These variants comprise a nucleotide sequence encoding a polypeptide that is 50%, at least about 55%, typically at least about 70-75%, more typically at least about 80-85%, and most typically at least about 90%, 95%, 98%, 99%, or more identical to the nucleotide sequence shown in SEQ ID NO:42 or a fragment of this sequence. Such nucleic acid molecules can readily be identified as being able to hybridize under a stringency condition described herein, to the nucleotide sequence shown in SEQ ID NO:42 or a fragment of the sequence. Nucleic acid molecules corresponding to orthologs, homologs, and allelic variants of the 47508 cDNAs of the invention can further be isolated by mapping to the same chromosome or locus as the 47508 gene.

[2770] Preferred variants include those that are correlated with histone deacetylase acitivity, e.g., deacetylation of histones H2A, H2B, H3, or H4.

[2771] Allelic variants of 47508, e.g., human 47508, include both functional and non-functional proteins. Functional allelic variants are naturally occurring amino acid sequence variants of the 47508 protein within a population that maintain the ability to deacetylate histones, e.g., histones H2A, H2B, H3, or H4, or to interact with transcription factors, e.g., transcriptional repressors. Functional allelic variants will typically contain only conservative substitution of one or more amino acids of SEQ ID NO:42, or substitution, deletion or insertion of non-critical residues in non-critical regions of the protein. Non-functional allelic variants are naturally-occurring amino acid sequence variants of the 47508, e.g., human 47508, protein within a population that do not have the ability to deacetylate histones, e.g., histones H2A, H2B, H3, or H4, or to interact with transcription factors, e.g., transcriptional repressors. Non-functional allelic variants will typically contain a non-conservative substitution, a deletion, or insertion, or premature truncation of the amino acid sequence of SEQ ID NO:42, or a substitution, insertion, or deletion in critical residues or critical regions of the protein.

[2772] Moreover, nucleic acid molecules encoding other 47508 family members and, thus, which have a nucleotide sequence which differs from the 47508 sequences of SEQ ID NO:41 or SEQ ID NO:43 are intended to be within the scope of the invention.

[2773] Antisense Nucleic Acid Molecules, Ribozymes and Modified 47508 Nucleic Acid Molecules

[2774] In another aspect, the invention features, an isolated nucleic acid molecule which is antisense to 47508. An “antisense” nucleic acid can include a nucleotide sequence which is complementary to a “sense” nucleic acid encoding a protein, e.g., complementary to the coding strand of a double-stranded cDNA molecule or complementary to an mRNA sequence. The antisense nucleic acid can be complementary to an entire 47508 coding strand, or to only a portion thereof (e.g., the coding region of human 47508 corresponding to SEQ ID NO:43). In another embodiment, the antisense nucleic acid molecule is antisense to a “noncoding region” of the coding strand of a nucleotide sequence encoding 47508 (e.g., the 5′ and 3′untranslated regions).

[2775] An antisense nucleic acid can be designed such that it is complementary to the entire coding region of 47508 mRNA, but more preferably is an oligonucleotide which is antisense to only a portion of the coding or noncoding region of 47508 mRNA. For example, the antisense oligonucleotide can be complementary to the region surrounding the translation start site of 47508 mRNA, e.g., between the −10 and +10 regions of the target gene nucleotide sequence of interest. An antisense oligonucleotide can be, for example, about 7, 10, 15, 20, 25, 30, 35, 40, 45, 50, 55, 60, 65, 70, 75, 80, or more nucleotides in length.

[2776] An antisense nucleic acid of the invention can be constructed using chemical synthesis and enzymatic ligation reactions using procedures known in the art. For example, an antisense nucleic acid (e.g., an antisense oligonucleotide) can be chemically synthesized using naturally occurring nucleotides or variously modified nucleotides designed to increase the biological stability of the molecules or to increase the physical stability of the duplex formed between the antisense and sense nucleic acids, e.g., phosphorothioate derivatives and acridine substituted nucleotides can be used. The antisense nucleic acid also can be produced biologically using an expression vector into which a nucleic acid has been subcloned in an antisense orientation (i.e., RNA transcribed from the inserted nucleic acid will be of an antisense orientation to a target nucleic acid of interest, described further in the following subsection).

[2777] The antisense nucleic acid molecules of the invention are typically administered to a subject (e.g., by direct injection at a tissue site), or generated in situ such that they hybridize with or bind to cellular mRNA and/or genomic DNA encoding a 47508 protein to thereby inhibit expression of the protein, e.g., by inhibiting transcription and/or translation. Alternatively, antisense nucleic acid molecules can be modified to target selected cells and then administered systemically. For systemic administration, antisense molecules can be modified such that they specifically bind to receptors or antigens expressed on a selected cell surface, e.g., by linking the antisense nucleic acid molecules to peptides or antibodies which bind to cell surface receptors or antigens. The antisense nucleic acid molecules can also be delivered to cells using the vectors described herein. To achieve sufficient intracellular concentrations of the antisense molecules, vector constructs in which the antisense nucleic acid molecule is placed under the control of a strong pol II or pol III promoter are preferred.

[2778] In yet another embodiment, the antisense nucleic acid molecule of the invention is an &agr;-anomeric nucleic acid molecule. An &agr;-anomeric nucleic acid molecule forms specific double-stranded hybrids with complementary RNA in which, contrary to the usual &bgr;-units, the strands run parallel to each other (Gaultier et al. (1987) Nucleic Acids. Res. 15:6625-6641). The antisense nucleic acid molecule can also comprise a 2′-o-methylribonucleotide (Inoue et al. (1987) Nucleic Acids Res. 15:6131-6148) or a chimeric RNA-DNA analogue (Inoue et al. (1987) FEBS Lett. 215:327-330).

[2779] In still another embodiment, an antisense nucleic acid of the invention is a ribozyme. A ribozyme having specificity for a 47508-encoding nucleic acid can include one or more sequences complementary to the nucleotide sequence of a 47508 cDNA disclosed herein (i.e., SEQ ID NO:41 or SEQ ID NO:43), and a sequence having known catalytic sequence responsible for mRNA cleavage (see U.S. Pat. No. 5,093,246 or Haselhoff and Gerlach (1988) Nature 334:585-591). For example, a derivative of a Tetrahymena L-19 IVS RNA can be constructed in which the nucleotide sequence of the active site is complementary to the nucleotide sequence to be cleaved in a 47508-encoding mRNA. See, e.g., Cech et al. U.S. Pat. No. 4,987,071; and Cech et al. U.S. Pat. No. 5,116,742. Alternatively, 47508 mRNA can be used to select a catalytic RNA having a specific ribonuclease activity from a pool of RNA molecules. See, e.g., Bartel, D. and Szostak, J. W. (1993) Science 261:1411-1418.

[2780] 47508 gene expression can be inhibited by targeting nucleotide sequences complementary to the regulatory region of the 47508 (e.g., the 47508 promoter and/or enhancers) to form triple helical structures that prevent transcription of the 47508 gene in target cells. See generally, Helene, C. (1991) Anticancer Drug Des. 6:569-84; Helene, C. i (1992) Ann. N.Y. Acad. Sci. 660:27-36; and Maher, L. J. (1992) Bioassays 14:807-15. The potential sequences that can be targeted for triple helix formation can be increased by creating a so-called “switchback” nucleic acid molecule. Switchback molecules are synthesized in an alternating 5′-3′,3′-5′manner, such that they base pair with first one strand of a duplex and then the other, eliminating the necessity for a sizeable stretch of either purines or pyrimidines to be present on one strand of a duplex.

[2781] The invention also provides detectably labeled oligonucleotide primer and probe molecules. Typically, such labels are chemiluminescent, fluorescent, radioactive, or calorimetric.

[2782] A 47508 nucleic acid molecule can be modified at the base moiety, sugar moiety or phosphate backbone to improve, e.g., the stability, hybridization, or solubility of the molecule. For non-limiting examples of synthetic oligonucleotides with modifications see Toulmé (2001) Nature Biotech. 19:17 and Faria et al. (2001) Nature Biotech. 19:40-44. Such phosphoramidite oligonucleotides can be effective antisense agents.

[2783] For example, the deoxyribose phosphate backbone of the nucleic acid molecules can be modified to generate peptide nucleic acids (see Hyrup B. et al. (1996) Bioorganic & Medicinal Chemistry 4: 5-23). As used herein, the terms “peptide nucleic acid” or “PNA” refers to a nucleic acid mimic, e.g., a DNA mimic, in which the deoxyribose phosphate backbone is replaced by a pseudopeptide backbone and only the four natural nucleobases are retained. The neutral backbone of a PNA can allow for specific hybridization to DNA and RNA under conditions of low ionic strength. The synthesis of PNA oligomers can be performed using standard solid phase peptide synthesis protocols as described in Hyrup B. et al. (1996) supra and Perry-O'Keefe et al. Proc. Natl. Acad. Sci. 93: 14670-675.

[2784] PNAs of 47508 nucleic acid molecules can be used in therapeutic and diagnostic applications. For example, PNAs can be used as antisense or antigene agents for sequence-specific modulation of gene expression by, for example, inducing transcription or translation arrest or inhibiting replication. PNAs of 47508 nucleic acid molecules can also be used in the analysis of single base pair mutations in a gene, (e.g., by PNA-directed PCR clamping); as ‘artificial restriction enzymes’ when used in combination with other enzymes, (e.g., S1 nucleases (Hyrup B. et al. (1996) supra)); or as probes or primers for DNA sequencing or hybridization (Hyrup B. et al. (1996) supra; Perry-O'Keefe supra).

[2785] In other embodiments, the oligonucleotide may include other appended groups such as peptides (e.g., for targeting host cell receptors in vivo), or agents facilitating transport across the cell membrane (see, e.g., Letsinger et al. (1989) Proc. Natl. Acad. Sci. USA 86:6553-6556; Lemaitre et al. (1987) Proc. Natl. Acad. Sci. USA 84:648-652; PCT Publication No. WO88/09810) or the blood-brain barrier (see, e.g., PCT Publication No. WO89/10134). In addition, oligonucleotides can be modified with hybridization-triggered cleavage agents (see, e.g., Krol et al. (1988) Bio-Techniques 6:958-976) or intercalating agents. (see, e.g., Zon (1988) Pharm. Res. 5:539-549). To this end, the oligonucleotide may be conjugated to another molecule, (e.g., a peptide, hybridization triggered cross-linking agent, transport agent, or hybridization-triggered cleavage agent).

[2786] The invention also includes molecular beacon oligonucleotide primer and probe molecules having at least one region which is complementary to a 47508 nucleic acid of the invention, two complementary regions one having a fluorophore and one a quencher such that the molecular beacon is useful for quantitating the presence of the 47508 nucleic acid of the invention in a sample. Molecular beacon nucleic acids are described, for example, in Lizardi et al., U.S. Pat. No. 5,854,033; Nazarenko et al., U.S. Pat. No. 5,866,336, and Livak et al., U.S. Pat. No. 5,876,930.

[2787] Isolated 47508 Polypeptides

[2788] In another aspect, the invention features, an isolated 47508 protein, or fragment, e.g., a biologically active portion, for use as immunogens or antigens to raise or test (or more generally to bind) anti-47508 antibodies. 47508 protein can be isolated from cells or tissue sources using standard protein purification techniques. 47508 protein or fragments thereof can be produced by recombinant DNA techniques or synthesized chemically.

[2789] Polypeptides of the invention include those which arise as a result of the existence of multiple genes, alternative transcription events, alternative RNA splicing events, and alternative translational and post-translational events. The polypeptide can be expressed in systems, e.g., cultured cells, which result in substantially the same post-translational modifications present when expressed the polypeptide is expressed in a native cell, or in systems which result in the alteration or omission of post-translational modifications, e.g., glycosylation or cleavage, present when expressed in a native cell.

[2790] In a preferred embodiment, a 47508 polypeptide has one or more of the following characteristics:

[2791] (i) it has the ability to deacetylate histones, e.g., histones H2A, H2B, H3, or H4;

[2792] (ii) it has a molecular weight, e.g., a deduced molecular weight, preferably ignoring any contribution of post translational modifications, amino acid composition or other physical characteristic of a 47508 polypeptide, e.g., a polypeptide of SEQ ID NO:42;

[2793] (iii) it has an overall sequence similarity of at least 60%, more preferably at least 70%, 80%, 90%, 95%, 98%, 99%, or more with a polypeptide a of SEQ ID NO:42;

[2794] (iv) it can be found in the nucleus of a cell;

[2795] (v) it has a histone deacetylase domain which is preferably about 70%, 80%, 90%, 95%, 98%, 99%, or more homologous to amino acid residues about 83 to 392 of SEQ ID NO:42;

[2796] (vi) it has at least one histone deacetylase zinc-binding triad;

[2797] (vii) it has at least two charge-relay systems;

[2798] (viiii) it can colocalize with transcription factors, e.g., transcriptional repressors;

[2799] (ix) it has at least one, two, three, preferably four predicted Protein kinase C phosphorylation sites (PS00005);

[2800] (x) it has at least one, two, three, four, preferably five predicted Casein kinase II phosphorylation sites (PS00006);

[2801] (xi) it has at least one, two, three, four, five, six, preferably seven predicted N-myristoylation sites (PS00008); or

[2802] (xii) it has at least one predicted amidation site (PS00009).

[2803] In a preferred embodiment the 47508 protein, or fragment thereof, differs from the corresponding sequence in SEQ ID NO:42. In one embodiment it differs by at least one but by less than 15, 10 or 5 amino acid residues. In another it differs from the corresponding sequence in SEQ ID NO:42 by at least one residue but less than 20%, 15%, 10% or 5% of the residues in it differ from the corresponding sequence in SEQ ID NO:42. (If this comparison requires alignment the sequences should be aligned for maximum homology. “Looped” out sequences from deletions or insertions, or mismatches, are considered differences.) The differences are, preferably, differences or changes at a non essential residue or a conservative substitution. In a preferred embodiment the differences are not in the histone deacetylase domain, e.g., about amino acid residues 83 to 392 of SEQ ID NO:42. In another preferred embodiment one or more differences are in the histone deacetylase domain, e.g., about amino acids 83 to 392 of SEQ ID NO:42.

[2804] Other embodiments include a protein that contain one or more changes in amino acid sequence, e.g., a change in an amino acid residue which is not essential for activity. Such 47508 proteins differ in amino acid sequence from SEQ ID NO:42, yet retain biological activity.

[2805] In one embodiment, the protein includes an amino acid sequence at least about 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 98%, 99% or more homologous to SEQ ID NO:42.

[2806] A 47508 protein or fragment is provided which varies from the sequence of SEQ ID NO:42 in regions defined by amino acids about 83 to 392 by at least one but by less than 15, 10 or 5 amino acid residues in the protein or fragment but which does not differ from SEQ ID NO:42 in regions defined by amino acids about 263 to 266. (If this comparison requires alignment the sequences should be aligned for maximum homology. “Looped” out sequences from deletions or insertions, or mismatches, are considered differences.) In some embodiments the difference is at a non-essential residue or is a conservative substitution, while in others the difference is at an essential residue or is a non-conservative substitution.

[2807] In one embodiment, a biologically active portion of a 47508 protein includes a histone deacetylase domain. Moreover, other biologically active portions, in which other regions of the protein are deleted, can be prepared by recombinant techniques and evaluated for one or more of the functional activities of a native 47508 protein.

[2808] In a preferred embodiment, the 47508 protein has an amino acid sequence shown in SEQ ID NO:42. In other embodiments, the 47508 protein is substantially identical to SEQ ID NO:42. In another embodiment, the 47508 protein is substantially identical to SEQ ID NO:42 and retains the functional activity of the protein of SEQ ID NO:42, as described in detail in the subsections above. In yet another embodiment, the 47508 protein or fragment thereof includes at least one amino acid from amino acid residues 1 to 57, or 263 to 266 of SEQ ID NO:42.

[2809] 47508 Chimeric or Fusion Proteins

[2810] In another aspect, the invention provides 47508 chimeric or fusion proteins. As used herein, a 47508 “chimeric protein” or “fusion protein” includes a 47508 polypeptide linked to a non-47508 polypeptide. A “non-47508 polypeptide” refers to a polypeptide having an amino acid sequence corresponding to a protein which is not substantially homologous to the 47508 protein, e.g., a protein which is different from the 47508 protein and which is derived from the same or a different organism. The 47508 polypeptide of the fusion protein can correspond to all or a portion e.g., a fragment described herein of a 47508 amino acid sequence. In a preferred embodiment, a 47508 fusion protein includes at least one (or two) biologically active portion of a 47508 protein. The non-47508 polypeptide can be fused to the N-terminus or C-terminus of the 47508 polypeptide.

[2811] The fusion protein can include a moiety which has a high affinity for a ligand. For example, the fusion protein can be a GST-47508 fusion protein in which the 47508 sequences are fused to the C-terminus of the GST sequences. Such fusion proteins can facilitate the purification of recombinant 47508. Alternatively, the fusion protein can be a 47508 protein containing a heterologous signal sequence at its N-terminus. In certain host cells (e.g., mammalian host cells), expression and/or secretion of 47508 can be increased through use of a heterologous signal sequence.

[2812] Fusion proteins can include all or a part of a serum protein, e.g., an IgG constant region, or human serum albumin.

[2813] The 47508 fusion proteins of the invention can be incorporated into pharmaceutical compositions and administered to a subject in vivo. The 47508 fusion proteins can be used to affect the bioavailability of a 47508 substrate. 47508 fusion proteins may be useful therapeutically for the treatment of disorders caused by, for example, (i) aberrant modification or mutation of a gene encoding a 47508 protein; (ii) mis-regulation of the 47508 gene; and (iii) aberrant post-translational modification of a 47508 protein.

[2814] Moreover, the 47508-fusion proteins of the invention can be used as immunogens to produce anti-47508 antibodies in a subject, to purify 47508 ligands and in screening assays to identify molecules which inhibit the interaction of 47508 with a 47508 substrate.

[2815] Expression vectors are commercially available that already encode a fusion moiety (e.g., a GST polypeptide). A 47508-encoding nucleic acid can be cloned into such an expression vector such that the fusion moiety is linked in-frame to the 47508 protein.

[2816] Variants of 47508 Proteins

[2817] In another aspect, the invention also features a variant of a 47508 polypeptide, e.g., which functions as an agonist (mimetics) or as an antagonist. Variants of the 47508 proteins can be generated by mutagenesis, e.g., discrete point mutation, the insertion or deletion of sequences or the truncation of a 47508 protein. An agonist of the 47508 proteins can retain substantially the same, or a subset, of the biological activities of the naturally occurring form of a 47508 protein. An antagonist of a 47508 protein can inhibit one or more of the activities of the naturally occurring form of the 47508 protein by, for example, competitively modulating a 47508-mediated activity of a 47508 protein. Thus, specific biological effects can be elicited by treatment with a variant of limited function. Preferably, treatment of a subject with a variant having a subset of the biological activities of the naturally occurring form of the protein has fewer side effects in a subject relative to treatment with the naturally occurring form of the 47508 protein.

[2818] Variants of a 47508 protein can be identified by screening combinatorial libraries of mutants, e.g., truncation mutants, of a 47508 protein for agonist or antagonist activity.

[2819] Libraries of fragments e.g., N terminal, C terminal, or internal fragments, of a 47508 protein coding sequence can be used to generate a variegated population of fragments for screening and subsequent selection of variants of a 47508 protein. Variants in which a cysteine residues is added or deleted or in which a residue which is glycosylated is added or deleted are particularly preferred.

[2820] Methods for screening gene products of combinatorial libraries made by point mutations or truncation, and for screening cDNA libraries for gene products having a selected property are known in the art. Such methods are adaptable for rapid screening of the gene libraries generated by combinatorial mutagenesis of 47508 proteins. Recursive ensemble mutagenesis (REM), a new technique which enhances the frequency of functional mutants in the libraries, can be used in combination with the screening assays to identify 47508 variants (Arkin and Yourvan (1992) Proc. Natl. Acad. Sci. USA 89:7811-7815; Delgrave et al. (1993) Protein Engineering 6:327-331).

[2821] Cell based assays can be exploited to analyze a variegated 47508 library. For example, a library of expression vectors can be transfected into a cell line, e.g., a cell line, which ordinarily responds to 47508 in a substrate-dependent manner. The transfected cells are then contacted with 47508 and the effect of the expression of the mutant on signaling by the 47508 substrate can be detected, e.g., by measuring histone deacetylase activity. Plasmid DNA can then be recovered from the cells which score for inhibition, or alternatively, potentiation of signaling by the 47508 substrate, and the individual clones further characterized.

[2822] In another aspect, the invention features a method of making a 47508 polypeptide, e.g., a peptide having a non-wild type activity, e.g., an antagonist, agonist, or super agonist of a naturally occurring 47508 polypeptide, e.g., a naturally occurring 47508 polypeptide. The method includes: altering the sequence of a 47508 polypeptide, e.g., altering the sequence, e.g., by substitution or deletion of one or more residues of a non-conserved region, a domain or residue disclosed herein, and testing the altered polypeptide for the desired activity.

[2823] In another aspect, the invention features a method of making a fragment or analog of a 47508 polypeptide a biological activity of a naturally occurring 47508 polypeptide. The method includes: altering the sequence, e.g., by substitution or deletion of one or more residues, of a 47508 polypeptide, e.g., altering the sequence of a non-conserved region, or a domain or residue described herein, and testing the altered polypeptide for the desired activity.

[2824] Anti-47508 Antibodies

[2825] In another aspect, the invention provides an anti-47508 antibody, or a fragment thereof (e.g., an antigen-binding fragment thereof). The term “antibody” as used herein refers to an immunoglobulin molecule or immunologically active portion thereof, i.e., an antigen-binding portion. As used herein, the term “antibody” refers to a protein comprising at least one, and preferably two, heavy (H) chain variable regions (abbreviated herein as VH), and at least one and preferably two light (L) chain variable regions (abbreviated herein as VL). The VH and VL regions can be further subdivided into regions of hypervariability, termed “complementarity determining regions” (“CDR”), interspersed with regions that are more-conserved, termed “framework regions” (FR). The extent of the framework region and CDR's has been precisely defined (see, Kabat, E. A., et al. (1991) Sequences of Proteins of Immunological Interest, Fifth Edition, U.S. Department of Health and Human Services, NIH Publication No. 91-3242, and Chothia, C. et al. (1987) J. Mol. Biol. 196:901-917, which are incorporated herein by reference). Each VH and VL is composed of three CDR's and four FRs, arranged from amino-terminus to carboxy-terminus in the following order: FR1, CDR1, FR2, CDR2, FR3, CDR3, FR4.

[2826] The anti-47508 antibody can further include a heavy and light chain constant region, to thereby form a heavy and light immunoglobulin chain, respectively. In one embodiment, the antibody is a tetramer of two heavy immunoglobulin chains and two light immunoglobulin chains, wherein the heavy and light immunoglobulin chains are inter-connected by, e.g., disulfide bonds. The heavy chain constant region is comprised of three domains, CH1, CH2 and CH3. The light chain constant region is comprised of one domain, CL. The variable region of the heavy and light chains contains a binding domain that interacts with an antigen. The constant regions of the antibodies typically mediate the binding of the antibody to host tissues or factors, including various cells of the immune system (e.g., effector cells) and the first component (Clq) of the classical complement system.

[2827] As used herein, the term “immunoglobulin” refers to a protein consisting of one or more polypeptides substantially encoded by immunoglobulin genes. The recognized human immunoglobulin genes include the kappa, lambda, alpha (IgA1 and IgA2), gamma (IgG1, IgG2, IgG3, IgG4), delta, epsilon and mu constant region genes, as well as the myriad immunoglobulin variable region genes. Full-length immunoglobulin “light chains” (about 25 KDa or 214 amino acids) are encoded by a variable region gene at the NH2-terminus (about 110 amino acids) and a kappa or lambda constant region gene at the COOH—terminus. Full-length immunoglobulin “heavy chains” (about 50 KDa or 446 amino acids), are similarly encoded by a variable region gene (about 116 amino acids) and one of the other aforementioned constant region genes, e.g., gamma (encoding about 330 amino acids).

[2828] The term “antigen-binding fragment” of an antibody (or simply “antibody portion,” or “fragment”), as used herein, refers to one or more fragments of a full-length antibody that retain the ability to specifically bind to the antigen, e.g., 47508 polypeptide or fragment thereof. Examples of antigen-binding fragments of the anti-47508 antibody include, but are not limited to: (i) a Fab fragment, a monovalent fragment consisting of the VL, VH, CL and CH1 domains; (ii) a F(ab′)2 fragment, a bivalent fragment comprising two Fab fragments linked by a disulfide bridge at the hinge region; (iii) a Fd fragment consisting of the VH and CH1 domains; (iv) a Fv fragment consisting of the VL and VH domains of a single arm of an antibody, (v) a dAb fragment (Ward et al., (1989) Nature 341:544-546), which consists of a VH domain; and (vi) an isolated complementarity determining region (CDR). Furthermore, although the two domains of the Fv fragment, VL and VH, are coded for by separate genes, they can be joined, using recombinant methods, by a synthetic linker that enables them to be made as a single protein chain in which the VL and VH regions pair to form monovalent molecules (known as single chain Fv (scFv); see e.g., Bird et al. (1988) Science 242:423-426; and Huston et al. (1988) Proc. Natl. Acad. Sci. USA 85:5879-5883). Such single chain antibodies are also encompassed within the term “antigen-binding fragment” of an antibody. These antibody fragments are obtained using conventional techniques known to those with skill in the art, and the fragments are screened for utility in the same manner as are intact antibodies.

[2829] The anti-47508 antibody can be a polyclonal or a monoclonal antibody. In other embodiments, the antibody can be recombinantly produced, e.g., produced by phage display or by combinatorial methods.

[2830] Phage display and combinatorial methods for generating anti-47508 antibodies are known in the art (as described in, e.g., Ladner et al. U.S. Pat. No. 5,223,409; Kang et al. International Publication No. WO 92/18619; Dower et al. International Publication No. WO 91/17271; Winter et al. International Publication WO 92/20791; Markland et al. International Publication No. WO 92/15679; Breitling et al. International Publication WO 93/01288; McCafferty et al. International Publication No. WO 92/01047; Garrard et al. International Publication No. WO 92/09690; Ladner et al. International Publication No. WO 90/02809; Fuchs et al. (1991) Bio/Technology 9:1370-1372; Hay et al. (1992) Hum Antibod Hybridomas 3:81-85; Huse et al. (1989) Science 246:1275-1281; Griffths et al. (1993) EMBO J 12:725-734; Hawkins et al. (1992) J Mol Biol 226:889-896; Clackson et al. (1991) Nature 352:624-628; Gram et al. (1992) PNAS 89:3576-3580; Garrad et al. (1991) Bio/Technology 9:1373-1377; Hoogenboom et al. (1991) Nuc Acid Res 19:4133-4137; and Barbas et al. (1991) PNAS 88:7978-7982, the contents of all of which are incorporated by reference herein).

[2831] In one embodiment, the anti-47508 antibody is a fully human antibody (e.g., an antibody made in a mouse which has been genetically engineered to produce an antibody from a human immunoglobulin sequence), or a non-human antibody, e.g., a rodent (mouse or rat), goat, primate (e.g., monkey), camel antibody. Preferably, the non-human antibody is a rodent (mouse or rat antibody). Method of producing rodent antibodies are known in the art.

[2832] Human monoclonal antibodies can be generated using transgenic mice carrying the human immunoglobulin genes rather than the mouse system. Splenocytes from these transgenic mice immunized with the antigen of interest are used to produce hybridomas that secrete human mAbs with specific affinities for epitopes from a human protein (see, e.g., Wood et al. International Application WO 91/00906, Kucherlapati et al. PCT publication WO 91/10741; Lonberg et al. International Application WO 92/03918; Kay et al. International Application 92/03917; Lonberg, N. et al. 1994 Nature 368:856-859; Green, L. L. et al. 1994 Nature Genet. 7:13-21; Morrison, S. L. et al. 1994 Proc. Natl. Acad. Sci. USA 81:6851-6855; Bruggeman et al. 1993 Year Immunol 7:33-40; Tuaillon et al. 1993 PNAS 90:3720-3724; Bruggeman et al. 1991 Eur J Immunol 21:1323-1326).

[2833] An anti-47508 antibody can be one in which the variable region, or a portion thereof, e.g., the CDR's, are generated in a non-human organism, e.g., a rat or mouse. Chimeric, CDR-grafted, and humanized antibodies are within the invention. Antibodies generated in a non-human organism, e.g., a rat or mouse, and then modified, e.g., in the variable framework or constant region, to decrease antigenicity in a human are within the invention.

[2834] Chimeric antibodies can be produced by recombinant DNA techniques known in the art. For example, a gene encoding the Fc constant region of a murine (or other species) monoclonal antibody molecule is digested with restriction enzymes to remove the region encoding the murine Fc, and the equivalent portion of a gene encoding a human Fc constant region is substituted (see Robinson et al., International Patent Publication PCT/US86/02269; Akira, et al., European Patent Application 184,187; Taniguchi, M., European Patent Application 171,496; Morrison et al., European Patent Application 173,494; Neuberger et al., International Application WO 86/01533; Cabilly et al. U.S. Pat. No. 4,816,567; Cabilly et al., European Patent Application 125,023; Better et al. (1988 Science 240:1041-1043); Liu et al. (1987) PNAS 84:3439-3443; Liu et al., 1987, J. Immunol. 139:3521-3526; Sun et al. (1987) PNAS 84:214-218; Nishimura et al., 1987, Canc. Res. 47:999-1005; Wood et al. (1985) Nature 314:446-449; and Shaw et al., 1988, J. Natl Cancer Inst. 80:1553-1559).

[2835] A humanized or CDR-grafted antibody will have at least one or two but generally all three recipient CDR's (of heavy and or light immuoglobulin chains) replaced with a donor CDR. The antibody may be replaced with at least a portion of a non-human CDR or only some of the CDR's may be replaced with non-human CDR's. It is only necessary to replace the number of CDR's required for binding of the humanized antibody to a 47508 or a fragment thereof. Preferably, the donor will be a rodent antibody, e.g., a rat or mouse antibody, and the recipient will be a human framework or a human consensus framework. Typically, the immunoglobulin providing the CDR's is called the “donor” and the immunoglobulin providing the framework is called the “acceptor.” In one embodiment, the donor immunoglobulin is a non-human (e.g., rodent). The acceptor framework is a naturally-occurring (e.g., a human) framework or a consensus framework, or a sequence about 85% or higher, preferably 90%, 95%, 99% or higher identical thereto.

[2836] As used herein, the term “consensus sequence” refers to the sequence formed from the most frequently occurring amino acids (or nucleotides) in a family of related sequences (See e.g., Winnaker, From Genes to Clones (Verlagsgesellschaft, Weinheim, Germany 1987). In a family of proteins, each position in the consensus sequence is occupied by the amino acid occurring most frequently at that position in the family. If two amino acids occur equally frequently, either can be included in the consensus sequence. A “consensus framework” refers to the framework region in the consensus immunoglobulin sequence.

[2837] An antibody can be humanized by methods known in the art. Humanized antibodies can be generated by replacing sequences of the Fv variable region which are not directly involved in antigen binding with equivalent sequences from human Fv variable regions. General methods for generating humanized antibodies are provided by Morrison, S. L., 1985, Science 229:1202-1207, by Oi et al., 1986, BioTechniques 4:214, and by Queen et al. U.S. Pat. No. 5,585,089, U.S. Pat. No. 5,693,761 and U.S. Pat. No. 5,693,762, the contents of all of which are hereby incorporated by reference. Those methods include isolating, manipulating, and expressing the nucleic acid sequences that encode all or part of immunoglobulin Fv variable regions from at least one of a heavy or light chain. Sources of such nucleic acid are well known to those skilled in the art and, for example, may be obtained from a hybridoma producing an antibody against a 47508 polypeptide or fragment thereof. The recombinant DNA encoding the humanized antibody, or fragment thereof, can then be cloned into an appropriate expression vector.

[2838] Humanized or CDR-grafted antibodies can be produced by CDR-grafting or CDR substitution, wherein one, two, or all CDR's of an immunoglobulin chain can be replaced. See e.g., U.S. Pat. No. 5,225,539; Jones et al. 1986 Nature 321:552-525; Verhoeyan et al. 1988 Science 239:1534; Beidler et al. 1988 J. Immunol. 141:4053-4060; Winter U.S. Pat. No. 5,225,539, the contents of all of which are hereby expressly incorporated by reference. Winter describes a CDR-grafting method which may be used to prepare the humanized antibodies of the present invention (UK Patent Application GB 2188638A, filed on Mar. 26, 1987; Winter U.S. Pat. No. 5,225,539), the contents of which is expressly incorporated by reference.

[2839] Also within the scope of the invention are humanized antibodies in which specific amino acids have been substituted, deleted or added. Preferred humanized antibodies have amino acid substitutions in the framework region, such as to improve binding to the antigen. For example, a humanized antibody will have framework residues identical to the donor framework residue or to another amino acid other than the recipient framework residue. To generate such antibodies, a selected, small number of acceptor framework residues of the humanized immunoglobulin chain can be replaced by the corresponding donor amino acids. Preferred locations of the substitutions include amino acid residues adjacent to the CDR, or which are capable of interacting with a CDR (see e.g., U.S. Pat. No. 5,585,089). Criteria for selecting amino acids from the donor are described in U.S. Pat. No. 5,585,089, e.g., columns 12-16 of U.S. Pat. No. 5,585,089, the e.g., columns 12-16 of U.S. Pat. No. 5,585,089, the contents of which are hereby incorporated by reference. Other techniques for humanizing antibodies are described in Padlan et al. EP 519596 A1, published on Dec. 23, 1992.

[2840] In preferred embodiments an antibody can be made by immunizing with purified 47508 antigen, or a fragment thereof, e.g., a fragment described herein, membrane associated antigen, tissue, e.g., crude tissue preparations, whole cells, preferably living cells, lysed cells, or cell fractions, e.g., nuclear cytosol.

[2841] A full-length 47508 protein or, antigenic peptide fragment of 47508 can be used as an immunogen or can be used to identify anti-47508 antibodies made with other immunogens, e.g., cells, membrane preparations, and the like. The antigenic peptide of 47508 should include at least 8 amino acid residues of the amino acid sequence shown in SEQ ID NO:42 and encompasses an epitope of 47508. Preferably, the antigenic peptide includes at least 10 amino acid residues, more preferably at least 15 amino acid residues, even more preferably at least 20 amino acid residues, and most preferably at least 30 amino acid residues.

[2842] Fragments of 47508 can be used as immunogens or used to characterize the specificity of an antibody. For example, fragments of 47508 which include residues about 19 to 35, about 245 to 263, or about 286 to 310 can be used to make, e.g., antibodies against hydrophilic regions of the 47508 protein. Similarly, fragments of 47508 which include residues about 151 to 172, about 215 to 232, or about 377 to 393 can be used to make an antibody against a hydrophobic region of the 47508 protein; fragments of 47508 which include residues about 1 to 57, or about 50 to 82 can be used to make an antibody against an N-terminal portion of the 47508 protein; and fragments of 47508 which include residues about 90 to 110, about 200 to 220, about 240 to 260, about 320 to 340, or about 360 to 375 can be used to make an antibody against the histone deacetylase region of the 47508 protein.

[2843] Antibodies reactive with, or specific for, any of these regions, or other regions or domains described herein are provided.

[2844] Antibodies which bind only native 47508 protein, only denatured or otherwise non-native 47508 protein, or which bind both, are with in the invention. Antibodies with linear or conformational epitopes are within the invention. Conformational epitopes can sometimes be identified by identifying antibodies which bind to native but not denatured 47508 protein.

[2845] Preferred epitopes encompassed by the antigenic peptide are regions of 47508 are located on the surface of the protein, e.g., hydrophilic regions, as well as regions with high antigenicity. For example, an Emini surface probability analysis of the human 47508 protein sequence can be used to indicate the regions that have a particularly high probability of being localized to the surface of the 47508 protein and are thus likely to constitute surface residues useful for targeting antibody production.

[2846] In preferred embodiments antibodies can bind one or more of purified antigen, membrane associated antigen, tissue, e.g., tissue sections, whole cells, preferably living cells, lysed cells, cell fractions, e.g., nuclear cytoplasm.

[2847] The anti-47508 antibody can be a single chain antibody. A single-chain antibody (scFV) may be engineered (see, for example, Colcher, D. et al. (1999) Ann N Y Acad Sci 880:263-80; and Reiter, Y. (1996) Clin Cancer Res 2:245-52). The single chain antibody can be dimerized or multimerized to generate multivalent antibodies having specificities for different epitopes of the same target 47508 protein.

[2848] In a preferred embodiment the antibody has: effector function; and can fix complement. In other embodiments the antibody does not; recruit effector cells; or fix complement.

[2849] In a preferred embodiment, the antibody has reduced or no ability to bind an Fc receptor. For example., it is a isotype or subtype, fragment or other mutant, which does not support binding to an Fc receptor, e.g., it has a mutagenized or deleted Fc receptor binding region.

[2850] In a preferred embodiment, an anti-47508 antibody alters (e.g., increases or decreases) the histone deacetylase activity of a 47508 polypeptide. For example, the antibody can bind at or in proximity to the active site, e.g., to an epitope that includes a residue located from about 90 to 110, about 200 to 220, about 240 to 260, about 320 to 340, or about 360 to 375 of SEQ ID NO:42.

[2851] The antibody can be coupled to a toxin, e.g., a polypeptide toxin, e,g, ricin or diphtheria toxin or active fragment hereof, or a radioactive nucleus, or imaging agent, e.g. a radioactive, enzymatic, or other, e.g., imaging agent, e.g., a NMR contrast agent. Labels which produce detectable radioactive emissions or fluorescence are preferred.

[2852] An anti-47508 antibody (e.g., monoclonal antibody) can be used to isolate 47508 by standard techniques, such as affinity chromatography or immunoprecipitation. Moreover, an anti-47508 antibody can be used to detect 47508 protein (e.g., in a cellular lysate or cell supernatant) in order to evaluate the abundance and pattern of expression of the protein. Anti-47508 antibodies can be used diagnostically to monitor protein levels in tissue as part of a clinical testing procedure, e.g., to determine the efficacy of a given treatment regimen. Detection can be facilitated by coupling (i.e., physically linking) the antibody to a detectable substance (i.e., antibody labelling). Examples of detectable substances include various enzymes, prosthetic groups, fluorescent materials, luminescent materials, bioluminescent materials, and radioactive materials. Examples of suitable enzymes include horseradish peroxidase, alkaline phosphatase, &bgr;-galactosidase, or acetylcholinesterase; examples of suitable prosthetic group complexes include streptavidin/biotin and avidin/biotin; examples of suitable fluorescent materials include umbelliferone, fluorescein, fluorescein isothiocyanate, rhodamine, dichlorotriazinylamine fluorescein, dansyl chloride or phycoerythrin; an example of a luminescent material includes luminol; examples of bioluminescent materials include luciferase, luciferin, and aequorin, and examples of suitable radioactive material include 125I, 131I, 35S or 3H.

[2853] The invention also includes a nucleic acids which encodes an anti-47508 antibody, e.g., an anti-47508 antibody described herein. Also included are vectors which include the nucleic acid and sells transformed with the nucleic acid, particularly cells which are useful for producing an antibody, e.g., mammalian cells, e.g. CHO or lymphatic cells.

[2854] The invention also includes cell lines, e.g., hybridomas, which make an anti-47508 antibody, e.g., and antibody described herein, and method of using said cells to make a 47508 antibody.

[2855] 47508 Recombinant Expression Vectors, Host Cells and Genetically Engineered Cells

[2856] In another aspect, the invention includes, vectors, preferably expression vectors, containing a nucleic acid encoding a polypeptide described herein. As used herein, the term “vector” refers to a nucleic acid molecule capable of transporting another nucleic acid to which it has been linked and can include a plasmid, cosmid or viral vector. The vector can be capable of autonomous replication-or it can integrate into a host DNA. Viral vectors include, e.g., replication defective retroviruses, adenoviruses and adeno-associated viruses.

[2857] A vector can include a 47508 nucleic acid in a form suitable for expression of the nucleic acid in a host cell. Preferably the recombinant expression vector includes one or more regulatory sequences operatively linked to the nucleic acid sequence to be expressed. The term “regulatory sequence” includes promoters, enhancers and other expression control elements (e.g., polyadenylation signals). Regulatory sequences include those which direct constitutive expression of a nucleotide sequence, as well as tissue-specific regulatory and/or inducible sequences. The design of the expression vector can depend on such factors as the choice of the host cell to be transformed, the level of expression of protein desired, and the like. The expression vectors of the invention can be introduced into host cells to thereby produce proteins or polypeptides, including fusion proteins or polypeptides, encoded by nucleic acids as described herein (e.g., 47508 proteins, mutant forms of 47508 proteins, fusion proteins, and the like).

[2858] The recombinant expression vectors of the invention can be designed for expression of 47508 proteins in prokaryotic or eukaryotic cells. For example, polypeptides of the invention can be expressed in E. coli, insect cells (e.g., using baculovirus expression vectors), yeast cells or mammalian cells. Suitable host cells are discussed further in Goeddel, (1990) Gene Expression Technology: Methods in Enzymology 185, Academic Press, San Diego, Calif. Alternatively, the recombinant expression vector can be transcribed and translated in vitro, for example using T7 promoter regulatory sequences and T7 polymerase.

[2859] Expression of proteins in prokaryotes is most often carried out in E. coli with vectors containing constitutive or inducible promoters directing the expression of either fusion or non-fusion proteins. Fusion vectors add a number of amino acids to a protein encoded therein, usually to the amino terminus of the recombinant protein. Such fusion vectors typically serve three purposes: 1) to increase expression of recombinant protein; 2) to increase the solubility of the recombinant protein; and 3) to aid in the purification of the recombinant protein by acting as a ligand in affinity purification. Often, a proteolytic cleavage site is introduced at the junction of the fusion moiety and the recombinant protein to enable separation of the recombinant protein from the fusion moiety subsequent to purification of the fusion protein. Such enzymes, and their cognate recognition sequences, include Factor Xa, thrombin and enterokinase. Typical fusion expression vectors include pGEX (Pharmacia Biotech Inc; Smith, D. B. and Johnson, K. S. (1988) Gene 67:31-40), pMAL (New England Biolabs, Beverly, Mass.) and pRIT5 (Pharmacia, Piscataway, N.J.) which fuse glutathione S-transferase (GST), maltose E binding protein, or protein A, respectively, to the target recombinant protein.

[2860] Purified fusion proteins can be used in 47508 activity assays, (e.g., direct assays or competitive assays described in detail below), or to generate antibodies specific for 47508 proteins. In a preferred embodiment, a fusion protein expressed in a retroviral expression vector of the present invention can be used to infect bone marrow cells which are subsequently transplanted into irradiated recipients. The pathology of the subject recipient is then examined after sufficient time has passed (e.g., six weeks).

[2861] To maximize recombinant protein expression in E. coli is to express the protein in a host bacteria with an impaired capacity to proteolytically cleave the recombinant protein (Gottesman, S., (1990) Gene Expression Technology: Methods in Enzymology 185, Academic Press, San Diego, Calif. 119-128). Another strategy is to alter the nucleic acid sequence of the nucleic acid to be inserted into an expression vector so that the individual codons for each amino acid are those preferentially utilized in E. coli (Wada et al., (1992) Nucleic Acids Res. 20:2111-2118). Such alteration of nucleic acid sequences of the invention can be carried out by standard DNA synthesis techniques.

[2862] The 47508 expression vector can be a yeast expression vector, a vector for expression in insect cells, e.g., a baculovirus expression vector or a vector suitable for expression in mammalian cells.

[2863] When used in mammalian cells, the expression vector's control functions can be provided by viral regulatory elements. For example, commonly used promoters are derived from polyoma, Adenovirus 2, cytomegalovirus and Simian Virus 40.

[2864] In another embodiment, the promoter is an inducible promoter, e.g., a promoter regulated by a steroid hormone, by a polypeptide hormone (e.g., by means of a signal transduction pathway), or by a heterologous polypeptide (e.g., the tetracycline-inducible systems, “Tet-On” and “Tet-Off”; see, e.g., Clontech Inc., CA, Gossen and Bujard (1992) Proc. Natl. Acad. Sci. USA 89:5547, and Paillard (1989) Human Gene Therapy 9:983).

[2865] In another embodiment, the recombinant mammalian expression vector is capable of directing expression of the nucleic acid preferentially in a particular cell type (e.g., tissue-specific regulatory elements are used to express the nucleic acid). Non-limiting examples of suitable tissue-specific promoters include the albumin promoter (liver-specific; Pinkert et al. (1987) Genes Dev. 1:268-277), lymphoid-specific promoters (Calame and Eaton (1988) Adv. Immunol. 43:235-275), in particular promoters of T cell receptors (Winoto and Baltimore (1989) EMBO J. 8:729-733) and immunoglobulins (Banerji et al. (1983) Cell 33:729-740; Queen and Baltimore (1983) Cell 33:741-748), neuron-specific promoters (e.g., the neurofilament promoter; Byrne and Ruddle (1989) Proc. Natl. Acad. Sci. USA 86:5473-5477), pancreas-specific promoters (Edlund et al. (1985) Science 230:912-916), and mammary gland-specific promoters (e.g., milk whey promoter; U.S. Pat. No. 4,873,316 and European Application Publication No. 264,166). Developmentally-regulated promoters are also encompassed, for example, the murine hox promoters (Kessel and Gruss (1990) Science 249:374-379) and the &agr;-fetoprotein promoter (Campes and Tilghman (1989) Genes Dev. 3:537-546).

[2866] The invention further provides a recombinant expression vector comprising a DNA molecule of the invention cloned into the expression vector in an antisense orientation. Regulatory sequences (e.g., viral promoters and/or enhancers) operatively linked to a nucleic acid cloned in the antisense orientation can be chosen which direct the constitutive, tissue specific or cell type specific expression of antisense RNA in a variety of cell types. The antisense expression vector can be in the form of a recombinant plasmid, phagemid or attenuated virus.

[2867] Another aspect the invention provides a host cell which includes a nucleic acid molecule described herein, e.g., a 47508 nucleic acid molecule within a recombinant expression vector or a 47508 nucleic acid molecule containing sequences which allow it to homologously recombine into a specific site of the host cell's genome. The terms “host cell” and “recombinant host cell” are used interchangeably herein. Such terms refer not only to the particular subject cell but to the progeny or potential progeny of such a cell. Because certain modifications may occur in succeeding generations due to either mutation or environmental influences, such progeny may not, in fact, be identical to the parent cell, but are still included within the scope of the term as used herein.

[2868] A host cell can be any prokaryotic or eukaryotic cell. For example, a 47508 protein can be expressed in bacterial cells (such as E. coli), insect cells, yeast or mammalian cells (such as Chinese hamster ovary cells (CHO) or COS cells (African green monkey kidney cells CV-1 origin SV40 cells; Gluzman (1981) Cell 123:175-182)). Other suitable host cells are known to those skilled in the art.

[2869] Vector DNA can be introduced into host cells via conventional transformation or transfection techniques. As used herein, the terms “transformation” and “transfection” are intended to refer to a variety of art-recognized techniques for introducing foreign nucleic acid (e.g., DNA) into a host cell, including calcium phosphate or calcium chloride co-precipitation, DEAE-dextran-mediated transfection, lipofection, or electroporation.

[2870] A host cell of the invention can be used to produce (i.e., express) a 47508 protein. Accordingly, the invention further provides methods for producing a 47508 protein using the host cells of the invention. In one embodiment, the method includes culturing the host cell of the invention (into which a recombinant expression vector encoding a 47508 protein has been introduced) in a suitable medium such that a 47508 protein is produced. In another embodiment, the method further includes isolating a 47508 protein from the medium or the host cell.

[2871] In another aspect, the invention features, a cell or purified preparation of cells which include a 47508 transgene, or which otherwise misexpress 47508. The cell preparation can consist of human or non-human cells, e.g., rodent cells, e.g., mouse or rat cells, rabbit cells, or pig cells. In preferred embodiments, the cell or cells include a 47508 transgene, e.g., a heterologous form of a 47508, e.g., a gene derived from humans (in the case of a non-human cell). The 47508 transgene can be misexpressed, e.g., overexpressed or underexpressed. In other preferred embodiments, the cell or cells include a gene that mis-expresses an endogenous 47508, e.g., a gene the expression of which is disrupted, e.g., a knockout. Such cells can serve as a model for studying disorders that are related to mutated or mis-expressed 47508 alleles or for use in drug screening.

[2872] In another aspect, the invention features, a human cell, e.g., a hematopoietic or hepatic stem cell, transformed with nucleic acid which encodes a subject 47508 polypeptide.

[2873] Also provided are cells, preferably human cells, e.g., human hematopoietic, hepatic, or fibroblast cells, in which an endogenous 47508 is under the control of a regulatory sequence that does not normally control the expression of the endogenous 47508 gene. The expression characteristics of an endogenous gene within a cell, e.g., a cell line or microorganism, can be modified by inserting a heterologous DNA regulatory element into the genome of the cell such that the inserted regulatory element is operably linked to the endogenous 47508 gene. For example, an endogenous 47508 gene which is “transcriptionally silent,” e.g., not normally expressed, or expressed only at very low levels, may be activated by inserting a regulatory element which is capable of promoting the expression of a normally expressed gene product in that cell. Techniques such as targeted homologous recombinations, can be used to insert the heterologous DNA as described in, e.g., Chappel, U.S. Pat. No. 5,272,071; WO 91/06667, published in May 16, 1991.

[2874] In a preferred embodiment, recombinant cells described herein can be used for replacement therapy in a subject. For example, a nucleic acid encoding a 47508 polypeptide operably linked to an inducible promoter (e.g., a steroid hormone receptor-regulated promoter) is introduced into a human or nonhuman, e.g., mammalian, e.g., porcine recombinant cell. The cell is cultivated and encapsulated in a biocompatible material, such as poly-lysine alginate, and subsequently implanted into the subject. See, e.g., Lanza (1996) Nat. Biotechnol. 14:1107; Joki et al. (2001) Nat. Biotechnol. 19:35; and U.S. Pat. No. 5,876,742. Production of 47508 polypeptide can be regulated in the subject by administering an agent (e.g., a steroid hormone) to the subject. In another preferred embodiment, the implanted recombinant cells express and secrete an antibody specific for a 47508 polypeptide. The antibody can be any antibody or any antibody derivative described herein.

[2875] 47508 Transgenic Animals

[2876] The invention provides non-human transgenic animals. Such animals are useful for studying the function and/or activity of a 47508 protein and for identifying and/or evaluating modulators of 47508 activity. As used herein, a “transgenic animal” is a non-human animal, preferably a mammal, more preferably a rodent such as a rat or mouse, in which one or more of the cells of the animal includes a transgene. Other examples of transgenic animals include non-human primates, sheep, dogs, cows, goats, chickens, amphibians, and the like. A transgene is exogenous DNA or a rearrangement, e.g., a deletion of endogenous chromosomal DNA, which preferably is integrated into or occurs in the genome of the cells of a transgenic animal. A transgene can direct the expression of an encoded gene product in one or more cell types or tissues of the transgenic animal, other transgenes, e.g., a knockout, reduce expression. Thus, a transgenic animal can be one in which an endogenous 47508 gene has been altered by, e.g., by homologous recombination between the endogenous gene and an exogenous DNA molecule introduced into a cell of the animal, e.g., an embryonic cell of the animal, prior to development of the animal.

[2877] Intronic sequences and polyadenylation signals can also be included in the transgene to increase the efficiency of expression of the transgene. A tissue-specific regulatory sequence(s) can be operably linked to a transgene of the invention to direct expression of a 47508 protein to particular cells. A transgenic founder animal can be identified based upon the presence of a 47508 transgene in its genome and/or expression of 47508 mRNA in tissues or cells of the animals. A transgenic founder animal can then be used to breed additional animals carrying the transgene. Moreover, transgenic animals carrying a transgene encoding a 47508 protein can further be bred to other transgenic animals carrying other transgenes.

[2878] 47508 proteins or polypeptides can be expressed in transgenic animals or plants, e.g., a nucleic acid encoding the protein or polypeptide can be introduced into the genome of an animal. In preferred embodiments the nucleic acid is placed under the control of a tissue specific promoter, e.g., a milk or egg specific promoter, and recovered from the milk or eggs produced by the animal. Suitable animals are mice, pigs, cows, goats, and sheep.

[2879] The invention also includes a population of cells from a transgenic animal, as discussed, e.g., below.

[2880] Uses of 47508

[2881] The nucleic acid molecules, proteins, protein homologues, and antibodies described herein can be used in one or more of the following methods: a) screening assays; b) predictive medicine (e.g., diagnostic assays, prognostic assays, monitoring clinical trials, and pharmacogenetics); and c) methods of treatment (e.g., therapeutic and prophylactic).

[2882] The isolated nucleic acid molecules of the invention can be used, for example, to express a 47508 protein (e.g., via a recombinant expression vector in a host cell in gene therapy applications), to detect a 47508 mRNA (e.g., in a biological sample) or a genetic alteration in a 47508 gene, and to modulate 47508 activity, as described further below. The 47508 proteins can be used to treat disorders characterized by insufficient or excessive production of a 47508 substrate or production of 47508 inhibitors. In addition, the 47508 proteins can be used to screen for naturally occurring 47508 substrates, to screen for drugs or compounds which modulate 47508 activity, as well as to treat disorders characterized by insufficient or excessive production of 47508 protein or production of 47508 protein forms which have decreased, aberrant or unwanted activity compared to 47508 wild type protein (e.g., cellular proliferative and/or differentiative disorder). Moreover, the anti-47508 antibodies of the invention can be used to detect and isolate 47508 proteins, regulate the bioavailability of 47508 proteins, and modulate 47508 activity.

[2883] A method of evaluating a compound for the ability to interact with, e.g., bind, a subject 47508 polypeptide is provided. The method includes: contacting the compound with the subject 47508 polypeptide; and evaluating ability of the compound to interact with, e.g., to bind or form a complex with the subject 47508 polypeptide. This method can be performed in vitro, e.g., in a cell free system, or in vivo, e.g., in a two-hybrid interaction trap assay. This method can be used to identify naturally occurring molecules that interact with subject 47508 polypeptide. It can also be used to find natural or synthetic inhibitors of subject 47508 polypeptide. Screening methods are discussed in more detail below.

[2884] 47508 Screening Assays

[2885] The invention provides methods (also referred to herein as “screening assays”) for identifying modulators, i.e., candidate or test compounds or agents (e.g., proteins, peptides, peptidomimetics, peptoids, small molecules or other drugs) which bind to 47508 proteins, have a stimulatory or inhibitory effect on, for example, 47508 expression or 47508 activity, or have a stimulatory or inhibitory effect on, for example, the expression or activity of a 47508 substrate. Compounds thus identified can be used to modulate the activity of target gene products (e.g., 47508 genes) in a therapeutic protocol, to elaborate the biological function of the target gene product, or to identify compounds that disrupt normal target gene interactions.

[2886] In one embodiment, the invention provides assays for screening candidate or test compounds which are substrates of a 47508 protein or polypeptide or a biologically active portion thereof. In another embodiment, the invention provides assays for screening candidate or test compounds that bind to or modulate an activity of a 47508 protein or polypeptide or a biologically active portion thereof.

[2887] In one embodiment, an activity of a 47508 protein can be assayed by expression a 47508 nucleic acid in a cell, e.g., a mammalian cell, such that 47508 protein is produced, purifying the 47508 protein, e.g., by means of an affinity tag, e.g., a HIS6 tag, mixing the purified 47508 protein with histone proteins, e.g., histones H2A, H2B, H3, and H4, which are acetylated with H3-labeled acetyl groups, and monitoring cleavage of the labeled acetyl groups from the histone proteins over time. An example of this method is shown is Hu et al. (2000), J Biol Chem 275(20):15254-64, the contents of which are incorporated herein by reference.

[2888] The test compounds of the present invention can be obtained using any of the numerous approaches in combinatorial library methods known in the art, including: biological libraries; peptoid libraries (libraries of molecules having the functionalities of peptides, but with a novel, non-peptide backbone which are resistant to enzymatic degradation but which nevertheless remain bioactive; see, e.g., Zuckermann, R. N. et al. (1994) J. Med. Chem. 37:2678-85); spatially addressable parallel solid phase or solution phase libraries; synthetic library methods requiring deconvolution; the ‘one-bead one-compound’ library method; and synthetic library methods using affinity chromatography selection. The biological library and peptoid library approaches are limited to peptide libraries, while the other four approaches are applicable to peptide, non-peptide oligomer or small molecule libraries of compounds (Lam (1997) Anticancer Drug Des. 12:145).

[2889] Examples of methods for the synthesis of molecular libraries can be found in the art, for example in: DeWitt et al. (1993) Proc. Natl. Acad. Sci. U.S.A. 90:6909; Erb et al. (1994) Proc. Natl. Acad. Sci. USA 91:11422; Zuckermann et al. (1994). J. Med. Chem. 37:2678; Cho et al. (1993) Science 261:1303; Carrell et al. (1994) Angew. Chem. Int. Ed. Engl. 33:2059; Carell et al. (1994) Angew. Chem. Int. Ed. Engl. 33:2061; and Gallop et al. (1994) J. Med. Chem. 37:1233.

[2890] Libraries of compounds may be presented in solution (e.g., Houghten (1992) Biotechniques 13:412-421), or on beads (Lam (1991) Nature 354:82-84), chips (Fodor (1993) Nature 364:555-556), bacteria (Ladner, U.S. Pat. No. 5,223,409), spores (Ladner U.S. Pat. No. 5,223,409), plasmids (Cull et al. (1992) Proc Natl Acad Sci USA 89:1865-1869) or on phage (Scott and Smith (1990) Science 249:386-390; Devlin (1990) Science 249:404-406; Cwirla et al. (1990) Proc. Natl. Acad. Sci. 87:6378-6382; Felici (1991) J. Mol. Biol. 222:301-310; Ladner supra.).

[2891] In one embodiment, an assay is a cell-based assay in which a cell which expresses a 47508 protein or biologically active portion thereof is contacted with a test compound, and the ability of the test compound to modulate 47508 activity is determined. Determining the ability of the test compound to modulate 47508 activity can be accomplished by monitoring, for example, histone deacetylation, e.g., deacetylation of histone H2A, H2B, H3, or H4. The cell, for example, can be of mammalian origin, e.g., human.

[2892] The ability of the test compound to modulate 47508 binding to a compound, e.g., a 47508 substrate, or to bind to 47508 can also be evaluated. This can be accomplished, for example, by coupling the compound, e.g., the substrate, with a radioisotope or enzymatic label such that binding of the compound, e.g., the substrate, to 47508 can be determined by detecting the labeled compound, e.g., substrate, in a complex. Alternatively, 47508 could be coupled with a radioisotope or enzymatic label to monitor the ability of a test compound to modulate 47508 binding to a 47508 substrate in a complex. For example, compounds (e.g., 47508 substrates) can be labeled with 125I, 35S, 14C, or 3H, either directly or indirectly, and the radioisotope detected by direct counting of radioemmission or by scintillation counting. Alternatively, compounds can be enzymatically labeled with, for example, horseradish peroxidase, alkaline phosphatase, or luciferase, and the enzymatic label detected by determination of conversion of an appropriate substrate to product.

[2893] The ability of a compound (e.g., a 47508 substrate) to interact with 47508 with or without the labeling of any of the interactants can be evaluated. For example, a microphysiometer can be used to detect the interaction of a compound with 47508 without the labeling of either the compound or the 47508. McConnell, H. M. et al. (1992) Science 257:1906-1912. As used herein, a “microphysiometer” (e.g., Cytosensor) is an analytical instrument that measures the rate at which a cell acidifies its environment using a light-addressable potentiometric sensor (LAPS). Changes in this acidification rate can be used as an indicator of the interaction between a compound and 47508.

[2894] In yet another embodiment, a cell-free assay is provided in which a 47508 protein or biologically active portion thereof is contacted with a test compound and the ability of the test compound to bind to the 47508 protein or biologically active portion thereof is evaluated. Preferred biologically active portions of the 47508 proteins to be used in assays of the present invention include fragments which participate in interactions with non-47508 molecules, e.g., fragments with high surface probability scores.

[2895] Soluble and/or membrane-bound forms of isolated proteins (e.g., 47508 proteins or biologically active portions thereof) can be used in the cell-free assays of the invention. When membrane-bound forms of the protein are used, it may be desirable to utilize a solubilizing agent. Examples of such solubilizing agents include non-ionic detergents such as n-octylglucoside, n-dodecylglucoside, n-dodecylmaltoside, octanoyl-N-methylglucamide, decanoyl-N-methylglucamide, Triton® X-100, Triton® X-114, Thesit®, Isotridecypoly(ethylene glycol ether)n, 3-[(3-cholamidopropyl)dimethylamminio]-1-propane sulfonate (CHAPS), 3-[(3-cholamidopropyl)dimethylamminio]-2-hydroxy-1-propane sulfonate (CHAPSO), or N-dodecyl=N,N-dimethyl-3-ammonio-1-propane sulfonate.

[2896] Cell-free assays involve preparing a reaction mixture of the target gene protein and the test compound under conditions and for a time sufficient to allow the two components to interact and bind, thus forming a complex that can be removed and/or detected.

[2897] The interaction between two molecules can also be detected, e.g., using fluorescence energy transfer (FET) (see, for example, Lakowicz et al., U.S. Pat. No. 5,631,169; Stavrianopoulos, et al., U.S. Pat. No. 4,868,103). A fluorophore label on the first, ‘donor’ molecule is selected such that its emitted fluorescent energy will be absorbed by a fluorescent label on a second, ‘acceptor’ molecule, which in turn is able to fluoresce due to the absorbed energy. Alternately, the ‘donor’ protein molecule may simply utilize the natural fluorescent energy of tryptophan residues. Labels are chosen that emit different wavelengths of light, such that the ‘acceptor’ molecule label may be differentiated from that of the ‘donor’. Since the efficiency of energy transfer between the labels is related to the distance separating the molecules, the spatial relationship between the molecules can be assessed. In a situation in which binding occurs between the molecules, the fluorescent emission of the ‘acceptor’ molecule label in the assay should be maximal. An FET binding event can be conveniently measured through standard fluorometric detection means well known in the art (e.g., using a fluorimeter).

[2898] In another embodiment, determining the ability of the 47508 protein to bind to a target molecule can be accomplished using real-time Biomolecular Interaction Analysis (BIA) (see, e.g., Sjolander, S. and Urbaniczky, C. (1991) Anal. Chem. 63:2338-2345 and Szabo et al. (1995) Curr. Opin. Struct. Biol. 5:699-705). “Surface plasmon resonance” or “BIA” detects biospecific interactions in real time, without labeling any of the interactants (e.g., BIAcore). Changes in the mass at the binding surface (indicative of a binding event) result in alterations of the refractive index of light near the surface (the optical phenomenon of surface plasmon resonance (SPR)), resulting in a detectable signal which can be used as an indication of real-time reactions between biological molecules.

[2899] In one embodiment, the target gene product or the test substance is anchored onto a solid phase. The target gene product/test compound complexes anchored on the solid phase can be detected at the end of the reaction. Preferably, the target gene product can be anchored onto a solid surface, and the test compound, (which is not anchored), can be labeled, either directly or indirectly, with detectable labels discussed herein.

[2900] It may be desirable to immobilize either 47508, an anti-47508 antibody or its target molecule to facilitate separation of complexed from uncomplexed forms of one or both of the proteins, as well as to accommodate automation of the assay. Binding of a test compound to a 47508 protein, or interaction of a 47508 protein with a target molecule in the presence and absence of a candidate compound, can be accomplished in any vessel suitable for containing the reactants. Examples of such vessels include microtiter plates, test tubes, and micro-centrifuge tubes. In one embodiment, a fusion protein can be provided which adds a domain that allows one or both of the proteins to be bound to a matrix. For example, glutathione-S-transferase/47508 fusion proteins or glutathione-S-transferase/target fusion proteins can be adsorbed onto glutathione sepharose beads (Sigma Chemical, St. Louis, Mo.) or glutathione derivatized microtiter plates, which are then combined with the test compound or the test compound and either the non-adsorbed target protein or 47508 protein, and the mixture incubated under conditions conducive to complex formation (e.g., at physiological conditions for salt and pH). Following incubation, the beads or microtiter plate wells are washed to remove any unbound components, the matrix immobilized in the case of beads, complex determined either directly or indirectly, for example, as described above. Alternatively, the complexes can be dissociated from the matrix, and the level of 47508 binding or activity determined using standard techniques.

[2901] Other techniques for immobilizing either a 47508 protein or a target molecule on matrices include using conjugation of biotin and streptavidin. Biotinylated 47508 protein or target molecules can be prepared from biotin-NHS(N-hydroxy-succinimide) using techniques known in the art (e.g., biotinylation kit, Pierce Chemicals, Rockford, Ill.), and immobilized in the wells of streptavidin-coated 96 well plates (Pierce Chemical).

[2902] In order to conduct the assay, the non-immobilized component is added to the coated surface containing the anchored component. After the reaction is complete, unreacted components are removed (e.g., by washing) under conditions such that any complexes formed will remain immobilized on the solid surface. The detection of complexes anchored on the solid surface can be accomplished in a number of ways. Where the previously non-immobilized component is pre-labeled, the detection of label immobilized on the surface indicates that complexes were formed. Where the previously non-immobilized component is not pre-labeled, an indirect label can be used to detect complexes anchored on the surface; e.g., using a labeled antibody specific for the immobilized component (the antibody, in turn, can be directly labeled or indirectly labeled with, e.g., a labeled anti-Ig antibody).

[2903] In one embodiment, this assay is performed utilizing antibodies reactive with 47508 protein or target molecules but which do not interfere with binding of the 47508 protein to its target molecule. Such antibodies can be derivatized to the wells of the plate, and unbound target or 47508 protein trapped in the wells by antibody conjugation. Methods for detecting such complexes, in addition to those described above for the GST-immobilized complexes, include immunodetection of complexes using antibodies reactive with the 47508 protein or target molecule, as well as enzyme-linked assays which rely on detecting an enzymatic activity associated with the 47508 protein or target molecule.

[2904] Alternatively, cell free assays can be conducted in a liquid phase. In such an assay, the reaction products are separated from unreacted components, by any of a number of standard techniques, including but not limited to: differential centrifugation (see, for example, Rivas, G., and Minton, A. P., (1993) Trends Biochem Sci 18:284-7); chromatography (gel filtration chromatography, ion-exchange chromatography); electrophoresis (see, e.g., Ausubel, F. et al., eds. Current Protocols in Molecular Biology 1999, J. Wiley: New York.); and immunoprecipitation (see, for example, Ausubel, F. et al., eds. (1999) Current Protocols in Molecular Biology, J. Wiley: New York). Such resins and chromatographic techniques are known to one skilled in the art (see, e.g., Heegaard, N. H., (1998) J Mol Recognit 11: 141-8; Hage, D. S., and Tweed, S. A. (1997) J Chromatogr B Biomed Sci Appl. 699:499-525). Further, fluorescence energy transfer may also be conveniently utilized, as described herein, to detect binding without further purification of the complex from solution.

[2905] In a preferred embodiment, the assay includes contacting the 47508 protein or biologically active portion thereof with a known compound which binds 47508 to form an assay mixture, contacting the assay mixture with a test compound, and determining the ability of the test compound to interact with a 47508 protein, wherein determining the ability of the test compound to interact with a 47508 protein includes determining the ability of the test compound to preferentially bind to 47508 or biologically active portion thereof, or to modulate the activity of a target molecule, as compared to the known compound.

[2906] The target gene products of the invention can, in vivo, interact with one or more cellular or extracellular macromolecules, such as proteins. For the purposes of this discussion, such cellular and extracellular macromolecules are referred to herein as “binding partners.” Compounds that disrupt such interactions can be useful in regulating the activity of the target gene product. Such compounds can include, but are not limited to molecules such as antibodies, peptides, and small molecules. The preferred target genes/products for use in this embodiment are the 47508 genes herein identified. In an alternative embodiment, the invention provides methods for determining the ability of the test compound to modulate the activity of a 47508 protein through modulation of the activity of a downstream effector of a 47508 target molecule. For example, the activity of the effector molecule on an appropriate target can be determined, or the binding of the effector to an appropriate target can be determined, as previously described.

[2907] To identify compounds that interfere with the interaction between the target gene product and its cellular or extracellular binding partner(s), a reaction mixture containing the target gene product and the binding partner is prepared, under conditions and for a time sufficient, to allow the two products to form complex. In order to test an inhibitory agent, the reaction mixture is provided in the presence and absence of the test compound. The test compound can be initially included in the reaction mixture, or can be added at a time subsequent to the addition of the target gene and its cellular or extracellular binding partner. Control reaction mixtures are incubated without the test compound or with a placebo. The formation of any complexes between the target gene product and the cellular or extracellular binding partner is then detected. The formation of a complex in the control reaction, but not in the reaction mixture containing the test compound, indicates that the compound interferes with the interaction of the target gene product and the interactive binding partner. Additionally, complex formation within reaction mixtures containing the test compound and normal target gene product can also be compared to complex formation within reaction mixtures containing the test compound and mutant target gene product. This comparison can be important in those cases wherein it is desirable to identify compounds that disrupt interactions of mutant but not normal target gene products.

[2908] These assays can be conducted in a heterogeneous or homogeneous format. Heterogeneous assays involve anchoring either the target gene product or the binding partner onto a solid phase, and detecting complexes anchored on the solid phase at the end of the reaction. In homogeneous assays, the entire reaction is carried out in a liquid phase. In either approach, the order of addition of reactants can be varied to obtain different information about the compounds being tested. For example, test compounds that interfere with the interaction between the target gene products and the binding partners, e.g., by competition, can be identified by conducting the reaction in the presence of the test substance. Alternatively, test compounds that disrupt preformed complexes, e.g., compounds with higher binding constants that displace one of the components from the complex, can be tested by adding the test compound to the reaction mixture after complexes have been formed. The various formats are briefly described below.

[2909] In a heterogeneous assay system, either the target gene product or the interactive cellular or extracellular binding partner, is anchored onto a solid surface (e.g., a microtiter plate), while the non-anchored species is labeled, either directly or indirectly. The anchored species can be immobilized by non-covalent or covalent attachments. Alternatively, an immobilized antibody specific for the species to be anchored can be used to anchor the species to the solid surface.

[2910] In order to conduct the assay, the partner of the immobilized species is exposed to the coated surface with or without the test compound. After the reaction is complete, unreacted components are removed (e.g., by washing) and any complexes formed will remain immobilized on the solid surface. Where the non-immobilized species is pre-labeled, the detection of label immobilized on the surface indicates that complexes were formed. Where the non-immobilized species is not pre-labeled, an indirect label can be used to detect complexes anchored on the surface; e.g., using a labeled antibody specific for the initially non-immobilized species (the antibody, in turn, can be directly labeled or indirectly labeled with, e.g., a labeled anti-Ig antibody). Depending upon the order of addition of reaction components, test compounds that inhibit complex formation or that disrupt preformed complexes can be detected.

[2911] Alternatively, the reaction can be conducted in a liquid phase in the presence or absence of the test compound, the reaction products separated from unreacted components, and complexes detected; e.g., using an immobilized antibody specific for one of the binding components to anchor any complexes formed in solution, and a labeled antibody specific for the other partner to detect anchored complexes. Again, depending upon the order of addition of reactants to the liquid phase, test compounds that inhibit complex or that disrupt preformed complexes can be identified.

[2912] In an alternate embodiment of the invention, a homogeneous assay can be used. For example, a preformed complex of the target gene product and the interactive cellular or extracellular binding partner product is prepared in that either the target gene products or their binding partners are labeled, but the signal generated by the label is quenched due to complex formation (see, e.g., U.S. Pat. No. 4,109,496 that utilizes this approach for immunoassays). The addition of a test substance that competes with and displaces one of the species from the preformed complex will result in the generation of a signal above background. In this way, test substances that disrupt target gene product-binding partner interaction can be identified.

[2913] In yet another aspect, the 47508 proteins can be used as “bait proteins” in a two-hybrid assay or three-hybrid assay (see, e.g., U.S. Pat. No. 5,283,317; Zervos et al. (1993) Cell 72:223-232; Madura et al. (1993) J. Biol. Chem. 268:12046-12054; Bartel et al. (1993) Biotechniques 14:920-924; Iwabuchi et al. (1993) Oncogene 8:1693-1696; and Brent WO94/10300), to identify other proteins, which bind to or interact with 47508 (“47508-binding proteins” or “47508-bp”) and are involved in 47508 activity. Such 47508-bps can be activators or inhibitors of signals by the 47508 proteins or 47508 targets as, for example, downstream elements of a 47508-mediated signaling pathway.

[2914] The two-hybrid system is based on the modular nature of most transcription factors, which consist of separable DNA-binding and activation domains. Briefly, the assay utilizes two different DNA constructs. In one construct, the gene that codes for a 47508 protein is fused to a gene encoding the DNA binding domain of a known transcription factor (e.g., GAL-4). In the other construct, a DNA sequence, from a library of DNA sequences, that encodes an unidentified protein (“prey” or “sample”) is fused to a gene that codes for the activation domain of the known transcription factor. (Alternatively the: 47508 protein can be the fused to the activator domain.) If the “bait” and the “prey” proteins are able to interact, in vivo, forming a 47508-dependent complex, the DNA-binding and activation domains of the transcription factor are brought into close proximity. This proximity allows transcription of a reporter gene (e.g., lacZ) which is operably linked to a transcriptional regulatory site responsive to the transcription factor. Expression of the reporter gene can be detected and cell colonies containing the functional transcription factor can be isolated and used to obtain the cloned gene which encodes the protein which interacts with the 47508 protein.

[2915] In another embodiment, modulators of 47508 expression are identified. For example, a cell or cell free mixture is contacted with a candidate compound and the expression of 47508 mRNA or protein evaluated relative to the level of expression of 47508 mRNA or protein in the absence of the candidate compound. When expression of 47508 mRNA or protein is greater in the presence of the candidate compound than in its absence, the candidate compound is identified as a stimulator of 47508 mRNA or protein expression. Alternatively, when expression of 47508 mRNA or protein is less (statistically significantly less) in the presence of the candidate compound than in its absence, the candidate compound is identified as an inhibitor of 47508 mRNA or protein expression. The level of 47508 mRNA or protein expression can be determined by methods described herein for detecting 47508 mRNA or protein.

[2916] In another aspect, the invention pertains to a combination of two or more of the assays described herein. For example, a modulating agent can be identified using a cell-based or a cell free assay, and the ability of the agent to modulate the activity of a 47508 protein can be confirmed in vivo, e.g., in an animal such as an animal model for a cellular proliferative and/or differentiative disorder.

[2917] This invention further pertains to novel agents identified by the above-described screening assays. Accordingly, it is within the scope of this invention to further use an agent identified as described herein (e.g., a 47508 modulating agent, an antisense 47508 nucleic acid molecule, a 47508-specific antibody, or a 47508-binding partner) in an appropriate animal model to determine the efficacy, toxicity, side effects, or mechanism of action, of treatment with such an agent. Furthermore, novel agents identified by the above-described screening assays can be used for treatments as described herein.

[2918] 47508 Detection Assays

[2919] Portions or fragments of the nucleic acid sequences identified herein can be used as polynucleotide reagents. For example, these sequences can be used to: (i) map their respective genes on a chromosome e.g., to locate gene regions associated with genetic disease or to associate 47508 with a disease; (ii) identify an individual from a minute biological sample (tissue typing); and (iii) aid in forensic identification of a biological sample. These applications are described in the subsections below.

[2920] 47508 Chromosome Mapping

[2921] The 47508 nucleotide sequences or portions thereof can be used to map the location of the 47508 genes on a chromosome. This process is called chromosome mapping. Chromosome mapping is useful in correlating the 47508 sequences with genes associated with disease.

[2922] Briefly, 47508 genes can be mapped to chromosomes by preparing PCR primers (preferably 15-25 bp in length) from the 47508 nucleotide sequences. These primers can then be used for PCR screening of somatic cell hybrids containing individual human chromosomes. Only those hybrids containing the human gene corresponding to the 47508 sequences will yield an amplified fragment.

[2923] A panel of somatic cell hybrids in which each cell line contains either a single human chromosome or a small number of human chromosomes, and a full set of mouse chromosomes, can allow easy mapping of individual genes to specific human chromosomes. (D'Eustachio P. et al. (1983) Science 220:919-924).

[2924] Other mapping strategies e.g., in situ hybridization (described in Fan, Y. et al. (1990) Proc. Natl. Acad. Sci. USA, 87:6223-27), pre-screening with labeled flow-sorted chromosomes, and pre-selection by hybridization to chromosome specific cDNA libraries can be used to map 47508 to a chromosomal location.

[2925] Fluorescence in situ hybridization (FISH) of a DNA sequence to a metaphase chromosomal spread can further be used to provide a precise chromosomal location in one step. The FISH technique can be used with a DNA sequence as short as 500 or 600 bases. However, clones larger than 1,000 bases have a higher likelihood of binding to a unique chromosomal location with sufficient signal intensity for simple detection. Preferably 1,000 bases, and more preferably 2,000 bases will suffice to get good results at a reasonable amount of time. For a review of this technique, see Verma et al., Human Chromosomes: A Manual of Basic Techniques ((1988) Pergamon Press, New York).

[2926] Reagents for chromosome mapping can be used individually to mark a single chromosome or a single site on that chromosome, or panels of reagents can be used for marking multiple sites and/or multiple chromosomes. Reagents corresponding to noncoding regions of the genes actually are preferred for mapping purposes. Coding sequences are more likely to be conserved within gene families, thus increasing the chance of cross hybridizations during chromosomal mapping.

[2927] Once a sequence has been mapped to a precise chromosomal location, the physical position of the sequence on the chromosome can be correlated with genetic map data. (Such data are found, for example, in V. McKusick, Mendelian Inheritance in Man, available on-line through Johns Hopkins University Welch Medical Library). The relationship between a gene and a disease, mapped to the same chromosomal region, can then be identified through linkage analysis (co-inheritance of physically adjacent genes), described in, for example, Egeland, J. et al. (1987) Nature, 325:783-787.

[2928] Moreover, differences in the DNA sequences between individuals affected and unaffected with a disease associated with the 47508 gene, can be determined. If a mutation is observed in some or all of the affected individuals but not in any unaffected individuals, then the mutation is likely to be the causative agent of the particular disease. Comparison of affected and unaffected individuals generally involves first looking for structural alterations in the chromosomes, such as deletions or translocations that are visible from chromosome spreads or detectable using PCR based on that DNA sequence. Ultimately, complete sequencing of genes from several individuals can be performed to confirm the presence of a mutation and to distinguish mutations from polymorphisms.

[2929] 47508 Tissue Typing

[2930] 47508 sequences can be used to identify individuals from biological samples using, e.g., restriction fragment length polymorphism (RFLP). In this technique, an individual's genomic DNA is digested with one or more restriction enzymes, the fragments separated, e.g., in a Southern blot, and probed to yield bands for identification. The sequences of the present invention are useful as additional DNA markers for RFLP (described in U.S. Pat. No. 5,272,057).

[2931] Furthermore, the sequences of the present invention can also be used to determine the actual base-by-base DNA sequence of selected portions of an individual's genome. Thus, the 47508 nucleotide sequences described herein can be used to prepare two PCR primers from the 5′ and 3′ends of the sequences. These primers can then be used to amplify an individual's DNA and subsequently sequence it. Panels of corresponding DNA sequences from individuals, prepared in this manner, can provide unique individual identifications, as each individual will have a unique set of such DNA sequences due to allelic differences.

[2932] Allelic variation occurs to some degree in the coding regions of these sequences, and to a greater degree in the noncoding regions. Each of the sequences described herein can, to some degree, be used as a standard against which DNA from an individual can be compared for identification purposes. Because greater numbers of polymorphisms occur in the noncoding regions, fewer sequences are necessary to differentiate individuals. The noncoding sequences of SEQ ID NO:41 can provide positive individual identification with a panel of perhaps 10 to 1,000 primers which each yield a noncoding amplified sequence of 100 bases. If predicted coding sequences, such as those in SEQ ID NO:43 are used, a more appropriate number of primers for positive individual identification would be 500-2,000.

[2933] If a panel of reagents from 47508 nucleotide sequences described herein is used to generate a unique identification database for an individual, those same reagents can later be used to identify tissue from that individual. Using the unique identification database, positive identification of the individual, living or dead, can be made from extremely small tissue samples.

[2934] Use of Partial 47508 Sequences in Forensic Biology

[2935] DNA-based identification techniques can also be used in forensic biology. To make such an identification, PCR technology can be used to amplify DNA sequences taken from very small biological samples such as tissues, e.g., hair or skin, or body fluids, e.g., blood, saliva, or semen found at a crime scene. The amplified sequence can then be compared to a standard, thereby allowing identification of the origin of the biological sample.

[2936] The sequences of the present invention can be used to provide polynucleotide reagents, e.g., PCR primers, targeted to specific loci in the human genome, which can enhance the reliability of DNA-based forensic identifications by, for example, providing another “identification marker” (i.e. another DNA sequence that is unique to a particular individual). As mentioned above, actual base sequence information can be used for identification as an accurate alternative to patterns formed by restriction enzyme generated fragments. Sequences targeted to noncoding regions of SEQ ID NO:41 (e.g., fragments derived from the noncoding regions of SEQ ID NO:41 having a length of at least 20 bases, preferably at least 30 bases) are particularly appropriate for this use.

[2937] The 47508 nucleotide sequences described herein can further be used to provide polynucleotide reagents, e.g., labeled or labelable probes which can be used in, for example, an in situ hybridization technique, to identify a specific tissue. This can be very useful in cases where a forensic pathologist is presented with a tissue of unknown origin. Panels of such 47508 probes can be used to identify tissue by species and/or by organ type.

[2938] In a similar fashion, these reagents, e.g., 47508 primers or probes can be used to screen tissue culture for contamination (i.e. screen for the presence of a mixture of different types of cells in a culture).

[2939] Predictive Medicine of 47508

[2940] The present invention also pertains to the field of predictive medicine in which diagnostic assays, prognostic assays, and monitoring clinical trials are used for prognostic (predictive) purposes to thereby treat an individual.

[2941] Generally, the invention provides, a method of determining if a subject is at risk for a disorder related to a lesion in or the misexpression of a gene which encodes 47508.

[2942] Such disorders include, e.g., a disorder associated with the misexpression of 47508 gene, e.g., a cellular proliferative and/or differentiative disorder.

[2943] The method includes one or more of the following:

[2944] detecting, in a tissue of the subject, the presence or absence of a mutation which affects the expression of the 47508 gene, or detecting the presence or absence of a mutation in a region which controls the expression of the gene, e.g., a mutation in the 5′control region;

[2945] detecting, in a tissue of the subject, the presence or absence of a mutation which alters the structure of the 47508 gene;

[2946] detecting, in a tissue of the subject, the misexpression of the 47508 gene, at the mRNA level, e.g., detecting a non-wild type level of a mRNA;

[2947] detecting, in a tissue of the subject, the misexpression of the gene, at the protein level, e.g., detecting a non-wild type level of a 47508 polypeptide.

[2948] In preferred embodiments the method includes: ascertaining the existence of at least one of: a deletion of one or more nucleotides from the 47508 gene; an insertion of one or more nucleotides into the gene, a point mutation, e.g., a substitution of one or more nucleotides of the gene, a gross chromosomal rearrangement of the gene, e.g., a translocation, inversion, or deletion.

[2949] For example, detecting the genetic lesion can include: (i) providing a probe/primer including an oligonucleotide containing a region of nucleotide sequence which hybridizes to a sense or antisense sequence from SEQ ID NO:41, or naturally occurring mutants thereof or 5′ or 3′flanking sequences naturally associated with the 47508 gene; (ii) exposing the probe/primer to nucleic acid of the tissue; and detecting, by hybridization, e.g., in situ hybridization, of the probe/primer to the nucleic acid, the presence or absence of the genetic lesion.

[2950] In preferred embodiments detecting the misexpression includes ascertaining the existence of at least one of: an alteration in the level of a messenger RNA transcript of the 47508 gene; the presence of a non-wild type splicing pattern of a messenger RNA transcript of the gene; or a non-wild type level of 47508.

[2951] Methods of the invention can be used prenatally or to determine if a subject's offspring will be at risk for a disorder.

[2952] In preferred embodiments the method includes determining the structure of a 47508 gene, an abnormal structure being indicative of risk for the disorder.

[2953] In preferred embodiments the method includes contacting a sample from the subject with an antibody to the 47508 protein or a nucleic acid, which hybridizes specifically with the gene. These and other embodiments are discussed below.

[2954] Diagnostic and Prognostic Assays of 47508

[2955] Diagnostic and prognostic assays of the invention include method for assessing the expression level of 47508 molecules and for identifying variations and mutations in the sequence of 47508 molecules.

[2956] Expression Monitoring and Profiling:

[2957] The presence, level, or absence of 47508 protein or nucleic acid in a biological sample can be evaluated by obtaining a biological sample from a test subject and contacting the biological sample with a compound or an agent capable of detecting 47508 protein or nucleic acid (e.g., mRNA, genomic DNA) that encodes 47508 protein such that the presence of 47508 protein or nucleic acid is detected in the biological sample. The term “biological sample” includes tissues, cells and biological fluids isolated from a subject, as well as tissues, cells and fluids present within a subject. A preferred biological sample is serum. The level of expression of the 47508 gene can be measured in a number of ways, including, but not limited to: measuring the mRNA encoded by the 47508 genes; measuring the amount of protein encoded by the 47508 genes; or measuring the activity of the protein encoded by the 47508 genes.

[2958] The level of mRNA corresponding to the 47508 gene in a cell can be determined both by in situ and by in vitro formats.

[2959] The isolated mRNA can be used in hybridization or amplification assays that include, but are not limited to, Southern or Northern analyses, polymerase chain reaction analyses and probe arrays. One preferred diagnostic method for the detection of mRNA levels involves contacting the isolated mRNA with a nucleic acid molecule (probe) that can hybridize to the mRNA encoded by the gene being detected. The nucleic acid probe can be, for example, a full-length 47508 nucleic acid, such as the nucleic acid of SEQ ID NO:41, or a portion thereof, such as an oligonucleotide of at least 7, 15, 30, 50, 100, 250 or 500 nucleotides in length and sufficient to specifically hybridize under stringent conditions to 47508 mRNA or genomic DNA. The probe can be disposed on an address of an array, e.g., an array described below. Other suitable probes for use in the diagnostic assays are described herein.

[2960] In one format, mRNA (or cDNA) is immobilized on a surface and contacted with the probes, for example by running the isolated mRNA on an agarose gel and transferring the mRNA from the gel to a membrane, such as nitrocellulose. In an alternative format, the probes are immobilized on a surface and the mRNA (or cDNA) is contacted with the probes, for example, in a two-dimensional gene chip array described below. A skilled artisan can adapt known mRNA detection methods for use in detecting the level of mRNA encoded by the 47508 genes.

[2961] The level of mRNA in a sample that is encoded by one of 47508 can be evaluated with nucleic acid amplification, e.g., by rtPCR (Mullis (1987) U.S. Pat. No. 4,683,202), ligase chain reaction (Barany (1991) Proc. Natl. Acad. Sci. USA 88:189-193), self sustained sequence replication (Guatelli et al., (1990) Proc. Natl. Acad. Sci. USA 87:1874-1878), transcriptional amplification system (Kwoh et al., (1989), Proc. Natl. Acad. Sci. USA 86:1173-1177), Q-Beta Replicase (Lizardi et al., (1988) Bio/Technology 6:1197), rolling circle replication (Lizardi et al., U.S. Pat. No. 5,854,033) or any other nucleic acid amplification method, followed by the detection of the amplified molecules using techniques known in the art. As used herein, amplification primers are defined as being a pair of nucleic acid molecules that can anneal to 5′ or 3′regions of a gene (plus and minus strands, respectively, or vice-versa) and contain a short region in between. In general, amplification primers are from about 10 to 30 nucleotides in length and flank a region from about 50 to 200 nucleotides in length. Under appropriate conditions and with appropriate reagents, such primers permit the amplification of a nucleic acid molecule comprising the nucleotide sequence flanked by the primers.

[2962] For in situ methods, a cell or tissue sample can be prepared/processed and immobilized on a support, typically a glass slide, and then contacted with a probe that can hybridize to mRNA that encodes the 47508 gene being analyzed.

[2963] In another embodiment, the methods further contacting a control sample with a compound or agent capable of detecting 47508 mRNA, or genomic DNA, and comparing the presence of 47508 mRNA or genomic DNA in the control sample with the presence of 47508 mRNA or genomic DNA in the test sample. In still another embodiment, serial analysis of gene expression, as described in U.S. Pat. No. 5,695,937, is used to detect 47508 transcript levels.

[2964] A variety of methods can be used to determine the level of protein encoded by 47508. In general, these methods include contacting an agent that selectively binds to the protein, such as an antibody with a sample, to evaluate the level of protein in the sample. In a preferred embodiment, the antibody bears a detectable label. Antibodies can be polyclonal, or more preferably, monoclonal. An intact antibody, or a fragment thereof (e.g., Fab or F(ab′)2) can be used. The term “labeled”, with regard to the probe or antibody, is intended to encompass direct labeling of the probe or antibody by coupling (i.e., physically linking) a detectable substance to the probe or antibody, as well as indirect labeling of the probe or antibody by reactivity with a detectable substance. Examples of detectable substances are provided herein.

[2965] The detection methods can be used to detect 47508 protein in a biological sample in vitro as well as in vivo. In vitro techniques for detection of 47508 protein include enzyme linked immunosorbent assays (ELISAs), immunoprecipitations, immunofluorescence, enzyme immunoassay (EIA), radioimmunoassay (RIA), and Western blot analysis. In vivo techniques for detection of 47508 protein include introducing into a subject a labeled anti-47508 antibody. For example, the antibody can be labeled with a radioactive marker whose presence and location in a subject can be detected by standard imaging techniques. In another embodiment, the sample is labeled, e.g., biotinylated and then contacted to the antibody, e.g., an anti-47508 antibody positioned on an antibody array (as described below). The sample can be detected, e.g., with avidin coupled to a fluorescent label.

[2966] In another embodiment, the methods further include contacting the control sample with a compound or agent capable of detecting 47508 protein, and comparing the presence of 47508 protein in the control sample with the presence of 47508 protein in the test sample.

[2967] The invention also includes kits for detecting the presence of 47508 in a biological sample. For example, the kit can include a compound or agent capable of detecting 47508 protein or mRNA in a biological sample; and a standard. The compound or agent can be packaged in a suitable container. The kit can further comprise instructions for using the kit to detect 47508 protein or nucleic acid.

[2968] For antibody-based kits, the kit can include: (1) a first antibody (e.g., attached to a solid support) which binds to a polypeptide corresponding to a marker of the invention; and, optionally, (2) a second, different antibody which binds to either the polypeptide or the first antibody and is conjugated to a detectable agent.

[2969] For oligonucleotide-based kits, the kit can include: (1) an oligonucleotide, e.g., a detectably labeled oligonucleotide, which hybridizes to a nucleic acid sequence encoding a polypeptide corresponding to a marker of the invention or (2) a pair of primers useful for amplifying a nucleic acid molecule corresponding to a marker of the invention. The kit can also includes a buffering agent, a preservative, or a protein stabilizing agent. The kit can also includes components necessary for detecting the detectable agent (e.g., an enzyme or a substrate). The kit can also contain a control sample or a series of control samples which can be assayed and compared to the test sample contained. Each component of the kit can be enclosed within an individual container and all of the various containers can be within a single package, along with instructions for interpreting the results of the assays performed using the kit.

[2970] The diagnostic methods described herein can identify subjects having, or at risk of developing, a disease or disorder associated with misexpressed or aberrant or unwanted 47508 expression or activity. As used herein, the term “unwanted” includes an unwanted phenomenon involved in a biological response such as a cellular proliferative and/or differentiative disorder.

[2971] In one embodiment, a disease or disorder associated with aberrant or unwanted 47508 expression or activity is identified. A test sample is obtained from a subject and 47508 protein or nucleic acid (e.g., mRNA or genomic DNA) is evaluated, wherein the level, e.g., the presence or absence, of 47508 protein or nucleic acid is diagnostic for a subject having or at risk of developing a disease or disorder associated with aberrant or unwanted 47508 expression or activity. As used herein, a “test sample” refers to a biological sample obtained from a subject of interest, including a biological fluid (e.g., serum), cell sample, or tissue.

[2972] The prognostic assays described herein can be used to determine whether a subject can be administered an agent (e.g., an agonist, antagonist, peptidomimetic, protein, peptide, nucleic acid, small molecule, or other drug candidate) to treat a disease or disorder associated with aberrant or unwanted 47508 expression or activity. For example, such methods can be used to determine whether a subject can be effectively treated with an agent for, e.g., a cellular proliferative and/or differentiative disorder.

[2973] In another aspect, the invention features a computer medium having a plurality of digitally encoded data records. Each data record includes a value representing the level of expression of 47508 in a sample, and a descriptor of the sample. The descriptor of the sample can be an identifier of the sample, a subject from which the sample was derived (e.g., a patient), a diagnosis, or a treatment (e.g., a preferred treatment). In a preferred embodiment, the data record further includes values representing the level of expression of genes other than 47508 (e.g., other genes associated with a 47508-disorder, or other genes on an array). The data record can be structured as a table, e.g., a table that is part of a database such as a relational database (e.g., a SQL database of the Oracle or Sybase database environments).

[2974] Also featured is a method of evaluating a sample. The method includes providing a sample, e.g., from the subject, and determining a gene expression profile of the sample, wherein the profile includes a value representing the level of 47508 expression. The method can further include comparing the value or the profile (i.e., multiple values) to a reference value or reference profile. The gene expression profile of the sample can be obtained by any of the methods described herein (e.g., by providing a nucleic acid from the sample and contacting the nucleic acid to an array). The method can be used to diagnose a cellular proliferative and/or differentiative disorder in a subject wherein an increase in 47508 expression is an indication that the subject has or is disposed to having a cellular proliferative and/or differentiative disorder. The method can be used to monitor a treatment for a cellular proliferative and/or differentiative disorder in a subject. For example, the gene expression profile can be determined for a sample from a subject undergoing treatment. The profile can be compared to a reference profile or to a profile obtained from the subject prior to treatment or prior to onset of the disorder (see, e.g., Golub et al. (1999) Science 286:531).

[2975] In yet another aspect, the invention features a method of evaluating a test compound (see also, “Screening Assays”, above). The method includes providing a cell and a test compound; contacting the test compound to the cell; obtaining a subject expression profile for the contacted cell; and comparing the subject expression profile to one or more reference profiles. The profiles include a value representing the level of 47508 expression. In a preferred embodiment, the subject expression profile is compared to a target profile, e.g., a profile for a normal cell or for desired condition of a cell. The test compound is evaluated favorably if the subject expression profile is more similar to the target profile than an expression profile obtained from an uncontacted cell.

[2976] In another aspect, the invention features, a method of evaluating a subject. The method includes: a) obtaining a sample from a subject, e.g., from a caregiver, e.g., a caregiver who obtains the sample from the subject; b) determining a subject expression profile for the sample. Optionally, the method further includes either or both of steps: c) comparing the subject expression profile to one or more reference expression profiles; and d) selecting the reference profile most similar to the subject reference profile. The subject expression profile and the reference profiles include a value representing the level of 47508 expression. A variety of routine statistical measures can be used to compare two reference profiles. One possible metric is the length of the distance vector that is the difference between the two profiles. Each of the subject and reference profile is represented as a multi-dimensional vector, wherein each dimension is a value in the profile.

[2977] The method can further include transmitting a result to a caregiver. The result can be the subject expression profile, a result of a comparison of the subject expression profile with another profile, a most similar reference profile, or a descriptor of any of the aforementioned. The result can be transmitted across a computer network, e.g., the result can be in the form of a computer transmission, e.g., a computer data signal embedded in a carrier wave.

[2978] Also featured is a computer medium having executable code for effecting the following steps: receive a subject expression profile; access a database of reference expression profiles; and either i) select a matching reference profile most similar to the subject expression profile or ii) determine at least one comparison score for the similarity of the subject expression profile to at least one reference profile. The subject expression profile, and the reference expression profiles each include a value representing the level of 47508 expression.

[2979] 47508 Arrays and Uses Thereof

[2980] In another aspect, the invention features an array that includes a substrate having a plurality of addresses. At least one address of the plurality includes a capture probe that binds specifically to a 47508 molecule (e.g., a 47508 nucleic acid or a 47508 polypeptide). The array can have a density of at least than 10, 50, 100, 200, 500, 1,000, 2,000, or 10,000 or more addresses/cm2, and ranges between. In a preferred embodiment, the plurality of addresses includes at least 10, 100, 500, 1,000, 5,000, 10,000, 50,000 addresses. In a preferred embodiment, the plurality of addresses includes equal to or less than 10, 100, 500, 1,000, 5,000, 10,000, or 50,000 addresses. The substrate can be a two-dimensional substrate such as a glass slide, a wafer (e.g., silica or plastic), a mass spectroscopy plate, or a three-dimensional substrate such as a gel pad. Addresses in addition to address of the plurality can be disposed on the array.

[2981] In a preferred embodiment, at least one address of the plurality includes a nucleic acid capture probe that hybridizes specifically to a 47508 nucleic acid, e.g., the sense or anti-sense strand. In one preferred embodiment, a subset of addresses of the plurality of addresses has a nucleic acid capture probe for 47508. Each address of the subset can include a capture probe that hybridizes to a different region of a 47508 nucleic acid. In another preferred embodiment, addresses of the subset include a capture probe for a 47508 nucleic acid. Each address of the subset is unique, overlapping, and complementary to a different variant of 47508 (e.g., an allelic variant, or all possible hypothetical variants). The array can be used to sequence 47508 by hybridization (see, e.g., U.S. Pat. No. 5,695,940).

[2982] An array can be generated by various methods, e.g., by photolithographic methods (see, e.g., U.S. Pat. Nos. 5,143,854; 5,510,270; and 5,527,681), mechanical methods (e.g., directed-flow methods as described in U.S. Pat. No. 5,384,261), pin-based methods (e.g., as described in U.S. Pat. No. 5,288,514), and bead-based techniques (e.g., as described in PCT US/93/04145).

[2983] In another preferred embodiment, at least one address of the plurality includes a polypeptide capture probe that binds specifically to a 47508 polypeptide or fragment thereof. The polypeptide can be a naturally-occurring interaction partner of 47508 polypeptide. Preferably, the polypeptide is an antibody, e.g., an antibody described herein (see “Anti-47508 Antibodies,” above), such as a monoclonal antibody or a single-chain antibody.

[2984] In another aspect, the invention features a method of analyzing the expression of 47508. The method includes providing an array as described above; contacting the array with a sample and detecting binding of a 47508-molecule (e.g., nucleic acid or polypeptide) to the array. In a preferred embodiment, the array is a nucleic acid array. Optionally the method further includes amplifying nucleic acid from the sample prior or during contact with the array.

[2985] In another embodiment, the array can be used to assay gene expression in a tissue to ascertain tissue specificity of genes in the array, particularly the expression of 47508. If a sufficient number of diverse samples is analyzed, clustering (e.g., hierarchical clustering, k-means clustering, Bayesian clustering and the like) can be used to identify other genes which are co-regulated with 47508. For example, the array can be used for the quantitation of the expression of multiple genes. Thus, not only tissue specificity, but also the level of expression of a battery of genes in the tissue is ascertained. Quantitative data can be used to group (e.g., cluster) genes on the basis of their tissue expression per se and level of expression in that tissue.

[2986] For example, array analysis of gene expression can be used to assess the effect of cell-cell interactions on 47508 expression. A first tissue can be perturbed and nucleic acid from a second tissue that interacts with the first tissue can be analyzed. In this context, the effect of one cell type on another cell type in response to a biological stimulus can be determined, e.g., to monitor the effect of cell-cell interaction at the level of gene expression.

[2987] In another embodiment, cells are contacted with a therapeutic agent. The expression profile of the cells is determined using the array, and the expression profile is compared to the profile of like cells not contacted with the agent. For example, the assay can be used to determine or analyze the molecular basis of an undesirable effect of the therapeutic agent. If an agent is administered therapeutically to treat one cell type but has an undesirable effect on another cell type, the invention provides an assay to determine the molecular basis of the undesirable effect and thus provides the opportunity to co-administer a counteracting agent or otherwise treat the undesired effect. Similarly, even within a single cell type, undesirable biological effects can be determined at the molecular level. Thus, the effects of an agent on expression of other than the target gene can be ascertained and counteracted.

[2988] In another embodiment, the array can be used to monitor expression of one or more genes in the array with respect to time. For example, samples obtained from different time points can be probed with the array. Such analysis can identify and/or characterize the development of a 47508-associated disease or disorder; and processes, such as a cellular transformation associated with a 47508-associated disease or disorder. The method can also evaluate the treatment and/or progression of a 47508-associated disease or disorder

[2989] The array is also useful for ascertaining differential expression patterns of one or more genes in normal and abnormal cells. This provides a battery of genes (e.g., including 47508) that could serve as a molecular target for diagnosis or therapeutic intervention.

[2990] In another aspect, the invention features an array having a plurality of addresses. Each address of the plurality includes a unique polypeptide. At least one address of the plurality has disposed thereon a 47508 polypeptide or fragment thereof. Methods of producing polypeptide arrays are described in the art, e.g., in De Wildt et al. (2000). Nature Biotech. 18, 989-994; Lueking et al. (1999). Anal. Biochem. 270, 103-111; Ge, H. (2000). Nucleic Acids Res. 28, e3, I-VII; MacBeath, G., and Schreiber, S. L. (2000). Science 289, 1760-1763; and WO 99/51773A1. In a preferred embodiment, each addresses of the plurality has disposed thereon a polypeptide at least 60, 70, 80, 85, 90, 95 or 99% identical to a 47508 polypeptide or fragment thereof. For example, multiple variants of a 47508 polypeptide (e.g., encoded by allelic variants, site-directed mutants, random mutants, or combinatorial mutants) can be disposed at individual addresses of the plurality. Addresses in addition to the address of the plurality can be disposed on the array.

[2991] The polypeptide array can be used to detect a 47508 binding compound, e.g., an antibody in a sample from a subject with specificity for a 47508 polypeptide or the presence of a 47508-binding protein or ligand.

[2992] The array is also useful for ascertaining the effect of the expression of a gene on the expression of other genes in the same cell or in different cells (e.g., ascertaining the effect of 47508 expression on the expression of other genes). This provides, for example, for a selection of alternate molecular targets for therapeutic intervention if the ultimate or downstream target cannot be regulated.

[2993] In another aspect, the invention features a method of analyzing a plurality of probes. The method is useful, e.g., for analyzing gene expression. The method includes: providing a two dimensional array having a plurality of addresses, each address of the plurality being positionally distinguishable from each other address of the plurality having a unique capture probe, e.g., wherein the capture probes are from a cell or subject which express 47508 or from a cell or subject in which a 47508 mediated response has been elicited, e.g., by contact of the cell with 47508 nucleic acid or protein, or administration to the cell or subject 47508 nucleic acid or protein; providing a two dimensional array having a plurality of addresses, each address of the plurality being positionally distinguishable from each other address of the plurality, and each address of the plurality having a unique capture probe, e.g., wherein the capture probes are from a cell or subject which does not express 47508 (or does not express as highly as in the case of the 47508 positive plurality of capture probes) or from a cell or subject which in which a 47508 mediated response has not been elicited (or has been elicited to a lesser extent than in the first sample); contacting the array with one or more inquiry probes (which is preferably other than a 47508 nucleic acid, polypeptide, or antibody), and thereby evaluating the plurality of capture probes. Binding, e.g., in the case of a nucleic acid, hybridization with a capture probe at an address of the plurality, is detected, e.g., by signal generated from a label attached to the nucleic acid, polypeptide, or antibody.

[2994] In another aspect, the invention features a method of analyzing a plurality of probes or a sample. The method is useful, e.g., for analyzing gene expression. The method includes: providing a two dimensional array having a plurality of addresses, each address of the plurality being positionally distinguishable from each other address of the plurality having a unique capture probe, contacting the array with a first sample from a cell or subject which express or mis-express 47508 or from a cell or subject in which a 47508-mediated response has been elicited, e.g., by contact of the cell with 47508 nucleic acid or protein, or administration to the cell or subject 47508 nucleic acid or protein; providing a two dimensional array having a plurality of addresses, each address of the plurality being positionally distinguishable from each other address of the plurality, and each address of the plurality having a unique capture probe, and contacting the array with a second sample from a cell or subject which does not express 47508 (or does not express as highly as in the case of the 47508 positive plurality of capture probes) or from a cell or subject which in which a 47508 mediated response has not been elicited (or has been elicited to a lesser extent than in the first sample); and comparing the binding of the first sample with the binding of the second sample. Binding, e.g., in the case of a nucleic acid, hybridization with a capture probe at an address of the plurality, is detected, e.g., by signal generated from a label attached to the nucleic acid, polypeptide, or antibody. The same array can be used for both samples or different arrays can be used. If different arrays are used the plurality of addresses with capture probes should be present on both arrays.

[2995] In another aspect, the invention features a method of analyzing 47508, e.g., analyzing structure, function, or relatedness to other nucleic acid or amino acid sequences. The method includes: providing a 47508 nucleic acid or amino acid sequence; comparing the 47508 sequence with one or more preferably a plurality of sequences from a collection of sequences, e.g., a nucleic acid or protein sequence database; to thereby analyze 47508.

[2996] Detection of 47508 Variations or Mutations

[2997] The methods of the invention can also be used to detect genetic alterations in a 47508 gene, thereby determining if a subject with the altered gene is at risk for a disorder characterized by misregulation in 47508 protein activity or nucleic acid expression, such as a cellular proliferative and/or differentiative disorder. In preferred embodiments, the methods include detecting, in a sample from the subject, the presence or absence of a genetic alteration characterized by at least one of an alteration affecting the integrity of a gene encoding a 47508-protein, or the mis-expression of the 47508 gene. For example, such genetic alterations can be detected by ascertaining the existence of at least one of 1) a deletion of one or more nucleotides from a 47508 gene; 2) an addition of one or more nucleotides to a 47508 gene; 3) a substitution of one or more nucleotides of a 47508 gene, 4) a chromosomal rearrangement of a 47508 gene; 5) an alteration in the level of a messenger RNA transcript of a 47508 gene, 6) aberrant modification of a 47508 gene, such as of the methylation pattern of the genomic DNA, 7) the presence of a non-wild type splicing pattern of a messenger RNA transcript of a 47508 gene, 8) a non-wild type level of a 47508-protein, 9) allelic loss of a 47508 gene, and 10) inappropriate post-translational modification of a 47508-protein.

[2998] An alteration can be detected without a probe/primer in a polymerase chain reaction, such as anchor PCR or RACE PCR, or, alternatively, in a ligation chain reaction (LCR), the latter of which can be particularly useful for detecting point mutations in the 47508-gene. This method can include the steps of collecting a sample of cells from a subject, isolating nucleic acid (e.g., genomic, mRNA or both) from the sample, contacting the nucleic acid sample with one or more primers which specifically hybridize to a 47508 gene under conditions such that hybridization and amplification of the 47508-gene (if present) occurs, and detecting the presence or absence of an amplification product, or detecting the size of the amplification product and comparing the length to a control sample. It is anticipated that PCR and/or LCR may be desirable to use as a preliminary amplification step in conjunction with any of the techniques used for detecting mutations described herein. Alternatively, other amplification methods described herein or known in the art can be used.

[2999] In another embodiment, mutations in a 47508 gene from a sample cell can be identified by detecting alterations in restriction enzyme cleavage patterns. For example, sample and control DNA is isolated, amplified (optionally), digested with one or more restriction endonucleases, and fragment length sizes are determined, e.g., by gel electrophoresis and compared. Differences in fragment length sizes between sample and control DNA indicates mutations in the sample DNA. Moreover, the use of sequence specific ribozymes (see, for example, U.S. Pat. No. 5,498,531) can be used to score for the presence of specific mutations by development or loss of a ribozyme cleavage site.

[3000] In other embodiments, genetic mutations in 47508 can be identified by hybridizing a sample and control nucleic acids, e.g., DNA or RNA, two-dimensional arrays, e.g., chip based arrays. Such arrays include a plurality of addresses, each of which is positionally distinguishable from the other. A different probe is located at each address of the plurality. A probe can be complementary to a region of a 47508 nucleic acid or a putative variant (e.g., allelic variant) thereof. A probe can have one or more mismatches to a region of a 47508 nucleic acid (e.g., a destabilizing mismatch). The arrays can have a high density of addresses, e.g., can contain hundreds or thousands of oligonucleotides probes (Cronin, M. T. et al. (1996) Human Mutation 7: 244-255; Kozal, M. J. et al. (1996) Nature Medicine 2: 753-759). For example, genetic mutations in 47508 can be identified in two-dimensional arrays containing light-generated DNA probes as described in Cronin, M. T. et al. supra. Briefly, a first hybridization array of probes can be used to scan through long stretches of DNA in a sample and control to identify base changes between the sequences by making linear arrays of sequential overlapping probes. This step allows the identification of point mutations. This step is followed by a second hybridization array that allows the characterization of specific mutations by using smaller, specialized probe arrays complementary to all variants or mutations detected. Each mutation array is composed of parallel probe sets, one complementary to the wild-type gene and the other complementary to the mutant gene.

[3001] In yet another embodiment, any of a variety of sequencing reactions known in the art can be used to directly sequence the 47508 gene and detect mutations by comparing the sequence of the sample 47508 with the corresponding wild-type (control) sequence. Automated sequencing procedures can be utilized when performing the diagnostic assays ((1995) Biotechniques 19:448), including sequencing by mass spectrometry.

[3002] Other methods for detecting mutations in the 47508 gene include methods in which protection from cleavage agents is used to detect mismatched bases in RNA/RNA or RNA/DNA heteroduplexes (Myers et al. (1985) Science 230:1242; Cotton et al. (1988) Proc. Natl. Acad Sci USA 85:4397; Saleeba et al. (1992) Methods Enzymol. 217:286-295).

[3003] In still another embodiment, the mismatch cleavage reaction employs one or more proteins that recognize mismatched base pairs in double-stranded DNA (so called “DNA mismatch repair” enzymes) in defined systems for detecting and mapping point mutations in 47508 cDNAs obtained from samples of cells. For example, the mutY enzyme of E. coli cleaves A at G/A mismatches and the thymidine DNA glycosylase from HeLa cells cleaves T at G/T mismatches (Hsu et al. (1994) Carcinogenesis 15:1657-1662; U.S. Pat. No. 5,459,039).

[3004] In other embodiments, alterations in electrophoretic mobility will be used to identify mutations in 47508 genes. For example, single strand conformation polymorphism (SSCP) may be used to detect differences in electrophoretic mobility between mutant and wild type nucleic acids (Orita et al. (1989) Proc Natl. Acad. Sci USA: 86:2766, see also Cotton (1993) Mutat. Res. 285:125-144; and Hayashi (1992) Genet. Anal. Tech. Appl. 9:73-79). Single-stranded DNA fragments of sample and control 47508 nucleic acids will be denatured and allowed to renature. The secondary structure of single-stranded nucleic acids varies according to sequence, the resulting alteration in electrophoretic mobility enables the detection of even a single base change. The DNA fragments may be labeled or detected with labeled probes. The sensitivity of the assay may be enhanced by using RNA (rather than DNA), in which the secondary structure is more sensitive to a change in sequence. In a preferred embodiment, the subject method utilizes heteroduplex analysis to separate double stranded heteroduplex molecules on the basis of changes in electrophoretic mobility (Keen et al. (1991) Trends Genet 7:5).

[3005] In yet another embodiment, the movement of mutant or wild-type fragments in polyacrylamide gels containing a gradient of denaturant is assayed using denaturing gradient gel electrophoresis (DGGE) (Myers et al. (1985) Nature 313:495). When DGGE is used as the method of analysis, DNA will be modified to insure that it does not completely denature, for example by adding a GC clamp of approximately 40 bp of high-melting GC-rich DNA by PCR. In a further embodiment, a temperature gradient is used in place of a denaturing gradient to identify differences in the mobility of control and sample DNA (Rosenbaum and Reissner (1987) Biophys Chem 265:12753).

[3006] Examples of other techniques for detecting point mutations include, but are not limited to, selective oligonucleotide hybridization, selective amplification, or selective primer extension (Saiki et al. (1986) Nature 324:163); Saiki et al. (1989) Proc. Natl. Acad. Sci USA 86:6230). A further method of detecting point mutations is the chemical ligation of oligonucleotides as described in Xu et al. ((2001) Nature Biotechnol. 19:148). Adjacent oligonucleotides, one of which selectively anneals to the query site, are ligated together if the nucleotide at the query site of the sample nucleic acid is complementary to the query oligonucleotide; ligation can be monitored, e.g., by fluorescent dyes coupled to the oligonucleotides.

[3007] Alternatively, allele specific amplification technology that depends on selective PCR amplification may be used in conjunction with the instant invention. Oligonucleotides used as primers for specific amplification may carry the mutation of interest in the center of the molecule (so that amplification depends on differential hybridization) (Gibbs et al. (1989) Nucleic Acids Res. 17:2437-2448) or at the extreme 3′end of one primer where, under appropriate conditions, mismatch can prevent, or reduce polymerase extension (Prossner (1993) Tibtech 11:238). In addition it may be desirable to introduce a novel restriction site in the region of the mutation to create cleavage-based detection (Gasparini et al. (1992) Mol. Cell Probes 6:1). It is anticipated that in certain embodiments amplification may also be performed using Taq ligase for amplification (Barany (1991) Proc. Natl. Acad. Sci USA 88:189). In such cases, ligation will occur only if there is a perfect match at the 3′end of the 5′sequence making it possible to detect the presence of a known mutation at a specific site by looking for the presence or absence of amplification.

[3008] In another aspect, the invention features a set of oligonucleotides. The set includes a plurality of oligonucleotides, each of which is at least partially complementary (e.g., at least 50%, 60%, 70%, 80%, 90%, 92%, 95%, 97%, 98%, or 99% complementary) to a 47508 nucleic acid.

[3009] In a preferred embodiment the set includes a first and a second oligonucleotide. The first and second oligonucleotide can hybridize to the same or to different locations of SEQ ID NO:41 or the complement of SEQ ID NO:41. Different locations can be different but overlapping, or non-overlapping on the same strand. The first and second oligonucleotide can hybridize to sites on the same or on different strands.

[3010] The set can be useful, e.g., for identifying SNP's, or identifying specific alleles of 47508. In a preferred embodiment, each oligonucleotide of the set has a different nucleotide at an interrogation position. In one embodiment, the set includes two oligonucleotides, each complementary to a different allele at a locus, e.g., a biallelic or polymorphic locus.

[3011] In another embodiment, the set includes four oligonucleotides, each having a different nucleotide (e.g., adenine, guanine, cytosine, or thymidine) at the interrogation position. The interrogation position can be a SNP or the site of a mutation. In another preferred embodiment, the oligonucleotides of the plurality are identical in sequence to one another (except for differences in length). The oligonucleotides can be provided with differential labels, such that an oligonucleotide that hybridizes to one allele provides a signal that is distinguishable from an oligonucleotide that hybridizes to a second allele. In still another embodiment, at least one of the oligonucleotides of the set has a nucleotide change at a position in addition to a query position, e.g., a destabilizing mutation to decrease the Tm of the oligonucleotide. In another embodiment, at least one oligonucleotide of the set has a non-natural nucleotide, e.g., inosine. In a preferred embodiment, the oligonucleotides are attached to a solid support, e.g., to different addresses of an array or to different beads or nanoparticles.

[3012] In a preferred embodiment the set of oligo nucleotides can be used to specifically amplify, e.g., by PCR, or detect, a 47508 nucleic acid.

[3013] The methods described herein may be performed, for example, by utilizing pre-packaged diagnostic kits comprising at least one probe nucleic acid or antibody reagent described herein, which may be conveniently used, e.g., in clinical settings to diagnose patients exhibiting symptoms or family history of a disease or illness involving a 47508 gene.

[3014] Use of 47508 Molecules as Surrogate Markers

[3015] The 47508 molecules of the invention are also useful as markers of disorders or disease states, as markers for precursors of disease states, as markers for predisposition of disease states, as markers of drug activity, or as markers of the pharmacogenomic profile of a subject. Using the methods described herein, the presence, absence and/or quantity of the 47508 molecules of the invention may be detected, and may be correlated with one or more biological states in vivo. For example, the 47508 molecules of the invention may serve as surrogate markers for one or more disorders or disease states or for conditions leading up to disease states. As used herein, a “surrogate marker” is an objective biochemical marker which correlates with the absence or presence of a disease or disorder, or with the progression of a disease or disorder (e.g., with the presence or absence of a tumor). The presence or quantity of such markers is independent of the disease. Therefore, these markers may serve to indicate whether a particular course of treatment is effective in lessening a disease state or disorder. Surrogate markers are of particular use when the presence or extent of a disease state or disorder is difficult to assess through standard methodologies (e.g., early stage tumors), or when an assessment of disease progression is desired before a potentially dangerous clinical endpoint is reached (e.g., an assessment of cardiovascular disease may be made using cholesterol levels as a surrogate marker, and an analysis of HIV infection may be made using HIV RNA levels as a surrogate marker, well in advance of the undesirable clinical outcomes of myocardial infarction or fully-developed AIDS). Examples of the use of surrogate markers in the art include: Koomen et al. (2000) J. Mass. Spectrom. 35: 258-264; and James (1994) AIDS Treatment News Archive 209.

[3016] The 47508 molecules of the invention are also useful as pharmacodynamic markers. As used herein, a “pharmacodynamic marker” is an objective biochemical marker which correlates specifically with drug effects. The presence or quantity of a pharmacodynamic marker is not related to the disease state or disorder for which the drug is being administered; therefore, the presence or quantity of the marker is indicative of the presence or activity of the drug in a subject. For example, a pharmacodynamic marker may be indicative of the concentration of the drug in a biological tissue, in that the marker is either expressed or transcribed or not expressed or transcribed in that tissue in relationship to the level of the drug. In this fashion, the distribution or uptake of the drug may be monitored by the pharmacodynamic marker. Similarly, the presence or quantity of the pharmacodynamic marker may be related to the presence or quantity of the metabolic product of a drug, such that the presence or quantity of the marker is indicative of the relative breakdown rate of the drug in vivo. Pharmacodynamic markers are of particular use in increasing the sensitivity of detection of drug effects, particularly when the drug is administered in low doses. Since even a small amount of a drug may be sufficient to activate multiple rounds of marker (e.g., a 47508 marker) transcription or expression, the amplified marker may be in a quantity which is more readily detectable than the drug itself. Also, the marker may be more easily detected due to the nature of the marker itself; for example, using the methods described herein, anti-47508 antibodies may be employed in an immune-based detection system for a 47508 protein marker, or 47508-specific radiolabeled probes may be used to detect a 47508 mRNA marker. Furthermore, the use of a pharmacodynamic marker may offer mechanism-based prediction of risk due to drug treatment beyond the range of possible direct observations. Examples of the use of pharmacodynamic markers in the art include: Matsuda et al. U.S. Pat. No. 6,033,862; Hattis et al. (1991) Env. Health Perspect. 90: 229-238; Schentag (1999) Am. J. Health-Syst. Pharm. 56 Suppl. 3: S21-S24; and Nicolau (1999) Am, J. Health-Syst. Pharm. 56 Suppl. 3: S16-S20.

[3017] The 47508 molecules of the invention are also useful as pharmacogenomic markers. As used herein, a “pharmacogenomic marker” is an objective biochemical marker which correlates with a specific clinical drug response or susceptibility in a subject (see, e.g., McLeod et al. (1999) Eur. J. Cancer 35:1650-1652). The presence or quantity of the pharmacogenomic marker is related to the predicted response of the subject to a specific drug or class of drugs prior to administration of the drug. By assessing the presence or quantity of one or more pharmacogenomic markers in a subject, a drug therapy which is most appropriate for the subject, or which is predicted to have a greater degree of success, may be selected. For example, based on the presence or quantity of RNA, or protein (e.g., 47508 protein or RNA) for specific tumor markers in a subject, a drug or course of treatment may be selected that is optimized for the treatment of the specific tumor likely to be present in the subject. Similarly, the presence or absence of a specific sequence mutation in 47508 DNA may correlate 47508 drug response. The use of pharmacogenomic markers therefore permits the application of the most appropriate treatment for each subject without having to administer the therapy.

[3018] Pharmaceutical Compositions of 47508

[3019] The nucleic acid and polypeptides, fragments thereof, as well as anti-47508 antibodies (also referred to herein as “active compounds”) of the invention can be incorporated into pharmaceutical compositions. Such compositions typically include the nucleic acid molecule, protein, or antibody and a pharmaceutically acceptable carrier. As used herein the language “pharmaceutically acceptable carrier” includes solvents, dispersion media, coatings, antibacterial and antifungal agents, isotonic and absorption delaying agents, and the like, compatible with pharmaceutical administration. Supplementary active compounds can also be incorporated into the compositions.

[3020] A pharmaceutical composition is formulated to be compatible with its intended route of administration. Examples of routes of administration include parenteral, e.g., intravenous, intradermal, subcutaneous, oral (e.g., inhalation), transdermal (topical), transmucosal, and rectal administration. Solutions or suspensions used for parenteral, intradermal, or subcutaneous application can include the following components: a sterile diluent such as water for injection, saline solution, fixed oils, polyethylene glycols, glycerine, propylene glycol or other synthetic solvents; antibacterial agents such as benzyl alcohol or methyl parabens; antioxidants such as ascorbic acid or sodium bisulfite; chelating agents such as ethylenediaminetetraacetic acid; buffers such as acetates, citrates or phosphates and agents for the adjustment of tonicity such as sodium chloride or dextrose. pH can be adjusted with acids or bases, such as hydrochloric acid or sodium hydroxide. The parenteral preparation can be enclosed in ampoules, disposable syringes or multiple dose vials made of glass or plastic.

[3021] Pharmaceutical compositions suitable for injectable use include sterile aqueous solutions (where water soluble) or dispersions and sterile powders for the extemporaneous preparation of sterile injectable solutions or dispersion. For intravenous administration, suitable carriers include physiological saline, bacteriostatic water, Cremophor EL™ (BASF, Parsippany, N.J.) or phosphate buffered saline (PBS). In all cases, the composition must be sterile and should be fluid to the extent that easy syringability exists. It should be stable under the conditions of manufacture and storage and must be preserved against the contaminating action of microorganisms such as bacteria and fungi. The carrier can be a solvent or dispersion medium containing, for example, water, ethanol, polyol (for example, glycerol, propylene glycol, and liquid polyetheylene glycol, and the like), and suitable mixtures thereof. The proper fluidity can be maintained, for example, by the use of a coating such as lecithin, by the maintenance of the required particle size in the case of dispersion and by the use of surfactants. Prevention of the action of microorganisms can be achieved by various antibacterial and antifungal agents, for example, parabens, chlorobutanol, phenol, ascorbic acid, thimerosal, and the like. In many cases, it will be preferable to include isotonic agents, for example, sugars, polyalcohols such as manitol, sorbitol, sodium chloride in the composition. Prolonged absorption of the injectable compositions can be brought about by including in the composition an agent which delays absorption, for example, aluminum monostearate and gelatin.

[3022] Sterile injectable solutions can be prepared by incorporating the active compound in the required amount in an appropriate solvent with one or a combination of ingredients enumerated above, as required, followed by filtered sterilization. Generally, dispersions are prepared by incorporating the active compound into a sterile vehicle which contains a basic dispersion medium and the required other ingredients from those enumerated above. In the case of sterile powders for the preparation of sterile injectable solutions, the preferred methods of preparation are vacuum drying and freeze-drying which yields a powder of the active ingredient plus any additional desired ingredient from a previously sterile-filtered solution thereof.

[3023] Oral compositions generally include an inert diluent or an edible carrier. For the purpose of oral therapeutic administration, the active compound can be incorporated with excipients and used in the form of tablets, troches, or capsules, e.g., gelatin capsules. Oral compositions can also be prepared using a fluid carrier for use as a mouthwash. Pharmaceutically compatible binding agents, and/or adjuvant materials can be included as part of the composition. The tablets, pills, capsules, troches and the like can contain any of the following ingredients, or compounds of a similar nature: a binder such as microcrystalline cellulose, gum tragacanth or gelatin; an excipient such as starch or lactose, a disintegrating agent such as alginic acid, Primogel, or corn starch; a lubricant such as magnesium stearate or Sterotes; a glidant such as colloidal silicon dioxide; a sweetening agent such as sucrose or saccharin; or a flavoring agent such as peppermint, methyl salicylate, or orange flavoring.

[3024] For administration by inhalation, the compounds are delivered in the form of an aerosol spray from pressured container or dispenser which contains a suitable propellant, e.g., a gas such as carbon dioxide, or a nebulizer.

[3025] Systemic administration can also be by transmucosal or transdermal means. For transmucosal or transdermal administration, penetrants appropriate to the barrier to be permeated are used in the formulation. Such penetrants are generally known in the art, and include, for example, for transmucosal administration, detergents, bile salts, and fusidic acid derivatives. Transmucosal administration can be accomplished through the use of nasal sprays or suppositories. For transdermal administration, the active compounds are formulated into ointments, salves, gels, or creams as generally known in the art.

[3026] The compounds can also be prepared in the form of suppositories (e.g., with conventional suppository bases such as cocoa butter and other glycerides) or retention enemas for rectal delivery.

[3027] In one embodiment, the active compounds are prepared with carriers that will protect the compound against rapid elimination from the body, such as a controlled release formulation, including implants and microencapsulated delivery systems. Biodegradable, biocompatible polymers can be used, such as ethylene vinyl acetate, polyanhydrides, polyglycolic acid, collagen, polyorthoesters, and polylactic acid. Methods for preparation of such formulations will be apparent to those skilled in the art. The materials can also be obtained commercially from Alza Corporation and Nova Pharmaceuticals, Inc. Liposomal suspensions (including liposomes targeted to infected cells with monoclonal antibodies to viral antigens) can also be used as pharmaceutically acceptable carriers. These can be prepared according to methods known to those skilled in the art, for example, as described in U.S. Pat. No. 4,522,811.

[3028] It is advantageous to formulate oral or parenteral compositions in dosage unit form for ease of administration and uniformity of dosage. Dosage unit form as used herein refers to physically discrete units suited as unitary dosages for the subject to be treated; each unit containing a predetermined quantity of active compound calculated to produce the desired therapeutic effect in association with the required pharmaceutical carrier.

[3029] Toxicity and therapeutic efficacy of such compounds can be determined by standard pharmaceutical procedures in cell cultures or experimental animals, e.g., for determining the LD50 (the dose lethal to 50% of the population) and the ED50 (the dose therapeutically effective in 50% of the population). The dose ratio between toxic and therapeutic effects is the therapeutic index and it can be expressed as the ratio LD50/ED50. Compounds which exhibit high therapeutic indices are preferred. While compounds that exhibit toxic side effects may be used, care should be taken to design a delivery system that targets such compounds to the site of affected tissue in order to minimize potential damage to uninfected cells and, thereby, reduce side effects.

[3030] The data obtained from the cell culture assays and animal studies can be used in formulating a range of dosage for use in humans. The dosage of such compounds lies preferably within a range of circulating concentrations that include the ED50 with little or no toxicity. The dosage may vary within this range depending upon the dosage form employed and the route of administration utilized. For any compound used in the method of the invention, the therapeutically effective dose can be estimated initially from cell culture assays. A dose may be formulated in animal models to achieve a circulating plasma concentration range that includes the IC50 (i.e., the concentration of the test compound which achieves a half-maximal inhibition of symptoms) as determined in cell culture. Such information can be used to more accurately determine useful doses in humans. Levels in plasma may be measured, for example, by high performance liquid chromatography.

[3031] As defined herein, a therapeutically effective amount of protein or polypeptide (i.e., an effective dosage) ranges from about 0.001 to 30 mg/kg body weight, preferably about 0.01 to 25 mg/kg body weight, more preferably about 0.1 to 20 mg/kg body weight, and even more preferably about 1 to 10 mg/kg, 2 to 9 mg/kg, 3 to 8 mg/kg, 4 to 7 mg/kg, or 5 to 6 mg/kg body weight. The protein or polypeptide can be administered one time per week for between about 1 to 10 weeks, preferably between 2 to 8 weeks, more preferably between about 3 to 7 weeks, and even more preferably for about 4, 5, or 6 weeks. The skilled artisan will appreciate that certain factors may influence the dosage and timing required to effectively treat a subject, including but not limited to the severity of the disease or disorder, previous treatments, the general health and/or age of the subject, and other diseases present. Moreover, treatment of a subject with a therapeutically effective amount of a protein, polypeptide, or antibody can include a single treatment or, preferably, can include a series of treatments.

[3032] For antibodies, the preferred dosage is 0.1 mg/kg of body weight (generally 10 mg/kg to 20 mg/kg). If the antibody is to act in the brain, a dosage of 50 mg/kg to 100 mg/kg is usually appropriate. Generally, partially human antibodies and fully human antibodies have a longer half-life within the human body than other antibodies. Accordingly, lower dosages and less frequent administration is often possible. Modifications such as lipidation can be used to stabilize antibodies and to enhance uptake and tissue penetration (e.g., into the brain). A method for lipidation of antibodies is described by Cruikshank et al. ((1997) J. Acquired Immune Deficiency Syndromes and Human Retrovirology 14:193).

[3033] The present invention encompasses agents which modulate expression or activity. An agent may, for example, be a small molecule. For example, such small molecules include, but are not limited to, peptides, peptidomimetics (e.g., peptoids), amino acids, amino acid analogs, polynucleotides, polynucleotide analogs, nucleotides, nucleotide analogs, organic or inorganic compounds (i.e., including heteroorganic and organometallic compounds) having a molecular weight less than about 10,000 grams per mole, organic or inorganic compounds having a molecular weight less than about 5,000 grams per mole, organic or inorganic compounds having a molecular weight less than about 1,000 grams per mole, organic or inorganic compounds having a molecular weight less than about 500 grams per mole, and salts, esters, and other pharmaceutically acceptable forms of such compounds.

[3034] Exemplary doses include milligram or microgram amounts of the small molecule per kilogram of subject or sample weight (e.g., about 1 microgram per kilogram to about 500 milligrams per kilogram, about 100 micrograms per kilogram to about 5 milligrams per kilogram, or about 1 microgram per kilogram to about 50 micrograms per kilogram. It is furthermore understood that appropriate doses of a small molecule depend upon the potency of the small molecule with respect to the expression or activity to be modulated. When one or more of these small molecules is to be administered to an animal (e.g., a human) in order to modulate expression or activity of a polypeptide or nucleic acid of the invention, a physician, veterinarian, or researcher may, for example, prescribe a relatively low dose at first, subsequently increasing the dose until an appropriate response is obtained. In addition, it is understood that the specific dose level for any particular animal subject will depend upon a variety of factors including the activity of the specific compound employed, the age, body weight, general health, gender, and diet of the subject, the time of administration, the route of administration, the rate of excretion, any drug combination, and the degree of expression or activity to be modulated.

[3035] An antibody (or fragment thereof) may be conjugated to a therapeutic moiety such as a cytotoxin, a therapeutic agent or a radioactive ion. A cytotoxin or cytotoxic agent includes any agent that is detrimental to cells. Examples include taxol, cytochalasin B, gramicidin D, ethidium bromide, emetine, mitomycin, etoposide, tenoposide, vincristine, vinblastine, colchicin, doxorubicin, daunorubicin, dihydroxy anthracin dione, mitoxantrone, mithramycin, actinomycin D, 1-dehydrotestosterone, glucocorticoids, procaine, tetracaine, lidocaine, propranolol, puromycin, maytansinoids, e.g., maytansinol (see U.S. Pat. No. 5,208,020), CC-1065 (see U.S. Pat. Nos. 5,475,092, 5,585,499, 5,846,545) and analogs or homologs thereof. Therapeutic agents include, but are not limited to, antimetabolites (e.g., methotrexate, 6-mercaptopurine, 6-thioguanine, cytarabine, 5-fluorouracil decarbazine), alkylating agents (e.g., mechlorethamine, thioepa chlorambucil, CC-1065, melphalan, carmustine (BSNU) and lomustine (CCNU), cyclothosphamide, busulfan, dibromomannitol, streptozotocin, mitomycin C, and cis-dichlorodiamine platinum (II) (DDP) cisplatin), anthracyclines (e.g., daunorubicin (formerly daunomycin) and doxorubicin), antibiotics (e.g., dactinomycin (formerly actinomycin), bleomycin, mithramycin, and anthramycin (AMC)), and anti-mitotic agents (e.g., vincristine, vinblastine, taxol and maytansinoids). Radioactive ions include, but are not limited to iodine, yttrium and praseodymium.

[3036] The conjugates of the invention can be used for modifying a given biological response, the drug moiety is not to be construed as limited to classical chemical therapeutic agents. For example, the drug moiety may be a protein or polypeptide possessing a desired biological activity. Such proteins may include, for example, a toxin such as abrin, ricin A, pseudomonas exotoxin, or diphtheria toxin; a protein such as tumor necrosis factor, &agr;-interferon, &bgr;-interferon, nerve growth factor, platelet derived growth factor, tissue plasminogen activator; or, biological response modifiers such as, for example, lymphokines, interleukin-1 (“IL-1”), interleukin-2 (“IL-2”), interleukin-6 (“IL-6”), granulocyte macrophase colony stimulating factor (“GM-CSF”), granulocyte colony stimulating factor (“G-CSF”), or other growth factors.

[3037] Alternatively, an antibody can be conjugated to a second antibody to form an antibody heteroconjugate as described by Segal in U.S. Pat. No. 4,676,980.

[3038] The nucleic acid molecules of the invention can be inserted into vectors and used as gene therapy vectors. Gene therapy vectors can be delivered to a subject by, for example, intravenous injection, local administration (see U.S. Pat. No. 5,328,470) or by stereotactic injection (see e.g., Chen et al. (1994) Proc. Natl. Acad. Sci. USA 91:3054-3057). The pharmaceutical preparation of the gene therapy vector can include the gene therapy vector in an acceptable diluent, or can comprise a slow release matrix in which the gene delivery vehicle is imbedded. Alternatively, where the complete gene delivery vector can be produced intact from recombinant cells, e.g., retroviral vectors, the pharmaceutical preparation can include one or more cells which produce the gene delivery system.

[3039] The pharmaceutical compositions can be included in a container, pack, or dispenser together with instructions for administration.

[3040] Methods of Treatment for 47508

[3041] The present invention provides for both prophylactic and therapeutic methods of treating a subject at risk of (or susceptible to) a disorder or having a disorder associated with aberrant or unwanted 47508 expression or activity. As used herein, the term “treatment” is defined as the application or administration of a therapeutic agent to a patient, or application or administration of a therapeutic agent to an isolated tissue or cell line from a patient, who has a disease, a symptom of disease or a predisposition toward a disease, with the purpose to cure, heal, alleviate, relieve, alter, remedy, ameliorate, improve or affect the disease, the symptoms of disease or the predisposition toward disease. A therapeutic agent includes, but is not limited to, small molecules, peptides, antibodies, ribozymes and antisense oligonucleotides.

[3042] With regards to both prophylactic and therapeutic methods of treatment, such treatments may be specifically tailored or modified, based on knowledge obtained from the field of pharmacogenomics. “Pharmacogenomics”, as used herein, refers to the application of genomics technologies such as gene sequencing, statistical genetics, and gene expression analysis to drugs in clinical development and on the market. More specifically, the term refers the study of how a patient's genes determine his or her response to a drug (e.g., a patient's “drug response phenotype”, or “drug response genotype”.) Thus, another aspect of the invention provides methods for tailoring an individual's prophylactic or therapeutic treatment with either the 47508 molecules of the present invention or 47508 modulators according to that individual's drug response genotype. Pharmacogenomics allows a clinician or physician to target prophylactic or therapeutic treatments to patients who will most benefit from the treatment and to avoid treatment of patients who will experience toxic drug-related side effects.

[3043] In one aspect, the invention provides a method for preventing in a subject, a disease or condition associated with an aberrant or unwanted 47508 expression or activity, by administering to the subject a 47508 or an agent which modulates 47508 expression or at least one 47508 activity. Subjects at risk for a disease which is caused or contributed to by aberrant or unwanted 47508 expression or activity can be identified by, for example, any or a combination of diagnostic or prognostic assays as described herein. Administration of a prophylactic agent can occur prior to the manifestation of symptoms characteristic of the 47508 aberrance, such that a disease or disorder is prevented or, alternatively, delayed in its progression. Depending on the type of 47508 aberrance, for example, a 47508, 47508 agonist or 47508 antagonist agent can be used for treating the subject. The appropriate agent can be determined based on screening assays described herein.

[3044] It is possible that some 47508 disorders can be caused, at least in part, by an abnormal level of gene product, or by the presence of a gene product exhibiting abnormal activity. As such, the reduction in the level and/or activity of such gene products would bring about the amelioration of disorder symptoms.

[3045] The 47508 molecules can act as novel diagnostic targets and therapeutic agents for controlling one or more of cellular proliferative and/or differentiative disorders, disorders of the breast, ovary, lung, colon, and liver, as well cardiovascular disorders, as has been described above. In addition, the 47508 molecules of the invention can act as novel diagnostic targets and therapeutic agents for controlling disorders associated with bone metabolism, immune disorders, viral diseases, and pain or metabolic disorders.

[3046] Aberrant expression and/or activity of 47508 molecules may mediate disorders associated with bone metabolism. “Bone metabolism” refers to direct or indirect effects in the formation or degeneration of bone structures, e.g., bone formation, bone resorption, etc., which may ultimately affect the concentrations in serum of calcium and phosphate. This term also includes activities mediated by 47508 molecules effects in bone cells, e.g. osteoclasts and osteoblasts, that may in turn result in bone formation and degeneration. For example, 47508 molecules may support different activities of bone resorbing osteoclasts such as the stimulation of differentiation of monocytes and mononuclear phagocytes into osteoclasts. Accordingly, 47508 molecules that modulate the production of bone cells can influence bone formation and degeneration, and thus may be used to treat bone disorders. Examples of such disorders include, but are not limited to, osteoporosis, osteodystrophy, osteomalacia, rickets, osteitis fibrosa cystica, renal osteodystrophy, osteosclerosis, anti-convulsant treatment, osteopenia, fibrogenesis-imperfecta ossium, secondary hyperparathyrodism, hypoparathyroidism, hyperparathyroidism, cirrhosis, obstructive jaundice, drug induced metabolism, medullary carcinoma, chronic renal disease, rickets, sarcoidosis, glucocorticoid antagonism, malabsorption syndrome, steatorrhea, tropical sprue, idiopathic hypercalcemia and milk fever.

[3047] The 47508 nucleic acid and protein of the invention can be used to treat and/or diagnose a variety of immune disorders. Examples of immune disorders or diseases include, but are not limited to, autoimmune diseases (including, for example, diabetes mellitus, arthritis (including rheumatoid arthritis, juvenile rheumatoid arthritis, osteoarthritis, psoriatic arthritis), multiple sclerosis, encephalomyelitis, myasthenia gravis, systemic lupus erythematosis, autoimmune thyroiditis, dermatitis (including atopic dermatitis and eczematous dermatitis), psoriasis, Sjögren's Syndrome, Crohn's disease, aphthous ulcer, iritis, conjunctivitis, keratoconjunctivitis, ulcerative colitis, asthma, allergic asthma, cutaneous lupus erythematosus, scleroderma, vaginitis, proctitis, drug eruptions, leprosy reversal reactions, erythema nodosum leprosum, autoimmune uveitis, allergic encephalomyelitis, acute necrotizing hemorrhagic encephalopathy, idiopathic bilateral progressive sensorineural hearing loss, aplastic anemia, pure red cell anemia, idiopathic thrombocytopenia, polychondritis, Wegener's granulomatosis, chronic active hepatitis, Stevens-Johnson syndrome, idiopathic sprue, lichen planus, Graves' disease, sarcoidosis, primary biliary cirrhosis, uveitis posterior, and interstitial lung fibrosis), graft-versus-host disease, cases of transplantation, and allergy such as, atopic allergy.

[3048] Additionally, 47508 molecules may play an important role in the etiology of certain viral diseases, including but not limited to Hepatitis B, Hepatitis C and Herpes Simplex Virus (HSV). Modulators of 47508 activity could be used to control viral diseases. The modulators can be used in the treatment and/or diagnosis of viral infected tissue or virus-associated tissue fibrosis, especially liver and liver fibrosis. Also, 47508 modulators can be used in the treatment and/or diagnosis of virus-associated carcinoma, especially hepatocellular cancer.

[3049] Additionally, 47508 may play an important role in the regulation of metabolism or pain disorders. Diseases of metabolic imbalance include, but are not limited to, obesity, anorexia nervosa, cachexia, lipid disorders, and diabetes. Examples of pain disorders include, but are not limited to, pain response elicited during various forms of tissue injury, e.g., inflammation, infection, and ischemia, usually referred to as hyperalgesia (described in, for example, Fields, H. L. (1987) Pain, New York: McGraw-Hill); pain associated with musculoskeletal disorders, e.g., joint pain; tooth pain; headaches; pain associated with surgery; pain related to irritable bowel syndrome; or chest pain.

[3050] Additional disorders which may be treated or diagnosed by methods described herein include, but are not limited to, disorders associated with an accumulation in the liver of fibrous tissue, such as that resulting from an imbalance between production and degradation of the extracellular matrix accompanied by the collapse and condensation of preexisting fibers. The methods described herein can be used to diagnose or treat hepatocellular necrosis or injury induced by a wide variety of agents including processes which disturb homeostasis, such as an inflammatory process, tissue damage resulting from toxic injury or altered hepatic blood flow, and infections (e.g., bacterial, viral and parasitic). For example, the methods can be used for the early detection of hepatic injury, such as portal hypertension or hepatic fibrosis. In addition, the methods can be employed to detect liver fibrosis attributed to inborn errors of metabolism, for example, fibrosis resulting from a storage disorder such as Gaucher's disease (lipid abnormalities) or a glycogen storage disease, A1-antitrypsin deficiency; a disorder mediating the accumulation (e.g., storage) of an exogenous substance, for example, hemochromatosis (iron-overload syndrome) and copper storage diseases (Wilson's disease), disorders resulting in the accumulation of a toxic metabolite (e.g., tyrosinemia, fructosemia and galactosemia) and peroxisomal disorders (e.g., Zellweger syndrome). Additionally, the methods described herein may be useful for the early detection and treatment of liver injury associated with the administration of various chemicals or drugs, such as for example, methotrexate, isonizaid, oxyphenisatin, methyldopa, chlorpromazine, tolbutamide or alcohol, or which represents a hepatic manifestation of a vascular disorder such as obstruction of either the intrahepatic or extrahepatic bile flow or an alteration in hepatic circulation resulting, for example, from chronic heart failure, veno-occlusive disease, portal vein thrombosis or Budd-Chiari syndrome.

[3051] As used herein, disorders involving the heart, or “cardiovascular disease” or a “cardiovascular disorder” includes a disease or disorder which affects the cardiovascular system, e.g., the heart, the blood vessels, and/or the blood. A cardiovascular disorder can be caused by an imbalance in arterial pressure, a malfunction of the heart, or an occlusion of a blood vessel, e.g., by a thrombus. A cardiovascular disorder includes, but is not limited to disorders such as arteriosclerosis, atherosclerosis, cardiac hypertrophy, ischemia reperfusion injury, restenosis, arterial inflammation, vascular wall remodeling, ventricular remodeling, rapid ventricular pacing, coronary microembolism, tachycardia, bradycardia, pressure overload, aortic bending, coronary artery ligation, vascular heart disease, valvular disease, including but not limited to, valvular degeneration caused by calcification, rheumatic heart disease, endocarditis, or complications of artificial valves; atrial fibrillation, long-QT syndrome, congestive heart failure, sinus node dysfunction, angina, heart failure, hypertension, atrial fibrillation, atrial flutter, pericardial disease, including but not limited to, pericardial effusion and pericarditis; cardiomyopathies, e.g., dilated cardiomyopathy or idiopathic cardiomyopathy, myocardial infarction, coronary artery disease, coronary artery spasm, ischemic disease, arrhythmia, sudden cardiac death, and cardiovascular developmental disorders (e.g., arteriovenous malformations, arteriovenous fistulae, raynaud's syndrome, neurogenic thoracic outlet syndrome, causalgia/reflex sympathetic dystrophy, hemangioma, aneurysm, cavernous angioma, aortic valve stenosis, atrial septal defects, atrioventricular canal, coarctation of the aorta, ebsteins anomaly, hypoplastic left heart syndrome, interruption of the aortic arch, mitral valve prolapse, ductus arteriosus, patent foramen ovale, partial anomalous pulmonary venous return, pulmonary atresia with ventricular septal defect, pulmonary atresia without ventricular septal defect, persistance of the fetal circulation, pulmonary valve stenosis, single ventricle, total anomalous pulmonary venous return, transposition of the great vessels, tricuspid atresia, truncus arteriosus, ventricular septal defects). A cardiovasular disease or disorder also can include an endothelial cell disorder.

[3052] As used herein, an “endothelial cell disorder” includes a disorder characterized by aberrant, unregulated, or unwanted endothelial cell activity, e.g., proliferation, migration, angiogenesis, or vascularization; or aberrant expression of cell surface adhesion molecules or genes associated with angiogenesis, e.g., TIE-2, FLT and FLK. Endothelial cell disorders include tumorigenesis, tumor metastasis, psoriasis, diabetic retinopathy, endometriosis, Grave's disease, ischemic disease (e.g., atherosclerosis), and chronic inflammatory diseases (e.g., rheumatoid arthritis).

[3053] As discussed, successful treatment of 47508 disorders can be brought about by techniques that serve to inhibit the expression or activity of target gene products. For example, compounds, e.g., an agent identified using an assays described above, that proves to exhibit negative modulatory activity, can be used in accordance with the invention to prevent and/or ameliorate symptoms of 47508 disorders. Such molecules can include, but are not limited to peptides, phosphopeptides, small organic or inorganic molecules, or antibodies (including, for example, polyclonal, monoclonal, humanized, anti-idiotypic, chimeric or single chain antibodies, and Fab, F(ab′)2 and Fab expression library fragments, scFV molecules, and epitope-binding fragments thereof).

[3054] Further, antisense and ribozyme molecules that inhibit expression of the target gene can also be used in accordance with the invention to reduce the level of target gene expression, thus effectively reducing the level of target gene activity. Still further, triple helix molecules can be utilized in reducing the level of target gene activity. Antisense, ribozyme and triple helix molecules are discussed above.

[3055] It is possible that the use of antisense, ribozyme, and/or triple helix molecules to reduce or inhibit mutant gene expression can also reduce or inhibit the transcription (triple helix) and/or translation (antisense, ribozyme) of mRNA produced by normal target gene alleles, such that the concentration of normal target gene product present can be lower than is necessary for a normal phenotype. In such cases, nucleic acid molecules that encode and express target gene polypeptides exhibiting normal target gene activity can be introduced into cells via gene therapy method. Alternatively, in instances in that the target gene encodes an extracellular protein, it can be preferable to co-administer normal target gene protein into the cell or tissue in order to maintain the requisite level of cellular or tissue target gene activity.

[3056] Another method by which nucleic acid molecules may be utilized in treating or preventing a disease characterized by 47508 expression is through the use of aptamer molecules specific for 47508 protein. Aptamers are nucleic acid molecules having a tertiary structure which permits them to specifically bind to protein ligands (see, e.g., Osborne, et al. (1997) Curr. Opin. Chem Biol. 1: 5-9; and Patel, D. J. (1997) Curr Opin Chem Biol 1:32-46). Since nucleic acid molecules may in many cases be more conveniently introduced into target cells than therapeutic protein molecules may be, aptamers offer a method by which 47508 protein activity may be specifically decreased without the introduction of drugs or other molecules which may have pluripotent effects.

[3057] Antibodies can be generated that are both specific for target gene product and that reduce target gene product activity. Such antibodies may, therefore, by administered in instances whereby negative modulatory techniques are appropriate for the treatment of 47508 disorders. For a description of antibodies, see the Antibody section above.

[3058] In circumstances wherein injection of an animal or a human subject with a 47508 protein or epitope for stimulating antibody production is harmful to the subject, it is possible to generate an immune response against 47508 through the use of anti-idiotypic antibodies (see, for example, Herlyn, D. (1999) Ann Med 31:66-78; and Bhattacharya-Chatterjee, M., and Foon, K. A. (1998) Cancer Treat Res. 94:51-68). If an anti-idiotypic antibody is introduced into a mammal or human subject, it should stimulate the production of anti-anti-idiotypic antibodies, which should be specific to the 47508 protein. Vaccines directed to a disease characterized by 47508 expression may also be generated in this fashion.

[3059] In instances where the target antigen is intracellular and whole antibodies are used, internalizing antibodies may be preferred. Lipofectin or liposomes can be used to deliver the antibody or a fragment of the Fab region that binds to the target antigen into cells. Where fragments of the antibody are used, the smallest inhibitory fragment that binds to the target antigen is preferred. For example, peptides having an amino acid sequence corresponding to the Fv region of the antibody can be used. Alternatively, single chain neutralizing antibodies that bind to intracellular target antigens can also be administered. Such single chain antibodies can be administered, for example, by expressing nucleotide sequences encoding single-chain antibodies within the target cell population (see e.g., Marasco et al. (1993) Proc. Natl. Acad. Sci. USA 90:7889-7893).

[3060] The identified compounds that inhibit target gene expression, synthesis and/or activity can be administered to a patient at therapeutically effective doses to prevent, treat or ameliorate 47508 disorders. A therapeutically effective dose refers to that amount of the compound sufficient to result in amelioration of symptoms of the disorders. Toxicity and therapeutic efficacy of such compounds can be determined by standard pharmaceutical procedures as described above.

[3061] The data obtained from the cell culture assays and animal studies can be used in formulating a range of dosage for use in humans. The dosage of such compounds lies preferably within a range of circulating concentrations that include the ED50 with little or no toxicity. The dosage can vary within this range depending upon the dosage form employed and the route of administration utilized. For any compound used in the method of the invention, the therapeutically effective dose can be estimated initially from cell culture assays. A dose can be formulated in animal models to achieve a circulating plasma concentration range that includes the IC50 i.e., the concentration of the test compound that achieves a half-maximal inhibition of symptoms) as determined in cell culture. Such information can be used to more accurately determine useful doses in humans. Levels in plasma can be measured, for example, by high performance liquid chromatography.

[3062] Another example of determination of effective dose for an individual is the ability to directly assay levels of “free” and “bound” compound in the serum of the test subject. Such assays may utilize antibody mimics and/or “biosensors” that have been created through molecular imprinting techniques. The compound which is able to modulate 47508 activity is used as a template, or “imprinting molecule”, to spatially organize polymerizable monomers prior to their polymerization with catalytic reagents. The subsequent removal of the imprinted molecule leaves a polymer matrix which contains a repeated “negative image” of the compound and is able to selectively rebind the molecule under biological assay conditions. A detailed review of this technique can be seen in Ansell, R. J. et al (1996) Current Opinion in Biotechnology 7:89-94 and in Shea, K. J. (1994) Trends in Polymer Science 2:166-173. Such “imprinted” affinity matrixes are amenable to ligand-binding assays, whereby the immobilized monoclonal antibody component is replaced by an appropriately imprinted matrix. An example of the use of such matrixes in this way can be seen in Vlatakis, G. et al (1993) Nature 361:645-647. Through the use of isotope-labeling, the “free” concentration of compound which modulates the expression or activity of 47508 can be readily monitored and used in calculations of IC50.

[3063] Such “imprinted” affinity matrixes can also be designed to include fluorescent groups whose photon-emitting properties measurably change upon local and selective binding of target compound. These changes can be readily assayed in real time using appropriate fiberoptic devices, in turn allowing the dose in a test subject to be quickly optimized based on its individual IC50. An rudimentary example of such a “biosensor” is discussed in Kriz, D. et al (1995) Analytical Chemistry 67:2142-2144.

[3064] Another aspect of the invention pertains to methods of modulating 47508 expression or activity for therapeutic purposes. Accordingly, in an exemplary embodiment, the modulatory method of the invention involves contacting a cell with a 47508 or agent that modulates one or more of the activities of 47508 protein activity associated with the cell. An agent that modulates 47508 protein activity can be an agent as described herein, such as a nucleic acid or a protein, a naturally-occurring target molecule of a 47508 protein (e.g., a 47508 substrate or receptor), a 47508 antibody, a 47508 agonist or antagonist, a peptidomimetic of a 47508 agonist or antagonist, or other small molecule.

[3065] In one embodiment, the agent stimulates one or 47508 activities. Examples of such stimulatory agents include active 47508 protein and a nucleic acid molecule encoding 47508. In another embodiment, the agent inhibits one or more 47508 activities. Examples of such inhibitory agents include antisense 47508 nucleic acid molecules, anti-47508 antibodies, and 47508 inhibitors. These modulatory methods can be performed in vitro (e.g., by culturing the cell with the agent) or, alternatively, in vivo (e.g., by administering the agent to a subject). As such, the present invention provides methods of treating an individual afflicted with a disease or disorder characterized by aberrant or unwanted expression or activity of a 47508 protein or nucleic acid molecule. In one embodiment, the method involves administering an agent (e.g., an agent identified by a screening assay described herein), or combination of agents that modulates (e.g., up regulates or down regulates) 47508 expression or activity. In another embodiment, the method involves administering a 47508 protein or nucleic acid molecule as therapy to compensate for reduced, aberrant, or unwanted 47508 expression or activity.

[3066] Stimulation of 47508 activity is desirable in situations in which 47508 is abnormally downregulated and/or in which increased 47508 activity is likely to have a beneficial effect. For example, stimulation of 47508 activity is desirable in situations in which a 47508 is downregulated and/or in which increased 47508 activity is likely to have a beneficial effect. Likewise, inhibition of 47508 activity is desirable in situations in which 47508 is abnormally upregulated and/or in which decreased 47508 activity is likely to have a beneficial effect.

[3067] 47508 Pharmacogenomics

[3068] The 47508 molecules of the present invention, as well as agents, or modulators which have a stimulatory or inhibitory effect on 47508 activity (e.g., 47508 gene expression) as identified by a screening assay described herein can be administered to individuals to treat (prophylactically or therapeutically) 47508 associated disorders (e.g., cellular proliferative and/or differentiative disorders) associated with aberrant or unwanted 47508 activity. In conjunction with such treatment, pharmacogenomics (i.e., the study of the relationship between an individual's genotype and that individual's response to a foreign compound or drug) may be considered. Differences in metabolism of therapeutics can lead to severe toxicity or therapeutic failure by altering the relation between dose and blood concentration of the pharmacologically active drug. Thus, a physician or clinician may consider applying knowledge obtained in relevant pharmacogenomics studies in determining whether to administer a 47508 molecule or 47508 modulator as well as tailoring the dosage and/or therapeutic regimen of treatment with a 47508 molecule or 47508 modulator.

[3069] Pharmacogenomics deals with clinically significant hereditary variations in the response to drugs due to altered drug disposition and abnormal action in affected persons. See, for example, Eichelbaum, M. et al. (1996) Clin. Exp. Pharmacol. Physiol. 23:983-985 and Linder, M. W. et al. (1997) Clin. Chem. 43:254-266. In general, two types of pharmacogenetic conditions can be differentiated. Genetic conditions transmitted as a single factor altering the way drugs act on the body (altered drug action) or genetic conditions transmitted as single factors altering the way the body acts on drugs (altered drug metabolism). These pharmacogenetic conditions can occur either as rare genetic defects or as naturally-occurring polymorphisms. For example, glucose-6-phosphate dehydrogenase deficiency (G6PD) is a common inherited enzymopathy in which the main clinical complication is haemolysis after ingestion of oxidant drugs (anti-malarials, sulfonamides, analgesics, nitrofurans) and consumption of fava beans.

[3070] One pharmacogenomics approach to identifying genes that predict drug response, known as “a genome-wide association”, relies primarily on a high-resolution map of the human genome consisting of already known gene-related markers (e.g., a “bi-allelic” gene marker map which consists of 60,000-100,000 polymorphic or variable sites on the human genome, each of which has two variants.) Such a high-resolution genetic map can be compared to a map of the genome of each of a statistically significant number of patients taking part in a Phase II/III drug trial to identify markers associated with a particular observed drug response or side effect. Alternatively, such a high resolution map can be generated from a combination of some ten-million known single nucleotide polymorphisms (SNPs) in the human genome. As used herein, a “SNP” is a common alteration that occurs in a single nucleotide base in a stretch of DNA. For example, a SNP may occur once per every 1000 bases of DNA. A SNP may be involved in a disease process, however, the vast majority may not be disease-associated. Given a genetic map based on the occurrence of such SNPs, individuals can be grouped into genetic categories depending on a particular pattern of SNPs in their individual genome. In such a manner, treatment regimens can be tailored to groups of genetically similar individuals, taking into account traits that may be common among such genetically similar individuals.

[3071] Alternatively, a method termed the “candidate gene approach,” can be utilized to identify genes that predict drug response. According to this method, if a gene that encodes a drug's target is known (e.g., a 47508 protein of the present invention), all common variants of that gene can be fairly easily identified in the population and it can be determined if having one version of the gene versus another is associated with a particular drug response.

[3072] Alternatively, a method termed the “gene expression profiling,” can be utilized to identify genes that predict drug response. For example, the gene expression of an animal dosed with a drug (e.g., a 47508 molecule or 47508 modulator of the present invention) can give an indication whether gene pathways related to toxicity have been turned on.

[3073] Information generated from more than one of the above pharmacogenomics approaches can be used to determine appropriate dosage and treatment regimens for prophylactic or therapeutic treatment of an individual. This knowledge, when applied to dosing or drug selection, can avoid adverse reactions or therapeutic failure and thus enhance therapeutic or prophylactic efficiency when treating a subject with a 47508 molecule or 47508 modulator, such as a modulator identified by one of the exemplary screening assays described herein.

[3074] The present invention further provides methods for identifying new agents, or combinations, that are based on identifying agents that modulate the activity of one or more of the gene products encoded by one or more of the 47508 genes of the present invention, wherein these products may be associated with resistance of the cells to a therapeutic agent. Specifically, the activity of the proteins encoded by the 47508 genes of the present invention can be used as a basis for identifying agents for overcoming agent resistance. By blocking the activity of one or more of the resistance proteins, target cells, e.g., human cells, will become sensitive to treatment with an agent that the unmodified target cells were resistant to.

[3075] Monitoring the influence of agents (e.g., drugs) on the expression or activity of a 47508 protein can be applied in clinical trials. For example, the effectiveness of an agent determined by a screening assay as described herein to increase 47508 gene expression, protein levels, or upregulate 47508 activity, can be monitored in clinical trials of subjects exhibiting decreased 47508 gene expression, protein levels, or downregulated 47508 activity. Alternatively, the effectiveness of an agent determined by a screening assay to decrease 47508 gene expression, protein levels, or downregulate 47508 activity, can be monitored in clinical trials of subjects exhibiting increased 47508 gene expression, protein levels, or upregulated 47508 activity. In such clinical trials, the expression or activity of a 47508 gene, and preferably, other genes that have been implicated in, for example, a 47508-associated disorder can be used as a “read out” or markers of the phenotype of a particular cell.

[3076] 47508 Informatics

[3077] The sequence of a 47508 molecule is provided in a variety of media to facilitate use thereof. A sequence can be provided as a manufacture, other than an isolated nucleic acid or amino acid molecule, which contains a 47508. Such a manufacture can provide a nucleotide or amino acid sequence, e.g., an open reading frame, in a form which allows examination of the manufacture using means not directly applicable to examining the nucleotide or amino acid sequences, or a subset thereof, as they exists in nature or in purified form. The sequence information can include, but is not limited to, 47508 full-length nucleotide and/or amino acid sequences, partial nucleotide and/or amino acid sequences, polymorphic sequences including single nucleotide polymorphisms (SNPs), epitope sequence, and the like. In a preferred embodiment, the manufacture is a machine-readable medium, e.g., a magnetic, optical, chemical or mechanical information storage device.

[3078] As used herein, “machine-readable media” refers to any medium that can be read and accessed directly by a machine, e.g., a digital computer or analogue computer. Non-limiting examples of a computer include a desktop PC, laptop, mainframe, server (e.g., a web server, network server, or server farm), handheld digital assistant, pager, mobile telephone, and the like. The computer can be stand-alone or connected to a communications network, e.g., a local area network (such as a VPN or intranet), a wide area network (e.g., an Extranet or the Internet), or a telephone network (e.g., a wireless, DSL, or ISDN network). Machine-readable media include, but are not limited to: magnetic storage media, such as floppy discs, hard disc storage medium, and magnetic tape; optical storage media such as CD-ROM; electrical storage media such as RAM, ROM, EPROM, EEPROM, flash memory, and the like; and hybrids of these categories such as magnetic/optical storage media.

[3079] A variety of data storage structures are available to a skilled artisan for creating a machine-readable medium having recorded thereon a nucleotide or amino acid sequence of the present invention. The choice of the data storage structure will generally be based on the means chosen to access the stored information. In addition, a variety of data processor programs and formats can be used to store the nucleotide sequence information of the present invention on computer readable medium. The sequence information can be represented in a word processing text file, formatted in commercially-available software such as WordPerfect and Microsoft Word, or represented in the form of an ASCII file, stored in a database application, such as DB2, Sybase, Oracle, or the like. The skilled artisan can readily adapt any number of data processor structuring formats (e.g., text file or database) in order to obtain computer readable medium having recorded thereon the nucleotide sequence information of the present invention.

[3080] In a preferred embodiment, the sequence information is stored in a relational database (such as Sybase or Oracle). The database can have a first table for storing sequence (nucleic acid and/or amino acid sequence) information. The sequence information can be stored in one field (e.g., a first column) of a table row and an identifier for the sequence can be store in another field (e.g., a second column) of the table row. The database can have a second table, e.g., storing annotations. The second table can have a field for the sequence identifier, a field for a descriptor or annotation text (e.g., the descriptor can refer to a functionality of the sequence, a field for the initial position in the sequence to which the annotation refers, and a field for the ultimate position in the sequence to which the annotation refers. Non-limiting examples for annotation to nucleic acid sequences include polymorphisms (e.g., SNP's) translational regulatory sites and splice junctions. Non-limiting examples for annotations to amino acid sequence include polypeptide domains, e.g., a domain described herein; active sites and other functional amino acids; and modification sites.

[3081] By providing the nucleotide or amino acid sequences of the invention in computer readable form, the skilled artisan can routinely access the sequence information for a variety of purposes. For example, one skilled in the art can use the nucleotide or amino acid sequences of the invention in computer readable form to compare a target sequence or target structural motif with the sequence information stored within the data storage means. A search is used to identify fragments or regions of the sequences of the invention which match a particular target sequence or target motif. The search can be a BLAST search or other routine sequence comparison, e.g., a search described herein.

[3082] Thus, in one aspect, the invention features a method of analyzing 47508, e.g., analyzing structure, function, or relatedness to one or more other nucleic acid or amino acid sequences. The method includes: providing a 47508 nucleic acid or amino acid sequence; comparing the 47508 sequence with a second sequence, e.g., one or more preferably a plurality of sequences from a collection of sequences, e.g., a nucleic acid or protein sequence database to thereby analyze 47508. The method can be performed in a machine, e.g., a computer, or manually by a skilled artisan.

[3083] The method can include evaluating the sequence identity between a 47508 sequence and a database sequence. The method can be performed by accessing the database at a second site, e.g., over the Internet.

[3084] As used herein, a “target sequence” can be any DNA or amino acid sequence of six or more nucleotides or two or more amino acids. A skilled artisan can readily recognize that the longer a target sequence is, the less likely a target sequence will be present as a random occurrence in the database. Typical sequence lengths of a target sequence are from about 10 to 100 amino acids or from about 30 to 300 nucleotide residues. However, it is well recognized that commercially important fragments, such as sequence fragments involved in gene expression and protein processing, may be of shorter length.

[3085] Computer software is publicly available which allows a skilled artisan to access sequence information provided in a computer readable medium for analysis and comparison to other sequences. A variety of known algorithms are disclosed publicly and a variety of commercially available software for conducting search means are and can be used in the computer-based systems of the present invention. Examples of such software include, but are not limited to, MacPattern (EMBL), BLASTN and BLASTX (NCBI).

[3086] Thus, the invention features a method of making a computer readable record of a sequence of a 47508 sequence which includes recording the sequence on a computer readable matrix. In a preferred embodiment the record includes one or more of the following: identification of an ORF; identification of a domain, region, or site; identification of the start of transcription; identification of the transcription terminator; the full length amino acid sequence of the protein, or a mature form thereof; the 5′end of the translated region.

[3087] In another aspect, the invention features, a method of analyzing a sequence. The method includes: providing a 47508 sequence, or record, in machine-readable form; comparing a second sequence to the 47508 sequence; thereby analyzing a sequence. Comparison can include comparing to sequences for sequence identity or determining if one sequence is included within the other, e.g., determining if the 47508 sequence includes a sequence being compared. In a preferred embodiment the 47508 or second sequence is stored on a first computer, e.g., at a first site and the comparison is performed, read, or recorded on a second computer, e.g., at a second site. E.g., the 47508 or second sequence can be stored in a public or proprietary database in one computer, and the results of the comparison performed, read, or recorded on a second computer. In a preferred embodiment the record includes one or more of the following: identification of an ORF; identification of a domain, region, or site; identification of the start of transcription; identification of the transcription terminator; the full length amino acid sequence of the protein, or a mature form thereof; the 5′end of the translated region.

[3088] In another aspect, the invention provides a machine-readable medium for holding instructions for performing a method for determining whether a subject has a 47508-associated disease or disorder or a pre-disposition to a 47508-associated disease or disorder, wherein the method comprises the steps of determining 47508 sequence information associated with the subject and based on the 47508 sequence information, determining whether the subject has a 47508-associated disease or disorder or a pre-disposition to a 47508-associated disease or disorder and/or recommending a particular treatment for the disease, disorder or pre-disease condition.

[3089] The invention further provides in an electronic system and/or in a network, a method for determining whether a subject has a 47508-associated disease or disorder or a pre-disposition to a disease associated with a 47508 wherein the method comprises the steps of determining 47508 sequence information associated with the subject, and based on the 47508 sequence information, determining whether the subject has a 47508-associated disease or disorder or a pre-disposition to a 47508-associated disease or disorder, and/or recommending a particular treatment for the disease, disorder or pre-disease condition. In a preferred embodiment, the method further includes the step of receiving information, e.g., phenotypic or genotypic information, associated with the subject and/or acquiring from a network phenotypic information associated with the subject. The information can be stored in a database, e.g., a relational database. In another embodiment, the method further includes accessing the database, e.g., for records relating to other subjects, comparing the 47508 sequence of the subject to the 47508 sequences in the database to thereby determine whether the subject as a 47508-associated disease or disorder, or a pre-disposition for such.

[3090] The present invention also provides in a network, a method for determining whether a subject has a 47508 associated disease or disorder or a pre-disposition to a 47508-associated disease or disorder associated with 47508, said method comprising the steps of receiving 47508 sequence information from the subject and/or information related thereto, receiving phenotypic information associated with the subject, acquiring information from the network corresponding to 47508 and/or corresponding to a 47508-associated disease or disorder (e.g., cellular proliferative and/or differentiative disorders), and based on one or more of the phenotypic information, the 47508 information (e.g., sequence information and/or information related thereto), and the acquired information, determining whether the subject has a 47508-associated disease or disorder or a pre-disposition to a 47508-associated disease or disorder. The method may further comprise the step of recommending a particular treatment for the disease, disorder or pre-disease condition.

[3091] The present invention also provides a method for determining whether a subject has a 47508-associated disease or disorder or a pre-disposition to a 47508-associated disease or disorder, said method comprising the steps of receiving information related to 47508 (e.g., sequence information and/or information related thereto), receiving phenotypic information associated with the subject, acquiring information from the network related to 47508 and/or related to a 47508-associated disease or disorder, and based on one or more of the phenotypic information, the 47508 information, and the acquired information, determining whether the subject has a 47508-associated disease or disorder or a pre-disposition to a 47508-associated disease or disorder. The method may further comprise the step of recommending a particular treatment for the disease, disorder or pre-disease condition.

[3092] This invention is further illustrated by the following examples that should not be construed as limiting. The contents of all references, patents and published patent applications cited throughout this application are incorporated herein by reference.

Background of the 56939 Invention

[3093] Hydrolases are a large class of enzymes which catalyze the cleavage of a bond with the addition of water. Hydrolases play important roles in the synthesis and breakdown of nearly all major metabolic intermediates, including polypeptides, nucleic acids, and lipids. In particular, the &agr;/&bgr; hydrolase family of enzymes is a phylogenetically diverse group of enzymes that have a common fold, typically comprising an eight-stranded &agr;-sheet surrounded by &agr;-helices (Ollis, D. et al. (1992) Protein Eng 5:197-211; Nardini and Dikkstra (1999) Curr Opin Str Bio 9:732-737). Members of the &agr;/&bgr; hydrolase family are found in nearly all organisms, from microbes to plants to humans. Members of the hydrolase family of enzymes include enzymes that hydrolyze ester bonds (e.g., phosphatases, sulfatases, exonucleases, and endonucleases), glycosidases, enzymes that act on ether bonds, peptidases (e.g., exopeptidases and endopeptidases), as well as enzymes that hydrolyze carbon-nitrogen bonds, acid anhydrides, carbon-carbon bonds, halide bonds, phosphorous-nitrogen bonds, sulfur-nitrogen bonds, carbon-phosphorous bonds, and sulfur-sulfur bonds (E. C. Webb ed., Enzyme Nomenclature, pp. 306-450, © (1992 Academic Press, Inc. San Diego, Calif.).

[3094] &agr;/&bgr; hydrolases vary widely in primary sequence, substrate specificity, and physical properties. However, despite the lack of sequence homology, hydrolase family members display structural similarities, e.g., conservation of a catalytic site framework. In particular, one conserved feature of the alpha/beta hydrolase fold is a nucleophile-histidine-acid catalytic triad. The identities of the triad residues in alpha/beta hydrolase fold enzymes are quite variable in that serine, aspartate, and cysteine have all been identified as catalytic nucleophiles (Schrag, J. et al. (1997) Meth. Enzymol. 284:85-107).

[3095] One particular class of &agr;/&bgr; hydrolases are the acyl-CoA thioesterase enzymes, which catalyze the hydrolysis of acyl-CoAs of various carbon chain lengths to free fatty acids and CoA-SH. These enzymes are predicted to have a domain which adopts the &agr;/&bgr; hydrolases fold. As a class, the acyl-CoA thioesterase enzymes have a conserved nucleophile-histidine-acid catalytic triad consisting of the amino acids: serine, histidine, and aspartic acid. Some examples of acyl-CoA thioesterase are the murine enzymes, CTE-I, MTE-I, and PTE-Ia/b which are localized to the cytoplasm, mitochondria, and peroxisomes of cells respectively (Hunt et al. (1999) J Biol Chem 274:34317-26). PTE-Ia/b transcription is controlled by the peroxisome proliferator activated receptor-&agr; (PPAR), which is itself a key regulator of lipid metabolism (Id). Given their catalytic activity and their transcriptional regulation, the acyl-CoA thioesterase enzymes are evidently closely linked to lipid metabolism.

Summary of the 56939 Invention

[3096] The present invention is based, in part, on the discovery of a novel acyl-CoA thioesterase family member, referred to herein as “56939”. The nucleotide sequence of a cDNA encoding 56939 is shown in SEQ ID NO:48, and the amino acid sequence of a 56939 polypeptide is shown in SEQ ID NO:49. In addition, the nucleotide sequences of the coding region are depicted in SEQ ID NO:50.

[3097] Accordingly, in one aspect, the invention features a nucleic acid molecule that encodes a 56939 protein or polypeptide, e.g., a biologically active portion of the 56939 protein. In a preferred embodiment the isolated nucleic acid molecule encodes a polypeptide having the amino acid sequence of SEQ ID NO:49. In other embodiments, the invention provides isolated 56939 nucleic acid molecules having the nucleotide sequence shown in SEQ ID NO:48, SEQ ID NO:50, or the sequence of the DNA insert of the plasmid deposited with ATCC Accession Number ______. In still other embodiments, the invention provides nucleic acid molecules that are substantially identical (e.g., naturally occurring allelic variants) to the nucleotide sequence shown in SEQ ID NO:48, SEQ ID NO:50, or the sequence of the DNA insert of the plasmid deposited with ATCC Accession Number ______. In other embodiments, the invention provides a nucleic acid molecule which hybridizes under a stringency condition described herein to a nucleic acid molecule comprising the nucleotide sequence of SEQ ID NO:48, SEQ ID NO:50, or the sequence of the DNA insert of the plasmid deposited with ATCC Accession Number ______, wherein the nucleic acid encodes a full length 56939 protein or an active fragment thereof.

[3098] In a related aspect, the invention further provides nucleic acid constructs that include a 56939 nucleic acid molecule described herein. In certain embodiments, the nucleic acid molecules of the invention are operatively linked to native or heterologous regulatory sequences. Also included, are vectors and host cells containing the 56939 nucleic acid molecules of the invention e.g., vectors and host cells suitable for producing 56939 nucleic acid molecules and polypeptides.

[3099] In another related aspect, the invention provides nucleic acid fragments suitable as primers or hybridization probes for the detection of 56939-encoding nucleic acids.

[3100] In still another related aspect, isolated nucleic acid molecules that are antisense to a 56939 encoding nucleic acid molecule are provided.

[3101] In another aspect, the invention features, 56939 polypeptides, and biologically active or antigenic fragments thereof that are useful, e.g., as reagents or targets in assays applicable to treatment and diagnosis of 56939-mediated or -related disorders. In another embodiment, the invention provides 56939 polypeptides having a 56939 activity. Preferred polypeptides are 56939 proteins including at least one acyl-CoA thioesterase domain, and, preferably, having a 56939 activity, e.g., a 56939 activity as described herein.

[3102] In other embodiments, the invention provides 56939 polypeptides, e.g., a 56939 polypeptide having the amino acid sequence shown in SEQ ID NO:49 or the amino acid sequence encoded by the cDNA insert of the plasmid deposited with ATCC Accession Number ______; an amino acid sequence that is substantially identical to the amino acid sequence shown in SEQ ID NO:49 or the amino acid sequence encoded by the cDNA insert of the plasmid deposited with ATCC Accession Number ______; or an amino acid sequence encoded by a nucleic acid molecule having a nucleotide sequence which hybridizes under a stringency condition described herein to a nucleic acid molecule comprising the nucleotide sequence of SEQ ID NO:48, SEQ ID NO:50, or the sequence of the DNA insert of the plasmid deposited with ATCC Accession Number ______, wherein the nucleic acid encodes a full length 56939 protein or an active fragment thereof.

[3103] In a related aspect, the invention further provides nucleic acid constructs that include a 56939 nucleic acid molecule described herein.

[3104] In a related aspect, the invention provides 56939 polypeptides or fragments operatively linked to non-56939 polypeptides to form fusion proteins.

[3105] In another aspect, the invention features antibodies and antigen-binding fragments thereof, that react with, or more preferably specifically bind 56939 polypeptides or fragments thereof. In one embodiment, the antibodies or antigen-binding fragment thereof competitively inhibit the binding of a second antibody to a 56939 polypeptide or a fragment thereof.

[3106] In another aspect, the invention provides methods of screening for compounds that modulate the expression or activity of the 56939 polypeptides or nucleic acids.

[3107] In still another aspect, the invention provides a process for modulating 56939 polypeptide or nucleic acid expression or activity, e.g. using the screened compounds. In certain embodiments, the methods involve treatment of conditions related to aberrant activity or expression of the 56939 polypeptides or nucleic acids, such as conditions involving aberrant or deficient metabolism, detoxification, and cellular proliferation or differentiation.

[3108] The invention also provides assays for determining the activity of or the presence or absence of 56939 polypeptides or nucleic acid molecules in a biological sample, including for disease diagnosis.

[3109] In yet another aspect, the invention provides methods for inhibiting the proliferation or inducing the killing, of a 56939-expressing cell, e.g., a hyper-proliferative 56939-expressing cell. The method includes contacting the cell with a compound (e.g., a compound identified using the methods described herein) that modulates the activity, or expression, of the 56939 polypeptide or nucleic acid. In a preferred embodiment, the contacting step is effective in vitro or ex vivo. In other embodiments, the contacting step is effected in vivo, e.g., in a subject (e.g., a mammal, e.g., a human), as part of a therapeutic or prophylactic protocol. In a preferred embodiment, the cell is a hyperproliferative cell, e.g., a cell found in a solid tumor, a soft tissue tumor, or a metastatic lesion.

[3110] In a preferred embodiment, the compound is an inhibitor of a 56939 polypeptide. Preferably, the inhibitor is chosen from a peptide, a phosphopeptide, a small organic molecule, a small inorganic molecule and an antibody (e.g., an antibody conjugated to a therapeutic moiety selected from a cytotoxin, a cytotoxic agent and a radioactive metal ion). In another preferred embodiment, the compound is an inhibitor of a 56939 nucleic acid, e.g., an antisense, a ribozyme, or a triple helix molecule.

[3111] In a preferred embodiment, the compound is administered in combination with a cytotoxic agent. Examples of cytotoxic agents include anti-microtubule agent, a topoisomerase I inhibitor, a topoisomerase II inhibitor, an anti-metabolite, a mitotic inhibitor, an alkylating agent, an intercalating agent, an agent capable of interfering with a signal transduction pathway, an agent that promotes apoptosis or necrosis, and radiation.

[3112] In another aspect, the invention features methods for treating or preventing a disorder characterized by aberrant cellular proliferation or differentiation of a 56939-expressing cell, in a subject. Preferably, the method includes administering to the subject (e.g., a mammal, e.g., a human) an effective amount of a compound (e.g., a compound identified using the methods described herein) that modulates the activity, or expression, of the 56939 polypeptide or nucleic acid. In a preferred embodiment, the disorder is a cancerous or pre-cancerous condition.

[3113] In a further aspect, the invention provides methods for evaluating the efficacy of a treatment of a disorder, e.g., proliferative disorder or a differentiative disorder. The method includes: treating a subject, e.g., a patient or an animal, with a protocol under evaluation (e.g., treating a subject with one or more of: chemotherapy, radiation, and/or a compound identified using the methods described herein); and evaluating the expression of a 56939 nucleic acid or polypeptide before and after treatment. A change, e.g., a decrease or increase, in the level of a 56939 nucleic acid (e.g., mRNA) or polypeptide after treatment, relative to the level of expression before treatment, is indicative of the efficacy of the treatment of the disorder. The level of 56939 nucleic acid or polypeptide expression can be detected by any method described herein.

[3114] In a preferred embodiment, the evaluating step includes obtaining a sample (e.g., a tissue sample, e.g., a biopsy, or a fluid sample) from the subject, before and after treatment and comparing the level of expressing of a 56939 nucleic acid (e.g., mRNA) or polypeptide before and after treatment.

[3115] In another aspect, the invention provides methods for evaluating the efficacy of a therapeutic or prophylactic agent (e.g., an anti-neoplastic agent). The method includes: contacting a sample with an agent (e.g., a compound identified using the methods described herein, a cytotoxic agent) and, evaluating the expression of 56939 nucleic acid or polypeptide in the sample before and after the contacting step. A change, e.g., a decrease or increase, in the level of 56939 nucleic acid (e.g., mRNA) or polypeptide in the sample obtained after the contacting step, relative to the level of expression in the sample before the contacting step, is indicative of the efficacy of the agent. The level of 56939 nucleic acid or polypeptide expression can be detected by any method described herein. In a preferred embodiment, the sample includes cells obtained from a cancerous tissue or kidney, liver, heart, brain, colon, breast, ovary, prostate, lung, skin or skeletal muscle tissue.

[3116] In further aspect, the invention provides assays for determining the presence or absence of a genetic alteration in a 56939 polypeptide or nucleic acid molecule, including for disease diagnosis.

[3117] In another aspect, the invention features a two dimensional array having a plurality of addresses, each address of the plurality being positionally distinguishable from each other address of the plurality, and each address of the plurality having a unique capture probe, e.g., a nucleic acid or peptide sequence. At least one address of the plurality has a capture probe that recognizes a 56939 molecule. In one embodiment, the capture probe is a nucleic acid, e.g., a probe complementary to a 56939 nucleic acid sequence. In another embodiment, the capture probe is a polypeptide, e.g., an antibody specific for 56939 polypeptides. Also featured is a method of analyzing a sample by contacting the sample to the aforementioned array and detecting binding of the sample to the array.

[3118] Other features and advantages of the invention will be apparent from the following detailed description, and from the claims.

Detailed Description of 56939

[3119] The human 56939 sequence (Example 34; SEQ ID NO:48), which is approximately 1391 nucleotides long including untranslated regions, contains a predicted methionine-initiated coding sequence of about 1266 nucleotides, including the termination codon (nucleotides indicated as coding of SEQ ID NO:48 in Example 34; SEQ ID NO:50). The coding sequence encodes a 421 amino acid protein (SEQ ID NO:49), the human 56939 protein of SEQ ID NO:49 and Example 35.

[3120] Human 56939 contains the following regions or other structural features:

[3121] an acyl-CoA thioesterase domain (ProDomain Accession PD006914) located at about amino acid residues 1 to 415 of SEQ ID NO:49;

[3122] three predicted N-glycosylation sites located at about amino acids 202 to 205, 247 to 250, and 255 to 258 of SEQ ID NO:49;

[3123] five predicted Protein Kinase C phosphorylation sites (PS00005) at about amino acids 33 to 35, about 37 to 39, about 138 to 140, about 339 to 341, and about 413 to 415 of SEQ ID NO:49;

[3124] one cAMP/cGMP dependent protein kinase phosphorylation sites (PS00004) at about amino acids 135 to 138 of SEQ ID NO:49;

[3125] two predicted Casein Kinase II phosphorylation sites (PS00006) located at about amino 37 to 40, and about 337 to 340 of SEQ ID NO:49; and

[3126] six predicted N-myristoylation sites (PS00008) from about amino 69 to 74, about 79 to 84, about 165 to 170, about 230 to 235, about 256 to 261, and about 412 to 417 of SEQ ID NO:49.

[3127] For general information regarding PFAM identifiers, PS prefix and PF prefix domain identification numbers, refer to Sonnhammer et al. (1997) Protein 28:405-420 and http://www.psc.edu/general/software/packages/pfam/pfam.html.

[3128] A plasmid containing the nucleotide sequence encoding human 56939 (clone “Fbh56939FL”) was deposited with American Type Culture Collection (ATCC), 10801 University Boulevard, Manassas, Va. 20110-2209, on ______ and assigned Accession Number ______. This deposit will be maintained under the terms of the Budapest Treaty on the International Recognition of the Deposit of Microorganisms for the Purposes of Patent Procedure. This deposit was made merely as a convenience for those of skill in the art and is not an admission that a deposit is required under 35 U.S.C. §112.

[3129] The 56939 protein contains a significant number of structural characteristics in common with members of the acyl-CoA thioesterase family. The term “family” when referring to the protein and nucleic acid molecules of the invention means two or more proteins or nucleic acid molecules having a common structural domain or motif and having sufficient amino acid or nucleotide sequence homology as defined herein. Such family members can be naturally or non-naturally occurring and can be from either the same or different species. For example, a family can contain a first protein of human origin as well as other distinct proteins of human origin, or alternatively, can contain homologues of non-human origin, e.g., rat or mouse proteins. Members of a family can also have common functional characteristics.

[3130] Members of the acyl-CoA thioesterase family of proteins are characterized by a common fold, which, in particular, includes an &agr;/&bgr; hydrolases fold of an eight stranded &bgr;-sheet with surrounding &agr;-helices. Family members also frequently have common catalytic residues. These residues are found within the &agr;/&bgr; hydrolases fold. The most critical catalytic residues are the catalytic triad which are positioned to favor nucleophilic attack of bound substrates. Within the acyl-CoA thioesterase family, the nucleophilic side chain is preferably a serine, but can be a cysteine; the remaining residues of the triad are preferably a histidine and an aspartic acid. Family members preferably catalyze the hydrolysis of acyl-CoA substrates, e.g., a fatty acid esterified with coenzyme A (CoA), e.g., palmitoyl-CoA, and bile salts conjugated with CoA.

[3131] A 56939 polypeptide can include a “acyl-CoA thioesterase domain” or regions homologous with a “acyl-CoA thioesterase domain.”

[3132] As used herein, the term “acyl-CoA thioesterase domain” includes an amino acid sequence of about 300 to 500 amino acid residues in length and having a score for the alignment of the sequence to the acyl-CoA thioesterase domain (PD0006914) of at least 400. Preferably, an acyl-CoA thioesterase domain includes at least one, two, and preferably three conserved catalytic residues positioned to favor the nucleophilic attack of bound substrates. The nucleophilic side chain is preferably a serine, but can be a cysteine; the remaining residues of the triad are preferably a histidine and an aspartic acid. For example, a 56939 polypeptide has a serine residue located at about residue 232 of SEQ ID NO:49, an aspartic acid residue located at about residue 325 of SEQ ID NO:49, and a histidine residue locate at about residue 360 of SEQ ID NO:49, corresponding to the conserved catalytic triad characteristic of acyl-CoA thioesterase domains. Preferably, an acyl-CoA thioesterase domain includes at least about 300 to 500 amino acids, more preferably about 350 to 450 amino acid residues, or about 390 to 420 amino acids and has a score for the alignment of the sequence to the acyl-CoA thioesterase domain (PD0006914) of at least 400, preferably 600, more preferably 800, more preferably 1000 or greater. The acyl-CoA thioesterase domain has been assigned the ProDomain Accession number PD0006914 (http://protein.toulouse.inra.fr/prodom.html). An alignment of the acyl-CoA thioesterase domain (amino acids about 1 to 415 of SEQ ID NO:49) of human 56939 with a consensus amino acid sequence is depicted in FIG. 25.

[3133] In a preferred embodiment 56939 polypeptide or protein has a “acyl-CoA thioesterase domain” or a region which includes at least about 300 to 500 amino acids, more preferably about 350 to 450 amino acids, more preferably about 380 to 430 or 400 to 415 amino acid residues and has at least about 60%, 70% 80% 90% 95%, 99%, or 100% homology with a “acyl-CoA thioesterase domain,” e.g., the acyl-CoA thioesterase domain of human 56939 (e.g., residues 1 to 415 of SEQ ID NO:49).

[3134] To identify the presence of a “acyl-CoA thioesterase” domain in a 56939 protein sequence, and make the determination that a polypeptide or protein of interest has a particular profile, the amino acid sequence of the protein can be searched against a database of domains, e.g., the ProDom database (Corpet et al. (1999), Nucl. Acids Res. 27:263-267) The ProDom protein domain database consists of an automatic compilation of homologous domains. Current versions of ProDom are built using recursive PSI-BLAST searches (Altschul S F et al. (1997) Nucleic Acids Res. 25:3389-3402; Gouzy et al. (1999) Computers and Chemistry 23:333-340.) of the SWISS-PROT 38 and TREMBL protein databases. The database automatically generates a consensus sequence for each domain. A BLAST search was performed against the ProDom database resulting in the identification of a “acyl-CoA thioesterase” domain in the amino acid sequence of human 56939 at about residues 1 to 415 of SEQ ID NO:49 (see FIG. 25).

[3135] A 56939 family member can include at least one acyl-CoA thioesterase domain (PD006914). Furthermore, a 56939 family member can include at least one, two, three, four preferably five protein kinase C phosphorylation sites (PS00005); at least one, preferably two predicted casein kinase II phosphorylation sites (PS00006); at least one, two, three, four, five and preferably six predicted N-myristylation sites (PS00008); at least one, two, preferably three predicted N-glycosylation sites; and at least one cAMP/cGMP dependent protein kinase phosphorylation site (PS00004).

[3136] As the 56939 polypeptides of the invention may modulate 56939-mediated activities, they may be useful as of for developing novel diagnostic and therapeutic agents for 56939-mediated or related disorders, as described below.

[3137] As used herein, a “56939 activity”, “biological activity of 56939” or “functional activity of 56939”, refers to an activity exerted by a 56939 protein, polypeptide or nucleic acid molecule. For example, a 56939 activity can be an activity exerted by 56939 in a physiological milieu on, e.g., a 56939-responsive cell or on a 56939 substrate, e.g., a protein substrate. A 56939 activity can be determined in vivo or in vitro. In one embodiment, a 56939 activity is a direct activity, such as an association with a 56939 target molecule. A “target molecule” or “binding partner” is a molecule with which a 56939 protein binds or interacts in nature. In an exemplary embodiment, 56939 is an enzyme for lipid substrates; fatty acid-CoAs and/or bile acid-CoAs.

[3138] A 56939 activity can also be an indirect activity, e.g., a cellular signaling activity mediated by interaction of the 56939 protein with a 56939 receptor. The features of the 56939 molecules of the present invention can provide similar biological activities as acyl-CoA thioesterase family members. For example, the 56939 proteins of the present invention can have or mediate one or more of the following activities: (1) hydrolysis of fatty acid substrates, e.g., fatty acid-CoA conjugates (2) hydrolysis of lipids; (3) hydrolysis of bile salt-CoA conjugates; or (4) N-acyltransferase activity, e.g., mediating the formation of N-acyl bile acid conjugates.

[3139] Based on the above-described sequence similarities, the 56939 molecules of the present invention are predicted to have similar biological activities as acyl-CoA thioesterase family members. For example, the 56939 protein may bind fatty acids, e.g., palmitoyl, and/or other lipid metabolic enzymes, e.g., fatty acid synthases. Accordingly, the 56939 protein may mediate one or more of the following physiological processes: metabolite regulation, detoxification, cardio-vascular function, liver function, and/or kidney function. Furthermore, as described in Example 35, the 56939 protein is highly expressed in tissues including kidney, liver, adipose, brain, and tumor cells.

[3140] The 56939 molecules can act as novel diagnostic targets and therapeutic agents for controlling one or more of cellular proliferative and/or differentiative, metabolic, cardio-vascular, hepatic, kidney, and/or brain disorders.

[3141] Examples of cellular proliferative and/or differentiative disorders include cancer, e.g., carcinoma, sarcoma, metastatic disorders or hematopoietic neoplastic disorders, e.g., leukemias. A metastatic tumor can arise from a multitude of primary tumor types, including but not limited to those of prostate, colon, lung, breast and liver origin.

[3142] As used herein, the terms “cancer”, “hyperproliferative” and “neoplastic” refer to cells having the capacity for autonomous growth. Examples of such cells include cells having an abnormal state or condition characterized by rapidly proliferating cell growth. Hyperproliferative and neoplastic disease states may be categorized as pathologic, i.e., characterizing or constituting a disease state, or may be categorized as non-pathologic, i.e., a deviation from normal but not associated with a disease state. The term is meant to include all types of cancerous growths or oncogenic processes, metastatic tissues or malignantly transformed cells, tissues, or organs, irrespective of histopathologic type or stage of invasiveness. “Pathologic hyperproliferative” cells occur in disease states characterized by malignant tumor growth. Examples of non-pathologic hyperproliferative cells include proliferation of cells associated with wound repair.

[3143] The terms “cancer” or “neoplasms” include malignancies of the various organ systems, such as affecting lung, breast, thyroid, lymphoid, gastrointestinal, and genito-urinary tract, as well as adenocarcinomas which include malignancies such as most colon cancers, renal-cell carcinoma, prostate cancer and/or testicular tumors, non-small cell carcinoma of the lung, cancer of the small intestine and cancer of the esophagus.

[3144] The term “carcinoma” is art recognized and refers to malignancies of epithelial or endocrine tissues including respiratory system carcinomas, gastrointestinal system carcinomas, genitourinary system carcinomas, testicular carcinomas, breast carcinomas, prostatic carcinomas, endocrine system carcinomas, and melanomas. Exemplary carcinomas include those forming from tissue of the cervix, lung, prostate, breast, head and neck, colon and ovary. The term also includes carcinosarcomas, e.g., which include malignant tumors composed of carcinomatous and sarcomatous tissues. An “adenocarcinoma” refers to a carcinoma derived from glandular tissue or in which the tumor cells form recognizable glandular structures.

[3145] The term “sarcoma” is art recognized and refers to malignant tumors of mesenchymal derivation.

[3146] Additional examples of proliferative disorders include hematopoietic neoplastic disorders. As used herein, the term “hematopoietic neoplastic disorders” includes diseases involving hyperplastic/neoplastic cells of hematopoietic origin. A hematopoietic neoplastic disorder can arise from myeloid, lymphoid or erythroid lineages, or precursor cells thereof. Preferably, the diseases arise from poorly differentiated acute leukemias, e.g., erythroblastic leukemia and acute megakaryoblastic leukemia. Additional exemplary myeloid disorders include, but are not limited to, acute promyeloid leukemia (APML), acute myelogenous leukemia (AML) and chronic myelogenous leukemia (CML) (reviewed in Vaickus, L. (1991) Crit Rev. in Oncol./Hemotol. 11:267-97); lymphoid malignancies include, but are not limited to acute lymphoblastic leukemia (ALL) which includes B-lineage ALL and T-lineage ALL, chronic lymphocytic leukemia (CLL), prolymphocytic leukemia (PLL), hairy cell leukemia (HLL) and Waldenstrom's macroglobulinemia (WM). Additional forms of malignant lymphomas include, but are not limited to non-Hodgkin lymphoma and variants thereof, peripheral T cell lymphomas, adult T cell leukemia/lymphoma (ATL), cutaneous T-cell lymphoma (CTCL), large granular lymphocytic leukemia (LGF), Hodgkin's disease and Reed-Sternberg disease.

[3147] As used herein, disorders involving the heart, or “cardiovascular disease” or a “cardiovascular disorder” includes a disease or disorder which affects the cardiovascular system, e.g., the heart, the blood vessels, and/or the blood. A cardiovascular disorder can be caused by an imbalance in arterial pressure, a malfunction of the heart, or an occlusion of a blood vessel, e.g., by a thrombus. A cardiovascular disorder includes, but is not limited to disorders such as arteriosclerosis, atherosclerosis, cardiac hypertrophy, ischemia reperfusion injury, restenosis, arterial inflammation, vascular wall remodeling, ventricular remodeling, rapid ventricular pacing, coronary microembolism, tachycardia, bradycardia, pressure overload, aortic bending, coronary artery ligation, vascular heart disease, valvular disease, including but not limited to, valvular degeneration caused by calcification, rheumatic heart disease, endocarditis, or complications of artificial valves; atrial fibrillation, long-QT syndrome, congestive heart failure, sinus node dysfunction, angina, heart failure, hypertension, atrial fibrillation, atrial flutter, pericardial disease, including but not limited to, pericardial effusion and pericarditis; cardiomyopathies, e.g., dilated cardiomyopathy or idiopathic cardiomyopathy, myocardial infarction, coronary artery disease, coronary artery spasm, ischemic disease, arrhythmia, sudden cardiac death, and cardiovascular developmental disorders (e.g., arteriovenous malformations, arteriovenous fistulae, raynaud's syndrome, neurogenic thoracic outlet syndrome, causalgia/reflex sympathetic dystrophy, hemangioma, aneurysm, cavernous angioma, aortic valve stenosis, atrial septal defects, atrioventricular canal, coarctation of the aorta, ebsteins anomaly, hypoplastic left heart syndrome, interruption of the aortic arch, mitral valve prolapse, ductus arteriosus, patent foramen ovale, partial anomalous pulmonary venous return, pulmonary atresia with ventricular septal defect, pulmonary atresia without ventricular septal defect, persistance of the fetal circulation, pulmonary valve stenosis, single ventricle, total anomalous pulmonary venous return, transposition of the great vessels, tricuspid atresia, truncus arteriosus, ventricular septal defects). A cardiovasular disease or disorder also can include an endothelial cell disorder.

[3148] Disorders involving the heart, include but are not limited to, heart failure, including but not limited to, cardiac hypertrophy, left-sided heart failure, and right-sided heart failure; ischemic heart disease, including but not limited to angina pectoris, myocardial infarction, chronic ischemic heart disease, and sudden cardiac death; hypertensive heart disease, including but not limited to, systemic (left-sided) hypertensive heart disease and pulmonary (right-sided) hypertensive heart disease; valvular heart disease, including but not limited to, valvular degeneration caused by calcification, such as calcific aortic stenosis, calcification of a congenitally bicuspid aortic valve, and mitral annular calcification, and myxomatous degeneration of the mitral valve (mitral valve prolapse), rheumatic fever and rheumatic heart disease, infective endocarditis, and noninfected vegetations, such as nonbacterial thrombotic endocarditis and endocarditis of systemic lupus erythematosus (Libman-Sacks disease), carcinoid heart disease, and complications of artificial valves; myocardial disease, including but not limited to dilated cardiomyopathy, hypertrophic cardiomyopathy, restrictive cardiomyopathy, and myocarditis; pericardial disease, including but not limited to, pericardial effusion and hemopericardium and pericarditis, including acute pericarditis and healed pericarditis, and rheumatoid heart disease; neoplastic heart disease, including but not limited to, primary cardiac tumors, such as myxoma, lipoma, papillary fibroelastoma, rhabdomyoma, and sarcoma, and cardiac effects of noncardiac neoplasms; congenital heart disease, including but not limited to, left-to-right shunts—late cyanosis, such as atrial septal defect, ventricular septal defect, patent ductus arteriosus, and atrioventricular septal defect, right-to-left shunts—early cyanosis, such as tetralogy of fallot, transposition of great arteries, truncus arteriosus, tricuspid atresia, and total anomalous pulmonary venous connection, obstructive congenital anomalies, such as coarctation of aorta, pulmonary stenosis and atresia, and aortic stenosis and atresia, and disorders involving cardiac transplantation.

[3149] Additional examples of vascular disorders include, but are not limited to, responses of vascular cell walls to injury, such as endothelial dysfunction and endothelial activation and intimal thickening; vascular diseases including, but not limited to, congenital anomalies, such as arteriovenous fistula, atherosclerosis, and hypertensive vascular disease, such as hypertension; inflammatory disease—the vasculitides, such as giant cell (temporal) arteritis, Takayasu arteritis, polyarteritis nodosa (classic), Kawasaki syndrome (mucocutaneous lymph node syndrome), microscopic polyanglitis (microscopic polyarteritis, hypersensitivity or leukocytoclastic anglitis), Wegener granulomatosis, thromboanglitis obliterans (Buerger disease), vasculitis associated with other disorders, and infectious arteritis; Raynaud disease; aneurysms and dissection, such as abdominal aortic aneurysms, syphilitic (luetic) aneurysms, and aortic dissection (dissecting hematoma); disorders of veins and lymphatics, such as varicose veins, thrombophlebitis and phlebothrombosis, obstruction of superior vena cava (superior vena cava syndrome), obstruction of inferior vena cava (inferior vena cava syndrome), and lymphangitis and lymphedema; tumors, including benign tumors and tumor-like conditions, such as hemangioma, lymphangioma, glomus tumor (glomangioma), vascular ectasias, and bacillary angiomatosis, and intermediate-grade (borderline low-grade malignant) tumors, such as Kaposi sarcoma and hemangloendothelioma, and malignant tumors, such as angiosarcoma and hemangiopericytoma; and pathology of therapeutic interventions in vascular disease, such as balloon angioplasty and related techniques and vascular replacement, such as coronary artery bypass graft surgery.

[3150] As 56939 is highly expressed in liver tissue, 56939 may be involved in disorders involving the liver. Disorders which may be treated or diagnosed by methods described herein include, but are not limited to, disorders associated with an accumulation in the liver of fibrous tissue, such as that resulting from an imbalance between production and degradation of the extracellular matrix accompanied by the collapse and condensation of preexisting fibers. The methods described herein can be used to diagnose or treat hepatocellular necrosis or injury induced by a wide variety of agents including processes which disturb homeostasis, such as an inflammatory process, tissue damage resulting from toxic injury or altered hepatic blood flow, and infections (e.g., bacterial, viral and parasitic). For example, the methods can be used for the early detection of hepatic injury, such as portal hypertension or hepatic fibrosis. In addition, the methods can be employed to detect liver fibrosis attributed to inborn errors of metabolism, for example, fibrosis resulting from a storage disorder such as Gaucher's disease (lipid abnormalities) or a glycogen storage disease, A1-antitrypsin deficiency; a disorder mediating the accumulation (e.g., storage) of an exogenous substance, for example, hemochromatosis (iron-overload syndrome) and copper storage diseases (Wilson's disease), disorders resulting in the accumulation of a toxic metabolite (e.g., tyrosinemia, fructosemia and galactosemia) and peroxisomal disorders (e.g., Zellweger syndrome). Additionally, the methods described herein may be useful for the early detection and treatment of liver injury associated with the administration of various chemicals or drugs, such as for example, methotrexate, isonizaid, oxyphenisatin, methyldopa, chlorpromazine, tolbutamide or alcohol, or which represents a hepatic manifestation of a vascular disorder such as obstruction of either the intrahepatic or extrahepatic bile flow or an alteration in hepatic circulation resulting, for example, from chronic heart failure, veno-occlusive disease, portal vein thrombosis or Budd-Chiari syndrome.

[3151] Disorders which may be treated or diagnosed by methods described herein also include disorders of metabolism. Diseases of metabolic imbalance include, but are not limited to, obesity, anorexia nervosa, cachexia, lipid disorders, and diabetes. For example, many acyl-CoA thioesterase family members have lipase activity and can oxidize fatty acids. 56939 may further be involved in hereditary diseases, e.g., neuronal ceroid lipofuscinosis (Batten's disease). Infantile neuronal ceroid lipofuscinosis can be caused by a defect in a number of genes, including PPT1, a palmitoyl protein thioesterase.

[3152] 56939 may have a role in removing xenobiotic epoxides and/or other toxins from the body. Furthermore, 56939 may contribute to the metabolism of drugs and other pharmaceuticals. The study of polymorphisms in the 56939 gene can provide a useful resource for pharmacogenomic (see below) analysis of drug responses. Additionally, variations in 56939 may contribute to population differences in sensitivity to environmental toxins.

[3153] 56939 is highly expressed in the brain cortex and hypothalamus. Thus, 56939 may be involved in disorders involving these tissues, e.g. brain disorders. Disorders involving the brain include, but are not limited to, disorders involving neurons, and disorders involving glia, such as astrocytes, oligodendrocytes, ependymal cells, and microglia; cerebral edema, raised intracranial pressure and herniation, and hydrocephalus; malformations and developmental diseases, such as neural tube defects, forebrain anomalies, posterior fossa anomalies, and syringomyelia and hydromyelia; perinatal brain injury; cerebrovascular diseases, such as those related to hypoxia, ischemia, and infarction, including hypotension, hypoperfusion, and low-flow states—global cerebral ischemia and focal cerebral ischemia—infarction from obstruction of local blood supply, intracranial hemorrhage, including intracerebral (intraparenchymal) hemorrhage, subarachnoid hemorrhage and ruptured berry aneurysms, and vascular malformations, hypertensive cerebrovascular disease, including lacunar infarcts, slit hemorrhages, and hypertensive encephalopathy; infections, such as acute meningitis, including acute pyogenic (bacterial) meningitis and acute aseptic (viral) meningitis, acute focal suppurative infections, including brain abscess, subdural empyema, and extradural abscess, chronic bacterial meningoencephalitis, including tuberculosis and mycobacterioses, neurosyphilis, and neuroborreliosis (Lyme disease), viral meningoencephalitis, including arthropod-borne (Arbo) viral encephalitis, Herpes simplex virus Type 1, Herpes simplex virus Type 2, Varicalla-zoster virus (Herpes zoster), cytomegalovirus, poliomyelitis, rabies, and human immunodeficiency virus 1, including HIV-1 meningoencephalitis (subacute encephalitis), vacuolar myelopathy, AIDS-associated myopathy, peripheral neuropathy, and AIDS in children, progressive multifocal leukoencephalopathy, subacute sclerosing panencephalitis, fungal meningoencephalitis, other infectious diseases of the nervous system; transmissible spongiform encephalopathies (prion diseases); demyelinating diseases, including multiple sclerosis, multiple sclerosis variants, acute disseminated encephalomyelitis and acute necrotizing hemorrhagic encephalomyelitis, and other diseases with demyelination; degenerative diseases, such as degenerative diseases affecting the cerebral cortex, including Alzheimer disease and Pick disease, degenerative diseases of basal ganglia and brain stem, including Parkinsonism, idiopathic Parkinson disease (paralysis agitans), progressive supranuclear palsy, corticobasal degenration, multiple system atrophy, including striatonigral degenration, Shy-Drager syndrome, and olivopontocerebellar atrophy, and Huntington disease; spinocerebellar degenerations, including spinocerebellar ataxias, including Friedreich ataxia, and ataxia-telanglectasia, degenerative diseases affecting motor neurons, including amyotrophic lateral sclerosis (motor neuron disease), bulbospinal atrophy (Kennedy syndrome), and spinal muscular atrophy; inborn errors of metabolism, such as leukodystrophies, including Krabbe disease, metachromatic leukodystrophy, adrenoleukodystrophy, Pelizaeus-Merzbacher disease, and Canavan disease, mitochondrial encephalomyopathies, including Leigh disease and other mitochondrial encephalomyopathies; toxic and acquired metabolic diseases, including vitamin deficiencies such as thiamine (vitamin B1) deficiency and vitamin B12 deficiency, neurologic sequelae of metabolic disturbances, including hypoglycemia, hyperglycemia, and hepatic encephatopathy, toxic disorders, including carbon monoxide, methanol, ethanol, and radiation, including combined methotrexate and radiation-induced injury; tumors, such as gliomas, including astrocytoma, including fibrillary (diffuse) astrocytoma and glioblastoma multiforme, pilocytic astrocytoma, pleomorphic xanthoastrocytoma, and brain stem glioma, oligodendroglioma, and ependymoma and related paraventricular mass lesions, neuronal tumors, poorly differentiated neoplasms, including medulloblastoma, other parenchymal tumors, including primary brain lymphoma, germ cell tumors, and pineal parenchymal tumors, meningiomas, metastatic tumors, paraneoplastic syndromes, peripheral nerve sheath tumors, including schwannoma, neurofibroma, and malignant peripheral nerve sheath tumor (malignant schwannoma), and neurocutaneous syndromes (phakomatoses), including neurofibromotosis, including Type 1 neurofibromatosis (NF1) and TYPE 2 neurofibromatosis (NF2), tuberous sclerosis, and Von Hippel-Lindau disease.

[3154] As 56939 is strongly expressed in kidney tissue, 56939 may be involved disorders involving the kidney. Disorders involving the kidney include, but are not limited to, congenital anomalies including, but not limited to, cystic diseases of the kidney, that include but are not limited to, cystic renal dysplasia, autosomal dominant (adult) polycystic kidney disease, autosomal recessive (childhood) polycystic kidney disease, and cystic diseases of renal medulla, which include, but are not limited to, medullary sponge kidney, and nephronophthisis-uremic medullary cystic disease complex, acquired (dialysis-associated) cystic disease, such as simple cysts; glomerular diseases including pathologies of glomerular injury that include, but are not limited to, in situ immune complex deposition, that includes, but is not limited to, anti-GBM nephritis, Heymann nephritis, and antibodies against planted antigens, circulating immune complex nephritis, antibodies to glomerular cells, cell-mediated immunity in glomerulonephritis, activation of alternative complement pathway, epithelial cell injury, and pathologies involving mediators of glomerular injury including cellular and soluble mediators, acute glomerulonephritis, such as acute proliferative (poststreptococcal, postinfectious) glomerulonephritis, including but not limited to, poststreptococcal glomerulonephritis and nonstreptococcal acute glomerulonephritis, rapidly progressive (crescentic) glomerulonephritis, nephrotic syndrome, membranous glomerulonephritis (membranous nephropathy), minimal change disease (lipoid nephrosis), focal segmental glomerulosclerosis, membranoproliferative glomerulonephritis, IgA nephropathy (Berger disease), focal proliferative and necrotizing glomerulonephritis (focal glomerulonephritis), hereditary nephritis, including but not limited to, Alport syndrome and thin membrane disease (benign familial hematuria), chronic glomerulonephritis, glomerular lesions associated with systemic disease, including but not limited to, systemic lupus erythematosus, Henoch-Schönlein purpura, bacterial endocarditis, diabetic glomerulosclerosis, amyloidosis, fibrillary and immunotactoid glomerulonephritis, and other systemic disorders; diseases affecting tubules and interstitium, including acute tubular necrosis and tubulointerstitial nephritis, including but not limited to, pyelonephritis and urinary tract infection, acute pyelonephritis, chronic pyelonephritis and reflux nephropathy, and tubulointerstitial nephritis induced by drugs and toxins, including but not limited to, acute drug-induced interstitial nephritis, analgesic abuse nephropathy, nephropathy associated with nonsteroidal anti-inflammatory drugs, and other tubulointerstitial diseases including, but not limited to, urate nephropathy, hypercalcemia and nephrocalcinosis, and multiple myeloma; diseases of blood vessels including benign nephrosclerosis, malignant hypertension and accelerated nephrosclerosis, renal artery stenosis, and thrombotic microangiopathies including, but not limited to, classic (childhood) hemolytic-uremic syndrome, adult hemolytic-uremic syndrome/thrombotic thrombocytopenic purpura, idiopathic HUS/TTP, and other vascular disorders including, but not limited to, atherosclerotic ischemic renal disease, atheroembolic renal disease, sickle cell disease nephropathy, diffuse cortical necrosis, and renal infarcts; urinary tract obstruction (obstructive uropathy); urolithiasis (renal calculi, stones); and tumors of the kidney including, but not limited to, benign tumors, such as renal papillary adenoma, renal fibroma or hamartoma (renomedullary interstitial cell tumor), angiomyolipoma, and oncocytoma, and malignant tumors, including renal cell carcinoma (hypemephroma, adenocarcinoma of kidney), which includes urothelial carcinomas of renal pelvis.

[3155] The 56939 protein, fragments thereof, and derivatives and other variants of the sequence in SEQ ID NO:49 thereof are collectively referred to as “polypeptides or proteins of the invention” or “56939 polypeptides or proteins”. Nucleic acid molecules encoding such polypeptides or proteins are collectively referred to as “nucleic acids of the invention” or “56939 nucleic acids.” 56939 molecules refer to 56939 nucleic acids, polypeptides, and antibodies.

[3156] As used herein, the term “nucleic acid molecule” includes DNA molecules (e.g., a cDNA or genomic DNA), RNA molecules (e.g., an mRNA) and analogs of the DNA or RNA. A DNA or RNA analog can be synthesized from nucleotide analogs. The nucleic acid molecule can be single-stranded or double-stranded, but preferably is double-stranded DNA.

[3157] The term “isolated nucleic acid molecule” or “purified nucleic acid molecule” includes nucleic acid molecules that are separated from other nucleic acid molecules present in the natural source of the nucleic acid. For example, with regards to genomic DNA, the term “isolated” includes nucleic acid molecules which are separated from the chromosome with which the genomic DNA is naturally associated. Preferably, an “isolated” nucleic acid is free of sequences which naturally flank the nucleic acid (i.e., sequences located at the 5′ and/or 3′ends of the nucleic acid) in the genomic DNA of the organism from which the nucleic acid is derived. For example, in various embodiments, the isolated nucleic acid molecule can contain less than about 5 kb, 4 kb, 3 kb, 2 kb, 1 kb, 0.5 kb or 0.1 kb of 5′ and/or 3′nucleotide sequences which naturally flank the nucleic acid molecule in genomic DNA of the cell from which the nucleic acid is derived. Moreover, an “isolated” nucleic acid molecule, such as a cDNA molecule, can be substantially free of other cellular material, or culture medium when produced by recombinant techniques, or substantially free of chemical precursors or other chemicals when chemically synthesized.

[3158] As used herein, the term “hybridizes under low stringency, medium stringency, high stringency, or very high stringency conditions” describes conditions for hybridization and washing. Guidance for performing hybridization reactions can be found in Current Protocols in Molecular Biology, John Wiley & Sons, N.Y. (1989), 6.3.1-6.3.6, which is incorporated by reference. Aqueous and nonaqueous methods are described in that reference and either can be used. Specific hybridization conditions referred to herein are as follows: 1) low stringency hybridization conditions in 6× sodium chloride/sodium citrate (SSC) at about 45° C., followed by two washes in 0.2× SSC, 0.1% SDS at least at 50° C. (the temperature of the washes can be increased to 55° C. for low stringency conditions); 2) medium stringency hybridization conditions in 6× SSC at about 45° C., followed by one or more washes in 0.2× SSC, 0.1% SDS at 60° C.; 3) high stringency hybridization conditions in 6× SSC at about 45° C., followed by one or more washes in 0.2× SSC, 0.1% SDS at 65° C.; and preferably 4) very high stringency hybridization conditions are 0.5M sodium phosphate, 7% SDS at 65° C., followed by one or more washes at 0.2× SSC, 1% SDS at 65° C. Very high stringency conditions (4) are the preferred conditions and the ones that should be used unless otherwise specified.

[3159] Preferably, an isolated nucleic acid molecule of the invention that hybridizes under a stringency condition described herein to the sequence of SEQ ID NO:48 or SEQ ID NO:50, corresponds to a naturally-occurring nucleic acid molecule.

[3160] As used herein, a “naturally-occurring” nucleic acid molecule refers to an RNA or DNA molecule having a nucleotide sequence that occurs in nature. For example a naturally occurring nucleic acid molecule can encode a natural protein.

[3161] As used herein, the terms “gene” and “recombinant gene” refer to nucleic acid molecules which include at least an open reading frame encoding a 56939 protein. The gene can optionally further include non-coding sequences, e.g., regulatory sequences and introns. Preferably, a gene encodes a mammalian 56939 protein or derivative thereof.

[3162] An “isolated” or “purified” polypeptide or protein is substantially free of cellular material or other contaminating proteins from the cell or tissue source from which the protein is derived, or substantially free from chemical precursors or other chemicals when chemically synthesized. “Substantially free” means that a preparation of 56939 protein is at least 10% pure. In a preferred embodiment, the preparation of 56939 protein has less than about 30%, 20%, 10% and more preferably 5% (by dry weight), of non-56939 protein (also referred to herein as a “contaminating protein”), or of chemical precursors or non-56939 chemicals. When the 56939 protein or biologically active portion thereof is recombinantly produced, it is also preferably substantially free of culture medium, i.e., culture medium represents less than about 20%, more preferably less than about 10%, and most preferably less than about 5% of the volume of the protein preparation. The invention includes isolated or purified preparations of at least 0.01, 0.1, 1.0, and 10 milligrams in dry weight.

[3163] A “non-essential” amino acid residue is a residue that can be altered from the wild-type sequence of 56939 without abolishing or substantially altering a 56939 activity. Preferably the alteration does not substantially alter the 56939 activity, e.g., the activity is at least 20%, 40%, 60%, 70% or 80% of wild-type. An “essential” amino acid residue is a residue that, when altered from the wild-type sequence of 56939, results in abolishing a 56939 activity such that less than 20% of the wild-type activity is present. For example, conserved amino acid residues in 56939 are predicted to be particularly unamenable to alteration.

[3164] A “conservative amino acid substitution” is one in which the amino acid residue is replaced with an amino acid residue having a similar side chain. Families of amino acid residues having similar side chains have been defined in the art. These families include amino acids with basic side chains (e.g., lysine, arginine, histidine), acidic side chains (e.g., aspartic acid, glutamic acid), uncharged polar side chains (e.g., glycine, asparagine, glutamine, serine, threonine, tyrosine, cysteine), nonpolar side chains (e.g., alanine, valine, leucine, isoleucine, proline, phenylalanine, methionine, tryptophan), beta-branched side chains (e.g., threonine, valine, isoleucine) and aromatic side chains (e.g., tyrosine, phenylalanine, tryptophan, histidine). Thus, a predicted nonessential amino acid residue in a 56939 protein is preferably replaced with another amino acid residue from the same side chain family. Alternatively, in another embodiment, mutations can be introduced randomly along all or part of a 56939 coding sequence, such as by saturation mutagenesis, and the resultant mutants can be screened for 56939 biological activity to identify mutants that retain activity. Following mutagenesis of SEQ ID NO:48 or SEQ ID NO:50, the encoded protein can be expressed recombinantly and the activity of the protein can be determined.

[3165] As used herein, a “biologically active portion” of a 56939 protein includes a fragment of a 56939 protein which participates in an interaction, e.g., an intramolecular or an inter-molecular interaction. An inter-molecular interaction can be a specific binding interaction or an enzymatic interaction (e.g., the interaction can be transient and a covalent bond is formed or broken). An inter-molecular interaction can be between a 56939 molecule and a non-56939 molecule or between a first 56939 molecule and a second 56939 molecule (e.g., a dimerization interaction). Biologically active portions of a 56939 protein include peptides comprising amino acid sequences sufficiently homologous to or derived from the amino acid sequence of the 56939 protein, e.g., the amino acid sequence shown in SEQ ID NO:49, which include less amino acids than the full length 56939 proteins, and exhibit at least one activity of a 56939 protein. Typically, biologically active portions comprise a domain or motif with at least one activity of the 56939 protein, e.g., hydrolysis of fatty acid-CoA substrates. A biologically active portion of a 56939 protein can be a polypeptide which is, for example, 10, 25, 50, 100, 200 or more amino acids in length. Biologically active portions of a 56939 protein can be used as targets for developing agents which modulate a 56939 mediated activity, e.g., hydrolysis of fatty acid-CoA substrates.

[3166] Calculations of homology or sequence identity between sequences (the terms are used interchangeably herein) are performed as follows.

[3167] To determine the percent identity of two amino acid sequences, or of two nucleic acid sequences, the sequences are aligned for optimal comparison purposes (e.g., gaps can be introduced in one or both of a first and a second amino acid or nucleic acid sequence for optimal alignment and non-homologous sequences can be disregarded for comparison purposes). In a preferred embodiment, the length of a reference sequence aligned for comparison purposes is at least 30%, preferably at least 40%, more preferably at least 50%, 60%, and even more preferably at least 70%, 80%, 90%, 100% of the length of the reference sequence. The amino acid residues or nucleotides at corresponding amino acid positions or nucleotide positions are then compared. When a position in the first sequence is occupied by the same amino acid residue or nucleotide as the corresponding position in the second sequence, then the molecules are identical at that position (as used herein amino acid or nucleic acid “identity” is equivalent to amino acid or nucleic acid “homology”).

[3168] The percent identity between the two sequences is a function of the number of identical positions shared by the sequences, taking into account the number of gaps, and the length of each gap, which need to be introduced for optimal alignment of the two sequences.

[3169] The comparison of sequences and determination of percent identity between two sequences can be accomplished using a mathematical algorithm. In a preferred embodiment, the percent identity between two amino acid sequences is determined using the Needleman and Wunsch ((1970) J. Mol. Biol. 48:444-453) algorithm which has been incorporated into the GAP program in the GCG software package (available at http://www.gcg.com), using either a Blossum 62 matrix or a PAM250 matrix, and a gap weight of 16, 14, 12, 10, 8, 6, or 4 and a length weight of 1, 2, 3, 4, 5, or 6. In yet another preferred embodiment, the percent identity between two nucleotide sequences is determined using the GAP program in the GCG software package (available at http://www.gcg.com), using a NWSgapdna.CMP matrix and a gap weight of 40, 50, 60, 70, or 80 and a length weight of 1, 2, 3, 4, 5, or 6. A particularly preferred set of parameters (and the one that should be used unless otherwise specified) are a Blossum 62 scoring matrix with a gap penalty of 12, a gap extend penalty of 4, and a frameshift gap penalty of 5.

[3170] The percent identity between two amino acid or nucleotide sequences can be determined using the algorithm of E. Meyers and W. Miller ((1989) CABIOS, 4:11-17) which has been incorporated into the ALIGN program (version 2.0), using a PAM120 weight residue table, a gap length penalty of 12 and a gap penalty of 4.

[3171] The nucleic acid and protein sequences described herein can be used as a “query sequence” to perform a search against public databases to, for example, identify other family members or related sequences. Such searches can be performed using the NBLAST and XBLAST programs (version 2.0) of Altschul, et al. (1990) J. Mol. Biol. 215:403-10. BLAST nucleotide searches can be performed with the NBLAST program, score=100, wordlength=12 to obtain nucleotide sequences homologous to 56939 nucleic acid molecules of the invention. BLAST protein searches can be performed with the XBLAST program, score=50, wordlength=3 to obtain amino acid sequences homologous to 56939 protein molecules of the invention. To obtain gapped alignments for comparison purposes, Gapped BLAST can be utilized as described in Altschul et al., (1997) Nucleic Acids Res. 25:3389-3402. When utilizing BLAST and Gapped BLAST programs, the default parameters of the respective programs (e.g., XBLAST and NBLAST) can be used. See http://www.ncbi.nlm.nih.gov.

[3172] Particular 56939 polypeptides of the present invention have an amino acid sequence substantially identical to the amino acid sequence of SEQ ID NO:49. In the context of an amino acid sequence, the term “substantially identical” is used herein to refer to a first amino acid that contains a sufficient or minimum number of amino acid residues that are i) identical to, or ii) conservative substitutions of aligned amino acid residues in a second amino acid sequence such that the first and second amino acid sequences can have a common structural domain and/or common functional activity. For example, amino acid sequences that contain a common structural domain having at least about 60%, or 65% identity, likely 75% identity, more likely 85%, 90%. 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98% or 99% identity to SEQ ID NO:49 are termed substantially identical.

[3173] In the context of nucleotide sequence, the term “substantially identical” is used herein to refer to a first nucleic acid sequence that contains a sufficient or minimum number of nucleotides that are identical to aligned nucleotides in a second nucleic acid sequence such that the first and second nucleotide sequences encode a polypeptide having common functional activity, or encode a common structural polypeptide domain or a common functional polypeptide activity. For example, nucleotide sequences having at least about 60%, or 65% identity, likely 75% identity, more likely 85%, 90%. 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98% or 99% identity to SEQ ID NO:48 or 50 are termed substantially identical.

[3174] “Misexpression or aberrant expression”, as used herein, refers to a non-wildtype pattern of gene expression at the RNA or protein level. It includes: expression at non-wild type levels, i.e., over- or under-expression; a pattern of expression that differs from wild type in terms of the time or stage at which the gene is expressed, e.g., increased or decreased expression (as compared with wild type) at a predetermined developmental period or stage; a pattern of expression that differs from wild type in terms of altered, e.g., increased or decreased, expression (as compared with wild type) in a predetermined cell type or tissue type; a pattern of expression that differs from wild type in terms of the splicing size, translated amino acid sequence, post-transitional modification, or biological activity of the expressed polypeptide; a pattern of expression that differs from wild type in terms of the effect of an environmental stimulus or extracellular stimulus on expression of the gene, e.g., a pattern of increased or decreased expression (as compared with wild type) in the presence of an increase or decrease in the strength of the stimulus.

[3175] “Subject,” as used herein, refers to human and non-human animals. The term “non-human animals” of the invention includes all vertebrates, e.g., mammals, such as non-human primates (particularly higher primates), sheep, dog, rodent (e.g., mouse or rat), guinea pig, goat, pig, cat, rabbits, cow, and non-mammals, such as chickens, amphibians, reptiles, etc. In a preferred embodiment, the subject is a human. In another embodiment, the subject is an experimental animal or animal suitable as a disease model.

[3176] A “purified preparation of cells”, as used herein, refers to an in vitro preparation of cells. In the case cells from multicellular organisms (e.g., plants and animals), a purified preparation of cells is a subset of cells obtained from the organism, not the entire intact organism. In the case of unicellular microorganisms (e.g., cultured cells and microbial cells), it consists of a preparation of at least 10% and more preferably 50% of the subject cells.

[3177] Various aspects of the invention are described in further detail below.

[3178] Isolated Nucleic Acid Molecules of 56939

[3179] In one aspect, the invention provides, an isolated or purified, nucleic acid molecule that encodes a 56939 polypeptide described herein, e.g., a full-length 56939 protein or a fragment thereof, e.g., a biologically active portion of 56939 protein. Also included is a nucleic acid fragment suitable for use as a hybridization probe, which can be used, e.g., to identify a nucleic acid molecule encoding a polypeptide of the invention, 56939 mRNA, and fragments suitable for use as primers, e.g., PCR primers for the amplification or mutation of nucleic acid molecules.

[3180] In one embodiment, an isolated nucleic acid molecule of the invention includes the nucleotide sequence shown in SEQ ID NO:48, or a portion of any of these nucleotide sequences. In one embodiment, the nucleic acid molecule includes sequences encoding the human 56939 protein (i.e., “the coding region” of SEQ ID NO:48, as shown in SEQ ID NO:50), as well as 5′untranslated sequences. Alternatively, the nucleic acid molecule can include only the coding region of SEQ ID NO:48 (e.g., SEQ ID NO:50) and, e.g., no flanking sequences which normally accompany the subject sequence. In another embodiment, the nucleic acid molecule encodes a sequence corresponding to a fragment of protein from about amino acid 1 to amino acid 415 of SEQ ID NO:49.

[3181] In another embodiment, an isolated nucleic acid molecule of the invention includes a nucleic acid molecule which is a complement of the nucleotide sequence shown in SEQ ID NO:48 or SEQ ID NO:50, or a portion of any of these nucleotide sequences. In other embodiments, the nucleic acid molecule of the invention is sufficiently complementary to the nucleotide sequence shown in SEQ ID NO:48 or SEQ ID NO:50, such that it can hybridize (e.g., under a stringency condition described herein) to the nucleotide sequence shown in SEQ ID NO:48 or 50, thereby forming a stable duplex.

[3182] In one embodiment, an isolated nucleic acid molecule of the present invention includes a nucleotide sequence which is at least about: 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or more homologous to the entire length of the nucleotide sequence shown in SEQ ID NO:48 or SEQ ID NO:50, or a portion, preferably of the same length, of any of these nucleotide sequences.

[3183] 56939 Nucleic Acid Fragments

[3184] A nucleic acid molecule of the invention can include only a portion of the nucleic acid sequence of SEQ ID NO:48 or 50. For example, such a nucleic acid molecule can include a fragment which can be used as a probe or primer or a fragment encoding a portion of a 56939 protein, e.g., an immunogenic or biologically active portion of a 56939 protein. A fragment can comprise those nucleotides of SEQ ID NO:48, which encode a acyl-CoA thioesterase domain of human 56939. The nucleotide sequence determined from the cloning of the 56939 gene allows for the generation of probes and primers designed for use in identifying and/or cloning other 56939 family members, or fragments thereof, as well as 56939 homologues, or fragments thereof, from other species.

[3185] In another embodiment, a nucleic acid includes a nucleotide sequence that includes part, or all, of the coding region and extends into either (or both) the 5′ or 3′noncoding region. Other embodiments include a fragment which includes a nucleotide sequence encoding an amino acid fragment described herein. Nucleic acid fragments can encode a specific domain or site described herein or fragments thereof, particularly fragments thereof which are at least 100 amino acids in length. Fragments also include nucleic acid sequences corresponding to specific amino acid sequences described above or fragments thereof. Nucleic acid fragments should not to be construed as encompassing those fragments that may have been disclosed prior to the invention.

[3186] A nucleic acid fragment can include a sequence corresponding to a domain, region, or functional site described herein. A nucleic acid fragment can also include one or more domain, region, or functional site described herein. Thus, for example, a 56939 nucleic acid fragment can include a sequence corresponding to a acyl-CoA thioesterase domain.

[3187] 56939 probes and primers are provided. Typically a probe/primer is an isolated or purified oligonucleotide. The oligonucleotide typically includes a region of nucleotide sequence that hybridizes under a stringency condition described herein to at least about 7, 12 or 15, preferably about 20 or 25, more preferably about 30, 35, 40, 45, 50, 55, 60, 65, or 75 consecutive nucleotides of a sense or antisense sequence of SEQ ID NO:48 or SEQ ID NO:50, or of a naturally occurring allelic variant or mutant of SEQ ID NO:48 or SEQ ID NO:50.

[3188] In a preferred embodiment the nucleic acid is a probe which is at least 5 or 10, and less than 200, more preferably less than 100, or less than 50, base pairs in length. It should be identical, or differ by 1, or less than in 5 or 10 bases, from a sequence disclosed herein. If alignment is needed for this comparison the sequences should be aligned for maximum homology. “Looped” out sequences from deletions or insertions, or mismatches, are considered differences.

[3189] A probe or primer can be derived from the sense or anti-sense strand of a nucleic acid which encodes an acyl-CoA thioesterase domain from about amino acid 1 to about amino acid 415 of SEQ ID NO:49.

[3190] In another embodiment a set of primers is provided, e.g., primers suitable for use in a PCR, which can be used to amplify a selected region of a 56939 sequence, e.g., a domain, region, site or other sequence described herein. The primers should be at least 5, 10, or 50 base pairs in length and less than 100, or less than 200, base pairs in length. The primers should be identical, or differs by one base from a sequence disclosed herein or from a naturally occurring variant. For example, primers suitable for amplifying all or a portion of the following region are provided: a acyl-CoA thioesterase domain from about amino acid 1 to 415 of SEQ ID NO:49.

[3191] A nucleic acid fragment can encode an epitope bearing region of a polypeptide described herein.

[3192] A nucleic acid fragment encoding a “biologically active portion of a 56939 polypeptide” can be prepared by isolating a portion of the nucleotide sequence of SEQ ID NO:48 or 50, which encodes a polypeptide having a 56939 biological activity (e.g., the biological activities of the 56939 proteins are described herein), expressing the encoded portion of the 56939 protein (e.g., by recombinant expression in vitro) and assessing the activity of the encoded portion of the 56939 protein. For example, a nucleic acid fragment encoding a biologically active portion of 56939 includes a acyl-CoA thioesterase domain, e.g., amino acid residues about 1 to 415 of SEQ ID NO:49. A nucleic acid fragment encoding a biologically active portion of a 56939 polypeptide, may comprise a nucleotide sequence which is greater than 300, 150, 400 or more nucleotides in length.

[3193] In preferred embodiments, the fragment includes at least one, and preferably at least 5, 10, 15, 25, 50, 100, 200, 300, 350, 400, 450, or 500 nucleotides from nucleotides 1-668, 1-795, 1-828, 1-440, 1008-1391, or 1-986 of SEQ ID NO:48.

[3194] In preferred embodiments, the fragment includes the nucleotide sequence of SEQ ID NO:50 and at least one, and preferably at least 5, 10, 15, 25, 50, 75, 100, 200, 300, or 500 consecutive nucleotides of SEQ ID NO:48.

[3195] In preferred embodiments, the fragment includes at least one, and preferably at least 5, 10, 15, 25, 50, 75, 100, 200, 300, 500, 1000, 1100, 1200, or 1300 nucleotides encoding a protein including 5, 10, 15, 20, 25, 30, 40, 50, 100, 200, 210, 220, 230, 240, 250, 260, 270, 280, 290, 300, 310, 320, 330, 340, 350, 360, 370, 380, 390, 400, 410, or 420 consecutive amino acids of SEQ ID NO:49.

[3196] In preferred embodiments, a nucleic acid includes a nucleotide sequence that is other than the sequence of A1830209, AA056177, AA639200, AW090080, AW959783, AW959780, or V88752.

[3197] In preferred embodiments, the fragment comprises the coding region of 56939, e.g., the nucleotide sequence of SEQ ID NO:50.

[3198] In preferred embodiments, a nucleic acid includes a nucleotide sequence which is about 300, 400, 500, 600, 700, 800, 900, 1000, 1100, 1200, 1300 or more nucleotides in length and hybridizes under a stringency condition described herein to a nucleic acid molecule of SEQ ID NO:48, or SEQ ID NO:50.

[3199] 56939 Nucleic Acid Variants

[3200] The invention further encompasses nucleic acid molecules that differ from the nucleotide sequence shown in SEQ ID NO:48 or SEQ ID NO:50. Such differences can be due to degeneracy of the genetic code (and result in a nucleic acid which encodes the same 56939 proteins as those encoded by the nucleotide sequence disclosed herein. In another embodiment, an isolated nucleic acid molecule of the invention has a nucleotide sequence encoding a protein having an amino acid sequence which differs, by at least 1, but less than 5, 10, 20, 50, or 100 amino acid residues that shown in SEQ ID NO:49. If alignment is needed for this comparison the sequences should be aligned for maximum homology. “Looped” out sequences from deletions or insertions, or mismatches, are considered differences.

[3201] Nucleic acids of the inventor can be chosen for having codons, which are preferred, or non-preferred, for a particular expression system. E.g., the nucleic acid can be one in which at least one codon, at preferably at least 10%, or 20% of the codons has been altered such that the sequence is optimized for expression in E. coli, yeast, human, insect, or CHO cells.

[3202] Nucleic acid variants can be naturally occurring, such as allelic variants (same locus), homologs (different locus), and orthologs (different organism) or can be non naturally occurring. Non-naturally occurring variants can be made by mutagenesis techniques, including those applied to polynucleotides, cells, or organisms. The variants can contain nucleotide substitutions, deletions, inversions and insertions. Variation can occur in either or both the coding and non-coding regions. The variations can produce both conservative and non-conservative amino acid substitutions (as compared in the encoded product).

[3203] In a preferred embodiment, the nucleic acid differs from that of SEQ ID NO:48 or 50, e.g., as follows: by at least one but less than 10, 20, 30, or 40 nucleotides; at least one but less than 1%, 5%, 10% or 20% of the nucleotides in the subject nucleic acid. If necessary for this analysis the sequences should be aligned for maximum homology. “Looped” out sequences from deletions or insertions, or mismatches, are considered differences.

[3204] Orthologs, homologs, and allelic variants can be identified using methods known in the art. These variants comprise a nucleotide sequence encoding a polypeptide that is 50%, at least about 55%, typically at least about 70-75%, more typically at least about 80-85%, and most typically at least about 90-95% or more identical to the nucleotide sequence shown in SEQ ID NO:49 or a fragment of this sequence. Such nucleic acid molecules can readily be identified as being able to hybridize under a stringency condition described herein, to the nucleotide sequence shown in SEQ ID NO:49 or a fragment of the sequence. Nucleic acid molecules corresponding to orthologs, homologs, and allelic variants of the 56939 cDNAs of the invention can further be isolated by mapping to the same chromosome or locus as the 56939 gene.

[3205] Preferred variants include those that are correlated with hydrolase catalytic activity including the ability to hydrolyze fatty acids, e.g., fatty acid-CoAs or bile acid-CoAs.

[3206] Allelic variants of 56939, e.g., human 56939, include both functional and non-functional proteins. Functional allelic variants are naturally occurring amino acid sequence variants of the 56939 protein within a population that maintain the ability to hydrolyze fatty acids, e.g., fatty acid-CoAs or bile acid-CoAs. Functional allelic variants will typically contain only conservative substitution of one or more amino acids of SEQ ID NO:49, or substitution, deletion or insertion of non-critical residues in non-critical regions of the protein. Non-functional allelic variants are naturally-occurring amino acid sequence variants of the 56939, e.g., human 56939, protein within a population that do not have the ability to hydrolyze fatty acids, e.g., fatty acid-CoAs or bile acid-CoAs. Non-functional allelic variants will typically contain a non-conservative substitution, a deletion, or insertion, or premature truncation of the amino acid sequence of SEQ ID NO:49, or a substitution, insertion, or deletion in critical residues or critical regions of the protein.

[3207] Moreover, nucleic acid molecules encoding other 56939 family members and, thus, which have a nucleotide sequence which differs from the 56939 sequences of SEQ ID NO:48 or SEQ ID NO:50 are intended to be within the scope of the invention.

[3208] Antisense Nucleic Acid Molecules, Ribozymes and Modified 56939 Nucleic Acid Molecules

[3209] In another aspect, the invention features, an isolated nucleic acid molecule which is antisense to 56939. An “antisense” nucleic acid can include a nucleotide sequence which is complementary to a “sense” nucleic acid encoding a protein, e.g., complementary to the coding strand of a double-stranded cDNA molecule or complementary to an mRNA sequence. The antisense nucleic acid can be complementary to an entire 56939 coding strand, or to only a portion thereof (e.g., the coding region of human 56939 corresponding to SEQ ID NO:50). In another embodiment, the antisense nucleic acid molecule is antisense to a “noncoding region” of the coding strand of a nucleotide sequence encoding 56939 (e.g., the 5′ and 3′untranslated regions).

[3210] An antisense nucleic acid can be designed such that it is complementary to the entire coding region of 56939 mRNA, but more preferably is an oligonucleotide which is antisense to only a portion of the coding or noncoding region of 56939 mRNA. For example, the antisense oligonucleotide can be complementary to the region surrounding the translation start site of 56939 mRNA, e.g., between the −10 and +10 regions of the target gene nucleotide sequence of interest. An antisense oligonucleotide can be, for example, about 7, 10, 15, 20, 25, 30, 35, 40, 45, 50, 55, 60, 65, 70, 75, 80, or more nucleotides in length.

[3211] An antisense nucleic acid of the invention can be constructed using chemical synthesis and enzymatic ligation reactions using procedures known in the art. For example, an antisense nucleic acid (e.g., an antisense oligonucleotide) can be chemically synthesized using naturally occurring nucleotides or variously modified nucleotides designed to increase the biological stability of the molecules or to increase the physical stability of the duplex formed between the antisense and sense nucleic acids, e.g., phosphorothioate derivatives and acridine substituted nucleotides can be used. The antisense nucleic acid also can be produced biologically using an expression vector into which a nucleic acid has been subcloned in an antisense orientation (i.e., RNA transcribed from the inserted nucleic acid will be of an antisense orientation to a target nucleic acid of interest, described further in the following subsection).

[3212] The antisense nucleic acid molecules of the invention are typically administered to a subject (e.g., by direct injection at a tissue site), or generated in situ such that they hybridize with or bind to cellular mRNA and/or genomic DNA encoding a 56939 protein to thereby inhibit expression of the protein, e.g., by inhibiting transcription and/or translation. Alternatively, antisense nucleic acid molecules can be modified to target selected cells and then administered systemically. For systemic administration, antisense molecules can be modified such that they specifically bind to receptors or antigens expressed on a selected cell surface, e.g., by linking the antisense nucleic acid molecules to peptides or antibodies which bind to cell surface receptors or antigens. The antisense nucleic acid molecules can also be delivered to cells using the vectors described herein. To achieve sufficient intracellular concentrations of the antisense molecules, vector constructs in which the antisense nucleic acid molecule is placed under the control of a strong pol II or pol III promoter are preferred.

[3213] In yet another embodiment, the antisense nucleic acid molecule of the invention is an &agr;-anomeric nucleic acid molecule. An &agr;-anomeric nucleic acid molecule forms specific double-stranded hybrids with complementary RNA in which, contrary to the usual &agr;-units, the strands run parallel to each other (Gaultier et al. (1987) Nucleic Acids. Res. 15:6625-6641). The antisense nucleic acid molecule can also comprise a 2′-o-methylribonucleotide (Inoue et al. (1987) Nucleic Acids Res. 15:6131-6148) or a chimeric RNA-DNA analogue (Inoue et al. (1987) FEBS Lett. 215:327-330).

[3214] In still another embodiment, an antisense nucleic acid of the invention is a ribozyme. A ribozyme having specificity for a 56939-encoding nucleic acid can include one or more sequences complementary to the nucleotide sequence of a 56939 cDNA disclosed herein (i.e., SEQ ID NO:48 or SEQ ID NO:50), and a sequence having known catalytic sequence responsible for mRNA cleavage (see U.S. Pat. No. 5,093,246 or Haselhoff and Gerlach (1988) Nature 334:585-591). For example, a derivative of a Tetrahymena L-19 IVS RNA can be constructed in which the nucleotide sequence of the active site is complementary to the nucleotide sequence to be cleaved in a 56939-encoding mRNA. See, e.g., Cech et al. U.S. Pat. No. 4,987,071; and Cech et al. U.S. Pat. No. 5,116,742. Alternatively, 56939 mRNA can be used to select a catalytic RNA having a specific ribonuclease activity from a pool of RNA molecules. See, e.g., Bartel, D. and Szostak, J. W. (1993) Science 261:1411-1418.

[3215] 56939 gene expression can be inhibited by targeting nucleotide sequences complementary to the regulatory region of the 56939 (e.g., the 56939 promoter and/or enhancers) to form triple helical structures that prevent transcription of the 56939 gene in target cells. See generally, Helene, C. (1991) Anticancer Drug Des. 6:569-84; Helene, C. i (1992) Ann. N.Y. Acad. Sci. 660:27-36; and Maher, L. J. (1992) Bioassays 14:807-15. The potential sequences that can be targeted for triple helix formation can be increased by creating a so-called “switchback” nucleic acid molecule. Switchback molecules are synthesized in an alternating 5′-3′,3′-5′manner, such that they base pair with first one strand of a duplex and then the other, eliminating the necessity for a sizeable stretch of either purines or pyrimidines to be present on one strand of a duplex.

[3216] The invention also provides detectably labeled oligonucleotide primer and probe molecules. Typically, such labels are chemiluminescent, fluorescent, radioactive, or calorimetric.

[3217] A 56939 nucleic acid molecule can be modified at the base moiety, sugar moiety or phosphate backbone to improve, e.g., the stability, hybridization, or solubility of the molecule. For non-limiting examples of synthetic oligonucleotides with modifications see Toulmé (2001) Nature Biotech. 19:17 and Faria et al. (2001) Nature Biotech. 19:40-44. Such phosphoramidite oligonucleotides can be effective antisense agents.

[3218] For example, the deoxyribose phosphate backbone of the nucleic acid molecules can be modified to generate peptide nucleic acids (see Hyrup B. et al. (1996) Bioorganic & Medicinal Chemistry 4: 5-23). As used herein, the terms “peptide nucleic acid” or “PNA” refers to a nucleic acid mimic, e.g., a DNA mimic, in which the deoxyribose phosphate backbone is replaced by a pseudopeptide backbone and only the four natural nucleobases are retained. The neutral backbone of a PNA can allow for specific hybridization to DNA and RNA under conditions of low ionic strength. The synthesis of PNA oligomers can be performed using standard solid phase peptide synthesis protocols as described in Hyrup B. et al. (1996) supra and Perry-O'Keefe et al. Proc. Natl. Acad. Sci. 93: 14670-675.

[3219] PNAs of 56939 nucleic acid molecules can be used in therapeutic and diagnostic applications. For example, PNAs can be used as antisense or antigene agents for sequence-specific modulation of gene expression by, for example, inducing transcription or translation arrest or inhibiting replication. PNAs of 56939 nucleic acid molecules can also be used in the analysis of single base pair mutations in a gene, (e.g., by PNA-directed PCR clamping); as ‘artificial restriction enzymes’ when used in combination with other enzymes, (e.g., S1 nucleases (Hyrup B. et al. (1996) supra)); or as probes or primers for DNA sequencing or hybridization (Hyrup B. et al. (1996) supra; Perry-O'Keefe supra).

[3220] In other embodiments, the oligonucleotide may include other appended groups such as peptides (e.g., for targeting host cell receptors in vivo), or agents facilitating transport across the cell membrane (see, e.g., Letsinger et al. (1989) Proc. Natl. Acad. Sci. USA 86:6553-6556; Lemaitre et al. (1987) Proc. Natl. Acad. Sci. USA 84:648-652; PCT Publication No. WO88/09810) or the blood-brain barrier (see, e.g., PCT Publication No. WO89/10134). In addition, oligonucleotides can be modified with hybridization-triggered cleavage agents (see, e.g., Krol et al. (1988) Bio-Techniques 6:958-976) or intercalating agents. (see, e.g., Zon (1988) Pharm. Res. 5:539-549). To this end, the oligonucleotide may be conjugated to another molecule, (e.g., a peptide, hybridization triggered cross-linking agent, transport agent, or hybridization-triggered cleavage agent).

[3221] The invention also includes molecular beacon oligonucleotide primer and probe molecules having at least one region which is complementary to a 56939 nucleic acid of the invention, two complementary regions one having a fluorophore and one a quencher such that the molecular beacon is useful for quantitating the presence of the 56939 nucleic acid of the invention in a sample. Molecular beacon nucleic acids are described, for example, in Lizardi et al., U.S. Pat. No. 5,854,033; Nazarenko et al., U.S. Pat. No. 5,866,336, and Livak et al., U.S. Pat. No. 5,876,930.

[3222] Isolated 56939 Polypeptides

[3223] In another aspect, the invention features, an isolated 56939 protein, or fragment, e.g., a biologically active portion, for use as immunogens or antigens to raise or test (or more generally to bind) anti-56939 antibodies. 56939 protein can be isolated from cells or tissue sources using standard protein purification techniques. 56939 protein or fragments thereof can be produced by recombinant DNA techniques or synthesized chemically.

[3224] Polypeptides of the invention include those which arise as a result of the existence of multiple genes, alternative transcription events, alternative RNA splicing events, and alternative translational and post-translational events. The polypeptide can be expressed in systems, e.g., cultured cells, which result in substantially the same post-translational modifications present when expressed the polypeptide is expressed in a native cell, or in systems which result in the alteration or omission of post-translational modifications, e.g., glycosylation or cleavage, present when expressed in a native cell.

[3225] In a preferred embodiment, a 56939 polypeptide has one or more of the following characteristics:

[3226] (i) it has the ability to hydrolyze acyl-CoA molecules to free fatty acids and CoA;

[3227] (ii) it has a molecular weight, e.g., a deduced molecular weight, preferably ignoring any contribution of post-translational modifications, amino acid composition or other physical characteristic of SEQ ID NO:49;

[3228] (iii) it has an overall sequence similarity of at least 60%, more preferably at least 705, 805, 905, or 95%, with a polypeptide of SEQ ID NO:49;

[3229] (iv) it has an acyl-CoA thioesterase domain which is preferably about 70%, 80%, 90% or 95% identical to amino acid residues about 1 to 415 of SEQ ID NO:49;

[3230] (v) it has a serine corresponding to the conserved serine of the catalytic triad located at about position 232 of SEQ ID NO:49;

[3231] (vi) it has a histidine corresponding to the conserved histidine of the catalytic triad located at about position 360 of SEQ ID NO:49;

[3232] (vii) it has a aspartic acid corresponding to the conserved aspartic acid of the catalytic triad located at about position 325 of SEQ ID NO:49; or

[3233] (viii) it has at least 70%, preferably 80%, and most preferably 95% of the cysteines found amino acid sequence of the native protein.

[3234] In a preferred embodiment the 56939 protein, or fragment thereof, differs from the corresponding sequence in SEQ ID NO:49. In one embodiment it differs by at least one but by less than 15, 10 or 5 amino acid residues. In another it differs from the corresponding sequence in SEQ ID NO:49 by at least one residue but less than 20%, 15%, 10% or 5% of the residues in it differ from the corresponding sequence in SEQ ID NO:49. (If this comparison requires alignment the sequences should be aligned for maximum homology. “Looped” out sequences from deletions or insertions, or mismatches, are considered differences.) The differences are, preferably, differences or changes at a non essential residue or a conservative substitution. In a preferred embodiment the differences are not in the acyl-CoA thioesterase domain. In another preferred embodiment one or more differences are in the acyl-CoA thioesterase domain.

[3235] Other embodiments include a protein that contain one or more changes in amino acid sequence, e.g., a change in an amino acid residue which is not essential for activity. Such 56939 proteins differ in amino acid sequence from SEQ ID NO:49, yet retain biological activity.

[3236] In one embodiment, the protein includes an amino acid sequence at least about 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 98% or more homologous to SEQ ID NO:49.

[3237] A 56939 protein or fragment is provided which varies from the sequence of SEQ ID NO:49 in regions defined by amino acids about 415 to 421 by at least one but by less than 15, 10 or 5 amino acid residues in the protein or fragment but which does not differ from SEQ ID NO:49 in regions defined by amino acids about 1 to about 415. (If this comparison requires alignment the sequences should be aligned for maximum homology. “Looped” out sequences from deletions or insertions, or mismatches, are considered differences.) In some embodiments the difference is at a non-essential residue or is a conservative substitution, while in others the difference is at an essential residue or is a non-conservative substitution.

[3238] In one embodiment, a biologically active portion of a 56939 protein includes a acyl-CoA thioesterase domain. Moreover, other biologically active portions, in which other regions of the protein are deleted, can be prepared by recombinant techniques and evaluated for one or more of the functional activities of a native 56939 protein.

[3239] In a preferred embodiment, the 56939 protein has an amino acid sequence shown in SEQ ID NO:49. In other embodiments, the 56939 protein is substantially identical to SEQ ID NO:49. In yet another embodiment, the 56939 protein is substantially identical to SEQ ID NO:49 and retains the functional activity of the protein of SEQ ID NO:49, as described in detail in the subsections above.

[3240] 56939 Chimeric or Fusion Proteins

[3241] In another aspect, the invention provides 56939 chimeric or fusion proteins. As used herein, a 56939 “chimeric protein” or “fusion protein” includes a 56939 polypeptide linked to a non-56939 polypeptide. A “non-56939 polypeptide” refers to a polypeptide having an amino acid sequence corresponding to a protein which is not substantially homologous to the 56939 protein, e.g., a protein which is different from the 56939 protein and which is derived from the same or a different organism. The 56939 polypeptide of the fusion protein can correspond to all or a portion e.g., a fragment described herein of a 56939 amino acid sequence. In a preferred embodiment, a 56939 fusion protein includes at least one (or two) biologically active portion of a 56939 protein. The non-56939 polypeptide can be fused to the N-terminus or C-terminus of the 56939 polypeptide.

[3242] The fusion protein can include a moiety which has a high affinity for a ligand. For example, the fusion protein can be a GST-56939 fusion protein in which the 56939 sequences are fused to the C-terminus of the GST sequences. Such fusion proteins can facilitate the purification of recombinant 56939. Alternatively, the fusion protein can be a 56939 protein containing a heterologous signal sequence at its N-terminus. In certain host cells (e.g., mammalian host cells), expression and/or secretion of 56939 can be increased through use of a heterologous signal sequence.

[3243] Fusion proteins can include all or a part of a serum protein, e.g., an IgG constant region, or human serum albumin.

[3244] The 56939 fusion proteins of the invention can be incorporated into pharmaceutical compositions and administered to a subject in vivo. The 56939 fusion proteins can be used to affect the bioavailability of a 56939 substrate. 56939 fusion proteins may be useful therapeutically for the treatment of disorders caused by, for example, (i) aberrant modification or mutation of a gene encoding a 56939 protein; (ii) mis-regulation of the 56939 gene; and (iii) aberrant post-translational modification of a 56939 protein.

[3245] Moreover, the 56939-fusion proteins of the invention can be used as immunogens to produce anti-56939 antibodies in a subject, to purify 56939 ligands and in screening assays to identify molecules which inhibit the interaction of 56939 with a 56939 substrate.

[3246] Expression vectors are commercially available that already encode a fusion moiety (e.g., a GST polypeptide). A 56939-encoding nucleic acid can be cloned into such an expression vector such that the fusion moiety is linked in-frame to the 56939 protein.

[3247] Variants of 56939 Proteins

[3248] In another aspect, the invention also features a variant of a 56939 polypeptide, e.g., which functions as an agonist (mimetics) or as an antagonist. Variants of the 56939 proteins can be generated by mutagenesis, e.g., discrete point mutation, the insertion or deletion of sequences or the truncation of a 56939 protein. An agonist of the 56939 proteins can retain substantially the same, or a subset, of the biological activities of the naturally occurring form of a 56939 protein. An antagonist of a 56939 protein can inhibit one or more of the activities of the naturally occurring form of the 56939 protein by, for example, competitively modulating a 56939-mediated activity of a 56939 protein. Thus, specific biological effects can be elicited by treatment with a variant of limited function. Preferably, treatment of a subject with a variant having a subset of the biological activities of the naturally occurring form of the protein has fewer side effects in a subject relative to treatment with the naturally occurring form of the 56939 protein.

[3249] Variants of a 56939 protein can be identified by screening combinatorial libraries of mutants, e.g., truncation mutants, of a 56939 protein for agonist or antagonist activity.

[3250] Libraries of fragments e.g., N terminal, C terminal, or internal fragments, of a 56939 protein coding sequence can be used to generate a variegated population of fragments for screening and subsequent selection of variants of a 56939 protein. Variants in which a cysteine residues is added or deleted or in which a residue which is glycosylated is added or deleted are particularly preferred.

[3251] Methods for screening gene products of combinatorial libraries made by point mutations or truncation, and for screening cDNA libraries for gene products having a selected property are known in the art. Such methods are adaptable for rapid screening of the gene libraries generated by combinatorial mutagenesis of 56939 proteins. Recursive ensemble mutagenesis (REM), a new technique which enhances the frequency of functional mutants in the libraries, can be used in combination with the screening assays to identify 56939 variants (Arkin and Yourvan (1992) Proc. Natl. Acad. Sci. USA 89:7811-7815; Delgrave et al. (1993) Protein Engineering 6:327-331).

[3252] Cell based assays can be exploited to analyze a variegated 56939 library. For example, a library of expression vectors can be transfected into a cell line, e.g., a cell line, which ordinarily responds to 56939 in a substrate-dependent manner. The transfected cells are then contacted with 56939 and the effect of the expression of the mutant on signaling by the 56939 substrate can be detected, e.g., by measuring hydrolysis of acyl-CoAs. Plasmid DNA can then be recovered from the cells which score for inhibition, or alternatively, potentiation of signaling by the 56939 substrate, and the individual clones further characterized.

[3253] In another aspect, the invention features a method of making a 56939 polypeptide, e.g., a peptide having a non-wild type activity, e.g., an antagonist, agonist, or super agonist of a naturally occurring 56939 polypeptide, e.g., a naturally occurring 56939 polypeptide. The method includes: altering the sequence of a 56939 polypeptide, e.g., altering the sequence, e.g., by substitution or deletion of one or more residues of a non-conserved region, a domain or residue disclosed herein, and testing the altered polypeptide for the desired activity.

[3254] In another aspect, the invention features a method of making a fragment or analog of a 56939 polypeptide a biological activity of a naturally occurring 56939 polypeptide. The method includes: altering the sequence, e.g., by substitution or deletion of one or more residues, of a 56939 polypeptide, e.g., altering the sequence of a non-conserved region, or a domain or residue described herein, and testing the altered polypeptide for the desired activity.

[3255] Anti-56939 Antibodies

[3256] In another aspect, the invention provides an anti-56939 antibody, or a fragment thereof (e.g., an antigen-binding fragment thereof). The term “antibody” as used herein refers to an immunoglobulin molecule or immunologically active portion thereof, i.e., an antigen-binding portion. As used herein, the term “antibody” refers to a protein comprising at least one, and preferably two, heavy (H) chain variable regions (abbreviated herein as VH), and at least one and preferably two light (L) chain variable regions (abbreviated herein as VL). The VH and VL regions can be further subdivided into regions of hypervariability, termed “complementarity determining regions” (“CDR”), interspersed with regions that are more conserved, termed “framework regions” (FR). The extent of the framework region and CDR's has been precisely defined (see, Kabat, E. A., et al. (1991) Sequences of Proteins of Immunological Interest, Fifth Edition, U.S. Department of Health and Human Services, NIH Publication No. 91-3242, and Chothia, C. et al. (1987) J. Mol. Biol. 196:901-917, which are incorporated herein by reference). Each VH and VL is composed of three CDR's and four FRs, arranged from amino-terminus to carboxy-terminus in the following order: FR1, CDR1, FR2, CDR2, FR3, CDR3, FR4.

[3257] The anti-56939 antibody can further include a heavy and light chain constant region, to thereby form a heavy and light immunoglobulin chain, respectively. In one embodiment, the antibody is a tetramer of two heavy immunoglobulin chains and two light immunoglobulin chains, wherein the heavy and light immunoglobulin chains are inter-connected by, e.g., disulfide bonds. The heavy chain constant region is comprised of three domains, CH1, CH2 and CH3. The light chain constant region is comprised of one domain, CL. The variable region of the heavy and light chains contains a binding domain that interacts with an antigen. The constant regions of the antibodies typically mediate the binding of the antibody to host tissues or factors, including various cells of the immune system (e.g., effector cells) and the first component (Clq) of the classical complement system.

[3258] As used herein, the term “immunoglobulin” refers to a protein consisting of one or more polypeptides substantially encoded by immunoglobulin genes. The recognized human immunoglobulin genes include the kappa, lambda, alpha (IgA1 and IgA2), gamma (IgG1, IgG2, IgG3, IgG4), delta, epsilon and mu constant region genes, as well as the myriad immunoglobulin variable region genes. Full-length immunoglobulin “light chains” (about 25 KDa or 214 amino acids) are encoded by a variable region gene at the NH2-terminus (about 110 amino acids) and a kappa or lambda constant region gene at the COOH—terminus. Full-length immunoglobulin “heavy chains” (about 50 KDa or 446 amino acids), are similarly encoded by a variable region gene (about 116 amino acids) and one of the other aforementioned constant region genes, e.g., gamma (encoding about 330 amino acids).

[3259] The term “antigen-binding fragment” of an antibody (or simply “antibody portion,” or “fragment”), as used herein, refers to one or more fragments of a full-length antibody that retain the ability to specifically bind to the antigen, e.g., 56939 polypeptide or fragment thereof. Examples of antigen-binding fragments of the anti-56939 antibody include, but are not limited to: (i) a Fab fragment, a monovalent fragment consisting of the VL, VH, CL and CH1 domains; (ii) a F(ab′)2 fragment, a bivalent fragment comprising two Fab fragments linked by a disulfide bridge at the hinge region; (iii) a Fd fragment consisting of the VH and CH1 domains; (iv) a Fv fragment consisting of the VL and VH domains of a single arm of an antibody, (v) a dAb fragment (Ward et al., (1989) Nature 341:544-546), which consists of a VH domain; and (vi) an isolated complementarity determining region (CDR). Furthermore, although the two domains of the Fv fragment, VL and VH, are coded for by separate genes, they can be joined, using recombinant methods, by a synthetic linker that enables them to be made as a single protein chain in which the VL and VH regions pair to form monovalent molecules (known as single chain Fv (scFv); see e.g., Bird et al. (1988) Science 242:423-426; and Huston et al. (1988) Proc. Natl. Acad. Sci. USA 85:5879-5883). Such single chain antibodies are also encompassed within the term “antigen-binding fragment” of an antibody. These antibody fragments are obtained using conventional techniques known to those with skill in the art, and the fragments are screened for utility in the same manner as are intact antibodies.

[3260] The anti-56939 antibody can be a polyclonal or a monoclonal antibody. In other embodiments, the antibody can be recombinantly produced, e.g., produced by phage display or by combinatorial methods.

[3261] Phage display and combinatorial methods for generating anti-56939 antibodies are known in the art (as described in, e.g., Ladner et al. U.S. Pat. No. 5,223,409; Kang et al. International Publication No. WO 92/18619; Dower et al. International Publication No. WO 91/17271; Winter et al. International Publication WO 92/20791; Markland et al. International Publication No. WO 92/15679; Breitling et al. International Publication WO 93/01288; McCafferty et al. International Publication No. WO 92/01047; Garrard et al. International Publication No. WO 92/09690; Ladner et al. International Publication No. WO 90/02809; Fuchs et al. (1991) Bio/Technology 9:1370-1372; Hay et al. (1992) Hum Antibod Hybridomas 3:81-85; Huse et al. (1989) Science 246:1275-1281; Griffths et al. (1993) EMBO J 12:725-734; Hawkins et al. (1992) J Mol Biol 226:889-896; Clackson et al. (1991) Nature 352:624-628; Gram et al. (1992) PNAS 89:3576-3580; Garrad et al. (1991) Bio/Technology 9:1373-1377; Hoogenboom et al. (1991) Nuc Acid Res 19:4133-4137; and Barbas et al. (1991) PNAS 88:7978-7982, the contents of all of which are incorporated by reference herein).

[3262] In one embodiment, the anti-56939 antibody is a fully human antibody (e.g., an antibody made in a mouse which has been genetically engineered to produce an antibody from a human immunoglobulin sequence), or a non-human antibody, e.g., a rodent (mouse or rat), goat, primate (e.g., monkey), camel antibody. Preferably, the non-human antibody is a rodent (mouse or rat antibody). Method of producing rodent antibodies are known in the art.

[3263] Human monoclonal antibodies can be generated using transgenic mice carrying the human immunoglobulin genes rather than the mouse system. Splenocytes from these transgenic mice immunized with the antigen of interest are used to produce hybridomas that secrete human mAbs with specific affinities for epitopes from a human protein (see, e.g., Wood et al. International Application WO 91/00906, Kucherlapati et al. PCT publication WO 91/10741; Lonberg et al. International Application WO 92/03918; Kay et al. International Application 92/03917; Lonberg, N. et al. 1994 Nature 368:856-859; Green, L. L. et al. 1994 Nature Genet. 7:13-21; Morrison, S. L. et al. 1994 Proc. Natl. Acad. Sci. USA 81:6851-6855; Bruggeman et al. 1993 Year Immunol 7:33-40; Tuaillon et al. 1993 PNAS 90:3720-3724; Bruggeman et al. 1991 Eur J Immunol 21:1323-1326).

[3264] An anti-56939 antibody can be one in which the variable region, or a portion thereof, e.g., the CDR's, are generated in a non-human organism, e.g., a rat or mouse. Chimeric, CDR-grafted, and humanized antibodies are within the invention. Antibodies generated in a non-human organism, e.g., a rat or mouse, and then modified, e.g., in the variable framework or constant region, to decrease antigenicity in a human are within the invention.

[3265] Chimeric antibodies can be produced by recombinant DNA techniques known in the art. For example, a gene encoding the Fc constant region of a murine (or other species) monoclonal antibody molecule is digested with restriction enzymes to remove the region encoding the murine Fc, and the equivalent portion of a gene encoding a human Fc constant region is substituted (see Robinson et al., International Patent Publication PCT/US86/02269; Akira, et al., European Patent Application 184,187; Taniguchi, M., European Patent Application 171,496; Morrison et al., European Patent Application 173,494; Neuberger et al., International Application WO 86/01533; Cabilly et al. U.S. Pat. No. 4,816,567; Cabilly et al., European Patent Application 125,023; Better et al. (1988 Science 240:1041-1043); Liu et al. (1987) PNAS 84:3439-3443; Liu et al., 1987, J. Immunol. 139:3521-3526; Sun et al. (1987) PNAS 84:214-218; Nishimura et al., 1987, Canc. Res. 47:999-1005; Wood et al. (1985) Nature 314:446-449; and Shaw et al., 1988, J. Natl Cancer Inst. 80:1553-1559).

[3266] A humanized or CDR-grafted antibody will have at least one or two but generally all three recipient CDR's (of heavy and or light immuoglobulin chains) replaced with a donor CDR. The antibody may be replaced with at least a portion of a non-human CDR or only some of the CDR's may be replaced with non-human CDR's. It is only necessary to replace the number of CDR's required for binding of the humanized antibody to a 56939 or a fragment thereof. Preferably, the donor will be a rodent antibody, e.g., a rat or mouse antibody, and the recipient will be a human framework or a human consensus framework. Typically, the immunoglobulin providing the CDR's is called the “donor” and the immunoglobulin providing the framework is called the “acceptor.” In one embodiment, the donor immunoglobulin is a non-human (e.g., rodent). The acceptor framework is a naturally-occurring (e.g., a human) framework or a consensus framework, or a sequence about 85% or higher, preferably 90%, 95%, 99% or higher identical thereto. As used herein, the term “consensus sequence” refers to the sequence formed from the most frequently occurring amino acids (or nucleotides) in a family of related sequences (See e.g., Winnaker, From Genes to Clones (Verlagsgesellschaft, Weinheim, Germany 1987). In a family of proteins, each position in the consensus sequence is occupied by the amino acid occurring most frequently at that position in the family. If two amino acids occur equally frequently, either can be included in the consensus sequence. A “consensus framework” refers to the framework region in the consensus immunoglobulin sequence.

[3267] An antibody can be humanized by methods known in the art. Humanized antibodies can be generated by replacing sequences of the Fv variable region which are not directly involved in antigen binding with equivalent sequences from human Fv variable regions. General methods for generating humanized antibodies are provided by Morrison, S. L., 1985, Science 229:1202-1207, by Oi et al., 1986, BioTechniques 4:214, and by Queen et al. U.S. Pat. No. 5,585,089, U.S. Pat. No. 5,693,761 and U.S. Pat. No. 5,693,762, the contents of all of which are hereby incorporated by reference. Those methods include isolating, manipulating, and expressing the nucleic acid sequences that encode all or part of immunoglobulin Fv variable regions from at least one of a heavy or light chain. Sources of such nucleic acid are well known to those skilled in the art and, for example, may be obtained from a hybridoma producing an antibody against a 56939 polypeptide or fragment thereof. The recombinant DNA encoding the humanized antibody, or fragment thereof, can then be cloned into an appropriate expression vector.

[3268] Humanized or CDR-grafted antibodies can be produced by CDR-grafting or CDR substitution, wherein one, two, or all CDR's of an immunoglobulin chain can be replaced. See e.g., U.S. Pat. No. 5,225,539; Jones et al. 1986 Nature 321:552-525; Verhoeyan et al. 1988 Science 239:1534; Beidler et al. 1988 J. Immunol. 141:4053-4060; Winter U.S. Pat. No. 5,225,539, the contents of all of which are hereby expressly incorporated by reference. Winter describes a CDR-grafting method which may be used to prepare the humanized antibodies of the present invention (UK Patent Application GB 2188638A, filed on Mar. 26, 1987; Winter U.S. Pat. No. 5,225,539), the contents of which is expressly incorporated by reference.

[3269] Also within the scope of the invention are humanized antibodies in which specific amino acids have been substituted, deleted or added. Preferred humanized antibodies have amino acid substitutions in the framework region, such as to improve binding to the antigen. For example, a humanized antibody will have framework residues identical to the donor framework residue or to another amino acid other than the recipient framework residue. To generate such antibodies, a selected, small number of acceptor framework residues of the humanized immunoglobulin chain can be replaced by the corresponding donor amino acids. Preferred locations of the substitutions include amino acid residues adjacent to the CDR, or which are capable of interacting with a CDR (see e.g., U.S. Pat. No. 5,585,089). Criteria for selecting amino acids from the donor are described in U.S. Pat. No. 5,585,089, e.g., columns 12-16 of U.S. Pat. No. 5,585,089, the e.g., columns 12-16 of U.S. Pat. No. 5,585,089, the contents of which are hereby incorporated by reference. Other techniques for humanizing antibodies are described in Padlan et al. EP 519596 A1, published on Dec. 23, 1992.

[3270] In preferred embodiments an antibody can be made by immunizing with purified 56939 antigen, or a fragment thereof, e.g., a fragment described herein.

[3271] A full-length 56939 protein or, antigenic peptide fragment of 56939 can be used as an immunogen or can be used to identify anti-56939 antibodies made with other immunogens, e.g., cells, membrane preparations, and the like. The antigenic peptide of 56939 should include at least 8 amino acid residues of the amino acid sequence shown in SEQ ID NO:49 and encompasses an epitope of 56939. Preferably, the antigenic peptide includes at least 10 amino acid residues, more preferably at least 15 amino acid residues, even more preferably at least 20 amino acid residues, and most preferably at least 30 amino acid residues.

[3272] Fragments of 56939 which include residues 110 to 115, about 323 to 330, or about 339 to 349 can be used to make, e.g., used as immunogens or used to characterize the specificity of an antibody, antibodies against hydrophilic regions of the 56939 protein. Similarly, fragments of 56939 which include residues 78 to 83, about 100 to 104, or about 223 to 231 can be used to make an antibody against a hydrophobic region of the 56939 protein. A fragment of 56939 which include residues about 1 to 415 can be used to make an antibody against the acyl-CoA thioesterase region of the 56939 protein.

[3273] Antibodies reactive with, or specific for, any of these regions, or other regions or domains described herein are provided.

[3274] Antibodies which bind only native 56939 protein, only denatured or otherwise non-native 56939 protein, or which bind both, are with in the invention. Antibodies with linear or conformational epitopes are within the invention. Conformational epitopes can sometimes be identified by identifying antibodies which bind to native but not denatured 56939 protein.

[3275] Preferred epitopes encompassed by the antigenic peptide are regions of 56939 are located on the surface of the protein, e.g., hydrophilic regions, as well as regions with high antigenicity. For example, an Emini surface probability analysis of the human 56939 protein sequence can be used to indicate the regions that have a particularly high probability of being localized to the surface of the 56939 protein and are thus likely to constitute surface residues useful for targeting antibody production.

[3276] The anti-56939 antibody can be a single chain antibody. A single-chain antibody (scFV) may be engineered (see, for example, Colcher, D. et al. (1999) Ann N Y Acad Sci 880:263-80; and Reiter, Y. (1996) Clin Cancer Res 2:245-52). The single chain antibody can be dimerized or multimerized to generate multivalent antibodies having specificities for different epitopes of the same target 56939 protein.

[3277] In a preferred embodiment the antibody has: effector function; and can fix complement. In other embodiments the antibody does not; recruit effector cells; or fix complement.

[3278] In a preferred embodiment, the antibody has reduced or no ability to bind an Fc receptor. For example., it is a isotype or subtype, fragment or other mutant, which does not support binding to an Fc receptor, e.g., it has a mutagenized or deleted Fc receptor binding region.

[3279] In a preferred embodiment, an anti-56939 antibody alters (e.g., increases or decreases) the activity of a 56939 polypeptide.

[3280] The antibody can be coupled to a toxin, e.g., a polypeptide toxin, e,g, ricin or diphtheria toxin or active fragment hereof, or a radioactive nucleus, or imaging agent, e.g. a radioactive, enzymatic, or other, e.g., imaging agent, e.g., a NMR contrast agent. Labels which produce detectable radioactive emissions or fluorescence are preferred.

[3281] An anti-56939 antibody (e.g., monoclonal antibody) can be used to isolate 56939 by standard techniques, such as affinity chromatography or immunoprecipitation. Moreover, an anti-56939 antibody can be used to detect 56939 protein (e.g., in a cellular lysate or cell supernatant) in order to evaluate the abundance and pattern of expression of the protein. Anti-56939 antibodies can be used diagnostically to monitor protein levels in tissue as part of a clinical testing procedure, e.g., to determine the efficacy of a given treatment regimen. Detection can be facilitated by coupling (i.e., physically linking) the antibody to a detectable substance (i.e., antibody labeling). Examples of detectable substances include various enzymes, prosthetic groups, fluorescent materials, luminescent materials, bioluminescent materials, and radioactive materials. Examples of suitable enzymes include horseradish peroxidase, alkaline phosphatase, &bgr;-galactosidase, or acetylcholinesterase; examples of suitable prosthetic group complexes include streptavidin/biotin and avidin/biotin; examples of suitable fluorescent materials include umbelliferone, fluorescein, fluorescein isothiocyanate, rhodamine, dichlorotriazinylamine fluorescein, dansyl chloride or phycoerythrin; an example of a luminescent material includes luminol; examples of bioluminescent materials include luciferase, luciferin, and aequorin, and examples of suitable radioactive material include 125I, 131I, 35S or 3H.

[3282] The invention also includes a nucleic acids which encodes an anti-56939 antibody, e.g., an anti-56939 antibody described herein. Also included are vectors which include the nucleic acid and sells transformed with the nucleic acid, particularly cells which are useful for producing an antibody, e.g., mammalian cells, e.g. CHO or lymphatic cells.

[3283] The invention also includes cell lines, e.g., hybridomas, which make an anti-56939 antibody, e.g., and antibody described herein, and method of using said cells to make a 56939 antibody.

[3284] 56939 Recombinant Expression Vectors, Host Cells and Genetically Engineered Cells

[3285] In another aspect, the invention includes, vectors, preferably expression vectors, containing a nucleic acid encoding a polypeptide described herein. As used herein, the term “vector” refers to a nucleic acid molecule capable of transporting another nucleic acid to which it has been linked and can include a plasmid, cosmid or viral vector. The vector can be capable of autonomous replication or it can integrate into a host DNA. Viral vectors include, e.g., replication defective retroviruses, adenoviruses and adeno-associated viruses.

[3286] A vector can include a 56939 nucleic acid in a form suitable for expression of the nucleic acid in a host cell. Preferably the recombinant expression vector includes one or more regulatory sequences operatively linked to the nucleic acid sequence to be expressed. The term “regulatory sequence” includes promoters, enhancers and other expression control elements (e.g., polyadenylation signals). Regulatory sequences include those which direct constitutive expression of a nucleotide sequence, as well as tissue-specific regulatory and/or inducible sequences. The design of the expression vector can depend on such factors as the choice of the host cell to be transformed, the level of expression of protein desired, and the like. The expression vectors of the invention can be introduced into host cells to thereby produce proteins or polypeptides, including fusion proteins or polypeptides, encoded by nucleic acids as described herein (e.g., 56939 proteins, mutant forms of 56939 proteins, fusion proteins, and the like).

[3287] The recombinant expression vectors of the invention can be designed for expression of 56939 proteins in prokaryotic or eukaryotic cells. For example, polypeptides of the invention can be expressed in E. coli, insect cells (e.g., using baculovirus expression vectors), yeast cells or mammalian cells. Suitable host cells are discussed further in Goeddel, (1990) Gene Expression Technology: Methods in Enzymology 185, Academic Press, San Diego, Calif. Alternatively, the recombinant expression vector can be transcribed and translated in vitro, for example using T7 promoter regulatory sequences and T7 polymerase.

[3288] Expression of proteins in prokaryotes is most often carried out in E. coli with vectors containing constitutive or inducible promoters directing the expression of either fusion or non-fusion proteins. Fusion vectors add a number of amino acids to a protein encoded therein, usually to the amino terminus of the recombinant protein. Such fusion vectors typically serve three purposes: 1) to increase expression of recombinant protein; 2) to increase the solubility of the recombinant protein; and 3) to aid in the purification of the recombinant protein by acting as a ligand in affinity purification. Often, a proteolytic cleavage site is introduced at the junction of the fusion moiety and the recombinant protein to enable separation of the recombinant protein from the fusion moiety subsequent to purification of the fusion protein. Such enzymes, and their cognate recognition sequences, include Factor Xa, thrombin and enterokinase. Typical fusion expression vectors include pGEX (Pharmacia Biotech Inc; Smith, D. B. and Johnson, K. S. (1988) Gene 67:31-40), pMAL (New England Biolabs, Beverly, Mass.) and pRIT5 (Pharmacia, Piscataway, N.J.) which fuse glutathione S-transferase (GST), maltose E binding protein, or protein A, respectively, to the target recombinant protein.

[3289] Purified fusion proteins can be used in 56939 activity assays, (e.g., direct assays or competitive assays described in detail below), or to generate antibodies specific for 56939 proteins. In a preferred embodiment, a fusion protein expressed in a retroviral expression vector of the present invention can be used to infect bone marrow cells which are subsequently transplanted into irradiated recipients. The pathology of the subject recipient is then examined after sufficient time has passed (e.g., six weeks).

[3290] To maximize recombinant protein expression in E. coli is to express the protein in a host bacteria with an impaired capacity to proteolytically cleave the recombinant protein (Gottesman, S., (1990) Gene Expression Technology: Methods in Enzymology 185, Academic Press, San Diego, Calif. 119-128). Another strategy is to alter the nucleic acid sequence of the nucleic acid to be inserted into an expression vector so that the individual codons for each amino acid are those preferentially utilized in E. coli (Wada et al., (1992) Nucleic Acids Res. 20:2111-2118). Such alteration of nucleic acid sequences of the invention can be carried out by standard DNA synthesis techniques.

[3291] The 56939 expression vector can be a yeast expression vector, a vector for expression in insect cells, e.g., a baculovirus expression vector or a vector suitable for expression in mammalian cells.

[3292] When used in mammalian cells, the expression vector's control functions can be provided by viral regulatory elements. For example, commonly used promoters are derived from polyoma, Adenovirus 2, cytomegalovirus and Simian Virus 40.

[3293] In another embodiment, the promoter is an inducible promoter, e.g., a promoter regulated by a steroid hormone, by a polypeptide hormone (e.g., by means of a signal transduction pathway), or by a heterologous polypeptide (e.g., the tetracycline-inducible systems, “Tet-On” and “Tet-Off”; see, e.g., Clontech Inc., CA, Gossen and Bujard (1992) Proc. Natl. Acad. Sci. USA 89:5547, and Paillard (1989) Human Gene Therapy 9:983).

[3294] In another embodiment, the recombinant mammalian expression vector is capable of directing expression of the nucleic acid preferentially in a particular cell type (e.g., tissue-specific regulatory elements are used to express the nucleic acid). Non-limiting examples of suitable tissue-specific promoters include the albumin promoter (liver-specific; Pinkert et al. (1987) Genes Dev. 1:268-277), lymphoid-specific promoters (Calame and Eaton (1988) Adv. Immunol. 43:235-275), in particular promoters of T cell receptors (Winoto and Baltimore (1989) EMBO J. 8:729-733) and immunoglobulins (Banerji et al. (1983) Cell 33:729-740; Queen and Baltimore (1983) Cell 33:741-748), neuron-specific promoters (e.g., the neurofilament promoter; Byrne and Ruddle (1989) Proc. Natl. Acad. Sci. USA 86:5473-5477), pancreas-specific promoters (Edlund et al. (1985) Science 230:912-916), and mammary gland-specific promoters (e.g., milk whey promoter; U.S. Pat. No. 4,873,316 and European Application Publication No. 264,166). Developmentally-regulated promoters are also encompassed, for example, the murine hox promoters (Kessel and Gruss (1990) Science 249:374-379) and the &agr;-fetoprotein promoter (Campes and Tilghman (1989) Genes Dev. 3:537-546).

[3295] The invention further provides a recombinant expression vector comprising a DNA molecule of the invention cloned into the expression vector in an antisense orientation. Regulatory sequences (e.g., viral promoters and/or enhancers) operatively linked to a nucleic acid cloned in the antisense orientation can be chosen which direct the constitutive, tissue specific or cell type specific expression of antisense RNA in a variety of cell types. The antisense expression vector can be in the form of a recombinant plasmid, phagemid or attenuated virus.

[3296] Another aspect the invention provides a host cell which includes a nucleic acid molecule described herein, e.g., a 56939 nucleic acid molecule within a recombinant expression vector or a 56939 nucleic acid molecule containing sequences which allow it to homologously recombine into a specific site of the host cell's genome. The terms “host cell” and “recombinant host cell” are used interchangeably herein. Such terms refer not only to the particular subject cell but to the progeny or potential progeny of such a cell. Because certain modifications may occur in succeeding generations due to either mutation or environmental influences, such progeny may not, in fact, be identical to the parent cell, but are still included within the scope of the term as used herein.

[3297] A host cell can be any prokaryotic or eukaryotic cell. For example, a 56939 protein can be expressed in bacterial cells (such as E. coli), insect cells, yeast or mammalian cells (such as Chinese hamster ovary cells (CHO) or COS cells (African green monkey kidney cells CV-1 origin SV40 cells; Gluzman (1981) Cell 123:175-182)). Other suitable host cells are known to those skilled in the art.

[3298] Vector DNA can be introduced into host cells via conventional transformation or transfection techniques. As used herein, the terms “transformation” and “transfection” are intended to refer to a variety of art-recognized techniques for introducing foreign nucleic acid (e.g., DNA) into a host cell, including calcium phosphate or calcium chloride co-precipitation, DEAE-dextran-mediated transfection, lipofection, or electroporation.

[3299] A host cell of the invention can be used to produce (i.e., express) a 56939 protein. Accordingly, the invention further provides methods for producing a 56939 protein using the host cells of the invention. In one embodiment, the method includes culturing the host cell of the invention (into which a recombinant expression vector encoding a 56939 protein has been introduced) in a suitable medium such that a 56939 protein is produced. In another embodiment, the method further includes isolating a 56939 protein from the medium or the host cell.

[3300] In another aspect, the invention features, a cell or purified preparation of cells which include a 56939 transgene, or which otherwise misexpress 56939. The cell preparation can consist of human or non-human cells, e.g., rodent cells, e.g., mouse or rat cells, rabbit cells, or pig cells. In preferred embodiments, the cell or cells include a 56939 transgene, e.g., a heterologous form of a 56939, e.g., a gene derived from humans (in the case of a non-human cell). The 56939 transgene can be misexpressed, e.g., overexpressed or underexpressed. In other preferred embodiments, the cell or cells include a gene that mis-expresses an endogenous 56939, e.g., a gene the expression of which is disrupted, e.g., a knockout. Such cells can serve as a model for studying disorders that are related to mutated or mis-expressed 56939 alleles or for use in drug screening.

[3301] In another aspect, the invention features, a human cell, e.g., a hematopoietic stem cell, transformed with nucleic acid which encodes a subject 56939 polypeptide.

[3302] Also provided are cells, preferably human cells, e.g., human hematopoietic or fibroblast cells, in which an endogenous 56939 is under the control of a regulatory sequence that does not normally control the expression of the endogenous 56939 gene. The expression characteristics of an endogenous gene within a cell, e.g., a cell line or microorganism, can be modified by inserting a heterologous DNA regulatory element into the genome of the cell such that the inserted regulatory element is operably linked to the endogenous 56939 gene. For example, an endogenous 56939 gene which is “transcriptionally silent,” e.g., not normally expressed, or expressed only at very low levels, may be activated by inserting a regulatory element which is capable of promoting the expression of a normally expressed gene product in that cell. Techniques such as targeted homologous recombinations, can be used to insert the heterologous DNA as described in, e.g., Chappel, U.S. Pat. No. 5,272,071; WO 91/06667, published in May 16, 1991.

[3303] In a preferred embodiment, recombinant cells described herein can be used for replacement therapy in a subject. For example, a nucleic acid encoding a 56939 polypeptide operably linked to an inducible promoter (e.g., a steroid hormone receptor-regulated promoter) is introduced into a human or nonhuman, e.g., mammalian, e.g., porcine recombinant cell. The cell is cultivated and encapsulated in a biocompatible material, such as poly-lysine alginate, and subsequently implanted into the subject. See, e.g., Lanza (1996) Nat. Biotechnol. 14:1107; Joki et al. (2001) Nat. Biotechnol. 19:35; and U.S. Pat. No. 5,876,742. Production of 56939 polypeptide can be regulated in the subject by administering an agent (e.g., a steroid hormone) to the subject. In another preferred embodiment, the implanted recombinant cells express and secrete an antibody specific for a 56939 polypeptide. The antibody can be any antibody or any antibody derivative described herein.

[3304] 56939 Transgenic Animals

[3305] The invention provides non-human transgenic animals. Such animals are useful for studying the function and/or activity of a 56939 protein and for identifying and/or evaluating modulators of 56939 activity. As used herein, a “transgenic animal” is a non-human animal, preferably a mammal, more preferably a rodent such as a rat or mouse, in which one or more of the cells of the animal includes a transgene. Other examples of transgenic animals include non-human primates, sheep, dogs, cows, goats, chickens, amphibians, and the like. A transgene is exogenous DNA or a rearrangement, e.g., a deletion of endogenous chromosomal DNA, which preferably is integrated into or occurs in the genome of the cells of a transgenic animal. A transgene can direct the expression of an encoded gene product in one or more cell types or tissues of the transgenic animal, other transgenes, e.g., a knockout, reduce expression. Thus, a transgenic animal can be one in which an endogenous 56939 gene has been altered by, e.g., by homologous recombination between the endogenous gene and an exogenous DNA molecule introduced into a cell of the animal, e.g., an embryonic cell of the animal, prior to development of the animal.

[3306] Intronic sequences and polyadenylation signals can also be included in the transgene to increase the efficiency of expression of the transgene. A tissue-specific regulatory sequence(s) can be operably linked to a transgene of the invention to direct expression of a 56939 protein to particular cells. A transgenic founder animal can be identified based upon the presence of a 56939 transgene in its genome and/or expression of 56939 mRNA in tissues or cells of the animals. A transgenic founder animal can then be used to breed additional animals carrying the transgene. Moreover, transgenic animals carrying a transgene encoding a 56939 protein can further be bred to other transgenic animals carrying other transgenes.

[3307] 56939 proteins or polypeptides can be expressed in transgenic animals or plants, e.g., a nucleic acid encoding the protein or polypeptide can be introduced into the genome of an animal. In preferred embodiments the nucleic acid is placed under the control of a tissue specific promoter, e.g., a milk or egg specific promoter, and recovered from the milk or eggs produced by the animal. Suitable animals are mice, pigs, cows, goats, and sheep.

[3308] The invention also includes a population of cells from a transgenic animal, as discussed, e.g., below.

[3309] Uses of 56939

[3310] The nucleic acid molecules, proteins, protein homologues, and antibodies described herein can be used in one or more of the following methods: a) screening assays; b) predictive medicine (e.g., diagnostic assays, prognostic assays, monitoring clinical trials, and pharmacogenetics); and c) methods of treatment (e.g., therapeutic and prophylactic).

[3311] The isolated nucleic acid molecules of the invention can be used, for example, to express a 56939 protein (e.g., via a recombinant expression vector in a host cell in gene therapy applications), to detect a 56939 mRNA (e.g., in a biological sample) or a genetic alteration in a 56939 gene, and to modulate 56939 activity, as described further below. The 56939 proteins can be used to treat disorders characterized by insufficient or excessive production of a 56939 substrate or production of 56939 inhibitors. In addition, the 56939 proteins can be used to screen for naturally occurring 56939 substrates, to screen for drugs or compounds which modulate 56939 activity, as well as to treat disorders characterized by insufficient or excessive production of 56939 protein or production of 56939 protein forms which have decreased, aberrant or unwanted activity compared to 56939 wild type protein (e.g., metabolic disorders). Moreover, the anti-56939 antibodies of the invention can be used to detect and isolate 56939 proteins, regulate the bioavailability of 56939 proteins, and modulate 56939 activity.

[3312] A method of evaluating a compound for the ability to interact with, e.g., bind, a subject 56939 polypeptide is provided. The method includes: contacting the compound with the subject 56939 polypeptide; and evaluating ability of the compound to interact with, e.g., to bind or form a complex with the subject 56939 polypeptide. This method can be performed in vitro, e.g., in a cell free system, or in vivo, e.g., in a two-hybrid interaction trap assay. This method can be used to identify naturally occurring molecules that interact with subject 56939 polypeptide. It can also be used to find natural or synthetic inhibitors of subject 56939 polypeptide. Screening methods are discussed in more detail below.

[3313] 56939 Screening Assays

[3314] The invention provides methods (also referred to herein as “screening assays”) for identifying modulators, i.e., candidate or test compounds or agents (e.g., proteins, peptides, peptidomimetics, peptoids, small molecules or other drugs) which bind to 56939 proteins, have a stimulatory or inhibitory effect on, for example, 56939 expression or 56939 activity, or have a stimulatory or inhibitory effect on, for example, the expression or activity of a 56939 substrate. Compounds thus identified can be used to modulate the activity of target gene products (e.g., 56939 genes) in a therapeutic protocol, to elaborate the biological function of the target gene product, or to identify compounds that disrupt normal target gene interactions.

[3315] In one embodiment, the invention provides assays for screening candidate or test compounds which are substrates of a 56939 protein or polypeptide or a biologically active portion thereof. In another embodiment, the invention provides assays for screening candidate or test compounds that bind to or modulate an activity of a 56939 protein or polypeptide or a biologically active portion thereof.

[3316] In one embodiment, an activity of a 56939 protein can be assayed using an assay system acceptable for detecting thioesterase activity. For example, thioesterase activity can be assessed radiochemically by extracting and assaying the [14C]palmitic acid formed from [1-14C]palmitoyl-CoA during an incubation with a 56939 protein, fragment, or variant. For example, the assay system can contain 25 mM potassium phosphate buffer (pH 8), 20 ug/ml bovine serum albumin, 10 uM [1-14C]palmitoyl-CoA (20 nCi), and a 56939 protein in a final volume of about 0.1 ml. The duration of the incubation can be approximately three minutes. See, e.g., Smith, S., (1981) Methods Enzymol. 71C, 181-188 and Joshi et al. (1993), J. Biol. Chem 268: 22508-22513 for additional details regarding methods of assaying for thioesterase activity.

[3317] The test compounds of the present invention can be obtained using any of the numerous approaches in combinatorial library methods known in the art, including: biological libraries; peptoid libraries (libraries of molecules having the functionalities of peptides, but with a novel, non-peptide backbone which are resistant to enzymatic degradation but which nevertheless remain bioactive; see, e.g., Zuckermann, R. N. et al. (1994) J. Med. Chem. 37:2678-85); spatially addressable parallel solid phase or solution phase libraries; synthetic library methods requiring deconvolution; the ‘one-bead one-compound’ library method; and synthetic library methods using affinity chromatography selection. The biological library and peptoid library approaches are limited to peptide libraries, while the other four approaches are applicable to peptide, non-peptide oligomer or small molecule libraries of compounds (Lam (1997) Anticancer Drug Des. 12:145).

[3318] Examples of methods for the synthesis of molecular libraries can be found in the art, for example in: DeWitt et al. (1993) Proc. Natl. Acad. Sci. U.S.A. 90:6909; Erb et al. (1994) Proc. Natl. Acad. Sci. USA 91:11422; Zuckermann et al. (1994). J. Med. Chem. 37:2678; Cho et al. (1993) Science 261:1303; Carrell et al. (1994) Angew. Chem. Int. Ed. Engl. 33:2059; Carell et al. (1994) Angew. Chem. Int. Ed. Engl. 33:2061; and Gallop et al. (1994) J. Med. Chem. 37:1233.

[3319] Libraries of compounds may be presented in solution (e.g., Houghten (1992) Biotechniques 13:412-421), or on beads (Lam (1991) Nature 354:82-84), chips (Fodor (1993) Nature 364:555-556), bacteria (Ladner, U.S. Pat. No. 5,223,409), spores (Ladner U.S. Pat. No. 5,223,409), plasmids (Cull et al. (1992) Proc Natl Acad Sci USA 89:1865-1869) or on phage (Scott and Smith (1990) Science 249:386-390; Devlin (1990) Science 249:404-406; Cwirla et al. (1990) Proc. Natl. Acad. Sci. 87:6378-6382; Felici (1991) J. Mol. Biol. 222:301-310; Ladner supra.).

[3320] In one embodiment, an assay is a cell-based assay in which a cell which expresses a 56939 protein or biologically active portion thereof is contacted with a test compound, and the ability of the test compound to modulate 56939 activity is determined. Determining the ability of the test compound to modulate 56939 activity can be accomplished by monitoring, for example, hydrolytic activity. The cell, for example, can be of mammalian origin, e.g., human.

[3321] The ability of the test compound to modulate 56939 binding to a compound, e.g., a 56939 substrate, or to bind to 56939 can also be evaluated. This can be accomplished, for example, by coupling the compound, e.g., the substrate, with a radioisotope or enzymatic label such that binding of the compound, e.g., the substrate, to 56939 can be determined by detecting the labeled compound, e.g., substrate, in a complex. Alternatively, 56939 could be coupled with a radioisotope or enzymatic label to monitor the ability of a test compound to modulate 56939 binding to a 56939 substrate in a complex. For example, compounds (e.g., 56939 substrates) can be labeled with 125I, 35S, 14C, or 3H, either directly or indirectly, and the radioisotope detected by direct counting of radioemmission or by scintillation counting. Alternatively, compounds can be enzymatically labeled with, for example, horseradish peroxidase, alkaline phosphatase, or luciferase, and the enzymatic label detected by determination of conversion of an appropriate substrate to product.

[3322] The ability of a compound (e.g., a 56939 substrate) to interact with 56939 with or without the labeling of any of the interactants can be evaluated. For example, a microphysiometer can be used to detect the interaction of a compound with 56939 without the labeling of either the compound or the 56939. McConnell, H. M. et al. (1992) Science 257:1906-1912. As used herein, a “microphysiometer” (e.g., Cytosensor) is an analytical instrument that measures the rate at which a cell acidifies its environment using a light-addressable potentiometric sensor (LAPS). Changes in this acidification rate can be used as an indicator of the interaction between a compound and 56939.

[3323] In yet another embodiment, a cell-free assay is provided in which a 56939 protein or biologically active portion thereof is contacted with a test compound and the ability of the test compound to bind to the 56939 protein or biologically active portion thereof is evaluated. Preferred biologically active portions of the 56939 proteins to be used in assays of the present invention include fragments which participate in interactions with non-56939 molecules, e.g., fragments with high surface probability scores.

[3324] Soluble and/or membrane-bound forms of isolated proteins (e.g., 56939 proteins or biologically active portions thereof) can be used in the cell-free assays of the invention. When membrane-bound forms of the protein are used, it may be desirable to utilize a solubilizing agent. Examples of such solubilizing agents include non-ionic detergents such as n-octylglucoside, n-dodecylglucoside, n-dodecylmaltoside, octanoyl-N-methylglucamide, decanoyl-N-methylglucamide, Triton® X-100, Triton® X-114, Thesit®, Isotridecypoly(ethylene glycol ether)n, 3-[(3-cholamidopropyl)dimethylamminio]-1-propane sulfonate (CHAPS), 3-[(3-cholamidopropyl)dimethylamminio]-2-hydroxy-1-propane sulfonate (CHAPSO), or N-dodecyl=N,N-dimethyl-3-ammonio-1-propane sulfonate.

[3325] Cell-free assays involve preparing a reaction mixture of the target gene protein and the test compound under conditions and for a time sufficient to allow the two components to interact and bind, thus forming a complex that can be removed and/or detected.

[3326] The interaction between two molecules can also be detected, e.g., using fluorescence energy transfer (FET) (see, for example, Lakowicz et al., U.S. Pat. No. 5,631,169; Stavrianopoulos, et al., U.S. Pat. No. 4,868,103). A fluorophore label on the first, ‘donor’ molecule is selected such that its emitted fluorescent energy will be absorbed by a fluorescent label on a second, ‘acceptor’ molecule, which in turn is able to fluoresce due to the absorbed energy. Alternately, the ‘donor’ protein molecule may simply utilize the natural fluorescent energy of tryptophan residues. Labels are chosen that emit different wavelengths of light, such that the ‘acceptor’ molecule label may be differentiated from that of the ‘donor’. Since the efficiency of energy transfer between the labels is related to the distance separating the molecules, the spatial relationship between the molecules can be assessed. In a situation in which binding occurs between the molecules, the fluorescent emission of the ‘acceptor’ molecule label in the assay should be maximal. An FET binding event can be conveniently measured through standard fluorometric detection means well known in the art (e.g., using a fluorimeter).

[3327] In another embodiment, determining the ability of the 56939 protein to bind to a target molecule can be accomplished using real-time Biomolecular Interaction Analysis (BIA) (see, e.g., Sjolander, S. and Urbaniczky, C. (1991) Anal. Chem. 63:2338-2345 and Szabo et al. (1995) Curr. Opin. Struct. Biol. 5:699-705). “Surface plasmon resonance” or “BIA” detects biospecific interactions in real time, without labeling any of the interactants (e.g., BIAcore). Changes in the mass at the binding surface (indicative of a binding event) result in alterations of the refractive index of light near the surface (the optical phenomenon of surface plasmon resonance (SPR)), resulting in a detectable signal which can be used as an indication of real-time reactions between biological molecules.

[3328] In one embodiment, the target gene product or the test substance is anchored onto a solid phase. The target gene product/test compound complexes anchored on the solid phase can be detected at the end of the reaction. Preferably, the target gene product can be anchored onto a solid surface, and the test compound, (which is not anchored), can be labeled, either directly or indirectly, with detectable labels discussed herein.

[3329] It may be desirable to immobilize either 56939, an anti-56939 antibody or its target molecule to facilitate separation of complexed from uncomplexed forms of one or both of the proteins, as well as to accommodate automation of the assay. Binding of a test compound to a 56939 protein, or interaction of a 56939 protein with a target molecule in the presence and absence of a candidate compound, can be accomplished in any vessel suitable for containing the reactants. Examples of such vessels include microtiter plates, test tubes, and micro-centrifuge tubes. In one embodiment, a fusion protein can be provided which adds a domain that allows one or both of the proteins to be bound to a matrix. For example, glutathione-S-transferase/56939 fusion proteins or glutathione-S-transferase/target fusion proteins can be adsorbed onto glutathione sepharose beads (Sigma Chemical, St. Louis, Mo.) or glutathione derivatized microtiter plates, which are then combined with the test compound or the test compound and either the non-adsorbed target protein or 56939 protein, and the mixture incubated under conditions conducive to complex formation (e.g., at physiological conditions for salt and pH). Following incubation, the beads or microtiter plate wells are washed to remove any unbound components, the matrix immobilized in the case of beads, complex determined either directly or indirectly, for example, as described above. Alternatively, the complexes can be dissociated from the matrix, and the level of 56939 binding or activity determined using standard techniques.

[3330] Other techniques for immobilizing either a 56939 protein or a target molecule on matrices include using conjugation of biotin and streptavidin. Biotinylated 56939 protein or target molecules can be prepared from biotin-NHS(N-hydroxy-succinimide) using techniques known in the art (e.g., biotinylation kit, Pierce Chemicals, Rockford, Ill.), and immobilized in the wells of streptavidin-coated 96 well plates (Pierce Chemical).

[3331] In order to conduct the assay, the non-immobilized component is added to the coated surface containing the anchored component. After the reaction is complete, unreacted components are removed (e.g., by washing) under conditions such that any complexes formed will remain immobilized on the solid surface. The detection of complexes anchored on the solid surface can be accomplished in a number of ways. Where the previously non-immobilized component is pre-labeled, the detection of label immobilized on the surface indicates that complexes were formed. Where the previously non-immobilized component is not pre-labeled, an indirect label can be used to detect complexes anchored on the surface; e.g., using a labeled antibody specific for the immobilized component (the antibody, in turn, can be directly labeled or indirectly labeled with, e.g., a labeled anti-Ig antibody).

[3332] In one embodiment, this assay is performed utilizing antibodies reactive with 56939 protein or target molecules but which do not interfere with binding of the 56939 protein to its target molecule. Such antibodies can be derivatized to the wells of the plate, and unbound target or 56939 protein trapped in the wells by antibody conjugation. Methods for detecting such complexes, in addition to those described above for the GST-immobilized complexes, include immunodetection of complexes using antibodies reactive with the 56939 protein or target molecule, as well as enzyme-linked assays which rely on detecting an enzymatic activity associated with the 56939 protein or target molecule.

[3333] Alternatively, cell free assays can be conducted in a liquid phase. In such an assay, the reaction products are separated from unreacted components, by any of a number of standard techniques, including but not limited to: differential centrifugation (see, for example, Rivas, G., and Minton, A. P., (1993) Trends Biochem Sci 18:284-7); chromatography (gel filtration chromatography, ion-exchange chromatography); electrophoresis (see, e.g., Ausubel, F. et al., eds. Current Protocols in Molecular Biology 1999, J. Wiley: New York.); and immunoprecipitation (see, for example, Ausubel, F. et al., eds. (1999) Current Protocols in Molecular Biology, J. Wiley: New York). Such resins and chromatographic techniques are known to one skilled in the art (see, e.g., Heegaard, N. H., (1998) J Mol Recognit 11:141-8; Hage, D. S., and Tweed, S. A. (1997) J Chromatogr B Biomed Sci Appl. 699:499-525). Further, fluorescence energy transfer may also be conveniently utilized, as described herein, to detect binding without further purification of the complex from solution.

[3334] In a preferred embodiment, the assay includes contacting the 56939 protein or biologically active portion thereof with a known compound which binds 56939 to form an assay mixture, contacting the assay mixture with a test compound, and determining the ability of the test compound to interact with a 56939 protein, wherein determining the ability of the test compound to interact with a 56939 protein includes determining the ability of the test compound to preferentially bind to 56939 or biologically active portion thereof, or to modulate the activity of a target molecule, as compared to the known compound.

[3335] The target gene products of the invention can, in vivo, interact with one or more cellular or extracellular macromolecules, such as proteins. For the purposes of this discussion, such cellular and extracellular macromolecules are referred to herein as “binding partners.” Compounds that disrupt such interactions can be useful in regulating the activity of the target gene product. Such compounds can include, but are not limited to molecules such as antibodies, peptides, and small molecules. The preferred target genes/products for use in this embodiment are the 56939 genes herein identified. In an alternative embodiment, the invention provides methods for determining the ability of the test compound to modulate the activity of a 56939 protein through modulation of the activity of a downstream effector of a 56939 target molecule. For example, the activity of the effector molecule on an appropriate target can be determined, or the binding of the effector to an appropriate target can be determined, as previously described.

[3336] To identify compounds that interfere with the interaction between the target gene product and its cellular or extracellular binding partner(s), a reaction mixture containing the target gene product and the binding partner is prepared, under conditions and for a time sufficient, to allow the two products to form complex. In order to test an inhibitory agent, the reaction mixture is provided in the presence and absence of the test compound. The test compound can be initially included in the reaction mixture, or can be added at a time subsequent to the addition of the target gene and its cellular or extracellular binding partner. Control reaction mixtures are incubated without the test compound or with a placebo. The formation of any complexes between the target gene product and the cellular or extracellular binding partner is then detected. The formation of a complex in the control reaction, but not in the reaction mixture containing the test compound, indicates that the compound interferes with the interaction of the target gene product and the interactive binding partner. Additionally, complex formation within reaction mixtures containing the test compound and normal target gene product can also be compared to complex formation within reaction mixtures containing the test compound and mutant target gene product. This comparison can be important in those cases wherein it is desirable to identify compounds that disrupt interactions of mutant but not normal target gene products.

[3337] These assays can be conducted in a heterogeneous or homogeneous format. Heterogeneous assays involve anchoring either the target gene product or the binding partner onto a solid phase, and detecting complexes anchored on the solid phase at the end of the reaction. In homogeneous assays, the entire reaction is carried out in a liquid phase. In either approach, the order of addition of reactants can be varied to obtain different information about the compounds being tested. For example, test compounds that interfere with the interaction between the target gene products and the binding partners, e.g., by competition, can be identified by conducting the reaction in the presence of the test substance. Alternatively, test compounds that disrupt preformed complexes, e.g., compounds with higher binding constants that displace one of the components from the complex, can be tested by adding the test compound to the reaction mixture after complexes have been formed. The various formats are briefly described below.

[3338] In a heterogeneous assay system, either the target gene product or the interactive cellular or extracellular binding partner, is anchored onto a solid surface (e.g., a microtiter plate), while the non-anchored species is labeled, either directly or indirectly. The anchored species can be immobilized by non-covalent or covalent attachments. Alternatively, an immobilized antibody specific for the species to be anchored can be used to anchor the species to the solid surface.

[3339] In order to conduct the assay, the partner of the immobilized species is exposed to the coated surface with or without the test compound. After the reaction is complete, unreacted components are removed (e.g., by washing) and any complexes formed will remain immobilized on the solid surface. Where the non-immobilized species is pre-labeled, the detection of label immobilized on the surface indicates that complexes were formed. Where the non-immobilized species is not pre-labeled, an indirect label can be used to detect complexes anchored on the surface; e.g., using a labeled antibody specific for the initially non-immobilized species (the antibody, in turn, can be directly labeled or indirectly labeled with, e.g., a labeled anti-Ig antibody). Depending upon the order of addition of reaction components, test compounds that inhibit complex formation or that disrupt preformed complexes can be detected.

[3340] Alternatively, the reaction can be conducted in a liquid phase in the presence or absence of the test compound, the reaction products separated from unreacted components, and complexes detected; e.g., using an immobilized antibody specific for one of the binding components to anchor any complexes formed in solution, and a labeled antibody specific for the other partner to detect anchored complexes. Again, depending upon the order of addition of reactants to the liquid phase, test compounds that inhibit complex or that disrupt preformed complexes can be identified.

[3341] In an alternate embodiment of the invention, a homogeneous assay can be used. For example, a preformed complex of the target gene product and the interactive cellular or extracellular binding partner product is prepared in that either the target gene products or their binding partners are labeled, but the signal generated by the label is quenched due to complex formation (see, e.g., U.S. Pat. No. 4,109,496 that utilizes this approach for immunoassays). The addition of a test substance that competes with and displaces one of the species from the preformed complex will result in the generation of a signal above background. In this way, test substances that disrupt target gene product-binding partner interaction can be identified.

[3342] In yet another aspect, the 56939 proteins can be used as “bait proteins” in a two-hybrid assay or three-hybrid assay (see, e.g., U.S. Pat. No. 5,283,317; Zervos et al. (1993) Cell 72:223-232; Madura et al. (1993) J. Biol. Chem. 268:12046-12054; Bartel et al. (1993) Biotechniques 14:920-924; Iwabuchi et al. (1993) Oncogene 8:1693-1696; and Brent WO94/10300), to identify other proteins, which bind to or interact with 56939 (“56939-binding proteins” or “56939-bp”) and are involved in 56939 activity. Such 56939-bps can be activators or inhibitors of signals by the 56939 proteins or 56939 targets as, for example, downstream elements of a 56939-mediated signaling pathway.

[3343] The two-hybrid system is based on the modular nature of most transcription factors, which consist of separable DNA-binding and activation domains. Briefly, the assay utilizes two different DNA constructs. In one construct, the gene that codes for a 56939 protein is fused to a gene encoding the DNA binding domain of a known transcription factor (e.g., GAL-4). In the other construct, a DNA sequence, from a library of DNA sequences, that encodes an unidentified protein (“prey” or “sample”) is fused to a gene that codes for the activation domain of the known transcription factor. (Alternatively the: 56939 protein can be the fused to the activator domain.) If the “bait” and the “prey” proteins are able to interact, in vivo, forming a 56939-dependent complex, the DNA-binding and activation domains of the transcription factor are brought into close proximity. This proximity allows transcription of a reporter gene (e.g., lacZ) which is operably linked to a transcriptional regulatory site responsive to the transcription factor. Expression of the reporter gene can be detected and cell colonies containing the functional transcription factor can be isolated and used to obtain the cloned gene which encodes the protein which interacts with the 56939 protein.

[3344] In another embodiment, modulators of 56939 expression are identified. For example, a cell or cell free mixture is contacted with a candidate compound and the expression of 56939 mRNA or protein evaluated relative to the level of expression of 56939 mRNA or protein in the absence of the candidate compound. When expression of 56939 mRNA or protein is greater in the presence of the candidate compound than in its absence, the candidate compound is identified as a stimulator of 56939 mRNA or protein expression. Alternatively, when expression of 56939 mRNA or protein is less (statistically significantly less) in the presence of the candidate compound than in its absence, the candidate compound is identified as an inhibitor of 56939 mRNA or protein expression. The level of 56939 mRNA or protein expression can be determined by methods described herein for detecting 56939 mRNA or protein.

[3345] In another aspect, the invention pertains to a combination of two or more of the assays described herein. For example, a modulating agent can be identified using a cell-based or a cell free assay, and the ability of the agent to modulate the activity of a 56939 protein can be confirmed in vivo, e.g., in an animal such as an animal model for metabolic disorders.

[3346] This invention further pertains to novel agents identified by the above-described screening assays. Accordingly, it is within the scope of this invention to further use an agent identified as described herein (e.g., a 56939 modulating agent, an antisense 56939 nucleic acid molecule, a 56939-specific antibody, or a 56939-binding partner) in an appropriate animal model to determine the efficacy, toxicity, side effects, or mechanism of action, of treatment with such an agent. Furthermore, novel agents identified by the above-described screening assays can be used for treatments as described herein.

[3347] 56939 Detection Assays

[3348] Portions or fragments of the nucleic acid sequences identified herein can be used as polynucleotide reagents. For example, these sequences can be used to: (i) map their respective genes on a chromosome e.g., to locate gene regions associated with genetic disease or to associate 56939 with a disease; (ii) identify an individual from a minute biological sample (tissue typing); and (iii) aid in forensic identification of a biological sample. These applications are described in the subsections below.

[3349] 56939 Chromosome Mapping

[3350] The 56939 nucleotide sequences or portions thereof can be used to map the location of the 56939 genes on a chromosome. This process is called chromosome mapping. Chromosome mapping is useful in correlating the 56939 sequences with genes associated with disease.

[3351] Briefly, 56939 genes can be mapped to chromosomes by preparing PCR primers (preferably 15-25 bp in length) from the 56939 nucleotide sequences. These primers can then be used for PCR screening of somatic cell hybrids containing individual human chromosomes. Only those hybrids containing the human gene corresponding to the 56939 sequences will yield an amplified fragment.

[3352] A panel of somatic cell hybrids in which each cell line contains either a single human chromosome or a small number of human chromosomes, and a full set of mouse chromosomes, can allow easy mapping of individual genes to specific human chromosomes. (D'Eustachio P. et al. (1983) Science 220:919-924).

[3353] Other mapping strategies e.g., in situ hybridization (described in Fan, Y. et al. (1990) Proc. Natl. Acad. Sci. USA, 87:6223-27), pre-screening with labeled flow-sorted chromosomes, and pre-selection by hybridization to chromosome specific cDNA libraries can be used to map 56939 to a chromosomal location.

[3354] Fluorescence in situ hybridization (FISH) of a DNA sequence to a metaphase chromosomal spread can further be used to provide a precise chromosomal location in one step. The FISH technique can be used with a DNA sequence as short as 500 or 600 bases. However, clones larger than 1,000 bases have a higher likelihood of binding to a unique chromosomal location with sufficient signal intensity for simple detection. Preferably 1,000 bases, and more preferably 2,000 bases will suffice to get good results at a reasonable amount of time. For a review of this technique, see Verma et al., Human Chromosomes: A Manual of Basic Techniques ((1988) Pergamon Press, New York).

[3355] Reagents for chromosome mapping can be used individually to mark a single chromosome or a single site on that chromosome, or panels of reagents can be used for marking multiple sites and/or multiple chromosomes. Reagents corresponding to noncoding regions of the genes actually are preferred for mapping purposes. Coding sequences are more likely to be conserved within gene families, thus increasing the chance of cross hybridizations during chromosomal mapping.

[3356] Once a sequence has been mapped to a precise chromosomal location, the physical position of the sequence on the chromosome can be correlated with genetic map data. (Such data are found, for example, in V. McKusick, Mendelian Inheritance in Man, available on-line through Johns Hopkins University Welch Medical Library). The relationship between a gene and a disease, mapped to the same chromosomal region, can then be identified through linkage analysis (co-inheritance of physically adjacent genes), described in, for example, Egeland, J. et al. (1987) Nature, 325:783-787.

[3357] Moreover, differences in the DNA sequences between individuals affected and unaffected with a disease associated with the 56939 gene, can be determined. If a mutation is observed in some or all of the affected individuals but not in any unaffected individuals, then the mutation is likely to be the causative agent of the particular disease. Comparison of affected and unaffected individuals generally involves first looking for structural alterations in the chromosomes, such as deletions or translocations that are visible from chromosome spreads or detectable using PCR based on that DNA sequence. Ultimately, complete sequencing of genes from several individuals can be performed to confirm the presence of a mutation and to distinguish mutations from polymorphisms.

[3358] 56939 Tissue Typing

[3359] 56939 sequences can be used to identify individuals from biological samples using, e.g., restriction fragment length polymorphism (RFLP). In this technique, an individual's genomic DNA is digested with one or more restriction enzymes, the fragments separated, e.g., in a Southern blot, and probed to yield bands for identification. The sequences of the present invention are useful as additional DNA markers for RFLP (described in U.S. Pat. No. 5,272,057).

[3360] Furthermore, the sequences of the present invention can also be used to determine the actual base-by-base DNA sequence of selected portions of an individual's genome. Thus, the 56939 nucleotide sequences described herein can be used to prepare two PCR primers from the 5′ and 3′ends of the sequences. These primers can then be used to amplify an individual's DNA and subsequently sequence it. Panels of corresponding DNA sequences from individuals, prepared in this manner, can provide unique individual identifications, as each individual will have a unique set of such DNA sequences due to allelic differences.

[3361] Allelic variation occurs to some degree in the coding regions of these sequences, and to a greater degree in the noncoding regions. Each of the sequences described herein can, to some degree, be used as a standard against which DNA from an individual can be compared for identification purposes. Because greater numbers of polymorphisms occur in the noncoding regions, fewer sequences are necessary to differentiate individuals. The noncoding sequences of SEQ ID NO:48 can provide positive individual identification with a panel of perhaps 10 to 1,000 primers which each yield a noncoding amplified sequence of 100 bases. If predicted coding sequences, such as those in SEQ ID NO:50 are used, amore appropriate number of primers for positive individual identification would be 500-2,000.

[3362] If a panel of reagents from 56939 nucleotide sequences described herein is used to generate a unique identification database for an individual, those same reagents can later be used to identify tissue from that individual. Using the unique identification database, positive identification of the individual, living or dead, can be made from extremely small tissue samples.

[3363] Use of Partial 56939 Sequences in Forensic Biology

[3364] DNA-based identification techniques can also be used in forensic biology. To make such an identification, PCR technology can be used to amplify DNA sequences taken from very small biological samples such as tissues, e.g., hair or skin, or body fluids, e.g., blood, saliva, or semen found at a crime scene. The amplified sequence can then be compared to a standard, thereby allowing identification of the origin of the biological sample.

[3365] The sequences of the present invention can be used to provide polynucleotide reagents, e.g., PCR primers, targeted to specific loci in the human genome, which can enhance the reliability of DNA-based forensic identifications by, for example, providing another “identification marker” (i.e. another DNA sequence that is unique to a particular individual). As mentioned above, actual base sequence information can be used for identification as an accurate alternative to patterns formed by restriction enzyme generated fragments. Sequences targeted to noncoding regions of SEQ ID NO:48 (e.g., fragments derived from the noncoding regions of SEQ ID NO:48 having a length of at least 20 bases, preferably at least 30 bases) are particularly appropriate for this use.

[3366] The 56939 nucleotide sequences described herein can further be used to provide polynucleotide reagents, e.g., labeled or labelable probes which can be used in, for example, an in situ hybridization technique, to identify a specific tissue. This can be very useful in cases where a forensic pathologist is presented with a tissue of unknown origin. Panels of such 56939 probes can be used to identify tissue by species and/or by organ type.

[3367] In a similar fashion, these reagents, e.g., 56939 primers or probes can be used to screen tissue culture for contamination (i.e. screen for the presence of a mixture of different types of cells in a culture).

[3368] Predictive Medicine of 56939

[3369] The present invention also pertains to the field of predictive medicine in which diagnostic assays, prognostic assays, and monitoring clinical trials are used for prognostic (predictive) purposes to thereby treat an individual.

[3370] Generally, the invention provides, a method of determining if a subject is at risk for a disorder related to a lesion in or the misexpression of a gene which encodes 56939.

[3371] Such disorders include, e.g., a disorder associated with the misexpression of 56939 gene; a disorder of metabolism, a disorder of the cardiovascular or hepatic system; or any other 56939-related disorder described herein.

[3372] The method includes one or more of the following:

[3373] detecting, in a tissue of the subject, the presence or absence of a mutation which affects the expression of the 56939 gene, or detecting the presence or absence of a mutation in a region which controls the expression of the gene, e.g., a mutation in the 5′control region;

[3374] detecting, in a tissue of the subject, the presence or absence of a mutation which alters the structure of the 56939 gene;

[3375] detecting, in a tissue of the subject, the misexpression of the 56939 gene, at the mRNA level, e.g., detecting a non-wild type level of a mRNA;

[3376] detecting, in a tissue of the subject, the misexpression of the gene, at the protein level, e.g., detecting a non-wild type level of a 56939 polypeptide.

[3377] In preferred embodiments the method includes: ascertaining the existence of at least one of: a deletion of one or more nucleotides from the 56939 gene; an insertion of one or more nucleotides into the gene, a point mutation, e.g., a substitution of one or more nucleotides of the gene, a gross chromosomal rearrangement of the gene, e.g., a translocation, inversion, or deletion.

[3378] For example, detecting the genetic lesion can include: (i) providing a probe/primer including an oligonucleotide containing a region of nucleotide sequence which hybridizes to a sense or antisense sequence from SEQ ID NO:48, or naturally occurring mutants thereof or 5′ or 3′flanking sequences naturally associated with the 56939 gene; (ii) exposing the probe/primer to nucleic acid of the tissue; and detecting, by hybridization, e.g., in situ hybridization, of the probe/primer to the nucleic acid, the presence or absence of the genetic lesion.

[3379] In preferred embodiments detecting the misexpression includes ascertaining the existence of at least one of: an alteration in the level of a messenger RNA transcript of the. 56939 gene; the presence of a non-wild type splicing pattern of a messenger RNA transcript of the gene; or a non-wild type level of 56939.

[3380] Methods of the invention can be used prenatally or to determine if a subject's offspring will be at risk for a disorder.

[3381] In preferred embodiments the method includes determining the structure of a 56939 gene, an abnormal structure being indicative of risk for the disorder.

[3382] In preferred embodiments the method includes contacting a sample from the subject with an antibody to the 56939 protein or a nucleic acid, which hybridizes specifically with the gene. These and other embodiments are discussed below.

[3383] Diagnostic and Prognostic Assays of 56939

[3384] Diagnostic and prognostic assays of the invention include method for assessing the expression level of 56939 molecules and for identifying variations and mutations in the sequence of 56939 molecules.

[3385] Expression Monitoring and Profiling:

[3386] The presence, level, or absence of 56939 protein or nucleic acid in a biological sample can be evaluated by obtaining a biological sample from a test subject and contacting the biological sample with a compound or an agent capable of detecting 56939 protein or nucleic acid (e.g., mRNA, genomic DNA) that encodes 56939 protein such that the presence of 56939 protein or nucleic acid is detected in the biological sample. The term “biological sample” includes tissues, cells and biological fluids isolated from a subject, as well as tissues, cells and fluids present within a subject. A preferred biological sample is serum. The level of expression of the 56939 gene can be measured in a number of ways, including, but not limited to: measuring the mRNA encoded by the 56939 genes; measuring the amount of protein encoded by the 56939 genes; or measuring the activity of the protein encoded by the 56939 genes.

[3387] The level of mRNA corresponding to the 56939 gene in a cell can be determined both by in situ and by in vitro formats.

[3388] The isolated mRNA can be used in hybridization or amplification assays that include, but are not limited to, Southern or Northern analyses, polymerase chain reaction analyses and probe arrays. One preferred diagnostic method for the detection of mRNA levels involves contacting the isolated mRNA with a nucleic acid molecule (probe) that can hybridize to the mRNA encoded by the gene being detected. The nucleic acid probe can be, for example, a full-length 56939 nucleic acid, such as the nucleic acid of SEQ ID NO:48, or a portion thereof, such as an oligonucleotide of at least 7, 15, 30, 50, 100, 250 or 500 nucleotides in length and sufficient to specifically hybridize under stringent conditions to 56939 mRNA or genomic DNA. The probe can be disposed on an address of an array, e.g., an array described below. Other suitable probes for use in the diagnostic assays are described herein.

[3389] In one format, mRNA (or cDNA) is immobilized on a surface and contacted with the probes, for example by running the isolated mRNA on an agarose gel and transferring the mRNA from the gel to a membrane, such as nitrocellulose. In an alternative format, the probes are immobilized on a surface and the mRNA (or cDNA) is contacted with the probes, for example, in a two-dimensional gene chip array described below. A skilled artisan can adapt known mRNA detection methods for use in detecting the level of mRNA encoded by the 56939 genes.

[3390] The level of mRNA in a sample that is encoded by one of 56939 can be evaluated with nucleic acid amplification, e.g., by rtPCR (Mullis (1987) U.S. Pat. No. 4,683,202), ligase chain reaction (Barany (1991) Proc. Natl. Acad. Sci. USA 88:189-193), self sustained sequence replication (Guatelli et al., (1990) Proc. Natl. Acad. Sci. USA 87:1874-1878), transcriptional amplification system (Kwoh et al., (1989), Proc. Natl. Acad. Sci. USA 86:1173-1177), Q-Beta Replicase (Lizardi et al., (1988) Bio/Technology 6:1197), rolling circle replication (Lizardi et al., U.S. Pat. No. 5,854,033) or any other nucleic acid amplification method, followed by the detection of the amplified molecules using techniques known in the art. As used herein, amplification primers are defined as being a pair of nucleic acid molecules that can anneal to 5′ or 3′regions of a gene (plus and minus strands, respectively, or vice-versa) and contain a short region in between. In general, amplification primers are from about 10 to 30 nucleotides in length and flank a region from about 50 to 200 nucleotides in length. Under appropriate conditions and with appropriate reagents, such primers permit the amplification of a nucleic acid molecule comprising the nucleotide sequence flanked by the primers.

[3391] For in situ methods, a cell or tissue sample can be prepared/processed and immobilized on a support, typically a glass slide, and then contacted with a probe that can hybridize to mRNA that encodes the 56939 gene being analyzed.

[3392] In another embodiment, the methods further contacting a control sample with a compound or agent capable of detecting 56939 mRNA, or genomic DNA, and comparing the presence of 56939 mRNA or genomic DNA in the control sample with the presence of 56939 mRNA or genomic DNA in the test sample. In still another embodiment, serial analysis of gene expression, as described in U.S. Pat. No. 5,695,937, is used to detect 56939 transcript levels.

[3393] A variety of methods can be used to determine the level of protein encoded by 56939. In general, these methods include contacting an agent that selectively binds to the protein, such as an antibody with a sample, to evaluate the level of protein in the sample. In a preferred embodiment, the antibody bears a detectable label. Antibodies can be polyclonal, or more preferably, monoclonal. An intact antibody, or a fragment thereof (e.g., Fab or F(ab′)2) can be used. The term “labeled”, with regard to the probe or antibody, is intended to encompass direct labeling of the probe or antibody by coupling (i.e., physically linking) a detectable substance to the probe or antibody, as well as indirect labeling of the probe or antibody by reactivity with a detectable substance. Examples of detectable substances are provided herein.

[3394] The detection methods can be used to detect 56939 protein in a biological sample in vitro as well as in vivo. In vitro techniques for detection of 56939 protein include enzyme linked immunosorbent assays (ELISAs), immunoprecipitations, immunofluorescence, enzyme immunoassay (EIA), radioimmunoassay (RIA), and Western blot analysis. In vivo techniques for detection of 56939 protein include introducing into a subject a labeled anti-56939 antibody. For example, the antibody can be labeled with a radioactive marker whose presence and location in a subject can be detected by standard imaging techniques. In another embodiment, the sample is labeled, e.g., biotinylated and then contacted to the antibody, e.g., an anti-56939 antibody positioned on an antibody array (as described below). The sample can be detected, e.g., with avidin coupled to a fluorescent label.

[3395] In another embodiment, the methods further include contacting the control sample with a compound or agent capable of detecting 56939 protein, and comparing the presence of 56939 protein in the control sample with the presence of 56939 protein in the test sample.

[3396] The invention also includes kits for detecting the presence of 56939 in a biological sample. For example, the kit can include a compound or agent capable of detecting 56939 protein or mRNA in a biological sample; and a standard. The compound or agent can be packaged in a suitable container. The kit can further comprise instructions for using the kit to detect 56939 protein or nucleic acid.

[3397] For antibody-based kits, the kit can include: (1) a first antibody (e.g., attached to a solid support) which binds to a polypeptide corresponding to a marker of the invention; and, optionally, (2) a second, different antibody which binds to either the polypeptide or the first antibody and is conjugated to a detectable agent.

[3398] For oligonucleotide-based kits, the kit can include: (1) an oligonucleotide, e.g., a detectably labeled oligonucleotide, which hybridizes to a nucleic acid sequence encoding a polypeptide corresponding to a marker of the invention or (2) a pair of primers useful for amplifying a nucleic acid molecule corresponding to a marker of the invention. The kit can also includes a buffering agent, a preservative, or a protein stabilizing agent. The kit can also includes components necessary for detecting the detectable agent (e.g., an enzyme or a substrate). The kit can also contain a control sample or a series of control samples which can be assayed and compared to the test sample contained. Each component of the kit can be enclosed within an individual container and all of the various containers can be within a single package, along with instructions for interpreting the results of the assays performed using the kit.

[3399] The diagnostic methods described herein can identify subjects having, or at risk of developing, a disease or disorder associated with misexpressed or aberrant or unwanted 56939 expression or activity. As used herein, the term “unwanted” includes an unwanted phenomenon involved in a biological response such as a metabolism disorder or deregulated cell proliferation.

[3400] In one embodiment, a disease or disorder associated with aberrant or unwanted 56939 expression or activity is identified. A test sample is obtained from a subject and 56939 protein or nucleic acid (e.g., mRNA or genomic DNA) is evaluated, wherein the level, e.g., the presence or absence, of 56939 protein or nucleic acid is diagnostic for a subject having or at risk of developing a disease or disorder associated with aberrant or unwanted 56939 expression or activity. As used herein, a “test sample” refers to a biological sample obtained from a subject of interest, including a biological fluid (e.g., serum), cell sample, or tissue.

[3401] The prognostic assays described herein can be used to determine whether a subject can be administered an agent (e.g., an agonist, antagonist, peptidomimetic, protein, peptide, nucleic acid, small molecule, or other drug candidate) to treat a disease or disorder associated with aberrant or unwanted 56939 expression or activity. For example, such methods can be used to determine whether a subject can be effectively treated with an agent for a cell metabolism disorder.

[3402] In another aspect, the invention features a computer medium having a plurality of digitally encoded data records. Each data record includes a value representing the level of expression of 56939 in a sample, and a descriptor of the sample. The descriptor of the sample can be an identifier of the sample, a subject from which the sample was derived (e.g., a patient), a diagnosis, or a treatment (e.g., a preferred treatment). In a preferred embodiment, the data record further includes values representing the level of expression of genes other than 56939 (e.g., other genes associated with a 56939-disorder, or other genes on an array). The data record can be structured as a table, e.g., a table that is part of a database such as a relational database (e.g., a SQL database of the Oracle or Sybase database environments).

[3403] Also featured is a method of evaluating a sample. The method includes providing a sample, e.g., from the subject, and determining a gene expression profile of the sample, wherein the profile includes a value representing the level of 56939 expression. The method can further include comparing the value or the profile (i.e., multiple values) to a reference value or reference profile. The gene expression profile of the sample can be obtained by any of the methods described herein (e.g., by providing a nucleic acid from the sample and contacting the nucleic acid to an array). The method can be used to diagnose a disorder in a subject wherein an increase or decrease in 56939 expression is an indication that the subject has or is disposed to having a disorder. The method can be used to monitor a treatment for disorders, e.g. metabolism disorders in a subject. For example, the gene expression profile can be determined for a sample from a subject undergoing treatment. The profile can be compared to a reference profile or to a profile obtained from the subject prior to treatment or prior to onset of the disorder (see, e.g., Golub et al. (1999) Science 286:531).

[3404] In yet another aspect, the invention features a method of evaluating a test compound (see also, “Screening Assays”, above). The method includes providing a cell and a test compound; contacting the test compound to the cell; obtaining a subject expression profile for the contacted cell; and comparing the subject expression profile to one or more reference profiles. The profiles include a value representing the level of 56939 expression. In a preferred embodiment, the subject expression profile is compared to a target profile, e.g., a profile for a normal cell or for desired condition of a cell. The test compound is evaluated favorably if the subject expression profile is more similar to the target profile than an expression profile obtained from an uncontacted cell.

[3405] In another aspect, the invention features, a method of evaluating a subject. The method includes: a) obtaining a sample from a subject, e.g., from a caregiver, e.g., a caregiver who obtains the sample from the subject; b) determining a subject expression profile for the sample. Optionally, the method further includes either or both of steps: c) comparing the subject expression profile to one or more reference expression profiles; and d) selecting the reference profile most similar to the subject reference profile. The subject expression profile and the reference profiles include a value representing the level of 56939 expression. A variety of routine statistical measures can be used to compare two reference profiles. One possible metric is the length of the distance vector that is the difference between the two profiles. Each of the subject and reference profile is represented as a multi-dimensional vector, wherein each dimension is a value in the profile.

[3406] The method can further include transmitting a result to a caregiver. The result can be the subject expression profile, a result of a comparison of the subject expression profile with another profile, a most similar reference profile, or a descriptor of any of the aforementioned. The result can be transmitted across a computer network, e.g., the result can be in the form of a computer transmission, e.g., a computer data signal embedded in a carrier wave.

[3407] Also featured is a computer medium having executable code for effecting the following steps: receive a subject expression profile; access a database of reference expression profiles; and either i) select a matching reference profile most similar to the subject expression profile or ii) determine at least one comparison score for the similarity of the subject expression profile to at least one reference profile. The subject expression profile, and the reference expression profiles each include a value representing the level of 56939 expression.

[3408] 56939 Arrays and Uses Thereof

[3409] In another aspect, the invention features an array that includes a substrate having a plurality of addresses. At least one address of the plurality includes a capture probe that binds specifically to a 56939 molecule (e.g., a 56939 nucleic acid or a 56939 polypeptide). The array can have a density of at least than 10, 50, 100, 200, 500, 1,000, 2,000, or 10,000 or more addresses/cm2, and ranges between. In a preferred embodiment, the plurality of addresses includes at least 10, 100, 500, 1,000, 5,000, 10,000, 50,000 addresses. In a preferred embodiment, the plurality of addresses includes equal to or less than 10, 100, 500, 1,000, 5,000, 10,000, or 50,000 addresses. The substrate can be a two-dimensional substrate such as a glass slide, a wafer (e.g., silica or plastic), a mass spectroscopy plate, or a three-dimensional substrate such as a gel pad. Addresses in addition to address of the plurality can be disposed on the array.

[3410] In a preferred embodiment, at least one address of the plurality includes a nucleic acid capture probe that hybridizes specifically to a 56939 nucleic acid, e.g., the sense or anti-sense strand. In one preferred embodiment, a subset of addresses of the plurality of addresses has a nucleic acid capture probe for 56939. Each address of the subset can include a capture probe that hybridizes to a different region of a 56939 nucleic acid. In another preferred embodiment, addresses of the subset include a capture probe for a 56939 nucleic acid. Each address of the subset is unique, overlapping, and complementary to a different variant of 56939 (e.g., an allelic variant, or all possible hypothetical variants). The array can be used to sequence 56939 by hybridization (see, e.g., U.S. Pat. No. 5,695,940).

[3411] An array can be generated by various methods, e.g., by photolithographic methods (see, e.g., U.S. Pat. Nos. 5,143,854; 5,510,270; and 5,527,681), mechanical methods (e.g., directed-flow methods as described in U.S. Pat. No. 5,384,261), pin-based methods (e.g., as described in U.S. Pat. No. 5,288,514), and bead-based techniques (e.g., as described in PCT US/93/04145).

[3412] In another preferred embodiment, at least one address of the plurality includes a polypeptide capture probe that binds specifically to a 56939 polypeptide or fragment thereof. The polypeptide can be a naturally-occurring interaction partner of 56939 polypeptide. Preferably, the polypeptide is an antibody, e.g., an antibody described herein (see “Anti-56939 Antibodies,” above), such as a monoclonal antibody or a single-chain antibody.

[3413] In another aspect, the invention features a method of analyzing the expression of 56939. The method includes providing an array as described above; contacting the array with a sample and detecting binding of a 56939-molecule (e.g., nucleic acid or polypeptide) to the array. In a preferred embodiment, the array is a nucleic acid array. Optionally the method further includes amplifying nucleic acid from the sample prior or during contact with the array.

[3414] In another embodiment, the array can be used to assay gene expression in a tissue to ascertain tissue specificity of genes in the array, particularly the expression of 56939. If a sufficient number of diverse samples is analyzed, clustering (e.g., hierarchical clustering, k-means clustering, Bayesian clustering and the like) can be used to identify other genes which are co-regulated with 56939. For example, the array can be used for the quantitation of the expression of multiple genes. Thus, not only tissue specificity, but also the level of expression of a battery of genes in the tissue is ascertained. Quantitative data can be used to group (e.g., cluster) genes on the basis of their tissue expression per se and level of expression in that tissue.

[3415] For example, array analysis of gene expression can be used to assess the effect of cell-cell interactions on 56939 expression. A first tissue can be perturbed and nucleic acid from a second tissue that interacts with the first tissue can be analyzed. In this context, the effect of one cell type on another cell type in response to a biological stimulus can be determined, e.g., to monitor the effect of cell-cell interaction at the level of gene expression.

[3416] In another embodiment, cells are contacted with a therapeutic agent. The expression profile of the cells is determined using the array, and the expression profile is compared to the profile of like cells not contacted with the agent. For example, the assay can be used to determine or analyze the molecular basis of an undesirable effect of the therapeutic agent. If an agent is administered therapeutically to treat one cell type but has an undesirable effect on another cell type, the invention provides an assay to determine the molecular basis of the undesirable effect and thus provides the opportunity to co-administer a counteracting agent or otherwise treat the undesired effect. Similarly, even within a single cell type, undesirable biological effects can be determined at the molecular level. Thus, the effects of an agent on expression of other than the target gene can be ascertained and counteracted.

[3417] In another embodiment, the array can be used to monitor expression of one or more genes in the array with respect to time. For example, samples obtained from different time points can be probed with the array. Such analysis can identify and/or characterize the development of a 56939-associated disease or disorder; and processes, such as a cellular transformation associated with a 56939-associated disease or disorder. The method can also evaluate the treatment and/or progression of a 56939-associated disease or disorder

[3418] The array is also useful for ascertaining differential expression patterns of one or more genes in normal and abnormal cells. This provides a battery of genes (e.g., including 56939) that could serve as a molecular target for diagnosis or therapeutic intervention.

[3419] In another aspect, the invention features an array having a plurality of addresses. Each address of the plurality includes a unique polypeptide. At least one address of the plurality has disposed thereon a 56939 polypeptide or fragment thereof. Methods of producing polypeptide arrays are described in the art, e.g., in De Wildt et al. (2000). Nature Biotech. 18, 989-994; Lueking et al. (1999). Anal. Biochem. 270, 103-111; Ge, H. (2000). Nucleic Acids Res. 28, e3, I-VII; MacBeath, G., and Schreiber, S. L. (2000). Science 289, 1760-1763; and WO 99/51773A1. In a preferred embodiment, each addresses of the plurality has disposed thereon a polypeptide at least 60, 70, 80, 85, 90, 95 or 99% identical to a 56939 polypeptide or fragment thereof. For example, multiple variants of a 56939 polypeptide (e.g., encoded by allelic variants, site-directed mutants, random mutants, or combinatorial mutants) can be disposed at individual addresses of the plurality. Addresses in addition to the address of the plurality can be disposed on the array.

[3420] The polypeptide array can be used to detect a 56939 binding compound, e.g., an antibody in a sample from a subject with specificity for a 56939 polypeptide or the presence of a 56939-binding protein or ligand.

[3421] The array is also useful for ascertaining the effect of the expression of a gene on the expression of other genes in the same cell or in different cells (e.g., ascertaining the effect of 56939 expression on the expression of other genes). This provides, for example, for a selection of alternate molecular targets for therapeutic intervention if the ultimate or downstream target cannot be regulated.

[3422] In another aspect, the invention features a method of analyzing a plurality of probes. The method is useful, e.g., for analyzing gene expression. The method includes: providing a two dimensional array having a plurality of addresses, each address of the plurality being positionally distinguishable from each other address of the plurality having a unique capture probe, e.g., wherein the capture probes are from a cell or subject which express 56939 or from a cell or subject in which a 56939 mediated response has been elicited, e.g., by contact of the cell with 56939 nucleic acid or protein, or administration to the cell or subject 56939 nucleic acid or protein; providing a two dimensional array having a plurality of addresses, each address of the plurality being positionally distinguishable from each other address of the plurality, and each address of the plurality having a unique capture probe, e.g., wherein the capture probes are from a cell or subject which does not express 56939 (or does not express as highly as in the case of the 56939 positive plurality of capture probes) or from a cell or subject which in which a 56939 mediated response has not been elicited (or has been elicited to a lesser extent than in the first sample); contacting the array with one or more inquiry probes (which is preferably other than a 56939 nucleic acid, polypeptide, or antibody), and thereby evaluating the plurality of capture probes. Binding, e.g., in the case of a nucleic acid, hybridization with a capture probe at an address of the plurality, is detected, e.g., by signal generated from a label attached to the nucleic acid, polypeptide, or antibody.

[3423] In another aspect, the invention features a method of analyzing a plurality of probes or a sample. The method is useful, e.g., for analyzing gene expression. The method includes: providing a two dimensional array having a plurality of addresses, each address of the plurality being positionally distinguishable from each other address of the plurality having a unique capture probe, contacting the array with a first sample from a cell or subject which express or mis-express 56939 or from a cell or subject in which a 56939-mediated response has been elicited, e.g., by contact of the cell with 56939 nucleic acid or protein, or administration to the cell or subject 56939 nucleic acid or protein; providing a two dimensional array having a plurality of addresses, each address of the plurality being positionally distinguishable from each other address of the plurality, and each address of the plurality having a unique capture probe, and contacting the array with a second sample from a cell or subject which does not express 56939 (or does not express as highly as in the case of the 56939 positive plurality of capture probes) or from a cell or subject which in which a 56939 mediated response has not been elicited (or has been elicited to a lesser extent than in the first sample); and comparing the binding of the first sample with the binding of the second sample. Binding, e.g., in the case of a nucleic acid, hybridization with a capture probe at an address of the plurality, is detected, e.g., by signal generated from a label attached to the nucleic acid, polypeptide, or antibody. The same array can be used for both samples or different arrays can be used. If different arrays are used the plurality of addresses with capture probes should be present on both arrays.

[3424] In another aspect, the invention features a method of analyzing 56939, e.g., analyzing structure, function, or relatedness to other nucleic acid or amino acid sequences. The method includes: providing a 56939 nucleic acid or amino acid sequence; comparing the 56939 sequence with one or more preferably a plurality of sequences from a collection of sequences, e.g., a nucleic acid or protein sequence database; to thereby analyze 56939.

[3425] Detection of 56939 Variations or Mutations

[3426] The methods of the invention can also be used to detect genetic alterations in a 56939 gene, thereby determining if a subject with the altered gene is at risk for a disorder characterized by misregulation in 56939 protein activity or nucleic acid expression, such as a metabolic disorder. In preferred embodiments, the methods include detecting, in a sample from the subject, the presence or absence of a genetic alteration characterized by at least one of an alteration affecting the integrity of a gene encoding a 56939-protein, or the mis-expression of the 56939 gene. For example, such genetic alterations can be detected by ascertaining the existence of at least one of 1) a deletion of one or more nucleotides from a 56939 gene; 2) an addition of one or more nucleotides to a 56939 gene; 3) a substitution of one or more nucleotides of a 56939 gene, 4) a chromosomal rearrangement of a 56939 gene; 5) an alteration in the level of a messenger RNA transcript of a 56939 gene, 6) aberrant modification of a 56939 gene, such as of the methylation pattern of the genomic DNA, 7) the presence of a non-wild type splicing pattern of a messenger RNA transcript of a 56939 gene, 8) a non-wild type level of a 56939-protein, 9) allelic loss of a 56939 gene, and 10) inappropriate post-translational modification of a 56939-protein.

[3427] An alteration can be detected without a probe/primer in a polymerase chain reaction, such as anchor PCR or RACE PCR, or, alternatively, in a ligation chain reaction (LCR), the latter of which can be particularly useful for detecting point mutations in the 56939-gene. This method can include the steps of collecting a sample of cells from a subject, isolating nucleic acid (e.g., genomic, mRNA or both) from the sample, contacting the nucleic acid sample with one or more primers which specifically hybridize to a 56939 gene under conditions such that hybridization and amplification of the 56939-gene (if present) occurs, and detecting the presence or absence of an amplification product, or detecting the size of the amplification product and comparing the length to a control sample. It is anticipated that PCR and/or LCR may be desirable to use as a preliminary amplification step in conjunction with any of the techniques used for detecting mutations described herein. Alternatively, other amplification methods described herein or known in the art can be used.

[3428] In another embodiment, mutations in a 56939 gene from a sample cell can be identified by detecting alterations in restriction enzyme cleavage patterns. For example, sample and control DNA is isolated, amplified (optionally), digested with one or more restriction endonucleases, and fragment length sizes are determined, e.g., by gel electrophoresis and compared. Differences in fragment length sizes between sample and control DNA indicates mutations in the sample DNA. Moreover, the use of sequence specific ribozymes (see, for example, U.S. Pat. No. 5,498,531) can be used to score for the presence of specific mutations by development or loss of a ribozyme cleavage site.

[3429] In other embodiments, genetic mutations in 56939 can be identified by hybridizing a sample and control nucleic acids, e.g., DNA or RNA, two-dimensional arrays, e.g., chip based arrays. Such arrays include a plurality of addresses, each of which is positionally distinguishable from the other. A different probe is located at each address of the plurality. A probe can be complementary to a region of a 56939 nucleic acid or a putative variant (e.g., allelic variant) thereof. A probe can have one or more mismatches to a region of a 56939 nucleic acid (e.g., a destabilizing mismatch). The arrays can have a high density of addresses, e.g., can contain hundreds or thousands of oligonucleotides probes (Cronin, M. T. et al. (1996) Human Mutation 7: 244-255; Kozal, M. J. et al. (1996) Nature Medicine 2: 753-759). For example, genetic mutations in 56939 can be identified in two-dimensional arrays containing light-generated DNA probes as described in Cronin, M. T. et al. supra. Briefly, a first hybridization array of probes can be used to scan through long stretches of DNA in a sample and control to identify base changes between the sequences by making linear arrays of sequential overlapping probes. This step allows the identification of point mutations. This step is followed by a second hybridization array that allows the characterization of specific mutations by using smaller, specialized probe arrays complementary to all variants or mutations detected. Each mutation array is composed of parallel probe sets, one complementary to the wild-type gene and the other complementary to the mutant gene.

[3430] In yet another embodiment, any of a variety of sequencing reactions known in the art can be used to directly sequence the 56939 gene and detect mutations by comparing the sequence of the sample 56939 with the corresponding wild-type (control) sequence. Automated sequencing procedures can be utilized when performing the diagnostic assays ((1995) Biotechniques 19:448), including sequencing by mass spectrometry.

[3431] Other methods for detecting mutations in the 56939 gene include methods in which protection from cleavage agents is used to detect mismatched bases in RNA/RNA or RNA/DNA heteroduplexes (Myers et al. (1985) Science 230:1242; Cotton et al. (1988) Proc. Natl. Acad Sci USA 85:4397; Saleeba et al. (1992) Methods Enzymol. 217:286-295).

[3432] In still another embodiment, the mismatch cleavage reaction employs one or more proteins that recognize mismatched base pairs in double-stranded DNA (so called “DNA mismatch repair” enzymes) in defined systems for detecting and mapping point mutations in 56939 cDNAs obtained from samples of cells. For example, the mutY enzyme of E. coli cleaves A at G/A mismatches and the thymidine DNA glycosylase from HeLa cells cleaves T at G/T mismatches (Hsu et al. (1994) Carcinogenesis 15:1657-1662; U.S. Pat. No. 5,459,039).

[3433] In other embodiments, alterations in electrophoretic mobility will be used to identify mutations in 56939 genes. For example, single strand conformation polymorphism (SSCP) may be used to detect differences in electrophoretic mobility between mutant and wild type nucleic acids (Orita et al. (1989) Proc Natl. Acad. Sci USA: 86:2766, see also Cotton (1993) Mutat. Res. 285:125-144; and Hayashi (1992) Genet. Anal. Tech. Appl. 9:73-79). Single-stranded DNA fragments of sample and control 56939 nucleic acids will be denatured and allowed to renature. The secondary structure of single-stranded nucleic acids varies according to sequence, the resulting alteration in electrophoretic mobility enables the detection of even a single base change. The DNA fragments may be labeled or detected with labeled probes. The sensitivity of the assay may be enhanced by using RNA (rather than DNA), in which the secondary structure is more sensitive to a change in sequence. In a preferred embodiment, the subject method utilizes heteroduplex analysis to separate double stranded heteroduplex molecules on the basis of changes in electrophoretic mobility (Keen et al. (1991) Trends Genet 7:5).

[3434] In yet another embodiment, the movement of mutant or wild-type fragments in polyacrylamide gels containing a gradient of denaturant is assayed using denaturing gradient gel electrophoresis (DGGE) (Myers et al. (1985) Nature 313:495). When DGGE is used as the method of analysis, DNA will be modified to insure that it does not completely denature, for example by adding a GC clamp of approximately 40 bp of high-melting GC-rich DNA by PCR. In a further embodiment, a temperature gradient is used in place of a denaturing gradient to identify differences in the mobility of control and sample DNA (Rosenbaum and Reissner (1987) Biophys Chem 265:12753).

[3435] Examples of other techniques for detecting point mutations include, but are not limited to, selective oligonucleotide hybridization, selective amplification, or selective primer extension (Saiki et al. (1986) Nature 324:163); Saiki et al. (1989) Proc. Natl. Acad. Sci USA 86:6230). A further method of detecting point mutations is the chemical ligation of oligonucleotides as described in Xu et al. ((2001) Nature Biotechnol. 19:148). Adjacent oligonucleotides, one of which selectively anneals to the query site, are ligated together if the nucleotide at the query site of the sample nucleic acid is complementary to the query oligonucleotide; ligation can be monitored, e.g., by fluorescent dyes coupled to the oligonucleotides.

[3436] Alternatively, allele specific amplification technology that depends on selective PCR amplification may be used in conjunction with the instant invention. Oligonucleotides used as primers for specific amplification may carry the mutation of interest in the center of the molecule (so that amplification depends on differential hybridization) (Gibbs et al. (1989) Nucleic Acids Res. 17:2437-2448) or at the extreme 3′end of one primer where, under appropriate conditions, mismatch can prevent, or reduce polymerase extension (Prossner (1993) Tibtech 11:238). In addition it may be desirable to introduce a novel restriction site in the region of the mutation to create cleavage-based detection (Gasparini et al. (1992) Mol. Cell Probes 6:1). It is anticipated that in certain embodiments amplification may also be performed using Taq ligase for amplification (Barany (1991) Proc. Natl. Acad. Sci USA 88:189). In such cases, ligation will occur only if there is a perfect match at the 3′end of the 5′sequence making it possible to detect the presence of a known mutation at a specific site by looking for the presence or absence of amplification.

[3437] In another aspect, the invention features a set of oligonucleotides. The set includes a plurality of oligonucleotides, each of which is at least partially complementary (e.g., at least 50%, 60%, 70%, 80%, 90%, 92%, 95%, 97%, 98%, or 99% complementary) to a 56939 nucleic acid.

[3438] In a preferred embodiment the set includes a first and a second oligonucleotide. The first and second oligonucleotide can hybridize to the same or to different locations of SEQ ID NO:48 or the complement of SEQ ID NO:48. Different locations can be different but overlapping, or non-overlapping on the same strand. The first and second oligonucleotide can hybridize to sites on the same or on different strands.

[3439] The set can be useful, e.g., for identifying SNP's, or identifying specific alleles of 56939. In a preferred embodiment, each oligonucleotide of the set has a different nucleotide at an interrogation position. In one embodiment, the set includes two oligonucleotides, each complementary to a different allele at a locus, e.g., a biallelic or polymorphic locus.

[3440] In another embodiment, the set includes four oligonucleotides, each having a different nucleotide (e.g., adenine, guanine, cytosine, or thymidine) at the interrogation position. The interrogation position can be a SNP or the site of a mutation. In another preferred embodiment, the oligonucleotides of the plurality are identical in sequence to one another (except for differences in length). The oligonucleotides can be provided with differential labels, such that an oligonucleotide that hybridizes to one allele provides a signal that is distinguishable from an oligonucleotide that hybridizes to a second allele. In still another embodiment, at least one of the oligonucleotides of the set has a nucleotide change at a position in addition to a query position, e.g., a destabilizing mutation to decrease the Tm of the oligonucleotide. In another embodiment, at least one oligonucleotide of the set has a non-natural nucleotide, e.g., inosine. In a preferred embodiment, the oligonucleotides are attached to a solid support, e.g., to different addresses of an array or to different beads or nanoparticles.

[3441] In a preferred embodiment the set of oligo nucleotides can be used to specifically amplify, e.g., by PCR, or detect, a 56939 nucleic acid.

[3442] The methods described herein may be performed, for example, by utilizing pre-packaged diagnostic kits comprising at least one probe nucleic acid or antibody reagent described herein, which may be conveniently used, e.g., in clinical settings to diagnose patients exhibiting symptoms or family history of a disease or illness involving a 56939 gene.

[3443] Use of 56939 Molecules as Surrogate Markers

[3444] The 56939 molecules of the invention are also useful as markers of disorders or disease states, as markers for precursors of disease states, as markers for predisposition of disease states, as markers of drug activity, or as markers of the pharmacogenomic profile of a subject. Using the methods described herein, the presence, absence and/or quantity of the 56939 molecules of the invention may be detected, and may be correlated with one or more biological states in vivo. For example, the 56939 molecules of the invention may serve as surrogate markers for one or more disorders or disease states or for conditions leading up to disease states. As used herein, a “surrogate marker” is an objective biochemical marker which correlates with the absence or presence of a disease or disorder, or with the progression of a disease or disorder (e.g., with the presence or absence of a tumor). The presence or quantity of such markers is independent of the disease. Therefore, these markers may serve to indicate whether a particular course of treatment is effective in lessening a disease state or disorder. Surrogate markers are of particular use when the presence or extent of a disease state or disorder is difficult to assess through standard methodologies (e.g., early stage tumors), or when an assessment of disease progression is desired before a potentially dangerous clinical endpoint is reached (e.g., an assessment of cardiovascular disease may be made using cholesterol levels as a surrogate marker, and an analysis of HIV infection may be made using HIV RNA levels as a surrogate marker, well in advance of the undesirable clinical outcomes of myocardial infarction or fully-developed AIDS). Examples of the use of surrogate markers in the art include: Koomen et al. (2000) J. Mass. Spectrom. 35: 258-264; and James (1994) AIDS Treatment News Archive 209.

[3445] The 56939 molecules of the invention are also useful as pharmacodynamic markers. As used herein, a “pharmacodynamic marker” is an objective biochemical marker which correlates specifically with drug effects. The presence or quantity of a pharmacodynamic marker is not related to the disease state or disorder for which the drug is being administered; therefore, the presence or quantity of the marker is indicative of the presence or activity of the drug in a subject. For example, a pharmacodynamic marker may be indicative of the concentration of the drug in a biological tissue, in that the marker is either expressed or transcribed or not expressed or transcribed in that tissue in relationship to the level of the drug. In this fashion, the distribution or uptake of the drug may be monitored by the pharmacodynamic marker. Similarly, the presence or quantity of the pharmacodynamic marker may be related to the presence or quantity of the metabolic product of a drug, such that the presence or quantity of the marker is indicative of the relative breakdown rate of the drug in vivo. Pharmacodynamic markers are of particular use in increasing the sensitivity of detection of drug effects, particularly when the drug is administered in low doses. Since even a small amount of a drug may be sufficient to activate multiple rounds of marker (e.g., a 56939 marker) transcription or expression, the amplified marker may be in a quantity which is more readily detectable than the drug itself. Also, the marker may be more easily detected due to the nature of the marker itself; for example, using the methods described herein, anti-56939 antibodies may be employed in an immune-based detection system for a 56939 protein marker, or 56939-specific radiolabeled probes may be used to detect a 56939 mRNA marker. Furthermore, the use of a pharmacodynamic marker may offer mechanism-based prediction of risk due to drug treatment beyond the range of possible direct observations. Examples of the use of pharmacodynamic markers in the art include: Matsuda et al. U.S. Pat. No. 6,033,862; Hattis et al. (1991) Env. Health Perspect. 90: 229-238; Schentag (1999) Am. J. Health-Syst. Pharm. 56 Suppl. 3: S21-S24; and Nicolau (1999) Am, J. Health-Syst. Pharm. 56 Suppl. 3: S16-S20.

[3446] The 56939 molecules of the invention are also useful as pharmacogenomic markers. As used herein, a “pharmacogenomic marker” is an objective biochemical marker which correlates with a specific clinical drug response or susceptibility in a subject (see, e.g., McLeod et al. (1999) Eur. J. Cancer 35:1650-1652). The presence or quantity of the pharmacogenomic marker is related to the predicted response of the subject to a specific drug or class of drugs prior to administration of the drug. By assessing the presence or quantity of one or more pharmacogenomic markers in a subject, a drug therapy which is most appropriate for the subject, or which is predicted to have a greater degree of success, may be selected. For example, based on the presence or quantity of RNA, or protein (e.g., 56939 protein or RNA) for specific tumor markers in a subject, a drug or course of treatment may be selected that is optimized for the treatment of the specific tumor likely to be present in the subject. Similarly, the presence or absence of a specific sequence mutation in 56939 DNA may correlate 56939 drug response. The use of pharmacogenomic markers therefore permits the application of the most appropriate treatment for each subject without having to administer the therapy.

[3447] Pharmaceutical Compositions of 56939

[3448] The nucleic acid and polypeptides, fragments thereof, as well as anti-56939 antibodies (also referred to herein as “active compounds”) of the invention can be incorporated into pharmaceutical compositions. Such compositions typically include the nucleic acid molecule, protein, or antibody and a pharmaceutically acceptable carrier. As used herein the language “pharmaceutically acceptable carrier” includes solvents, dispersion media, coatings, antibacterial and antifungal agents, isotonic and absorption delaying agents, and the like, compatible with pharmaceutical administration. Supplementary active compounds can also be incorporated into the compositions.

[3449] A pharmaceutical composition is formulated to be compatible with its intended route of administration. Examples of routes of administration include parenteral, e.g., intravenous, intradermal, subcutaneous, oral (e.g., inhalation), transdermal (topical), transmucosal, and rectal administration. Solutions or suspensions used for parenteral, intradermal, or subcutaneous application can include the following components: a sterile diluent such as water for injection, saline solution, fixed oils, polyethylene glycols, glycerine, propylene glycol or other synthetic solvents; antibacterial agents such as benzyl alcohol or methyl parabens; antioxidants such as ascorbic acid or sodium bisulfite; chelating agents such as ethylenediaminetetraacetic acid; buffers such as acetates, citrates or phosphates and agents for the adjustment of tonicity such as sodium chloride or dextrose. pH can be adjusted with acids or bases, such as hydrochloric acid or sodium hydroxide. The parenteral preparation can be enclosed in ampoules, disposable syringes or multiple dose vials made of glass or plastic.

[3450] Pharmaceutical compositions suitable for injectable use include sterile aqueous solutions (where water soluble) or dispersions and sterile powders for the extemporaneous preparation of sterile injectable solutions or dispersion. For intravenous administration, suitable carriers include physiological saline, bacteriostatic water, Cremophor EL™ (BASF, Parsippany, N.J.) or phosphate buffered saline (PBS). In all cases, the composition must be sterile and should be fluid to the extent that easy syringability exists. It should be stable under the conditions of manufacture and storage and must be preserved against the contaminating action of microorganisms such as bacteria and fungi. The carrier can be a solvent or dispersion medium containing, for example, water, ethanol, polyol (for example, glycerol, propylene glycol, and liquid polyetheylene glycol, and the like), and suitable mixtures thereof. The proper fluidity can be maintained, for example, by the use of a coating such as lecithin, by the maintenance of the required particle size in the case of dispersion and by the use of surfactants. Prevention of the action of microorganisms can be achieved by various antibacterial and antifungal agents, for example, parabens, chlorobutanol, phenol, ascorbic acid, thimerosal, and the like. In many cases, it will be preferable to include isotonic agents, for example, sugars, polyalcohols such as manitol, sorbitol, sodium chloride in the composition. Prolonged absorption of the injectable compositions can be brought about by including in the composition an agent which delays absorption, for example, aluminum monostearate and gelatin.

[3451] Sterile injectable solutions can be prepared by incorporating the active compound in the required amount in an appropriate solvent with one or a combination of ingredients enumerated above, as required, followed by filtered sterilization. Generally, dispersions are prepared by incorporating the active compound into a sterile vehicle which contains a basic dispersion medium and the required other ingredients from those enumerated above. In the case of sterile powders for the preparation of sterile injectable solutions, the preferred methods of preparation are vacuum drying and freeze-drying which yields a powder of the active ingredient plus any additional desired ingredient from a previously sterile-filtered solution thereof.

[3452] Oral compositions generally include an inert diluent or an edible carrier. For the purpose of oral therapeutic administration, the active compound can be incorporated with excipients and used in the form of tablets, troches, or capsules, e.g., gelatin capsules. Oral compositions can also be prepared using a fluid carrier for use as a mouthwash. Pharmaceutically compatible binding agents, and/or adjuvant materials can be included as part of the composition. The tablets, pills, capsules, troches and the like can contain any of the following ingredients, or compounds of a similar nature: a binder such as microcrystalline cellulose, gum tragacanth or gelatin; an excipient such as starch or lactose, a disintegrating agent such as alginic acid, Primogel, or corn starch; a lubricant such as magnesium stearate or Sterotes; a glidant such as colloidal silicon dioxide; a sweetening agent such as sucrose or saccharin; or a flavoring agent such as peppermint, methyl salicylate, or orange flavoring.

[3453] For administration by inhalation, the compounds are delivered in the form of an aerosol spray from pressured container or dispenser which contains a suitable propellant, e.g., a gas such as carbon dioxide, or a nebulizer.

[3454] Systemic administration can also be by transmucosal or transdermal means. For transmucosal or transdermal administration, penetrants appropriate to the barrier to be permeated are used in the formulation. Such penetrants are generally known in the art, and include, for example, for transmucosal administration, detergents, bile salts, and fusidic acid derivatives. Transmucosal administration can be accomplished through the use of nasal sprays or suppositories. For transdermal administration, the active compounds are formulated into ointments, salves, gels, or creams as generally known in the art.

[3455] The compounds can also be prepared in the form of suppositories (e.g., with conventional suppository bases such as cocoa butter and other glycerides) or retention enemas for rectal delivery.

[3456] In one embodiment, the active compounds are prepared with carriers that will protect the compound against rapid elimination from the body, such as a controlled release formulation, including implants and microencapsulated delivery systems. Biodegradable, biocompatible polymers can be used, such as ethylene vinyl acetate, polyanhydrides, polyglycolic acid, collagen, polyorthoesters, and polylactic acid. Methods for preparation of such formulations will be apparent to those skilled in the art. The materials can also be obtained commercially from Alza Corporation and Nova Pharmaceuticals, Inc. Liposomal suspensions (including liposomes targeted to infected cells with monoclonal antibodies to viral antigens) can also be used as pharmaceutically acceptable carriers. These can be prepared according to methods known to those skilled in the art, for example, as described in U.S. Pat. No. 4,522,811.

[3457] It is advantageous to formulate oral or parenteral compositions in dosage unit form for ease of administration and uniformity of dosage. Dosage unit form as used herein refers to physically discrete units suited as unitary dosages for the subject to be treated; each unit containing a predetermined quantity of active compound calculated to produce the desired therapeutic effect in association with the required pharmaceutical carrier.

[3458] Toxicity and therapeutic efficacy of such compounds can be determined by standard pharmaceutical procedures in cell cultures or experimental animals, e.g., for determining the LD50 (the dose lethal to 50% of the population) and the ED50 (the dose therapeutically effective in 50% of the population). The dose ratio between toxic and therapeutic effects is the therapeutic index and it can be expressed as the ratio LD50/ED50. Compounds which exhibit high therapeutic indices are preferred. While compounds that exhibit toxic side effects may be used, care should be taken to design a delivery system that targets such compounds to the site of affected tissue in order to minimize potential damage to uninfected cells and, thereby, reduce side effects.

[3459] The data obtained from the cell culture assays and animal studies can be used in formulating a range of dosage for use in humans. The dosage of such compounds lies preferably within a range of circulating concentrations that include the ED50 with little or no toxicity. The dosage may vary within this range depending upon the dosage form employed and the route of administration utilized. For any compound used in the method of the invention, the therapeutically effective dose can be estimated initially from cell culture assays. A dose may be formulated in animal models to achieve a circulating plasma concentration range that includes the IC50 (i.e., the concentration of the test compound which achieves a half-maximal inhibition of symptoms) as determined in cell culture. Such information can be used to more accurately determine useful doses in humans. Levels in plasma may be measured, for example, by high performance liquid chromatography.

[3460] As defined herein, a therapeutically effective amount of protein or polypeptide (i.e., an effective dosage) ranges from about 0.001 to 30 mg/kg body weight, preferably about 0.01 to 25 mg/kg body weight, more preferably about 0.1 to 20 mg/kg body weight, and even more preferably about 1 to 10 mg/kg, 2 to 9 mg/kg, 3 to 8 mg/kg, 4 to 7 mg/kg, or 5 to 6 mg/kg body weight. The protein or polypeptide can be administered one time per week for between about 1 to 10 weeks, preferably between 2 to 8 weeks, more preferably between about 3 to 7 weeks, and even more preferably for about 4, 5, or 6 weeks. The skilled artisan will appreciate that certain factors may influence the dosage and timing required to effectively treat a subject, including but not limited to the severity of the disease or disorder, previous treatments, the general health and/or age of the subject, and other diseases present. Moreover, treatment of a subject with a therapeutically effective amount of a protein, polypeptide, or antibody can include a single treatment or, preferably, can include a series of treatments.

[3461] For antibodies, the preferred dosage is 0.1 mg/kg of body weight (generally 10 mg/kg to 20 mg/kg). If the antibody is to act in the brain, a dosage of 50 mg/kg to 100 mg/kg is usually appropriate. Generally, partially human antibodies and fully human antibodies have a longer half-life within the human body than other antibodies. Accordingly, lower dosages and less frequent administration is often possible. Modifications such as lipidation can be used to stabilize antibodies and to enhance uptake and tissue penetration (e.g., into the brain). A method for lipidation of antibodies is described by Cruikshank et al. ((1997) J. Acquired Immune Deficiency Syndromes and Human Retrovirology 14:193).

[3462] The present invention encompasses agents which modulate expression or activity. An agent may, for example, be a small molecule. For example, such small molecules include, but are not limited to, peptides, peptidomimetics (e.g., peptoids), amino acids, amino acid analogs, polynucleotides, polynucleotide analogs, nucleotides, nucleotide analogs, organic or inorganic compounds (i.e., including heteroorganic and organometallic compounds) having a molecular weight less than about 10,000 grams per mole, organic or inorganic compounds having a molecular weight less than about 5,000 grams per mole, organic or inorganic compounds having a molecular weight less than about 1,000 grams per mole, organic or inorganic compounds having a molecular weight less than about 500 grams per mole, and salts, esters, and other pharmaceutically acceptable forms of such compounds.

[3463] Exemplary doses include milligram or microgram amounts of the small molecule per kilogram of subject or sample weight (e.g., about 1 microgram per kilogram to about 500 milligrams per kilogram, about 100 micrograms per kilogram to about 5 milligrams per kilogram, or about 1 microgram per kilogram to about 50 micrograms per kilogram. It is furthermore understood that appropriate doses of a small molecule depend upon the potency of the small molecule with respect to the expression or activity to be modulated. When one or more of these small molecules is to be administered to an animal (e.g., a human) in order to modulate expression or activity of a polypeptide or nucleic acid of the invention, a physician, veterinarian, or researcher may, for example, prescribe a relatively low dose at first, subsequently increasing the dose until an appropriate response is obtained. In addition, it is understood that the specific dose level for any particular animal subject will depend upon a variety of factors including the activity of the specific compound employed, the age, body weight, general health, gender, and diet of the subject, the time of administration, the route of administration, the rate of excretion, any drug combination, and the degree of expression or activity to be modulated.

[3464] An antibody (or fragment thereof) may be conjugated to a therapeutic moiety such as a cytotoxin, a therapeutic agent or a radioactive ion. A cytotoxin or cytotoxic agent includes any agent that is detrimental to cells. Examples include taxol, cytochalasin B, gramicidin D, ethidium bromide, emetine, mitomycin, etoposide, tenoposide, vincristine, vinblastine, colchicin, doxorubicin, daunorubicin, dihydroxy anthracin dione, mitoxantrone, mithramycin, actinomycin D, 1-dehydrotestosterone, glucocorticoids, procaine, tetracaine, lidocaine, propranolol, puromycin, maytansinoids, e.g., maytansinol (see U.S. Pat. No. 5,208,020), CC-1065 (see U.S. Pat. Nos. 5,475,092, 5,585,499, 5,846,545) and analogs or homologs thereof. Therapeutic agents include, but are not limited to, antimetabolites (e.g., methotrexate, 6-mercaptopurine, 6-thioguanine, cytarabine, 5-fluorouracil decarbazine), alkylating agents (e.g., mechlorethamine, thioepa chlorambucil, CC-1065, melphalan, carmustine (BSNU) and lomustine (CCNU), cyclothosphamide, busulfan, dibromomannitol, streptozotocin, mitomycin C, and cis-dichlorodiamine platinum (II) (DDP) cisplatin), anthracyclines (e.g., daunorubicin (formerly daunomycin) and doxorubicin), antibiotics (e.g., dactinomycin (formerly actinomycin), bleomycin, mithramycin, and anthramycin (AMC)), and anti-mitotic agents (e.g., vincristine, vinblastine, taxol and maytansinoids). Radioactive ions include, but are not limited to iodine, yttrium and praseodymium.

[3465] The conjugates of the invention can be used for modifying a given biological response, the drug moiety is not to be construed as limited to classical chemical therapeutic agents. For example, the drug moiety may be a protein or polypeptide possessing a desired biological activity. Such proteins may include, for example, a toxin such as abrin, ricin A, pseudomonas exotoxin, or diphtheria toxin; a protein such as tumor necrosis factor, &agr;-interferon, &bgr;-interferon, nerve growth factor, platelet derived growth factor, tissue plasminogen activator; or, biological response modifiers such as, for example, lymphokines, interleukin-1 (“IL-1”), interleukin-2 (“IL-2”), interleukin-6 (“IL-6”), granulocyte macrophase colony stimulating factor (“GM-CSF”), granulocyte colony stimulating factor (“G-CSF”), or other growth factors.

[3466] Alternatively, an antibody can be conjugated to a second antibody to form an antibody heteroconjugate as described by Segal in U.S. Pat. No. 4,676,980.

[3467] The nucleic acid molecules of the invention can be inserted into vectors and used as gene therapy vectors. Gene therapy vectors can be delivered to a subject by, for example, intravenous injection, local administration (see U.S. Pat. No. 5,328,470) or by stereotactic injection (see e.g., Chen et al. (1994) Proc. Natl. Acad. Sci. USA 91:3054-3057). The pharmaceutical preparation of the gene therapy vector can include the gene therapy vector in an acceptable diluent, or can comprise a slow release matrix in which the gene delivery vehicle is imbedded. Alternatively, where the complete gene delivery vector can be produced intact from recombinant cells, e.g., retroviral vectors, the pharmaceutical preparation can include one or more cells which produce the gene delivery system.

[3468] The pharmaceutical compositions can be included in a container, pack, or dispenser together with instructions for administration.

[3469] Methods of Treatment for 56939

[3470] The present invention provides for both prophylactic and therapeutic methods of treating a subject at risk of (or susceptible to) a disorder or having a disorder associated with aberrant or unwanted 56939 expression or activity. As used herein, the term “treatment” is defined as the application or administration of a therapeutic agent to a patient, or application or administration of a therapeutic agent to an isolated tissue or cell line from a patient, who has a disease, a symptom of disease or a predisposition toward a disease, with the purpose to cure, heal, alleviate, relieve, alter, remedy, ameliorate, improve or affect the disease, the symptoms of disease or the predisposition toward disease. A therapeutic agent includes, but is not limited to, small molecules, peptides, antibodies, ribozymes and antisense oligonucleotides.

[3471] With regards to both prophylactic and therapeutic methods of treatment, such treatments may be specifically tailored or modified, based on knowledge obtained from the field of pharmacogenomics. “Pharmacogenomics”, as used herein, refers to the application of genomics technologies such as gene sequencing, statistical genetics, and gene expression analysis to drugs in clinical development and on the market. More specifically, the term refers the study of how a patient's genes determine his or her response to a drug (e.g., a patient's “drug response phenotype”, or “drug response genotype”.) Thus, another aspect of the invention provides methods for tailoring an individual's prophylactic or therapeutic treatment with either the 56939 molecules of the present invention or 56939 modulators according to that individual's drug response genotype. Pharmacogenomics allows a clinician or physician to target prophylactic or therapeutic treatments to patients who will most benefit from the treatment and to avoid treatment of patients who will experience toxic drug-related side effects.

[3472] In one aspect, the invention provides a method for preventing in a subject, a disease or condition associated with an aberrant or unwanted 56939 expression or activity, by administering to the subject a 56939 or an agent which modulates 56939 expression or at least one 56939 activity. Subjects at risk for a disease which is caused or contributed to by aberrant or unwanted 56939 expression or activity can be identified by, for example, any or a combination of diagnostic or prognostic assays as described herein. Administration of a prophylactic agent can occur prior to the manifestation of symptoms characteristic of the 56939 aberrance, such that a disease or disorder is prevented or, alternatively, delayed in its progression. Depending on the type of 56939 aberrance, for example, a 56939, 56939 agonist or 56939 antagonist agent can be used for treating the subject. The appropriate agent can be determined based on screening assays described herein.

[3473] It is possible that some 56939 disorders can be caused, at least in part, by an abnormal level of gene product, or by the presence of a gene product exhibiting abnormal activity. As such, the reduction in the level and/or activity of such gene products would bring about the amelioration of disorder symptoms.

[3474] The 56939 molecules can act as novel diagnostic targets and therapeutic agents for controlling one or more of disorders associated with bone metabolism, immune disorders, viral diseases, and pain or metabolic disorders.

[3475] Aberrant expression and/or activity of 56939 molecules may mediate disorders associated with bone metabolism. “Bone metabolism” refers to direct or indirect effects in the formation or degeneration of bone structures, e.g., bone formation, bone resorption, etc., which may ultimately affect the concentrations in serum of calcium and phosphate. This term also includes activities mediated by 56939 molecules effects in bone cells, e.g. osteoclasts and osteoblasts, that may in turn result in bone formation and degeneration. For example, 56939 molecules may support different activities of bone resorbing osteoclasts such as the stimulation of differentiation of monocytes and mononuclear phagocytes into osteoclasts. Accordingly, 56939 molecules that modulate the production of bone cells can influence bone formation and degeneration, and thus may be used to treat bone disorders. Examples of such disorders include, but are not limited to, osteoporosis, osteodystrophy, osteomalacia, rickets, osteitis fibrosa cystica, renal osteodystrophy, osteosclerosis, anti-convulsant treatment, osteopenia, fibrogenesis-imperfecta ossium, secondary hyperparathyrodism, hypoparathyroidism, hyperparathyroidism, cirrhosis, obstructive jaundice, drug induced metabolism, medullary carcinoma, chronic renal disease, rickets, sarcoidosis, glucocorticoid antagonism, malabsorption syndrome, steatorrhea, tropical sprue, idiopathic hypercalcemia and milk fever.

[3476] The 56939 nucleic acid and protein of the invention can be used to treat and/or diagnose a variety of immune disorders. Examples of immune disorders or diseases include, but are not limited to, autoimmune diseases (including, for example, diabetes mellitus, arthritis (including rheumatoid arthritis, juvenile rheumatoid arthritis, osteoarthritis, psoriatic arthritis), multiple sclerosis, encephalomyelitis, myasthenia gravis, systemic lupus erythematosis, autoimmune thyroiditis, dermatitis (including atopic dermatitis and eczematous dermatitis), psoriasis, Sjögren's Syndrome, Crohn's disease, aphthous ulcer, iritis, conjunctivitis, keratoconjunctivitis, ulcerative colitis, asthma, allergic asthma, cutaneous lupus erythematosus, scleroderma, vaginitis, proctitis, drug eruptions, leprosy reversal reactions, erythema nodosum leprosum, autoimmune uveitis, allergic encephalomyelitis, acute necrotizing hemorrhagic encephalopathy, idiopathic bilateral progressive sensorineural hearing loss, aplastic anemia, pure red cell anemia, idiopathic thrombocytopenia, polychondritis, Wegener's granulomatosis, chronic active hepatitis, Stevens-Johnson syndrome, idiopathic sprue, lichen planus, Graves' disease, sarcoidosis, primary biliary cirrhosis, uveitis posterior, and interstitial lung fibrosis), graft-versus-host disease, cases of transplantation, and allergy such as, atopic allergy.

[3477] Examples of pain disorders include, but are not limited to, pain response elicited during various forms of tissue injury, e.g., inflammation, infection, and ischemia, usually referred to as hyperalgesia (described in, for example, Fields, H. L. (1987) Pain, New York: McGraw-Hill); pain associated with musculoskeletal disorders, e.g., joint pain; tooth pain; headaches; pain associated with surgery; pain related to irritable bowel syndrome; or chest pain.

[3478] As discussed, successful treatment of 56939 disorders can be brought about by techniques that serve to inhibit the expression or activity of target gene products. For example, compounds, e.g., an agent identified using an assays described above, that proves to exhibit negative modulatory activity, can be used in accordance with the invention to prevent and/or ameliorate symptoms of 56939 disorders. Such molecules can include, but are not limited to peptides, phosphopeptides, small organic or inorganic molecules, or antibodies (including, for example, polyclonal, monoclonal, humanized, anti-idiotypic, chimeric or single chain antibodies, and Fab, F(ab′)2 and Fab expression library fragments, scFV molecules, and epitope-binding fragments thereof).

[3479] Further, antisense and ribozyme molecules that inhibit expression of the target gene can also be used in accordance with the invention to reduce the level of target gene expression, thus effectively reducing the level of target gene activity. Still further, triple helix molecules can be utilized in reducing the level of target gene activity. Antisense, ribozyme and triple helix molecules are discussed above.

[3480] It is possible that the use of antisense, ribozyme, and/or triple helix molecules to reduce or inhibit mutant gene expression can also reduce or inhibit the transcription (triple helix) and/or translation (antisense, ribozyme) of mRNA produced by normal target gene alleles, such that the concentration of normal target gene product present can be lower than is necessary for a normal phenotype. In such cases, nucleic acid molecules that encode and express target gene polypeptides exhibiting normal target gene activity can be introduced into cells via gene therapy method. Alternatively, in instances in that the target gene encodes an extracellular protein, it can be preferable to co-administer normal target gene protein into the cell or tissue in order to maintain the requisite level of cellular or tissue target gene activity.

[3481] Another method by which nucleic acid molecules may be utilized in treating or preventing a disease characterized by 56939 expression is through the use of aptamer molecules specific for 56939 protein. Aptamers are nucleic acid molecules having a tertiary structure which permits them to specifically bind to protein ligands (see, e.g., Osborne, et al. (1997) Curr. Opin. Chem Biol. 1: 5-9; and Patel, D. J. (1997) Curr Opin Chem Biol 1:32-46). Since nucleic acid molecules may in many cases be more conveniently introduced into target cells than therapeutic protein molecules may be, aptamers offer a method by which 56939 protein activity may be specifically decreased without the introduction of drugs or other molecules which may have pluripotent effects.

[3482] Antibodies can be generated that are both specific for target gene product and that reduce target gene product activity. Such antibodies may, therefore, by administered in instances whereby negative modulatory techniques are appropriate for the treatment of 56939 disorders. For a description of antibodies, see the Antibody section above.

[3483] In circumstances wherein injection of an animal or a human subject with a 56939 protein or epitope for stimulating antibody production is harmful to the subject, it is possible to generate an immune response against 56939 through the use of anti-idiotypic antibodies (see, for example, Herlyn, D. (1999) Ann Med 31:66-78; and Bhattacharya-Chatterjee, M., and Foon, K. A. (1998) Cancer Treat Res. 94:51-68). If an anti-idiotypic antibody is introduced into a mammal or human subject, it should stimulate the production of anti-anti-idiotypic antibodies, which should be specific to the 56939 protein. Vaccines directed to a disease characterized by 56939 expression may also be generated in this fashion.

[3484] In instances where the target antigen is intracellular and whole antibodies are used, internalizing antibodies may be preferred. Lipofectin or liposomes can be used to deliver the antibody or a fragment of the Fab region that binds to the target antigen into cells. Where fragments of the antibody are used, the smallest inhibitory fragment that binds to the target antigen is preferred. For example, peptides having an amino acid sequence corresponding to the Fv region of the antibody can be used. Alternatively, single chain neutralizing antibodies that bind to intracellular target antigens can also be administered. Such single chain antibodies can be administered, for example, by expressing nucleotide sequences encoding single-chain antibodies within the target cell population (see e.g., Marasco et al. (1993) Proc. Natl. Acad. Sci. USA 90:7889-7893).

[3485] The identified compounds that inhibit target gene expression, synthesis and/or activity can be administered to a patient at therapeutically effective doses to prevent, treat or ameliorate 56939 disorders. A therapeutically effective dose refers to that amount of the compound sufficient to result in amelioration of symptoms of the disorders. Toxicity and therapeutic efficacy of such compounds can be determined by standard pharmaceutical procedures as described above.

[3486] The data obtained from the cell culture assays and animal studies can be used in formulating a range of dosage for use in humans. The dosage of such compounds lies preferably within a range of circulating concentrations that include the ED50 with little or no toxicity. The dosage can vary within this range depending upon the dosage form employed and the route of administration utilized. For any compound used in the method of the invention, the therapeutically effective dose can be estimated initially from cell culture assays. A dose can be formulated in animal models to achieve a circulating plasma concentration range that includes the IC50 (i.e., the concentration of the test compound that achieves a half-maximal inhibition of symptoms) as determined in cell culture. Such information can be used to more accurately determine useful doses in humans. Levels in plasma can be measured, for example, by high performance liquid chromatography. Another example of determination of effective dose for an individual is the ability to directly assay levels of “free” and “bound” compound in the serum of the test subject. Such assays may utilize antibody mimics and/or “biosensors” that have been created through molecular imprinting techniques. The compound which is able to modulate 56939 activity is used as a template, or “imprinting molecule”, to spatially organize polymerizable monomers prior to their polymerization with catalytic reagents. The subsequent removal of the imprinted molecule leaves a polymer matrix which contains a repeated “negative image” of the compound and is able to selectively rebind the molecule under biological assay conditions. A detailed review of this technique can be seen in Ansell, R. J. et al (1996) Current Opinion in Biotechnology 7:89-94 and in Shea, K. J. (1994) Trends in Polymer Science 2:166-173. Such “imprinted” affinity matrixes are amenable to ligand-binding assays, whereby the immobilized monoclonal antibody component is replaced by an appropriately imprinted matrix. An example of the use of such matrixes in this way can be seen in Vlatakis, G. et al (1993) Nature 361:645-647. Through the use of isotope-labeling, the “free” concentration of compound which modulates the expression or activity of 56939 can be readily monitored and used in calculations of IC50.

[3487] Such “imprinted” affinity matrixes can also be designed to include fluorescent groups whose photon-emitting properties measurably change upon local and selective binding of target compound. These changes can be readily assayed in real time using appropriate fiberoptic devices, in turn allowing the dose in a test subject to be quickly optimized based on its individual IC50. An rudimentary example of such a “biosensor” is discussed in Kriz, D. et al (1995) Analytical Chemistry 67:2142-2144.

[3488] Another aspect of the invention pertains to methods of modulating 56939 expression or activity for therapeutic purposes. Accordingly, in an exemplary embodiment, the modulatory method of the invention involves contacting a cell with a 56939 or agent that modulates one or more of the activities of 56939 protein activity associated with the cell. An agent that modulates 56939 protein activity can be an agent as described herein, such as a nucleic acid or a protein, a naturally-occurring target molecule of a 56939 protein (e.g., a 56939 substrate or receptor), a 56939 antibody, a 56939 agonist or antagonist, a peptidomimetic of a 56939 agonist or antagonist, or other small molecule.

[3489] In one embodiment, the agent stimulates one or 56939 activities. Examples of such stimulatory agents include active 56939 protein and a nucleic acid molecule encoding 56939. In another embodiment, the agent inhibits one or more 56939 activities. Examples of such inhibitory agents include antisense 56939 nucleic acid molecules, anti-56939 antibodies, and 56939 inhibitors. These modulatory methods can be performed in vitro (e.g., by culturing the cell with the agent) or, alternatively, in vivo (e.g., by administering the agent to a subject). As such, the present invention provides methods of treating an individual afflicted with a disease or disorder characterized by aberrant or unwanted expression or activity of a 56939 protein or nucleic acid molecule. In one embodiment, the method involves administering an agent (e.g., an agent identified by a screening assay described herein), or combination of agents that modulates (e.g., up regulates or down regulates) 56939 expression or activity. In another embodiment, the method involves administering a 56939 protein or nucleic acid molecule as therapy to compensate for reduced, aberrant, or unwanted 56939 expression or activity.

[3490] Stimulation of 56939 activity is desirable in situations in which 56939 is abnormally downregulated and/or in which increased 56939 activity is likely to have a beneficial effect. For example, stimulation of 56939 activity is desirable in situations in which a 56939 is downregulated and/or in which increased 56939 activity is likely to have a beneficial effect. Likewise, inhibition of 56939 activity is desirable in situations in which 56939 is abnormally upregulated and/or in which decreased 56939 activity is likely to have a beneficial effect.

[3491] 56939 Pharmacogenomics

[3492] The 56939 molecules of the present invention, as well as agents, or modulators which have a stimulatory or inhibitory effect on 56939 activity (e.g., 56939 gene expression) as identified by a screening assay described herein can be administered to individuals to treat (prophylactically or therapeutically) 56939 associated disorders, e.g. metabolic disorders, associated with aberrant or unwanted 56939 activity. In conjunction with such treatment, pharmacogenomics (i.e., the study of the relationship between an individual's genotype and that individual's response to a foreign compound or drug) may be considered. Differences in metabolism of therapeutics can lead to severe toxicity or therapeutic failure by altering the relation between dose and blood concentration of the pharmacologically active drug. Thus, a physician or clinician may consider applying knowledge obtained in relevant pharmacogenomics studies in determining whether to administer a 56939 molecule or 56939 modulator as well as tailoring the dosage and/or therapeutic regimen of treatment with a 56939 molecule or 56939 modulator.

[3493] Pharmacogenomics deals with clinically significant hereditary variations in the response to drugs due to altered drug disposition and abnormal action in affected persons. See, for example, Eichelbaum, M. et al. (1996) Clin. Exp. Pharmacol. Physiol. 23:983-985 and Linder, M. W. et al. (1997) Clin. Chem. 43:254-266. In general, two types of pharmacogenetic conditions can be differentiated. Genetic conditions transmitted as a single factor altering the way drugs act on the body (altered drug action) or genetic conditions transmitted as single factors altering the way the body acts on drugs (altered drug metabolism). These pharmacogenetic conditions can occur either as rare genetic defects or as naturally-occurring polymorphisms. For example, glucose-6-phosphate dehydrogenase deficiency (G6PD) is a common inherited enzymopathy in which the main clinical complication is haemolysis after ingestion of oxidant drugs (anti-malarials, sulfonamides, analgesics, nitrofurans) and consumption of fava beans.

[3494] One pharmacogenomics approach to identifying genes that predict drug response, known as “a genome-wide association”, relies primarily on a high-resolution map of the human genome consisting of already known gene-related markers (e.g., a “bi-allelic” gene marker map which consists of 60,000-100,000 polymorphic or variable sites on the human genome, each of which has two variants.) Such a high-resolution genetic map can be compared to a map of the genome of each of a statistically significant number of patients taking part in a Phase II/III drug trial to identify markers associated with a particular observed drug response or side effect. Alternatively, such a high resolution map can be generated from a combination of some ten-million known single nucleotide polymorphisms (SNPs) in the human genome. As used herein, a “SNP” is a common alteration that occurs in a single nucleotide base in a stretch of DNA. For example, a SNP may occur once per every 1000 bases of DNA. A SNP may be involved in a disease process, however, the vast majority may not be disease-associated. Given a genetic map based on the occurrence of such SNPs, individuals can be grouped into genetic categories depending on a particular pattern of SNPs in their individual genome. In such a manner, treatment regimens can be tailored to groups of genetically similar individuals, taking into account traits that may be common among such genetically similar individuals.

[3495] Alternatively, a method termed the “candidate gene approach,” can be utilized to identify genes that predict drug response. According to this method, if a gene that encodes a drug's target is known (e.g., a 56939 protein of the present invention), all common variants of that gene can be fairly easily identified in the population and it can be determined if having one version of the gene versus another is associated with a particular drug response.

[3496] Alternatively, a method termed the “gene expression profiling,” can be utilized to identify genes that predict drug response. For example, the gene expression of an animal dosed with a drug (e.g., a 56939 molecule or 56939 modulator of the present invention) can give an indication whether gene pathways related to toxicity have been turned on.

[3497] Information generated from more than one of the above pharmacogenomics approaches can be used to determine appropriate dosage and treatment regimens for prophylactic or therapeutic treatment of an individual. This knowledge, when applied to dosing or drug selection, can avoid adverse reactions or therapeutic failure and thus enhance therapeutic or prophylactic efficiency when treating a subject with a 56939 molecule or 56939 modulator, such as a modulator identified by one of the exemplary screening assays described herein.

[3498] The present invention further provides methods for identifying new agents, or combinations, that are based on identifying agents that modulate the activity of one or more of the gene products encoded by one or more of the 56939 genes of the present invention, wherein these products may be associated with resistance of the cells to a therapeutic agent. Specifically, the activity of the proteins encoded by the 56939 genes of the present invention can be used as a basis for identifying agents for overcoming agent resistance. By blocking the activity of one or more of the resistance proteins, target cells, e.g., human cells, will become sensitive to treatment with an agent to which the unmodified target cells were resistant.

[3499] Monitoring the influence of agents (e.g., drugs) on the expression or activity of a 56939 protein can be applied in clinical trials. For example, the effectiveness of an agent determined by a screening assay as described herein to increase 56939 gene expression, protein levels, or upregulate 56939 activity, can be monitored in clinical trials of subjects exhibiting decreased 56939 gene expression, protein levels, or downregulated 56939 activity. Alternatively, the effectiveness of an agent determined by a screening assay to decrease 56939 gene expression, protein levels, or downregulate 56939 activity, can be monitored in clinical trials of subjects exhibiting increased 56939 gene expression, protein levels, or upregulated 56939 activity. In such clinical trials, the expression or activity of a 56939 gene, and preferably, other genes that have been implicated in, for example, a 56939-associated disorder can be used as a “read out” or markers of the phenotype of a particular cell.

[3500] 56939 Informatics

[3501] The sequence of a 56939 molecule is provided in a variety of media to facilitate use thereof. A sequence can be provided as a manufacture, other than an isolated nucleic acid or amino acid molecule, which contains a 56939. Such a manufacture can provide a nucleotide or amino acid sequence, e.g., an open reading frame, in a form which allows examination of the manufacture using means not directly applicable to examining the nucleotide or amino acid sequences, or a subset thereof, as they exists in nature or in purified form. The sequence information can include, but is not limited to, 56939 full-length nucleotide and/or amino acid sequences, partial nucleotide and/or amino acid sequences, polymorphic sequences including single nucleotide polymorphisms (SNPs), epitope sequence, and the like. In a preferred embodiment, the manufacture is a machine-readable medium, e.g., a magnetic, optical, chemical or mechanical information storage device.

[3502] As used herein, “machine-readable media” refers to any medium that can be read and accessed directly by a machine, e.g., a digital computer or analogue computer. Non-limiting examples of a computer include a desktop PC, laptop, mainframe, server (e.g., a web server, network server, or server farm), handheld digital assistant, pager, mobile telephone, and the like. The computer can be stand-alone or connected to a communications network, e.g., a local area network (such as a VPN or intranet), a wide area network (e.g., an Extranet or the Internet), or a telephone network (e.g., a wireless, DSL, or ISDN network). Machine-readable media include, but are not limited to: magnetic storage media, such as floppy discs, hard disc storage medium, and magnetic tape; optical storage media such as CD-ROM; electrical storage media such as RAM, ROM, EPROM, EEPROM, flash memory, and the like; and hybrids of these categories such as magnetic/optical storage media.

[3503] A variety of data storage structures are available to a skilled artisan for creating a machine-readable medium having recorded thereon a nucleotide or amino acid sequence of the present invention. The choice of the data storage structure will generally be based on the means chosen to access the stored information. In addition, a variety of data processor programs and formats can be used to store the nucleotide sequence information of the present invention on computer readable medium. The sequence information can be represented in a word processing text file, formatted in commercially-available software such as WordPerfect and Microsoft Word, or represented in the form of an ASCII file, stored in a database application, such as DB2, Sybase, Oracle, or the like. The skilled artisan can readily adapt any number of data processor structuring formats (e.g., text file or database) in order to obtain computer readable medium having recorded thereon the nucleotide sequence information of the present invention.

[3504] In a preferred embodiment, the sequence information is stored in a relational database (such as Sybase or Oracle). The database can have a first table for storing sequence (nucleic acid and/or amino acid sequence) information. The sequence information can be stored in one field (e.g., a first column) of a table row and an identifier for the sequence can be store in another field (e.g., a second column) of the table row. The database can have a second table, e.g., storing annotations. The second table can have a field for the sequence identifier, a field for a descriptor or annotation text (e.g., the descriptor can refer to a functionality of the sequence, a field for the initial position in the sequence to which the annotation refers, and a field for the ultimate position in the sequence to which the annotation refers. Non-limiting examples for annotation to nucleic acid sequences include polymorphisms (e.g., SNP's) translational regulatory sites and splice junctions. Non-limiting examples for annotations to amino acid sequence include polypeptide domains, e.g., a domain described herein; active sites and other functional amino acids; and modification sites.

[3505] By providing the nucleotide or amino acid sequences of the invention in computer readable form, the skilled artisan can routinely access the sequence information for a variety of purposes. For example, one skilled in the art can use the nucleotide or amino acid sequences of the invention in computer readable form to compare a target sequence or target structural motif with the sequence information stored within the data storage means. A search is used to identify fragments or regions of the sequences of the invention which match a particular target sequence or target motif. The search can be a BLAST search or other routine sequence comparison, e.g., a search described herein.

[3506] Thus, in one aspect, the invention features a method of analyzing 56939, e.g., analyzing structure, function, or relatedness to one or more other nucleic acid or amino acid sequences. The method includes: providing a 56939 nucleic acid or amino acid sequence; comparing the 56939 sequence with a second sequence, e.g., one or more preferably a plurality of sequences from a collection of sequences, e.g., a nucleic acid or protein sequence database to thereby analyze 56939. The method can be performed in a machine, e.g., a computer, or manually by a skilled artisan.

[3507] The method can include evaluating the sequence identity between a 56939 sequence and a database sequence. The method can be performed by accessing the database at a second site, e.g., over the Internet.

[3508] As used herein, a “target sequence” can be any DNA or amino acid sequence of six or more nucleotides or two or more amino acids. A skilled artisan can readily recognize that the longer a target sequence is, the less likely a target sequence will be present as a random occurrence in the database. Typical sequence lengths of a target sequence are from about 10 to 100 amino acids or from about 30 to 300 nucleotide residues. However, it is well recognized that commercially important fragments, such as sequence fragments involved in gene expression and protein processing, may be of shorter length.

[3509] Computer software is publicly available which allows a skilled artisan to access sequence information provided in a computer readable medium for analysis and comparison to other sequences. A variety of known algorithms are disclosed publicly and a variety of commercially available software for conducting search means are and can be used in the computer-based systems of the present invention. Examples of such software include, but are not limited to, MacPattern (EMBL), BLASTN and BLASTX (NCBI).

[3510] Thus, the invention features a method of making a computer readable record of a sequence of a 56939 sequence which includes recording the sequence on a computer readable matrix. In a preferred embodiment the record includes one or more of the following: identification of an ORF; identification of a domain, region, or site; identification of the start of transcription; identification of the transcription terminator; the full length amino acid sequence of the protein, or a mature form thereof; the 5′end of the translated region.

[3511] In another aspect, the invention features, a method of analyzing a sequence. The method includes: providing a 56939 sequence, or record, in machine-readable form; comparing a second sequence to the 56939 sequence; thereby analyzing a sequence. Comparison can include comparing to sequences for sequence identity or determining if one sequence is included within the other, e.g., determining if the 56939 sequence includes a sequence being compared. In a preferred embodiment the 56939 or second sequence is stored on a first computer, e.g., at a first site and the comparison is performed, read, or recorded on a second computer, e.g., at a second site. E.g., the 56939 or second sequence can be stored in a public or proprietary database in one computer, and the results of the comparison performed, read, or recorded on a second computer. In a preferred embodiment the record includes one or more of the following: identification of an ORF; identification of a domain, region, or site; identification of the start of transcription; identification of the transcription terminator; the full length amino acid sequence of the protein, or a mature form thereof; the 5′end of the translated region.

[3512] In another aspect, the invention provides a machine-readable medium for holding instructions for performing a method for determining whether a subject has a 56939-associated disease or disorder or a pre-disposition to a 56939-associated disease or disorder, wherein the method comprises the steps of determining 56939 sequence information associated with the subject and based on the 56939 sequence information, determining whether the subject has a 56939-associated disease or disorder or a pre-disposition to a 56939-associated disease or disorder and/or recommending a particular treatment for the disease, disorder or pre-disease condition.

[3513] The invention further provides in an electronic system and/or in a network, a method for determining whether a subject has a 56939-associated disease or disorder or a pre-disposition to a disease associated with a 56939 wherein the method comprises the steps of determining 56939 sequence information associated with the subject, and based on the 56939 sequence information, determining whether the subject has a 56939-associated disease or disorder or a pre-disposition to a 56939-associated disease or disorder, and/or recommending a particular treatment for the disease, disorder or pre-disease condition. In a preferred embodiment, the method further includes the step of receiving information, e.g., phenotypic or genotypic information, associated with the subject and/or acquiring from a network phenotypic information associated with the subject. The information can be stored in a database, e.g., a relational database. In another embodiment, the method further includes accessing the database, e.g., for records relating to other subjects, comparing the 56939 sequence of the subject to the 56939 sequences in the database to thereby determine whether the subject as a 56939-associated disease or disorder, or a pre-disposition for such.

[3514] The present invention also provides in a network, a method for determining whether a subject has a 56939 associated disease or disorder or a pre-disposition to a 56939-associated disease or disorder associated with 56939, said method comprising the steps of receiving 56939 sequence information from the subject and/or information related thereto, receiving phenotypic information associated with the subject, acquiring information from the network corresponding to 56939 and/or corresponding to a 56939-associated disease or disorder (e.g., metabolic disorders), and based on one or more of the phenotypic information, the 56939 information (e.g., sequence information and/or information related thereto), and the acquired information, determining whether the subject has a 56939-associated disease or disorder or a pre-disposition to a 56939-associated disease or disorder. The method may further comprise the step of recommending a particular treatment for the disease, disorder or pre-disease condition.

[3515] The present invention also provides a method for determining whether a subject has a 56939-associated disease or disorder or a pre-disposition to a 56939-associated disease or disorder, said method comprising the steps of receiving information related to 56939 (e.g., sequence information and/or information related thereto), receiving phenotypic information associated with the subject, acquiring information from the network related to 56939 and/or related to a 56939-associated disease or disorder, and based on one or more of the phenotypic information, the 56939 information, and the acquired information, determining whether the subject has a 56939-associated disease or disorder or a pre-disposition to a 56939-associated disease or disorder. The method may further comprise the step of recommending a particular treatment for the disease, disorder or pre-disease condition.

[3516] This invention is further illustrated by the following examples that should not be construed as limiting. The contents of all references, patents and published patent applications cited throughout this application are incorporated herein by reference.

Background of the 33410 Invention

[3517] Higher eukaryotes have many distinct carboxylesterases. Among the different types of carboxylesterases are those that act on carboxylic esters. Carboxylesterases have been classified into three categories (A, B and C) on the basis of differential patters of inhibition by organophosphates (Myers, M. et al. (1988) Mol. Biol. Evol. 5(2):113-119). The sequence of a number of type-B carboxylesterase indicates that the majority are evolutionarily related. Members of the type B carboxylesterase include acetylcholincarboxylesterases from vertebrates and drosophila, mammalian cholincarboxylesterases, mammalian bile salt activated lipases, among others.

[3518] Neuroligins are cell surface molecules composed of five domains: an N-terminal cleaved signal sequence, a large extracellular domain homologous to carboxylesterases, a linker domain between the transmembrane region and the carboxylesterase homology domain, a single transmembrane region, and a cytoplasmic tail (Ichtchenko, K. et al. (1996) J. Biol. Chem. 271(5):2676-2682). Sequence comparisons place the neuroligins in the large family of carboxylesterase homology domain proteins that includes thyroglobulin, acetylcholincarboxylesterase, and gliotactin. However, neuroligins are only distantly related to these proteins, and thus appear to form a unique subset of the carboxylesterase family. At least three neuroligins have been cloned, namely Neuroligins 1, 2 and 3 (Ichtchenko, K. et al. (1995) Cell 81:435-443; Ichtchenko, K. et al. (1996) supra). These three neuroligins are expressed at high levels in the brain, primarily in neurons. Neuroligins have been shown to mediate cell adhesion events associated with neuronal development and/or maintenance. For example, Neuroligin 1 has been found to be enriched in postsynaptic densities where it may recruit receptors, channels, and signal transduction molecules at synaptic sites of cell adhesion (Song et al. (1999) Proc. Natl. Acad. Sci. 96(3):1100-5).

[3519] Functionally, neuroligins bind tightly, in a calcium-dependent manner, to the extracellular domains of the polymorphic cell surface proteins known as &bgr;-neurexins. Neurexins are neuronal cell surface proteins that exhibit a high degree of diversity (Ushkaryov et al. (1994) J. Biol. Chem. 269: 11987-11992). Neuroligin-&bgr;-neurexin interactions have been implicated in mediating recognition processes between neurons that give rise to neuronal developmental events such as synaptogenesis (e.g., specification of excitatory synapses) (Brose, N. (1999) Naturwissenschaften 86(11):516-24).

Summary of the 33410 Invention

[3520] The present invention is based, in part, on the discovery of a novel carboxylesterase family member, referred to herein as “33410”. The nucleotide sequence of a cDNA encoding 33410 is shown in SEQ ID NO:53, and the amino acid sequence of a 33410 polypeptide is shown in SEQ ID NO:54. In addition, the nucleotide sequences of the coding region are depicted in SEQ ID NO:55.

[3521] Accordingly, in one aspect, the invention features a nucleic acid molecule that encodes a 33410 protein or polypeptide, e.g., a biologically active portion of the 33410 protein. In a preferred embodiment the isolated nucleic acid molecule encodes a polypeptide having the amino acid sequence of SEQ ID NO:54. In other embodiments, the invention provides isolated 33410 nucleic acid molecules having the nucleotide sequence shown in SEQ ID NO:53, SEQ ID NO:55, or the sequence of the DNA insert of the plasmid deposited with ATCC Accession Number ______. In still other embodiments, the invention provides nucleic acid molecules that are substantially identical (e.g., naturally occurring allelic variants) to the nucleotide sequence shown in SEQ ID NO:53, SEQ ID NO:55, or the sequence of the DNA insert of the plasmid deposited with ATCC Accession Number ______. In other embodiments, the invention provides a nucleic acid molecule which hybridizes under stringent hybridization conditions to a nucleic acid molecule comprising the nucleotide sequence of SEQ ID NO:53 or 55, or the sequence of the DNA insert of the plasmid deposited with ATCC Accession Number ______, wherein the nucleic acid encodes a full length 33410 protein or an active fragment thereof.

[3522] In a related aspect, the invention further provides nucleic acid constructs that include a 33410 nucleic acid molecule described herein. In certain embodiments, the nucleic acid molecules of the invention are operatively linked to native or heterologous regulatory sequences. Also included, are vectors and host cells containing the 33410 nucleic acid molecules of the invention e.g., vectors and host cells suitable for producing 33410 nucleic acid molecules and polypeptides.

[3523] In another related aspect, the invention provides nucleic acid fragments suitable as primers or hybridization probes for the detection of 33410-encoding nucleic acids.

[3524] In still another related aspect, isolated nucleic acid molecules that are antisense to a 33410 encoding nucleic acid molecule are provided.

[3525] In another aspect, the invention features 33410 polypeptides, and biologically active or antigenic fragments thereof that are useful, e.g., as reagents or targets in assays applicable to treatment and diagnosis of 33410-mediated or -related disorders, e.g., a cell adhesion disorder; a disorder involving aberrant cellular proliferation or differentiation (e.g., a cancer); a CNS disorder, such as a neurodegenerative disorder, e.g., Alzheimer's disease, dementias related to Alzheimer's disease (such as Pick's disease), Parkinson's and other Lewy diffuse body diseases, multiple sclerosis, amyotrophic lateral sclerosis, progressive supranuclear palsy, epilepsy, Jakob-Creutzfieldt disease, AIDS related dementia, familial infantile convulsions, paroxysmal choreoathetosis; a psychiatric disorder (e.g., depression, schizophrenic disorders, korsakoff's psychosis, mania, anxiety disorders, or phobic disorders); a learning or memory disorder (e.g., amnesia or age-related memory loss; and migraine).

[3526] In another embodiment, the invention provides 33410 polypeptides having a 33410 activity. Preferred polypeptides are 33410 proteins including at least one carboxylesterase domain, a signal peptide and at least one transmembrane domain, and, preferably, having a 33410 activity, e.g., a 33410 activity as described herein (e.g., the modulation of one or more of: a cell-cell (e.g., neuron-neuron, or neuron-glia) recognition event or adhesion; synaptogenesis; membrane excitability; neurite outgrowth; signal transduction; or cell (e.g., neural or a cancer cell) proliferation, growth, differentiation, or migration).

[3527] In other embodiments, the invention provides 33410 polypeptides, e.g., a 33410 polypeptide having the amino acid sequence shown in SEQ ID NO:54; the amino acid sequence encoded by the cDNA insert of the plasmid deposited with ATCC Accession Number ______; an amino acid sequence that is substantially identical to the amino acid sequence shown in SEQ ID NO:54; or an amino acid sequence encoded by a nucleic acid molecule having a nucleotide sequence which hybridizes under stringent hybridization conditions to a nucleic acid molecule comprising the nucleotide sequence of SEQ ID NO:53 or SEQ ID NO:55, or the sequence of the DNA insert of the plasmid deposited with ATCC Accession Number ______, wherein the nucleic acid encodes a full length 33410 protein or an active fragment thereof.

[3528] In a related aspect, the invention further provides nucleic acid constructs that include a 33410 nucleic acid molecule described herein.

[3529] In a related aspect, the invention provides 33410 polypeptides or fragments operatively linked to non-33410 polypeptides to form fusion proteins.

[3530] In another aspect, the invention features antibodies and antigen-binding fragments thereof, that react with, or more preferably specifically bind 33410 polypeptides.

[3531] In another aspect, the invention provides methods of screening for compounds that modulate the expression or activity of the 33410 polypeptides or nucleic acids. In yet another aspect, the invention features a method of evaluating, or identifying, an agent, e.g., an agent as described herein, e.g., a compound (e.g., a polypeptide, peptide, a peptide fragment, a peptidomimetic, a small molecule), for the ability to modulate, e.g. inhibit, the activity or expression of a 33410 polypeptide. Such agents are useful for treating or preventing cardiovascular disorders (e.g., an endothelial cell disorder) or proliferation-related disorders, e.g., cancer, as described herein. The method includes:

[3532] providing a test agent, and a 33410 nucleic acid or polypeptide, or a cell expressing an 33410 (e.g., a cancer cell or cell line);

[3533] contacting said test agent, and said 33410 nucleic acid or polypeptide, or said cell expressing said 33410, under conditions that allow an interaction (e.g., activity or expression) between said 33410 nucleic acid or polypeptide and said test agent to occur; and

[3534] determining whether said test agent interacts with, e.g., binds the 33410 nucleic acid or polypeptide, or modulates, e.g., inhibits, the expression or activity of said 33410 polypeptide. E.g. wherein interacting with or binding the 33410 nucleic acid or polypeptide, or wherein a change, e.g., a decrease, in the level of activity or expression between said 33410 polypeptide in the presence of the test agent relative to the activity or expression in the absence of the test agent, is indicative of modulation, e.g., inhibition, of modulation of 33410 activity or expression.

[3535] In a preferred embodiment, the method further comprises the step of evaluating the test agent in the 33410-expressing cell, e.g., an endothelial or a cancer cell, in vitro, or in vivo (e.g., in a subject, e.g., a patient having a cancer or a cardiovascular disorder), to thereby determine the effect of the test agent in the activity or expression of the 33410.

[3536] In a preferred embodiment, the contacting step occurs in vitro or ex vivo. For example, a sample, e.g., a blood, biopsy or tissue sample, is obtained from the subject. Preferably, the sample contains a 33410-expressing cell.

[3537] In a preferred embodiment, the contacting step occurs in vivo. For example, by administering to the subject a detectably labeled agent that interacts with the 33410 nucleic acid or polypeptide, such that a signal is generated relative to the level of activity or expression of the 33410 nucleic acid or polypeptide.

[3538] In a preferred embodiment, the test agent is an inhibitor (partial or complete inhibitor) of the 33410 polypeptide activity or expression.

[3539] In preferred embodiments, the test agent is a peptide, a small molecule, e.g., a member of a combinatorial library (e.g., a peptide or organic combinatorial library, or a natural product library), or an antibody, or any combination thereof.

[3540] In additional preferred embodiments, the test agent is an antisense, a ribozyme, a triple helix molecule, or any combination thereof. In a preferred embodiment, a plurality of test agents, e.g., library members, is tested. In a preferred embodiment, the plurality of test agents, e.g., library members, includes at least 10, 102, 103, 104, 105, 106, 107, or 108 compounds. In a preferred embodiment, the plurality of test agents, e.g., library members, share a structural or functional characteristic.

[3541] In a preferred embodiment, test agent is a peptide or a small organic molecule. In a preferred embodiment, the method is performed in cell-free conditions (e.g., a reconstituted system).

[3542] In a preferred embodiment, the method further includes: contacting said agent with a test cell, or a test animal, to evaluate the effect of the test agent on the activity or expression of 33410.

[3543] In a preferred embodiment, the ability of the agent to modulate the activity or expression of 33410 is evaluated in a second system, e.g., a cell-free, cell-based, or an animal system.

[3544] In a preferred embodiment, the ability of the agent to modulate the activity or expression of 33410 is evaluated in a cell based system, e.g., a two-hybrid assay.

[3545] In still another aspect, the invention provides a process for modulating 33410 polypeptide or nucleic acid expression or activity, e.g. using one or more of the screened compounds. In certain embodiments, the methods involve treatment of conditions related to aberrant activity or expression of the 33410 polypeptides or nucleic acids, such as neurological conditions. For example, one or more of the screened compounds can be used to modulate one or more of cell-cell adhesion events, membrane excitability, neurite outgrowth, synaptogenesis, signal transduction, cell (e.g., neural cell) proliferation, growth, differentiation, or migration. In certain embodiments, the methods involve treatment of conditions related to aberrant activity or expression of the 33410 polypeptides or nucleic acids, such as neurodegenerative conditions, and aberrant or deficient cellular proliferation or differentiation.

[3546] In yet another aspect, the invention provides methods for inhibiting the proliferation or inducing the killing, of a 33410-expressing cell, e.g., a 33410-expressing hyperproliferative cell, comprising contacting the cell with a compound (e.g., a compound identified using the methods described herein) that modulates the activity, or expression, of the 33410 polypeptide or nucleic acid.

[3547] In a preferred embodiment, the contacting step is effective in vitro or ex vivo. In other embodiments, the contacting step is effected in vivo, e.g., in a subject (e.g., a mammal, e.g., a human), as part of a therapeutic or prophylactic protocol.

[3548] In a preferred embodiment, the 33410-expressing cell is found in a solid tumor, a soft tissue tumor, or a metastatic lesion. Preferably, the tumor is a sarcoma, a carcinoma, or an adenocarcinoma. Preferably, the cell is found in a cancerous or pre-cancerous tissue, e.g., a cancerous or pre-cancerous tissue where a 33410 polypeptide or nucleic acid is expressed. In a preferred embodiment, the 33410-expressing cell is an endothelial cell, e.g., a blood vessel associated cell.

[3549] In a preferred embodiment, the compound is an inhibitor of a 33410 polypeptide. Preferably, the inhibitor is chosen from a peptide, a phosphopeptide, a peptidomimetic, e.g., a phosphonate analog of a peptide substrate, a small organic molecule, a small inorganic molecule and an antibody (e.g., an antibody conjugated to a therapeutic moiety selected from a cytotoxin, a cytotoxic agent and a radioactive metal ion).

[3550] In a preferred embodiment, the compound is an inhibitor of a 33410 nucleic acid, e.g., an antisense, a ribozyme, or a triple helix molecule.

[3551] In a preferred embodiment, the compound is administered in combination with a cytotoxic agent. Examples of cytotoxic agents include anti-microtubule agent, a topoisomerase I inhibitor, a topoisomerase II inhibitor, an anti-metabolite, a mitotic inhibitor, an alkylating agent, an intercalating agent, an agent capable of interfering with a signal transduction pathway, an agent that promotes apoptosis or necrosis, and radiation.

[3552] In another aspect, the invention features methods for treating or preventing a disorder characterized by aberrant cellular proliferation or differentiation of a 33410-expressing cell, in a subject. Preferably, the method includes comprising administering to the subject (e.g., a mammal, e.g., a human) an effective amount of a compound (e.g., a compound identified using the methods described herein) that modulates the activity, or expression, of the 33410 polypeptide or nucleic acid.

[3553] In a preferred embodiment, the disorder is a cancerous or pre-cancerous condition. Most preferably, the disorder is a cancer, e.g., a solid tumor, a soft tissue tumor, or a metastatic lesion. Preferably, the cancer is a sarcoma, a carcinoma, or an adenocarcinoma. Preferably, the cancer is found in a tissue where a 33410 polypeptide or nucleic acid is expressed, e.g., breast, ovarian, colon, liver, lung, kidney, or brain cancer. Most preferably, the cancer is found in the breast, ovary, colon, liver and lung.

[3554] In a preferred embodiment, the disorder is an endothelial cell disorder; is a disorder characterized by aberrant, unregulated, or unwanted endothelial cell activity, e.g., proliferation, migration, angiogenesis, or vascularization; or aberrant expression of cell surface adhesion molecules or genes associated with angiogenesis. Examples of endothelial cell disorders include tumorigenesis, tumor metastasis, psoriasis, diabetic retinopathy, endometriosis, Grave's disease, ischemic disease (e.g., atherosclerosis), and chronic inflammatory diseases (e.g., rheumatoid arthritis).

[3555] In a preferred embodiment, the compound is an inhibitor of a 33410 polypeptide. Preferably, the inhibitor is chosen from a peptide, a phosphopeptide, a small organic molecule, a small inorganic molecule and an antibody (e.g., an antibody conjugated to a therapeutic moiety selected from a cytotoxin, a cytotoxic agent and a radioactive metal ion). The inhibitor can also be a trypsin inhibitor or a derivative thereof, or a peptidomimetic, e.g., a phosphonate analog of a peptide substrate.

[3556] In a preferred embodiment, the compound is an inhibitor of a 33410 nucleic acid, e.g., an antisense, a ribozyme, or a triple helix molecule.

[3557] In a preferred embodiment, the compound is administered in combination with a cytotoxic agent. Examples of cytotoxic agents include anti-microtubule agent, a topoisomerase I inhibitor, a topoisomerase II inhibitor, an anti-metabolite, a mitotic inhibitor, an alkylating agent, an intercalating agent, an agent capable of interfering with a signal transduction pathway, an agent that promotes apoptosis or necrosis, and radiation.

[3558] In a preferred embodiment, the subject is a mammal, e.g., a human; a patient, e.g., a patient with a cancer or a cardiovascular condition.

[3559] The invention also provides assays for determining the activity of or the presence or absence of 33410 polypeptides or nucleic acid molecules in a biological sample, including for disease diagnosis. Preferably, the biological sample includes a cancerous or pre-cancerous cell or tissue. For example, the cancerous tissue can be a solid tumor, a soft tissue tumor, or a metastatic lesion. Preferably, the cancerous tissue is a sarcoma, a carcinoma, or an adenocarcinoma. In other embodiments, the biological sample includes endothelial cells.

[3560] In further aspect the invention provides assays for determining the presence or absence of a genetic alteration in a 33410 polypeptide or nucleic acid molecule, including for disease diagnosis. Preferably, the biological sample includes a cancerous or pre-cancerous cell or tissue. For example, the cancerous tissue can be a solid tumor, a soft tissue tumor, or a metastatic lesion. Preferably, the cancerous tissue is a sarcoma, a carcinoma, or an adenocarcinoma. In other embodiments, the biological sample includes endothelial cells.

[3561] In another aspect, the invention features a method of diagnosing, or staging, a 33410-mediated disorder, e.g., a neurological disorder, or a cancer disorder, in a subject. The method includes evaluating the expression or activity of a 33410 nucleic acid or polypeptide, thereby diagnosis or staging the disorder. In a preferred embodiment, the expression or activity is compared with a reference value, e.g., a difference in the expression or activity level of the 33410 nucleic or polypeptide relative to a reference, e.g., a normal subject or a cohort of normal subjects is indicative of the disorder, or a stage in the disorder.

[3562] In a preferred embodiment, the subject is a human. For example, the subject is a human suffering from, or at risk of, a cardiovascular or a cancer disorder as described herein.

[3563] In a preferred embodiment, the evaluating step occurs in vitro or ex vivo. For example, a sample, e.g., a blood or tissue sample, a biopsy, is obtained from the subject. Preferably, the sample contains a cancer or an endothelial cell.

[3564] In a preferred embodiment, the evaluating step occurs in vivo. For example, by administering to the subject a detectably labeled agent that interacts with the 33410-associated nucleic acid or polypeptide, such that a signal is generated relative to the level of activity or expression of the 33410 nucleic acid or polypeptide.

[3565] In preferred embodiments, the method is performed: on a sample from a subject, a sample from a human subject; e.g., a sample of a patient suffering from, or at risk of, a cardiovascular or a cancer disorder as described herein; to determine if the individual from which the target nucleic acid or protein is taken should receive a drug or other treatment; to diagnose an individual for a disorder or for predisposition to resistance to treatment, to stage a disease or disorder.

[3566] In a still further aspect, the invention provides methods for evaluating the efficacy of a treatment of a disorder, e.g., proliferative disorder, e.g., a cancer (e.g., breast, ovarian, colon, liver or lung cancer); or an endothelial cell disorder. The method includes: treating a subject, e.g., a patient or an animal, with a protocol (e.g., treating a subject with one or more of: chemotherapy, radiation, and/or a compound identified using the methods described herein); and evaluating the expression of a 33410 nucleic acid or polypeptide before and after treatment. A change, e.g., a decrease or increase, in the level of a 33410 nucleic acid (e.g., mRNA) or polypeptide after treatment, relative to the level of expression before treatment, is indicative of the efficacy of the treatment of the disorder.

[3567] In a preferred embodiment, the disorder is a cancer of the breast, ovary, colon, lung, or liver. In other embodiments, the disorder is an endothelial cell disorder. The level of 33410 nucleic acid or polypeptide expression can be detected by any method described herein.

[3568] In a preferred embodiment, the evaluating step includes obtaining a sample (e.g., a tissue sample, e.g., a biopsy, or a fluid sample) from the subject, before and after treatment and comparing the level of expressing of a 33410 nucleic acid (e.g., mRNA) or polypeptide before and after treatment.

[3569] In another aspect, the invention provides methods for evaluating the efficacy of a therapeutic or prophylactic agent (e.g., an anti-neoplastic agent). The method includes: contacting a sample with an agent (e.g., a compound identified using the methods described herein, a cytotoxic agent) and, evaluating the expression or activity of a 33410 nucleic acid or polypeptide in the sample before and after the contacting step. A change, e.g., a decrease or increase, in the level of 33410 nucleic acid (e.g., mRNA) or polypeptide in the sample obtained after the contacting step, relative to the level of expression in the sample before the contacting step, is indicative of the efficacy of the agent. The level of 33410 nucleic acid or polypeptide expression can be detected by any method described herein.

[3570] In a preferred embodiment, the sample includes cells obtained from a cancerous tissue where a 33410 polypeptide or nucleic acid is obtained, e.g., a cancer of the breast, ovary, colon, lung, or liver.

[3571] In a preferred embodiment, the sample is a tissue sample (e.g., a biopsy), a bodily fluid, cultured cells (e.g., a tumor cell line).

[3572] In a preferred embodiment, the sample includes endothelial cells.

[3573] In another aspect, the invention features a two dimensional array having a plurality of addresses, each address of the plurality being positionally distinguishable from each other address of the plurality, and each address of the plurality having a unique capture probe, e.g., a nucleic acid or peptide sequence. At least one address of the plurality has a capture probe that recognizes a 33410 molecule. In one embodiment, the capture probe is a nucleic acid, e.g., a probe complementary to a 33410 nucleic acid sequence. In another embodiment, the capture probe is a polypeptide, e.g., an antibody specific for 33410 polypeptides. Also featured is a method of analyzing a sample by contacting the sample to the aforementioned array and detecting binding of the sample to the array.

[3574] Other features and advantages of the invention will be apparent from the following detailed description, and from the claims.

[3575] Detailed Description of 33410

[3576] The human 33410 sequence (Example 39; SEQ ID NO:53), which is approximately 4667 nucleotides long including untranslated regions, contains a predicted methionine-initiated coding sequence of about 2508 nucleotides, including the termination codon (nucleotides indicated as coding of SEQ ID NO:53; SEQ ID NO:55). The coding sequence encodes a 835 amino acid protein (SEQ ID NO:54). The human 33410 protein of SEQ ID NO:54 and FIG. 27 includes an amino-terminal hydrophobic amino acid sequence, consistent with a signal sequence, of about 14 amino acids (from amino acid 1 to about amino acid 14 of SEQ ID NO:54), which upon cleavage results in the production of a mature protein form.

[3577] Human 33410 contains the following regions or other structural features: a carboxylesterase domain (PFAM Accession Number PF00135) located at about amino acid residues 42 to 601 of SEQ ID NO:54 which includes a carboxylesterase type-B signature 2 domain (Prosite Accession Number PS00941) at about amino acid 139 to 149 of SEQ ID NO:54; and one predicted transmembrane domain at about amino acid 676 to 698 of SEQ ID NO:54.

[3578] The 33410 protein also includes the following domains: four predicted N-glycosylation sites (PS00001) at about amino acids 98 to 101, 136 to 139, 522 to 525, and 823 to 826 of SEQ ID NO:54; three predicted glycosaminoglycan attachment sites (PS00002) at about amino acids 264 to 267, 718 to 721 and 793 to 796 of SEQ ID NO:54; three predicted cAMP- and cGMP-dependent protein kinase phosphorylation sites (PS00004) at about amino acids 331 to 334, 447 to 450, and 710 to 713 of SEQ ID NO:54; six predicted Protein Kinase C phosphorylation sites (PS00005) at about amino acids 156 to 158, 430 to 432, 467 to 469, 619 to 621, 640 to 642, and 832 to 834 of SEQ ID NO:54; five predicted Casein Kinase II phosphorylation sites (PS00006) located at about amino 35 to 38, 187 to 190, 223 to 226, 404 to 407, and 524 to 527 of SEQ ID NO:54; one predicted Tyrosine kinase phosphorylation site (PS00007) at about amino acids 664 to 672 of SEQ ID NO:54; fourteen predicted N-myristoylation sites (PS00008) from about amino 10 to 15, 18 to 23, 30 to 35, 71 to 76, 95 to 100, 112 to 117, 198 to 203, 263 to 268, 292 to 297, 381 to 386, 400 to 405, 636 to 641, 716 to 721, and 750 to 755 of SEQ ID NO:54; one predicted amidation site (PS00009) at about amino acid 174 to 177 of SEQ ID NO:54; and one predicted prokaryotic membrane lipoprotein lipid attachment site (PS00013) at about amino acid 260 to 270 of SEQ ID NO:54.

[3579] For general information regarding PFAM identifiers, PS prefix and PF prefix domain identification numbers, refer to Sonnhammer et al. (1997) Protein 28:405-420 and http://www.psc.edu/general/software/packages/pfam/pfam.html.

[3580] A plasmid containing the nucleotide sequence encoding human 33410 (clone Fbh33410FL) was deposited with American Type Culture Collection (ATCC), 10801 University Boulevard, Manassas, Va. 20110-2209, on ______ and assigned Accession Number ______. This deposit will be maintained under the terms of the Budapest Treaty on the International Recognition of the Deposit of Microorganisms for the Purposes of Patent Procedure. This deposit was made merely as a convenience for those of skill in the art and is not an admission that a deposit is required under 35 U.S.C. §112.

[3581] The 33410 protein contains a significant number of structural characteristics in common with members of the carboxylesterase family. The term “family” when referring to the protein and nucleic acid molecules of the invention means two or more proteins or nucleic acid molecules having a common structural domain or motif and having sufficient amino acid or nucleotide sequence homology as defined herein. Such family members can be naturally or non-naturally occurring and can be from either the same or different species. For example, a family can contain a first protein of human origin as well as other distinct proteins of human origin, or alternatively, can contain homologues of non-human origin, e.g., rat or mouse proteins. Members of a family can also have common functional characteristics. Carboxylesterase family members are known to act on carboxylic esters, such as acetylcholinesterase. Based on the differential patters of inhibition by organophosphates, carboxylesterases have been classified into three categories (A, B and C) (Myers, M. et al. (1988) Mol. Biol. Evol. 5(2):113-119). 33410 proteins of the invention include a carboxylesterase type B signature 2 domain located at about amino acids 139 to 149 of SEQ ID NO:54, which suggests that the 33410 proteins belong to the carboxylesterase type B family.

[3582] Carboxylesterase family members are characterized by a catalytic triad of amino acids: a serine, a glutamate or aspartate and a histidine. The sequence around the active site serine is well conserved and can be used as a signature pattern. A second signature pattern is located in the N-terminal section and contains a cysteine involved in disulfide bond formation. Typical consensus patterns of carboxylesterases are F-[GR]-G-x(4)-[LIVM]-x-[LIV]-x-G-x-S-[STAG]-G (SEQ I]D NO:59) (where S is the active site residue) and [ED]-D-C-L-[YT]-[LIV]-[DNS]-[LIV]-[LIVFYW]-x-[PQR] (where C is involved in a disulfide bond). 33410 proteins have a similar pattern starting at about amino acid 252 to 267 of SEQ ID NO:54 as follows: FGGDPERITIFGSGAG.

[3583] 33410 proteins of the invention are homologous to rat neuroligin proteins, in particular, the rat neuroligin-2 protein (FIG. 29). Thus, the proteins of the invention are members of the neuroligin subfamily of carboxylesterase type B proteins. As used herein, the term “neuroligin” refers to cell surface molecules composed of five domains: an N-terminal cleaved signal sequence, a large extracellular domain homologous to esterases, a linker domain between the transmembrane region and the esterase homology domain, a single transmembrane region, and a cytoplasmic tail (Ichtchenko, K. et al. (1996) J. Biol. Chem. 271(5):2676-2682). Neurexin binding is known to be calcium-dependent and is mediated by an EF hand motif. 33410 contain several structural features of neuroligin family members. For example, 33410 proteins have an extracellular domain located at about amino acids 1-675 of SEQ ID NO:54 (which includes a carboxylesterase domain located at amino acids 42 to 601 of SEQ ID NO:54), a transmembrane domain located at about amino acids 676-698, and a short cytoplasmic domain located at about amino acids 699-835 of SEQ ID NO:54. 33410 proteins further include an EF hand motif from about amino acid 387 to 416 of SEQ ID NO:54, and six conserved cysteine residues located at about amino acids 106, 141, 270, 317, 328, 487 and 521 of SEQ ID NO:54.

[3584] Typically, members of the neuroligins are expressed at high levels in the brain, primarily in neurons. Typically, neuroligins are capable of mediating cell adhesion events associated with development and/or maintenance, e.g., neural events such as synaptogenesis, recruitment of receptors, channels, and signal transduction molecules at synaptic sites (e.g., at excitatory synapses) (Song et al. (1999) Proc. Natl. Acad. Sci. 96(3):1100-5. Typically, neuroligins are capable of interacting with a cell surface protein, e.g., a neurexin (e.g., a &bgr;-neurexins). Typically, neuroligin-&bgr;-neurexin interactions mediate cell adhesion events, e.g., neuron-neuron, or neuron-glia cell adhesion events.

[3585] A 33410 polypeptide can include a “carboxylesterase domain” or regions homologous with an “carboxylesterase” domain. A 33410 can optionally further include at least one transmembrane domain, at least one extracellular domain, and at least one intracellular domain. A 33410 can optionally further include at least one, two, three, preferably four N-glycosylation sites; at least one, two preferably three cAMP/cGMP phosphorylation sites; at least one, two, three, four, five, preferably six protein kinase C sites; at least one, two, three, four, preferably five casein kinase II sites; at least one, two, three, four, five, six, seven, eight, nine, ten, eleven, twelve, thirteen, and preferably fourteen N-myristoylation sites; at least one tyrosine phosphorylation site, at least one amidation site; and at least one, two and preferably three glycosaminoglycan attachment site.

[3586] As used herein, the term “carboxylesterase domain” refers to a protein domain which is includes a carboxylesterase type B signature 2 domain. Preferably, the carboxylesterase type B signature 2 domain is about 5 to 20 amino acids, more preferably 8-15, most preferably 11 amino acids and includes the sequence [EDX(0, 1)CLYX] (SEQ ID NO:60). Most preferably, the carboxylesterase type B signature 2 domain has the amino acid sequence: EDCLYNIYVP located at about amino acids 139 to 149 of SEQ ID NO:54. Preferably, the carboxylesterase domain has an amino acid sequence of about 450 to about 650 amino acid residues and having a bit score for the alignment of the sequence to the carboxylesterase domain (HMM) of at least 100. Preferably, a carboxylesterase domain includes at least about 450 to about 600 amino acids, more preferably about 500 to about 575 amino acid residues, about 550 to 570, or about 559 amino acids and has a bit score for the alignment of the sequence to the carboxylesterase domain (HMM) of at least 200, preferably 300, more preferably 400 or greater. The carboxylesterase domain (HMM) has been assigned the PFAM Accession (PF00135) (http://genome.wustl.edu/Pfam/html). An alignment of the carboxylesterase domain (from about amino acids 42 to about 601 of SEQ ID NO:54) of human 33410 with a consensus amino acid sequence derived from a hidden Markov model (PFAM) is depicted in FIG. 28.

[3587] In a preferred embodiment, 33410 polypeptide or protein has a “carboxylesterase domain” or a region which includes at least about 400 to about 650 amino acids, preferably 450 to about 600 amino acids, more preferably about 500 to about 570 amino acid residues, about 550 to 570, or about 559 amino acid residues and has at least about 60%, 70% 80% 90% 95%, 99%, or 100% homology with a “carboxylesterase domain,” e.g., the carboxylesterase domain of human 33410 (e.g., residues 42 to 601 of SEQ ID NO:54).

[3588] To identify the presence of an “carboxylesterase” domain in a 33410 protein sequence, and make the determination that a polypeptide or protein of interest has a particular profile, the amino acid sequence of the protein can be searched against a database of HMMs (e.g., the Pfam database, release 2.1) using the default parameters (http://www.sanger.ac.uk/Software/Pfam/HMM_search). For example, the hmmsf program, which is available as part of the HMMER package of search programs, is a family specific default program for MILPAT0063 and a score of 15 is the default threshold score for determining a hit. Alternatively, the threshold score for determining a hit can be lowered (e.g., to 8 bits). A description of the Pfam database can be found in Sonhammer et al. (1997) Proteins 28(3):405-420 and a detailed description of HMMs can be found, for example, in Gribskov et al. (1990) Meth. Enzymol. 183:146-159; Gribskov et al. (1987) Proc. Natl. Acad. Sci. USA 84:4355-4358; Krogh et al. (1994) J. Mol. Biol. 235:1501-1531; and Stultz et al. (1993) Protein Sci. 2:305-314, the contents of which are incorporated herein by reference. A search was performed against the HMM database resulting in the identification of an “carboxylesterase domain” domain in the amino acid sequence of human 33410 at about residues 42 to 601 of SEQ ID NO:54 (see FIG. 26).

[3589] In one embodiment, a 33410 protein includes at least one transmembrane domain. As used herein, the term “transmembrane domain” includes an amino acid sequence of about 15 amino acid residues in length that spans a phospholipid membrane. More preferably, a transmembrane domain includes about at least 16, 18, 20, 21, 23, 25, 30, 35 or 40 amino acid residues and spans a phospholipid membrane. Transmembrane domains are rich in hydrophobic residues, and typically have an &agr;-helical structure. In a preferred embodiment, at least 50%, 60%, 70%, 80%, 90%, 95% or more of the amino acids of a transmembrane domain are hydrophobic, e.g., leucines, isoleucines, tyrosines, or tryptophans. Transmembrane domains are described in, for example, http://pfam.wustl.edu/cgi-bin/getdesc?name=7tm-1, and Zagotta W. N. et al, (1996) Annual Rev. Neuronsci. 19: 235-63, the contents of which are incorporated herein by reference.

[3590] In a preferred embodiment, 33410 polypeptide or protein has a “transmembrane domain” or a region which includes at least about 1 to 45, more preferably about 10 to 35, even more preferably about 20 to 25 or 23 amino acid residues and has at least about 50%, 60%, 70% 80% 90% 95%, 99%, or 100% homology with a “transmembrane domain” e.g., the transmembrane domain of human 33410 (e.g., residues 676 to 698 of SEQ ID NO:54).

[3591] A 33410 protein further includes a predicted N-terminal extracellular domain located at about amino acids 1-675 (or 15-675 of the mature protein) of SEQ ID NO:54. As used herein, an “N-terminal extracellular domain” includes an amino acid sequence about 1-800, preferably about 200-700, and even more preferably about 300-680 or 675, amino acid residues in length and is located outside of a cell or extracellularly. The C-terminal amino acid residue of a “N-terminal extracellular domain” is adjacent to an N-terminal amino acid residue of a transmembrane domain in a naturally occurring 33410 or 33410-like protein. For example, an N-terminal cytoplasmic domain is located at about amino acid residues 1-675 of SEQ ID NO:54.

[3592] In a preferred embodiment 33410 polypeptide or protein has an “N-terminal extracellular domain” or a region which includes at least about 1-800, preferably about 200-700, and even more preferably about 300-680 or 675 amino acid residues and has at least about 60%, 70% 80% 90% 95%, 99%, or 100% homology with an “N-terminal extracellular domain,” e.g., the N-terminal extracellular domain of human 33410 (e.g., residues 1-675 of SEQ ID NO:54). Preferably, the N-terminal extracellular domain is capable of interacting (e.g., binding to) with an extracellular signal (e.g., a neurexin) and/or modulating cell adhesion.

[3593] In another embodiment, a 33410 protein includes a “C-terminal cytoplasmic domain”, also referred to herein as a C-terminal cytoplasmic tail, in the sequence of the protein. As used herein, a “C-terminal cytoplasmic domain” includes an amino acid sequence having a length of at least about 50 to 200, preferably 100 to 150, and more preferably 136 amino acid residues and is located within a cell or within the cytoplasm of a cell. Accordingly, the N-terminal amino acid residue of a “C-terminal cytoplasmic domain” is adjacent to a C-terminal amino acid residue of a transmembrane domain in a naturally-occurring 33410 or 33410-like protein. For example, a C-terminal cytoplasmic domain is found at about amino acid residues 699-835 of SEQ ID NO:54.

[3594] In a preferred embodiment, a 33410 polypeptide or protein has a C-terminal cytoplasmic domain or a region which includes at least about 50 to 200, preferably 100 to 150, and more preferably 136 amino acid residues and has at least about 60%, 70% 80% 90% 95%, 99%, or 100% homology with an “C-terminal cytoplasmic domain,” e.g., the C-terminal cytoplasmic domain of human 33410 (e.g., residues 699-835 of SEQ ID NO:54).

[3595] A 33410 molecule can further include a signal sequence. As used herein, a “signal sequence” refers to a peptide of about 10-40 amino acid residues in length which occurs at the N-terminus of secretory and integral membrane proteins and which contains a majority of hydrophobic amino acid residues. For example, a signal sequence contains at least about 12-30 amino acid residues, preferably about 13-20 amino acid residues, more preferably about 14 amino acid residues, and has at least about 40-70%, preferably about 50-65%, and more preferably about 55-60% hydrophobic amino acid residues (e.g., alanine, valine, leucine, isoleucine, phenylalanine, tyrosine, tryptophan, or proline). Such a “signal sequence”, also referred to in the art as a “signal peptide”, serves to direct a protein containing such a sequence to a lipid bilayer. For example, in one embodiment, a 33410 protein contains a signal sequence of about amino acids 1-14 of SEQ ID NO:54. The “signal sequence” is cleaved during processing of the mature protein. The mature 33410 protein corresponds to amino acids 15 to 835 of SEQ ID NO:54.

[3596] As the 33410 polypeptides of the invention may modulate 33410-mediated activities, they may be useful as or for developing novel diagnostic and therapeutic agents for 33410-mediated or related disorders, as described below.

[3597] As used herein, a “33410 activity”, “biological activity of 33410” or “functional activity of 33410”, refers to an activity exerted by a 33410 protein, polypeptide or nucleic acid molecule on e.g., a 33410-responsive cell or on a 33410 substrate, e.g., a protein substrate, as determined in vivo or in vitro. In one embodiment, a 33410 activity is a direct activity, such as an association with a 33410 target molecule. A “target molecule” or “binding partner” is a molecule with which a 33410 protein binds or interacts in nature, e.g., a cell surface molecule, e.g., a neurexin. A 33410 activity can also be an indirect activity, e.g., a cellular signaling activity mediated by interaction of the 33410 protein with a 33410 receptor. For example, the 33410 proteins of the present invention can have one or more of the following activities: 1) ability to catalyze the hydrolysis of carboxylic esters; (2) ability to mediate cell-cell (e.g., neuron-neuron, or neuron-glia) recognition events, adhesion or attachment (3) ability to interact with a cell surface protein (e.g., neurexin) or an extracellular component; (4) ability to modulate cell migration, (5) ability to modulate patterning, (6) ability to modulate proliferation, and/or differentiation, of a cell (e.g., a neural or a cancer cell); (7) ability to modulate embryonic development and differentiation; (8) ability to modulate morphogenesis; (9) ability to modulate tissue maintenance; (10) ability to modulate neural development, e.g., axonal growth, synaptogenesis, neurite outgrowth, membrane excitability and/or guidance; or (11) ability to bind a divalent cation, e.g., Zn2+, Mg2+, Cd2+, Mn2+, and/or preferably a Ca2+ ion.

[3598] Based on the above-described sequence similarities, the 33410 molecules of the present invention are predicted to have similar biological activities as carboxylesterase family members, in particular neuroligin proteins. Thus, the 33410 molecules can act as novel diagnostic targets and therapeutic agents for controlling cell proliferative and cell differentiative disorders, as well as neural disorder (e.g., neurodegenerative disorders including CNS disorders).

[3599] The 33410 protein may be involved in disorders characterized by aberrant activity of the cells in which it is expressed. 33410 is expressed in cells and tissues derived from heart, arteries, kidney, brain (e.g., cortex and hypothalamus), spinal cord and ovaries (Table 16). Accordingly, the 33410 molecules can serve as novel diagnostic targets and therapeutic agents for controlling disorders involving the cells or tissues where they are expressed. For example, the 33410 molecules can serve as novel diagnostic targets and therapeutic agents for controlling disorders of cell proliferation, cell differentiation, angiogenesis, organogenesis, and cell signaling.

[3600] Disorders involving the brain include, but are not limited to, disorders involving neurons, and disorders involving glia, such as astrocytes, oligodendrocytes, ependymal cells, and microglia; cerebral edema, raised intracranial pressure and herniation, and hydrocephalus; malformations and developmental diseases, such as neural tube defects, forebrain anomalies, posterior fossa anomalies, and syringomyelia and hydromyelia; perinatal brain injury; cerebrovascular diseases, such as those related to hypoxia, ischemia, and infarction, including hypotension, hypoperfusion, and low-flow states—global cerebral ischemia and focal cerebral ischemia—infarction from obstruction of local blood supply, intracranial hemorrhage, including intracerebral (intraparenchymal) hemorrhage, subarachnoid hemorrhage and ruptured berry aneurysms, and vascular malformations, hypertensive cerebrovascular disease, including lacunar infarcts, slit hemorrhages, and hypertensive encephalopathy; infections, such as acute meningitis, including acute pyogenic (bacterial) meningitis and acute aseptic (viral) meningitis, acute focal suppurative infections, including brain abscess, subdural empyema, and extradural abscess, chronic bacterial meningoencephalitis, including tuberculosis and mycobacterioses, neurosyphilis, and neuroborreliosis (Lyme disease), viral meningoencephalitis, including arthropod-borne (Arbo) viral encephalitis, Herpes simplex virus Type 1, Herpes simplex virus Type 2, Varicalla-zoster virus (Herpes zoster), cytomegalovirus, poliomyelitis, rabies, and human immunodeficiency virus 1, including HIV-1 meningoencephalitis (subacute encephalitis), vacuolar myelopathy, AIDS-associated myopathy, peripheral neuropathy, and AIDS in children, progressive multifocal leukoencephalopathy, subacute sclerosing panencephalitis, fungal meningoencephalitis, other infectious diseases of the nervous system; transmissible spongiform encephalopathies (prion diseases); demyelinating diseases, including multiple sclerosis, multiple sclerosis variants, acute disseminated encephalomyelitis and acute necrotizing hemorrhagic encephalomyelitis, and other diseases with demyelination; degenerative diseases, such as degenerative diseases affecting the cerebral cortex, including Alzheimer disease and Pick disease, degenerative diseases of basal ganglia and brain stem, including Parkinsonism, idiopathic Parkinson disease (paralysis agitans), progressive supranuclear palsy, corticobasal degenration, multiple system atrophy, including striatonigral degenration, Shy-Drager syndrome, and olivopontocerebellar atrophy, and Huntington disease; spinocerebellar degenerations, including spinocerebellar ataxias, including Friedreich ataxia, and ataxia-telanglectasia, degenerative diseases affecting motor neurons, including amyotrophic lateral sclerosis (motor neuron disease), bulbospinal atrophy (Kennedy syndrome), and spinal muscular atrophy; inborn errors of metabolism, such as leukodystrophies, including Krabbe disease, metachromatic leukodystrophy, adrenoleukodystrophy, Pelizaeus-Merzbacher disease, and Canavan disease, mitochondrial encephalomyopathies, including Leigh disease and other mitochondrial encephalomyopathies; toxic and acquired metabolic diseases, including vitamin deficiencies such as thiamine (vitamin B1) deficiency and vitamin B12 deficiency, neurologic sequelae of metabolic disturbances, including hypoglycemia, hyperglycemia, and hepatic encephatopathy, toxic disorders, including carbon monoxide, methanol, ethanol, and radiation, including combined methotrexate and radiation-induced injury; tumors, such as gliomas, including astrocytoma, including fibrillary (diffuse) astrocytoma and glioblastoma multiforme, pilocytic astrocytoma, pleomorphic xanthoastrocytoma, and brain stem glioma, oligodendroglioma, and ependymoma and related paraventricular mass lesions, neuronal tumors, poorly differentiated neoplasms, including medulloblastoma, other parenchymal tumors, including primary brain lymphoma, germ cell tumors, and pineal parenchymal tumors, meningiomas, metastatic tumors, paraneoplastic syndromes, peripheral nerve sheath tumors, including schwannoma, neurofibroma, and malignant peripheral nerve sheath tumor (malignant schwannoma), and neurocutaneous syndromes (phakomatoses), including neurofibromotosis, including Type 1 neurofibromatosis (NF1) and TYPE 2 neurofibromatosis (NF2), tuberous sclerosis, and Von Hippel-Lindau disease.

[3601] The polypeptides and nucleic acids of the invention can also be used to treat, prevent, and/or diagnose cancers and neoplastic conditions. Examples of cellular proliferative and/or differentiative disorders include cancer, e.g., carcinoma, sarcoma, metastatic disorders or hematopoietic neoplastic disorders, e.g., leukemias. A metastatic tumor can arise from a multitude of primary tumor types, including but not limited to those of prostate, colon, lung, breast and liver origin.

[3602] As used herein, the terms “cancer”, “hyperproliferative” and “neoplastic” refer to cells having the capacity for autonomous growth, i.e., an abnormal state or condition characterized by rapidly proliferating cell growth. Hyperproliferative and neoplastic disease states may be categorized as pathologic, i.e., characterizing or constituting a disease state, or may be categorized as non-pathologic, i.e., a deviation from normal but not associated with a disease state. The term is meant to include all types of cancerous growths or oncogenic processes, metastatic tissues or malignantly transformed cells, tissues, or organs, irrespective of histopathologic type or stage of invasiveness. “Pathologic hyperproliferative” cells occur in disease states characterized by malignant tumor growth. Examples of non-pathologic hyperproliferative cells include proliferation of cells associated with wound repair.

[3603] The terms “cancer” or “neoplasms” include malignancies of the various organ systems, such as affecting lung, breast, thyroid, lymphoid, gastrointestinal, and genito-urinary tract, as well as adenocarcinomas which include malignancies such as most colon cancers, renal-cell carcinoma, prostate cancer and/or testicular tumors, non-small cell carcinoma of the lung, cancer of the small intestine and cancer of the esophagus.

[3604] The term “carcinoma” is art recognized and refers to malignancies of epithelial or endocrine tissues including respiratory system carcinomas, gastrointestinal system carcinomas, genitourinary system carcinomas, testicular carcinomas, breast carcinomas, prostatic carcinomas, endocrine system carcinomas, and melanomas. Exemplary carcinomas include those forming from tissue of the cervix, lung, prostate, breast, head and neck, colon and ovary. The term also includes carcinosarcomas, e.g., which include malignant tumors composed of carcinomatous and sarcomatous tissues. An “adenocarcinoma” refers to a carcinoma derived from glandular tissue or in which the tumor cells form recognizable glandular structures.

[3605] The term “sarcoma” is art recognized and refers to malignant tumors of mesenchymal derivation.

[3606] Examples of cancers or neoplastic conditions, include, but are not limited to, a fibrosarcoma, myosarcoma, liposarcoma, chondrosarcoma, osteogenic sarcoma, chordoma, angiosarcoma, endotheliosarcoma, lymphangiosarcoma, lymphangioendotheliosarcoma, synovioma, mesothelioma, Ewing's tumor, leiomyosarcoma, rhabdomyosarcoma, gastric cancer, esophageal cancer, rectal cancer, pancreatic cancer, ovarian cancer, prostate cancer, uterine cancer, cancer of the head and neck, skin cancer, brain cancer, squamous cell carcinoma, sebaceous gland carcinoma, papillary carcinoma, papillary adenocarcinoma, cystadenocarcinoma, medullary carcinoma, bronchogenic carcinoma, renal cell carcinoma, hepatoma, bile duct carcinoma, choriocarcinoma, seminoma, embryonal carcinoma, Wilm's tumor, cervical cancer, testicular cancer, small cell lung carcinoma, non-small cell lung carcinoma, bladder carcinoma, epithelial carcinoma, glioma, astrocytoma, medulloblastoma, craniopharyngioma, ependymoma, pinealoma, hemangioblastoma, acoustic neuroma, oligodendroglioma, meningioma, melanoma, neuroblastoma, retinoblastoma, leukemia, lymphoma, or Kaposi sarcoma.

[3607] Examples of cellular proliferative and/or differentiative disorders of the breast include, but are not limited to, proliferative breast disease including, e.g., epithelial hyperplasia, sclerosing adenosis, and small duct papillomas; tumors, e.g., stromal tumors such as fibroadenoma, phyllodes tumor, and sarcomas, and epithelial tumors such as large duct papilloma; carcinoma of the breast including in situ (noninvasive) carcinoma that includes ductal carcinoma in situ (including Paget 's disease) and lobular carcinoma in situ, and invasive (infiltrating) carcinoma including, but not limited to, invasive ductal carcinoma, invasive lobular carcinoma, medullary carcinoma, colloid (mucinous) carcinoma, tubular carcinoma, and invasive papillary carcinoma, and miscellaneous malignant neoplasms. Disorders in the male breast include, but are not limited to, gynecomastia and carcinoma.

[3608] Examples of cellular proliferative and/or differentiative disorders of the lung include, but are not limited to, bronchogenic carcinoma, including paraneoplastic syndromes, bronchioloalveolar carcinoma, neuroendocrine tumors, such as bronchial carcinoid, miscellaneous tumors, and metastatic tumors; pathologies of the pleura, including inflammatory pleural effusions, noninflammatory pleural effusions, pneumothorax, and pleural tumors, including solitary fibrous tumors (pleural fibroma) and malignant mesothelioma.

[3609] Examples of cellular proliferative and/or differentiative disorders of the colon include, but are not limited to, non-neoplastic polyps, adenomas, familial syndromes, colorectal carcinogenesis, colorectal carcinoma, and carcinoid tumors. Examples of cellular proliferative and/or differentiative disorders of the liver include, but are not limited to, nodular hyperplasias, adenomas, and malignant tumors, including primary carcinoma of the liver and metastatic tumors.

[3610] Examples of cellular proliferative and/or differentiative disorders of the ovary include, but are not limited to, ovarian tumors such as, tumors of coelomic epithelium, serous tumors, mucinous tumors, endometeriod tumors, clear cell adenocarcinoma, cystadenofibroma, brenner tumor, surface epithelial tumors; germ cell tumors such as mature (benign) teratomas, monodermal teratomas, immature malignant teratomas, dysgerminoma, endodermal sinus tumor, choriocarcinoma; sex cord-stomal tumors such as, granulosa-theca cell tumors, thecoma-fibromas, androblastomas, hill cell tumors, and gonadoblastoma; and metastatic tumors such as Krukenberg tumors.

[3611] The 33410 nucleic acid and protein of the invention can be used to treat and/or diagnose a variety of hematopoietic neoplastic disorders. Additional examples of proliferative disorders include hematopoietic neoplastic disorders. As used herein, the term “hematopoietic neoplastic disorders” includes diseases involving hyperplastic/neoplastic cells of hematopoietic origin, e.g., arising from myeloid, lymphoid or erythroid lineages, or precursor cells thereof. Preferably, the diseases arise from poorly differentiated acute leukemias, e.g., erythroblastic leukemia and acute megakaryoblastic leukemia. Additional exemplary myeloid disorders include, but are not limited to, acute promyeloid leukemia (APML), acute myelogenous leukemia (AML) and chronic myelogenous leukemia (CML) (reviewed in Vaickus, L. (1991) Crit Rev. in Oncol./Hemotol. 11:267-97); lymphoid malignancies include, but are not limited to acute lymphoblastic leukemia (ALL) which includes B-lineage ALL and T-lineage ALL, chronic lymphocytic leukemia (CLL), prolymphocytic leukemia (PLL), hairy cell leukemia (HLL) and Waldenstrom's macroglobulinemia (WM). Additional forms of malignant lymphomas include, but are not limited to non-Hodgkin lymphoma and variants thereof, peripheral T cell lymphomas, adult T cell leukemia/lymphoma (ATL), cutaneous T-cell lymphoma (CTCL), large granular lymphocytic leukemia (LGF), Hodgkin's disease and Reed-Sternberg disease.

[3612] As the 33410 mRNA is expressed in the normal heart, artery, spinal cord, kidney, brain, and ovary, it is likely that 33410 molecules of the present invention are involved in disorders characterized by aberrant activity of these cells. Thus, the 33410 molecules can act as novel diagnostic targets and therapeutic agents for controlling disorders involving aberrant activity of these cells. For example, modulators of 33410 polypeptide or nucleic acid activity or expression can be used to treat or prevent endothelial cell disorders, and more broadly cardiovascular or blood vessel disorders.

[3613] The 33410 protein, fragments thereof, and derivatives and other variants of the sequence in SEQ ID NO:54 thereof are collectively referred to as “polypeptides or proteins of the invention” or “33410 polypeptides or proteins”. Nucleic acid molecules encoding such polypeptides or proteins are collectively referred to as “nucleic acids of the invention” or “33410 nucleic acids.” 33410 molecules refer to 33410 nucleic acids, polypeptides, and antibodies.

[3614] As used herein, the term “nucleic acid molecule” includes DNA molecules (e.g., a cDNA or genomic DNA) and RNA molecules (e.g., an mRNA) and analogs of the DNA or RNA generated, e.g., by the use of nucleotide analogs. The nucleic acid molecule can be single-stranded or double-stranded, but preferably is double-stranded DNA.

[3615] The term “isolated or purified nucleic acid molecule” includes nucleic acid molecules that are separated from other nucleic acid molecules that are present in the natural source of the nucleic acid. For example, with regards to genomic DNA, the term “isolated” includes nucleic acid molecules that are separated from the chromosome with which the genomic DNA is naturally associated. Preferably, an “isolated” nucleic acid is free of sequences that naturally flank the nucleic acid (i.e., sequences located at the 5′ and/or 3′ends of the nucleic acid) in the genomic DNA of the organism from which the nucleic acid is derived. For example, in various embodiments, the isolated nucleic acid molecule can contain less than about 5 kb, 4 kb, 3 kb, 2 kb, 1 kb, 0.5 kb or 0.1 kb of 5′ and/or 3′nucleotide sequences which naturally flank the nucleic acid molecule in genomic DNA of the cell from which the nucleic acid is derived. Moreover, an “isolated” nucleic acid molecule, such as a cDNA molecule, can be substantially free of other cellular material, or culture medium when produced by recombinant techniques, or substantially free of chemical precursors or other chemicals when chemically synthesized.

[3616] As used herein, the term “hybridizes under low stringency, medium stringency, high stringency, or very high stringency conditions” describes conditions for hybridization and washing. Guidance for performing hybridization reactions can be found in Current Protocols in Molecular Biology, John Wiley & Sons, N.Y. (1989), 6.3.1-6.3.6, which is incorporated by reference. Aqueous and non-aqueous methods are described in that reference and either can be used. Specific hybridization conditions referred to herein are as follows: 1) low stringency hybridization conditions in 6× sodium chloride/sodium citrate (SSC) at about 45° C., followed by two washes in 0.2× SSC, 0.1% SDS at least at 50° C. (the temperature of the washes can be increased to 55° C. for low stringency conditions); 2) medium stringency hybridization conditions in 6× SSC at about 45° C., followed by one or more washes in 0.2× SSC, 0.1% SDS at 60° C.; 3) high stringency hybridization conditions in 6× SSC at about 45° C., followed by one or more washes in 0.2× SSC, 0.1% SDS at 65° C.; and preferably 4) very high stringency hybridization conditions are 0.5M sodium phosphate, 7% SDS at 65° C., followed by one or more washes at 0.2× SSC, 1% SDS at 65° C. Very high stringency conditions (4) are the preferred conditions and the ones that should be used unless otherwise specified.

[3617] Preferably, an isolated nucleic acid molecule of the invention that hybridizes under stringent conditions to the sequence of SEQ ID NO:53 or 55, corresponds to a naturally-occurring nucleic acid molecule.

[3618] As used herein, a “naturally-occurring” nucleic acid molecule refers to an RNA or DNA molecule having a nucleotide sequence that occurs in nature (e.g., encodes a natural protein).

[3619] As used herein, the terms “gene” and “recombinant gene” refer to nucleic acid molecules which include an open reading frame encoding a 33410 protein, preferably a mammalian 33410 protein, and can further include non-coding regulatory sequences, and introns.

[3620] An “isolated” or “purified” polypeptide or protein is substantially free of cellular material or other contaminating proteins from the cell or tissue source from which the protein is derived, or substantially free from chemical precursors or other chemicals when chemically synthesized. In one embodiment, the language “substantially free” means preparation of 33410 protein having less than about 30%, 20%, 10% and more preferably 5% (by dry weight), of non-33410 protein (also referred to herein as a “contaminating protein”), or of chemical precursors or non-33410 chemicals. When the 33410 protein or biologically active portion thereof is recombinantly produced, it is also preferably substantially free of culture medium, i.e., culture medium represents less than about 20%, more preferably less than about 10%, and most preferably less than about 5% of the volume of the protein preparation. The invention includes isolated or purified preparations of at least 0.01, 0.1, 1.0, and 10 milligrams in dry weight.

[3621] A “non-essential” amino acid residue is a residue that can be altered from the wild-type sequence of 33410 (e.g., the sequence of SEQ ID NO:53 or 55, or the nucleotide sequence of the DNA insert of the plasmid deposited with ATCC as Accession Number ______) without abolishing or more preferably, without substantially altering a biological activity, whereas an “essential” amino acid residue results in such a change. For example, amino acid residues that are conserved among the polypeptides of the present invention, e.g., those present in the carboxylesterase domain, are predicted to be particularly unamenable to alteration.

[3622] A “conservative amino acid substitution” is one in which the amino acid residue is replaced with an amino acid residue having a similar side chain. Families of amino acid residues having similar side chains have been defined in the art. These families include amino acids with basic side chains (e.g., lysine, arginine, histidine), acidic side chains (e.g., aspartic acid, glutamic acid), uncharged polar side chains (e.g., glycine, asparagine, glutamine, serine, threonine, tyrosine, cysteine), nonpolar side chains (e.g., alanine, valine, leucine, isoleucine, proline, phenylalanine, methionine, tryptophan), beta-branched side chains (e.g., threonine, valine, isoleucine) and aromatic side chains (e.g., tyrosine, phenylalanine, tryptophan, histidine). Thus, a predicted nonessential amino acid residue in a 33410 protein is preferably replaced with another amino acid residue from the same side chain family. Alternatively, in another embodiment, mutations can be introduced randomly along all or part of a 33410 coding sequence, such as by saturation mutagenesis, and the resultant mutants can be screened for 33410 biological activity to identify mutants that retain activity. Following mutagenesis of SEQ ID NO:53 or SEQ ID NO:55, or the nucleotide sequence of the DNA insert of the plasmid deposited with ATCC as Accession Number ______, the encoded protein can be expressed recombinantly and the activity of the protein can be determined.

[3623] As used herein, a “biologically active portion” of a 33410 protein includes a fragment of a 33410 protein that participates in an interaction between a 33410 molecule and a non-33410 molecule. Biologically active portions of a 33410 protein include peptides comprising amino acid sequences sufficiently homologous to or derived from the amino acid sequence of the 33410 protein, e.g., the amino acid sequence shown in SEQ ID NO:54, which include less amino acids than the full length 33410 proteins, and exhibit at least one activity of a 33410 protein. Typically, biologically active portions comprise a domain or motif with at least one activity of the 33410 protein, e.g., (1) binding to cell surface ligand, e.g., neurexins; (2) acting as a cell surface receptor; (3) possessing cell adhesion properties; (4) mediating cell-cell interactions between, e.g., neurons; (5) regulating inter-neuronal recognition pathways for axon pathfinding; (6) regulating neuritogenesis; or (7) binding of a divalent cation, e.g., Zn2+, Ca2+, Mg2+, Cd2+ and/or Mn2+. A biologically active portion of a 33410 protein can be a polypeptide that is, for example, 10, 25, 50, 100, 200 or more amino acids in length. Biologically active portions of a 33410 protein can be used as targets for developing agents that modulate a 33410 mediated activity, e.g., an activity as described herein.

[3624] Particular 33410 polypeptides of the present invention have an amino acid sequence substantially identical to the amino acid sequence of SEQ ID NO:54. In the context of an amino acid sequence, the term “substantially identical” is used herein to refer to a first amino acid that contains a sufficient or minimum number of amino acid residues that are i) identical to, or ii) conservative substitutions of aligned amino acid residues in a second amino acid sequence such that the first and second amino acid sequences can have a common structural domain and/or common functional activity. For example, amino acid sequences that contain a common structural domain having at least about 60%, or 65% identity, likely 75% identity, more likely 85%, 90%. 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98% or 99% identity to SEQ ID NO:54 are termed substantially identical.

[3625] In the context of nucleotide sequence, the term “substantially identical” is used herein to refer to a first nucleic acid sequence that contains a sufficient or minimum number of nucleotides that are identical to aligned nucleotides in a second nucleic acid sequence such that the first and second nucleotide sequences encode a polypeptide having common functional activity, or encode a common structural polypeptide domain or a common functional polypeptide activity. For example, nucleotide sequences having at least about 60%, or 65% identity, likely 75% identity, more likely 85%, 90%. 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98% or 99% identity to SEQ ID NO:53 or 55 are termed substantially identical.

[3626] Calculations of homology or sequence identity between sequences (the terms are used interchangeably herein) are performed as follows.

[3627] To determine the percent identity of two amino acid sequences, or of two nucleic acid sequences, the sequences are aligned for optimal comparison purposes (e.g., gaps can be introduced in one or both of a first and a second amino acid or nucleic acid sequence for optimal alignment and non-homologous sequences can be disregarded for comparison purposes). In a preferred embodiment, the length of a reference sequence aligned for comparison purposes is at least 30%, preferably at least 40%, more preferably at least 50%, even more preferably at least 60%, and even more preferably at least 70%, 80%, 90%, 100% of the length of the reference sequence (e.g., when aligning a second sequence to the 33410 amino acid sequence of SEQ ID NO:54 having 560 amino acid residues, at least [30%] 168, preferably at least [40%] 224, more preferably at least [50%] 280, even more preferably at least [60%] 336, and even more preferably at least [70%] 392, [80%] 448, or [90%] 504 amino acid residues are aligned). The amino acid residues or nucleotides at corresponding amino acid positions or nucleotide positions are then compared. When a position in the first sequence is occupied by the same amino acid residue or nucleotide as the corresponding position in the second sequence, then the molecules are identical at that position (as used herein amino acid or nucleic acid “identity” is equivalent to amino acid or nucleic acid “homology”). The percent identity between the two sequences is a function of the number of identical positions shared by the sequences, taking into account the number of gaps, and the length of each gap, which need to be introduced for optimal alignment of the two sequences.

[3628] The comparison of sequences and determination of percent identity between two sequences can be accomplished using a mathematical algorithm. In a preferred embodiment, the percent identity between two amino acid sequences is determined using the Needleman and Wunsch (J. Mol. Biol. (48):444-453 (1970)) algorithm which has been incorporated into the GAP program in the GCG software package (available at http://www.gcg.com), using either a Blossum 62 matrix or a PAM250 matrix, and a gap weight of 16, 14, 12, 10, 8, 6, or 4 and a length weight of 1, 2, 3, 4, 5, or 6. In yet another preferred embodiment, the percent identity between two nucleotide sequences is determined using the GAP program in the GCG software package (available at http://www.gcg.com), using a NWSgapdna.CMP matrix and a gap weight of 40, 50, 60, 70, or 80 and a length weight of 1, 2, 3, 4, 5, or 6. A particularly preferred set of parameters (and the one that should be used if the practitioner is uncertain about what parameters should be applied to determine if a molecule is within a sequence identity or homology limitation of the invention) are a Blossum 62 scoring matrix with a gap penalty of 12, a gap extend penalty of 4, and a frameshift gap penalty of 5.

[3629] The percent identity between two amino acid or nucleotide sequences can be determined using the algorithm of E. Meyers and W. Miller (CABIOS, 4:11-17 (1989)) which has been incorporated into the ALIGN program (version 2.0), using a PAM120 weight residue table, a gap length penalty of 12 and a gap penalty of 4.

[3630] The nucleic acid and protein sequences described herein can be used as a “query sequence” to perform a search against public databases to, for example, identify other family members or related sequences. Such searches can be performed using the NBLAST and XBLAST programs (version 2.0) of Altschul, et al. (1990) J. Mol. Biol. 215:403-10. BLAST nucleotide searches can be performed with the NBLAST program, score=100, wordlength=12 to obtain nucleotide sequences homologous to 33410 nucleic acid molecules of the invention. BLAST protein searches can be performed with the XBLAST program, score=50, wordlength=3 to obtain amino acid sequences homologous to 33410 protein molecules of the invention. To obtain gapped alignments for comparison purposes, Gapped BLAST can be utilized as described in Altschul et al., (1997) Nucleic Acids Res. 25(17):3389-3402. When utilizing BLAST and Gapped BLAST programs, the default parameters of the respective programs (e.g., XBLAST and NBLAST) can be used. See http://www.ncbi.nlm.nih.gov.

[3631] “Misexpression or aberrant expression”, as used herein, refers to a non-wild type pattern of gene expression, at the RNA or protein level. It includes: expression at non-wild type levels, i.e., over or under expression; a pattern of expression that differs from wild type in terms of the time or stage at which the gene is expressed, e.g., increased or decreased expression (as compared with wild type) at a predetermined developmental period or stage; a pattern of expression that differs from wild type in terms of decreased expression (as compared with wild type) in a predetermined cell type or tissue type; a pattern of expression that differs from wild type in terms of the splicing size, amino acid sequence, post-transitional modification, or biological activity of the expressed polypeptide; a pattern of expression that differs from wild type in terms of the effect of an environmental stimulus or extracellular stimulus on expression of the gene, e.g., a pattern of increased or decreased expression (as compared with wild type) in the presence of an increase or decrease in the strength of the stimulus.

[3632] “Subject,” as used herein, refers to human and non-human animals. The term “non-human animals” of the invention includes all vertebrates, e.g., mammals, such as non-human primates (particularly higher primates), sheep, dog, rodent (e.g., mouse or rat), guinea pig, goat, pig, cat, rabbits, cow, and non-mammals, such as chickens, amphibians, reptiles, etc. In a preferred embodiment, the subject is a human. In another embodiment, the subject is an experimental animal or animal suitable as a disease model.

[3633] A “purified preparation of cells”, as used herein, refers to, in the case of plant or animal cells, an in vitro preparation of cells and not an entire intact plant or animal. In the case of cultured cells or microbial cells, it consists of a preparation of at least 10% and more preferably 50% of the subject cells.

[3634] Various aspects of the invention are described in further detail below.

[3635] Isolated Nucleic Acid Molecules of 33410

[3636] In one aspect, the invention provides, an isolated or purified, nucleic acid molecule that encodes a 33410 polypeptide described herein, e.g., a full-length 33410 protein or a fragment thereof, e.g., a biologically active portion of 33410 protein. Also included is a nucleic acid fragment suitable for use as a hybridization probe, which can be used, e.g., to a identify nucleic acid molecule encoding a polypeptide of the invention, 33410 mRNA, and fragments suitable for use as primers, e.g., PCR primers for the amplification or mutation of nucleic acid molecules.

[3637] In one embodiment, an isolated nucleic acid molecule of the invention includes the nucleotide sequence shown in SEQ ID NO:53, or the nucleotide sequence of the DNA insert of the plasmid deposited with ATCC as Accession Number ______, or a portion of any of these nucleotide sequences. In one embodiment, the nucleic acid molecule includes sequences encoding the human 33410 protein (i.e., “the coding region” of SEQ ID NO:53, as shown in SEQ ID NO:55), as well as 5′untranslated sequences. Alternatively, the nucleic acid molecule can include only the coding region of SEQ ID NO:53 (e.g., SEQ ID NO:55) and, e.g., no flanking sequences which normally accompany the subject sequence. In another embodiment, the nucleic acid molecule encodes a sequence corresponding to a fragment of the protein from about amino acid 42 to 601 of SEQ ID NO:54.

[3638] In another embodiment, an isolated nucleic acid molecule of the invention includes a nucleic acid molecule which is a complement of the nucleotide sequence shown in SEQ ID NO:53 or SEQ ID NO:55, or the nucleotide sequence of the DNA insert of the plasmid deposited with ATCC as Accession Number ______, or a portion of any of these nucleotide sequences. In other embodiments, the nucleic acid molecule of the invention is sufficiently complementary to the nucleotide sequence shown in SEQ ID NO:53 or SEQ ID NO:55, or the nucleotide sequence of the DNA insert of the plasmid deposited with ATCC as Accession Number ______ such that it can hybridize to the nucleotide sequence shown in SEQ ID NO:53 or 55, or the nucleotide sequence of the DNA insert of the plasmid deposited with ATCC as Accession Number ______, thereby forming a stable duplex.

[3639] In one embodiment, an isolated nucleic acid molecule of the present invention includes a nucleotide sequence which is at least about 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or more homologous to the entire length of the nucleotide sequence shown in SEQ ID NO:53 or SEQ ID NO:55, or the entire length of the nucleotide sequence of the DNA insert of the plasmid deposited with ATCC as Accession Number ______, or a portion, preferably of the same length, of any of these nucleotide sequences.

[3640] 33410 Nucleic Acid Fragments

[3641] A nucleic acid molecule of the invention can include only a portion of the nucleic acid sequence of SEQ ID NO:53 or 55, or the nucleotide sequence of the DNA insert of the plasmid deposited with ATCC as Accession Number ______. For example, such a nucleic acid molecule can include a fragment that can be used as a probe or primer or a fragment encoding a portion of a 33410 protein, e.g., an immunogenic or biologically active portion of a 33410 protein. A fragment can comprise those nucleotides of SEQ ID NO:53, which encode an carboxylesterase domain of human 33410. The nucleotide sequence determined from the cloning of the 33410 gene allows for the generation of probes and primers designed for use in identifying and/or cloning other 33410 family members, or fragments thereof, as well as 33410 homologues, or fragments thereof, from other species.

[3642] In another embodiment, a nucleic acid includes a nucleotide sequence that includes part, or all, of the coding region and extends into either (or both) the 5′ or 3′noncoding region. Other embodiments include a fragment that includes a nucleotide sequence encoding an amino acid fragment described herein. Nucleic acid fragments can encode a specific domain or site described herein or fragments thereof, particularly fragments thereof which are at least 800 amino acids in length. Fragments also include nucleic acid sequences corresponding to specific amino acid sequences described above or fragments thereof. Nucleic acid fragments should not to be construed as encompassing those fragments that may have been disclosed prior to the invention.

[3643] A nucleic acid fragment can include a sequence corresponding to a domain, region, or functional site described herein. A nucleic acid fragment can also include one or more domain, region, or functional site described herein. Thus, for example, a 33410 nucleic acid fragment can include a sequence corresponding to a carboxylesterase domain.

[3644] In a preferred embodiment, the fragment is at least 300, 350, 400, 450, 500, 550, 600, 650, 700, 750, 800, 850, 900, 950, or 1000 nucleotides in length.

[3645] 33410 probes and primers are provided. Typically a probe/primer is an isolated or purified oligonucleotide. The oligonucleotide typically includes a region of nucleotide sequence that hybridizes under stringent conditions to at least about 7, 12 or 15, preferably about 20 or 25, more preferably about 30, 35, 40, 45, 50, 55, 60, 65, or 75 consecutive nucleotides of a sense or antisense sequence of SEQ ID NO:53 or SEQ ID NO:55, or the nucleotide sequence of the DNA insert of the plasmid deposited with ATCC as Accession Number ______, or of a naturally occurring allelic variant or mutant of SEQ ID NO:53 or SEQ ID NO:55, or the nucleotide sequence of the DNA insert of the plasmid deposited with ATCC as Accession Number ______.

[3646] In a preferred embodiment the nucleic acid is a probe which is at least 5 or 10, and less than 200, more preferably less than 100, or less than 50, base pairs in length. It should be identical, or differ by 1, or less than in 5 or 10 bases, from a sequence disclosed herein. If alignment is needed for this comparison the sequences should be aligned for maximum homology. “Looped” out sequences from deletions or insertions, or mismatches, are considered differences.

[3647] A probe or primer can be derived from the sense or anti-sense strand of a nucleic acid that encodes a carboxylesterase domain from about amino acid 42 to 601 of SEQ ID NO:54, a carboxylesterase type-B signature 2 domain from about amino acid 139 to 149 of SEQ ID NO:54, or a transmembrane domain from about amino acid 676 to 698 of SEQ ID NO:54.

[3648] In another embodiment a set of primers is provided, e.g., primers suitable for use in a PCR, which can be used to amplify a selected region of a 33410 sequence, e.g., a domain, region, site or other sequence described herein. The primers should be at least 5, 10, or 50 base pairs in length and less than 100, or less than 200, base pairs in length. The primers should be identical, or differs by one base from a sequence disclosed herein or from a naturally occurring variant. For example, primers suitable for amplifying all or a portion of any of the following regions are provided: a carboxylesterase domain from about amino acid 42 to 601 of SEQ ID NO:54; a carboxylesterase type-B signature 2 domain from about amino acid 139 to 149 of SEQ ID NO:54; or a transmembrane domain from about amino acid 676 to 698 of SEQ ID NO:54.

[3649] A nucleic acid fragment can encode an epitope bearing region of a polypeptide described herein.

[3650] A nucleic acid fragment encoding a “biologically active portion of a 33410 polypeptide” can be prepared by isolating a portion of the nucleotide sequence of SEQ ID NO:53 or 55, or the nucleotide sequence of the DNA insert of the plasmid deposited with ATCC as Accession Number ______, which encodes a polypeptide having a 33410 biological activity (e.g., the biological activities of the 33410 proteins are described herein), expressing the encoded portion of the 33410 protein (e.g., by recombinant expression in vitro) and assessing the activity of the encoded portion of the 33410 protein. For example, a nucleic acid fragment encoding a biologically active portion of 33410 includes a carboxylesterase domain, e.g., amino acid residues about 42 to 601 of SEQ ID NO:54. A nucleic acid fragment encoding a biologically active portion of a 33410 polypeptide, may comprise a nucleotide sequence which is greater than 300 or more nucleotides in length.

[3651] In a preferred embodiment, a nucleic acid fragment differs by at least 1, 2, 3, 10, 20, or more nucleotides from the sequence of Genbank accession number AW291374. Differ can include differing in length or sequence identity. E.g., a nucleic acid fragment can: include one or more nucleotides from SEQ ID NO:53 or SEQ ID NO:55 outside the region of nucleotides 927-1424; not include all of the nucleotides of AW291374, e.g., can be one or more nucleotides shorter (at one or both ends) than the sequence of AW291374; or can differ by one or more nucleotides in the region of overlap.

[3652] In a preferred embodiment, a nucleic acid fragment differs by at least 1, 2, 3, 10, 20, or more nucleotides from the sequence of Genbank accession number A1337820. Differ can include differing in length or sequence identity. E.g., a nucleic acid fragment can: include one or more nucleotides from SEQ ID NO:53 or SEQ ID NO:55 outside the region of nucleotides 804-1331; not include all of the nucleotides of AI337820, e.g., can be one or more nucleotides shorter (at one or both ends) than the sequence of A1337820; or can differ by one or more nucleotides in the region of overlap.

[3653] In a preferred embodiment, a nucleic acid fragment differs by at least 1, 2, 3, 10, 20, or more nucleotides from the sequence of Genbank accession number AB037787. Differ can include differing in length or sequence identity. E.g., a nucleic acid fragment can: include one or more nucleotides from SEQ ID NO:53 or SEQ ID NO:55 outside the region of nucleotides 1273-4641; not include all of the nucleotides of AB037787, e.g., can be one or more nucleotides shorter (at one or both ends) than the sequence of AB037787; or can differ by one or more nucleotides in the region of overlap.

[3654] In a preferred embodiment, a nucleic acid fragment differs by at least 1, 2, 3, 10, 20, or more nucleotides from the sequence of Genbank accession number C74943. Differ can include differing in length or sequence identity. E.g., a nucleic acid fragment can: include one or more nucleotides from SEQ ID NO:53 or SEQ ID NO:55 outside the region of nucleotides 1422-2345; not include all of the nucleotides of C74943, e.g., can be one or more nucleotides shorter (at one or both ends) than the sequence of C74943; or can differ by one or more nucleotides in the region of overlap.

[3655] In a preferred embodiment, a nucleic acid fragment differs by at least 1, 2, 3, 10, 20, or more nucleotides from the sequence of Genbank accession number V59639. Differ can include differing in length or sequence identity. E.g., a nucleic acid fragment can: include one or more nucleotides from SEQ ID NO:53 or SEQ ID NO:55 outside the region of nucleotides 3758-4297; not include all of the nucleotides of V59639, e.g., can be one or more nucleotides shorter (at one or both ends) than the sequence of V59639; or can differ by one or more nucleotides in the region of overlap.

[3656] In a preferred embodiment, a nucleic acid fragment differs by at least 1, 2, 3, 10, 20, or more nucleotides from the sequence of Genbank accession number AAA97870. Differ can include differing in length or sequence identity. E.g., a nucleic acid fragment can: include one or more nucleotides from SEQ ID NO:53 or SEQ ID NO:55 outside the region of nucleotides 420-2372; not include all of the nucleotides of AAA97870, e.g., can be one or more nucleotides shorter (at one or both ends) than the sequence of AAA97870; or can differ by one or more nucleotides in the region of overlap.

[3657] In a preferred embodiment, a nucleic acid fragment differs by at least 1, 2, 3, 10, 20, or more nucleotides from the sequence of Genbank accession number BAA92604. Differ can include differing in length or sequence identity. E.g., a nucleic acid fragment can: include one or more nucleotides from SEQ ID NO:53 or SEQ ID NO:55 outside the region of nucleotides 1275-2924; not include all of the nucleotides of BAA9260, e.g., can be one or more nucleotides shorter (at one or both ends) than the sequence of BAA9260; or can differ by one or more nucleotides in the region of overlap.

[3658] In a preferred embodiment, a nucleic acid fragment differs by at least 1, 2, 3, 10, 20, or more nucleotides from a sequence in WO 01/27277.7. Differ can include differing in length or sequence identity. E.g., a nucleic acid fragment can: include one or more nucleotides from SEQ ID NO:53 or SEQ ID NO:55 outside the region of nucleotides 420-2320; not include all of the nucleotides of a sequence in WO 01/27277.7, e.g., can be one or more nucleotides shorter (at one or both ends) than a sequence in WO 01/27277.7; or can differ by one or more nucleotides in the region of overlap.

[3659] In a preferred embodiment, a nucleic acid fragment differs by at least 1, 2, 3, 10, 20, or more nucleotides from a sequence in WO 01/27277.8. Differ can include differing in length or sequence identity. E.g., a nucleic acid fragment can: include one or more nucleotides from SEQ ID NO:53 or SEQ ID NO:55 outside the region of nucleotides 420-2372; not include all of the nucleotides of a sequence in WO 01/27277.8, e.g., can be one or more nucleotides shorter (at one or both ends) than a sequence in WO 01/27277.8; or can differ by one or more nucleotides in the region of overlap.

[3660] In a preferred embodiment, a nucleic acid fragment includes a nucleotide sequence which is about 300, 400, 500, 600, 700, 800, 900, 1000, 1100, 1200, 1300, 1400, 1500, 1600, 1700, 2000, 2100, 2200, 2300, 2400, 2500, 2600, 2700, 2800, 2900, 3000, 3100, 3200, 3300, 3400, 3500, 3600, 3700, 3800, 3900, 4000 or more nucleotides in length and hybridizes under stringent hybridization conditions to a nucleic acid molecule of SEQ ID NO:53, or SEQ ID NO:55, or the nucleotide sequence of the DNA insert of the plasmid deposited with ATCC as Accession Number ______.

[3661] 33410 Nucleic Acid Variants

[3662] The invention further encompasses nucleic acid molecules that differ from the nucleotide sequence shown in SEQ ID NO:53 or SEQ ID NO:55, or the nucleotide sequence of the DNA insert of the plasmid deposited with ATCC as Accession Number ______. Such differences can be due to degeneracy of the genetic code (and result in a nucleic acid that encodes the same 33410 proteins as those encoded by the nucleotide sequence disclosed herein). In another embodiment, an isolated nucleic acid molecule of the invention has a nucleotide sequence encoding a protein having an amino acid sequence which differs, by at least 1, but less than 5, 10, 20, 50, or 100 amino acid residues that shown in SEQ ID NO:54. If alignment is needed for this comparison the sequences should be aligned for maximum homology. “Looped” out sequences from deletions or insertions, or mismatches, are considered differences.

[3663] Nucleic acids of the inventor can be chosen for having codons, which are preferred, or non-preferred, for a particular expression system. E.g., the nucleic acid can be one in which at least one codon, and preferably at least 10%, or 20% of the codons have been altered such that the sequence is optimized for expression in E. coli, yeast, human, insect, or CHO cells.

[3664] Nucleic acid variants can be naturally occurring, such as allelic variants (same locus), homologs (different locus), and orthologs (different organism) or can be non-naturally occurring. Non-naturally occurring variants can be made by mutagenesis techniques, including those applied to polynucleotides, cells, or organisms. The variants can contain nucleotide substitutions, deletions, inversions and insertions. Variation can occur in either or both the coding and non-coding regions. The variations can produce both conservative and non-conservative amino acid substitutions (as compared in the encoded product).

[3665] In a preferred embodiment, the nucleic acid differs from that of SEQ ID NO:53 or 55, or the sequence in ATCC Accession Number ______, e.g., as follows: by at least one but less than 10, 20, 30, or 40 nucleotides; at least one but less than 2%, 5%, 10% or 20% of the in the subject nucleic acid. If necessary for this analysis the sequences should be aligned for maximum homology. “Looped” out sequences from deletions or insertions, or mismatches, are considered differences.

[3666] Orthologs, homologs, and allelic variants can be identified using methods known in the art. These variants comprise a nucleotide sequence encoding a polypeptide that is 50%, at least about 55%, typically at least about 70-75%, more typically at least about 80-85%, and most typically at least about 90-95% or more identical to the nucleotide sequence shown in SEQ ID NO:54 or a fragment of this sequence. Such nucleic acid molecules can readily be identified as being able to hybridize under stringent conditions, to the nucleotide sequence shown in SEQ ID NO:54 or a fragment of the sequence. Nucleic acid molecules corresponding to orthologs, homologs, and allelic variants of the 33410 cDNAs of the invention can further be isolated by mapping to the same chromosome or locus as the 33410 gene.

[3667] Preferred variants include those that are correlated with modulating (stimulating and/or enhancing or inhibiting) cell adhesion, cellular proliferation, differentiation, or tumorigenesis; modulating neural cell activity; binding to neurexins on neurons; acting as a cell surface receptor; mediating cell-cell interactions between neurons; regulating inter-neuronal recognition pathways for axon pathfinding; regulating neuritogenesis; or binding divalent cations, e.g., Zn2+, Ca2+, Mg2+, Cd2+ and/or Mn2+.

[3668] Allelic variants of 33410, e.g., human 33410, include both functional and non-functional proteins. Functional allelic variants are naturally occurring amino acid sequence variants of the 33410 protein within a population that maintain the ability to bind an extracellular component or cell surface protein, e.g., neurexin. Functional allelic variants will typically contain only conservative substitution of one or more amino acids of SEQ ID NO:54, or substitution, deletion or insertion of non-critical residues in non-critical regions of the protein. Non-functional allelic variants are naturally-occurring amino acid sequence variants of the 33410, e.g., human 33410, protein within a population that do not have the ability to bind to neurexins; act as a cell surface receptor; possess cell adhesion properties; mediate cell-cell interactions between neurons; regulate inter-neuronal recognition pathways for axon pathfinding; regulate neuritogenesis; or bind Zn2+, Ca2+, Mg2+, Cd2+ and/or Mn2+. Non-functional allelic variants will typically contain a non-conservative substitution, a deletion, or insertion, or premature truncation of the amino acid sequence of SEQ ID NO:54, or a substitution, insertion, or deletion in critical residues or critical regions of the protein.

[3669] Moreover, nucleic acid molecules encoding other 33410 family members and, thus, which have a nucleotide sequence which differs from the 33410 sequences of SEQ ID NO:53 or SEQ ID NO:55, or the nucleotide sequence of the DNA insert of the plasmid deposited with ATCC as Accession Number ______ are intended to be within the scope of the invention.

[3670] Antisense Nucleic Acid Molecules, Ribozymes and Modified 33410 Nucleic Acid Molecules

[3671] In another aspect, the invention features an isolated nucleic acid molecule that is antisense to 33410. An “antisense” nucleic acid can include a nucleotide sequence that is complementary to a “sense” nucleic acid encoding a protein, e.g., complementary to the coding strand of a double-stranded cDNA molecule or complementary to an mRNA sequence. The antisense nucleic acid can be complementary to an entire 33410 coding strand, or to only a portion thereof (e.g., the coding region of human 33410 corresponding to SEQ ID NO:55). In another embodiment, the antisense nucleic acid molecule is antisense to a “noncoding region” of the coding strand of a nucleotide sequence encoding 33410 (e.g., the 5′ and 3′untranslated regions).

[3672] An antisense nucleic acid can be designed such that it is complementary to the entire coding region of 33410 mRNA, but more preferably is an oligonucleotide that is antisense to only a portion of the coding or noncoding region of 33410 mRNA. For example, the antisense oligonucleotide can be complementary to the region surrounding the translation start site of 33410 mRNA, e.g., between the −10 and +10 regions of the target gene nucleotide sequence of interest. An antisense oligonucleotide can be, for example, about 7, 10, 15, 20, 25, 30, 35, 40, 45, 50, 55, 60, 65, 70, 75, 80, or more nucleotides in length.

[3673] An antisense nucleic acid of the invention can be constructed using chemical synthesis and enzymatic ligation reactions using procedures known in the art. For example, an antisense nucleic acid (e.g., an antisense oligonucleotide) can be chemically synthesized using naturally occurring nucleotides or variously modified nucleotides designed to increase the biological stability of the molecules or to increase the physical stability of the duplex formed between the antisense and sense nucleic acids, e.g., phosphorothioate derivatives and acridine substituted nucleotides can be used. The antisense nucleic acid also can be produced biologically using an expression vector into which a nucleic acid has been subcloned in an antisense orientation (i.e., RNA transcribed from the inserted nucleic acid will be of an antisense orientation to a target nucleic acid of interest, described further in the following subsection).

[3674] The antisense nucleic acid molecules of the invention are typically administered to a subject (e.g., by direct injection at a tissue site), or generated in situ such that they hybridize with or bind to cellular mRNA and/or genomic DNA encoding a 33410 protein to thereby inhibit expression of the protein, e.g., by inhibiting transcription and/or translation. Alternatively, antisense nucleic acid molecules can be modified to target selected cells and then administered systemically. For systemic administration, antisense molecules can be modified such that they specifically bind to receptors or antigens expressed on a selected cell surface, e.g., by linking the antisense nucleic acid molecules to peptides or antibodies that bind to cell surface receptors or antigens. The antisense nucleic acid molecules can also be delivered to cells using the vectors described herein. To achieve sufficient intracellular concentrations of the antisense molecules, vector constructs in which the antisense nucleic acid molecule is placed under the control of a strong po II or pol III promoter are preferred.

[3675] In yet another embodiment, the antisense nucleic acid molecule of the invention is an &agr;-anomeric nucleic acid molecule. An &agr;-anomeric nucleic acid molecule forms specific double-stranded hybrids with complementary RNA in which, contrary to the usual &bgr;-units, the strands run parallel to each other (Gaultier et al. (1987) Nucleic Acids. Res. 15:6625-6641). The antisense nucleic acid molecule can also comprise a 2′-o-methylribonucleotide (Inoue et al. (1987) Nucleic Acids Res. 15:6131-6148) or a chimeric RNA-DNA analogue (Inoue et al. (1987) FEBS Lett. 215:327-330).

[3676] In still another embodiment, an antisense nucleic acid of the invention is a ribozyme. A ribozyme having specificity for a 33410-encoding nucleic acid can include one or more sequences complementary to the nucleotide sequence of a 33410 cDNA disclosed herein (i.e., SEQ ID NO:53 or SEQ ID NO:55), and a sequence having known catalytic sequence responsible for mRNA cleavage (see U.S. Pat. No. 5,093,246 or Haselhoff and Gerlach (1988) Nature 334:585-591). For example, a derivative of a Tetrahymena L-19 IVS RNA can be constructed in which the nucleotide sequence of the active site is complementary to the nucleotide sequence to be cleaved in a 33410-encoding mRNA. See, e.g., Cech et al. U.S. Pat. No. 4,987,071; and Cech et al. U.S. Pat. No. 5,116,742. Alternatively, 33410 mRNA can be used to select a catalytic RNA having a specific ribonuclease activity from a pool of RNA molecules. See, e.g., Bartel, D. and Szostak, J. W. (1993) Science 261:1411-1418.

[3677] 33410 gene expression can be inhibited by targeting nucleotide sequences complementary to the regulatory region of the 33410 (e.g., the 33410 promoter and/or enhancers) to form triple helical structures that prevent transcription of the 33410 gene in target cells. See generally, Helene, C. (1991) Anticancer Drug Des. 6(6):569-84; Helene, C. et al. (1992) Ann. N.Y. Acad. Sci. 660:27-36; and Maher, L. J. (1992) Bioassays 14(12):807-15. Potential sequences that can be targeted for triple helix formation can be increased by creating a so-called “switchback” nucleic acid molecule. Switchback molecules are synthesized in an alternating 5′-3′,3′-5′manner, such that they base pair with first one strand of a duplex and then the other, eliminating the necessity for a sizeable stretch of either purines or pyrimidines to be present on one strand of a duplex.

[3678] The invention also provides detectably labeled oligonucleotide primer and probe molecules. Typically, such labels are chemiluminescent, fluorescent, radioactive, or calorimetric.

[3679] A 33410 nucleic acid molecule can be modified at the base moiety, sugar moiety or phosphate backbone to improve, e.g., the stability, hybridization, or solubility of the molecule. For example, the deoxyribose phosphate backbone of the nucleic acid molecules can be modified to generate peptide nucleic acids (see Hyrup B. et al. (1996) Bioorganic & Medicinal Chemistry 4 (1): 5-23). As used herein, the terms “peptide nucleic acid” or “PNA” refers to a nucleic acid mimic, e.g., a DNA mimic, in which the deoxyribose phosphate backbone is replaced by a pseudopeptide backbone and only the four natural nucleobases are retained. The neutral backbone of a PNA can allow for specific hybridization to DNA and RNA under conditions of low ionic strength. The synthesis of PNA oligomers can be performed using standard solid phase peptide synthesis protocols as described in Hyrup B. et al. (1996) supra; Perry-O'Keefe et al. Proc. Natl. Acad. Sci. 93: 14670-675.

[3680] PNAs of 33410 nucleic acid molecules can be used in therapeutic and diagnostic applications. For example, PNAs can be used as antisense or antigene agents for sequence-specific modulation of gene expression by, for example, inducing transcription or translation arrest or inhibiting replication. PNAs of 33410 nucleic acid molecules can also be used in the analysis of single base pair mutations in a gene, (e.g., by PNA-directed PCR clamping); as ‘artificial restriction enzymes’ when used in combination with other enzymes, (e.g., S1 nucleases (Hyrup B. (1996) supra)); or as probes or primers for DNA sequencing or hybridization (Hyrup B. et al. (1996) supra; Perry-O'Keefe supra).

[3681] In other embodiments, the oligonucleotide may include other appended groups such as peptides (e.g., for targeting host cell receptors in vivo), or agents facilitating transport across the cell membrane (see, e.g., Letsinger et al. (1989) Proc. Natl. Acad. Sci. USA 86:6553-6556; Lemaitre et al. (1987) Proc. Natl. Acad. Sci. USA 84:648-652; PCT Publication No. WO88/09810) or the blood-brain barrier (see, e.g., PCT Publication No. WO89/10134). In addition, oligonucleotides can be modified with hybridization-triggered cleavage agents (See, e.g., Krol et al. (1988) Bio-Techniques 6:958-976) or intercalating agents. (See, e.g., Zon (1988) Pharm. Res. 5:539-549). To this end, the oligonucleotide may be conjugated to another molecule, (e.g., a peptide, hybridization triggered cross-linking agent, transport agent, or hybridization-triggered cleavage agent).

[3682] The invention also includes molecular beacon oligonucleotide primer and probe molecules having at least one region which is complementary to a 33410 nucleic acid of the invention, two complementary regions one having a fluorophore and one a quencher such that the molecular beacon is useful for quantitating the presence of the 33410 nucleic acid of the invention in a sample. Molecular beacon nucleic acids are described, for example, in Lizardi et al., U.S. Pat. No. 5,854,033; Nazarenko et al., U.S. Pat. No. 5,866,336, and Livak et al., U.S. Pat. No. 5,876,930.

[3683] Isolated 33410 Polypeptides

[3684] In another aspect, the invention features an isolated 33410 protein, or fragment, e.g., a biologically active portion, for use as immunogens or antigens to raise or test (or more generally to bind) anti-33410 antibodies. 33410 protein can be isolated from cells or tissue sources using standard protein purification techniques. 33410 protein or fragments thereof can be produced by recombinant DNA techniques or synthesized chemically.

[3685] Polypeptides of the invention include those that arise as a result of the existence of multiple genes, alternative transcription events, alternative RNA splicing events, and alternative translational and post-translational events. The polypeptide can be expressed in systems, e.g., cultured cells, which result in substantially the same post-translational modifications present when expressed the polypeptide is expressed in a native cell, or in systems that result in the alteration or omission of post-translational modifications present when expressed in a native cell, e.g., glycosylation or cleavage.

[3686] In a preferred embodiment, A 33410 polypeptide has one or more of the following characteristics:

[3687] (i) it acts on carboxylic esters, e.g., acetylcholinesterase;

[3688] (ii) it modulates cell-cell (e.g., neuron-neuron, or neuron-glia) recognition event or adhesion;

[3689] (iii) it binds an extracellular component or cell surface protein, e.g., neurexin;

[3690] (iv) it modulates neural developmental events, for example, membrane excitability, neurite outgrowth, and/or synaptogenesis;

[3691] (v) it has an amino acid composition of SEQ ID NO:54;

[3692] (vi) it has an overall sequence similarity of at least 60%, preferably at least 70, more preferably at least 80, 90, or 95%, with a polypeptide of SEQ ID NO:54;

[3693] (vii) it can be found in human tissue;

[3694] (viii) it can be found in neural tissues, e.g., neurons;

[3695] (ix) it has a carboxylesterase domain which is preferably about 70%, 80%, 90% or 95% with amino acid residues from about 42 to 601 of SEQ ID NO:54;

[3696] (x) it has a carboxylesterase type-B signature 2 domain which is preferably about 70%, 80%, 90% or 95% with amino acid residues from about 139 to 149 of SEQ ID NO:54;

[3697] (xi) it has a transmembrane domain which is preferably about 70%, 80%, 90% or 95% with amino acid residues from about 676 to 698 of SEQ ID NO:54; or

[3698] (xii) it has at least 1, preferably 5, and most preferably 6 of the cysteines found in the amino acid sequence of the native protein (i.e., SEQ ID NO:54).

[3699] In a preferred embodiment, the 33410 protein, or fragment thereof, differs from the corresponding sequence in SEQ ID:2. In one embodiment, it differs by at least one but by less than 15, 10 or 5 amino acid residues. In another it differs from the corresponding sequence in SEQ ID NO:54 by at least one residue but less than 20%, 15%, 10% or 5% of the residues in it differ from the corresponding sequence in SEQ ID NO:54. (If this comparison requires alignment the sequences should be aligned for maximum homology. “Looped” out sequences from deletions or insertions, or mismatches, are considered differences.) The differences are preferably differences or changes at a non-essential residue or a conservative substitution. In a preferred embodiment, the differences are not in the carboxylesterase domain (e.g., amino acids 42 to 601 of SEQ ID NO:54). In another preferred embodiment, one or more differences are in the carboxylesterase domain (e.g., amino acids 42 to 601 of SEQ ID NO:54).

[3700] Other embodiments include a protein that contain one or more changes in amino acid sequence, e.g., a change in an amino acid residue which is not essential for activity. Such 33410 proteins differ in amino acid sequence from SEQ ID NO:54, yet retain biological activity.

[3701] In one embodiment, the protein includes an amino acid sequence at least about 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 98% or more homologous to SEQ ID NO:54.

[3702] A 33410 protein or fragment is provided which varies from the sequence of SEQ ID NO:0.54 in regions defined by amino acids about 42 to 601 by at least one but by less than 15, 10 or 5 amino acid residues in the protein or fragment but which does not differ from SEQ ID NO:54 in regions defined by amino acids about 42 to 601. (If this comparison requires alignment the sequences should be aligned for maximum homology. “Looped” out sequences from deletions or insertions, or mismatches, are considered differences.) In some embodiments the difference is at a non-essential residue or is a conservative substitution, while in others the difference is at an essential residue or is a non-conservative substitution.

[3703] In one embodiment, a biologically active portion of a 33410 protein includes an carboxylesterase domain. Moreover, other biologically active portions, in which other regions of the protein are deleted, can be prepared by recombinant techniques and evaluated for one or more of the functional activities of a native 33410 protein.

[3704] In a preferred embodiment, the 33410 protein has an amino acid sequence shown in SEQ ID NO:54. In other embodiments, the 33410 protein is substantially identical to SEQ ID NO:54. In yet another embodiment, the 33410 protein is substantially identical to SEQ ID NO:54 and retains the functional activity of the protein of SEQ ID NO:54, as described in detail in the subsections above. Accordingly, in another embodiment, the 33410 protein is a protein which includes an amino acid sequence at least about 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 98% or more identical to SEQ ID NO:54.

[3705] In a preferred embodiment, a fragment differs by at least 1, 2, 3, 10, 20, or more amino acid residues from a sequence in WO 01/27277.7. Differ can include differing in length or sequence identity. E.g., a fragment can: include one or more amino acid residues from SEQ ID NO:54 outside the region encoded by nucleotides 420-2320; not include all of the amino acid residues of a sequence in WO 01/27277.7, e.g., can be one or more amino acid residues shorter (at one or both ends) than a sequence in WO 01/27277.7; or can differ by one or more amino acid residues in the region of overlap.

[3706] In a preferred embodiment, a fragment differs by at least 1, 2, 3, 10, 20, or more amino acid residues from a sequence in WO 01/27277.8. Differ can include differing in length or sequence identity. E.g., a fragment can: include one or more amino acid residues from SEQ ID NO:54 outside the region encoded by nucleotides 420-2372; not include all of the amino acid residues of a sequence in WO 01/27277.8, e.g., can be one or more amino acid residues shorter (at one or both ends) than a sequence in WO 01/27277.8; or can differ by one or more amino acid residues in the region of overlap.

[3707] 33410 Chimeric or Fusion Proteins

[3708] In another aspect, the invention provides 33410 chimeric or fusion proteins. As used herein, a 33410 “chimeric protein” or “fusion protein” includes a 33410 polypeptide linked to a non-33410 polypeptide. A “non-33410 polypeptide” refers to a polypeptide having an amino acid sequence corresponding to a protein which is not substantially homologous to the 33410 protein, e.g., a protein which is different from the 33410 protein and which is derived from the same or a different organism. The 33410 polypeptide of the fusion protein can correspond to all or a portion e.g., a fragment described herein of a 33410 amino acid sequence. In a preferred embodiment, a 33410 fusion protein includes at least one (or two) biologically active portion of a 33410 protein. The non-33410 polypeptide can be fused to the N-terminus or C-terminus of the 33410 polypeptide.

[3709] The fusion protein can include a moiety that has a high affinity for a ligand. For example, the fusion protein can be a GST-33410 fusion protein in which the 33410 sequences are fused to the C-terminus of the GST sequences. Such fusion proteins can facilitate the purification of recombinant 33410. Alternatively, the fusion protein can be a 33410 protein containing a heterologous signal sequence at its N-terminus. In certain host cells (e.g., mammalian host cells), expression and/or secretion of 33410 can be increased through use of a heterologous signal sequence.

[3710] Fusion proteins can include all or a part of a serum protein, e.g., an IgG constant region, or human serum albumin.

[3711] The 33410 fusion proteins of the invention can be incorporated into pharmaceutical compositions and administered to a subject in vivo. The 33410 fusion proteins can be used to affect the bioavailability of a 33410 substrate. 33410 fusion proteins may be useful therapeutically for the treatment of disorders caused by, for example, (i) aberrant modification or mutation of a gene encoding a 33410 protein; (ii) mis-regulation of the 33410 gene; and (iii) aberrant post-translational modification of a 33410 protein.

[3712] Moreover, the 33410-fusion proteins of the invention can be used as immunogens to produce anti-33410 antibodies in a subject, to purify 33410 ligands and in screening assays to identify molecules that inhibit the interaction of 33410 with a 33410 substrate.

[3713] Expression vectors are commercially available that already encode a fusion moiety (e.g., a GST polypeptide). A 33410-encoding nucleic acid can be cloned into such an expression vector such that the fusion moiety is linked in-frame to the 33410 protein.

[3714] Variants of 33410 Proteins

[3715] In another aspect, the invention also features a variant of a 33410 polypeptide, e.g., which functions as an agonist (mimetics) or as an antagonist. Variants of the 33410 proteins can be generated by mutagenesis, e.g., discrete point mutation, the insertion or deletion of sequences or the truncation of a 33410 protein. An agonist of the 33410 proteins can retain substantially the same, or a subset, of the biological activities of the naturally occurring form of a 33410 protein. An antagonist of a 33410 protein can inhibit one or more of the activities of the naturally occurring form of the 33410 protein by, for example, competitively modulating a 33410-mediated activity of a 33410 protein. Thus, specific biological effects can be elicited by treatment with a variant of limited function. Preferably, treatment of a subject with a variant having a subset of the biological activities of the naturally occurring form of the protein has fewer side effects in a subject relative to treatment with the naturally occurring form of the 33410 protein.

[3716] Variants of a 33410 protein can be identified by screening combinatorial libraries of mutants, e.g., truncation mutants, of a 33410 protein for agonist or antagonist activity.

[3717] Libraries of fragments e.g., N terminal, C terminal, or internal fragments, of a 33410 protein coding sequence can be used to generate a variegated population of fragments for screening and subsequent selection of variants of a 33410 protein.

[3718] Variants in which a cysteine residue is added or deleted or in which a residue that is glycosylated is added or deleted are particularly preferred.

[3719] Methods for screening gene products of combinatorial libraries made by point mutations or truncation, and for screening cDNA libraries for gene products having a selected property. Recursive ensemble mutagenesis (REM), a new technique which enhances the frequency of functional mutants in the libraries, can be used in combination with the screening assays to identify 33410 variants (Arkin and Yourvan (1992) Proc. Natl. Acad. Sci. USA 89:7811-7815; Delgrave et al. (1993) Protein Engineering 6(3):327-331).

[3720] Cell based assays can be exploited to analyze a variegated 33410 library. For example, a library of expression vectors can be transfected into a cell line, e.g., a cell line, which ordinarily responds to 33410 in a substrate-dependent manner. The transfected cells are then contacted with 33410 and the effect of the expression of the mutant on signaling by the 33410 substrate can be detected, e.g., by measuring the amount of binding to neurexins on neurons; the ability to act as a cell surface receptor; the cell adhesion properties; the ability to mediate cell-cell interactions between neurons; the ability to regulate inter-neuronal recognition pathways for axon pathfinding; the ability to regulate neuritogenesis; or the ability to bind Zn2+, Ca2+, Mg2+, Cd2+ and/or Mn2+. Plasmid DNA can then be recovered from those cells that score for inhibition, or alternatively, potentiation of signaling by the 33410 substrate, and the individual clones further characterized.

[3721] In another aspect, the invention features a method of making a 33410 polypeptide, e.g., a peptide having a non-wild type activity, e.g., an antagonist, agonist, or super agonist of a naturally occurring 33410 polypeptide, e.g., a naturally occurring 33410 polypeptide. The method includes: altering the sequence of a 33410 polypeptide, e.g., altering the sequence, e.g., by substitution or deletion of one or more residues of a non-conserved region, a domain or residue disclosed herein, and testing the altered polypeptide for the desired activity.

[3722] In another aspect, the invention features a method of making a fragment or analog of a 33410 polypeptide a biological activity of a naturally occurring 33410 polypeptide. The method includes: altering the sequence, e.g., by substitution or deletion of one or more residues, of a 33410 polypeptide, e.g., altering the sequence of a non-conserved region, or a domain or residue described herein, and testing the altered polypeptide for the desired activity.

[3723] Anti-33410 Antibodies

[3724] In another aspect, the invention provides an anti-33410 antibody, or a fragment thereof (e.g., an antigen-binding fragment thereof). The term “antibody” as used herein refers to an immunoglobulin molecule or immunologically active portion thereof, i.e., an antigen-binding portion. As used herein, the term “antibody” refers to a protein comprising at least one, and preferably two, heavy (H) chain variable regions (abbreviated herein as VH), and at least one and preferably two light (L) chain variable regions (abbreviated herein as VL). The VH and VL regions can be further subdivided into regions of hypervariability, termed “complementarity determining regions” (“CDR”), interspersed with regions that are more conserved, termed “framework regions” (FR). The extent of the framework region and CDR's has been precisely defined (see, Kabat, E. A., et al. (1991) Sequences of Proteins of Immunological Interest, Fifth Edition, U.S. Department of Health and Human Services, NIH Publication No. 91-3242, and Chothia, C. et al. (1987) J. Mol. Biol. 196:901-917, which are incorporated herein by reference). Each VH and VL is composed of three CDR's and four FRs, arranged from amino-terminus to carboxy-terminus in the following order: FR1, CDR1, FR2, CDR2, FR3, CDR3, FR4.

[3725] The anti-33410 antibody can further include a heavy and light chain constant region, to thereby form a heavy and light immunoglobulin chain, respectively. In one embodiment, the antibody is a tetramer of two heavy immunoglobulin chains and two light immunoglobulin chains, wherein the heavy and light immunoglobulin chains are inter-connected by, e.g., disulfide bonds. The heavy chain constant region is comprised of three domains, CH1, CH2 and CH3. The light chain constant region is comprised of one domain, CL. The variable region of the heavy and light chains contains a binding domain that interacts with an antigen. The constant regions of the antibodies typically mediate the binding of the antibody to host tissues or factors, including various cells of the immune system (e.g., effector cells) and the first component (Clq) of the classical complement system.

[3726] As used herein, the term “immunoglobulin” refers to a protein consisting of one or more polypeptides substantially encoded by immunoglobulin genes. The recognized human immunoglobulin genes include the kappa, lambda, alpha (IgA1 and IgA2), gamma (IgG1, IgG2, IgG3, IgG4), delta, epsilon and mu constant region genes, as well as the myriad immunoglobulin variable region genes. Full-length immunoglobulin “light chains” (about 25 Kd or 214 amino acids) are encoded by a variable region gene at the NH2-terminus (about 110 amino acids) and a kappa or lambda constant region gene at the COOH—terminus. Full-length immunoglobulin “heavy chains” (about 50 kD or 446 amino acids), are similarly encoded by a variable region gene (about 116 amino acids) and one of the other aforementioned constant region genes, e.g., gamma (encoding about 330 amino acids).

[3727] The term “antigen-binding fragment” of an antibody (or simply “antibody portion,” or “fragment”), as used herein, refers to one or more fragments of a full-length antibody that retain the ability to specifically bind to the antigen, e.g., 33410 polypeptide or fragment thereof. Examples of antigen-binding fragments of the anti-33410 antibody include, but are not limited to: (i) a Fab fragment, a monovalent fragment consisting of the VL, VH, CL and CH1 domains; (ii) a F(ab′)2 fragment, a bivalent fragment comprising two Fab fragments linked by a disulfide bridge at the hinge region; (iii) a Fd fragment consisting of the VH and CH1 domains; (iv) a Fv fragment consisting of the VL and VH domains of a single arm of an antibody, (v) a dAb fragment (Ward et al., (1989) Nature 341:544-546), which consists of a VH domain; and (vi) an isolated complementarity determining region (CDR). Furthermore, although the two domains of the Fv fragment, VL and VH, are coded for by separate genes, they can be joined, using recombinant methods, by a synthetic linker that enables them to be made as a single protein chain in which the VL and VH regions pair to form monovalent molecules (known as single chain Fv (scFv); see e.g., Bird et al. (1988) Science 242:423-426; and Huston et al. (1988) Proc. Natl. Acad. Sci. USA 85:5879-5883). Such single chain antibodies are also encompassed within the term “antigen-binding fragment” of an antibody. These antibody fragments are obtained using conventional techniques known to those with skill in the art, and the fragments are screened for utility in the same manner as are intact antibodies.

[3728] The anti-33410 antibody can be a polyclonal or a monoclonal antibody. In other embodiments, the antibody can be recombinantly produced, e.g., produced by phage display or by combinatorial methods.

[3729] Phage display and combinatorial methods for generating anti-33410 antibodies are known in the art (as described in, e.g., Ladner et al. U.S. Pat. No. 5,223,409; Kang et al. International Publication No. WO 92/18619; Dower et al. International Publication No. WO 91/17271; Winter et al. International Publication WO 92/20791; Markland et al. International Publication No. WO 92/15679; Breitling et al. International Publication WO 93/01288; McCafferty et al. International Publication No. WO 92/01047; Garrard et al. International Publication No. WO 92/09690; Ladner et al. International Publication No. WO 90/02809; Fuchs et al. (1991) Bio/Technology 9:1370-1372; Hay et al. (1992) Hum Antibod Hybridomas 3:81-85; Huse et al. (1989) Science 246:1275-1281; Griffths et al. (1993) EMBO J 12:725-734; Hawkins et al. (1992) J Mol Biol 226:889-896; Clackson et al. (1991) Nature 352:624-628; Gram et al. (1992) PNAS 89:3576-3580; Garrad et al. (1991) Bio/Technology 9:1373-1377; Hoogenboom et al. (1991) Nuc Acid Res 19:4133-4137; and Barbas et al. (1991) PNAS 88:7978-7982, the contents of all of which are incorporated by reference herein).

[3730] In one embodiment, the anti-33410 antibody is a fully human antibody (e.g., an antibody made in a mouse which has been genetically engineered to produce an antibody from a human immunoglobulin sequence), or a non-human antibody, e.g., a rodent (mouse or rat), goat, primate (e.g., monkey), camel antibody. Preferably, the non-human antibody is a rodent (mouse or rat antibody). Method of producing rodent antibodies are known in the art.

[3731] Human monoclonal antibodies can be generated using transgenic mice carrying the human immunoglobulin genes rather than the mouse system. Splenocytes from these transgenic mice immunized with the antigen of interest are used to produce hybridomas that secrete human mAbs with specific affinities for epitopes from a human protein (see, e.g., Wood et al. International Application WO 91/00906, Kucherlapati et al. PCT publication WO 91/10741; Lonberg et al. International Application WO 92/03918; Kay et al. International Application 92/03917; Lonberg, N. et al. 1994 Nature 368:856-859; Green, L. L. et al. 1994 Nature Genet. 7:13-21; Morrison, S. L. et al. 1994 Proc. Natl. Acad. Sci. USA 81:6851-6855; Bruggeman et al. 1993 Year Immunol 7:33-40; Tuaillon et al. 1993 PNAS 90:3720-3724; Bruggeman et al. 1991 Eur J Immunol 21:1323-1326).

[3732] An anti-33410 antibody can be one in which the variable region, or a portion thereof, e.g., the CDR's, are generated in a non-human organism, e.g., a rat or mouse. Chimeric, CDR-grafted, and humanized antibodies are within the invention. Antibodies generated in a non-human organism, e.g., a rat or mouse, and then modified, e.g., in the variable framework or constant region, to decrease antigenicity in a human are within the invention.

[3733] Chimeric antibodies can be produced by recombinant DNA techniques known in the art. For example, a gene encoding the Fc constant region of a murine (or other species) monoclonal antibody molecule is digested with restriction enzymes to remove the region encoding the murine Fc, and the equivalent portion of a gene encoding a human Fc constant region is substituted (see Robinson et al., International Patent Publication PCT/US86/02269; Akira, et al., European Patent Application 184,187; Taniguchi, M., European Patent Application 171,496; Morrison et al., European Patent Application 173,494; Neuberger et al., International Application WO 86/01533; Cabilly et al. U.S. Pat. No. 4,816,567; Cabilly et al., European Patent Application 125,023; Better et al. (1988 Science 240:1041-1043); Liu et al. (1987) PNAS 84:3439-3443; Liu et al., 1987, J. Immunol. 139:3521-3526; Sun et al. (1987) PNAS 84:214-218; Nishimura et al., 1987, Canc. Res. 47:999-1005; Wood et al. (1985) Nature 314:446-449; and Shaw et al., 1988, J Natl Cancer Inst. 80:1553-1559).

[3734] A humanized or CDR-grafted antibody will have at least one or two but generally all three recipient CDR's (of heavy and or light immuoglobulin chains) replaced with a donor CDR. The antibody may be replaced with at least a portion of a non-human CDR or only some of the CDR's may be replaced with non-human CDR's. It is only necessary to replace the number of CDR's required for binding of the humanized antibody to a 33410 or a fragment thereof. Preferably, the donor will be a rodent antibody, e.g., a rat or mouse antibody, and the recipient will be a human framework or a human consensus framework. Typically, the immunoglobulin providing the CDR's is called the “donor” and the immunoglobulin providing the framework is called the “acceptor.” In one embodiment, the donor immunoglobulin is a non-human (e.g., rodent). The acceptor framework is a naturally occurring (e.g., a human) framework or a consensus framework, or a sequence about 85% or higher, preferably 90%, 95%, 99% or higher identical thereto.

[3735] As used herein, the term “consensus sequence” refers to the sequence formed from the most frequently occurring amino acids (or nucleotides) in a family of related sequences (See e.g., Winnaker, From Genes to Clones (Verlagsgesellschaft, Weinheim, Germany 1987). In a family of proteins, each position in the consensus sequence is occupied by the amino acid occurring most frequently at that position in the family. If two amino acids occur equally frequently, either can be included in the consensus sequence. A “consensus framework” refers to the framework region in the consensus immunoglobulin sequence.

[3736] An antibody can be humanized by methods known in the art. Humanized antibodies can be generated by replacing sequences of the Fv variable region that are not directly involved in antigen binding with equivalent sequences from human Fv variable regions. General methods for generating humanized antibodies are provided by Morrison, S. L., 1985, Science 229:1202-1207, by Oi et al., 1986, BioTechniques 4:214, and by Queen et al. U.S. Pat. No. 5,585,089, U.S. Pat. No. 5,693,761 and U.S. Pat. No. 5,693,762, the contents of all of which are hereby incorporated by reference. Those methods include isolating, manipulating, and expressing the nucleic acid sequences that encode all or part of immunoglobulin Fv variable regions from at least one of a heavy or light chain. Sources of such nucleic acid are well known to those skilled in the art and, for example, may be obtained from a hybridoma producing an antibody against a 33410 polypeptide or fragment thereof. The recombinant DNA encoding the humanized antibody, or fragment thereof, can then be cloned into an appropriate expression vector.

[3737] Humanized or CDR-grafted antibodies can be produced by CDR-grafting or CDR substitution, wherein one, two, or all CDR's of an immunoglobulin chain can be replaced. See e.g., U.S. Pat. No. 5,225,539; Jones et al. 1986 Nature 321:552-525; Verhoeyan et al. 1988 Science 239:1534; Beidler et al. 1988 J. Immunol. 141:4053-4060; Winter U.S. Pat. No. 5,225,539, the contents of all of which are hereby expressly incorporated by reference. Winter describes a CDR-grafting method that may be used to prepare the humanized antibodies of the present invention (UK Patent Application GB 2188638A, filed on Mar. 26, 1987; Winter U.S. Pat. No. 5,225,539), the contents of which is expressly incorporated by reference.

[3738] Also within the scope of the invention are humanized antibodies in which specific amino acids have been substituted, deleted or added. Preferred humanized antibodies have amino acid substitutions in the framework region, such as to improve binding to the antigen. For example, a humanized antibody will have framework residues identical to the donor framework residue or to another amino acid other than the recipient framework residue. To generate such antibodies, a selected, small number of acceptor framework residues of the humanized immunoglobulin chain can be replaced by the corresponding donor amino acids. Preferred locations of the substitutions include amino acid residues adjacent to the CDR, or which are capable of interacting with a CDR (see e.g., U.S. Pat. No. 5,585,089). Criteria for selecting amino acids from the donor are described in U.S. Pat. No. 5,585,089, e.g., columns 12-16 of U.S. Pat. No. 5,585,089, the e.g., columns 12-16 of U.S. Pat. No. 5,585,089, the contents of which are hereby incorporated by reference. Other techniques for humanizing antibodies are described in Padlan et al. EP 519596 A1, published on Dec. 23, 1992.

[3739] In preferred embodiments an antibody can be made by immunizing with purified 33410 antigen, or a fragment thereof, e.g., a fragment described herein, membrane associated antigen, tissue, e.g., crude tissue preparations, whole cells, preferably living cells, lysed cells, or cell fractions, e.g., membrane fractions.

[3740] A full-length 33410 protein or, antigenic peptide fragment of 33410 can be used as an immunogen or can be used to identify anti-33410 antibodies made with other immunogens, e.g., cells, membrane preparations, and the like. The antigenic peptide of 33410 should include at least 8 amino acid residues of the amino acid sequence shown in SEQ ID NO:54 and encompasses an epitope of 33410. Preferably, the antigenic peptide includes at least 10 amino acid residues, more preferably at least 15 amino acid residues, even more preferably at least 20 amino acid residues, and most preferably at least 30 amino acid residues.

[3741] Fragments of 33410 which include, for example, residues 330-350, 480-505, or 695-720 of SEQ ID NO:54 can be used to make, e.g., antibodies against hydrophilic regions of the 33410 protein or used as immunogens or to characterize the specificity of an antibody. Similarly, a fragment of 33410 which include, for example, residues 60-72, 260-277, or 780-793 of SEQ ID NO:54 can be used to make an antibody against a hydrophobic region of the 33410 protein; a fragment of 33410 which include residues 42-601 of SEQ ID NO:54 can be used to make an antibody against an extracellular region of the 33410 protein; a fragment of 33410 which include residues 699-835 of SEQ ID NO:54 can be used to make an antibody against an intracellular region of the 33410 protein; a fragment of 33410 which include residues 42-601 or 139-149 of SEQ ID NO:54 can be used to make an antibody against a carboxylesterase region of the 33410 protein.

[3742] Antibodies reactive with, or specific for, any of these regions, or other regions or domains described herein are provided.

[3743] Antibodies which bind only native 33410 protein, only denatured or otherwise non-native 33410 protein, or which bind both, are with in the invention. Antibodies with linear or conformational epitopes are within the invention. Conformational epitopes can sometimes be identified by indentifying antibodies that bind to native but not denatured 33410 protein.

[3744] Preferred epitopes encompassed by the antigenic peptide are regions of 33410 are located on the surface of the protein, e.g., hydrophilic regions, as well as regions with high antigenicity. For example, an Emini surface probability analysis of the human 33410 protein sequence can be used to indicate the regions that have a particularly high probability of being localized to the surface of the 33410 protein and are thus likely to constitute surface residues useful for targeting antibody production.

[3745] The anti-33410 antibody can be a single chain antibody. A single-chain antibody (scFV) may be engineered (see, for example, Colcher, D. et al. (1999) Ann N Y Acad Sci 880:263-80; and Reiter, Y. (1996) Clin Cancer Res 2:245-52). The single chain antibody can be dimerized or multimerized to generate multivalent antibodies having specificities for different epitopes of the same target 33410 protein.

[3746] In a preferred embodiment the antibody has: effector function; and can fix complement. In other embodiments the antibody does not; recruit effector cells; or fix complement.

[3747] In a preferred embodiment, the antibody has reduced or no ability to bind an Fc receptor. For example., it is a isotype or subtype, fragment or other mutant, which does not support binding to an Fc receptor, e.g., it has a mutagenized or deleted Fc receptor binding region.

[3748] The antibody can be coupled to a toxin, e.g., a polypeptide toxin, e,g, ricin or diptheria toxin or active fragment hereof, or a radionuclide, or imaging agent, e.g. a radioactive, enzymatic, or other, e.g., imaging agent, e.g., a NMR contrast agent. Labels which produce detectable radioactive emissions or fluorescence are preferred.

[3749] An anti-33410 antibody (e.g., monoclonal antibody) can be used to isolate 33410 by standard techniques, such as affinity chromatography or immunoprecipitation. Moreover, an anti-33410 antibody can be used to detect 33410 protein (e.g., in a cellular lysate or cell supernatant) in order to evaluate the abundance and pattern of expression of the protein. Anti-33410 antibodies can be used diagnostically to monitor protein levels in tissue as part of a clinical testing procedure, e.g., to determine the efficacy of a given treatment regimen. Detection can be facilitated by coupling (i.e., physically linking) the antibody to a detectable substance (i.e., antibody labeling). Examples of detectable substances include various enzymes, prosthetic groups, fluorescent materials, luminescent materials, bioluminescent materials, and radioactive materials. Examples of suitable enzymes include horseradish peroxidase, alkaline phosphatase, &bgr;-galactosidase, or acetylcholinesterase; examples of suitable prosthetic group complexes include streptavidin/biotin and avidin/biotin; examples of suitable fluorescent materials include umbelliferone, fluorescein, fluorescein isothiocyanate, rhodamine, dichlorotriazinylamine fluorescein, dansyl chloride or phycoerythrin; an example of a luminescent material includes luminol; examples of bioluminescent materials include luciferase, luciferin, and aequorin, and examples of suitable radioactive material include 125I, 131I, 35S or 3H.

[3750] The invention also includes a nucleic acid that encodes an anti-33410 antibody, e.g., an anti-33410 antibody described herein. Also included are vectors which include the nucleic acid and sells transformed with the nucleic acid, particularly cells which are useful for producing an antibody, e.g., mammalian cells, e.g. CHO or lymphatic cells.

[3751] The invention also includes cell lines, e.g., hybridomas, which make an anti-33410 antibody, e.g., and antibody described herein, and method of using said cells to make a 33410 antibody.

[3752] 33410 Recombinant Expression Vectors, Host Cells and Genetically Engineered Cells

[3753] In another aspect, the invention includes, vectors, preferably expression vectors, containing a nucleic acid encoding a polypeptide described herein. As used herein, the term “vector” refers to a nucleic acid molecule capable of transporting another nucleic acid to which it has been linked and can include a plasmid, cosmid or viral vector. The vector can be capable of autonomous replication or it can integrate into a host DNA. Viral vectors include, e.g., replication defective retroviruses, adenoviruses and adeno-associated viruses.

[3754] A vector can include a 33410 nucleic acid in a form suitable for expression of the nucleic acid in a host cell. Preferably the recombinant expression vector includes one or more regulatory sequences operatively linked to the nucleic acid sequence to be expressed. The term “regulatory sequence” includes promoters, enhancers and other expression control elements (e.g., polyadenylation signals). Regulatory sequences include those that direct constitutive expression of a nucleotide sequence, as well as tissue-specific regulatory and/or inducible sequences. The design of the expression vector can depend on such factors as the choice of the host cell to be transformed, the level of expression of protein desired, and the like. The expression vectors of the invention can be introduced into host cells to thereby produce proteins or polypeptides, including fusion proteins or polypeptides, encoded by nucleic acids as described herein (e.g., 33410 proteins, mutant forms of 33410 proteins, fusion proteins, and the like).

[3755] The recombinant expression vectors of the invention can be designed for expression of 33410 proteins in prokaryotic or eukaryotic cells. For example, polypeptides of the invention can be expressed in E. coli, insect cells (e.g., using baculovirus expression vectors), yeast cells or mammalian cells. Suitable host cells are discussed further in Goeddel, Gene Expression Technology: Methods in Enzymology 185, Academic Press, San Diego, Calif. (1990). Alternatively, the recombinant expression vector can be transcribed and translated in vitro, for example using T7 promoter regulatory sequences and T7 polymerase.

[3756] Expression of proteins in prokaryotes is most often carried out in E. coli with vectors containing constitutive or inducible promoters directing the expression of either fusion or non-fusion proteins. Fusion vectors add a number of amino acids to a protein encoded therein, usually to the amino terminus of the recombinant protein. Such fusion vectors typically serve three purposes: 1) to increase expression of recombinant protein; 2) to increase the solubility of the recombinant protein; and 3) to aid in the purification of the recombinant protein by acting as a ligand in affinity purification. Often, a proteolytic cleavage site is introduced at the junction of the fusion moiety and the recombinant protein to enable separation of the recombinant protein from the fusion moiety subsequent to purification of the fusion protein. Such enzymes, and their cognate recognition sequences, include Factor Xa, thrombin and enterokinase. Typical fusion expression vectors include pGEX (Pharmacia Biotech Inc; Smith, D. B. and Johnson, K. S. (1988) Gene 67:31-40), pMAL (New England Biolabs, Beverly, Mass.) and pRIT5 (Pharmacia, Piscataway, N.J.) which fuse glutathione S-transferase (GST), maltose E binding protein, or protein A, respectively, to the target recombinant protein.

[3757] Purified fusion proteins can be used in 33410 activity assays, (e.g., direct assays or competitive assays described in detail below), or to generate antibodies specific for 33410 proteins. In a preferred embodiment, a fusion protein expressed in a retroviral expression vector of the present invention can be used to infect bone marrow cells that are subsequently transplanted into irradiated recipients. The pathology of the subject recipient is then examined after sufficient time has passed (e.g., six weeks).

[3758] One strategy used to maximize recombinant protein expression in E. coli is to express the protein in a host strain with an impaired capacity to proteolytically cleave the recombinant protein (Gottesman, S., Gene Expression Technology: Methods in Enzymology 185, Academic Press, San Diego, Calif. (1990) 119-128). Another strategy is to alter the nucleic acid sequence of the nucleic acid to be inserted into an expression vector so that the individual codons for each amino acid are those preferentially utilized in E. coli (Wada et al., (1992) Nucleic Acids Res. 20:2111-2118). Such alteration of nucleic acid sequences of the invention can be carried out by standard DNA synthesis techniques.

[3759] The 33410 expression vector can be a yeast expression vector, a vector for expression in insect cells, e.g., a baculovirus expression vector or a vector suitable for expression in mammalian cells.

[3760] When used in mammalian cells, the expression vector's control functions are often provided by viral regulatory elements. For example, commonly used promoters are derived from polyoma, Adenovirus 2, cytomegalovirus and Simian Virus 40.

[3761] In another embodiment, the recombinant mammalian expression vector is capable of directing expression of the nucleic acid preferentially in a particular cell type (e.g., tissue-specific regulatory elements are used to express the nucleic acid). Non-limiting examples of suitable tissue-specific promoters include the albumin promoter (liver-specific; Pinkert et al. (1987) Genes Dev. 1:268-277), lymphoid-specific promoters (Calame and Eaton (1988) Adv. Immunol. 43:235-275), in particular promoters of T cell receptors (Winoto and Baltimore (1989) EMBO J. 8:729-733) and immunoglobulins (Banerji et al. (1983) Cell 33:729-740; Queen and Baltimore (1983) Cell 33:741-748), neuron-specific promoters (e.g., the neurofilament promoter; Byrne and Ruddle (1989) Proc. Natl. Acad. Sci. USA 86:5473-5477), pancreas-specific promoters (Edlund et al. (1985) Science 230:912-916), and mammary gland-specific promoters (e.g., milk whey promoter; U.S. Pat. No. 4,873,316 and European Application Publication No. 264,166). Developmentally-regulated promoters are also encompassed, for example, the murine hox promoters (Kessel and Gruss (1990) Science 249:374-379) and the &agr;-fetoprotein promoter (Campes and Tilghman (1989) Genes Dev. 3:537-546).

[3762] The invention further provides a recombinant expression vector comprising a DNA molecule of the invention cloned into the expression vector in an antisense orientation. Regulatory sequences (e.g., viral promoters and/or enhancers) operatively linked to a nucleic acid cloned in the antisense orientation can be chosen which direct the constitutive, tissue specific or cell type specific expression of antisense RNA in a variety of cell types. The antisense expression vector can be in the form of a recombinant plasmid, phagemid or attenuated virus. For a discussion of the regulation of gene expression using antisense genes see Weintraub, H. et al., Antisense RNA as a molecular tool for genetic analysis, Reviews—Trends in Genetics, Vol. 1(1) 1986.

[3763] Another aspect the invention provides a host cell which includes a nucleic acid molecule described herein, e.g., a 33410 nucleic acid molecule within a recombinant expression vector or a 33410 nucleic acid molecule containing sequences which allow it to homologously recombine into a specific site of the host cell's genome. The terms “host cell” and “recombinant host cell” are used interchangeably herein. Such terms refer not only to the particular subject cell, but also to the progeny or potential progeny of such a cell. Because certain modifications may occur in succeeding generations due to either mutation or environmental influences, such progeny may not, in fact, be identical to the parent cell, but are still included within the scope of the term as used herein.

[3764] A host cell can be any prokaryotic or eukaryotic cell. For example, a 33410 protein can be expressed in bacterial cells such as E. coli, insect cells, yeast or mammalian cells (such as Chinese hamster ovary cells (CHO) or COS cells). Other suitable host cells are known to those skilled in the art.

[3765] Vector DNA can be introduced into host cells via conventional transformation or transfection techniques. As used herein, the terms “transformation” and “transfection” are intended to refer to a variety of art-recognized techniques for introducing foreign nucleic acid (e.g., DNA) into a host cell, including calcium phosphate or calcium chloride co-precipitation, DEAE-dextran-mediated transfection, lipofection, or electroporation.

[3766] A host cell of the invention can be used to produce (i.e., express) a 33410 protein. Accordingly, the invention further provides methods for producing a 33410 protein using the host cells of the invention. In one embodiment, the method includes culturing the host cell of the invention (into which a recombinant expression vector encoding a 33410 protein has been introduced) in a suitable medium such that a 33410 protein is produced. In another embodiment, the method further includes isolating a 33410 protein from the medium or the host cell.

[3767] In another aspect, the invention features a cell or purified preparation of cells which include a 33410 transgene, or which otherwise misexpress 33410. The cell preparation can consist of human or non-human cells, e.g., rodent cells, e.g., mouse or rat cells, rabbit cells, or pig cells. In preferred embodiments, the cell or cells include a 33410 transgene, e.g., a heterologous form of a 33410, e.g., a gene derived from humans (in the case of a non-human cell). The 33410 transgene can be misexpressed, e.g., overexpressed or underexpressed. In other preferred embodiments, the cell or cells include a gene that misexpress an endogenous 33410, e.g., a gene the expression of which is disrupted, e.g., a knockout. Such cells can serve as a model for studying disorders that are related to mutated or mis-expressed 33410 alleles or for use in drug screening.

[3768] In another aspect, the invention features a human cell, e.g., a neurological cell, transformed with nucleic acid that encodes a subject 33410 polypeptide.

[3769] Also provided are cells, preferably human cells, e.g., human neurological cells (e.g., neurons), in which an endogenous 33410 is under the control of a regulatory sequence that does not normally control the expression of the endogenous 33410 gene. The expression characteristics of an endogenous gene within a cell, e.g., a cell line or microorganism, can be modified by inserting a heterologous DNA regulatory element into the genome of the cell such that the inserted regulatory element is operably linked to the endogenous 33410 gene. For example, an endogenous 33410 gene that is “transcriptionally silent,” e.g., not normally expressed, or expressed only at very low levels, may be activated by inserting a regulatory element which is capable of promoting the expression of a normally expressed gene product in that cell. Techniques such as targeted homologous recombinations, can be used to insert the heterologous DNA as described in, e.g., Chappel, U.S. Pat. No. 5,272,071; WO 91/06667, published in May 16, 1991.

[3770] 33410 Transgenic Animals

[3771] The invention provides non-human transgenic animals. Such animals are useful for studying the function and/or activity of a 33410 protein and for identifying and/or evaluating modulators of 33410 activity. As used herein, a “transgenic animal” is a non-human animal, preferably a mammal, more preferably a rodent such as a rat or mouse, in which one or more of the cells of the animal includes a transgene. Other examples of transgenic animals include non-human primates, sheep, dogs, cows, goats, chickens, amphibians, and the like. A transgene is exogenous DNA or a rearrangement, e.g., a deletion of endogenous chromosomal DNA, which preferably is integrated into or occurs in the genome of the cells of a transgenic animal. A transgene can direct the expression of an encoded gene product in one or more cell types or tissues of the transgenic animal, other transgenes, e.g., a knockout, reduce expression. Thus, a transgenic animal can be one in which an endogenous 33410 gene has been altered by, e.g., by homologous recombination between the endogenous gene and an exogenous DNA molecule introduced into a cell of the animal, e.g., an embryonic cell of the animal, prior to development of the animal.

[3772] Intronic sequences and polyadenylation signals can also be included in the transgene to increase the efficiency of expression of the transgene. A tissue-specific regulatory sequence(s) can be operably linked to a transgene of the invention to direct expression of a 33410 protein to particular cells. A transgenic founder animal can be identified based upon the presence of a 33410 transgene in its genome and/or expression of 33410 mRNA in tissues or cells of the animals. A transgenic founder animal can then be used to breed additional animals carrying the transgene. Moreover, transgenic animals carrying a transgene encoding a 33410 protein can further be bred to other transgenic animals carrying other transgenes.

[3773] 33410 proteins or polypeptides can be expressed in transgenic animals or plants, e.g., a nucleic acid encoding the protein or polypeptide can be introduced into the genome of an animal. In preferred embodiments the nucleic acid is placed under the control of a tissue specific promoter, e.g., a milk or egg specific promoter, and recovered from the milk or eggs produced by the animal. Suitable animals are mice, pigs, cows, goats, and sheep.

[3774] The invention also includes a population of cells from a transgenic animal, as discussed, e.g., below.

[3775] Uses of 33410

[3776] The nucleic acid molecules, proteins, protein homologues, and antibodies described herein can be used in one or more of the following methods: a) screening assays; b) predictive medicine (e.g., diagnostic assays, prognostic assays, monitoring clinical trials, and pharmacogenetics); and c) methods of treatment (e.g., therapeutic and prophylactic).

[3777] The isolated nucleic acid molecules of the invention can be used, for example, to express a 33410 protein (e.g., via a recombinant expression vector in a host cell in gene therapy applications), to detect a 33410 mRNA (e.g., in a biological sample) or a genetic alteration in a 33410 gene, and to modulate 33410 activity, as described further below. The 33410 proteins can be used to treat disorders characterized by insufficient or excessive production of a 33410 substrate or production of 33410 inhibitors. In addition, the 33410 proteins can be used to screen for naturally occurring 33410 substrates, to screen for drugs or compounds which modulate 33410 activity, as well as to treat disorders characterized by insufficient or excessive production of 33410 protein or production of 33410 protein forms which have decreased, aberrant or unwanted activity compared to 33410 wild type protein (e.g., neurological disorders and/or carcinomas). Moreover, the anti-33410 antibodies of the invention can be used to detect and isolate 33410 proteins, regulate the bioavailability of 33410 proteins, and modulate 33410 activity.

[3778] A method of evaluating a compound for the ability to interact with, e.g., bind, a subject 33410 polypeptide is provided. The method includes: contacting the compound with the subject 33410 polypeptide; and evaluating ability of the compound to interact with, e.g., to bind or form a complex with the subject 33410 polypeptide. This method can be performed in vitro, e.g., in a cell free system, or in vivo, e.g., in a two-hybrid interaction trap assay. This method can be used to identify naturally occurring molecules that interact with subject 33410 polypeptide. It can also be used to find natural or synthetic inhibitors of subject 33410 polypeptide. Screening methods are discussed in more detail below.

[3779] 33410 Screening Assays

[3780] The invention provides methods (also referred to herein as “screening assays”) for identifying modulators, i.e., candidate or test compounds or agents (e.g., proteins, peptides, peptidomimetics, peptoids, small molecules or other drugs) which bind to 33410 proteins, have a stimulatory or inhibitory effect on, for example, 33410 expression or 33410 activity, or have a stimulatory or inhibitory effect on, for example, the expression or activity of a 33410 substrate. Compounds thus identified can be used to modulate the activity of target gene products (e.g., 33410 genes) in a therapeutic protocol, to elaborate the biological function of the target gene product, or to identify compounds that disrupt normal target gene interactions.

[3781] In one embodiment, the invention provides assays for screening candidate or test compounds that are substrates of a 33410 protein or polypeptide or a biologically active portion thereof. In another embodiment, the invention provides assays for screening candidate or test compounds that bind to or modulate the activity of a 33410 protein or polypeptide or a biologically active portion thereof.

[3782] The test compounds of the present invention can be obtained using any of the numerous approaches in combinatorial library methods known in the art, including: biological libraries; peptoid libraries (libraries of molecules having the functionalities of peptides, but with a novel, non-peptide backbone which are resistant to enzymatic degradation but which nevertheless remain bioactive; see, e.g., Zuckermann, R. N. et al. J. Med. Chem. 1994, 37: 2678-85); spatially addressable parallel solid phase or solution phase libraries; synthetic library methods requiring deconvolution; the ‘one-bead one-compound’ library method; and synthetic library methods using affinity chromatography selection. The biological library and peptoid library approaches are limited to peptide libraries, while the other four approaches are applicable to peptide, non-peptide oligomer or small molecule libraries of compounds (Lam, K. S. (1997) Anticancer Drug Des. 12:145).

[3783] Examples of methods for the synthesis of molecular libraries can be found in the art, for example in: DeWitt et al. (1993) Proc. Natl. Acad. Sci. U.S.A. 90:6909; Erb et al. (1994) Proc. Natl. Acad. Sci. USA 91:11422; Zuckermann et al. (1994). J. Med. Chem. 37:2678; Cho et al. (1993) Science 261:1303; Carrell et al. (1994) Angew. Chem. Int. Ed. Engl. 33:2059; Carell et al. (1994) Angew. Chem. Int. Ed. Engl. 33:2061; and in Gallop et al. (1994) J. Med. Chem. 37:1233.

[3784] Libraries of compounds may be presented in solution (e.g., Houghten (1992) Biotechniques 13:412-421), or on beads (Lam (1991) Nature 354:82-84), chips (Fodor (1993) Nature 364:555-556), bacteria (Ladner U.S. Pat. No. 5,223,409), spores (Ladner U.S. Pat. No. '409), plasmids (Cull et al. (1992) Proc Natl Acad Sci USA 89:1865-1869) or on phage (Scott and Smith (1990) Science 249:386-390); (Devlin (1990) Science 249:404-406); (Cwirla et al. (1990) Proc. Natl. Acad. Sci. 87:6378-6382); (Felici (1991) J. Mol. Biol. 222:301-310); (Ladner supra.).

[3785] In one embodiment, an assay is a cell-based assay in which a cell which expresses a 33410 protein or biologically active portion thereof is contacted with a test compound, and the ability of the test compound to modulate 33410 activity is determined. Determining the ability of the test compound to modulate 33410 activity can be accomplished by monitoring, for example, binding to neurexins on neurons; the ability to act as a cell surface receptor; cell adhesion properties; the ability to mediate cell-cell interactions between neurons; the ability to regulate inter-neuronal recognition pathways for axon pathfinding; the ability to regulate neuritogenesis; or binding Zn2+, Ca2+, Mg2+, Cd2+ and/or Mn2+. The cell, for example, can be of mammalian origin, e.g., human.

[3786] The ability of the test compound to modulate 33410 binding to a compound, e.g., a 33410 substrate, or to bind to 33410 can also be evaluated. This can be accomplished, for example, by coupling the compound, e.g., the substrate, with a radioisotope or enzymatic label such that binding of the compound, e.g., the substrate, to 33410 can be determined by detecting the labeled compound, e.g., substrate, in a complex. Alternatively, 33410 could be coupled with a radioisotope or enzymatic label to monitor the ability of a test compound to modulate 33410 binding to a 33410 substrate in a complex. For example, compounds (e.g., 33410 substrates) can be labeled with 125I, 35S, 14C, or 3H, either directly or indirectly, and the radioisotope detected by direct counting of radioemmission or by scintillation counting. Alternatively, compounds can be enzymatically labeled with, for example, horseradish peroxidase, alkaline phosphatase, or luciferase, and the enzymatic label detected by determination of conversion of an appropriate substrate to product.

[3787] The ability of a compound (e.g., a 33410 substrate) to interact with 33410 with or without the labeling of any of the interactants can be evaluated. For example, a microphysiometer can be used to detect the interaction of a compound with 33410 without the labeling of either the compound or the 33410. McConnell, H. M. et al. (1992) Science 257:1906-1912. As used herein, a “microphysiometer” (e.g., Cytosensor) is an analytical instrument that measures the rate at which a cell acidifies its environment using a light-addressable potentiometric sensor (LAPS). Changes in this acidification rate can be used as an indicator of the interaction between a compound and 33410.

[3788] In yet another embodiment, a cell-free assay is provided in which a 33410 protein or biologically active portion thereof is contacted with a test compound and the ability of the test compound to bind to the 33410 protein or biologically active portion thereof is evaluated. Preferred biologically active portions of the 33410 proteins to be used in assays of the present invention include fragments that participate in interactions with non-33410 molecules, e.g., fragments with high surface probability scores.

[3789] Soluble and/or membrane-bound forms of isolated proteins (e.g., 33410 proteins or biologically active portions thereof) can be used in the cell-free assays of the invention. When membrane-bound forms of the protein are used, it may be desirable to utilize a solubilizing agent. Examples of such solubilizing agents include non-ionic detergents such as n-octylglucoside, n-dodecylglucoside, n-dodecylmaltoside, octanoyl-N-methylglucamide, decanoyl-N-methylglucamide, Triton® X-100, Triton®g X-114, Thesit®, Isotridecypoly(ethylene glycol ether)n, 3-[(3-cholamidopropyl)dimethylamminio]-1-propane sulfonate (CHAPS), 3-[(3-cholamidopropyl)dimethylamminio]-2-hydroxy-1-propane sulfonate (CHAPSO), or N-dodecyl=N,N-dimethyl-3-ammonio-11-propane sulfonate.

[3790] Cell-free assays involve preparing a reaction mixture of the target gene protein and the test compound under conditions and for a time sufficient to allow the two components to interact and bind, thus forming a complex that can be removed and/or detected.

[3791] The interaction between two molecules can also be detected, e.g., using fluorescence energy transfer (FET) (see, for example, Lakowicz et al., U.S. Pat. No. 5,631,169; Stavrianopoulos, et al., U.S. Pat. No. 4,868,103). A fluorophore label on the first, ‘donor’ molecule is selected such that its emitted fluorescent energy will be absorbed by a fluorescent label on a second, ‘acceptor’ molecule, which in turn is able to fluoresce due to the absorbed energy. Alternately, the ‘donor’ protein molecule may simply utilize the natural fluorescent energy of tryptophan residues. Labels are chosen that emit different wavelengths of light, such that the ‘acceptor’ molecule label may be differentiated from that of the ‘donor’. Since the efficiency of energy transfer between the labels is related to the distance separating the molecules, the spatial relationship between the molecules can be assessed. In a situation in which binding occurs between the molecules, the fluorescent emission of the ‘acceptor’ molecule label in the assay should be maximal. An FET binding event can be conveniently measured through standard fluorometric detection means well known in the art (e.g., using a fluorimeter).

[3792] In another embodiment, determining the ability of the 33410 protein to bind to a target molecule can be accomplished using real-time Biomolecular Interaction Analysis (BIA) (see, e.g., Sjolander, S. and Urbaniczky, C. (1991) Anal. Chem. 63:2338-2345 and Szabo et al. (1995) Curr. Opin. Struct. Biol. 5:699-705). “Surface plasmon resonance” or “BIA” detects biospecific interactions in real time, without labeling any of the interactants (e.g., BIAcore). Changes in the mass at the binding surface (indicative of a binding event) result in alterations of the refractive index of light near the surface (the optical phenomenon of surface plasmon resonance (SPR)), resulting in a detectable signal that can be used as an indication of real-time reactions between biological molecules.

[3793] In one embodiment, the target gene product or the test substance is anchored onto a solid phase. The target gene product/test compound complexes anchored on the solid phase can be detected at the end of the reaction. Preferably, the target gene product can be anchored onto a solid surface, and the test compound, (which is not anchored), can be labeled, either directly or indirectly, with detectable labels discussed herein.

[3794] It may be desirable to immobilize either 33410, an anti-33410 antibody or its target molecule to facilitate separation of complexed from uncomplexed forms of one or both of the proteins, as well as to accommodate automation of the assay. Binding of a test compound to a 33410 protein, or interaction of a 33410 protein with a target molecule in the presence and absence of a candidate compound, can be accomplished in any vessel suitable for containing the reactants. Examples of such vessels include microtiter plates, test tubes, and micro-centrifuge tubes. In one embodiment, a fusion protein can be provided which adds a domain that allows one or both of the proteins to be bound to a matrix. For example, glutathione-S-transferase/33410 fusion proteins or glutathione-S-transferase/target fusion proteins can be adsorbed onto glutathione sepharose beads (Sigma Chemical, St. Louis, Mo.) or glutathione derivatized microtiter plates, which are then combined with the test compound or the test compound and either the non-adsorbed target protein or 33410 protein, and the mixture incubated under conditions conducive to complex formation (e.g., at physiological conditions for salt and pH). Following incubation, the beads or microtiter plate wells are washed to remove any unbound components, the matrix immobilized in the case of beads, complex determined either directly or indirectly, for example, as described above. Alternatively, the complexes can be dissociated from the matrix, and the level of 33410 binding or activity determined using standard techniques.

[3795] Other techniques for immobilizing either a 33410 protein or a target molecule on matrices include using conjugation of biotin and streptavidin. Biotinylated 33410 protein or target molecules can be prepared from biotin-NHS(N-hydroxy-succinimide) using techniques known in the art (e.g., biotinylation kit, Pierce Chemicals, Rockford, Ill.), and immobilized in the wells of streptavidin-coated 96 well plates (Pierce Chemical).

[3796] In order to conduct the assay, the non-immobilized component is added to the coated surface containing the anchored component. After the reaction is complete, unreacted components are removed (e.g., by washing) under conditions such that any complexes formed will remain immobilized on the solid surface. The detection of complexes anchored on the solid surface can be accomplished in a number of ways. Where the previously non-immobilized component is pre-labeled, the detection of label immobilized on the surface indicates that complexes were formed. Where the previously non-immobilized component is not pre-labeled, an indirect label can be used to detect complexes anchored on the surface; e.g., using a labeled antibody specific for the immobilized component (the antibody, in turn, can be directly labeled or indirectly labeled with, e.g., a labeled anti-Ig antibody).

[3797] In one embodiment, this assay is performed utilizing antibodies reactive with 33410 protein or target molecules but which do not interfere with binding of the 33410 protein to its target molecule. Such antibodies can be derivatized to the wells of the plate, and unbound target or 33410 protein trapped in the wells by antibody conjugation. Methods for detecting such complexes, in addition to those described above for the GST-immobilized complexes, include immunodetection of complexes using antibodies reactive with the 33410 protein or target molecule, as well as enzyme-linked assays which rely on detecting an enzymatic activity associated with the 33410 protein or target molecule.

[3798] Alternatively, cell free assays can be conducted in a liquid phase. In such an assay, the reaction products are separated from unreacted components, by any of a number of standard techniques, including but not limited to: differential centrifugation (see, for example, Rivas, G., and Minton, A. P., Trends Biochem Sci 1993 August;18(8):284-7); chromatography (gel filtration chromatography, ion-exchange chromatography); electrophoresis (see, e.g., Ausubel, F. et al., eds. Current Protocols in Molecular Biology 1999, J. Wiley: New York.); and immunoprecipitation (see, for example, Ausubel, F. et al., eds. Current Protocols in Molecular Biology 1999, J. Wiley: New York). Such resins and chromatographic techniques are known to one skilled in the art (see, e.g., Heegaard, N. H., J Mol Recognit 1998 Winter; 11(1-6):141-8; Hage, D. S., and Tweed, S. A. J Chromatogr B Biomed Sci Appl 1997 Oct. 10;699(1-2):499-525). Further, fluorescence energy transfer may also be conveniently utilized, as described herein, to detect binding without further purification of the complex from solution.

[3799] In a preferred embodiment, the assay includes contacting the 33410 protein or biologically active portion thereof with a known compound which binds 33410 to form an assay mixture, contacting the assay mixture with a test compound, and determining the ability of the test compound to interact with a 33410 protein, wherein determining the ability of the test compound to interact with a 33410 protein includes determining the ability of the test compound to preferentially bind to 33410 or biologically active portion thereof, or to modulate the activity of a target molecule, as compared to the known compound.

[3800] The target gene products of the invention can, in vivo, interact with one or more cellular or extracellular macromolecules, such as proteins. For the purposes of this discussion, such cellular and extracellular macromolecules are referred to herein as “binding partners.” Compounds that disrupt such interactions can be useful in regulating the activity of the target gene product. Such compounds can include, but are not limited to molecules such as antibodies, peptides, and small molecules. The preferred target genes/products for use in this embodiment are the 33410 genes herein identified. In an alternative embodiment, the invention provides methods for determining the ability of the test compound to modulate the activity of a 33410 protein through modulation of the activity of a downstream effector of a 33410 target molecule. For example, the activity of the effector molecule on an appropriate target can be determined, or the binding of the effector to an appropriate target can be determined, as previously described.

[3801] To identify compounds that interfere with the interaction between the target gene product and its cellular or extracellular binding partner(s), a reaction mixture containing the target gene product and the binding partner is prepared, under conditions and for a time sufficient, to allow the two products to form complex. In order to test an inhibitory agent, the reaction mixture is provided in the presence and absence of the test compound. The test compound can be initially included in the reaction mixture, or can be added at a time subsequent to the addition of the target gene and its cellular or extracellular binding partner. Control reaction mixtures are incubated without the test compound or with a placebo. The formation of any complexes between the target gene product and the cellular or extracellular binding partner is then detected. The formation of a complex in the control reaction, but not in the reaction mixture containing the test compound, indicates that the compound interferes with the interaction of the target gene product and the interactive binding partner. Additionally, complex formation within reaction mixtures containing the test compound and normal target gene product can also be compared to complex formation within reaction mixtures containing the test compound and mutant target gene product. This comparison can be important in those cases wherein it is desirable to identify compounds that disrupt interactions of mutant but not normal target gene products.

[3802] These assays can be conducted in a heterogeneous or homogeneous format. Heterogeneous assays involve anchoring either the target gene product or the binding partner onto a solid phase, and detecting complexes anchored on the solid phase at the end of the reaction. In homogeneous assays, the entire reaction is carried out in a liquid phase. In either approach, the order of addition of reactants can be varied to obtain different information about the compounds being tested. For example, test compounds that interfere with the interaction between the target gene products and the binding partners, e.g., by competition, can be identified by conducting the reaction in the presence of the test substance. Alternatively, test compounds that disrupt preformed complexes, e.g., compounds with higher binding constants that displace one of the components from the complex, can be tested by adding the test compound to the reaction mixture after complexes have been formed. The various formats are briefly described below.

[3803] In a heterogeneous assay system, either the target gene product or the interactive cellular or extracellular binding partner, is anchored onto a solid surface (e.g., a microtiter plate), while the non-anchored species is labeled, either directly or indirectly. The anchored species can be immobilized by non-covalent or covalent attachments. Alternatively, an immobilized antibody specific for the species to be anchored can be used to anchor the species to the solid surface.

[3804] In order to conduct the assay, the partner of the immobilized species is exposed to the coated surface with or without the test compound. After the reaction is complete, unreacted components are removed (e.g., by washing) and any complexes formed will remain immobilized on the solid surface. Where the non-immobilized species is pre-labeled, the detection of label immobilized on the surface indicates that complexes were formed. Where the non-immobilized species is not pre-labeled, an indirect label can be used to detect complexes anchored on the surface; e.g., using a labeled antibody specific for the initially non-immobilized species (the antibody, in turn, can be directly labeled or indirectly labeled with, e.g., a labeled anti-Ig antibody). Depending upon the order of addition of reaction components, test compounds that inhibit complex formation or that disrupt preformed complexes can be detected.

[3805] Alternatively, the reaction can be conducted in a liquid phase in the presence or absence of the test compound, the reaction products separated from unreacted components, and complexes detected; e.g., using an immobilized antibody specific for one of the binding components to anchor any complexes formed in solution, and a labeled antibody specific for the other partner to detect anchored complexes. Again, depending upon the order of addition of reactants to the liquid phase, test compounds that inhibit complex or that disrupt preformed complexes can be identified.

[3806] In an alternate embodiment of the invention, a homogeneous assay can be used. For example, a preformed complex of the target gene product and the interactive cellular or extracellular binding partner product is prepared in that either the target gene products or their binding partners are labeled, but the signal generated by the label is quenched due to complex formation (see, e.g., U.S. Pat. No. 4,109,496 that utilizes this approach for immunoassays). The addition of a test substance that competes with and displaces one of the species from the preformed complex will result in the generation of a signal above background. In this way, test substances that disrupt target gene product-binding partner interaction can be identified.

[3807] In yet another aspect, the 33410 proteins can be used as “bait proteins” in a two-hybrid assay or three-hybrid assay (see, e.g., U.S. Pat. No. 5,283,317; Zervos et al. (1993) Cell 72:223-232; Madura et al. (1993) J. Biol. Chem. 268:12046-12054; Bartel et al. (1993) Biotechniques 14:920-924; Iwabuchi et al. (1993) Oncogene 8:1693-1696; and Brent WO94/10300), to identify other proteins, which bind to or interact with 33410 (“33410-binding proteins” or “33410-bp”) and are involved in 33410 activity. Such 33410-bps can be activators or inhibitors of signals by the 33410 proteins or 33410 targets as, for example, downstream elements of a 33410-mediated signaling pathway.

[3808] The two-hybrid system is based on the modular nature of most transcription factors, which consist of separable DNA-binding and activation domains. Briefly, the assay utilizes two different DNA constructs. In one construct, the gene that codes for a 33410 protein is fused to a gene encoding the DNA binding domain of a known transcription factor (e.g., GAL-4). In the other construct, a DNA sequence, from a library of DNA sequences, that encodes an unidentified protein (“prey” or “sample”) is fused to a gene that codes for the activation domain of the known transcription factor. (Alternatively the: 33410 protein can be the fused to the activator domain.) If the “bait” and the “prey” proteins are able to interact, in vivo, forming a 33410-dependent complex, the DNA-binding and activation domains of the transcription factor are brought into close proximity. This proximity allows transcription of a reporter gene (e.g., lacZ) that is operably linked to a transcriptional regulatory site responsive to the transcription factor. Expression of the reporter gene can be detected and cell colonies containing the functional transcription factor can be isolated and used to obtain the cloned gene that encodes the protein that interacts with the 33410 protein.

[3809] In another embodiment, modulators of 33410 expression are identified. For example, a cell or cell free mixture is contacted with a candidate compound and the expression of 33410 mRNA or protein evaluated relative to the level of expression of 33410 mRNA or protein in the absence of the candidate compound. When expression of 33410 mRNA or protein is greater in the presence of the candidate compound than in its absence, the candidate compound is identified as a stimulator of 33410 mRNA or protein expression. Alternatively, when expression of 33410 mRNA or protein is less (statistically significantly less) in the presence of the candidate compound than in its absence, the candidate compound is identified as an inhibitor of 33410 mRNA or protein expression. The level of 33410 mRNA or protein expression can be determined by methods described herein for detecting 33410 mRNA or protein.

[3810] In another aspect, the invention pertains to a combination of two or more of the assays described herein. For example, a modulating agent can be identified using a cell-based or a cell free assay, and the ability of the agent to modulate the activity of a 33410 protein can be confirmed in vivo, e.g., in an animal such as an animal model for neurological disorders and/or carcinomas.

[3811] This invention further pertains to novel agents identified by the above-described screening assays. Accordingly, it is within the scope of this invention to further use an agent identified as described herein (e.g., a 33410 modulating agent, an antisense 33410 nucleic acid molecule, a 33410-specific antibody, or a 33410-binding partner) in an appropriate animal model to determine the efficacy, toxicity, side effects, or mechanism of action, of treatment with such an agent. Furthermore, novel agents identified by the above-described screening assays can be used for treatments as described herein.

[3812] 33410 Detection Assays

[3813] Portions or fragments of the nucleic acid sequences identified herein can be used as polynucleotide reagents. For example, these sequences can be used to: (i) map their respective genes on a chromosome e.g., to locate gene regions associated with genetic disease or to associate 33410 with a disease; (ii) identify an individual from a minute biological sample (tissue typing); and (iii) aid in forensic identification of a biological sample. These applications are described in the subsections below.

[3814] 33410 Chromosome Mapping

[3815] The 33410 nucleotide sequences or portions thereof can be used to map the location of the 33410 genes on a chromosome. This process is called chromosome mapping. Chromosome mapping is useful in correlating the 33410 sequences with genes associated with disease.

[3816] Briefly, 33410 genes can be mapped to chromosomes by preparing PCR primers (preferably 15-25 bp in length) from the 33410 nucleotide sequences. These primers can then be used for PCR screening of somatic cell hybrids containing individual human chromosomes. Only those hybrids containing the human gene corresponding to the 33410 sequences will yield an amplified fragment.

[3817] A panel of somatic cell hybrids in which each cell line contains either a single human chromosome or a small number of human chromosomes, and a full set of mouse chromosomes, can allow easy mapping of individual genes to specific human chromosomes. (D'Eustachio P. et al. (1983) Science 220:919-924).

[3818] Other mapping strategies e.g., in situ hybridization (described in Fan, Y. et al. (1990) Proc. Natl. Acad. Sci. USA, 87:6223-27), pre-screening with labeled flow-sorted chromosomes, and pre-selection by hybridization to chromosome specific cDNA libraries can be used to map 33410 to a chromosomal location.

[3819] Fluorescence in situ hybridization (FISH) of a DNA sequence to a metaphase chromosomal spread can further be used to provide a precise chromosomal location in one step. The FISH technique can be used with a DNA sequence as short as 500 or 600 bases. However, clones larger than 1,000 bases have a higher likelihood of binding to a unique chromosomal location with sufficient signal intensity for simple detection. Preferably 1,000 bases, and more preferably 2,000 bases will suffice to get good results at a reasonable amount of time. For a review of this technique, see Verma et al., Human Chromosomes: A Manual of Basic Techniques (Pergamon Press, New York 1988).

[3820] Reagents for chromosome mapping can be used individually to mark a single chromosome or a single site on that chromosome, or panels of reagents can be used for marking multiple sites and/or multiple chromosomes. Reagents corresponding to non-coding regions of the genes actually are preferred for mapping purposes. Coding sequences are more likely to be conserved within gene families, thus increasing the chance of cross hybridizations during chromosomal mapping.

[3821] Once a sequence has been mapped to a precise chromosomal location, the physical position of the sequence on the chromosome can be correlated with genetic map data. (Such data are found, for example, in V. McKusick, Mendelian Inheritance in Man, available on-line through Johns Hopkins University Welch Medical Library). The relationship between a gene and a disease, mapped to the same chromosomal region, can then be identified through linkage analysis (co-inheritance of physically adjacent genes), described in, for example, Egeland, J. et al. (1987) Nature, 325:783-787.

[3822] Moreover, differences in the DNA sequences between individuals affected and unaffected with a disease associated with the 33410 gene, can be determined. If a mutation is observed in some or all of the affected individuals but not in any unaffected individuals, then the mutation is likely to be the causative agent of the particular disease. Comparison of affected and unaffected individuals generally involves first looking for structural alterations in the chromosomes, such as deletions or translocations that are visible from chromosome spreads or detectable using PCR based on that DNA sequence. Ultimately, complete sequencing of genes from several individuals can be performed to confirm the presence of a mutation and to distinguish mutations from polymorphisms.

[3823] 33410 Tissue Typing

[3824] 33410 sequences can be used to identify individuals from biological samples using, e.g., restriction fragment length polymorphism (RFLP). In this technique, an individual's genomic DNA is digested with one or more restriction enzymes, the fragments separated, e.g., in a Southern blot, and probed to yield bands for identification. The sequences of the present invention are useful as additional DNA markers for RFLP (described in U.S. Pat. No. 5,272,057).

[3825] Furthermore, the sequences of the present invention can also be used to determine the actual base-by-base DNA sequence of selected portions of an individual's genome. Thus, the 33410 nucleotide sequences described herein can be used to prepare two PCR primers from the 5′ and 3′ends of the sequences. These primers can then be used to amplify an individual's DNA and subsequently sequence it. Panels of corresponding DNA sequences from individuals, prepared in this manner, can provide unique individual identifications, as each individual will have a unique set of such DNA sequences due to allelic differences.

[3826] Allelic variation occurs to some degree in the coding regions of these sequences, and to a greater degree in the noncoding regions. Each of the sequences described herein can, to some degree, be used as a standard against which DNA from an individual can be compared for identification purposes. Because greater numbers of polymorphisms occur in the noncoding regions, fewer sequences are necessary to differentiate individuals. The noncoding sequences of SEQ ID NO:53 can provide positive individual identification with a panel of perhaps 10 to 1,000 primers which each yield a noncoding amplified sequence of 100 bases. If predicted coding sequences, such as those in SEQ ID NO:55 are used, a more appropriate number of primers for positive individual identification would be 500-2,000.

[3827] If a panel of reagents from 33410 nucleotide sequences described herein is used to generate a unique identification database for an individual, those same reagents can later be used to identify tissue from that individual. Using the unique identification database, positive identification of the individual, living or dead, can be made from extremely small tissue samples.

[3828] Use of Partial 33410 Sequences in Forensic Biology

[3829] DNA-based identification techniques can also be used in forensic biology. To make such an identification, PCR technology can be used to amplify DNA sequences taken from very small biological samples such as tissues, e.g., hair or skin, or body fluids, e.g., blood, saliva, or semen found at a crime scene. The amplified sequence can then be compared to a standard, thereby allowing identification of the origin of the biological sample.

[3830] The sequences of the present invention can be used to provide polynucleotide reagents, e.g., PCR primers, targeted to specific loci in the human genome, which can enhance the reliability of DNA-based forensic identifications by, for example, providing another “identification marker” (i.e. another DNA sequence that is unique to a particular individual). As mentioned above, actual base sequence information can be used for identification as an accurate alternative to patterns formed by restriction enzyme generated fragments. Sequences targeted to noncoding regions of SEQ ID NO:53 (e.g., fragments derived from the noncoding regions of SEQ ID NO:53 having a length of at least 20 bases, preferably at least 30 bases) are particularly appropriate for this use.

[3831] The 33410 nucleotide sequences described herein can further be used to provide polynucleotide reagents, e.g., labeled or labelable probes which can be used in, for example, an in situ hybridization technique, to identify a specific tissue. This can be very useful in cases where a forensic pathologist is presented with a tissue of unknown origin. Panels of such 33410 probes can be used to identify tissue by species and/or by organ type.

[3832] In a similar fashion, these reagents, e.g., 33410 primers or probes can be used to screen tissue culture for contamination (i.e. screen for the presence of a mixture of different types of cells in a culture).

[3833] Predictive Medicine of 33410

[3834] The present invention also pertains to the field of predictive medicine in which diagnostic assays, prognostic assays, and monitoring clinical trials are used for prognostic (predictive) purposes to thereby treat an individual.

[3835] Generally, the invention provides, a method of determining if a subject is at risk for a disorder related to a lesion in or the misexpression of a gene which encodes 33410.

[3836] Such disorders include, e.g., a disorder associated with the misexpression of 33410 gene; a neoplasia or a disorder of the neurological system.

[3837] The method includes one or more of the following:

[3838] detecting, in a tissue of the subject, the presence or absence of a mutation which affects the expression of the 33410 gene, or detecting the presence or absence of a mutation in a region which controls the expression of the gene, e.g., a mutation in the 5′control region;

[3839] detecting, in a tissue of the subject, the presence or absence of a mutation which alters the structure of the 33410 gene;

[3840] detecting, in a tissue of the subject, the misexpression of the 33410 gene, at the mRNA level, e.g., detecting a non-wild type level of a mRNA;

[3841] detecting, in a tissue of the subject, the misexpression of the gene, at the protein level, e.g., detecting a non-wild type level of a 33410 polypeptide.

[3842] In preferred embodiments the method includes: ascertaining the existence of at least one of: a deletion of one or more nucleotides from the 33410 gene; an insertion of one or more nucleotides into the gene, a point mutation, e.g., a substitution of one or more nucleotides of the gene, a gross chromosomal rearrangement of the gene, e.g., a translocation, inversion, or deletion.

[3843] For example, detecting the genetic lesion can include: (i) providing a probe/primer including an oligonucleotide containing a region of nucleotide sequence which hybridizes to a sense or antisense sequence from SEQ ID NO:53, or naturally occurring mutants thereof or 5′ or 3′flanking sequences naturally associated with the 33410 gene; (ii) exposing the probe/primer to nucleic acid of the tissue; and detecting, by hybridization, e.g., in situ hybridization, of the probe/primer to the nucleic acid, the presence or absence of the genetic lesion.

[3844] In preferred embodiments detecting the misexpression includes ascertaining the existence of at least one of: an alteration in the level of a messenger RNA transcript of the 33410 gene; the presence of a non-wild type splicing pattern of a messenger RNA transcript of the gene; or a non-wild type level of 33410.

[3845] Methods of the invention can be used prenatally or to determine if a subject's offspring will be at risk for a disorder.

[3846] In preferred embodiments the method includes determining the structure of a 33410 gene, an abnormal structure being indicative of risk for the disorder.

[3847] In preferred embodiments the method includes contacting a sample form the subject with an antibody to the 33410 protein or a nucleic acid, which hybridizes specifically with the gene. These and other embodiments are discussed below.

[3848] Diagnostic and Prognostic Assays of 33410

[3849] Diagnostic and prognostic assays of the invention include method for assessing the expression level of 33410 molecules and for identifying variations and mutations in the sequence of 33410 molecules.

[3850] Expression Monitoring and Profiling:

[3851] The presence, level, or absence of 33410 protein or nucleic acid in a biological sample can be evaluated by obtaining a biological sample from a test subject and contacting the biological sample with a compound or an agent capable of detecting 33410 protein or nucleic acid (e.g., mRNA, genomic DNA) that encodes 33410 protein such that the presence of 33410 protein or nucleic acid is detected in the biological sample. The term “biological sample” includes tissues, cells and biological fluids isolated from a subject, as well as tissues, cells and fluids present within a subject. A preferred biological sample is serum. The level of expression of the 33410 gene can be measured in a number of ways, including, but not limited to: measuring the mRNA encoded by the 33410 genes; measuring the amount of protein encoded by the 33410 genes; or measuring the activity of the protein encoded by the 33410 genes.

[3852] The level of mRNA corresponding to the 33410 gene in a cell can be determined both by in situ and by in vitro formats.

[3853] The isolated mRNA can be used in hybridization or amplification assays that include, but are not limited to, Southern or Northern analyses, polymerase chain reaction analyses and probe arrays. One preferred diagnostic method for the detection of mRNA levels involves contacting the isolated mRNA with a nucleic acid molecule (probe) that can hybridize to the mRNA encoded by the gene being detected. The nucleic acid probe can be, for example, a full-length 33410 nucleic acid, such as the nucleic acid of SEQ ID NO:53, or a portion thereof, such as an oligonucleotide of at least 7, 15, 30, 50, 100, 250 or 500 nucleotides in length and sufficient to specifically hybridize under stringent conditions to 33410 mRNA or genomic DNA. The probe can be disposed on an address of an array, e.g., an array described below. Other suitable probes for use in the diagnostic assays are described herein.

[3854] In one format, mRNA (or cDNA) is immobilized on a surface and contacted with the probes, for example by running the isolated mRNA on an agarose gel and transferring the mRNA from the gel to a membrane, such as nitrocellulose. In an alternative format, the probes are immobilized on a surface and the mRNA (or cDNA) is contacted with the probes, for example, in a two-dimensional gene chip array described below. A skilled artisan can adapt known mRNA detection methods for use in detecting the level of mRNA encoded by the 33410 genes.

[3855] The level of mRNA in a sample that is encoded by one of 33410 can be evaluated with nucleic acid amplification, e.g., by RT-PCR (Mullis (1987) U.S. Pat. No. 4,683,202), ligase chain reaction (Barany (1991) Proc. Natl. Acad. Sci. USA 88:189-193), self sustained sequence replication (Guatelli et al., (1990) Proc. Natl. Acad. Sci. USA 87:1874-1878), transcriptional amplification system (Kwoh et al., (1989), Proc. Natl. Acad. Sci. USA 86:1173-1177), Q-Beta Replicase (Lizardi et al., (1988) Bio/Technology 6:1197), rolling circle replication (Lizardi et al., U.S. Pat. No. 5,854,033) or any other nucleic acid amplification method, followed by the detection of the amplified molecules using techniques known in the art. As used herein, amplification primers are defined as being a pair of nucleic acid molecules that can anneal to 5′ or 3′regions of a gene (plus and minus strands, respectively, or vice-versa) and contain a short region in between. In general, amplification primers are from about 10 to 30 nucleotides in length and flank a region from about 50 to 200 nucleotides in length. Under appropriate conditions and with appropriate reagents, such primers permit the amplification of a nucleic acid molecule comprising the nucleotide sequence flanked by the primers.

[3856] For in situ methods, a cell or tissue sample can be prepared/processed and immobilized on a support, typically a glass slide, and then contacted with a probe that can hybridize to mRNA that encodes the 33410 gene being analyzed.

[3857] In another embodiment, the methods further contacting a control sample with a compound or agent capable of detecting 33410 mRNA, or genomic DNA, and comparing the presence of 33410 mRNA or genomic DNA in the control sample with the presence of 33410 mRNA or genomic DNA in the test sample. In still another embodiment, serial analysis of gene expression, as described in U.S. Pat. No. 5,695,937, is used to detect 33410 transcript levels.

[3858] A variety of methods can be used to determine the level of protein encoded by 33410. In general, these methods include contacting an agent that selectively binds to the protein, such as an antibody with a sample, to evaluate the level of protein in the sample. In a preferred embodiment, the antibody bears a detectable label. Antibodies can be polyclonal, or more preferably, monoclonal. An intact antibody, or a fragment thereof (e.g., Fab or F(ab′)2) can be used. The term “labeled”, with regard to the probe or antibody, is intended to encompass direct labeling of the probe or antibody by coupling (i.e., physically linking) a detectable substance to the probe or antibody, as well as indirect labeling of the probe or antibody by reactivity with a detectable substance. Examples of detectable substances are provided herein.

[3859] The detection methods can be used to detect 33410 protein in a biological sample in vitro as well as in vivo. In vitro techniques for detection of 33410 protein include enzyme linked immunosorbent assays (ELISAs), immunoprecipitations, immunofluorescence, enzyme immunoassay (EIA), radioimmunoassay (RIA), and Western blot analysis. In vivo techniques for detection of 33410 protein include introducing into a subject a labeled anti-33410 antibody. For example, the antibody can be labeled with a radioactive marker whose presence and location in a subject can be detected by standard imaging techniques. In another embodiment, the sample is labeled, e.g., biotinylated and then contacted to the antibody, e.g., an anti-33410 antibody positioned on an antibody array (as described below). The sample can be detected, e.g., with avidin coupled to a fluorescent label.

[3860] In another embodiment, the methods further include contacting the control sample with a compound or agent capable of detecting 33410 protein, and comparing the presence of 33410 protein in the control sample with the presence of 33410 protein in the test sample. The invention also includes kits for detecting the presence of 33410 in a biological sample. For example, the kit can include a compound or agent capable of detecting 33410 protein or mRNA in a biological sample; and a standard. The compound or agent can be packaged in a suitable container. The kit can further comprise instructions for using the kit to detect 33410 protein or nucleic acid.

[3861] For antibody-based kits, the kit can include: (1) a first antibody (e.g., attached to a solid support) which binds to a polypeptide corresponding to a marker of the invention; and, optionally, (2) a second, different antibody which binds to either the polypeptide or the first antibody and is conjugated to a detectable agent.

[3862] For oligonucleotide-based kits, the kit can include: (1) an oligonucleotide, e.g., a detectably labeled oligonucleotide, which hybridizes to a nucleic acid sequence encoding a polypeptide corresponding to a marker of the invention or (2) a pair of primers useful for amplifying a nucleic acid molecule corresponding to a marker of the invention. The kit can also includes a buffering agent, a preservative, or a protein-stabilizing agent. The kit can also includes components necessary for detecting the detectable agent (e.g., an enzyme or a substrate). The kit can also contain a control sample or a series of control samples that can be assayed and compared to the test sample contained. Each component of the kit can be enclosed within an individual container and all of the various containers can be within a single package, along with instructions for interpreting the results of the assays performed using the kit.

[3863] The diagnostic methods described herein can identify subjects having, or at risk of developing, a disease or disorder associated with misexpressed or aberrant or unwanted 33410 expression or activity. As used herein, the term “unwanted” includes an unwanted phenomenon involved in a biological response such as pain or deregulated cell proliferation. In one embodiment, a disease or disorder associated with aberrant or unwanted 33410 expression or activity is identified. A test sample is obtained from a subject and 33410 protein or nucleic acid (e.g., mRNA or genomic DNA) is evaluated, wherein the level, e.g., the presence or absence, of 33410 protein or nucleic acid is diagnostic for a subject having or at risk of developing a disease or disorder associated with aberrant or unwanted 33410 expression or activity. As used herein, a “test sample” refers to a biological sample obtained from a subject of interest, including a biological fluid (e.g., serum), cell sample, or tissue.

[3864] The prognostic assays described herein can be used to determine whether a subject can be administered an agent (e.g., an agonist, antagonist, peptidomimetic, protein, peptide, nucleic acid, small molecule, or other drug candidate) to treat a disease or disorder associated with aberrant or unwanted 33410 expression or activity. For example, such methods can be used to determine whether a subject can be effectively treated with an agent for a disease or disorder associated with misexpressed or aberrant or unwanted 33410 expression or activity.

[3865] In another aspect, the invention features a computer medium having a plurality of digitally encoded data records. Each data record includes a value representing the level of expression of 33410 in a sample, and a descriptor of the sample. The descriptor of the sample can be an identifier of the sample, a subject from which the sample was derived (e.g., a patient), a diagnosis, or a treatment (e.g., a preferred treatment). In a preferred embodiment, the data record further includes values representing the level of expression of genes other than 33410 (e.g., other genes associated with a 33410-disorder, or other genes on an array). The data record can be structured as a table, e.g., a table that is part of a database such as a relational database (e.g., a SQL database of the Oracle or Sybase database environments).

[3866] Also featured is a method of evaluating a sample. The method includes providing a sample, e.g., from the subject, and determining a gene expression profile of the sample, wherein the profile includes a value representing the level of 33410 expression. The method can further include comparing the value or the profile (i.e., multiple values) to a reference value or reference profile. The gene expression profile of the sample can be obtained by any of the methods described herein (e.g., by providing a nucleic acid from the sample and contacting the nucleic acid to an array). The method can be used to diagnose a disease or disorder associated with misexpressed or aberrant or unwanted 33410 expression or activity in a subject wherein an increase or a decrease in 33410 expression is an indication that the subject has or is disposed to having a disease or disorder associated with misexpressed or aberrant or unwanted 33410 expression or activity. The method can be used to monitor a treatment for a disease or disorder associated with misexpressed or aberrant or unwanted 33410 expression or activity in a subject. For example, the gene expression profile can be determined for a sample from a subject undergoing treatment. The profile can be compared to a reference profile or to a profile obtained from the subject prior to treatment or prior to onset of the disorder (see, e.g., Golub et al. (1999) Science 286:531).

[3867] In yet another aspect, the invention features a method of evaluating a test compound (see also, “Screening Assays”, above). The method includes providing a cell and a test compound; contacting the test compound to the cell; obtaining a subject expression profile for the contacted cell; and comparing the subject expression profile to one or more reference profiles. The profiles include a value representing the level of 33410 expression. In a preferred embodiment, the subject expression profile is compared to a target profile, e.g., a profile for a normal cell or for desired condition of a cell. The test compound is evaluated favorably if the subject expression profile is more similar to the target profile than an expression profile obtained from a cell not contacted with the test compound.

[3868] In another aspect, the invention features a method of evaluating a subject. The method includes: a) obtaining a sample from a subject, e.g., from a caregiver, e.g., a caregiver who obtains the sample from the subject; b) determining a subject expression profile for the sample. Optionally, the method further includes either or both of steps: c) comparing the subject expression profile to one or more reference expression profiles; and d) selecting the reference profile most similar to the subject reference profile. The subject expression profile and the reference profiles include a value representing the level of 33410 expression. A variety of routine statistical measures can be used to compare two reference profiles. One possible metric is the length of the distance vector that is the difference between the two profiles. Each of the subject and reference profile is represented as a multi-dimensional vector, wherein each dimension is a value in the profile.

[3869] The method can further include transmitting a result to a caregiver. The result can be the subject expression profile, a result of a comparison of the subject expression profile with another profile, a most similar reference profile, or a descriptor of any of the aforementioned. The result can be transmitted across a computer network, e.g., the result can be in the form of a computer transmission, e.g., a computer data signal embedded in a carrier wave.

[3870] Also featured is a computer medium having executable code for effecting the following steps: receive a subject expression profile; access a database of reference expression profiles; and either i) select a matching reference profile most similar to the subject expression profile or ii) determine at least one comparison score for the similarity of the subject expression profile to at least one reference profile. The subject expression profile, and the reference expression profiles each include a value representing the level of 33410 expression.

[3871] 33410 Arrays and Uses Thereof

[3872] In another aspect, the invention features an array that includes a substrate having a plurality of addresses. At least one address of the plurality includes a capture probe that binds specifically to a 33410 molecule (e.g., a 33410 nucleic acid or a 33410 polypeptide). The array can have a density of at least than 10, 50, 100, 200, 500, 1,000, 2,000, or 10,000 or more addresses/cm2, and ranges between. In a preferred embodiment, the plurality of addresses includes at least 10, 100, 500, 1,000, 5,000, 10,000, 50,000 addresses. In a preferred embodiment, the plurality of addresses includes equal to or less than 10, 100, 500, 1,000, 5,000, 10,000, or 50,000 addresses. The substrate can be a two-dimensional substrate such as a glass slide, a wafer (e.g., silica or plastic), a mass spectroscopy plate, or a three-dimensional substrate such as a gel pad. Addresses in addition to address of the plurality can be disposed on the array.

[3873] In a preferred embodiment, at least one address of the plurality includes a nucleic acid capture probe that hybridizes specifically to a 33410 nucleic acid, e.g., the sense or anti-sense strand. In one preferred embodiment, a subset of addresses of the plurality of addresses has a nucleic acid capture probe for 33410. Each address of the subset can include a capture probe that hybridizes to a different region of a 33410 nucleic acid. In another preferred embodiment, addresses of the subset include a capture probe for a 33410 nucleic acid. Each address of the subset is unique, overlapping, and complementary to a different variant of 33410 (e.g., an allelic variant, or all possible hypothetical variants). The array can be used to sequence 33410 by hybridization (see, e.g., U.S. Pat. No. 5,695,940).

[3874] An array can be generated by various methods, e.g., by photolithographic methods (see, e.g., U.S. Pat. Nos. 5,143,854; 5,510,270; and 5,527,681), mechanical methods (e.g., directed-flow methods as described in U.S. Pat. No. 5,384,261), pin-based methods (e.g., as described in U.S. Pat. No. 5,288,514), and bead-based techniques (e.g., as described in PCT US/93/04145).

[3875] In another preferred embodiment, at least one address of the plurality includes a polypeptide capture probe that binds specifically to a 33410 polypeptide or fragment thereof. The polypeptide can be a naturally occurring interaction partner of 33410 polypeptide. Preferably, the polypeptide is an antibody, e.g., an antibody described herein (see “Anti-33410 Antibodies,” above), such as a monoclonal antibody or a single-chain antibody.

[3876] In another aspect, the invention features a method of analyzing the expression of 33410. The method includes providing an array as described above; contacting the array with a sample and detecting binding of a 33410-molecule (e.g., nucleic acid or polypeptide) to the array. In a preferred embodiment, the array is a nucleic acid array. Optionally the method further includes amplifying nucleic acid from the sample prior or during contact with the array.

[3877] In another embodiment, the array can be used to assay gene expression in a tissue to ascertain tissue specificity of genes in the array, particularly the expression of 33410. If a sufficient number of diverse samples is analyzed, clustering (e.g., hierarchical clustering, k-means clustering, Bayesian clustering and the like) can be used to identify other genes which are co-regulated with 33410. For example, the array can be used for the quantitation of the expression of multiple genes. Thus, not only tissue specificity, but also the level of expression of a battery of genes in the tissue is ascertained. Quantitative data can be used to group (e.g., cluster) genes on the basis of their tissue expression per se and level of expression in that tissue.

[3878] For example, array analysis of gene expression can be used to assess the effect of cell-cell interactions on 33410 expression. A first tissue can be perturbed and nucleic acid from a second tissue that interacts with the first tissue can be analyzed. In this context, the effect of one cell type on another cell type in response to a biological stimulus can be determined, e.g., to monitor the effect of cell-cell interaction at the level of gene expression.

[3879] In another embodiment, cells are contacted with a therapeutic agent. The expression profile of the cells is determined using the array, and the expression profile is compared to the profile of like cells not contacted with the agent. For example, the assay can be used to determine or analyze the molecular basis of an undesirable effect of the therapeutic agent. If an agent is administered therapeutically to treat one cell type but has an undesirable effect on another cell type, the invention provides an assay to determine the molecular basis of the undesirable effect and thus provides the opportunity to co-administer a counteracting agent or otherwise treat the undesired effect. Similarly, even within a single cell type, undesirable biological effects can be determined at the molecular level. Thus, the effects of an agent on expression of other than the target gene can be ascertained and counteracted.

[3880] In another embodiment, the array can be used to monitor expression of one or more genes in the array with respect to time. For example, samples obtained from different time points can be probed with the array. Such analysis can identify and/or characterize the development of a 33410-associated disease or disorder; and processes, such as a cellular transformation associated with a 33410-associated disease or disorder. The method can also evaluate the treatment and/or progression of a 33410-associated disease or disorder

[3881] The array is also useful for ascertaining differential expression patterns of one or more genes in normal and abnormal cells. This provides a battery of genes (e.g., including 33410) that could serve as a molecular target for diagnosis or therapeutic intervention.

[3882] In another aspect, the invention features an array having a plurality of addresses. Each address of the plurality includes a unique polypeptide. At least one address of the plurality has disposed thereon a 33410 polypeptide or fragment thereof. Methods of producing polypeptide arrays are described in the art, e.g., in De Wildt et al. (2000). Nature Biotech. 18, 989-994; Lueking et al. (1999). Anal. Biochem. 270, 103-111; Ge, H. (2000). Nucleic Acids Res. 28, e3, I-VII; MacBeath, G., and Schreiber, S. L. (2000). Science 289, 1760-1763; and WO 99/51773A1. In a preferred embodiment, each addresses of the plurality has disposed thereon a polypeptide at least 60, 70, 80, 85, 90, 95 or 99% identical to a 33410 polypeptide or fragment thereof. For example, multiple variants of a 33410 polypeptide (e.g., encoded by allelic variants, site-directed mutants, random mutants, or combinatorial mutants) can be disposed at individual addresses of the plurality. Addresses in addition to the address of the plurality can be disposed on the array.

[3883] The polypeptide array can be used to detect a 33410 binding compound, e.g., an antibody in a sample from a subject with specificity for a 33410 polypeptide or the presence of a 33410-binding protein or ligand.

[3884] The array is also useful for ascertaining the effect of the expression of a gene on the expression of other genes in the same cell or in different cells (e.g., ascertaining the effect of 33410 expression on the expression of other genes). This provides, for example, for a selection of alternate molecular targets for therapeutic intervention if the ultimate or downstream target cannot be regulated.

[3885] In another aspect, the invention features a method of analyzing a plurality of probes. The method is useful, e.g., for analyzing gene expression. The method includes: providing a two dimensional array having a plurality of addresses, each address of the plurality being positionally distinguishable from each other address of the plurality having a unique capture probe, e.g., wherein the capture probes are from a cell or subject which express 33410 or from a cell or subject in which a 33410 mediated response has been elicited, e.g., by contact of the cell with 33410 nucleic acid or protein, or administration to the cell or subject 33410 nucleic acid or protein; providing a two dimensional array having a plurality of addresses, each address of the plurality being positionally distinguishable from each other address of the plurality, and each address of the plurality having a unique capture probe, e.g., wherein the capture probes are from a cell or subject which does not express 33410 (or does not express as highly as in the case of the 33410 positive plurality of capture probes) or from a cell or subject which in which a 33410 mediated response has not been elicited (or has been elicited to a lesser extent than in the first sample); contacting the array with one or more inquiry probes (which is preferably other than a 33410 nucleic acid, polypeptide, or antibody), and thereby evaluating the plurality of capture probes. Binding, e.g., in the case of a nucleic acid, hybridization with a capture probe at an address of the plurality, is detected, e.g., by signal generated from a label attached to the nucleic acid, polypeptide, or antibody.

[3886] In another aspect, the invention features a method of analyzing a plurality of probes or a sample. The method is useful, e.g., for analyzing gene expression. The method includes: providing a two dimensional array having a plurality of addresses, each address of the plurality being positionally distinguishable from each other address of the plurality having a unique capture probe, contacting the array with a first sample from a cell or subject which express or mis-express 33410 or from a cell or subject in which a 33410-mediated response has been elicited, e.g., by contact of the cell with 33410 nucleic acid or protein, or administration to the cell or subject 33410 nucleic acid or protein; providing a two dimensional array having a plurality of addresses, each address of the plurality being positionally distinguishable from each other address of the plurality, and each address of the plurality having a unique capture probe, and contacting the array with a second sample from a cell or subject which does not express 33410 (or does not express as highly as in the case of the 33410 positive plurality of capture probes) or from a cell or subject which in which a 33410 mediated response has not been elicited (or has been elicited to a lesser extent than in the first sample); and comparing the binding of the first sample with the binding of the second sample. Binding, e.g., in the case of a nucleic acid, hybridization with a capture probe at an address of the plurality, is detected, e.g., by signal generated from a label attached to the nucleic acid, polypeptide, or antibody. The same array can be used for both samples or different arrays can be used. If different arrays are used the plurality of addresses with capture probes should be present on both arrays.

[3887] In another aspect, the invention features a method of analyzing 33410, e.g., analyzing structure, function, or relatedness to other nucleic acid or amino acid sequences. The method includes: providing a 33410 nucleic acid or amino acid sequence; comparing the 33410 sequence with one or more preferably a plurality of sequences from a collection of sequences, e.g., a nucleic acid or protein sequence database; to thereby analyze 33410.

[3888] Detection of 33410 Variations or Mutations

[3889] The methods of the invention can also be used to detect genetic alterations in a 33410 gene, thereby determining if a subject with the altered gene is at risk for a disorder characterized by misregulation in 33410 protein activity or nucleic acid expression, such as a neurological disorder and/or carcinomas. In preferred embodiments, the methods include detecting, in a sample from the subject, the presence or absence of a genetic alteration characterized by at least one of an alteration affecting the integrity of a gene encoding a 33410-protein, or the mis-expression of the 33410 gene. For example, such genetic alterations can be detected by ascertaining the existence of at least one of 1) a deletion of one or more nucleotides from a 33410 gene; 2) an addition of one or more nucleotides to a 33410 gene; 3) a substitution of one or more nucleotides of a 33410 gene, 4) a chromosomal rearrangement of a 33410 gene; 5) an alteration in the level of a messenger RNA transcript of a 33410 gene, 6) aberrant modification of a 33410 gene, such as of the methylation pattern of the genomic DNA, 7) the presence of a non-wild type splicing pattern of a messenger RNA transcript of a 33410 gene, 8) a non-wild type level of a 33410-protein, 9) allelic loss of a 33410 gene, and 10) inappropriate post-translational modification of a 33410-protein.

[3890] An alteration can be detected without a probe/primer in a polymerase chain reaction, such as anchor PCR or RACE PCR, or, alternatively, in a ligation chain reaction (LCR), the latter of which can be particularly useful for detecting point mutations in the 33410-gene. This method can include the steps of collecting a sample of cells from a subject, isolating nucleic acid (e.g., genomic, mRNA or both) from the sample, contacting the nucleic acid sample with one or more primers which specifically hybridize to a 33410 gene under conditions such that hybridization and amplification of the 33410-gene (if present) occurs, and detecting the presence or absence of an amplification product, or detecting the size of the amplification product and comparing the length to a control sample. It is anticipated that PCR and/or LCR may be desirable to use as a preliminary amplification step in conjunction with any of the techniques used for detecting mutations described herein. Alternatively, other amplification methods described herein or known in the art can be used.

[3891] In another embodiment, mutations in a 33410 gene from a sample cell can be identified by detecting alterations in restriction enzyme cleavage patterns. For example, sample and control DNA is isolated, amplified (optionally), digested with one or more restriction endonucleases, and fragment length sizes are determined, e.g., by gel electrophoresis and compared. Differences in fragment length sizes between sample and control DNA indicates mutations in the sample DNA. Moreover, the use of sequence specific ribozymes (see, for example, U.S. Pat. No. 5,498,531) can be used to score for the presence of specific mutations by development or loss of a ribozyme cleavage site.

[3892] In other embodiments, genetic mutations in 33410 can be identified by hybridizing a sample and control nucleic acids, e.g., DNA or RNA, two-dimensional arrays, e.g., chip based arrays. Such arrays include a plurality of addresses, each of which is positionally distinguishable from the other. A different probe is located at each address of the plurality. A probe can be complementary to a region of a 33410 nucleic acid or a putative variant (e.g., allelic variant) thereof. A probe can have one or more mismatches to a region of a 33410 nucleic acid (e.g., a destabilizing mismatch). The arrays can have a high density of addresses, e.g., can contain hundreds or thousands of oligonucleotides probes (Cronin, M. T. et al. (1996) Human Mutation 7: 244-255; Kozal, M. J. et al. (1996) Nature Medicine 2: 753-759). For example, genetic mutations in 33410 can be identified in two-dimensional arrays containing light-generated DNA probes as described in Cronin, M. T. et al. supra. Briefly, a first hybridization array of probes can be used to scan through long stretches of DNA in a sample and control to identify base changes between the sequences by making linear arrays of sequential overlapping probes. This step allows the identification of point mutations. This step is followed by a second hybridization array that allows the characterization of specific mutations by using smaller, specialized probe arrays complementary to all variants or mutations detected. Each mutation array is composed of parallel probe sets, one complementary to the wild-type gene and the other complementary to the mutant gene.

[3893] In yet another embodiment, any of a variety of sequencing reactions known in the art can be used to directly sequence the 33410 gene and detect mutations by comparing the sequence of the sample 33410 with the corresponding wild-type (control) sequence. Automated sequencing procedures can be utilized when performing the diagnostic assays ((1995) Biotechniques 19:448), including sequencing by mass spectrometry.

[3894] Other methods for detecting mutations in the 33410 gene include methods in which protection from cleavage agents is used to detect mismatched bases in RNA/RNA or RNA/DNA heteroduplexes (Myers et al. (1985) Science 230:1242; Cotton et al. (1988) Proc. Natl. Acad Sci USA 85:4397; Saleeba et al. (1992) Methods Enzymol. 217:286-295).

[3895] In still another embodiment, the mismatch cleavage reaction employs one or more proteins that recognize mismatched base pairs in double-stranded DNA (so called “DNA mismatch repair” enzymes) in defined systems for detecting and mapping point mutations in 33410 cDNAs obtained from samples of cells. For example, the mutY enzyme of E. coli cleaves A at G/A mismatches and the thymidine DNA glycosylase from HeLa cells cleaves T at G/T mismatches (Hsu et al. (1994) Carcinogenesis 15:1657-1662; U.S. Pat. No. 5,459,039).

[3896] In other embodiments, alterations in electrophoretic mobility will be used to identify mutations in 33410 genes. For example, single strand conformation polymorphism (SSCP) may be used to detect differences in electrophoretic mobility between mutant and wild type nucleic acids (Orita et al. (1989) Proc Natl. Acad. Sci USA: 86:2766, see also Cotton (1993) Mutat. Res. 285:125-144; and Hayashi (1992) Genet. Anal. Tech. Appl. 9:73-79). Single-stranded DNA fragments of sample and control 33410 nucleic acids will be denatured and allowed to renature. The secondary structure of single-stranded nucleic acids varies according to sequence, the resulting alteration in electrophoretic mobility enables the detection of even a single base change. The DNA fragments may be labeled or detected with labeled probes. The sensitivity of the assay may be enhanced by using RNA (rather than DNA), in which the secondary structure is more sensitive to a change in sequence. In a preferred embodiment, the subject method utilizes heteroduplex analysis to separate double stranded heteroduplex molecules on the basis of changes in electrophoretic mobility (Keen et al. (1991) Trends Genet 7:5).

[3897] In yet another embodiment, the movement of mutant or wild-type fragments in polyacrylamide gels containing a gradient of denaturant is assayed using denaturing gradient gel electrophoresis (DGGE) (Myers et al. (1985) Nature 313:495). When DGGE is used as the method of analysis, DNA will be modified to insure that it does not completely denature, for example by adding a GC clamp of approximately 40 bp of high-melting GC-rich DNA by PCR. In a further embodiment, a temperature gradient is used in place of a denaturing gradient to identify differences in the mobility of control and sample DNA (Rosenbaum and Reissner (1987) Biophys Chem 265:12753).

[3898] Examples of other techniques for detecting point mutations include, but are not limited to, selective oligonucleotide hybridization, selective amplification, or selective primer extension (Saiki et al. (1986) Nature 324:163); Saiki et al. (1989) Proc. Natl. Acad. Sci USA 86:6230). A further method of detecting point mutations is the chemical ligation of oligonucleotides as described in Xu et al. ((2001) Nature Biotechnol. 19:148). Adjacent oligonucleotides, one of which selectively anneals to the query site, are ligated together if the nucleotide at the query site of the sample nucleic acid is complementary to the query oligonucleotide; ligation can be monitored, e.g., by fluorescent dyes coupled to the oligonucleotides.

[3899] Alternatively, allele specific amplification technology that depends on selective PCR amplification may be used in conjunction with the instant invention. Oligonucleotides used as primers for specific amplification may carry the mutation of interest in the center of the molecule (so that amplification depends on differential hybridization) (Gibbs et al. (1989) Nucleic Acids Res. 17:2437-2448) or at the extreme 3′end of one primer where, under appropriate conditions, mismatch can prevent, or reduce polymerase extension (Prossner (1993) Tibtech 11:238). In addition it may be desirable to introduce a novel restriction site in the region of the mutation to create cleavage-based detection (Gasparini et al. (1992) Mol. Cell Probes 6:1). It is anticipated that in certain embodiments amplification may also be performed using Taq ligase for amplification (Barany (1991) Proc. Natl. Acad. Sci USA 88:189). In such cases, ligation will occur only if there is a perfect match at the 3′end of the 5′sequence making it possible to detect the presence of a known mutation at a specific site by looking for the presence or absence of amplification.

[3900] In another aspect, the invention features a set of oligonucleotides. The set includes a plurality of oligonucleotides, each of which is at least partially complementary (e.g., at least 50%, 60%, 70%, 80%, 90%, 92%, 95%, 97%, 98%, or 99% complementary) to a 33410 nucleic acid.

[3901] In a preferred embodiment the set includes a first and a second oligonucleotide. The first and second oligonucleotide can hybridize to the same or to different locations of SEQ ID NO:53 or the complement of SEQ ID NO:53. Different locations can be different but overlapping or non-overlapping on the same strand. The first and second oligonucleotide can hybridize to sites on the same or on different strands.

[3902] The set can be useful, e.g., for identifying SNP's, or identifying specific alleles of 33410. In a preferred embodiment, each oligonucleotide of the set has a different nucleotide at an interrogation position. In one embodiment, the set includes two oligonucleotides, each complementary to a different allele at a locus, e.g., a biallelic or polymorphic locus.

[3903] In another embodiment, the set includes four oligonucleotides, each having a different nucleotide (e.g., adenine, guanine, cytosine, or thymidine) at the interrogation position. The interrogation position can be a SNP or the site of a mutation. In another preferred embodiment, the oligonucleotides of the plurality are identical in sequence to one another (except for differences in length). The oligonucleotides can be provided with differential labels, such that an oligonucleotide that hybridizes to one allele provides a signal that is distinguishable from an oligonucleotide that hybridizes to a second allele. In still another embodiment, at least one of the oligonucleotides of the set has a nucleotide change at a position in addition to a query position, e.g., a destabilizing mutation to decrease the Tm of the oligonucleotide. In another embodiment, at least one oligonucleotide of the set has a non-natural nucleotide, e.g., inosine. In a preferred embodiment, the oligonucleotides are attached to a solid support, e.g., to different addresses of an array or to different beads or nanoparticles.

[3904] In a preferred embodiment the set of oligonucleotides can be used to specifically amplify, e.g., by PCR, or detect, a 33410 nucleic acid.

[3905] The methods described herein may be performed, for example, by utilizing pre-packaged diagnostic kits comprising at least one probe nucleic acid or antibody reagent described herein, which may be conveniently used, e.g., in clinical settings to diagnose patients exhibiting symptoms or family history of a disease or illness involving a 33410 gene.

[3906] Use of 33410 Molecules as Surrogate Markers

[3907] The 33410 molecules of the invention are also useful as markers of disorders or disease states, as markers for precursors of disease states, as markers for predisposition of disease states, as markers of drug activity, or as markers of the pharmacogenomic profile of a subject. Using the methods described herein, the presence, absence and/or quantity of the 33410 molecules of the invention may be detected, and may be correlated with one or more biological states in vivo. For example, the 33410 molecules of the invention may serve as surrogate markers for one or more disorders or disease states or for conditions leading up to disease states. As used herein, a “surrogate marker” is an objective biochemical marker that correlates with the absence or presence of a disease or disorder, or with the progression of a disease or disorder (e.g., with the presence or absence of a tumor). The presence or quantity of such markers is independent of the disease. Therefore, these markers may serve to indicate whether a particular course of treatment is effective in lessening a disease state or disorder. Surrogate markers are of particular use when the presence or extent of a disease state or disorder is difficult to assess through standard methodologies (e.g., early stage tumors), or when an assessment of disease progression is desired before a potentially dangerous clinical endpoint is reached (e.g., an assessment of cardiovascular disease may be made using cholesterol levels as a surrogate marker, and an analysis of HIV infection may be made using HIV RNA levels as a surrogate marker, well in advance of the undesirable clinical outcomes of myocardial infarction or fully-developed AIDS). Examples of the use of surrogate markers in the art include: Koomen et al. (2000) J. Mass. Spectrom. 35: 258-264; and James (1994) AIDS Treatment News Archive 209.

[3908] The 33410 molecules of the invention are also useful as pharmacodynamic markers. As used herein, a “pharmacodynamic marker” is an objective biochemical marker that correlates specifically with drug effects. The presence or quantity of a pharmacodynamic marker is not related to the disease state or disorder for which the drug is being administered; therefore, the presence or quantity of the marker is indicative of the presence or activity of the drug in a subject. For example, a pharmacodynamic marker may be indicative of the concentration of the drug in a biological tissue, in that the marker is either expressed or transcribed or not expressed or transcribed in that tissue in relationship to the level of the drug. In this fashion, the distribution or uptake of the drug may be monitored by the pharmacodynamic marker. Similarly, the presence or quantity of the pharmacodynamic marker may be related to the presence or quantity of the metabolic product of a drug, such that the presence or quantity of the marker is indicative of the relative breakdown rate of the drug in vivo. Pharmacodynamic markers are of particular use in increasing the sensitivity of detection of drug effects, particularly when the drug is administered in low doses. Since even a small amount of a drug may be sufficient to activate multiple rounds of marker (e.g., a 33410 marker) transcription or expression, the amplified marker may be in a quantity that is more readily detectable than the drug itself. Also, the marker may be more easily detected due to the nature of the marker itself; for example, using the methods described herein, anti-33410 antibodies may be employed in an immune-based detection system for a 33410 protein marker, or 33410-specific radiolabeled probes may be used to detect a 33410 mRNA marker. Furthermore, the use of a pharmacodynamic marker may offer mechanism-based prediction of risk due to drug treatment beyond the range of possible direct observations. Examples of the use of pharmacodynamic markers in the art include: Matsuda et al. U.S. Pat. No. 6,033,862; Hattis et al. (1991) Env. Health Perspect. 90: 229-238; Schentag (1999) Am. J. Health-Syst. Pharm. 56 Suppl. 3: S21-S24; and Nicolau (1999) Am, J Health-Syst. Pharm. 56 Suppl. 3: S16-S20.

[3909] The 33410 molecules of the invention are also useful as pharmacogenomic markers. As used herein, a “pharmacogenomic marker” is an objective biochemical marker that correlates with a specific clinical drug response or susceptibility in a subject (see, e.g., McLeod et al. (1999) Eur. J. Cancer 35:1650-1652). The presence or quantity of the pharmacogenomic marker is related to the predicted response of the subject to a specific drug or class of drugs prior to administration of the drug. By assessing the presence or quantity of one or more pharmacogenomic markers in a subject, a drug therapy which is most appropriate for the subject, or which is predicted to have a greater degree of success, may be selected. For example, based on the presence or quantity of RNA, or protein (e.g., 33410 protein or RNA) for specific tumor markers in a subject, a drug or course of treatment may be selected that is optimized for the treatment of the specific tumor likely to be present in the subject. Similarly, the presence or absence of a specific sequence mutation in 33410 DNA may correlate with a 33410 drug response. The use of pharmacogenomic markers therefore permits the application of the most appropriate treatment for each subject without having to administer the therapy.

[3910] Pharmaceutical Compositions of 33410

[3911] The nucleic acid and polypeptides, fragments thereof, as well as anti-33410 antibodies (also referred to herein as “active compounds”) of the invention can be incorporated into pharmaceutical compositions. Such compositions typically include the nucleic acid molecule, protein, or antibody and a pharmaceutically acceptable carrier. As used herein the language “pharmaceutically acceptable carrier” includes solvents, dispersion media, coatings, antibacterial and antifingal agents, isotonic and absorption delaying agents, and the like, compatible with pharmaceutical administration. Supplementary active compounds can also be incorporated into the compositions.

[3912] A pharmaceutical composition is formulated to be compatible with its intended route of administration. Examples of routes of administration include parenteral, e.g., intravenous, intradermal, subcutaneous, oral (e.g., inhalation), transdermal (topical), transmucosal, and rectal administration. Solutions or suspensions used for parenteral, intradermal, or subcutaneous application can include the following components: a sterile diluent such as water for injection, saline solution, fixed oils, polyethylene glycols, glycerine, propylene glycol or other synthetic solvents; antibacterial agents such as benzyl alcohol or methyl parabens; antioxidants such as ascorbic acid or sodium bisulfite; chelating agents such as ethylenediaminetetraacetic acid; buffers such as acetates, citrates or phosphates and agents for the adjustment of tonicity such as sodium chloride or dextrose. pH can be adjusted with acids or bases, such as hydrochloric acid or sodium hydroxide. The parenteral preparation can be enclosed in ampoules, disposable syringes or multiple dose vials made of glass or plastic.

[3913] Pharmaceutical compositions suitable for injectable use include sterile aqueous solutions (where water soluble) or dispersions and sterile powders for the extemporaneous preparation of sterile injectable solutions or dispersion. For intravenous administration, suitable carriers include physiological saline, bacteriostatic water, Cremophor EL™ (BASF, Parsippany, N.J.) or phosphate buffered saline (PBS). In all cases, the composition must be sterile and should be fluid to the extent that easy syringability exists. It should be stable under the conditions of manufacture and storage and must be preserved against the contaminating action of microorganisms such as bacteria and fungi. The carrier can be a solvent or dispersion medium containing, for example, water, ethanol, polyol (for example, glycerol, propylene glycol, and liquid polyetheylene glycol, and the like), and suitable mixtures thereof. The proper fluidity can be maintained, for example, by the use of a coating such as lecithin, by the maintenance of the required particle size in the case of dispersion and by the use of surfactants. Prevention of the action of microorganisms can be achieved by various antibacterial and antifungal agents, for example, parabens, chlorobutanol, phenol, ascorbic acid, thimerosal, and the like. In many cases, it will be preferable to include isotonic agents, for example, sugars, polyalcohols such as manitol, sorbitol, sodium chloride in the composition. Prolonged absorption of the injectable compositions can be brought about by including an agent in the composition that delays absorption, for example, aluminum monostearate and gelatin.

[3914] Sterile injectable solutions can be prepared by incorporating the active compound in the required amount in an appropriate solvent with one or a combination of ingredients enumerated above, as required, followed by filtered sterilization. Generally, dispersions are prepared by incorporating the active compound into a sterile vehicle that contains a basic dispersion medium and the required other ingredients from those enumerated above. In the case of sterile powders for the preparation of sterile injectable solutions, the preferred methods of preparation are vacuum drying and freeze-drying which yields a powder of the active ingredient plus any additional desired ingredient from a previously sterile-filtered solution thereof.

[3915] Oral compositions generally include an inert diluent or an edible carrier. For the purpose of oral therapeutic administration, the active compound can be incorporated with excipients and used in the form of tablets, troches, or capsules, e.g., gelatin capsules. Oral compositions can also be prepared using a fluid carrier for use as a mouthwash. Pharmaceutically compatible binding agents, and/or adjuvant materials can be included as part of the composition. The tablets, pills, capsules, troches and the like can contain any of the following ingredients, or compounds of a similar nature: a binder such as microcrystalline cellulose, gum tragacanth or gelatin; an excipient such as starch or lactose, a disintegrating agent such as alginic acid, Primogel, or corn starch; a lubricant such as magnesium stearate or Sterotes; a glidant such as colloidal silicon dioxide; a sweetening agent such as sucrose or saccharin; or a flavoring agent such as peppermint, methyl salicylate, or orange flavoring.

[3916] For administration by inhalation, the compounds are delivered in the form of an aerosol spray from pressured container or dispenser that contains a suitable propellant, e.g., a gas such as carbon dioxide, or a nebulizer.

[3917] Systemic administration can also be by transmucosal or transdermal means. For transmucosal or transdermal administration, penetrants appropriate to the barrier to be permeated are used in the formulation. Such penetrants are generally known in the art, and include, for example, for transmucosal administration, detergents, bile salts, and fusidic acid derivatives. Transmucosal administration can be accomplished through the use of nasal sprays or suppositories. For transdermal administration, the active compounds are formulated into ointments, salves, gels, or creams as generally known in the art.

[3918] The compounds can also be prepared in the form of suppositories (e.g., with conventional suppository bases such as cocoa butter and other glycerides) or retention enemas for rectal delivery.

[3919] In one embodiment, the active compounds are prepared with carriers that will protect the compound against rapid elimination from the body, such as a controlled release formulation, including implants and microencapsulated delivery systems. Biodegradable, biocompatible polymers can be used, such as ethylene vinyl acetate, polyanhydrides, polyglycolic acid, collagen, polyorthoesters, and polylactic acid. Methods for preparation of such formulations will be apparent to those skilled in the art. The materials can also be obtained commercially from Alza Corporation and Nova Pharmaceuticals, Inc. Liposomal suspensions (including liposomes targeted to infected cells with monoclonal antibodies to viral antigens) can also be used as pharmaceutically acceptable carriers. These can be prepared according to methods known to those skilled in the art, for example, as described in U.S. Pat. No. 4,522,811.

[3920] It is advantageous to formulate oral or parenteral compositions in dosage unit form for ease of administration and uniformity of dosage. Dosage unit form as used herein refers to physically discrete units suited as unitary dosages for the subject to be treated; each unit containing a predetermined quantity of active compound calculated to produce the desired therapeutic effect in association with the required pharmaceutical carrier.

[3921] Toxicity and therapeutic efficacy of such compounds can be determined by standard pharmaceutical procedures in cell cultures or experimental animals, e.g., for determining the LD50 (the dose lethal to 50% of the population) and the ED50 (the dose therapeutically effective in 50% of the population). The dose ratio between toxic and therapeutic effects is the therapeutic index and it can be expressed as the ratio LD50/ED50. Compounds that exhibit high therapeutic indices are preferred. While compounds that exhibit toxic side effects may be used, care should be taken to design a delivery system that targets such compounds to the site of affected tissue in order to minimize potential damage to uninfected cells and, thereby, reduce side effects.

[3922] The data obtained from the cell culture assays and animal studies can be used in formulating a range of dosage for use in humans. The dosage of such compounds lies preferably within a range of circulating concentrations that include the ED50 with little or no toxicity. The dosage may vary within this range depending upon the dosage form employed and the route of administration utilized. For any compound used in the method of the invention, the therapeutically effective dose can be estimated initially from cell culture assays. A dose may be formulated in animal models to achieve a circulating plasma concentration range that includes the IC50 (i.e., the concentration of the test compound which achieves a half-maximal inhibition of symptoms) as determined in cell culture. Such information can be used to more accurately determine useful doses in humans. Levels in plasma may be measured, for example, by high performance liquid chromatography. As defined herein, a therapeutically effective amount of protein or polypeptide (i.e., an effective dosage) ranges from about 0.001 to 30 mg/kg body weight, preferably about 0.01 to 25 mg/kg body weight, more preferably about 0.1 to 20 mg/kg body weight, and even more preferably about 1 to 10 mg/kg, 2 to 9 mg/kg, 3 to 8 mg/kg, 4 to 7 mg/kg, or 5 to 6 mg/kg body weight. The protein or polypeptide can be administered one time per week for between about 1 to 10 weeks, preferably between 2 to 8 weeks, more preferably between about 3 to 7 weeks, and even more preferably for about 4, 5, or 6 weeks. The skilled artisan will appreciate that certain factors may influence the dosage and timing required to effectively treat a subject, including but not limited to the severity of the disease or disorder, previous treatments, the general health and/or age of the subject, and other diseases present. Moreover, treatment of a subject with a therapeutically effective amount of a protein, polypeptide, or antibody can include a single treatment or, preferably, can include a series of treatments.

[3923] For antibodies, the preferred dosage is 0.1 mg/kg of body weight (generally 10 mg/kg to 20 mg/kg). If the antibody is to act in the brain, a dosage of 50 mg/kg to 100 mg/kg is usually appropriate. Generally, partially human antibodies and fully human antibodies have a longer half-life within the human body than other antibodies. Accordingly, lower dosages and less frequent administration are often possible. Modifications such as lipidation can be used to stabilize antibodies and to enhance uptake and tissue penetration (e.g., into the brain). A method for lipidation of antibodies is described by Cruikshank et al. ((1997) J. Acquired Immune Deficiency Syndromes and Human Retrovirology 14:193).

[3924] The present invention encompasses agents that modulate expression or activity. An agent may, for example, be a small molecule. For example, such small molecules include, but are not limited to, peptides, peptidomimetics (e.g., peptoids), amino acids, amino acid analogs, polynucleotides, polynucleotide analogs, nucleotides, nucleotide analogs, organic or inorganic compounds (i.e., including heteroorganic and organometallic compounds) having a molecular weight less than about 10,000 grams per mole, organic or inorganic compounds having a molecular weight less than about 5,000 grams per mole, organic or inorganic compounds having a molecular weight less than about 1,000 grams per mole, organic or inorganic compounds having a molecular weight less than about 500 grams per mole, and salts, esters, and other pharmaceutically acceptable forms of such compounds.

[3925] Exemplary doses include milligram or microgram amounts of the small molecule per kilogram of subject or sample weight (e.g., about 1 microgram per kilogram to about 500 milligrams per kilogram, about 100 micrograms per kilogram to about 5 milligrams per kilogram, or about 1 microgram per kilogram to about 50 micrograms per kilogram. It is furthermore understood that appropriate doses of a small molecule depend upon the potency of the small molecule with respect to the expression or activity to be modulated. When one or more of these small molecules is to be administered to an animal (e.g., a human) in order to modulate expression or activity of a polypeptide or nucleic acid of the invention, a physician, veterinarian, or researcher may, for example, prescribe a relatively low dose at first, subsequently increasing the dose until an appropriate response is obtained. In addition, it is understood that the specific dose level for any particular animal subject will depend upon a variety of factors including the activity of the specific compound employed, the age, body weight, general health, gender, and diet of the subject, the time of administration, the route of administration, the rate of excretion, any drug combination, and the degree of expression or activity to be modulated.

[3926] An antibody (or fragment thereof) may be conjugated to a therapeutic moiety such as a cytotoxin, a therapeutic agent or a radioactive metal ion. A cytotoxin or cytotoxic agent includes any agent that is detrimental to cells. Examples include taxol, cytochalasin B, gramicidin D, ethidium bromide, emetine, mitomycin, etoposide, tenoposide, vincristine, vinblastine, colchicin, doxorubicin, daunorubicin, dihydroxy anthracin dione, mitoxantrone, mithramycin, actinomycin D, 1-dehydrotestosterone, glucocorticoids, procaine, tetracaine, lidocaine, propranolol, puromycin, maytansinoids, e.g., maytansinol (see U.S. Pat. No. 5,208,020), CC-1065 (see U.S. Pat. Nos. 5,475,092, 5,585,499, 5,846,545) and analogs or homologs thereof and analogs or homologs thereof. Therapeutic agents include, but are not limited to, antimetabolites (e.g., methotrexate, 6-mercaptopurine, 6-thioguanine, cytarabine, 5-fluorouracil decarbazine), alkylating agents (e.g., mechlorethamine, thioepa chlorambucil, melphalan, carmustine (BSNU) and lomustine (CCNU), cyclothosphamide, busulfan, dibromomannitol, streptozotocin, mitomycin C, and cis-dichlorodiamine platinum (II) (DDP) cisplatin), anthracyclines (e.g., daunorubicin (formerly daunomycin) and doxorubicin), antibiotics (e.g., dactinomycin (formerly actinomycin), bleomycin, mithramycin, and anthramycin (AMC)), and anti-mitotic agents (e.g., vincristine and vinblastine).

[3927] The conjugates of the invention can be used for modifying a given biological response; the drug moiety is not to be construed as limited to classical chemical therapeutic agents. For example, the drug moiety may be a protein or polypeptide possessing a desired biological activity. Such proteins may include, for example, a toxin such as abrin, ricin A, pseudomonas exotoxin, or diphtheria toxin; a protein such as tumor necrosis factor, alpha-interferon, beta-interferon, nerve growth factor, platelet derived growth factor, tissue plasminogen activator; or, biological response modifiers such as, for example, lymphokines, interleukin-1 (“IL-1”), interleukin-2 (“IL-2”), interleukin-6 (“IL-6”), granulocyte macrophase colony stimulating factor (“GM-CSF”), granulocyte colony stimulating factor (“G-CSF”), or other growth factors.

[3928] Alternatively, an antibody can be conjugated to a second antibody to form an antibody heteroconjugate as described by Segal in U.S. Pat. No. 4,676,980.

[3929] The nucleic acid molecules of the invention can be inserted into vectors and used as gene therapy vectors. Gene therapy vectors can be delivered to a subject by, for example, intravenous injection, local administration (see U.S. Pat. No. 5,328,470) or by stereotactic injection (see e.g., Chen et al. (1994) Proc. Natl. Acad. Sci. USA 91:3054-3057). The pharmaceutical preparation of the gene therapy vector can include the gene therapy vector in an acceptable diluent, or can comprise a slow release matrix in which the gene delivery vehicle is imbedded. Alternatively, where the complete gene delivery vector can be produced intact from recombinant cells, e.g., retroviral vectors, the pharmaceutical preparation can include one or more cells which produce the gene delivery system.

[3930] The pharmaceutical compositions can be included in a container, pack, or dispenser together with instructions for administration.

[3931] Methods of Treatment for 33410

[3932] The present invention provides for both prophylactic and therapeutic methods of treating a subject at risk of (or susceptible to) a disorder or having a disorder associated with aberrant or unwanted 33410 expression or activity. As used herein, the term “treatment” is defined as the application or administration of a therapeutic agent to a patient, or application or administration of a therapeutic agent to an isolated tissue or cell line from a patient, who has a disease, a symptom of disease or a predisposition toward a disease, with the purpose to cure, heal, alleviate, relieve, alter, remedy, ameliorate, improve or affect the disease, the symptoms of disease or the predisposition toward disease. A therapeutic agent includes, but is not limited to, small molecules, peptides, antibodies, ribozymes and antisense oligonucleotides.

[3933] With regards to both prophylactic and therapeutic methods of treatment, such treatments may be specifically tailored or modified, based on knowledge obtained from the field of pharmacogenomics. “Pharmacogenomics”, as used herein, refers to the application of genomics technologies such as gene sequencing, statistical genetics, and gene expression analysis to drugs in clinical development and on the market. More specifically, the term refers the study of how a patient's genes determine his or her response to a drug (e.g., a patient's “drug response phenotype”, or “drug response genotype”.) Thus, another aspect of the invention provides methods for tailoring an individual's prophylactic or therapeutic treatment with either the 33410 molecules of the present invention or 33410 modulators according to that individual's drug response genotype. Pharmacogenomics allows a clinician or physician to target prophylactic or therapeutic treatments to patients who will most benefit from the treatment and to avoid treatment of patients who will experience toxic drug-related side effects.

[3934] In one aspect, the invention provides a method for preventing in a subject, a disease or condition associated with an aberrant or unwanted 33410 expression or activity, by administering to the subject a 33410 or an agent which modulates 33410 expression or at least one 33410 activity. Subjects at risk for a disease which is caused or contributed to by aberrant or unwanted 33410 expression or activity can be identified by, for example, any or a combination of diagnostic or prognostic assays as described herein. Administration of a prophylactic agent can occur prior to the manifestation of symptoms characteristic of the 33410 aberrance, such that a disease or disorder is prevented or, alternatively, delayed in its progression. Depending on the type of 33410 aberrance, for example, a 33410, 33410 agonist or 33410 antagonist agent can be used for treating the subject. The appropriate agent can be determined based on screening assays described herein.

[3935] It is possible that some 33410 disorders can be caused, at least in part, by an abnormal level of gene product, or by the presence of a gene product exhibiting abnormal activity. As such, the reduction in the level and/or activity of such gene products would bring about the amelioration of disorder symptoms.

[3936] The 33410 molecules can act as novel diagnostic targets and therapeutic agents for controlling one or more of cellular proliferative and/or differentiative disorders, and neurodegenerative disorders as described above, as well as immune or inflammatory disorders, cardiovascular disorders, disorders associated with bone metabolism, liver disorders, viral diseases, pain or metabolic disorders.

[3937] The 33410 nucleic acid and protein of the invention can be used to treat and/or diagnose a variety of immune disorders. Examples of hematopoieitic disorders or diseases include, but are not limited to, autoimmune diseases (including, for example, diabetes mellitus, arthritis (including rheumatoid arthritis, juvenile rheumatoid arthritis, osteoarthritis, psoriatic arthritis), multiple sclerosis, encephalomyelitis, myasthenia gravis, systemic lupus erythematosis, autoimmune thyroiditis, dermatitis (including atopic dermatitis and eczematous dermatitis), psoriasis, Sjögren's Syndrome, Crohn's disease, aphthous ulcer, iritis, conjunctivitis, keratoconjunctivitis, ulcerative colitis, asthma, allergic asthma, cutaneous lupus erythematosus, scleroderma, vaginitis, proctitis, drug eruptions, leprosy reversal reactions, erythema nodosum leprosum, autoimmune uveitis, allergic encephalomyelitis, acute necrotizing hemorrhagic encephalopathy, idiopathic bilateral progressive sensorineural hearing loss, aplastic anemia, pure red cell anemia, idiopathic thrombocytopenia, polychondritis, Wegener's granulomatosis, chronic active hepatitis, Stevens-Johnson syndrome, idiopathic sprue, lichen planus, Graves' disease, sarcoidosis, primary biliary cirrhosis, uveitis posterior, and interstitial lung fibrosis), graft-versus-host disease, cases of transplantation, and allergy such as, atopic allergy.

[3938] Examples of disorders involving the heart or “cardiovascular disorder” include, but are not limited to, a disease, disorder, or state involving the cardiovascular system, e.g., the heart, the blood vessels, and/or the blood. A cardiovascular disorder can be caused by an imbalance in arterial pressure, a malfunction of the heart, or an occlusion of a blood vessel, e.g., by a thrombus. Examples of such disorders include hypertension, atherosclerosis, coronary artery spasm, congestive heart failure. coronary artery disease, valvular disease, arrhythmias, and cardiomyopathies.

[3939] Aberrant expression and/or activity of 33410 molecules may mediate disorders associated with bone metabolism. “Bone metabolism” refers to direct or indirect effects in the formation or degeneration of bone structures, e.g., bone formation, bone resorption, etc., which may ultimately affect the concentrations in serum of calcium and phosphate. This term also includes activities mediated by 33410 molecules effects in bone cells, e.g. osteoclasts and osteoblasts, that may in turn result in bone formation and degeneration. For example, 33410 molecules may support different activities of bone resorbing osteoclasts such as the stimulation of differentiation of monocytes and mononuclear phagocytes into osteoclasts. Accordingly, 33410 molecules that modulate the production of bone cells can influence bone formation and degeneration, and thus may be used to treat bone disorders. Examples of such disorders include, but are not limited to, osteoporosis, osteodystrophy, osteomalacia, rickets, osteitis fibrosa cystica, renal osteodystrophy, osteosclerosis, anti-convulsant treatment, osteopenia, fibrogenesis-imperfecta ossium, secondary hyperparathyrodism, hypoparathyroidism, hyperparathyroidism, cirrhosis, obstructive jaundice, drug induced metabolism, medullary carcinoma, chronic renal disease, rickets, sarcoidosis, glucocorticoid antagonism, malabsorption syndrome, steatorrhea, tropical sprue, idiopathic hypercalcemia and milk fever.

[3940] Disorders which may be treated or diagnosed by methods described herein include, but are not limited to, disorders associated with an accumulation in the liver of fibrous tissue, such as that resulting from an imbalance between production and degradation of the extracellular matrix accompanied by the collapse and condensation of preexisting fibers. The methods described herein can be used to diagnose or treat hepatocellular necrosis or injury induced by a wide variety of agents including processes which disturb homeostasis, such as an inflammatory process, tissue damage resulting from toxic injury or altered hepatic blood flow, and infections (e.g., bacterial, viral and parasitic). For example, the methods can be used for the early detection of hepatic injury, such as portal hypertension or hepatic fibrosis. In addition, the methods can be employed to detect liver fibrosis attributed to inborn errors of metabolism, for example, fibrosis resulting from a storage disorder such as Gaucher's disease (lipid abnormalities) or a glycogen storage disease, A1-anticarboxylesterase deficiency; a disorder mediating the accumulation (e.g., storage) of an exogenous substance, for example, hemochromatosis (iron-overload syndrome) and copper storage diseases (Wilson's disease), disorders resulting in the accumulation of a toxic metabolite (e.g., tyrosinemia, fructosemia and galactosemia) and peroxisomal disorders (e.g., Zellweger syndrome). Additionally, the methods described herein may be useful for the early detection and treatment of liver injury associated with the administration of various chemicals or drugs, such as for example, methotrexate, isonizaid, oxyphenisatin, methyldopa, chlorpromazine, tolbutamide or alcohol, or which represents a hepatic manifestation of a vascular disorder such as obstruction of either the intrahepatic or extrahepatic bile flow or an alteration in hepatic circulation resulting, for example, from chronic heart failure, veno-occlusive disease, portal vein thrombosis or Budd-Chiari syndrome.

[3941] Additionally, 33410 molecules may play an important role in the etiology of certain viral diseases, inducing but not limited to Hepatitis B, Heptitis C and Herpes Simplex Virus (HSV). Modulators of 33410 activity could be used to control viral diseases. The modulators can be used in the treatment and/or diagnosis of viral infected tissue or virus-associated tissue fibrosis, especially liver and liver fibrosis. Also, 33410 modulators can be used in the treatment and/or diagnosis of virus-associated carcinoma, especially hepatocellular cancer.

[3942] Additionally, 33410 may play an important role in the regulation of metabolism or pain disorders. Diseases of metabolic imbalance include, but are not limited to, obesity, anorexia nervosa, cachexia, lipid disorders, and diabetes. Examples of pain disorders include, but are not limited to, pain response elicited during various forms of tissue injury, e.g., inflammation, infection, and ischemia, usually referred to as hyperalgesia (described in, for example, Fields, H. L. (1987) Pain, New York: McGraw-Hill); pain associated with muscoloskeletal disorders, e.g., joint pain; tooth pain; headaches; pain associated with surgery; http://164.195.100.11/netacgi/nph-Parser?Sect1=PTO2&Sect2=HITOFF&p=1 &u=/netahtml/search-bool.html&r=3&f=G&1=50&co 1=AND&d=curr&s 1=millennium.ASNM.&s2=ain&OS=A N/millennium+AND+pain&RS=AN/-h3http://164.195.100.11/netacgi/nph-Parser?Sect 1=PTO2&Sect2=HITOFF&p=1 &u=/netahtml/search-bool.html&r=3&f=G&1=50&co 1 AND&d=curr&s 1 l=millennium.ASNM.&s2=pain&OS=A N/millennium+AND+pain&RS=AN/-h5pain related to irritable bowel syndrome; or chest http://164.195.100.11/netacgi/nph-Parser?Sect1=PTO2&Sect2=HITOFF&p=1 &u=/netahtml/search-bool.html&r=3&f=G&1=50&co 1=AND&d=curr&s1=millennium.ASNM.&s2=pain&OS=A N/millennium+AND+pain&RS=AN/-h4http://164.195.100.11/netacgi/nph-Parser?Sect I=PTO2&Sect2=HITOFF&p=1 &u=/netahtml/search-bool.html&r=3&f=G&1=50&co 1=AND&d=curr&s 1 l=millennium.ASNM.&s2=pain&OS=A N/millennium+AND+pain&RS=AN/-h6pain.

[3943] As discussed, successful treatment of 33410 disorders can be brought about by techniques that serve to inhibit the expression or activity of target gene products. For example, compounds, e.g., an agent identified using an assays described above, that proves to exhibit negative modulatory activity, can be used in accordance with the invention to prevent and/or ameliorate symptoms of 33410 disorders. Such molecules can include, but are not limited to peptides, phosphopeptides, small organic or inorganic molecules, or antibodies (including, for example, polyclonal, monoclonal, humanized, anti-idiotypic, chimeric or single chain antibodies, and Fab, F(ab′)2 and Fab expression library fragments, scFV molecules, and epitope-binding fragments thereof).

[3944] Further, antisense and ribozyme molecules that inhibit expression of the target gene can also be used in accordance with the invention to reduce the level of target gene expression, thus effectively reducing the level of target gene activity. Still further, triple helix molecules can be utilized in reducing the level of target gene activity. Antisense, ribozyme and triple helix molecules are discussed above.

[3945] It is possible that the use of antisense, ribozyme, and/or triple helix molecules to reduce or inhibit mutant gene expression can also reduce or inhibit the transcription (triple helix) and/or translation (antisense, ribozyme) of mRNA produced by normal target gene alleles, such that the concentration of normal target gene product present can be lower than is necessary for a normal phenotype. In such cases, nucleic acid molecules that encode and express target gene polypeptides exhibiting normal target gene activity can be introduced into cells via gene therapy method. Alternatively, in instances in that the target gene encodes an extracellular protein, it can be preferable to co-administer normal target gene protein into the cell or tissue in order to maintain the requisite level of cellular or tissue target gene activity.

[3946] Another method by which nucleic acid molecules may be utilized in treating or preventing a disease characterized by 33410 expression is through the use of aptamer molecules specific for 33410 protein. Aptamers are nucleic acid molecules having a tertiary structure that permits them to specifically bind to protein ligands (see, e.g., Osborne, et al. Curr. Opin. Chem Biol. 1997, 1(1): 5-9; and Patel, D. J. Curr Opin Chem Biol 1997 June; 1(1):32-46). Since nucleic acid molecules may in many cases be more conveniently introduced into target cells than therapeutic protein molecules may be, aptamers offer a method by which 33410 protein activity may be specifically decreased without the introduction of drugs or other molecules which may have pluripotent effects.

[3947] Antibodies can be generated that are both specific for target gene product and that reduce target gene product activity. Such antibodies may, therefore, by administered in instances whereby negative modulatory techniques are appropriate for the treatment of 33410 disorders. For a description of antibodies, see the Antibody section above.

[3948] In circumstances wherein injection of an animal or a human subject with a 33410 protein or epitope for stimulating antibody production is harmful to the subject, it is possible to generate an immune response against 33410 through the use of anti-idiotypic antibodies (see, for example, Herlyn, D. Ann Med 1999;31(1):66-78; and Bhattacharya-Chatterjee, M., and Foon, K. A. Cancer Treat Res 1998;94:51-68). If an anti-idiotypic antibody is introduced into a mammal or human subject, it should stimulate the production of anti-anti-idiotypic antibodies, which should be specific to the 33410 protein. Vaccines directed to a disease characterized by 33410 expression may also be generated in this fashion.

[3949] In instances where the target antigen is intracellular and whole antibodies are used, internalizing antibodies may be preferred. Lipofectin or liposomes can be used to deliver the antibody or a fragment of the Fab region that binds to the target antigen into cells. Where fragments of the antibody are used, the smallest inhibitory fragment that binds to the target antigen is preferred. For example, peptides having an amino acid sequence corresponding to the Fv region of the antibody can be used. Alternatively, single chain neutralizing antibodies that bind to intracellular target antigens can also be administered. Such single chain antibodies can be administered, for example, by expressing nucleotide sequences encoding single-chain antibodies within the target cell population (see e.g., Marasco et al. (1993, Proc. Natl. Acad. Sci. USA 90:7889-7893).

[3950] The identified compounds that inhibit target gene expression, synthesis and/or activity can be administered to a patient at therapeutically effective doses to prevent, treat or ameliorate 33410 disorders. A therapeutically effective dose refers to that amount of the compound sufficient to result in amelioration of symptoms of the disorders.

[3951] Another example of determination of effective dose for an individual is the ability to directly assay levels of “free” and “bound” compound in the serum of the test subject. Such assays may utilize antibody mimics and/or “biosensors” that have been created through molecular imprinting techniques. The compound which is able to modulate 33410 activity is used as a template, or “imprinting molecule”, to spatially organize polymerizable monomers prior to their polymerization with catalytic reagents. The subsequent removal of the imprinted molecule leaves a polymer matrix that contains a repeated “negative image” of the compound and is able to selectively rebind the molecule under biological assay conditions. A detailed review of this technique can be seen in Ansell, R. J. et al (1996) Current Opinion in Biotechnology 7:89-94 and in Shea, K. J. (1994) Trends in Polymer Science 2:166-173. Such “imprinted” affinity matrixes are amenable to ligand-binding assays, whereby the immobilized monoclonal antibody component is replaced by an appropriately imprinted matrix. An example of the use of such matrixes in this way can be seen in Vlatakis, G. et al (1993) Nature 361:645-647. Through the use of isotope labeling, the “free” concentration of compound that modulates the expression or activity of 33410 can be readily monitored and used in calculations of IC50.

[3952] Such “imprinted” affinity matrixes can also be designed to include fluorescent groups whose photon-emitting properties measurably change upon local and selective binding of target compound. These changes can be readily assayed in real time using appropriate fiberoptic devices, in turn allowing the dose in a test subject to be quickly optimized based on its individual IC50. A rudimentary example of such a “biosensor” is discussed in Kriz, D. et al (1995) Analytical Chemistry 67:2142-2144.

[3953] Another aspect of the invention pertains to methods of modulating 33410 expression or activity for therapeutic purposes. Accordingly, in an exemplary embodiment, the modulatory method of the invention involves contacting a cell with a 33410 or agent that modulates one or more of the activities of 33410 protein activity associated with the cell. An agent that modulates 33410 protein activity can be an agent as described herein, such as a nucleic acid or a protein, a naturally-occurring target molecule of a 33410 protein (e.g., a 33410 substrate or receptor), a 33410 antibody, a 33410 agonist or antagonist, a peptidomimetic of a 33410 agonist or antagonist, or other small molecule.

[3954] In one embodiment, the agent stimulates one or 33410 activities. Examples of such stimulatory agents include active 33410 protein and a nucleic acid molecule encoding 33410. In another embodiment, the agent inhibits one or more 33410 activities. Examples of such inhibitory agents include antisense 33410 nucleic acid molecules, anti-33410 antibodies, and 33410 inhibitors. These modulatory methods can be performed in vitro (e.g., by culturing the cell with the agent) or, alternatively, in vivo (e.g., by administering the agent to a subject). As such, the present invention provides methods of treating an individual afflicted with a disease or disorder characterized by aberrant or unwanted expression or activity of a 33410 protein or nucleic acid molecule. In one embodiment, the method involves administering an agent (e.g., an agent identified by a screening assay described herein), or combination of agents that modulates (e.g., up regulates or down regulates) 33410 expression or activity. In another embodiment, the method involves administering a 33410 protein or nucleic acid molecule as therapy to compensate for reduced, aberrant, or unwanted 33410 expression or activity.

[3955] Stimulation of 33410 activity is desirable in situations in which 33410 is abnormally down-regulated and/or in which increased 33410 activity is likely to have a beneficial effect. For example, stimulation of 33410 activity is desirable in situations in which a 33410 is down-regulated and/or in which increased 33410 activity is likely to have a beneficial effect. Likewise, inhibition of 33410 activity is desirable in situations in which 33410 is abnormally upregulated and/or in which decreased 33410 activity is likely to have a beneficial effect.

[3956] 33410 Pharmacogenomics

[3957] The 33410 molecules of the present invention, as well as agents, or modulators which have a stimulatory or inhibitory effect on 33410 activity (e.g., 33410 gene expression) as identified by a screening assay described herein can be administered to individuals to treat (prophylactically or therapeutically) 33410 associated disorders (e.g., neurological disorders and/or carcinomas) associated with aberrant or unwanted 33410 activity. In conjunction with such treatment, pharmacogenomics (i.e., the study of the relationship between an individual's genotype and that individual's response to a foreign compound or drug) may be considered. Differences in metabolism of therapeutics can lead to severe toxicity or therapeutic failure by altering the relation between dose and blood concentration of the pharmacologically active drug. Thus, a physician or clinician may consider applying knowledge obtained in relevant pharmacogenomics studies in determining whether to administer a 33410 molecule or 33410 modulator as well as tailoring the dosage and/or therapeutic regimen of treatment with a 33410 molecule or 33410 modulator.

[3958] Pharmacogenomics deals with clinically significant hereditary variations in the response to drugs due to altered drug disposition and abnormal action in affected persons. See, for example, Eichelbaum, M. et al. (1996) Clin. Exp. Pharmacol. Physiol. 23(10-11):983-985 and Linder, M. W. et al. (1997) Clin. Chem. 43(2):254-266. In general, two types of pharmacogenetic conditions can be differentiated. Genetic conditions transmitted as a single factor altering the way drugs act on the body (altered drug action) or genetic conditions transmitted as single factors altering the way the body acts on drugs (altered drug metabolism). These pharmacogenetic conditions can occur either as rare genetic defects or as naturally-occurring polymorphisms. For example, glucose-6-phosphate dehydrogenase deficiency (G6PD) is a common inherited enzymopathy in which the main clinical complication is haemolysis after ingestion of oxidant drugs (anti-malarials, sulfonamides, analgesics, nitrofurans) and consumption of fava beans.

[3959] One pharmacogenomics approach to identifying genes that predict drug response, known as “a genome-wide association”, relies primarily on a high-resolution map of the human genome consisting of already known gene-related markers (e.g., a “bi-allelic” gene marker map which consists of 60,000-100,000 polymorphic or variable sites on the human genome, each of which has two variants.) Such a high-resolution genetic map can be compared to a map of the genome of each of a statistically significant number of patients taking part in a Phase II/III drug trial to identify markers associated with a particular observed drug response or side effect. Alternatively, such a high-resolution map can be generated from a combination of some ten-million known single nucleotide polymorphisms (SNPs) in the human genome. As used herein, a “SNP” is a common alteration that occurs in a single nucleotide base in a stretch of DNA. For example, a SNP may occur once per every 1000 bases of DNA. A SNP may be involved in a disease process, however, the vast majority may not be disease-associated. Given a genetic map based on the occurrence of such SNPs, individuals can be grouped into genetic categories depending on a particular pattern of SNPs in their individual genome. In such a manner, treatment regimens can be tailored to groups of genetically similar individuals, taking into account traits that may be common among such genetically similar individuals.

[3960] Alternatively, a method termed the “candidate gene approach” can be utilized to identify genes that predict drug response. According to this method, if a gene that encodes a drug's target is known (e.g., a 33410 protein of the present invention), all common variants of that gene can be fairly easily identified in the population and it can be determined if having one version of the gene versus another is associated with a particular drug response.

[3961] Alternatively, a method termed the “gene expression profiling” can be utilized to identify genes that predict drug response. For example, the gene expression of an animal dosed with a drug (e.g., a 33410 molecule or 33410 modulator of the present invention) can give an indication whether gene pathways related to toxicity have been turned on.

[3962] Information generated from more than one of the above pharmacogenomics approaches can be used to determine appropriate dosage and treatment regimens for prophylactic or therapeutic treatment of an individual. This knowledge, when applied to dosing or drug selection, can avoid adverse reactions or therapeutic failure and thus enhance therapeutic or prophylactic efficiency when treating a subject with a 33410 molecule or 33410 modulator, such as those identified by one of the exemplary screening assays described herein.

[3963] The present invention further provides methods for identifying new agents, or combinations, that are based on identifying agents that modulate the activity of one or more of the gene products encoded by one or more of the 33410 genes of the present invention, wherein these products may be associated with resistance of the cells to a therapeutic agent. Specifically, the activity of the proteins encoded by the 33410 genes of the present invention can be used as a basis for identifying agents for overcoming agent resistance. By blocking the activity of one or more of the resistance proteins, target cells, e.g., human cells, will become sensitive to treatment with an agent that the unmodified target cells were resistant to.

[3964] Monitoring the influence of agents (e.g., drugs) on the expression or activity of a 33410 protein can be applied in clinical trials. For example, the effectiveness of an agent determined by a screening assay as described herein to increase 33410 gene expression, protein levels, or up-regulate 33410 activity, can be monitored in clinical trials of subjects exhibiting decreased 33410 gene expression, protein levels, or down-regulated 33410 activity. Alternatively, the effectiveness of an agent determined by a screening assay to decrease 33410 gene expression, protein levels, or down-regulate 33410 activity, can be monitored in clinical trials of subjects exhibiting increased 33410 gene expression, protein levels, or upregulated 33410 activity. In such clinical trials, the expression or activity of a 33410 gene, and preferably, other genes that have been implicated in, for example, a 33410-associated disorder can be used as a “read out” or markers of the phenotype of a particular cell.

[3965] 33410 Informatics

[3966] The sequence of a 33410 molecule is provided in a variety of media to facilitate use thereof. A sequence can be provided as a manufacture, other than an isolated nucleic acid or amino acid molecule, which contains a 33410. Such a manufacture can provide a nucleotide or amino acid sequence, e.g., an open reading frame, in a form which allows examination of the manufacture using means not directly applicable to examining the nucleotide or amino acid sequences, or a subset thereof, as they exists in nature or in purified form. The sequence information can include, but is not limited to, 33410 full-length nucleotide and/or amino acid sequences, partial nucleotide and/or amino acid sequences, polymorphic sequences including single nucleotide polymorphisms (SNPs), epitope sequence, and the like. In a preferred embodiment, the manufacture is a machine-readable medium, e.g., a magnetic, optical, chemical or mechanical information storage device.

[3967] As used herein, “machine-readable media” refers to any medium that can be read and accessed directly by a machine, e.g., a digital computer or analogue computer. Non-limiting examples of a computer include a desktop PC, laptop, mainframe, server (e.g., a web server, network server, or server farm), handheld digital assistant, pager, mobile telephone, and the like. The computer can be stand-alone or connected to a communications network, e.g., a local area network (such as a VPN or intranet), a wide area network (e.g., an Extranet or the Internet), or a telephone network (e.g., a wireless, DSL, or ISDN network). Machine-readable media include, but are not limited to: magnetic storage media, such as floppy discs, hard disc storage medium, and magnetic tape; optical storage media such as CD-ROM; electrical storage media such as RAM, ROM, EPROM, EEPROM, flash memory, and the like; and hybrids of these categories such as magnetic/optical storage media.

[3968] A variety of data storage structures are available to a skilled artisan for creating a machine-readable medium having recorded thereon a nucleotide or amino acid sequence of the present invention. The choice of the data storage structure will generally be based on the means chosen to access the stored information. In addition, a variety of data processor programs and formats can be used to store the nucleotide sequence information of the present invention on computer readable medium. The sequence information can be represented in a word processing text file, formatted in commercially-available software such as WordPerfect and Microsoft Word, or represented in the form of an ASCII file, stored in a database application, such as DB2, Sybase, Oracle, or the like. The skilled artisan can readily adapt any number of data processor structuring formats (e.g., text file or database) in order to obtain computer readable medium having recorded thereon the nucleotide sequence information of the present invention.

[3969] In a preferred embodiment, the sequence information is stored in a relational database (such as Sybase or Oracle). The database can have a first table for storing sequence (nucleic acid and/or amino acid sequence) information. The sequence information can be stored in one field (e.g., a first column) of a table row and an identifier for the sequence can be store in another field (e.g., a second column) of the table row. The database can have a second table, e.g., storing annotations. The second table can have a field for the sequence identifier, a field for a descriptor or annotation text (e.g., the descriptor can refer to a functionality of the sequence, a field for the initial position in the sequence to which the annotation refers, and a field for the ultimate position in the sequence to which the annotation refers. Non-limiting examples for annotation to nucleic acid sequences include polymorphisms (e.g., SNP's) translational regulatory sites and splice junctions. Non-limiting examples for annotations to amino acid sequence include polypeptide domains, e.g., a domain described herein; active sites and other functional amino acids; and modification sites.

[3970] By providing the nucleotide or amino acid sequences of the invention in computer readable form, the skilled artisan can routinely access the sequence information for a variety of purposes. For example, one skilled in the art can use the nucleotide or amino acid sequences of the invention in computer readable form to compare a target sequence or target structural motif with the sequence information stored within the data storage means. A search is used to identify fragments or regions of the sequences of the invention which match a particular target sequence or target motif. The search can be a BLAST search or other routine sequence comparison, e.g., a search described herein.

[3971] Thus, in one aspect, the invention features a method of analyzing 33410, e.g., analyzing structure, function, or relatedness to one or more other nucleic acid or amino acid sequences. The method includes: providing a 33410 nucleic acid or amino acid sequence; comparing the 33410 sequence with a second sequence, e.g., one or more preferably a plurality of sequences from a collection of sequences, e.g., a nucleic acid or protein sequence database to thereby analyze 33410. The method can be performed in a machine, e.g., a computer, or manually by a skilled artisan.

[3972] The method can include evaluating the sequence identity between a 33410 sequence and a database sequence. The method can be performed by accessing the database at a second site, e.g., over the Internet.

[3973] As used herein, a “target sequence” can be any DNA or amino acid sequence of six or more nucleotides or two or more amino acids. A skilled artisan can readily recognize that the longer a target sequence is, the less likely a target sequence will be present as a random occurrence in the database. Typical sequence lengths of a target sequence are from about 10 to 100 amino acids or from about 30 to 300 nucleotide residues. However, it is well recognized that commercially important fragments, such as sequence fragments involved in gene expression and protein processing, may be of shorter length.

[3974] Computer software is publicly available which allows a skilled artisan to access sequence information provided in a computer readable medium for analysis and comparison to other sequences. A variety of known algorithms are disclosed publicly and a variety of commercially available software for conducting search means are and can be used in the computer-based systems of the present invention. Examples of such software include, but are not limited to, MacPattern (EMBL), BLASTN and BLASTX (NCBI).

[3975] Thus, the invention features a method of making a computer readable record of a sequence of a 33410 sequence that includes recording the sequence on a computer readable matrix. In a preferred embodiment the record includes one or more of the following: identification of an ORF; identification of a domain, region, or site; identification of the start of transcription; identification of the transcription terminator; the full length amino acid sequence of the protein, or a mature form thereof; the 5′end of the translated region.

[3976] In another aspect, the invention features a method of analyzing a sequence. The method includes: providing a 33410 sequence, or record, in machine-readable form; comparing a second sequence to the 33410 sequence; thereby analyzing a sequence. Comparison can include comparing to sequences for sequence identity or determining if one sequence is included within the other, e.g., determining if the 33410 sequence includes a sequence being compared. In a preferred embodiment the 33410 or second sequence is stored on a first computer, e.g., at a first site and the comparison is performed, read, or recorded on a second computer, e.g., at a second site. E.g., the 33410 or second sequence can be stored in a public or proprietary database in one computer, and the results of the comparison performed, read, or recorded on a second computer. In a preferred embodiment the record includes one or more of the following: identification of an ORF; identification of a domain, region, or site; identification of the start of transcription; identification of the transcription terminator; the full length amino acid sequence of the protein, or a mature form thereof; the 5′end of the translated region.

[3977] In another aspect, the invention provides a machine-readable medium for holding instructions for performing a method for determining whether a subject has a 33410-associated disease or disorder or a pre-disposition to a 33410-associated disease or disorder, wherein the method comprises the steps of determining 33410 sequence information associated with the subject and based on the 33410 sequence information, determining whether the subject has a 33410-associated disease or disorder or a pre-disposition to a 33410-associated disease or disorder and/or recommending a particular treatment for the disease, disorder or pre-disease condition.

[3978] The invention further provides in an electronic system and/or in a network, a method for determining whether a subject has a 33410-associated disease or disorder or a pre-disposition to a disease associated with a 33410 wherein the method comprises the steps of determining 33410 sequence information associated with the subject, and based on the 33410 sequence information, determining whether the subject has a 33410-associated disease or disorder or a pre-disposition to a 33410-associated disease or disorder, and/or recommending a particular treatment for the disease, disorder or pre-disease condition. In a preferred embodiment, the method further includes the step of receiving information, e.g., phenotypic or genotypic information, associated with the subject and/or acquiring from a network phenotypic information associated with the subject. The information can be stored in a database, e.g., a relational database. In another embodiment, the method further includes accessing the database, e.g., for records relating to other subjects, comparing the 33410 sequence of the subject to the 33410 sequences in the database to thereby determine whether the subject as a 33410-associated disease or disorder, or a pre-disposition for such.

[3979] The present invention also provides in a network, a method for determining whether a subject has a 33410 associated disease or disorder or a pre-disposition to a 33410-associated disease or disorder associated with 33410, said method comprising the steps of receiving 33410 sequence information from the subject and/or information related thereto, receiving phenotypic information associated with the subject, acquiring information from the network corresponding to 33410 and/or corresponding to a 33410-associated disease or disorder (e.g., a neurological disorder and/or carcinomas), and based on one or more of the phenotypic information, the 33410 information (e.g., sequence information and/or information related thereto), and the acquired information, determining whether the subject has a 33410-associated disease or disorder or a pre-disposition to a 33410-associated disease or disorder. The method may further comprise the step of recommending a particular treatment for the disease, disorder or pre-disease condition.

[3980] The present invention also provides a method for determining whether a subject has a 33410-associated disease or disorder or a pre-disposition to a 33410-associated disease or disorder, said method comprising the steps of receiving information related to 33410 (e.g., sequence information and/or information related thereto), receiving phenotypic information associated with the subject, acquiring information from the network related to 33410 and/or related to a 33410-associated disease or disorder, and based on one or more of the phenotypic information, the 33410 information, and the acquired information, determining whether the subject has a 33410-associated disease or disorder or a pre-disposition to a 33410-associated disease or disorder. The method may further comprise the step of recommending a particular treatment for the disease, disorder or pre-disease condition.

[3981] This invention is further illustrated by the following examples that should not be construed as limiting. The contents of all references, patents and published patent applications cited throughout this application are incorporated herein by reference.

Background of the 33521 Invention

[3982] Rho small GTPases are critical cellular regulators. Rho proteins regulate the formation of actin structures, particularly stress fibers and focal adhesions. For example, the Rho family member, Rac regulates membrane ruffling and lamellipodia formation in fibroblasts. Cdc42, another family member, regulates filopodia formation. Rho proteins also have more disparate roles in cell cycle progression, gene transcription through the serum response factor, Ras-mediated transformation, and NADPH oxidase-mediated phagocytosis (see Fleming et al. (1999) J Biol Chem 274:12753 for a review). Rho proteins, like other small GTPases, cycle between two different conformations: one bound to GTP and one to GDP. The different conformations have different regulatory properties (see, e.g., Park et al. (1997) Proc. Natl. Acad. Sci. USA 94:4463-8.) The GTP bound protein can hydrolyze GTP to convert to the GDP-bound form. This event is stimulated by a GTPase-activating protein (GAP). Likewise, the GDP-bound form can convert to the GTP bound form by releasing the bound GDP and exchanging it for GTP. This step is stimulated by a GDP exchange factor (GEF). Thus, GAPs and GEFs are key regulators in controlling Rho-mediated signalling events.

[3983] One particular class of Rho GEFs is the TIAM class which have the hallmark domain organization of—from amino- to carboxy-terminus—a first plekstrin homology (PH) domain, a Raf-like Ras binding domain (RBD), a PDZ domain, a Rho GEF domain, and a second PH domain. The PH, PDZ, and RBD domains are common small modules of signalling proteins. PH domains, as described below, can potentially interact with lipids, membranes, and inositol signalling molecules. PDZ domain are peptide binding domains, that frequently recognize the carboxy-termini of proteins. RBD domains are involved in signalling by Ras, a key mediator of cell proliferation and oncogenesis.

[3984] The TIAM class of GEFs include TIAM1, TIAM2, Still-life protein type 1 (SIF type 1), and Still-life protein type 2 (SIF type 2). TIAM1 was originally identified as a T-lymphoma invasion and metastasis inducing protein. TIAM1 can induce the invasion of T-lymphoma cells into a fibroblast monolayer by activating Rac1, a Rho class GTPase. Activation of Rac1 activity can modulate the assembly of adherens junctions, and cell adhesion, e.g., E-cadherin-mediated adhesion (Hordijk et al. (1997) Science 278:1464-1466; Habets et al. (1994) Cell 77:537; Fichiels et al. (1995) Nature 375:338). TIAM1 can be stimulated by a variety of signals. For example, hyaluronic acid binding to CD44 results in a CD44 interaction with TIAM1 that increases Rac1 activity (Bourguigon (2000) J Biol Chem 275:1829). Thus, extracellular environment and signals can regulate cell motility and morphogenesis through the TIAM1 protein. The Drosophila TIAM1 homolog, SIF, is similarly involved in controlling cell morphogenesis, particular the morphology of synaptic terminals (Sone et al. (1997) Science 275:543).

[3985] In sum, the TIAM1 class of GEFs are key intracellular signalling molecules which have multiple protein domains, including a GEF domain that regulates Rho GTPases. These proteins are likely to be critical in a variety of physiological systems, including, but not limited to immune cells, epithelial cells, and neurons as well as pathological systems such as tumor cells and metastatic cells.

Summary of the 33521 Invention

[3986] The present invention is based, in part, on the discovery of a novel Rho GDP exchange factor (Rho GEF) family member, referred to herein as “33521”. The nucleotide sequence of a cDNA encoding 33521 is shown in SEQ ID NO:61, and the amino acid sequence of a 33521 polypeptide is shown in SEQ ID NO:62. In addition, the nucleotide sequences of the coding region are depicted in SEQ ID NO:63.

[3987] Accordingly, in one aspect, the invention features a nucleic acid molecule that encodes a 33521 protein or polypeptide, e.g., a biologically active portion of the 33521 protein. In a preferred embodiment the isolated nucleic acid molecule encodes a polypeptide having the amino acid sequence of SEQ ID NO:62. In other embodiments, the invention provides isolated 33521 nucleic acid molecules having the nucleotide sequence shown in SEQ ID NO:61, SEQ ID NO:63, or the sequence of the DNA insert of the plasmid deposited with ATCC Accession Number ______. In still other embodiments, the invention provides nucleic acid molecules that are substantially identical (e.g., naturally occurring allelic variants) to the nucleotide sequence shown in SEQ ID NO:61, SEQ ID NO:63, or the sequence of the DNA insert of the plasmid deposited with ATCC Accession Number ______. In other embodiments, the invention provides a nucleic acid molecule which hybridizes under a stringency condition described herein to a nucleic acid molecule comprising the nucleotide sequence of SEQ ID NO:61, SEQ ID NO:63, or the sequence of the DNA insert of the plasmid deposited with ATCC Accession Number ______, wherein the nucleic acid encodes a full length 33521 protein or an active fragment thereof.

[3988] In a related aspect, the invention further provides nucleic acid constructs that include a 33521 nucleic acid molecule described herein. In certain embodiments, the nucleic acid molecules of the invention are operatively linked to native or heterologous regulatory sequences. Also included, are vectors and host cells containing the 33521 nucleic acid molecules of the invention e.g., vectors and host cells suitable for producing 33521 nucleic acid molecules and polypeptides.

[3989] In another related aspect, the invention provides nucleic acid fragments suitable as primers or hybridization probes for the detection of 33521-encoding nucleic acids.

[3990] In still another related aspect, isolated nucleic acid molecules that are antisense to a 33521 encoding nucleic acid molecule are provided.

[3991] In another aspect, the invention features, 33521 polypeptides, and biologically active or antigenic fragments thereof that are useful, e.g., as reagents or targets in assays applicable to treatment and diagnosis of 33521-mediated or -related disorders. In another embodiment, the invention provides 33521 polypeptides having a 33521 activity. Preferred polypeptides are 33521 proteins including at least one Rho GEF domain; at least one, preferably two PH domains; at least one PDZ domain; and, preferably, having a 33521 activity, e.g., a 33521 activity as described herein.

[3992] In other embodiments, the invention provides 33521 polypeptides, e.g., a 33521 polypeptide having the amino acid sequence shown in SEQ ID NO:62 or the amino acid sequence encoded by the cDNA insert of the plasmid deposited with ATCC Accession Number ______; an amino acid sequence that is substantially identical to the amino acid sequence shown in SEQ ID NO:62 or the amino acid sequence encoded by the cDNA insert of the plasmid deposited with ATCC Accession Number ______; or an amino acid sequence encoded by a nucleic acid molecule having a nucleotide sequence which hybridizes under a stringency condition described herein to a nucleic acid molecule comprising the nucleotide sequence of SEQ ID NO:61, SEQ ID NO:63, or the sequence of the DNA insert of the plasmid deposited with ATCC Accession Number ______, wherein the nucleic acid encodes a full length 33521 protein or an active fragment thereof.

[3993] In a related aspect, the invention further provides nucleic acid constructs which include a 33521 nucleic acid molecule described herein.

[3994] In a related aspect, the invention provides 33521 polypeptides or fragments operatively linked to non-33521 polypeptides to form fusion proteins.

[3995] In another aspect, the invention features antibodies and antigen-binding fragments thereof, that react with, or more preferably specifically bind 33521 polypeptides or fragments thereof.

[3996] In another aspect, the invention provides methods of screening for compounds that modulate the expression or activity of the 33521 polypeptides or nucleic acids.

[3997] In still another aspect, the invention provides a process for modulating 33521 polypeptide or nucleic acid expression or activity, e.g. using the screened compounds. In certain embodiments, the methods involve treatment of conditions related to aberrant activity or expression of the 33521 polypeptides or nucleic acids, such as conditions involving aberrant or deficient cellular proliferation or differentiation.

[3998] The invention also provides assays for determining the activity of or the presence or absence of 33521 polypeptides or nucleic acid molecules in a biological sample, including for disease diagnosis.

[3999] In yet another aspect, the invention provides methods for inhibiting the proliferation or inducing the killing, of a 33521-expressing cell, e.g., a hyper-proliferative 33521-expressing cell. The method includes contacting the cell with a compound (e.g., a compound identified using the methods described herein) that modulates the activity, or expression, of the 33521 polypeptide or nucleic acid. In a preferred embodiment, the contacting step is effective in vitro or ex vivo. In other embodiments, the contacting step is effected in vivo, e.g., in a subject (e.g., a mammal, e.g., a human), as part of a therapeutic or prophylactic protocol. In a preferred embodiment, the cell is a hyperproliferative cell, e.g., a cell found in a solid tumor, a soft tissue tumor, or a metastatic lesion.

[4000] In a preferred embodiment, the compound is an inhibitor of a 33521 polypeptide. Preferably, the inhibitor is chosen from a peptide, a phosphopeptide, a small organic molecule, a small inorganic molecule and an antibody (e.g., an antibody conjugated to a therapeutic moiety selected from a cytotoxin, a cytotoxic agent and a radioactive metal ion). In another preferred embodiment, the compound is an inhibitor of a 33521 nucleic acid, e.g., an antisense, a ribozyme, or a triple helix molecule.

[4001] In a preferred embodiment, the compound is administered in combination with a cytotoxic agent. Examples of cytotoxic agents include anti-microtubule agent, a topoisomerase I inhibitor, a topoisomerase II inhibitor, an anti-metabolite, a mitotic inhibitor, an alkylating agent, an intercalating agent, an agent capable of interfering with a signal transduction pathway, an agent that promotes apoptosis or necrosis, and radiation.

[4002] In another aspect, the invention features methods for treating or preventing a disorder characterized by aberrant cellular proliferation or differentiation of a 33521-expressing cell, in a subject. Preferably, the method includes administering to the subject (e.g., a mammal, e.g., a human) an effective amount of a compound (e.g., a compound identified using the methods described herein) that modulates the activity, or expression, of the 33521 polypeptide or nucleic acid. In a preferred embodiment, the disorder is a cancerous or pre-cancerous condition.

[4003] In a further aspect, the invention provides methods for evaluating the efficacy of a treatment of a disorder, e.g., proliferative disorder or a disorder. The method includes: treating a subject, e.g., a patient or an animal, with a protocol under evaluation (e.g., treating a subject with one or more of: chemotherapy, radiation, and/or a compound identified using the methods described herein); and evaluating the expression of a 33521 nucleic acid or polypeptide before and after treatment. A change, e.g., a decrease or increase, in the level of a 33521 nucleic acid (e.g., mRNA) or polypeptide after treatment, relative to the level of expression before treatment, is indicative of the efficacy of the treatment of the disorder. The level of 33521 nucleic acid or polypeptide expression can be detected by any method described herein.

[4004] In a preferred embodiment, the evaluating step includes obtaining a sample (e.g., a tissue sample, e.g., a biopsy, or a fluid sample) from the subject, before and after treatment and comparing the level of expressing of a 33521 nucleic acid (e.g., mRNA) or polypeptide before and after treatment.

[4005] In another aspect, the invention provides methods for evaluating the efficacy of a therapeutic or prophylactic agent (e.g., an anti-neoplastic agent). The method includes: contacting a sample with an agent (e.g., a compound identified using the methods described herein, a cytotoxic agent) and, evaluating the expression of 33521 nucleic acid or polypeptide in the sample before and after the contacting step. A change, e.g., a decrease or increase, in the level of 33521 nucleic acid (e.g., mRNA) or polypeptide in the sample obtained after the contacting step, relative to the level of expression in the sample before the contacting step, is indicative of the efficacy of the agent. The level of 33521 nucleic acid or polypeptide expression can be detected by any method described herein. In a preferred embodiment, the sample includes cells obtained from a cancerous tissue.

[4006] In further aspect, the invention provides assays for determining the presence or absence of a genetic alteration in a 33521 polypeptide or nucleic acid molecule, including for disease diagnosis.

[4007] In another aspect, the invention features a two dimensional array having a plurality of addresses, each address of the plurality being positionally distinguishable from each other address of the plurality, and each address of the plurality having a unique capture probe, e.g., a nucleic acid or peptide sequence. At least one address of the plurality has a capture probe that recognizes a 33521 molecule. In one embodiment, the capture probe is a nucleic acid, e.g., a probe complementary to a 33521 nucleic acid sequence. In another embodiment, the capture probe is a polypeptide, e.g., an antibody specific for 33521 polypeptides. Also featured is a method of analyzing a sample by contacting the sample to the aforementioned array and detecting binding of the sample to the array.

[4008] Other features and advantages of the invention will be apparent from the following detailed description, and from the claims.

Detailed Description of 33521

[4009] The human 33521 sequence (see SEQ ID NO:61, as recited in Example 43), which is approximately 5437 nucleotides long including untranslated regions, contains a predicted methionine-initiated coding sequence of about 5106 nucleotides, including the termination codon. The coding sequence encodes a 1701 amino acid protein (see SEQ ID NO:62, as recited in Example 43).

[4010] Human 33521 contains the following regions or other structural features:

[4011] two PH (pleckstrin homology) domains (PFAM Accession Number PF00169) located at about amino acid residues 507 to 620 and 1353 to 1455 of SEQ ID NO:62;

[4012] an RBD domain (PFAM Accession Number PF02196) located at about amino acid residues 810 to 881 of SEQ ID NO:62;

[4013] a PDZ domain (PFAM Accession Number PF00595) located at about amino acid residues 890 to 975 of SEQ ID NO:62;

[4014] a Rho guanine nucleotide exchange factor domain (PFAM Accession Number PF00621) located at about amino acid residues 1103 to 1292 of SEQ ID NO:62;

[4015] a predicted coiled-coil domain located at about amino acids 629 to 694 of SEQ ID NO:62;

[4016] 30 predicted protein kinase C phosphorylation sites (PS00005) at about amino acids 29 to 31, 82 to 84, 93 to 95, 129 to 131, 158 to 160, 225 to 227, 237 to 239, 298 to 300, 330 to 332, 370 to 372, 375 to 377, 391 to 393, 440 to 442, 541 to 543, 632 to 634, 755 to 757, 767 to 769, 962 to 964, 1046 to 1048, 1054 to 1056, 1110 to 1112, 1220 to 1222, 1314 to 1316, 1471 to 1473, 1486 to 1488, 1491 to 1493, 1494 to 1496, 1510 to 1512, 1578 to 1580, and 1641 to 1643, of SEQ ID NO:62;

[4017] 45 predicted casein kinase II phosphorylation sites (PS00006) located at about amino acids 129 to 132, 218 to 221, 229 to 232, 248 to 251, 261 to 264, 298 to 301, 337 to 340, 411 to 414, 471 to 474, 478 to 481, 485 to 488, 491 to 494, 577 to 580, 593 to 596, 604 to 607, 741 to 744, 786 to 789, 797 to 800, 898 to 901, 950 to 953, 984 to 987, 1019 to 1022, 1036 to 1039, 1046 to 1049, 1067 to 1070, 1095 to 1098, 1136 to 1139, 1146 to 1149, 1160 to 1163, 1239 to 1242, 1264 to 1267, 1302 to 1305, 1312 to 1315, 1322 to 1325, 1384 to 1387, 1414 to 1417, 1421 to 1424, 1438 to 1441, 1471 to 1474, 1502 to 1505, 1514 to 1517, 1569 to 1572, 1642 to 1645, 1661 to 1664, and 1672 to 1675, of SEQ ID NO:62;

[4018] twelve predicted N-glycosylation sites (PS00001) located at about amino acids 15 to 18, 144 to 147, 291 to 294, 638 to 641, 1003 to 1006, 1069 to 1072, 1131 to 1134, 1382 to 1385, 1412 to 1415, 1500 to 1503, 1640 to 1643, and 1665 to 1668, of SEQ ID NO:62;

[4019] six predicted cAMP/cGMP to dependent protein kinase phosphorylation sites (PS00004) located at about amino acids 215 to 218, 372 to 375, 381 to 384, 748 to 751, 781 to 784, and 1016 to 1019, of SEQ ID NO:62;

[4020] twenty predicted N-myristylation sites (PS00008) from about amino acids 2 to 7, 49 to 54, 79 to 84, 88 to 93, 105 to 110, 180 to 185, 240 to 245, 315 to 320, 324 to 329, 422 to 427, 763 to 768, 821 to 826, 928 to 933, 934 to 939, 1032 to 1037, 1164 to 1169, 1405 to 1410, 1521 to 1526, 1531 to 1536, and 1618 to 1623, of SEQ ID NO:62;

[4021] one predicted glycosaminoglycan site (PS00002) located at about amino acids 1615 to 1618 of SEQ ID NO:62; and

[4022] three predicted tyrosine kinase phosphorylation sites (PS00007) located at about amino acids 401 to 408, 843 to 850, and 856 to 864, of SEQ ID NO:62.

[4023] For general information regarding PFAM identifiers, PS prefix and PF prefix domain identification numbers, refer to Sonnhammer et al. (1997) Protein 28:405-420 and http://www.psc.edu/general/software/packages/pfam/pfam.html.

[4024] A plasmid containing the nucleotide sequence encoding human 33521 (clone “Fbh33521FL”) was deposited with American Type Culture Collection (ATCC), 10801 University Boulevard, Manassas, Va. 20110-2209, on ______ and assigned Accession Number ______. This deposit will be maintained under the terms of the Budapest Treaty on the International Recognition of the Deposit of Microorganisms for the Purposes of Patent Procedure. This deposit was made merely as a convenience for those of skill in the art and is not an admission that a deposit is required under 35 U.S.C. §112.

[4025] The 33521 protein has the domain organization of the TIAM1 family of proteins. The organization is as follows: a first PH domain, a RBD domain, a PDZ domain, a Rho GEF domain, and a second PH domain. The 33521 protein contains a significant number of structural characteristics in common with members of the Rho GEF family, the PH domain family, and the PDZ domain family. The term “family” when referring to the protein and nucleic acid molecules of the invention means two or more proteins or nucleic acid molecules having a common structural domain or motif and having sufficient amino acid or nucleotide sequence homology as defined herein. Such family members can be naturally or non-naturally occurring and can be from either the same or different species. For example, a family can contain a first protein of human origin as well as other distinct proteins of human origin, or alternatively, can contain homologues of non-human origin, e.g., rat or mouse proteins. Members of a family can also have common functional characteristics.

[4026] Rho GEF domains are also referred to as Dbl homology domains. The Rho GEF domain typically functions in conjunction with a carboxy-terminal PH domain. The crystal structure of the Rho GEF from the human Son of sevenless (Sos) protein and its adjacent carboxy-terminal PH domain has been determined (Soisson et al. (1998) Cell 95:259-68). The Rho GEF domain is entirely &agr;-helical with the active site residues positioned near the interface with the PH domain. These two domains may function together in binding the effector region of Rho proteins to thereby trigger GDP release.

[4027] A 33521 polypeptide can include a “Rho GEF domain” or regions homologous with a “Rho GEF domain”.

[4028] As used herein, the term “Rho GEF domain” includes an amino acid sequence of about 170 to 190 amino acid residues in length and having a bit score for the alignment of the sequence to the Rho GEF domain (HMM) of at least 145. Preferably, a Rho GEF domain includes at least about 150 to 220 amino acids, more preferably about 160 to 200 amino acid residues, or about 170 to 190 amino acids and has a bit score for the alignment of the sequence to the Rho GEF domain (HMM) of at least 50, 100, preferably 180, or more preferably 200 or greater. The Rho GEF domain (HMM) has been assigned the PFAM Accession Number PF00621 (http;//genome.wustl.edu/Pfam/.html). An alignment of the Rho GEF domain (amino acids 1103 to 1292 of SEQ ID NO:62) of human 33521 with a PFAM consensus amino acid sequence (SEQ ID NO:67) derived from a hidden Markov model is depicted in FIG. 31D. An alignment of the Rho GEF domain (amino acids 1103 to 1292 of SEQ ID NO:62) of human 33521 with a SMART consensus amino acid sequence (SEQ ID NO:72) derived from a hidden Markov model is depicted in FIG. 32D.

[4029] In a preferred embodiment 33521 polypeptide or protein has a “Rho GEF domain” or a region which includes at least about 150 to 220 more preferably about 160 to 200 or 170 to 190 amino acid residues and has at least about 50%, 60%, 70% 80% 90% 95%, 99%, or 100% homology with a “Rho GEF domain,” e.g., the Rho GEF domain of human 33521 (e.g., residues 1103 to 1292 of SEQ ID NO:62).

[4030] To identify the presence of a “Rho GEF” domain in a 33521 protein sequence, and make the determination that a polypeptide or protein of interest has a particular profile, the amino acid sequence of the protein can be searched against the Pfam database of HMMs (e.g., the Pfam database, release 2.1) using the default parameters (http://www.sanger.ac.uk/Software/Pfam/HMM_search). For example, the hmmsf program, which is available as part of the HMMER package of search programs, is a family specific default program for MILPAT0063 and a score of 15 is the default threshold score for determining a hit. Alternatively, the threshold score for determining a hit can be lowered (e.g., to 8 bits). A description of the Pfam database can be found in Sonhammer et al. (1997) Proteins 28(3):405-420 and a detailed description of HMMs can be found, for example, in Gribskov et al. (1990) Meth. Enzymol. 183:146-159; Gribskov et al. (1987) Proc. Natl. Acad. Sci. USA 84:4355-4358; Krogh et al. (1994) J. Mol. Biol. 235:1501-1531; and Stultz et al. (1993) Protein Sci. 2:305-314, the contents of which are incorporated herein by reference. A search was performed against the HMM database resulting in the identification of a “Rho GEF” domain in the amino acid sequence of human 33521 at about residues 1103 to 1292 of SEQ ID NO:62 (see Example 43).

[4031] PH domains are small domains found in a diverse class of proteins, including those involved in intracellular signaling and the cytoskeleton. The domain has been implicated in binding the &bgr;&ggr; subunit of heterotrimeric G proteins, lipids (e.g., ascorbyl sterates, and phosphoinositols), and phosphoserine and phosphothreonine. The ligand specificity of PH domains can vary from domain to domain. The structures of multiple PH domains have been determined (for a review, see Riddihough (1994) Nat. Struct. Biol. 1:755-757). From analysis of such structures, it is evident that the PH domains represent a conserved fold consisting of two perpendicular &bgr;-sheets followed by an amphipathic &agr;-helix despite the lack of absolutely conserved residues. However, the loop regions differ greatly in length and composition. Many Rho GEF domain protein also contain a PH domain. Such proteins included vav, dbl, Sos, yeast CDC24, TIAM1, and Still-life (SIF) proteins.

[4032] A 33521 polypeptide can include a “PH domain” or regions homologous with a “PH domain”.

[4033] As used herein, the term “PH domain” includes an amino acid sequence of about 80 to 120 amino acid residues in length and having a bit score for the alignment of the sequence to the PH domain (HMM) of at least 60. Preferably, a PH domain includes at least about 80 to 130 amino acids, more preferably about 90 to 120 amino acid residues, or about 100 to 115 amino acids and has a bit score for the alignment of the sequence to the PH domain (HMM) of at least 5, 10, 20, 40, 60 or greater. The PH domain (HMM) has been assigned the PFAM Accession Number PF00169 (http;//genome.wustl.edu/Pfam/.html). Alignments of the PH domains (amino acids 507 to 620 and 1353 to 1455 of SEQ ID NO:62) of human 33521 with a consensus amino acid sequence (SEQ ID NO:64) derived from a PFAM hidden Markov model are depicted in FIGS. 31A and 31E. Similar alignments with the PH domain consensus from SMART are depicted in FIGS. 32A and 32E.

[4034] In a preferred embodiment 33521 polypeptide or protein has a “PH domain” or a region which includes at least about 80 to 130 more preferably about 90 to 120 or 100 to 115 amino acid residues and has at least about 60%, 70% 80% 90% 95%, 99%, or 100% homology with a “PH domain,” e.g., the first PH domain of human 33521 (e.g., residues 507 to 620 of SEQ ID NO:62) or the second PH domain of human 33521 (e.g., residues 1353 to 1455 of SEQ ID NO:62).

[4035] To identify the presence of a “PH” domain in a 33521 protein sequence, and make the determination that a polypeptide or protein of interest has a particular profile, the amino acid sequence of the protein can be searched against a database of HMMs, e.g., as described above. A “PH” domain was located by such a search in the amino acid sequence of human 33521 at about residues 507 to 620 and 1353 to 1455 of SEQ ID NO:62 (see Example 43).

[4036] PDZ domains (also referred to as DHR or GLGF domains) are found in proteins from bacteria to humans. The canonical PDZ domain is a largely &bgr;-sheet structure, which has a ligand binding pocket for extended peptides. In eukaryotes, PDZ domain frequently bind to carboxy-terminal peptides, such as the carboxy-terminal tetrapeptide X-[ST]-X-V. The peptide can bind in a groove, and form a &bgr;-strand which is antiparallel to a &bgr;-strand in the domain, thus, completing an anti-parallel &bgr;-sheet. The peptide binding groove can also contain a conserved loop which provides binding pocket for the carboxy-terminus of the ligand. The carboxylate at the terminus hydrogen bonds to main chain amides of the loop, which is characterized by a conserved glycine and a conserved phenylalanine. PDZ domain-containing proteins can mediate protein-protein interactions in signaling complexes, as well as a host of other functions, such as the clustering of ion channels that terminate in PDZ-binding peptides.

[4037] A 33521 polypeptide can include a “PDZ domain” or regions homologous with a “PDZ domain”.

[4038] As used herein, the term “PDZ domain” includes an amino acid sequence of about 80 to 90 amino acid residues in length and having a bit score for the alignment of the sequence to the PDZ domain (HMM) of at least 30. Preferably, a PDZ domain includes at least about 70 to 120 amino acids, more preferably about 80 to 110 amino acid residues, or about 80 to 90 amino acids and has a bit score for the alignment of the sequence to the PDZ domain (HMM) of at least 5, 10, 20, 25, 30, 32 or greater. The PDZ domain (HMM) has been assigned the PFAM Accession Number PF00595 (http;//genome.wustl.edu/Pfam/.html). Preferably, a PDZ further includes a conserved glycine-phenylalanine dipeptide in the carboxylate ligand binding loop. An alignment of the PDZ domain (amino acids 890 to 975 of SEQ ID NO:62) of human 33521 with a consensus amino acid sequence (SEQ ID NO:66) derived from a PFAM hidden Markov model is depicted in FIG. 31C. An alignment of the PDZ domain (amino acids 900 to 976 of SEQ ID NO:62) of human 33521 with a consensus amino acid sequence (SEQ ID NO:71) derived from a SMART hidden Markov model is depicted in FIG. 32C.

[4039] In a preferred embodiment 33521 polypeptide or protein has a “PDZ domain” or a region which includes at least about 70 to 120 more preferably about 80 to 110 or 80 to 90 amino acid residues and has at least about 60%, 70% 80% 90% 95%, 99%, or 100% homology with a “PDZ domain,” e.g., the PDZ domain of human 33521 (e.g., residues 890 to 975 of SEQ ID NO:62). The 33521 polypeptide as the conserved “GF” dipeptide motif at about residues 903 to 904 of SEQ ID NO:62. To identify the presence of a “PDZ” domain in a 33521 protein sequence, and make the determination that a polypeptide or protein of interest has a particular profile, the amino acid sequence of the protein can be searched against a database of Pfam HMMs, e.g., as described above. A “PDZ domain” was located by such a search in the amino acid sequence of human 33521 at about residues 890 to 975 of SEQ ID NO:62 (see Example 43).

[4040] An RBD domain, or “Ras-binding” domain can be found on a variety of cell signalling molecules. The paradigm for this domain is the RBD of the Raf serine-threonine kinase, which is a key component of the Ras signalling pathway. Ras is a critical small GTPase which regulates cell proliferation. The central anti-parallel &bgr;-sheet of the RBD interacts with two &bgr;-strands from Ras effector region by means of side chain and main chain interactions (Nassar et al. (1995) Nature 375:554-6).

[4041] 33521 polypeptide can include a “RBD domain” or regions homologous with a “RBD domain”.

[4042] As used herein, the term “RBD domain” includes an amino acid sequence of about 65 to 80 amino acid residues in length and having a bit score for the alignment of the sequence to the RBD domain consensus from the SMART database of HMM of at least 65. Preferably, a RBD domain includes at least about 40 to 100 amino acids, more preferably about 50 to 90 amino acid residues, or about 65 to 80 amino acids and has a bit score for the alignment of the sequence to the RBD domain consensus derived from the SMART database (HMM) of at least 10, 20, 30, 40, 50, 60 or greater. An alignment of the RBD domain (amino acids 810 to 873 of SEQ ID NO:62) of human 33521 with a consensus amino acid sequence (SEQ ID NO:65) derived from a PFAM hidden Markov model is depicted in FIG. 31B. An alignment of the RBD domain (amino acids 810 to 881 of SEQ ID NO:62) of human 33521 with a consensus amino acid sequence (SEQ ID NO:70) derived from a SMART hidden Markov model is depicted in FIG. 32B.

[4043] In a preferred embodiment 33521 polypeptide or protein has a “RBD domain” or a region which includes at least about 40 to 100 more preferably about 50 to 90 or 65 to 80 amino acid residues and has at least about 60%, 70% 80% 90% 95%, 99%, or 100% homology with a “RBD domain,” e.g., the RBD domain of human 33521 (e.g., residues 810 to 881 of SEQ ID NO:62).

[4044] To identify the presence of a “RBD” domain in a 33521 protein sequence, and make the determination that a polypeptide or protein of interest has a particular profile, the amino acid sequence of the protein can be searched against a SMART database (Simple Modular Architecture Research Tool, http://smart.embl-heidelberg.de/) of HMMs as described in Schultz et al. (1998), Proc. Natl. Acad. Sci. USA 95:5857 and Schultz et al. (2000) Nucl. Acids Res 28:231. The database contains domains identified by profiling with the hidden Markov models of the HMMer2 search program (R. Durbin et al. (1998) Biological sequence analysis: probabilistic models of proteins and nucleic acids. Cambridge University Press.; http://hmmer.wustl.edu/). The database also is extensively annotated and monitored by experts to enhance accuracy. A “RBD domain” was located by such a search in the amino acid sequence of human 33521 at about residues 810 to 880 of SEQ ID NO:62 (see Example 43).

[4045] A 33521 molecule can include: at least one and preferably two PH domains; an RBD domain; a PDZ domain; and a Rho GEF domain.

[4046] A 33521 molecule can further include at least one coiled-coil domain; at least one, two, four, six, eight, ten, twenty, twenty-five, or preferably thirty protein kinase C phosphorylation sites; at least one, two, four, six, eight, ten, 20, 25, 30, 35, 40, or preferably 45 casein kinase II phosphorylation sites; at least one, two, three, four, five, six, seven, eight, nine, ten, eleven, or preferably twelve N-glycosylation sites; at least one, two, three, four, five, or preferably six cAMP-cGMP; at least one, two, three, four, five, six, seven, eight, nine, ten, eleven, twelve, fifteen, eighteen, or preferably twenty predicted N-myristylation sites; at least one predicted glycosaminoglycan site; and at least one, two, or preferably three tyrosine kinase phosphorylation sites.

[4047] As the 33521 polypeptides of the invention may modulate 33521-mediated activities, they may be useful as of for developing novel diagnostic and therapeutic agents for 33521-mediated or related disorders, as described below.

[4048] As used herein, a “33521 activity”, “biological activity of 33521” or “functional activity of 33521”, refers to an activity exerted by a 33521 protein, polypeptide or nucleic acid molecule. For example, a 33521 activity can be an activity exerted by 33521 in a physiological milieu on, e.g., a 33521-responsive cell or on a 33521 substrate, e.g., a protein substrate. A 33521 activity can be determined in vivo or in vitro. In one embodiment, a 33521 activity is a direct activity, such as an association with a 33521 target molecule. A “target molecule” or “binding partner” is a molecule with which a 33521 protein binds or interacts in nature, e.g., a Rac guanine nucleotide exchange factor (GEF).

[4049] A 33521 activity can also be an indirect activity, e.g., a cellular signaling activity mediated by interaction of the 33521 protein with a 33521 receptor. The features of the 33521 molecules of the present invention can provide similar biological activities as Rho GEF family members. For example, the 33521 proteins of the present invention can have one or more of the following activities: (1) modulation of cell morphology, e.g., regulation of focal adhesions, or stress fibers; (2) modulation of cell motility, e.g., regulation of lamellipodia, and leading edges; (3) modulation of cell adhesion, e.g., cadherin-mediated adhesion, or adherens junctions; (4) activation of small GTPases by stimulating guanine nucleotide exchange; (5) signaling in response to small lipophilic second messengers, e.g., phosphoinositols, or ascorbyl stereates; (6) signaling in response to phosphorylation events, e.g., phosphorylation of 3352 predicted tyrosine phosphorylation sites or predicted protein kinase C phosphorylation sites; (7) signaling in response to cell surface receptors, e.g., CD44; and (8) signaling in response to activation of Ras or G&bgr;&ggr; proteins.

[4050] Thus, the 33521 molecules can act as novel diagnostic targets and therapeutic agents for controlling cell proliferation and differentiation disorders, including metastatic disorders.

[4051] Examples of cellular proliferative and/or differentiative disorders include cancer, e.g., carcinoma, sarcoma, metastatic disorders or hematopoietic neoplastic disorders, e.g., leukemias. A metastatic tumor can arise from a multitude of primary tumor types, including but not limited to those of prostate, colon, lung, breast and liver origin.

[4052] As used herein, the terms “cancer”, “hyperproliferative” and “neoplastic” refer to cells having the capacity for autonomous growth. Examples of such cells include cells having an abnormal state or condition characterized by rapidly proliferating cell growth. Hyperproliferative and neoplastic disease states may be categorized as pathologic, i.e., characterizing or constituting a disease state, or may be categorized as non-pathologic, i.e., a deviation from normal but not associated with a disease state. The term is meant to include all types of cancerous growths or oncogenic processes, metastatic tissues or malignantly transformed cells, tissues, or organs, irrespective of histopathologic type or stage of invasiveness. “Pathologic hyperproliferative” cells occur in disease states characterized by malignant tumor growth. Examples of non-pathologic hyperproliferative cells include proliferation of cells associated with wound repair.

[4053] The terms “cancer” or “neoplasms” include malignancies of the various organ systems, such as affecting lung, breast, thyroid, lymphoid, gastrointestinal, and genito-urinary tract, as well as adenocarcinomas which include malignancies such as most colon cancers, renal-cell carcinoma, prostate cancer and/or testicular tumors, non-small cell carcinoma of the lung, cancer of the small intestine and cancer of the esophagus.

[4054] The term “carcinoma” is art recognized and refers to malignancies of epithelial or endocrine tissues including respiratory system carcinomas, gastrointestinal system carcinomas, genitourinary system carcinomas, testicular carcinomas, breast carcinomas, prostatic carcinomas, endocrine system carcinomas, and melanomas. Exemplary carcinomas include those forming from tissue of the cervix, lung, prostate, breast, head and neck, colon and ovary. The term also includes carcinosarcomas, e.g., which include malignant tumors composed of carcinomatous and sarcomatous tissues. An “adenocarcinoma” refers to a carcinoma derived from glandular tissue or in which the tumor cells form recognizable glandular structures.

[4055] The term “sarcoma” is art recognized and refers to malignant tumors of mesenchymal derivation.

[4056] Additional examples of proliferative disorders include hematopoietic neoplastic disorders. As used herein, the term “hematopoietic neoplastic disorders” includes diseases involving hyperplastic/neoplastic cells of hematopoietic origin. A hematopoietic neoplastic disorder can arise from myeloid, lymphoid or erythroid lineages, or precursor cells thereof. Preferably, the diseases arise from poorly differentiated acute leukemias, e.g., erythroblastic leukemia and acute megakaryoblastic leukemia. Additional exemplary myeloid disorders include, but are not limited to, acute promyeloid leukemia (APML), acute myelogenous leukemia (AML) and chronic myelogenous leukemia (CML) (reviewed in Vaickus, L. (1991) Crit Rev. in Oncol./Hemotol. 11:267-97); lymphoid malignancies include, but are not limited to acute lymphoblastic leukemia (ALL) which includes B-lineage ALL and T-lineage ALL, chronic lymphocytic leukemia (CLL), prolymphocytic leukemia (PLL), hairy cell leukemia (HLL) and Waldenstrom's macroglobulinemia (WM). Additional forms of malignant lymphomas include, but are not limited to non-Hodgkin lymphoma and variants thereof, peripheral T cell lymphomas, adult T cell leukemia/lymphoma (ATL), cutaneous T-cell lymphoma (CTCL), large granular lymphocytic leukemia (LGF), Hodgkin's disease and Reed-Sternberg disease.

[4057] The 33521 protein, fragments thereof, and derivatives and other variants of the sequence in SEQ ID NO:62 thereof are collectively referred to as “polypeptides or proteins of the invention” or “33521 polypeptides or proteins”. Nucleic acid molecules encoding such polypeptides or proteins are collectively referred to as “nucleic acids of the invention” or “33521 nucleic acids.” 33521 molecules refer to 33521 nucleic acids, polypeptides, and antibodies.

[4058] As used herein, the term “nucleic acid molecule” includes DNA molecules (e.g., a cDNA or genomic DNA), RNA molecules (e.g., an mRNA) and analogs of the DNA or RNA. A DNA or RNA analog can be synthesized from nucleotide analogs. The nucleic acid molecule can be single-stranded or double-stranded, but preferably is double-stranded DNA.

[4059] The term “isolated nucleic acid molecule” or “purified nucleic acid molecule” includes nucleic acid molecules that are separated from other nucleic acid molecules present in the natural source of the nucleic acid. For example, with regards to genomic DNA, the term “isolated” includes nucleic acid molecules which are separated from the chromosome with which the genomic DNA is naturally associated. Preferably, an “isolated” nucleic acid is free of sequences which naturally flank the nucleic acid (i.e., sequences located at the 5′ and/or 3′ends of the nucleic acid) in the genomic DNA of the organism from which the nucleic acid is derived. For example, in various embodiments, the isolated nucleic acid molecule can contain less than about 5 kb, 4 kb, 3 kb, 2 kb, 1 kb, 0.5 kb or 0.1 kb of 5′ and/or 3′nucleotide sequences which naturally flank the nucleic acid molecule in genomic DNA of the cell from which the nucleic acid is derived. Moreover, an “isolated” nucleic acid molecule, such as a cDNA molecule, can be substantially free of other cellular material, or culture medium when produced by recombinant techniques, or substantially free of chemical precursors or other chemicals when chemically synthesized.

[4060] As used herein, the term “hybridizes under low stringency, medium stringency, high stringency, or very high stringency conditions” describes conditions for hybridization and washing. Guidance for performing hybridization reactions can be found in Current Protocols in Molecular Biology, John Wiley & Sons, N.Y. (1989), 6.3.1-6.3.6, which is incorporated by reference. Aqueous and nonaqueous methods are described in that reference and either can be used. Specific hybridization conditions referred to herein are as follows: 1) low stringency hybridization conditions in 6× sodium chloride/sodium citrate (SSC) at about 45° C., followed by two washes in 0.2× SSC, 0.1% SDS at least at 50° C. (the temperature of the washes can be increased to 55° C. for low stringency conditions); 2) medium stringency hybridization conditions in 6× SSC at about 45° C., followed by one or more washes in 0.2× SSC, 0.1% SDS at 60° C.; 3) high stringency hybridization conditions in 6× SSC at about 45° C., followed by one or more washes in 0.2× SSC, 0.1% SDS at 65° C.; and preferably 4) very high stringency hybridization conditions are 0.5M sodium phosphate, 7% SDS at 65° C., followed by one or more washes at 0.2× SSC, 1% SDS at 65° C. Very high stringency conditions (4) are the preferred conditions and the ones that should be used unless otherwise specified.

[4061] Preferably, an isolated nucleic acid molecule of the invention that hybridizes under a stringency condition described herein to the sequence of SEQ ID NO:61 or SEQ ID NO:63, corresponds to a naturally-occurring nucleic acid molecule.

[4062] As used herein, a “naturally-occurring” nucleic acid molecule refers to an RNA or DNA molecule having a nucleotide sequence that occurs in nature. For example a naturally occurring nucleic acid molecule can encode a natural protein. As used herein, the terms “gene” and “recombinant gene” refer to nucleic acid molecules which include at least an open reading frame encoding a 33521 protein. The gene can optionally further include non-coding sequences, e.g., regulatory sequences and introns. Preferably, a gene encodes a mammalian 33521 protein or derivative thereof.

[4063] An “isolated” or “purified” polypeptide or protein is substantially free of cellular material or other contaminating proteins from the cell or tissue source from which the protein is derived, or substantially free from chemical precursors or other chemicals when chemically synthesized. “Substantially free” means that a preparation of 33521 protein is at least 10% pure. In a preferred embodiment, the preparation of 33521 protein has less than about 30%, 20%, 10% and more preferably 5% (by dry weight), of non-33521 protein (also referred to herein as a “contaminating protein”), or of chemical precursors or non-33521 chemicals. When the 33521 protein or biologically active portion thereof is recombinantly produced, it is also preferably substantially free of culture medium, i.e., culture medium represents less than about 20%, more preferably less than about 10%, and most preferably less than about 5% of the volume of the protein preparation. The invention includes isolated or purified preparations of at least 0.01, 0.1, 1.0, and 10 milligrams in dry weight.

[4064] A “non-essential” amino acid residue is a residue that can be altered from the wild-type sequence of 33521 without abolishing or substantially altering a 33521 activity. Preferably the alteration does not substantially alter the 33521 activity, e.g., the activity is at least 20%, 40%, 60%, 70% or 80% of wild-type. An “essential” amino acid residue is a residue that, when altered from the wild-type sequence of 33521, results in abolishing a 33521 activity such that less than 20% of the wild-type activity is present. For example, conserved amino acid residues in 33521 are predicted to be particularly unamenable to alteration.

[4065] A “conservative amino acid substitution” is one in which the amino acid residue is replaced with an amino acid residue having a similar side chain. Families of amino acid residues having similar side chains have been defined in the art. These families include amino acids with basic side chains (e.g., lysine, arginine, histidine), acidic side chains (e.g., aspartic acid, glutamic acid), uncharged polar side chains (e.g., glycine, asparagine, glutamine, serine, threonine, tyrosine, cysteine), nonpolar side chains (e.g., alanine, valine, leucine, isoleucine, proline, phenylalanine, methionine, tryptophan), beta-branched side chains (e.g., threonine, valine, isoleucine) and aromatic side chains (e.g., tyrosine, phenylalanine, tryptophan, histidine). Thus, a predicted nonessential amino acid residue in a 33521 protein is preferably replaced with another amino acid residue from the same side chain family. Alternatively, in another embodiment, mutations can be introduced randomly along all or part of a 33521 coding sequence, such as by saturation mutagenesis, and the resultant mutants can be screened for 33521 biological activity to identify mutants that retain activity. Following mutagenesis of SEQ ID NO:61 or SEQ ID NO:63, the encoded protein can be expressed recombinantly and the activity of the protein can be determined.

[4066] As used herein, a “biologically active portion” of a 33521 protein includes a fragment of a 33521 protein which participates in an interaction, e.g., an intramolecular or an inter-molecular interaction. An inter-molecular interaction can be a specific binding interaction or an enzymatic interaction (e.g., the interaction can be transient and a covalent bond is formed or broken). An inter-molecular interaction can be between a 33521 molecule and a non-33521 molecule or between a first 33521 molecule and a second 33521 molecule (e.g., a dimerization interaction). Biologically active portions of a 33521 protein include peptides comprising amino acid sequences sufficiently homologous to or derived from the amino acid sequence of the 33521 protein, e.g., the amino acid sequence shown in SEQ ID NO:62, which include less amino acids than the full length 33521 proteins, and exhibit at least one activity of a 33521 protein. Typically, biologically active portions comprise a domain or motif with at least one activity of the 33521 protein, e.g., guanine nucleotide exchange. A biologically active portion of a 33521 protein can be a polypeptide which is, for example, 10, 25, 50, 100, 200 or more amino acids in length. Biologically active portions of a 33521 protein can be used as targets for developing agents which modulate a 33521 mediated activity, e.g., guanine nucleotide exchange.

[4067] Calculations of homology or sequence identity between sequences (the terms are used interchangeably herein) are performed as follows.

[4068] To determine the percent identity of two amino acid sequences, or of two nucleic acid sequences, the sequences are aligned for optimal comparison purposes (e.g., gaps can be introduced in one or both of a first and a second amino acid or nucleic acid sequence for optimal alignment and non-homologous sequences can be disregarded for comparison purposes). In a preferred embodiment, the length of a reference sequence aligned for comparison purposes is at least 30%, preferably at least 40%, more preferably at least 50%, 60%, and even more preferably at least 70%, 80%, 90%, 100% of the length of the reference sequence. The amino acid residues or nucleotides at corresponding amino acid positions or nucleotide positions are then compared. When a position in the first sequence is occupied by the same amino acid residue or nucleotide as the corresponding position in the second sequence, then the molecules are identical at that position (as used herein amino acid or nucleic acid “identity” is equivalent to amino acid or nucleic acid “homology”).

[4069] The percent identity between the two sequences is a function of the number of identical positions shared by the sequences, taking into account the number of gaps, and the length of each gap, which need to be introduced for optimal alignment of the two sequences.

[4070] The comparison of sequences and determination of percent identity between two sequences can be accomplished using a mathematical algorithm. In a preferred embodiment, the percent identity between two amino acid sequences is determined using the Needleman and Wunsch ((1970) J. Mol. Biol. 48:444-453) algorithm which has been incorporated into the GAP program in the GCG software package (available at http://www.gcg.com), using either a Blossum 62 matrix or a PAM250 matrix, and a gap weight of 16, 14, 12, 10, 8, 6, or 4 and a length weight of 1, 2, 3, 4, 5, or 6. In yet another preferred embodiment, the percent identity between two nucleotide sequences is determined using the GAP program in the GCG software package (available at http://www.gcg.com), using a NWSgapdna.CMP matrix and a gap weight of 40, 50, 60, 70, or 80 and a length weight of 1, 2, 3, 4, 5, or 6. A particularly preferred set of parameters (and the one that should be used unless otherwise specified) are a Blossum 62 scoring matrix with a gap penalty of 12, a gap extend penalty of 4, and a frameshift gap penalty of 5.

[4071] The percent identity between two amino acid or nucleotide sequences can be determined using the algorithm of E. Meyers and W. Miller ((1989) CABIOS, 4:11-17) which has been incorporated into the ALIGN program (version 2.0), using a PAM120 weight residue table, a gap length penalty of 12 and a gap penalty of 4.

[4072] The nucleic acid and protein sequences described herein can be used as a “query sequence” to perform a search against public databases to, for example, identify other family members or related sequences. Such searches can be performed using the NBLAST and XBLAST programs (version 2.0) of Altschul, et al. (1990) J. Mol. Biol. 215:403-10. BLAST nucleotide searches can be performed with the NBLAST program, score=100, wordlength=12 to obtain nucleotide sequences homologous to 33521 nucleic acid molecules of the invention. BLAST protein searches can be performed with the XBLAST program, score=50, wordlength=3 to obtain amino acid sequences homologous to 33521 protein molecules of the invention. To obtain gapped alignments for comparison purposes, Gapped BLAST can be utilized as described in Altschul et al., (1997) Nucleic Acids Res. 25:3389-3402. When utilizing BLAST and Gapped BLAST programs, the default parameters of the respective programs (e.g., XBLAST and NBLAST) can be used. See http://www.ncbi.nlm.nih.gov.

[4073] Particularly preferred 33521 polypeptides of the present invention have an amino acid sequence substantially identical to the amino acid sequence of SEQ ID NO:62. In the context of an amino acid sequence, the term “substantially identical” is used herein to refer to a first amino acid that contains a sufficient or minimum number of amino acid residues that are i) identical to, or ii) conservative substitutions of aligned amino acid residues in a second amino acid sequence such that the first and second amino acid sequences can have a common structural domain and/or common functional activity. For example, amino acid sequences that contain a common structural domain having at least about 60%, or 65% identity, likely 75% identity, more likely 85%, 90%. 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98% or 99% identity to SEQ ID NO:62 are termed substantially identical.

[4074] In the context of nucleotide sequence, the term “substantially identical” is used herein to refer to a first nucleic acid sequence that contains a sufficient or minimum number of nucleotides that are identical to aligned nucleotides in a second nucleic acid sequence such that the first and second nucleotide sequences encode a polypeptide having common functional activity, or encode a common structural polypeptide domain or a common functional polypeptide activity. For example, nucleotide sequences having at least about 60%, or 65% identity, likely 75% identity, more likely 85%, 90%. 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98% or 99% identity to SEQ ID NO:61 or 63 are termed substantially identical.

[4075] “Misexpression or aberrant expression”, as used herein, refers to a non-wildtype pattern of gene expression at the RNA or protein level. It includes: expression at non-wild type levels, i.e., over- or under-expression; a pattern of expression that differs from wild type in terms of the time or stage at which the gene is expressed, e.g., increased or decreased expression (as compared with wild type) at a predetermined developmental period or stage; a pattern of expression that differs from wild type in terms of altered, e.g., increased or decreased, expression (as compared with wild type) in a predetermined cell type or tissue type; a pattern of expression that differs from wild type in terms of the splicing size, translated amino acid sequence, post-transitional modification, or biological activity of the expressed polypeptide; a pattern of expression that differs from wild type in terms of the effect of an environmental stimulus or extracellular stimulus on expression of the gene, e.g., a pattern of increased or decreased expression (as compared with wild type) in the presence of an increase or decrease in the strength of the stimulus.

[4076] “Subject,” as used herein, refers to human and non-human animals. The term “non-human animals” of the invention includes all vertebrates, e.g., mammals, such as non-human primates (particularly higher primates), sheep, dog, rodent (e.g., mouse or rat), guinea pig, goat, pig, cat, rabbits, cow, and non-mammals, such as chickens, amphibians, reptiles, etc. In a preferred embodiment, the subject is a human. In another embodiment, the subject is an experimental animal or animal suitable as a disease model.

[4077] A “purified preparation of cells”, as used herein, refers to an in vitro preparation of cells. In the case cells from multicellular organisms (e.g., plants and animals), a purified preparation of cells is a subset of cells obtained from the organism, not the entire intact organism. In the case of unicellular microorganisms (e.g., cultured cells and microbial cells), it consists of a preparation of at least 10% and more preferably 50% of the subject cells.

[4078] Various aspects of the invention are described in further detail below.

[4079] Isolated Nucleic Acid Molecules of 33521

[4080] In one aspect, the invention provides, an isolated or purified, nucleic acid molecule that encodes a 33521 polypeptide described herein, e.g., a full-length 33521 protein or a fragment thereof, e.g., a biologically active portion of 33521 protein. Also included is a nucleic acid fragment suitable for use as a hybridization probe, which can be used, e.g., to identify a nucleic acid molecule encoding a polypeptide of the invention, 33521 mRNA, and fragments suitable for use as primers, e.g., PCR primers for the amplification or mutation of nucleic acid molecules.

[4081] In one embodiment, an isolated nucleic acid molecule of the invention includes the nucleotide sequence shown in SEQ ID NO:61, or a portion of any of these nucleotide sequences. In one embodiment, the nucleic acid molecule includes sequences encoding the human 33521 protein (i.e., “the coding region” of SEQ ID NO:61, as shown in SEQ ID NO:63), as well as 5′untranslated sequences. Alternatively, the nucleic acid molecule can include only the coding region of SEQ ID NO:61 (e.g., SEQ ID NO:63) and, e.g., no flanking sequences which normally accompany the subject sequence. In another embodiment, the nucleic acid molecule encodes a sequence corresponding to a fragment of the protein from about amino acid 1103 to 1292 of SEQ ID NO:62. In still other embodiments, the nucleic acid molecule encodes a sequence corresponding to one of the following fragments of the 33521 protein: from about amino acid 1 to 507, 507 to 620, 620 to 810, 810 to 853, 853 to 890, 890 to 975, 975 to 1103, 1292 to 1353, and 1353 to 1455, of SEQ ID NO:62.

[4082] In another embodiment, an isolated nucleic acid molecule of the invention includes a nucleic acid molecule which is a complement of the nucleotide sequence shown in SEQ ID NO:61 or SEQ ID NO:63, or a portion of any of these nucleotide sequences. In other embodiments, the nucleic acid molecule of the invention is sufficiently complementary to the nucleotide sequence shown in SEQ ID NO:61 or SEQ ID NO:63, such that it can hybridize (e.g., under a stringency condition described herein) to the nucleotide sequence shown in SEQ ID NO:61 or 63, thereby forming a stable duplex.

[4083] In one embodiment, an isolated nucleic acid molecule of the present invention includes a nucleotide sequence which is at least about 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or more homologous to the entire length of the nucleotide sequence shown in SEQ ID NO:61 or SEQ ID NO:63, or a portion, preferably of the same length, of any of these nucleotide sequences.

[4084] 33521 Nucleic Acid Fragments

[4085] A nucleic acid molecule of the invention can include only a portion of the nucleic acid sequence of SEQ ID NO:61 or 63. For example, such a nucleic acid molecule can include a fragment which can be used as a probe or primer or a fragment encoding a portion of a 33521 protein, e.g., an immunogenic or biologically active portion of a 33521 protein. A fragment can comprise those nucleotides of SEQ ID NO:61, which encode a Rho GEF domain of human 33521. Other fragments include those encoding a PH domain, a PDZ domain, or a RBD domain of human 33521. The nucleotide sequence determined from the cloning of the 33521 gene allows for the generation of probes and primers designed for use in identifying and/or cloning other 33521 family members, or fragments thereof, as well as 33521 homologues, or fragments thereof, from other species.

[4086] In another embodiment, a nucleic acid includes a nucleotide sequence that includes part, or all, of the coding region and extends into either (or both) the 5′ or 3′noncoding region. Other embodiments include a fragment which includes a nucleotide sequence encoding an amino acid fragment described herein. Nucleic acid fragments can encode a specific domain or site described herein or fragments thereof, particularly fragments thereof which are at least 50 amino acids in length, more preferably 100, 200, 300, 400, 500, 600, 700, 800, 900, 1000, 1100, 1200, 1300, 1400, 1500, 1600, 1650, 1670, 1680, 1685, 1690, 1695, 1700, or 1701 amino acids in length.

[4087] Fragments also include nucleic acid sequences corresponding to specific amino acid sequences described above or fragments thereof. Nucleic acid fragments should not to be construed as encompassing those fragments that may have been disclosed prior to the invention.

[4088] A nucleic acid fragment can include a sequence corresponding to a domain, region, or functional site described herein. A nucleic acid fragment can also include one or more domain, region, or functional site described herein. Thus, for example, a 33521 nucleic acid fragment can include a sequence corresponding to a Rho GEF domain, a PH domain, a PDZ domain, or an RBD domain.

[4089] 33521 probes and primers are provided. Typically a probe/primer is an isolated or purified oligonucleotide. The oligonucleotide typically includes a region of nucleotide sequence that hybridizes under a stringency condition described herein to at least about 7, 12 or 15, preferably about 20 or 25, more preferably about 30, 35, 40, 45, 50, 55, 60, 65, or 75 consecutive nucleotides of a sense or antisense sequence of SEQ ID NO:61 or SEQ ID NO:63, or of a naturally occurring allelic variant or mutant of SEQ ID NO:61 or SEQ ID NO:63. Preferably, an oligonucleotide is less than about 200, 150, 120, or 100 nucleotides in length.

[4090] In one embodiment, the probe or primer is attached to a solid support, e.g., a solid support described herein.

[4091] One exemplary kit of primers includes a forward primer that anneals to the coding strand and a reverse primer that anneals to the non-coding strand. The forward primer can anneal to the start codon, e.g., the nucleic acid sequence encoding amino acid residue 1 of SEQ ID NO:62. The reverse primer can anneal to the ultimate codon, e.g., the codon immediately before the stop codon, e.g., the codon encoding amino acid residue 1701 of SEQ ID NO:62. In a preferred embodiment, the annealing temperatures of the forward and reverse primers differ by no more than 5, 4, 3, or 2° C.

[4092] In a preferred embodiment the nucleic acid is a probe which is at least 10, 12, 15, 18, 20 and less than 200, more preferably less than 100, or less than 50, nucleotides in length. It should be identical, or differ by 1, or 2, or less than 5 or 10 nucleotides, from a sequence disclosed herein. If alignment is needed for this comparison the sequences should be aligned for maximum homology. “Looped” out sequences from deletions or insertions, or mismatches, are considered differences.

[4093] A probe or primer can be derived from the sense or anti-sense strand of a nucleic acid which encodes: a Rho GEF domain, a PH domain, a PDZ domain, or a RBD domain of human 33521.

[4094] In another embodiment a set of primers is provided, e.g., primers suitable for use in a PCR, which can be used to amplify a selected region of a 33521 sequence, e.g., a domain, region, site or other sequence described herein. The primers should be at least 5, 10, or 50 base pairs in length and less than 100, or less than 200, base pairs in length. The primers should be identical, or differs by one base from a sequence disclosed herein or from a naturally occurring variant. For example, primers suitable for amplifying all or a portion of any of the following regions are provided: a Rho GEF domain from about amino acid 1103 to 1292 of SEQ ID NO:62; a PH domain from about amino acid 507 to 622 and 1353 to 1457 of SEQ ID NO:62; an RBD domain from about amino acid 810 to 881 of SEQ ID NO:62; or a PDZ domain from about amino acid 890 to 975 of SEQ ID NO:62.

[4095] A nucleic acid fragment can encode an epitope bearing region of a polypeptide described herein.

[4096] A nucleic acid fragment encoding a “biologically active portion of a 33521 polypeptide” can be prepared by isolating a portion of the nucleotide sequence of SEQ ID NO:61 or 63, which encodes a polypeptide having a 33521 biological activity (e.g., the biological activities of the 33521 proteins are described herein), expressing the encoded portion of the 33521 protein (e.g., by recombinant expression in vitro) and assessing the activity of the encoded portion of the 33521 protein. For example, a nucleic acid fragment encoding a biologically active portion of 33521 includes a Rho GEF domain, e.g., amino acid residues about 1103 to 1292 of SEQ ID NO:62. A nucleic acid fragment encoding a biologically active portion of a 33521 polypeptide, may comprise a nucleotide sequence which is greater than 300 or more nucleotides in length.

[4097] In preferred embodiments, a nucleic acid includes a nucleotide sequence which is about 300, 400, 500, 600, 700, 800, 900, 1000, 1100, 1200, 1300, 1500, 2000, 2500, 3000, 3500, 4000, 4500, 5000 or more nucleotides in length and hybridizes under a stringency condition described herein to a nucleic acid molecule of SEQ ID NO:61 or SEQ ID NO:63.

[4098] In preferred embodiments, the fragment includes at least one, and preferably at least 5, 10, 15, 25, 50, 100, 200, 300, 400, 500, 600, 700, 800, 900, 1000, 1100, 1200, 1300, 1500, 2000, 2500, 3000, 3500, 4000, or 4500 nucleotides from nucleotides 1-2142, 1-2273, 1-3415, 1-4822, 1-4937, 1-4949, 1-4958, or 1-4973 of SEQ ID NO:61.

[4099] In preferred embodiments, the fragment includes the nucleotide sequence of SEQ ID NO:63 and at least one, and preferably at least 5, 10, 15, 25, 50, 75, 100, 150, 200, or 240 consecutive nucleotides of SEQ ID NO:61.

[4100] In preferred embodiments, the fragment includes at least one, and preferably at least 5, 10, 15, 25, 50, 75, 100, 200, 300, 400, 500, 600, 700, 800, 900, 1000, 1100, 1200, 1300, 1500, 2000, 2500, 3000, 3500, 4000, 4500, 5000, or more nucleotides encoding a protein including at least 5, 10, 15, 20, 25, 30, 40, 50, 100, 150, 200, 250, 300, 350, 400, 500, 600, 700, 800, 900, 1000, 1100, 1200, 1300, 1400, 1500, 1600, or 1700 consecutive amino acids of SEQ ID NO:62. In one embodiment, the encoded protein includes at least 5, 10, 15, 20, 25, 30, 40, 50, 100, 150, 200, 250, 300, 400, 500, or 600 consecutive amino acids from residues 1-631 of SEQ ID NO:62.

[4101] In preferred embodiments, the nucleic acid fragment includes a nucleotide sequence that is other than, e.g., differs by at least one, two, three of more nucleotides from, a sequence in WO 00/40607 or in GenBank™ Accession numbers AI094945, AI126294, AW338968, AW026228, AI982584, AI589050, BE504999, AL122086, AF120323, AF120324, AF195656.

[4102] In preferred embodiments, the fragment comprises the coding region of 33521, e.g., the nucleotide sequence of SEQ ID NO:66.

[4103] 33521 Nucleic Acid Variants

[4104] The invention further encompasses nucleic acid molecules that differ from the nucleotide sequence shown in SEQ ID NO:61 or SEQ ID NO:63. Such differences can be due to degeneracy of the genetic code (and result in a nucleic acid which encodes the same 33521 proteins as those encoded by the nucleotide sequence disclosed herein. In another embodiment, an isolated nucleic acid molecule of the invention has a nucleotide sequence encoding a protein having an amino acid sequence which differs, by at least 1, but less than 5, 10, 20, 50, or 100 amino acid residues that shown in SEQ ID NO:62. If alignment is needed for this comparison the sequences should be aligned for maximum homology. The encoded protein can differ by no more than 5, 4, 3, 2, or 1 amino acid. “Looped” out sequences from deletions or insertions, or mismatches, are considered differences.

[4105] Nucleic acids of the inventor can be chosen for having codons, which are preferred, or non-preferred, for a particular expression system. E.g., the nucleic acid can be one in which at least one codon, at preferably at least 10%, or 20% of the codons has been altered such that the sequence is optimized for expression in E. coli, yeast, human, insect, or CHO cells.

[4106] Nucleic acid variants can be naturally occurring, such as allelic variants (same locus), homologs (different locus), and orthologs (different organism) or can be non naturally occurring. Non-naturally occurring variants can be made by mutagenesis techniques, including those applied to polynucleotides, cells, or organisms. The variants can contain nucleotide substitutions, deletions, inversions and insertions. Variation can occur in either or both the coding and non-coding regions. The variations can produce both conservative and non-conservative amino acid substitutions (as compared in the encoded product).

[4107] In a preferred embodiment, the nucleic acid differs from that of SEQ ID NO:61 or 63, e.g., as follows: by at least one but less than 10, 20, 30, or 40 nucleotides; at least one but less than 1%, 5%, 10% or 20% of the nucleotides in the subject nucleic acid. The nucleic acid can differ by no more than 5, 4, 3, 2, or 1 nucleotide. If necessary for this analysis the sequences should be aligned for maximum homology. “Looped” out sequences from deletions or insertions, or mismatches, are considered differences.

[4108] Orthologs, homologs, and allelic variants can be identified using methods known in the art. These variants comprise a nucleotide sequence encoding a polypeptide that is 50%, at least about 55%, typically at least about 70-75%, more typically at least about 80-85%, and most typically at least about 90-95% or more identical to the nucleotide sequence shown in SEQ ID NO:62 or a fragment of this sequence. Such nucleic acid molecules can readily be identified as being able to hybridize under a stringency condition described herein, to the nucleotide sequence shown in SEQ ID NO:62 or a fragment of the sequence. Nucleic acid molecules corresponding to orthologs, homologs, and allelic variants of the 33521 cDNAs of the invention can further be isolated by mapping to the same chromosome or locus as the 33521 gene.

[4109] Preferred variants include those that are correlated with guanine nucleotide exchange activity.

[4110] Allelic variants of 33521, e.g., human 33521, include both functional and non-functional proteins. Functional allelic variants are naturally occurring amino acid sequence variants of the 33521 protein within a population that maintain the ability to bind a Rho family protein, e.g., Rac. Functional allelic variants will typically contain only conservative substitution of one or more amino acids of SEQ ID NO:62, or substitution, deletion or insertion of non-critical residues in non-critical regions of the protein. Non-functional allelic variants are naturally-occurring amino acid sequence variants of the 33521, e.g., human 33521, protein within a population that do not have the ability to bind a Rho family protein, e.g., Rac. Non-functional allelic variants will typically contain a non-conservative substitution, a deletion, or insertion, or premature truncation of the amino acid sequence of SEQ ID NO:62, or a substitution, insertion, or deletion in critical residues or critical regions of the protein.

[4111] Moreover, nucleic acid molecules encoding other 33521 family members and, thus, which have a nucleotide sequence which differs from the 33521 sequences of SEQ ID NO:61 or SEQ ID NO:63 are intended to be within the scope of the invention.

[4112] Antisense Nucleic Acid Molecules, Ribozymes and Modified 33521 Nucleic Acid Molecules

[4113] In another aspect, the invention features, an isolated nucleic acid molecule which is antisense to 33521. An “antisense” nucleic acid can include a nucleotide sequence which is complementary to a “sense” nucleic acid encoding a protein, e.g., complementary to the coding strand of a double-stranded cDNA molecule or complementary to an mRNA sequence. The antisense nucleic acid can be complementary to an entire 33521 coding strand, or to only a portion thereof (e.g., the coding region of human 33521 corresponding to SEQ ID NO:63). In another embodiment, the antisense nucleic acid molecule is antisense to a “noncoding region” of the coding strand of a nucleotide sequence encoding 33521 (e.g., the 5′ and 3′untranslated regions).

[4114] An antisense nucleic acid can be designed such that it is complementary to the entire coding region of 33521 mRNA, but more preferably is an oligonucleotide which is antisense to only a portion of the coding or noncoding region of 33521 mRNA. For example, the antisense oligonucleotide can be complementary to the region surrounding the translation start site of 33521 mRNA, e.g., between the −10 and +10 regions of the target gene nucleotide sequence of interest. An antisense oligonucleotide can be, for example, about 7, 10, 15, 20, 25, 30, 35, 40, 45, 50, 55, 60, 65, 70, 75, 80, or more nucleotides in length.

[4115] An antisense nucleic acid of the invention can be constructed using chemical synthesis and enzymatic ligation reactions using procedures known in the art. For example, an antisense nucleic acid (e.g., an antisense oligonucleotide) can be chemically synthesized using naturally occurring nucleotides or variously modified nucleotides designed to increase the biological stability of the molecules or to increase the physical stability of the duplex formed between the antisense and sense nucleic acids, e.g., phosphorothioate derivatives and acridine substituted nucleotides can be used. The antisense nucleic acid also can be produced biologically using an expression vector into which a nucleic acid has been subcloned in an antisense orientation (i.e., RNA transcribed from the inserted nucleic acid will be of an antisense orientation to a target nucleic acid of interest, described further in the following subsection).

[4116] The antisense nucleic acid molecules of the invention are typically administered to a subject (e.g., by direct injection at a tissue site), or generated in situ such that they hybridize with or bind to cellular mRNA and/or genomic DNA encoding a 33521 protein to thereby inhibit expression of the protein, e.g., by inhibiting transcription and/or translation. Alternatively, antisense nucleic acid molecules can be modified to target selected cells and then administered systemically. For systemic administration, antisense molecules can be modified such that they specifically bind to receptors or antigens expressed on a selected cell surface, e.g., by linking the antisense nucleic acid molecules to peptides or antibodies which bind to cell surface receptors or antigens. The antisense nucleic acid molecules can also be delivered to cells using the vectors described herein. To achieve sufficient intracellular concentrations of the antisense molecules, vector constructs in which the antisense nucleic acid molecule is placed under the control of a strong pol II or pol III promoter are preferred.

[4117] In yet another embodiment, the antisense nucleic acid molecule of the invention is an &agr;-anomeric nucleic acid molecule. An &agr;-anomeric nucleic acid molecule forms specific double-stranded hybrids with complementary RNA in which, contrary to the usual &bgr;-units, the strands run parallel to each other (Gaultier et al. (1987) Nucleic Acids. Res. 15:6625-6641). The antisense nucleic acid molecule can also comprise a 2′-o-methylribonucleotide (Inoue et al. (1987) Nucleic Acids Res. 15:6131-6148) or a chimeric RNA-DNA analogue (Inoue et al. (1987) FEBS Lett. 215:327-330).

[4118] In still another embodiment, an antisense nucleic acid of the invention is a ribozyme. A ribozyme having specificity for a 33521-encoding nucleic acid can include one or more sequences complementary to the nucleotide sequence of a 33521 cDNA disclosed herein (i.e., SEQ ID NO:61 or SEQ ID NO:63), and a sequence having known catalytic sequence responsible for mRNA cleavage (see U.S. Pat. No. 5,093,246 or Haselhoff and Gerlach (1988) Nature 334:585-591). For example, a derivative of a Tetrahymena L-19 IVS RNA can be constructed in which the nucleotide sequence of the active site is complementary to the nucleotide sequence to be cleaved in a 33521-encoding mRNA. See, e.g., Cech et al. U.S. Pat. No. 4,987,071; and Cech et al. U.S. Pat. No. 5,116,742. Alternatively, 33521 mRNA can be used to select a catalytic RNA having a specific ribonuclease activity from a pool of RNA molecules. See, e.g., Bartel, D. and Szostak, J. W. (1993) Science 261:1411-1418.

[4119] 33521 gene expression can be inhibited by targeting nucleotide sequences complementary to the regulatory region of the 33521 (e.g., the 33521 promoter and/or enhancers) to form triple helical structures that prevent transcription of the 33521 gene in target cells. See generally, Helene, C. (1991) Anticancer Drug Des. 6:569-84; Helene, C. i (1992) Ann. N.Y. Acad. Sci. 660:27-36; and Maher, L. J. (1992) Bioassays 14:807-15. The potential sequences that can be targeted for triple helix formation can be increased by creating a so-called “switchback” nucleic acid molecule. Switchback molecules are synthesized in an alternating 5′-3′,3′-5′manner, such that they base pair with first one strand of a duplex and then the other, eliminating the necessity for a sizeable stretch of either purines or pyrimidines to be present on one strand of a duplex.

[4120] The invention also provides detectably labeled oligonucleotide primer and probe molecules. Typically, such labels are chemiluminescent, fluorescent, radioactive, or colorimetric.

[4121] A 33521 nucleic acid molecule can be modified at the base moiety, sugar moiety or phosphate backbone to improve, e.g., the stability, hybridization, or solubility of the molecule. For non-limiting examples of synthetic oligonucleotides with modifications see Toulmé (2001) Nature Biotech. 19:17 and Faria et al. (2001) Nature Biotech. 19:40-44. Such phosphoramidite oligonucleotides can be effective antisense agents.

[4122] For example, the deoxyribose phosphate backbone of the nucleic acid molecules can be modified to generate peptide nucleic acids (see Hyrup B. et al. (1996) Bioorganic & Medicinal Chemistry 4: 5-23). As used herein, the terms “peptide nucleic acid” or “PNA” refers to a nucleic acid mimic, e.g., a DNA mimic, in which the deoxyribose phosphate backbone is replaced by a pseudopeptide backbone and only the four natural nucleobases are retained. The neutral backbone of a PNA can allow for specific hybridization to DNA and RNA under conditions of low ionic strength. The synthesis of PNA oligomers can be performed using standard solid phase peptide synthesis protocols as described in Hyrup B. et al. (1996) supra and Perry-O'Keefe et al. Proc. Natl. Acad. Sci. 93: 14670-675.

[4123] PNAs of 33521 nucleic acid molecules can be used in therapeutic and diagnostic applications. For example, PNAs can be used as antisense or antigene agents for sequence-specific modulation of gene expression by, for example, inducing transcription or translation arrest or inhibiting replication. PNAs of 33521 nucleic acid molecules can also be used in the analysis of single base pair mutations in a gene, (e.g., by PNA-directed PCR clamping); as ‘artificial restriction enzymes’ when used in combination with other enzymes, (e.g., S1 nucleases (Hyrup B. et al. (1996) supra)); or as probes or primers for DNA sequencing or hybridization (Hyrup B. et al. (1996) supra; Perry-O'Keefe supra).

[4124] In other embodiments, the oligonucleotide may include other appended groups such as peptides (e.g., for targeting host cell receptors in vivo), or agents facilitating transport across the cell membrane (see, e.g., Letsinger et al. (1989) Proc. Natl. Acad. Sci. USA 86:6553-6556; Lemaitre et al. (1987) Proc. Natl. Acad. Sci. USA 84:648-652; PCT Publication No. WO88/09810) or the blood-brain barrier (see, e.g., PCT Publication No. W089/10134). In addition, oligonucleotides can be modified with hybridization-triggered cleavage agents (see, e.g., Krol et al. (1988) Bio-Techniques 6:958-976) or intercalating agents. (see, e.g., Zon (1988) Pharm. Res. 5:539-549). To this end, the oligonucleotide may be conjugated to another molecule, (e.g., a peptide, hybridization triggered cross-linking agent, transport agent, or hybridization-triggered cleavage agent).

[4125] The invention also includes molecular beacon oligonucleotide primer and probe molecules having at least one region which is complementary to a 33521 nucleic acid of the invention, two complementary regions one having a fluorophore and one a quencher such that the molecular beacon is useful for quantitating the presence of the 33521 nucleic acid of the invention in a sample. Molecular beacon nucleic acids are described, for example, in Lizardi et al., U.S. Pat. No. 5,854,033; Nazarenko et al., U.S. Pat. No. 5,866,336, and Livak et al., U.S. Pat. No. 5,876,930.

[4126] Isolated 33521 Polypeptides

[4127] In another aspect, the invention features, an isolated 33521 protein, or fragment, e.g., a biologically active portion, for use as immunogens or antigens to raise or test (or more generally to bind) anti-33521 antibodies. 33521 protein can be isolated from cells or tissue sources using standard protein purification techniques. 33521 protein or fragments thereof can be produced by recombinant DNA techniques or synthesized chemically.

[4128] Polypeptides of the invention include those which arise as a result of the existence of multiple genes, alternative transcription events, alternative RNA splicing events, and alternative translational and post-translational events. The polypeptide can be expressed in systems, e.g., cultured cells, which result in substantially the same post-translational modifications present when expressed the polypeptide is expressed in a native cell, or in systems which result in the alteration or omission of post-translational modifications, e.g., glycosylation or cleavage, present when expressed in a native cell.

[4129] In a preferred embodiment, a 33521 polypeptide has one or more of the following characteristics:

[4130] (i) has the ability to stimulate guanine nucleotide exchange of a Rho protein, e.g., Rac;

[4131] (ii) it has a molecular weight, e.g., a deduced molecular weight, preferably ignoring any contribution of post translational modifications, amino acid composition or other physical characteristic of SEQ ID NO:62;

[4132] (iii) it has an overall sequence similarity of at least 50%, preferably at least 60%, more preferably at least 70, 80, 90, or 95%, with a polypeptide a of SEQ ID NO:62;

[4133] (iv) it has a Rho GEF domain which is preferably about 70%, 80%, 90% or 95% with amino acid residues about 1103 to 1292 of SEQ ID NO:62;

[4134] (v) it has two PH domains which are preferably about 70%, 80%, 90% or 95% with amino acid residues about 507 to 620, and 1353 to 1455, of SEQ ID NO:62;

[4135] (vi) it has a RBD domain which is preferably about 70%, 80%, 90% or 95% with amino acid residues about 810 to 881 of SEQ ID NO:62; or

[4136] (vii) it has a PDZ domain which is preferably about 70%, 80%, 90% or 95% with amino acid residues about 890 to 975 of SEQ ID NO:62.

[4137] In a preferred embodiment the 33521 protein, or fragment thereof, differs from the corresponding sequence in SEQ ID NO:62. In one embodiment it differs by at least one but by less than 15, 10 or 5 amino acid residues. In another it differs from the corresponding sequence in SEQ ID NO:62 by at least one residue but less than 20%, 15%, 10% or 5% of the residues in it differ from the corresponding sequence in SEQ ID NO:62. (If this comparison requires alignment the sequences should be aligned for maximum homology. “Looped” out sequences from deletions or insertions, or mismatches, are considered differences.) The differences are, preferably, differences or changes at a non essential residue or a conservative substitution. In a preferred embodiment the differences are not in the Rho GEF domain. In another preferred embodiment one or more differences are in the Rho GEF domain.

[4138] Other embodiments include a protein that contain one or more changes in amino acid sequence, e.g., a change in an amino acid residue which is not essential for activity. Such 33521 proteins differ in amino acid sequence from SEQ ID NO:62, yet retain biological activity.

[4139] In one embodiment, the protein includes an amino acid sequence at least about 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 98% or more homologous to SEQ ID NO:62.

[4140] A 33521 protein or fragment is provided which varies from the sequence of SEQ ID NO:62 in regions defined by amino acids about 1 to 1102, and 1293 to 1701 by at least one but by less than 15, 10 or 5 amino acid residues in the protein or fragment but which does not differ from SEQ ID NO:62 in regions defined by amino acids about 1103 to 1292. (If this comparison requires alignment the sequences should be aligned for maximum homology. “Looped” out sequences from deletions or insertions, or mismatches, are considered differences.) In some embodiments the difference is at a non-essential residue or is a conservative substitution, while in others the difference is at an essential residue or is a non-conservative substitution.

[4141] In one embodiment, a biologically active portion of a 33521 protein includes a Rho GEF domain. Moreover, other biologically active portions, in which other regions of the protein are deleted, can be prepared by recombinant techniques and evaluated for one or more of the functional activities of a native 33521 protein.

[4142] In a preferred embodiment, the 33521 protein has an amino acid sequence shown in SEQ ID NO:62. In other embodiments, the 33521 protein is substantially identical to SEQ ID NO:62. In yet another embodiment, the 33521 protein is substantially identical to SEQ ID NO:62 and retains the functional activity of the protein of SEQ ID NO:62, as described in detail in the subsections above.

[4143] 33521 Chimeric or Fusion Proteins

[4144] In another aspect, the invention provides 33521 chimeric or fusion proteins. As used herein, a 33521 “chimeric protein” or “fusion protein” includes a 33521 polypeptide linked to a non-33521 polypeptide. A “non-33521 polypeptide” refers to a polypeptide having an amino acid sequence corresponding to a protein which is not substantially homologous to the 33521 protein, e.g., a protein which is different from the 33521 protein and which is derived from the same or a different organism. The 33521 polypeptide of the fusion protein can correspond to all or a portion e.g., a fragment described herein of a 33521 amino acid sequence. In a preferred embodiment, a 33521 fusion protein includes at least one (or two) biologically active portion of a 33521 protein. The non-33521 polypeptide can be fused to the N-terminus or C-terminus of the 33521 polypeptide.

[4145] The fusion protein can include a moiety which has a high affinity for a ligand. For example, the fusion protein can be a GST-33521 fusion protein in which the 33521 sequences are fused to the C-terminus of the GST sequences. Such fusion proteins can facilitate the purification of recombinant 33521. Alternatively, the fusion protein can be a 33521 protein containing a heterologous signal sequence at its N-terminus. In certain host cells (e.g., mammalian host cells), expression and/or secretion of 33521 can be increased through use of a heterologous signal sequence.

[4146] Fusion proteins can include all or a part of a serum protein, e.g., an IgG constant region, or human serum albumin.

[4147] The 33521 fusion proteins of the invention can be incorporated into pharmaceutical compositions and administered to a subject in vivo. The 33521 fusion proteins can be used to affect the bioavailability of a 33521 substrate. 33521 fusion proteins may be useful therapeutically for the treatment of disorders caused by, for example, (i) aberrant modification or mutation of a gene encoding a 33521 protein; (ii) mis-regulation of the 33521 gene; and (iii) aberrant post-translational modification of a 33521 protein.

[4148] Moreover, the 33521-fusion proteins of the invention can be used as immunogens to produce anti-33521 antibodies in a subject, to purify 33521 ligands and in screening assays to identify molecules which inhibit the interaction of 33521 with a 33521 substrate.

[4149] Expression vectors are commercially available that already encode a fusion moiety (e.g., a GST polypeptide). A 33521-encoding nucleic acid can be cloned into such an expression vector such that the fusion moiety is linked in-frame to the 33521 protein.

[4150] Variants of 33521 Proteins

[4151] In another aspect, the invention also features a variant of a 33521 polypeptide, e.g., which functions as an agonist (mimetics) or as an antagonist. Variants of the 33521 proteins can be generated by mutagenesis, e.g., discrete point mutation, the insertion or deletion of sequences or the truncation of a 33521 protein. An agonist of the 33521 proteins can retain substantially the same, or a subset, of the biological activities of the naturally occurring form of a 33521 protein. An antagonist of a 33521 protein can inhibit one or more of the activities of the naturally occurring form of the 33521 protein by, for example, competitively modulating a 33521-mediated activity of a 33521 protein. Thus, specific biological effects can be elicited by treatment with a variant of limited function. Preferably, treatment of a subject with a variant having a subset of the biological activities of the naturally occurring form of the protein has fewer side effects in a subject relative to treatment with the naturally occurring form of the 33521 protein.

[4152] Variants of a 33521 protein can be identified by screening combinatorial libraries of mutants, e.g., truncation mutants, of a 33521 protein for agonist or antagonist activity.

[4153] Libraries of fragments e.g., N terminal, C terminal, or internal fragments, of a 33521 protein coding sequence can be used to generate a variegated population of fragments for screening and subsequent selection of variants of a 33521 protein. Variants in which a cysteine residues is added or deleted or in which a residue which is glycosylated is added or deleted are particularly preferred.

[4154] Methods for screening gene products of combinatorial libraries made by point mutations or truncation, and for screening cDNA libraries for gene products having a selected property are known in the art. Such methods are adaptable for rapid screening of the gene libraries generated by combinatorial mutagenesis of 33521 proteins. Recursive ensemble mutagenesis (REM), a new technique which enhances the frequency of functional mutants in the libraries, can be used in combination with the screening assays to identify 33521 variants (Arkin and Yourvan (1992) Proc. Natl. Acad. Sci. USA 89:7811-7815; Delgrave et al. (1993) Protein Engineering 6:327-331).

[4155] Cell based assays can be exploited to analyze a variegated 33521 library. For example, a library of expression vectors can be transfected into a cell line, e.g., a cell line, which ordinarily responds to 33521 in a substrate-dependent manner. The transfected cells are then contacted with 33521 and the effect of the expression of the mutant on signaling by the 33521 substrate can be detected, e.g., by measuring stimulation of guanine nucleotide exchange on a Rho protein. Plasmid DNA can then be recovered from the cells which score for inhibition, or alternatively, potentiation of signaling by the 33521 substrate, and the individual clones further characterized.

[4156] In another aspect, the invention features a method of making a 33521 polypeptide, e.g., a peptide having a non-wild type activity, e.g., an antagonist, agonist, or super agonist of a naturally occurring 33521 polypeptide, e.g., a naturally occurring 33521 polypeptide. The method includes: altering the sequence of a 33521 polypeptide, e.g., altering the sequence, e.g., by substitution or deletion of one or more residues of a non-conserved region, a domain or residue disclosed herein, and testing the altered polypeptide for the desired activity.

[4157] In another aspect, the invention features a method of making a fragment or analog of a 33521 polypeptide a biological activity of a naturally occurring 33521 polypeptide. The method includes: altering the sequence, e.g., by substitution or deletion of one or more residues, of a 33521 polypeptide, e.g., altering the sequence of a non-conserved region, or a domain or residue described herein, and testing the altered polypeptide for the desired activity.

[4158] Anti-33521 Antibodies

[4159] In another aspect, the invention provides an anti-33521 antibody, or a fragment thereof (e.g., an antigen-binding fragment thereof). The term “antibody” as used herein refers to an immunoglobulin molecule or immunologically active portion thereof, i.e., an antigen-binding portion. As used herein, the term “antibody” refers to a protein comprising at least one, and preferably two, heavy (H) chain variable regions (abbreviated herein as VH), and at least one and preferably two light (L) chain variable regions (abbreviated herein as VL). The VH and VL regions can be further subdivided into regions of hypervariability, termed “complementarity determining regions” (“CDR”), interspersed with regions that are more conserved, termed “framework regions” (FR). The extent of the framework region and CDR's has been precisely defined (see, Kabat, E. A., et al. (1991) Sequences of Proteins of Immunological Interest, Fifth Edition, U.S. Department of Health and Human Services, NIH Publication No. 91-3242, and Chothia, C. et al. (1987) J. Mol. Biol. 196:901-917, which are incorporated herein by reference). Each VH and VL is composed of three CDR's and four FRs, arranged from amino-terminus to carboxy-terminus in the following order: FR1, CDR1, FR2, CDR2, FR3, CDR3, FR4.

[4160] The anti-33521 antibody can further include a heavy and light chain constant region, to thereby form a heavy and light immunoglobulin chain, respectively. In one embodiment, the antibody is a tetramer of two heavy immunoglobulin chains and two light immunoglobulin chains, wherein the heavy and light immunoglobulin chains are inter-connected by, e.g., disulfide bonds. The heavy chain constant region is comprised of three domains, CH1, CH2 and CH3. The light chain constant region is comprised of one domain, CL. The variable region of the heavy and light chains contains a binding domain that interacts with an antigen. The constant regions of the antibodies typically mediate the binding of the antibody to host tissues or factors, including various cells of the immune system (e.g., effector cells) and the first component (Clq) of the classical complement system.

[4161] As used herein, the term “immunoglobulin” refers to a protein consisting of one or more polypeptides substantially encoded by immunoglobulin genes. The recognized human immunoglobulin genes include the kappa, lambda, alpha (IgA1 and IgA2), gamma (IgG1, IgG2, IgG3, IgG4), delta, epsilon and mu constant region genes, as well as the myriad immunoglobulin variable region genes. Full-length immunoglobulin “light chains” (about 25 KDa or 214 amino acids) are encoded by a variable region gene at the NH2-terminus (about 110 amino acids) and a kappa or lambda constant region gene at the COOH—terminus. Full-length immunoglobulin “heavy chains” (about 50 KDa or 446 amino acids), are similarly encoded by a variable region gene (about 116 amino acids) and one of the other aforementioned constant region genes, e.g., gamma (encoding about 330 amino acids).

[4162] The term “antigen-binding fragment” of an antibody (or simply “antibody portion,” or “fragment”), as used herein, refers to one or more fragments of a full-length antibody that retain the ability to specifically bind to the antigen, e.g., 33521 polypeptide or fragment thereof. Examples of antigen-binding fragments of the anti-33521 antibody include, but are not limited to: (i) a Fab fragment, a monovalent fragment consisting of the VL, VH, CL and CH1 domains; (ii) a F(ab′)2 fragment, a bivalent fragment comprising two Fab fragments linked by a disulfide bridge at the hinge region; (iii) a Fd fragment consisting of the VH and CH1 domains; (iv) a Fv fragment consisting of the VL and VH domains of a single arm of an antibody, (v) a dAb fragment (Ward et al., (1989) Nature 341:544-546), which consists of a VH domain; and (vi) an isolated complementarity determining region (CDR). Furthermore, although the two domains of the Fv fragment, VL and VH, are coded for by separate genes, they can be joined, using recombinant methods, by a synthetic linker that enables them to be made as a single protein chain in which the VL and VH regions pair to form monovalent molecules (known as single chain Fv (scFv); see e.g., Bird et al. (1988) Science 242:423-426; and Huston et al. (1988) Proc. Natl. Acad. Sci. USA 85:5879-5883). Such single chain antibodies are also encompassed within the term “antigen-binding fragment” of an antibody. These antibody fragments are obtained using conventional techniques known to those with skill in the art, and the fragments are screened for utility in the same manner as are intact antibodies.

[4163] The anti-33521 antibody can be a polyclonal or a monoclonal antibody. In other embodiments, the antibody can be recombinantly produced, e.g., produced by phage display or by combinatorial methods.

[4164] Phage display and combinatorial methods for generating anti-33521 antibodies are known in the art (as described in, e.g., Ladner et al. U.S. Pat. No. 5,223,409; Kang et al. International Publication No. WO 92/18619; Dower et al. International Publication No. WO 91/17271; Winter et al. International Publication WO 92/20791; Markland et al. International Publication No. WO 92/15679; Breitling et al. International Publication WO 93/01288; McCafferty et al. International Publication No. WO 92/01047; Garrard et al. International Publication No. WO 92/09690; Ladner et al. International Publication No. WO 90/02809; Fuchs et al. (1991) Bio/Technology 9:1370-1372; Hay et al. (1992) Hum Antibod Hybridomas 3:81-85; Huse et al. (1989) Science 246:1275-1281; Griffths et al. (1993) EMBO J 12:725-734; Hawkins et al. (1992) J Mol Biol 226:889-896; Clackson et al. (1991) Nature 352:624-628; Gram et al. (1992) PNAS 89:3576-3580; Garrad et al. (1991) Bio/Technology 9:1373-1377; Hoogenboom et al. (1991) Nuc Acid Res 19:4133-4137; and Barbas et al. (1991) PNAS 88:7978-7982, the contents of all of which are incorporated by reference herein).

[4165] In one embodiment, the anti-33521 antibody is a fully human antibody (e.g., an antibody made in a mouse which has been genetically engineered to produce an antibody from a human immunoglobulin sequence), or a non-human antibody, e.g., a rodent (mouse or rat), goat, primate (e.g., monkey), camel antibody. Preferably, the non-human antibody is a rodent (mouse or rat antibody). Method of producing rodent antibodies are known in the art.

[4166] Human monoclonal antibodies can be generated using transgenic mice carrying the human immunoglobulin genes rather than the mouse system. Splenocytes from these transgenic mice immunized with the antigen of interest are used to produce hybridomas that secrete human mAbs with specific affinities for epitopes from a human protein (see, e.g., Wood et al. International Application WO 91/00906, Kucherlapati et al. PCT publication WO 91/10741; Lonberg et al. International Application WO 92/03918; Kay et al. International Application 92/03917; Lonberg, N. et al. 1994 Nature 368:856-859; Green, L. L. et al. 1994 Nature Genet. 7:13-21; Morrison, S. L. et al. 1994 Proc. Natl. Acad. Sci. USA 81:6851-6855; Bruggeman et al. 1993 Year Immunol 7:33-40; Tuaillon et al. 1993 PNAS 90:3720-3724; Bruggeman et al. 1991 Eur J Immunol 21:1323-1326).

[4167] An anti-33521 antibody can be one in which the variable region, or a portion thereof, e.g., the CDR's, are generated in a non-human organism, e.g., a rat or mouse. Chimeric, CDR-grafted, and humanized antibodies are within the invention. Antibodies generated in a non-human organism, e.g., a rat or mouse, and then modified, e.g., in the variable framework or constant region, to decrease antigenicity in a human are within the invention.

[4168] Chimeric antibodies can be produced by recombinant DNA techniques known in the art. For example, a gene encoding the Fc constant region of a murine (or other species) monoclonal antibody molecule is digested with restriction enzymes to remove the region encoding the murine Fc, and the equivalent portion of a gene encoding a human Fc constant region is substituted (see Robinson et al., International Patent Publication PCT/US86/02269; Akira, et al., European Patent Application 184,187; Taniguchi, M., European Patent Application 171,496; Morrison et al., European Patent Application 173,494; Neuberger et al., International Application WO 86/01533; Cabilly et al. U.S. Pat. No. 4,816,567; Cabilly et al., European Patent Application 125,023; Better et al. (1988 Science 240:1041-1043); Liu et al. (1987) PNAS 84:3439-3443; Liu et al., 1987, J. Immunol. 139:3521-3526; Sun et al. (1987) PNAS 84:214-218; Nishimura et al., 1987, Canc. Res. 47:999-1005; Wood et al. (1985) Nature 314:446-449; and Shaw et al., 1988, J. Natl Cancer Inst. 80:1553-1559).

[4169] A humanized or CDR-grafted antibody will have at least one or two but generally all three recipient CDR's (of heavy and or light immuoglobulin chains) replaced with a donor CDR. The antibody may be replaced with at least a portion of a non-human CDR or only some of the CDR's may be replaced with non-human CDR's. It is only necessary to replace the number of CDR's required for binding of the humanized antibody to a 33521 or a fragment thereof. Preferably, the donor will be a rodent antibody, e.g., a rat or mouse antibody, and the recipient will be a human framework or a human consensus framework. Typically, the immunoglobulin providing the CDR's is called the “donor” and the immunoglobulin providing the framework is called the “acceptor.” In one embodiment, the donor immunoglobulin is a non-human (e.g., rodent). The acceptor framework is a naturally-occurring (e.g., a human) framework or a consensus framework, or a sequence about 85% or higher, preferably 90%, 95%, 99% or higher identical thereto.

[4170] As used herein, the term “consensus sequence” refers to the sequence formed from the most frequently occurring amino acids (or nucleotides) in a family of related sequences (See e.g., Winnaker, From Genes to Clones (Verlagsgesellschaft, Weinheim, Germany 1987). In a family of proteins, each position in the consensus sequence is occupied by the amino acid occurring most frequently at that position in the family. If two amino acids occur equally frequently, either can be included in the consensus sequence. A “consensus framework” refers to the framework region in the consensus immunoglobulin sequence.

[4171] An antibody can be humanized by methods known in the art. Humanized antibodies can be generated by replacing sequences of the Fv variable region which are not directly involved in antigen binding with equivalent sequences from human Fv variable regions. General methods for generating humanized antibodies are provided by Morrison, S. L., 1985, Science 229:1202-1207, by Oi et al., 1986, BioTechniques 4:214, and by Queen et al. U.S. Pat. No. 5,585,089, U.S. Pat. No. 5,693,761 and U.S. Pat. No. 5,693,762, the contents of all of which are hereby incorporated by reference. Those methods include isolating, manipulating, and expressing the nucleic acid sequences that encode all or part of immunoglobulin Fv variable regions from at least one of a heavy or light chain. Sources of such nucleic acid are well known to those skilled in the art and, for example, may be obtained from a hybridoma producing an antibody against a 33521 polypeptide or fragment thereof. The recombinant DNA encoding the humanized antibody, or fragment thereof, can then be cloned into an appropriate expression vector.

[4172] Humanized or CDR-grafted antibodies can be produced by CDR-grafting or CDR substitution, wherein one, two, or all CDR's of an immunoglobulin chain can be replaced. See e.g., U.S. Pat. No. 5,225,539; Jones et al. 1986 Nature 321:552-525; Verhoeyan et al. 1988 Science 239:1534; Beidler et al. 1988 J. Immunol. 141:4053-4060; Winter U.S. Pat. No. 5,225,539, the contents of all of which are hereby expressly incorporated by reference. Winter describes a CDR-grafting method which may be used to prepare the humanized antibodies of the present invention (UK Patent Application GB 2188638A, filed on Mar. 26, 1987; Winter U.S. Pat. No. 5,225,539), the contents of which is expressly incorporated by reference.

[4173] Also within the scope of the invention are humanized antibodies in which specific amino acids have been substituted, deleted or added. Preferred humanized antibodies have amino acid substitutions in the framework region, such as to improve binding to the antigen. For example, a humanized antibody will have framework residues identical to the donor framework residue or to another amino acid other than the recipient framework residue. To generate such antibodies, a selected, small number of acceptor framework residues of the humanized immunoglobulin chain can be replaced by the corresponding donor amino acids. Preferred locations of the substitutions include amino acid residues adjacent to the CDR, or which are capable of interacting with a CDR (see e.g., U.S. Pat. No. 5,585,089). Criteria for selecting amino acids from the donor are described in U.S. Pat. No. 5,585,089, e.g., columns 12-16 of U.S. Pat. No. 5,585,089, the e.g., columns 12-16 of U.S. Pat. No. 5,585,089, the contents of which are hereby incorporated by reference. Other techniques for humanizing antibodies are described in Padlan et al. EP 519596 A1, published on Dec. 23, 1992.

[4174] In preferred embodiments an antibody can be made by immunizing with purified 33521 antigen, or a fragment thereof, e.g., a fragment described herein.

[4175] A full-length 33521 protein or, antigenic peptide fragment of 33521 can be used as an immunogen or can be used to identify anti-33521 antibodies made with other immunogens, e.g., cells, membrane preparations, and the like. The antigenic peptide of 33521 should include at least 8 amino acid residues of the amino acid sequence shown in SEQ ID NO:62 and encompasses an epitope of 33521. Preferably, the antigenic peptide includes at least 10 amino acid residues, more preferably at least 15 amino acid residues, even more preferably at least 20 amino acid residues, and most preferably at least 30 amino acid residues.

[4176] Fragments of 33521 which include residues about amino acid 741 to 750, from about 756 to 762, and from about 1363 to 1372, of SEQ ID NO:62; can be used to make, e.g., used as immunogens or used to characterize the specificity of an antibody, antibodies against hydrophilic regions of the 33521 protein. Similarly, fragments of 33521 which include residues about amino acid 722 to 730, from about 883 to 891, and from about 966 to 975, of SEQ ID NO:62 can be used to make an antibody against a hydrophobic region of the 33521 protein; a fragment of 33521 which includes residues 507 to 620 and 1353 to 1455 of SEQ ID NO:62 can be used to make an antibody against the PH domains of the 33521 protein; a fragment of 33521 which includes residues about 810 to 881 of SEQ ID NO:62 can be used to make an antibody against an RBD domain of the 33521 protein; a fragment of 33521 which includes residues about 1103 to 1292 of SEQ ID NO:62 can be used to make an antibody against the Rho GEF region of the 33521 protein; a fragment of 33521 which includes residues about 890 to 975 of SEQ ID NO:62 can be used to make an antibody against the PDZ region of the 33521 protein.

[4177] Antibodies reactive with, or specific for, any of these regions, or other regions or domains described herein are provided.

[4178] Antibodies which bind only native 33521 protein, only denatured or otherwise non-native 33521 protein, or which bind both, are with in the invention. Antibodies with linear or conformational epitopes are within the invention. Conformational epitopes can sometimes be identified by identifying antibodies which bind to native but not denatured 33521 protein.

[4179] Preferred epitopes encompassed by the antigenic peptide are regions of 33521 are located on the surface of the protein, e.g., hydrophilic regions, as well as regions with high antigenicity. For example, an Emini surface probability analysis of the human 33521 protein sequence can be used to indicate the regions that have a particularly high probability of being localized to the surface of the 33521 protein and are thus likely to constitute surface residues useful for targeting antibody production.

[4180] In a preferred embodiment the antibody can bind to the Rho GEF portion of the 33521 protein. In another embodiment, the antibody binds a PH domain, a PDZ domain, or RBD domain portion of the 33521 protein.

[4181] The anti-33521 antibody can be a single chain antibody. A single-chain antibody (scFV) may be engineered (see, for example, Colcher, D. et al. (1999) Ann N Y Acad Sci 880:263-80; and Reiter, Y. (1996) Clin Cancer Res 2:245-52). The single chain antibody can be dimerized or multimerized to generate multivalent antibodies having specificities for different epitopes of the same target 33521 protein.

[4182] In a preferred embodiment the antibody has effector function and/or can fix complement. In other embodiments the antibody does not recruit effector cells; or fix complement.

[4183] In a preferred embodiment, the antibody has reduced or no ability to bind an Fc receptor. For example, it is a isotype or subtype, fragment or other mutant, which does not support binding to an Fc receptor, e.g., it has a mutagenized or deleted Fc receptor binding region.

[4184] In a preferred embodiment, an anti-33521 antibody alters (e.g., increases or decreases) the guanine nucleotide exchange activity of a 33521 polypeptide.

[4185] The antibody can be coupled to a toxin, e.g., a polypeptide toxin, e,g, ricin or diphtheria toxin or active fragment hereof, or a radioactive nucleus, or imaging agent, e.g. a radioactive, enzymatic, or other, e.g., imaging agent, e.g., a NMR contrast agent. Labels which produce detectable radioactive emissions or fluorescence are preferred.

[4186] An anti-33521 antibody (e.g., monoclonal antibody) can be used to isolate 33521 by standard techniques, such as affinity chromatography or immunoprecipitation. Moreover, an anti-33521 antibody can be used to detect 33521 protein (e.g., in a cellular lysate or cell supernatant) in order to evaluate the abundance and pattern of expression of the protein. Anti-33521 antibodies can be used diagnostically to monitor protein levels in tissue as part of a clinical testing procedure, e.g., to determine the efficacy of a given treatment regimen. Detection can be facilitated by coupling (i.e., physically linking) the antibody to a detectable substance (i.e., antibody labelling). Examples of detectable substances include various enzymes, prosthetic groups, fluorescent materials, luminescent materials, bioluminescent materials, and radioactive materials. Examples of suitable enzymes include horseradish peroxidase, alkaline phosphatase, &bgr;-galactosidase, or acetylcholinesterase; examples of suitable prosthetic group complexes include streptavidin/biotin and avidin/biotin; examples of suitable fluorescent materials include umbelliferone, fluorescein, fluorescein isothiocyanate, rhodamine, dichlorotriazinylamine fluorescein, dansyl chloride or phycoerythrin; an example of a luminescent material includes luminol; examples of bioluminescent materials include luciferase, luciferin, and aequorin, and examples of suitable radioactive material include 125I, 131I; 35S or 3H.

[4187] The invention also includes a nucleic acid which encodes an anti-33521 antibody, e.g., an anti-33521 antibody described herein. Also included are vectors which include the nucleic acid and cells transformed with the nucleic acid, particularly cells which are useful for producing an antibody, e.g., mammalian cells, e.g. CHO or lymphatic cells.

[4188] The invention also includes cell lines, e.g., hybridomas, which make an anti-33521 antibody, e.g., and antibody described herein, and method of using said cells to make a 33521 antibody.

[4189] 33521 Recombinant Expression Vectors, Host Cells and Genetically Engineered Cells

[4190] In another aspect, the invention includes, vectors, preferably expression vectors, containing a nucleic acid encoding a polypeptide described herein. As used herein, the term “vector” refers to a nucleic acid molecule capable of transporting another nucleic acid to which it has been linked and can include a plasmid, cosmid or viral vector. The vector can be capable of autonomous replication or it can integrate into a host DNA. Viral vectors include, e.g., replication defective retroviruses, adenoviruses and adeno-associated viruses.

[4191] A vector can include a 33521 nucleic acid in a form suitable for expression of the nucleic acid in a host cell. Preferably the recombinant expression vector includes one or more regulatory sequences operatively linked to the nucleic acid sequence to be expressed. The term “regulatory sequence” includes promoters, enhancers and other expression control elements (e.g., polyadenylation signals). Regulatory sequences include those which direct constitutive expression of a nucleotide sequence, as well as tissue-specific regulatory and/or inducible sequences. The design of the expression vector can depend on such factors as the choice of the host cell to be transformed, the level of expression of protein desired, and the like. The expression vectors of the invention can be introduced into host cells to thereby produce proteins or polypeptides, including fusion proteins or polypeptides, encoded by nucleic acids as described herein (e.g., 33521 proteins, mutant forms of 33521 proteins, fusion proteins, and the like).

[4192] The recombinant expression vectors of the invention can be designed for expression of 33521 proteins in prokaryotic or eukaryotic cells. For example, polypeptides of the invention can be expressed in E. coli, insect cells (e.g., using baculovirus expression vectors), yeast cells or mammalian cells. Suitable host cells are discussed further in Goeddel, (1990) Gene Expression Technology: Methods in Enzymology 185, Academic Press, San Diego, Calif. Alternatively, the recombinant expression vector can be transcribed and translated in vitro, for example using T7 promoter regulatory sequences and T7 polymerase.

[4193] Expression of proteins in prokaryotes is most often carried out in E. coli with vectors containing constitutive or inducible promoters directing the expression of either fusion or non-fusion proteins. Fusion vectors add a number of amino acids to a protein encoded therein, usually to the amino terminus of the recombinant protein. Such fusion vectors typically serve three purposes: 1) to increase expression of recombinant protein; 2) to increase the solubility of the recombinant protein; and 3) to aid in the purification of the recombinant protein by acting as a ligand in affinity purification. Often, a proteolytic cleavage site is introduced at the junction of the fusion moiety and the recombinant protein to enable separation of the recombinant protein from the fusion moiety subsequent to purification of the fusion protein. Such enzymes, and their cognate recognition sequences, include Factor Xa, thrombin and enterokinase. Typical fusion expression vectors include pGEX (Pharmacia Biotech Inc; Smith, D. B. and Johnson, K. S. (1988) Gene 67:31-40), pMAL (New England Biolabs, Beverly, Mass.) and pRIT5 (Pharmacia, Piscataway, N.J.) which fuse glutathione S-transferase (GST), maltose E binding protein, or protein A, respectively, to the target recombinant protein.

[4194] Purified fusion proteins can be used in 33521 activity assays, (e.g., direct assays or competitive assays described in detail below), or to generate antibodies specific for 33521 proteins. In a preferred embodiment, a fusion protein expressed in a retroviral expression vector of the present invention can be used to infect bone marrow cells which are subsequently transplanted into irradiated recipients. The pathology of the subject recipient is then examined after sufficient time has passed (e.g., six weeks).

[4195] To maximize recombinant protein expression in E. coli is to express the protein in a host bacteria with an impaired capacity to proteolytically cleave the recombinant protein (Gottesman, S., (1990) Gene Expression Technology: Methods in Enzymology 185, Academic Press, San Diego, Calif. 119-128). Another strategy is to alter the nucleic acid sequence of the nucleic acid to be inserted into an expression vector so that the individual codons for each amino acid are those preferentially utilized in E. coli (Wada et al., (1992) Nucleic Acids Res. 20:2111-2118). Such alteration of nucleic acid sequences of the invention can be carried out by standard DNA synthesis techniques.

[4196] The 33521 expression vector can be a yeast expression vector, a vector for expression in insect cells, e.g., a baculovirus expression vector or a vector suitable for expression in mammalian cells.

[4197] When used in mammalian cells, the expression vector's control functions can be provided by viral regulatory elements. For example, commonly used promoters are derived from polyoma, Adenovirus 2, cytomegalovirus and Simian Virus 40.

[4198] In another embodiment, the promoter is an inducible promoter, e.g., a promoter regulated by a steroid hormone, by a polypeptide hormone (e.g., by means of a signal transduction pathway), or by a heterologous polypeptide (e.g., the tetracycline-inducible systems, “Tet-On” and “Tet-Off”; see, e.g., Clontech Inc., CA, Gossen and Bujard (1992) Proc. Natl. Acad. Sci. USA 89:5547, and Paillard (1989) Human Gene Therapy 9:983).

[4199] In another embodiment, the recombinant mammalian expression vector is capable of directing expression of the nucleic acid preferentially in a particular cell type (e.g., tissue-specific regulatory elements are used to express the nucleic acid). Non-limiting examples of suitable tissue-specific promoters include the albumin promoter (liver-specific; Pinkert et al. (1987) Genes Dev. 1:268-277), lymphoid-specific promoters (Calame and Eaton (1988) Adv. Immunol. 43:235-275), in particular promoters of T cell receptors (Winoto and Baltimore (1989) EMBO J. 8:729-733) and immunoglobulins (Banerji et al. (1983) Cell 33:729-740; Queen and Baltimore (1983) Cell 33:741-748), neuron-specific promoters (e.g., the neurofilament promoter; Byrne and Ruddle (1989) Proc. Natl. Acad. Sci. USA 86:5473-5477), pancreas-specific promoters (Edlund et al. (1985) Science 230:912-916), and mammary gland-specific promoters (e.g., milk whey promoter; U.S. Pat. No. 4,873,316 and European Application Publication No. 264,166). Developmentally-regulated promoters are also encompassed, for example, the murine hox promoters (Kessel and Gruss (1990) Science 249:374-379) and the &agr;-fetoprotein promoter (Campes and Tilghman (1989) Genes Dev. 3:537-546).

[4200] The invention further provides a recombinant expression vector comprising a DNA molecule of the invention cloned into the expression vector in an antisense orientation. Regulatory sequences (e.g., viral promoters and/or enhancers) operatively linked to a nucleic acid cloned in the antisense orientation can be chosen which direct the constitutive, tissue specific or cell type specific expression of antisense RNA in a variety of cell types. The antisense expression vector can be in the form of a recombinant plasmid, phagemid or attenuated virus.

[4201] Another aspect the invention provides a host cell which includes a nucleic acid molecule described herein, e.g., a 33521 nucleic acid molecule within a recombinant expression vector or a 33521 nucleic acid molecule containing sequences which allow it to homologously recombine into a specific site of the host cell's genome. The terms “host cell” and “recombinant host cell” are used interchangeably herein. Such terms refer not only to the particular subject cell but to the progeny or potential progeny of such a cell. Because certain modifications may occur in succeeding generations due to either mutation or environmental influences, such progeny may not, in fact, be identical to the parent cell, but are still included within the scope of the term as used herein.

[4202] A host cell can be any prokaryotic or eukaryotic cell. For example, a 33521 protein can be expressed in bacterial cells (such as E. coli), insect cells, yeast or mammalian cells (such as Chinese hamster ovary cells (CHO) or COS cells (African green monkey kidney cells CV-1 origin SV40 cells; Gluzman (1981) Cell 123:175-182)). Other suitable host cells are known to those skilled in the art.

[4203] Vector DNA can be introduced into host cells via conventional transformation or transfection techniques. As used herein, the terms “transformation” and “transfection” are intended to refer to a variety of art-recognized techniques for introducing foreign nucleic acid (e.g., DNA) into a host cell, including calcium phosphate or calcium chloride co-precipitation, DEAE-dextran-mediated transfection, lipofection, or electroporation.

[4204] A host cell of the invention can be used to produce (i.e., express) a 33521 protein. Accordingly, the invention further provides methods for producing a 33521 protein using the host cells of the invention. In one embodiment, the method includes culturing the host cell of the invention (into which a recombinant expression vector encoding a 33521 protein has been introduced) in a suitable medium such that a 33521 protein is produced. In another embodiment, the method further includes isolating a 33521 protein from the medium or the host cell.

[4205] In another aspect, the invention features, a cell or purified preparation of cells which include a 33521 transgene, or which otherwise misexpress 33521. The cell preparation can consist of human or non-human cells, e.g., rodent cells, e.g., mouse or rat cells, rabbit cells, or pig cells. In preferred embodiments, the cell or cells include a 33521 transgene, e.g., a heterologous form of a 33521, e.g., a gene derived from humans (in the case of a non-human cell). The 33521 transgene can be misexpressed, e.g., overexpressed or underexpressed. In other preferred embodiments, the cell or cells include a gene that mis-expresses an endogenous 33521, e.g., a gene the expression of which is disrupted, e.g., a knockout. Such cells can serve as a model for studying disorders that are related to mutated or mis-expressed 33521 alleles or for use in drug screening.

[4206] In another aspect, the invention features, a human cell, e.g., a hematopoietic stem cell, transformed with nucleic acid which encodes a subject 33521 polypeptide.

[4207] Also provided are cells, preferably human cells, e.g., human hematopoietic or fibroblast cells, in which an endogenous 33521 is under the control of a regulatory sequence that does not normally control the expression of the endogenous 33521 gene. The expression characteristics of an endogenous gene within a cell, e.g., a cell line or microorganism, can be modified by inserting a heterologous DNA regulatory element into the genome of the cell such that the inserted regulatory element is operably linked to the endogenous 33521 gene. For example, an endogenous 33521 gene which is “transcriptionally silent,” e.g., not normally expressed, or expressed only at very low levels, may be activated by inserting a regulatory element which is capable of promoting the expression of a normally expressed gene product in that cell. Techniques such as targeted homologous recombinations, can be used to insert the heterologous DNA as described in, e.g., Chappel, U.S. Pat. No. 5,272,071; WO 91/06667, published in May 16, 1991.

[4208] In a preferred embodiment, recombinant cells described herein can be used for replacement therapy in a subject. For example, a nucleic acid encoding a 33521 polypeptide operably linked to an inducible promoter (e.g., a steroid hormone receptor-regulated promoter) is introduced into a human or nonhuman, e.g., mammalian, e.g., porcine recombinant cell. The cell is cultivated and encapsulated in a biocompatible material, such as poly-lysine alginate, and subsequently implanted into the subject. See, e.g., Lanza (1996) Nat. Biotechnol. 14:1107; Joki et al. (2001) Nat. Biotechnol. 19:35; and U.S. Pat. No. 5,876,742. Production of 33521 polypeptide can be regulated in the subject by administering an agent (e.g., a steroid hormone) to the subject. In another preferred embodiment, the implanted recombinant cells express and secrete an antibody specific for a 33521 polypeptide. The antibody can be any antibody or any antibody derivative described herein.

[4209] 33521 Transgenic Animals

[4210] The invention provides non-human transgenic animals. Such animals are useful for studying the function and/or activity of a 33521 protein and for identifying and/or evaluating modulators of 33521 activity. As used herein, a “transgenic animal” is a non-human animal, preferably a mammal, more preferably a rodent such as a rat or mouse, in which one or more of the cells of the animal includes a transgene. Other examples of transgenic animals include non-human primates, sheep, dogs, cows, goats, chickens, amphibians, and the like. A transgene is exogenous DNA or a rearrangement, e.g., a deletion of endogenous chromosomal DNA, which preferably is integrated into or occurs in the genome of the cells of a transgenic animal. A transgene can direct the expression of an encoded gene product in one or more cell types or tissues of the transgenic animal, other transgenes, e.g., a knockout, reduce expression. Thus, a transgenic animal can be one in which an endogenous 33521 gene has been altered by, e.g., by homologous recombination between the endogenous gene and an exogenous DNA molecule introduced into a cell of the animal, e.g., an embryonic cell of the animal, prior to development of the animal.

[4211] Intronic sequences and polyadenylation signals can also be included in the transgene to increase the efficiency of expression of the transgene. A tissue-specific regulatory sequence(s) can be operably linked to a transgene of the invention to direct expression of a 33521 protein to particular cells. A transgenic founder animal can be identified based upon the presence of a 33521 transgene in its genome and/or expression of 33521 mRNA in tissues or cells of the animals. A transgenic founder animal can then be used to breed additional animals carrying the transgene. Moreover, transgenic animals carrying a transgene encoding a 33521 protein can further be bred to other transgenic animals carrying other transgenes.

[4212] 33521 proteins or polypeptides can be expressed in transgenic animals or plants, e.g., a nucleic acid encoding the protein or polypeptide can be introduced into the genome of an animal. In preferred embodiments the nucleic acid is placed under the control of a tissue specific promoter, e.g., a milk or egg specific promoter, and recovered from the milk or eggs produced by the animal. Suitable animals are mice, pigs, cows, goats, and sheep.

[4213] The invention also includes a population of cells from a transgenic animal, as discussed, e.g., below.

[4214] Uses of 33521

[4215] The nucleic acid molecules, proteins, protein homologues, and antibodies described herein can be used in one or more of the following methods: a) screening assays; b) predictive medicine (e.g., diagnostic assays, prognostic assays, monitoring clinical trials, and pharmacogenetics); and c) methods of treatment (e.g., therapeutic and prophylactic).

[4216] The isolated nucleic acid molecules of the invention can be used, for example, to express a 33521 protein (e.g., via a recombinant expression vector in a host cell in gene therapy applications), to detect a 33521 mRNA (e.g., in a biological sample) or a genetic alteration in a 33521 gene, and to modulate 33521 activity, as described further below. The 33521 proteins can be used to treat disorders characterized by insufficient or excessive production of a 33521 substrate or production of 33521 inhibitors. In addition, the 33521 proteins can be used to screen for naturally occurring 33521 substrates, to screen for drugs or compounds which modulate 33521 activity, as well as to treat disorders characterized by insufficient or excessive production of 33521 protein or production of 33521 protein forms which have decreased, aberrant or unwanted activity compared to 33521 wild type protein (e.g., aberrant cell proliferation and/or differentiation). Moreover, the anti-33521 antibodies of the invention can be used to detect and isolate 33521 proteins, regulate the bioavailability of 33521 proteins, and modulate 33521 activity.

[4217] A method of evaluating a compound for the ability to interact with, e.g., bind, a subject 33521 polypeptide is provided. The method includes: contacting the compound with the subject 33521 polypeptide; and evaluating ability of the compound to interact with, e.g., to bind or form a complex with the subject 33521 polypeptide. This method can be performed in vitro, e.g., in a cell free system, or in vivo, e.g., in a two-hybrid interaction trap assay. This method can be used to identify naturally occurring molecules that interact with subject 33521 polypeptide. It can also be used to find natural or synthetic inhibitors of subject 33521 polypeptide. Screening methods are discussed in more detail below.

[4218] 33521 Screening Assays

[4219] The invention provides methods (also referred to herein as “screening assays”) for identifying modulators, i.e., candidate or test compounds or agents (e.g., proteins, peptides, peptidomimetics, peptoids, small molecules or other drugs) which bind to 33521 proteins, have a stimulatory or inhibitory effect on, for example, 33521 expression or 33521 activity, or have a stimulatory or inhibitory effect on, for example, the expression or activity of a 33521 substrate. Compounds thus identified can be used to modulate the activity of target gene products (e.g., 33521 genes) in a therapeutic protocol, to elaborate the biological function of the target gene product, or to identify compounds that disrupt normal target gene interactions.

[4220] In one embodiment, the invention provides assays for screening candidate or test compounds which are substrates of a 33521 protein or polypeptide or a biologically active portion thereof. In another embodiment, the invention provides assays for screening candidate or test compounds that bind to or modulate an activity of a 33521 protein or polypeptide or a biologically active portion thereof.

[4221] In one embodiment, an activity of a 33521 protein can be assayed, e.g., assayed for nucleotide exchange activity toward Rho proteins. An example of an such an “exchange assay” can be found, for example, in Fleming et al., J. Biol. Chem. 274: 12753-12758 (1999), wherein the nucleotide exchange activity of TIAM1 toward Rac1 was assayed. Briefly, the activity of TIAM1 was observed by incubating [3H]GDP-Rac1 with TIAM1 for a specified time period, and then measuring the amount of [3H]GDP remaining bound to Rac1 following the incubation period.

[4222] The test compounds of the present invention can be obtained using any of the numerous approaches in combinatorial library methods known in the art, including: biological libraries; peptoid libraries (libraries of molecules having the functionalities of peptides, but with a novel, non-peptide backbone which are resistant to enzymatic degradation but which nevertheless remain bioactive; see, e.g., Zuckermann, R. N. et al. (1994) J. Med. Chem. 37:2678-85); spatially addressable parallel solid phase or solution phase libraries; synthetic library methods requiring deconvolution; the ‘one-bead one-compound’ library method; and synthetic library methods using affinity chromatography selection. The biological library and peptoid library approaches are limited to peptide libraries, while the other four approaches are applicable to peptide, non-peptide oligomer or small molecule libraries of compounds (Lam (1997) Anticancer Drug Des. 12:145).

[4223] Examples of methods for the synthesis of molecular libraries can be found in the art, for example in: DeWitt et al. (1993) Proc. Natl. Acad. Sci. U.S.A. 90:6909; Erb et al. (1994) Proc. Natl. Acad. Sci. USA 91:11422; Zuckermann et al. (1994). J. Med. Chem. 37:2678; Cho et al. (1993) Science 261:1303; Carrell et al. (1994) Angew. Chem. Int. Ed. Engl. 33:2059; Carell et al. (1994) Angew. Chem. Int. Ed. Engl. 33:2061; and Gallop et al. (1994) J. Med. Chem. 37:1233.

[4224] Libraries of compounds may be presented in solution (e.g., Houghten (1992) Biotechniques 13:412-421), or on beads (Lam (1991) Nature 354:82-84), chips (Fodor (1993) Nature 364:555-556), bacteria (Ladner, U.S. Pat. No. 5,223,409), spores (Ladner U.S. Pat. No. 5,223,409), plasmids (Cull et al. (1992) Proc Natl Acad Sci USA 89:1865-1869) or on phage (Scott and Smith (1990) Science 249:386-390; Devlin (1990) Science 249:404-406; Cwirla et al. (1990) Proc. Natl. Acad. Sci. 87:6378-6382; Felici (1991) J. Mol. Biol. 222:301-310; Ladner supra.).

[4225] In one embodiment, an assay is a cell-based assay in which a cell which expresses a 33521 protein or biologically active portion thereof is contacted with a test compound, and the ability of the test compound to modulate 33521 activity is determined. Determining the ability of the test compound to modulate 33521 activity can be accomplished by monitoring, for example, stimulation of guanine nucleotide exchange on a Rho protein. The cell, for example, can be of mammalian origin, e.g., human.

[4226] The ability of the test compound to modulate 33521 binding to a compound, e.g., a 33521 substrate, or to bind to 33521 can also be evaluated. This can be accomplished, for example, by coupling the compound, e.g., the substrate, with a radioisotope or enzymatic label such that binding of the compound, e.g., the substrate, to 33521 can be determined by detecting the labeled compound, e.g., substrate, in a complex. Alternatively, 33521 could be coupled with a radioisotope or enzymatic label to monitor the ability of a test compound to modulate 33521 binding to a 33521 substrate in a complex. For example, compounds (e.g., 33521 substrates) can be labeled with 125I, 35S, 14C, or 3H, either directly or indirectly, and the radioisotope detected by direct counting of radioemmission or by scintillation counting. Alternatively, compounds can be enzymatically labeled with, for example, horseradish peroxidase, alkaline phosphatase, or luciferase, and the enzymatic label detected by determination of conversion of an appropriate substrate to product.

[4227] The ability of a compound (e.g., a 33521 substrate) to interact with 33521 with or without the labeling of any of the interactants can be evaluated. For example, a microphysiometer can be used to detect the interaction of a compound with 33521 without the labeling of either the compound or the 33521. McConnell, H. M. et al. (1992) Science 257:1906-1912. As used herein, a “microphysiometer” (e.g., Cytosensor) is an analytical instrument that measures the rate at which a cell acidifies its environment using a light-addressable potentiometric sensor (LAPS). Changes in this acidification rate can be used as an indicator of the interaction between a compound and 33521.

[4228] In yet another embodiment, a cell-free assay is provided in which a 33521 protein or biologically active portion thereof is contacted with a test compound and the ability of the test compound to bind to the 33521 protein or biologically active portion thereof is evaluated. Preferred biologically active portions of the 33521 proteins to be used in assays of the present invention include fragments which participate in interactions with non-33521 molecules, e.g., fragments with high surface probability scores.

[4229] Soluble and/or membrane-bound forms of isolated proteins (e.g., 33521 proteins or biologically active portions thereof) can be used in the cell-free assays of the invention. When membrane-bound forms of the protein are used, it may be desirable to utilize a solubilizing agent. Examples of such solubilizing agents include non-ionic detergents such as n-octylglucoside, n-dodecylglucoside, n-dodecylmaltoside, octanoyl-N-methylglucamide, decanoyl-N-methylglucamide, Triton® X-100, Triton® X-114, Thesit®, Isotridecypoly(ethylene glycol ether)n, 3-[(3-cholamidopropyl)dimethylamminio]-1-propane sulfonate (CHAPS), 3-[(3-cholamidopropyl)dimethylamminio]-2-hydroxy-1-propane sulfonate (CHAPSO), or N-dodecyl=N,N-dimethyl-3-ammonio-1-propane sulfonate.

[4230] Cell-free assays involve preparing a reaction mixture of the target gene protein and the test compound under conditions and for a time sufficient to allow the two components to interact and bind, thus forming a complex that can be removed and/or detected.

[4231] The interaction between two molecules can also be detected, e.g., using fluorescence energy transfer (FET) (see, for example, Lakowicz et al., U.S. Pat. No. 5,631,169; Stavrianopoulos, et al., U.S. Pat. No. 4,868,103). A fluorophore label on the first, ‘donor’ molecule is selected such that its emitted fluorescent energy will be absorbed by a fluorescent label on a second, ‘acceptor’ molecule, which in turn is able to fluoresce due to the absorbed energy. Alternately, the ‘donor’ protein molecule may simply utilize the natural fluorescent energy of tryptophan residues. Labels are chosen that emit different wavelengths of light, such that the ‘acceptor’ molecule label may be differentiated from that of the ‘donor’. Since the efficiency of energy transfer between the labels is related to the distance separating the molecules, the spatial relationship between the molecules can be assessed. In a situation in which binding occurs between the molecules, the fluorescent emission of the ‘acceptor’ molecule label in the assay should be maximal. An FET binding event can be conveniently measured through standard fluorometric detection means well known in the art (e.g., using a fluorimeter).

[4232] In another embodiment, determining the ability of the 33521 protein to bind to a target molecule can be accomplished using real-time Biomolecular Interaction Analysis (BIA) (see, e.g., Sjolander, S. and Urbaniczky, C. (1991) Anal. Chem. 63:2338-2345 and Szabo et al. (1995) Curr. Opin. Struct. Biol. 5:699-705). “Surface plasmon resonance” or “BIA” detects biospecific interactions in real time, without labeling any of the interactants (e.g., BIAcore). Changes in the mass at the binding surface (indicative of a binding event) result in alterations of the refractive index of light near the surface (the optical phenomenon of surface plasmon resonance (SPR)), resulting in a detectable signal which can be used as an indication of real-time reactions between biological molecules.

[4233] In one embodiment, the target gene product or the test substance is anchored onto a solid phase. The target gene product/test compound complexes anchored on the solid phase can be detected at the end of the reaction. Preferably, the target gene product can be anchored onto a solid surface, and the test compound, (which is not anchored), can be labeled, either directly or indirectly, with detectable labels discussed herein.

[4234] It may be desirable to immobilize either 33521, an anti-33521 antibody or its target molecule to facilitate separation of complexed from uncomplexed forms of one or both of the proteins, as well as to accommodate automation of the assay. Binding of a test compound to a 33521 protein, or interaction of a 33521 protein with a target molecule in the presence and absence of a candidate compound, can be accomplished in any vessel suitable for containing the reactants. Examples of such vessels include microtiter plates, test tubes, and micro-centrifuge tubes. In one embodiment, a fusion protein can be provided which adds a domain that allows one or both of the proteins to be bound to a matrix. For example, glutathione-S-transferase/33521 fusion proteins or glutathione-S-transferase/target fusion proteins can be adsorbed onto glutathione sepharose beads (Sigma Chemical, St. Louis, Mo.) or glutathione derivatized microtiter plates, which are then combined with the test compound or the test compound and either the non-adsorbed target protein or 33521 protein, and the mixture incubated under conditions conducive to complex formation (e.g., at physiological conditions for salt and pH). Following incubation, the beads or microtiter plate wells are washed to remove any unbound components, the matrix immobilized in the case of beads, complex determined either directly or indirectly, for example, as described above. Alternatively, the complexes can be dissociated from the matrix, and the level of 33521 binding or activity determined using standard techniques.

[4235] Other techniques for immobilizing either a 33521 protein or a target molecule on matrices include using conjugation of biotin and streptavidin. Biotinylated 33521 protein or target molecules can be prepared from biotin-NHS(N-hydroxy-succinimide) using techniques known in the art (e.g., biotinylation kit, Pierce Chemicals, Rockford, Ill.), and immobilized in the wells of streptavidin-coated 96 well plates (Pierce Chemical).

[4236] In order to conduct the assay, the non-immobilized component is added to the coated surface containing the anchored component. After the reaction is complete, unreacted components are removed (e.g., by washing) under conditions such that any complexes formed will remain immobilized on the solid surface. The detection of complexes anchored on the solid surface can be accomplished in a number of ways. Where the previously non-immobilized component is pre-labeled, the detection of label immobilized on the surface indicates that complexes were formed. Where the previously non-immobilized component is not pre-labeled, an indirect label can be used to detect complexes anchored on the surface; e.g., using a labeled antibody specific for the immobilized component (the antibody, in turn, can be directly labeled or indirectly labeled with, e.g., a labeled anti-Ig antibody).

[4237] In one embodiment, this assay is performed utilizing antibodies reactive with 33521 protein or target molecules but which do not interfere with binding of the 33521 protein to its target molecule. Such antibodies can be derivatized to the wells of the plate, and unbound target or 33521 protein trapped in the wells by antibody conjugation. Methods for detecting such complexes, in addition to those described above for the GST-immobilized complexes, include immunodetection of complexes using antibodies reactive with the 33521 protein or target molecule, as well as enzyme-linked assays which rely on detecting an enzymatic activity associated with the 33521 protein or target molecule.

[4238] Alternatively, cell free assays can be conducted in a liquid phase. In such an assay, the reaction products are separated from unreacted components, by any of a number of standard techniques, including but not limited to: differential centrifugation (see, for example, Rivas, G., and Minton, A. P., (1993) Trends Biochem Sci 18:284-7); chromatography (gel filtration chromatography, ion-exchange chromatography); electrophoresis (see, e.g., Ausubel, F. et al., eds. Current Protocols in Molecular Biology 1999, J. Wiley: New York.); and immunoprecipitation (see, for example, Ausubel, F. et al., eds. (1999) Current Protocols in Molecular Biology, J. Wiley: New York). Such resins and chromatographic techniques are known to one skilled in the art (see, e.g., Heegaard, N. H., (1998) J Mol Recognit 11:141-8; Hage, D. S., and Tweed, S. A. (1997) J Chromatogr B Biomed Sci Appl. 699:499-525). Further, fluorescence energy transfer may also be conveniently utilized, as described herein, to detect binding without further purification of the complex from solution.

[4239] In a preferred embodiment, the assay includes contacting the 33521 protein or biologically active portion thereof with a known compound which binds 33521 to form an assay mixture, contacting the assay mixture with a test compound, and determining the ability of the test compound to interact with a 33521 protein, wherein determining the ability of the test compound to interact with a 33521 protein includes determining the ability of the test compound to preferentially bind to 33521 or biologically active portion thereof, or to modulate the activity of a target molecule, as compared to the known compound.

[4240] The target gene products of the invention can, in vivo, interact with one or more cellular or extracellular macromolecules, such as proteins. For the purposes of this discussion, such cellular and extracellular macromolecules are referred to herein as “binding partners.” Compounds that disrupt such interactions can be useful in regulating the activity of the target gene product. Such compounds can include, but are not limited to molecules such as antibodies, peptides, and small molecules. The preferred target genes/products for use in this embodiment are the 33521 genes herein identified. In an alternative embodiment, the invention provides methods for determining the ability of the test compound to modulate the activity of a 33521 protein through modulation of the activity of a downstream effector of a 33521 target molecule. For example, the activity of the effector molecule on an appropriate target can be determined, or the binding of the effector to an appropriate target can be determined, as previously described.

[4241] To identify compounds that interfere with the interaction between the target gene product and its cellular or extracellular binding partner(s), a reaction mixture containing the target gene product and the binding partner is prepared, under conditions and for a time sufficient, to allow the two products to form complex. In order to test an inhibitory agent, the reaction mixture is provided in the presence and absence of the test compound. The test compound can be initially included in the reaction mixture, or can be added at a time subsequent to the addition of the target gene and its cellular or extracellular binding partner. Control reaction mixtures are incubated without the test compound or with a placebo. The formation of any complexes between the target gene product and the cellular or extracellular binding partner is then detected. The formation of a complex in the control reaction, but not in the reaction mixture containing the test compound, indicates that the compound interferes with the interaction of the target gene product and the interactive binding partner. Additionally, complex formation within reaction mixtures containing the test compound and normal target gene product can also be compared to complex formation within reaction mixtures containing the test compound and mutant target gene product. This comparison can be important in those cases wherein it is desirable to identify compounds that disrupt interactions of mutant but not normal target gene products.

[4242] These assays can be conducted in a heterogeneous or homogeneous format. Heterogeneous assays involve anchoring either the target gene product or the binding partner onto a solid phase, and detecting complexes anchored on the solid phase at the end of the reaction. In homogeneous assays, the entire reaction is carried out in a liquid phase. In either approach, the order of addition of reactants can be varied to obtain different information about the compounds being tested. For example, test compounds that interfere with the interaction between the target gene products and the binding partners, e.g., by competition, can be identified by conducting the reaction in the presence of the test substance. Alternatively, test compounds that disrupt preformed complexes, e.g., compounds with higher binding constants that displace one of the components from the complex, can be tested by adding the test compound to the reaction mixture after complexes have been formed. The various formats are briefly described below.

[4243] In a heterogeneous assay system, either the target gene product or the interactive cellular or extracellular binding partner, is anchored onto a solid surface (e.g., a microtiter plate), while the non-anchored species is labeled, either directly or indirectly. The anchored species can be immobilized by non-covalent or covalent attachments. Alternatively, an immobilized antibody specific for the species to be anchored can be used to anchor the species to the solid surface.

[4244] In order to conduct the assay, the partner of the immobilized species is exposed to the coated surface with or without the test compound. After the reaction is complete, unreacted components are removed (e.g., by washing) and any complexes formed will remain immobilized on the solid surface. Where the non-immobilized species is pre-labeled, the detection of label immobilized on the surface indicates that complexes were formed. Where the non-immobilized species is not pre-labeled, an indirect label can be used to detect complexes anchored on the surface; e.g., using a labeled antibody specific for the initially non-immobilized species (the antibody, in turn, can be directly labeled or indirectly labeled with, e.g., a labeled anti-Ig antibody). Depending upon the order of addition of reaction components, test compounds that inhibit complex formation or that disrupt preformed complexes can be detected.

[4245] Alternatively, the reaction can be conducted in a liquid phase in the presence or absence of the test compound, the reaction products separated from unreacted components, and complexes detected; e.g., using an immobilized antibody specific for one of the binding components to anchor any complexes formed in solution, and a labeled antibody specific for the other partner to detect anchored complexes. Again, depending upon the order of addition of reactants to the liquid phase, test compounds that inhibit complex or that disrupt preformed complexes can be identified.

[4246] In an alternate embodiment of the invention, a homogeneous assay can be used. For example, a preformed complex of the target gene product and the interactive cellular or extracellular binding partner product is prepared in that either the target gene products or their binding partners are labeled, but the signal generated by the label is quenched due to complex formation (see, e.g., U.S. Pat. No. 4,109,496 that utilizes this approach for immunoassays). The addition of a test substance that competes with and displaces one of the species from the preformed complex will result in the generation of a signal above background. In this way, test substances that disrupt target gene product-binding partner interaction can be identified.

[4247] In yet another aspect, the 33521 proteins can be used as “bait proteins” in a two-hybrid assay or three-hybrid assay (see, e.g., U.S. Pat. No. 5,283,317; Zervos et al. (1993) Cell 72:223-232; Madura et al. (1993) J. Biol. Chem. 268:12046-12054; Bartel et al. (1993) Biotechniques 14:920-924; Iwabuchi et al. (1993) Oncogene 8:1693-1696; and Brent WO94/10300), to identify other proteins, which bind to or interact with 33521 (“33521-binding proteins” or “33521-bp”) and are involved in 33521 activity. Such 33521-bps can be activators or inhibitors of signals by the 33521 proteins or 33521 targets as, for example, downstream elements of a 33521-mediated signaling pathway.

[4248] The two-hybrid system is based on the modular nature of most transcription factors, which consist of separable DNA-binding and activation domains. Briefly, the assay utilizes two different DNA constructs. In one construct, the gene that codes for a 33521 protein is fused to a gene encoding the DNA binding domain of a known transcription factor (e.g., GAL-4). In the other construct, a DNA sequence, from a library of DNA sequences, that encodes an unidentified protein (“prey” or “sample”) is fused to a gene that codes for the activation domain of the known transcription factor. (Alternatively the: 33521 protein can be the fused to the activator domain.) If the “bait” and the “prey” proteins are able to interact, in vivo, forming a 33521-dependent complex, the DNA-binding and activation domains of the transcription factor are brought into close proximity. This proximity allows transcription of a reporter gene (e.g., lacZ) which is operably linked to a transcriptional regulatory site responsive to the transcription factor. Expression of the reporter gene can be detected and cell colonies containing the functional transcription factor can be isolated and used to obtain the cloned gene which encodes the protein which interacts with the 33521 protein.

[4249] In another embodiment, modulators of 33521 expression are identified. For example, a cell or cell free mixture is contacted with a candidate compound and the expression of 33521 mRNA or protein evaluated relative to the level of expression of 33521 mRNA or protein in the absence of the candidate compound. When expression of 33521 mRNA or protein is greater in the presence of the candidate compound than in its absence, the candidate compound is identified as a stimulator of 33521 mRNA or protein expression. Alternatively, when expression of 33521 mRNA or protein is less (statistically significantly less) in the presence of the candidate compound than in its absence, the candidate compound is identified as an inhibitor of 33521 mRNA or protein expression. The level of 33521 mRNA or protein expression can be determined by methods described herein for detecting 33521 mRNA or protein.

[4250] In another aspect, the invention pertains to a combination of two or more of the assays described herein. For example, a modulating agent can be identified using a cell-based or a cell free assay, and the ability of the agent to modulate the activity of a 33521 protein can be confirmed in vivo, e.g., in an animal such as an animal model for disorders of cell proliferation and differentiation, e.g., cancer and metastasis.

[4251] This invention further pertains to novel agents identified by the above-described screening assays. Accordingly, it is within the scope of this invention to further use an agent identified as described herein (e.g., a 33521 modulating agent, an antisense 33521 nucleic acid molecule, a 33521-specific antibody, or a 33521-binding partner) in an appropriate animal model to determine the efficacy, toxicity, side effects, or mechanism of action, of treatment with such an agent. Furthermore, novel agents identified by the above-described screening assays can be used for treatments as described herein.

[4252] 33521 Detection Assays

[4253] Portions or fragments of the nucleic acid sequences identified herein can be used as polynucleotide reagents. For example, these sequences can be used to: (i) map their respective genes on a chromosome e.g., to locate gene regions associated with genetic disease or to associate 33521 with a disease; (ii) identify an individual from a minute biological sample (tissue typing); and (iii) aid in forensic identification of a biological sample. These applications are described in the subsections below.

[4254] 33521 Chromosome Mapping

[4255] The 33521 nucleotide sequences or portions thereof can be used to map the location of the 33521 genes on a chromosome. This process is called chromosome mapping. Chromosome mapping is useful in correlating the 33521 sequences with genes associated with disease.

[4256] Briefly, 33521 genes can be mapped to chromosomes by preparing PCR primers (preferably 15-25 bp in length) from the 33521 nucleotide sequences. These primers can then be used for PCR screening of somatic cell hybrids containing individual human chromosomes. Only those hybrids containing the human gene corresponding to the 33521 sequences will yield an amplified fragment.

[4257] A panel of somatic cell hybrids in which each cell line contains either a single human chromosome or a small number of human chromosomes, and a full set of mouse chromosomes, can allow easy mapping of individual genes to specific human chromosomes. (D'Eustachio P. et al. (1983) Science 220:919-924).

[4258] Other mapping strategies e.g., in situ hybridization (described in Fan, Y. et al. (1990) Proc. Natl. Acad. Sci. USA, 87:6223-27), pre-screening with labeled flow-sorted chromosomes, and pre-selection by hybridization to chromosome specific cDNA libraries can be used to map 33521 to a chromosomal location.

[4259] Fluorescence in situ hybridization (FISH) of a DNA sequence to a metaphase chromosomal spread can further be used to provide a precise chromosomal location in one step. The FISH technique can be used with a DNA sequence as short as 500 or 600 bases. However, clones larger than 1,000 bases have a higher likelihood of binding to a unique chromosomal location with sufficient signal intensity for simple detection. Preferably 1,000 bases, and more preferably 2,000 bases will suffice to get good results at a reasonable amount of time. For a review of this technique, see Verma et al., Human Chromosomes: A Manual of Basic Techniques ((1988) Pergamon Press, New York).

[4260] Reagents for chromosome mapping can be used individually to mark a single chromosome or a single site on that chromosome, or panels of reagents can be used for marking multiple sites and/or multiple chromosomes. Reagents corresponding to noncoding regions of the genes actually are preferred for mapping purposes. Coding sequences are more likely to be conserved within gene families, thus increasing the chance of cross hybridizations during chromosomal mapping.

[4261] Once a sequence has been mapped to a precise chromosomal location, the physical position of the sequence on the chromosome can be correlated with genetic map data. (Such data are found, for example, in V. McKusick, Mendelian Inheritance in Man, available on-line through Johns Hopkins University Welch Medical Library). The relationship between a gene and a disease, mapped to the same chromosomal region, can then be identified through linkage analysis (co-inheritance of physically adjacent genes), described in, for example, Egeland, J. et al. (1987) Nature, 325:783-787.

[4262] Moreover, differences in the DNA sequences between individuals affected and unaffected with a disease associated with the 33521 gene, can be determined. If a mutation is observed in some or all of the affected individuals but not in any unaffected individuals, then the mutation is likely to be the causative agent of the particular disease. Comparison of affected and unaffected individuals generally involves first looking for structural alterations in the chromosomes, such as deletions or translocations that are visible from chromosome spreads or detectable using PCR based on that DNA sequence. Ultimately, complete sequencing of genes from several individuals can be performed to confirm the presence of a mutation and to distinguish mutations from polymorphisms.

[4263] 33521 Tissue Typing

[4264] 33521 sequences can be used to identify individuals from biological samples using, e.g., restriction fragment length polymorphism (RFLP). In this technique, an individual's genomic DNA is digested with one or more restriction enzymes, the fragments separated, e.g., in a Southern blot, and probed to yield bands for identification. The sequences of the present invention are useful as additional DNA markers for RFLP (described in U.S. Pat. No. 5,272,057).

[4265] Furthermore, the sequences of the present invention can also be used to determine the actual base-by-base DNA sequence of selected portions of an individual's genome. Thus, the 33521 nucleotide sequences described herein can be used to prepare two PCR primers from the 5′ and 3′ends of the sequences. These primers can then be used to amplify an individual's DNA and subsequently sequence it. Panels of corresponding DNA sequences from individuals, prepared in this manner, can provide unique individual identifications, as each individual will have a unique set of such DNA sequences due to allelic differences.

[4266] Allelic variation occurs to some degree in the coding regions of these sequences, and to a greater degree in the noncoding regions. Each of the sequences described herein can, to some degree, be used as a standard against which DNA from an individual can be compared for identification purposes. Because greater numbers of polymorphisms occur in the noncoding regions, fewer sequences are necessary to differentiate individuals. The noncoding sequences of SEQ ID NO:61 can provide positive individual identification with a panel of perhaps 10 to 1,000 primers which each yield a noncoding amplified sequence of 100 bases. If predicted coding sequences, such as those in SEQ ID NO:63 are used, a more appropriate number of primers for positive individual identification would be 500-2,000.

[4267] If a panel of reagents from 33521 nucleotide sequences described herein is used to generate a unique identification database for an individual, those same reagents can later be used to identify tissue from that individual. Using the unique identification database, positive identification of the individual, living or dead, can be made from extremely small tissue samples.

[4268] Use of Partial 33521 Sequences in Forensic Biology

[4269] DNA-based identification techniques can also be used in forensic biology. To make such an identification, PCR technology can be used to amplify DNA sequences taken from very small biological samples such as tissues, e.g., hair or skin, or body fluids, e.g., blood, saliva, or semen found at a crime scene. The amplified sequence can then be compared to a standard, thereby allowing identification of the origin of the biological sample.

[4270] The sequences of the present invention can be used to provide polynucleotide reagents, e.g., PCR primers, targeted to specific loci in the human genome, which can enhance the reliability of DNA-based forensic identifications by, for example, providing another “identification marker” (i.e. another DNA sequence that is unique to a particular individual). As mentioned above, actual base sequence information can be used for identification as an accurate alternative to patterns formed by restriction enzyme generated fragments. Sequences targeted to noncoding regions of SEQ ID NO:61 (e.g., fragments derived from the noncoding regions of SEQ ID NO:61 having a length of at least 20 bases, preferably at least 30 bases) are particularly appropriate for this use.

[4271] The 33521 nucleotide sequences described herein can further be used to provide polynucleotide reagents, e.g., labeled or labelable probes which can be used in, for example, an in situ hybridization technique, to identify a specific tissue. This can be very useful in cases where a forensic pathologist is presented with a tissue of unknown origin. Panels of such 33521 probes can be used to identify tissue by species and/or by organ type.

[4272] In a similar fashion, these reagents, e.g., 33521 primers or probes can be used to screen tissue culture for contamination (i.e. screen for the presence of a mixture of different types of cells in a culture).

[4273] Predictive Medicine of 33521

[4274] The present invention also pertains to the field of predictive medicine in which diagnostic assays, prognostic assays, and monitoring clinical trials are used for prognostic (predictive) purposes to thereby treat an individual.

[4275] Generally, the invention provides, a method of determining if a subject is at risk for a disorder related to a lesion in or the misexpression of a gene which encodes 33521.

[4276] Such disorders include, e.g., a disorder associated with the misexpression of 33521 gene; a disorder of cell proliferation, e.g., cancer, or a disorder of cell motility, e.g., metastatic cancer.

[4277] The method includes one or more of the following:

[4278] detecting, in a tissue of the subject, the presence or absence of a mutation which affects the expression of the 33521 gene, or detecting the presence or absence of a mutation in a region which controls the expression of the gene, e.g., a mutation in the 5′control region;

[4279] detecting, in a tissue of the subject, the presence or absence of a mutation which alters the structure of the 33521 gene;

[4280] detecting, in a tissue of the subject, the misexpression of the 33521 gene, at the mRNA level, e.g., detecting a non-wild type level of a mRNA;

[4281] detecting, in a tissue of the subject, the misexpression of the gene, at the protein level, e.g., detecting a non-wild type level of a 33521 polypeptide.

[4282] In preferred embodiments the method includes: ascertaining the existence of at least one of: a deletion of one or more nucleotides from the 33521 gene; an insertion of one or more nucleotides into the gene, a point mutation, e.g., a substitution of one or more nucleotides of the gene, a gross chromosomal rearrangement of the gene, e.g., a translocation, inversion, or deletion.

[4283] For example, detecting the genetic lesion can include: (i) providing a probe/primer including an oligonucleotide containing a region of nucleotide sequence which hybridizes to a sense or antisense sequence from SEQ ID NO:61, or naturally occurring mutants thereof or 5′ or 3′flanking sequences naturally associated with the 33521 gene; (ii) exposing the probe/primer to nucleic acid of the tissue; and detecting, by hybridization, e.g., in situ hybridization, of the probe/primer to the nucleic acid, the presence or absence of the genetic lesion.

[4284] In preferred embodiments detecting the misexpression includes ascertaining the existence of at least one of: an alteration in the level of a messenger RNA transcript of the 33521 gene; the presence of a non-wild type splicing pattern of a messenger RNA transcript of the gene; or a non-wild type level of 33521.

[4285] Methods of the invention can be used prenatally or to determine if a subject's offspring will be at risk for a disorder.

[4286] In preferred embodiments the method includes determining the structure of a 33521 gene, an abnormal structure being indicative of risk for the disorder.

[4287] In preferred embodiments the method includes contacting a sample from the subject with an antibody to the 33521 protein or a nucleic acid, which hybridizes specifically with the gene. These and other embodiments are discussed below.

[4288] Diagnostic and Prognostic Assays of 33521

[4289] Diagnostic and prognostic assays of the invention include method for assessing the expression level of 33521 molecules and for identifying variations and mutations in the sequence of 33521 molecules.

[4290] Expression Monitoring and Profiling:

[4291] The presence, level, or absence of 33521 protein or nucleic acid in a biological sample can be evaluated by obtaining a biological sample from a test subject and contacting the biological sample with a compound or an agent capable of detecting 33521 protein or nucleic acid (e.g., mRNA, genomic DNA) that encodes 33521 protein such that the presence of 33521 protein or nucleic acid is detected in the biological sample. The term “biological sample” includes tissues, cells and biological fluids isolated from a subject, as well as tissues, cells and fluids present within a subject. A preferred biological sample is serum. The level of expression of the 33521 gene can be measured in a number of ways, including, but not limited to: measuring the mRNA encoded by the 33521 genes; measuring the amount of protein encoded by the 33521 genes; or measuring the activity of the protein encoded by the 33521 genes.

[4292] The level of mRNA corresponding to the 33521 gene in a cell can be determined both by in situ and by in vitro formats.

[4293] The isolated mRNA can be used in hybridization or amplification assays that include, but are not limited to, Southern or Northern analyses, polymerase chain reaction analyses and probe arrays. One preferred diagnostic method for the detection of mRNA levels involves contacting the isolated mRNA with a nucleic acid molecule (probe) that can hybridize to the mRNA encoded by the gene being detected. The nucleic acid probe can be, for example, a full-length 33521 nucleic acid, such as the nucleic acid of SEQ ID NO:61, or a portion thereof, such as an oligonucleotide of at least 7, 15, 30, 50, 100, 250 or 500 nucleotides in length and sufficient to specifically hybridize under stringent conditions to 33521 mRNA or genomic DNA. The probe can be disposed on an address of an array, e.g., an array described below. Other suitable probes for use in the diagnostic assays are described herein.

[4294] In one format, mRNA (or cDNA) is immobilized on a surface and contacted with the probes, for example by running the isolated mRNA on an agarose gel and transferring the mRNA from the gel to a membrane, such as nitrocellulose. In an alternative format, the probes are immobilized on a surface and the mRNA (or cDNA) is contacted with the probes, for example, in a two-dimensional gene chip array described below. A skilled artisan can adapt known mRNA detection methods for use in detecting the level of mRNA encoded by the 33521 genes.

[4295] The level of mRNA in a sample that is encoded by one of 33521 can be evaluated with nucleic acid amplification, e.g., by rtPCR (Mullis (1987) U.S. Pat. No. 4,683,202), ligase chain reaction (Barany (1991) Proc. Natl. Acad. Sci. USA 88:189-193), self sustained sequence replication (Guatelli et al., (1990) Proc. Natl. Acad. Sci. USA 87:1874-1878), transcriptional amplification system (Kwoh et al., (1989), Proc. Natl. Acad. Sci. USA 86:1173-1177), Q-Beta Replicase (Lizardi et al., (1988) Bio/Technology 6:1197), rolling circle replication (Lizardi et al., U.S. Pat. No. 5,854,033) or any other nucleic acid amplification method, followed by the detection of the amplified molecules using techniques known in the art. As used herein, amplification primers are defined as being a pair of nucleic acid molecules that can anneal to 5′ or 3′regions of a gene (plus and minus strands, respectively, or vice-versa) and contain a short region in between. In general, amplification primers are from about 10 to 30 nucleotides in length and flank a region from about 50 to 200 nucleotides in length. Under appropriate conditions and with appropriate reagents, such primers permit the amplification of a nucleic acid molecule comprising the nucleotide sequence flanked by the primers.

[4296] For in situ methods, a cell or tissue sample can be prepared/processed and immobilized on a support, typically a glass slide, and then contacted with a probe that can hybridize to mRNA that encodes the 33521 gene being analyzed.

[4297] In another embodiment, the methods further contacting a control sample with a compound or agent capable of detecting 33521 mRNA, or genomic DNA, and comparing the presence of 33521 mRNA or genomic DNA in the control sample with the presence of 33521 mRNA or genomic DNA in the test sample. In still another embodiment, serial analysis of gene expression, as described in U.S. Pat. No. 5,695,937, is used to detect 33521 transcript levels.

[4298] A variety of methods can be used to determine the level of protein encoded by 33521. In general, these methods include contacting an agent that selectively binds to the protein, such as an antibody with a sample, to evaluate the level of protein in the sample. In a preferred embodiment, the antibody bears a detectable label. Antibodies can be polyclonal, or more preferably, monoclonal. An intact antibody, or a fragment thereof (e.g., Fab or F(ab′)2) can be used. The term “labeled”, with regard to the probe or antibody, is intended to encompass direct labeling of the probe or antibody by coupling (i.e., physically linking) a detectable substance to the probe or antibody, as well as indirect labeling of the probe or antibody by reactivity with a detectable substance. Examples of detectable substances are provided herein.

[4299] The detection methods can be used to detect 33521 protein in a biological sample in vitro as well as in vivo. In vitro techniques for detection of 33521 protein include enzyme linked immunosorbent assays (ELISAs), immunoprecipitations, immunofluorescence, enzyme immunoassay (EIA), radioimmunoassay (RIA), and Western blot analysis. In vivo techniques for detection of 33521 protein include introducing into a subject a labeled anti-33521 antibody. For example, the antibody can be labeled with a radioactive marker whose presence and location in a subject can be detected by standard imaging techniques. In another embodiment, the sample is labeled, e.g., biotinylated and then contacted to the antibody, e.g., an anti-33521 antibody positioned on an antibody array (as described below). The sample can be detected, e.g., with avidin coupled to a fluorescent label.

[4300] In another embodiment, the methods further include contacting the control sample with a compound or agent capable of detecting 33521 protein, and comparing the presence of 33521 protein in the control sample with the presence of 33521 protein in the test sample.

[4301] The invention also includes kits for detecting the presence of 33521 in a biological sample. For example, the kit can include a compound or agent capable of detecting 33521 protein or mRNA in a biological sample; and a standard. The compound or agent can be packaged in a suitable container. The kit can further comprise instructions for using the kit to detect 33521 protein or nucleic acid.

[4302] For antibody-based kits, the kit can include: (1) a first antibody (e.g., attached to a solid support) which binds to a polypeptide corresponding to a marker of the invention; and, optionally, (2) a second, different antibody which binds to either the polypeptide or the first antibody and is conjugated to a detectable agent.

[4303] For oligonucleotide-based kits, the kit can include: (1) an oligonucleotide, e.g., a detectably labeled oligonucleotide, which hybridizes to a nucleic acid sequence encoding a polypeptide corresponding to a marker of the invention or (2) a pair of primers useful for amplifying a nucleic acid molecule corresponding to a marker of the invention. The kit can also includes a buffering agent, a preservative, or a protein stabilizing agent. The kit can also includes components necessary for detecting the detectable agent (e.g., an enzyme or a substrate). The kit can also contain a control sample or a series of control samples which can be assayed and compared to the test sample contained. Each component of the kit can be enclosed within an individual container and all of the various containers can be within a single package, along with instructions for interpreting the results of the assays performed using the kit.

[4304] The diagnostic methods described herein can identify subjects having, or at risk of developing, a disease or disorder associated with misexpressed or aberrant or unwanted 33521 expression or activity. As used herein, the term “unwanted” includes an unwanted phenomenon involved in a biological response such as pain or deregulated cell proliferation.

[4305] In one embodiment, a disease or disorder associated with aberrant or unwanted 33521 expression or activity is identified. A test sample is obtained from a subject and 33521 protein or nucleic acid (e.g., mRNA or genomic DNA) is evaluated, wherein the level, e.g., the presence or absence, of 33521 protein or nucleic acid is diagnostic for a subject having or at risk of developing a disease or disorder associated with aberrant or unwanted 33521 expression or activity. As used herein, a “test sample” refers to a biological sample obtained from a subject of interest, including a biological fluid (e.g., serum), cell sample, or tissue.

[4306] The prognostic assays described herein can be used to determine whether a subject can be administered an agent (e.g., an agonist, antagonist, peptidomimetic, protein, peptide, nucleic acid, small molecule, or other drug candidate) to treat a disease or disorder associated with aberrant or unwanted 33521 expression or activity. For example, such methods can be used to determine whether a subject can be effectively treated with an agent for a cell proliferation and differentiation disorder, e.g., cancer and metastasis.

[4307] In another aspect, the invention features a computer medium having a plurality of digitally encoded data records. Each data record includes a value representing the level of expression of 33521 in a sample, and a descriptor of the sample. The descriptor of the sample can be an identifier of the sample, a subject from which the sample was derived (e.g., a patient), a diagnosis, or a treatment (e.g., a preferred treatment). In a preferred embodiment, the data record further includes values representing the level of expression of genes other than 33521 (e.g., other genes associated with a 33521-disorder, or other genes on an array). The data record can be structured as a table, e.g., a table that is part of a database such as a relational database (e.g., a SQL database of the Oracle or Sybase database environments).

[4308] Also featured is a method of evaluating a sample. The method includes providing a sample, e.g., from the subject, and determining a gene expression profile of the sample, wherein the profile includes a value representing the level of 33521 expression. The method can further include comparing the value or the profile (i.e., multiple values) to a reference value or reference profile. The gene expression profile of the sample can be obtained by any of the methods described herein (e.g., by providing a nucleic acid from the sample and contacting the nucleic acid to an array). The method can be used to diagnose a disorder in a subject wherein a change in 33521 expression is an indication that the subject has or is disposed to having a disorder. The method can be used to monitor a treatment for a disorder in a subject. For example, the gene expression profile can be determined for a sample from a subject undergoing treatment. The profile can be compared to a reference profile or to a profile obtained from the subject prior to treatment or prior to onset of the disorder (see, e.g., Golub et al. (1999) Science 286:531).

[4309] In yet another aspect, the invention features a method of evaluating a test compound (see also, “Screening Assays”, above). The method includes providing a cell and a test compound; contacting the test compound to the cell; obtaining a subject expression profile for the contacted cell; and comparing the subject expression profile to one or more reference profiles. The profiles include a value representing the level of 33521 expression. In a preferred embodiment, the subject expression profile is compared to a target profile, e.g., a profile for a normal cell or for desired condition of a cell. The test compound is evaluated favorably if the subject expression profile is more similar to the target profile than an expression profile obtained from an uncontacted cell.

[4310] In another aspect, the invention features, a method of evaluating a subject. The method includes: a) obtaining a sample from a subject, e.g., from a caregiver, e.g., a caregiver who obtains the sample from the subject; b) determining a subject expression profile for the sample. Optionally, the method further includes either or both of steps: c) comparing the subject expression profile to one or more reference expression profiles; and d) selecting the reference profile most similar to the subject reference profile. The subject expression profile and the reference profiles include a value representing the level of 33521 expression. A variety of routine statistical measures can be used to compare two reference profiles. One possible metric is the length of the distance vector that is the difference between the two profiles. Each of the subject and reference profile is represented as a multi-dimensional vector, wherein each dimension is a value in the profile.

[4311] The method can further include transmitting a result to a caregiver. The result can be the subject expression profile, a result of a comparison of the subject expression profile with another profile, a most similar reference profile, or a descriptor of any of the aforementioned. The result can be transmitted across a computer network, e.g., the result can be in the form of a computer transmission, e.g., a computer data signal embedded in a carrier wave.

[4312] Also featured is a computer medium having executable code for effecting the following steps: receive a subject expression profile; access a database of reference expression profiles; and either i) select a matching reference profile most similar to the subject expression profile or ii) determine at least one comparison score for the similarity of the subject expression profile to at least one reference profile. The subject expression profile, and the reference expression profiles each include a value representing the level of 33521 expression.

[4313] 33521 Arrays and Uses Thereof

[4314] In another aspect, the invention features an array that includes a substrate having a plurality of addresses. At least one address of the plurality includes a capture probe that binds specifically to a 33521 molecule (e.g., a 33521 nucleic acid or a 33521 polypeptide). The array can have a density of at least than 10, 50, 100, 200, 500, 1,000, 2,000, or 10,000 or more addresses/cm2, and ranges between. In a preferred embodiment, the plurality of addresses includes at least 10, 100, 500, 1,000, 5,000, 10,000, 50,000 addresses. In a preferred embodiment, the plurality of addresses includes equal to or less than 10, 100, 500, 1,000, 5,000, 10,000, or 50,000 addresses. The substrate can be a two-dimensional substrate such as a glass slide, a wafer (e.g., silica or plastic), a mass spectroscopy plate, or a three-dimensional substrate such as a gel pad. Addresses in addition to address of the plurality can be disposed on the array.

[4315] In a preferred embodiment, at least one address of the plurality includes a nucleic acid capture probe that hybridizes specifically to a 33521 nucleic acid, e.g., the sense or anti-sense strand. In one preferred embodiment, a subset of addresses of the plurality of addresses has a nucleic acid capture probe for 33521. Each address of the subset can include a capture probe that hybridizes to a different region of a 33521 nucleic acid. In another preferred embodiment, addresses of the subset include a capture probe for a 33521 nucleic acid. Each address of the subset is unique, overlapping, and complementary to a different variant of 33521 (e.g., an allelic variant, or all possible hypothetical variants). The array can be used to sequence 33521 by hybridization (see, e.g., U.S. Pat. No. 5,695,940).

[4316] An array can be generated by various methods, e.g., by photolithographic methods (see, e.g., U.S. Pat. Nos. 5,143,854; 5,510,270; and 5,527,681), mechanical methods (e.g., directed-flow methods as described in U.S. Pat. No. 5,384,261), pin-based methods (e.g., as described in U.S. Pat. No. 5,288,514), and bead-based techniques (e.g., as described in PCT US/93/04145).

[4317] In another preferred embodiment, at least one address of the plurality includes a polypeptide capture probe that binds specifically to a 33521 polypeptide or fragment thereof. The polypeptide can be a naturally-occurring interaction partner of 33521 polypeptide. Preferably, the polypeptide is an antibody, e.g., an antibody described herein (see “Anti-33521 Antibodies,” above), such as a monoclonal antibody or a single-chain antibody.

[4318] In another aspect, the invention features a method of analyzing the expression of 33521. The method includes providing an array as described above; contacting the array with a sample and detecting binding of a 33521-molecule (e.g., nucleic acid or polypeptide) to the array. In a preferred embodiment, the array is a nucleic acid array. Optionally the method further includes amplifying nucleic acid from the sample prior or during contact with the array.

[4319] In another embodiment, the array can be used to assay gene expression in a tissue to ascertain tissue specificity of genes in the array, particularly the expression of 33521. If a sufficient number of diverse samples is analyzed, clustering (e.g., hierarchical clustering, k-means clustering, Bayesian clustering and the like) can be used to identify other genes which are co-regulated with 33521. For example, the array can be used for the quantitation of the expression of multiple genes. Thus, not only tissue specificity, but also the level of expression of a battery of genes in the tissue is ascertained. Quantitative data can be used to group (e.g., cluster) genes on the basis of their tissue expression per se and level of expression in that tissue.

[4320] For example, array analysis of gene expression can be used to assess the effect of cell-cell interactions on 33521 expression. A first tissue can be perturbed and nucleic acid from a second tissue that interacts with the first tissue can be analyzed. In this context, the effect of one cell type on another cell type in response to a biological stimulus can be determined, e.g., to monitor the effect of cell-cell interaction at the level of gene expression.

[4321] In another embodiment, cells are contacted with a therapeutic agent. The expression profile of the cells is determined using the array, and the expression profile is compared to the profile of like cells not contacted with the agent. For example, the assay can be used to determine or analyze the molecular basis of an undesirable effect of the therapeutic agent. If an agent is administered therapeutically to treat one cell type but has an undesirable effect on another cell type, the invention provides an assay to determine the molecular basis of the undesirable effect and thus provides the opportunity to co-administer a counteracting agent or otherwise treat the undesired effect. Similarly, even within a single cell type, undesirable biological effects can be determined at the molecular level. Thus, the effects of an agent on expression of other than the target gene can be ascertained and counteracted.

[4322] In another embodiment, the array can be used to monitor expression of one or more genes in the array with respect to time. For example, samples obtained from different time points can be probed with the array. Such analysis can identify and/or characterize the development of a 33521-associated disease or disorder; and processes, such as a cellular transformation associated with a 33521-associated disease or disorder. The method can also evaluate the treatment and/or progression of a 33521-associated disease or disorder

[4323] The array is also useful for ascertaining differential expression patterns of one or more genes in normal and abnormal cells. This provides a battery of genes (e.g., including 33521) that could serve as a molecular target for diagnosis or therapeutic intervention.

[4324] In another aspect, the invention features an array having a plurality of addresses. Each address of the plurality includes a unique polypeptide. At least one address of the plurality has disposed thereon a 33521 polypeptide or fragment thereof. Methods of producing polypeptide arrays are-described in the art, e.g., in De Wildt et al. (2000). Nature Biotech. 18, 989-994; Lueking et al. (1999). Anal. Biochem. 270, 103-111; Ge, H. (2000). Nucleic Acids Res. 28, e3, I-VII; MacBeath, G., and Schreiber, S. L. (2000). Science 289, 1760-1763; and WO 99/51773A1. In a preferred embodiment, each addresses of the plurality has disposed thereon a polypeptide at least 60, 70, 80, 85, 90, 95 or 99% identical to a 33521 polypeptide or fragment thereof. For example, multiple variants of a 33521 polypeptide (e.g., encoded by allelic variants, site-directed mutants, random mutants, or combinatorial mutants) can be disposed at individual addresses of the plurality. Addresses in addition to the address of the plurality can be disposed on the array.

[4325] The polypeptide array can be used to detect a 33521 binding compound, e.g., an antibody in a sample from a subject with specificity for a 33521 polypeptide or the presence of a 33521-binding protein or ligand.

[4326] The array is also useful for ascertaining the effect of the expression of a gene on the expression of other genes in the same cell or in different cells (e.g., ascertaining the effect of 33521 expression on the expression of other genes). This provides, for example, for a selection of alternate molecular targets for therapeutic intervention if the ultimate or downstream target cannot be regulated.

[4327] In another aspect, the invention features a method of analyzing a plurality of probes. The method is useful, e.g., for analyzing gene expression. The method includes: providing a two dimensional array having a plurality of addresses, each address of the plurality being positionally distinguishable from each other address of the plurality having a unique capture probe, e.g., wherein the capture probes are from a cell or subject which express 33521 or from a cell or subject in which a 33521 mediated response has been elicited, e.g., by contact of the cell with 33521 nucleic acid or protein, or administration to the cell or subject 33521 nucleic acid or protein; providing a two dimensional array having a plurality of addresses, each address of the plurality being positionally distinguishable from each other address of the plurality, and each address of the plurality having a unique capture probe, e.g., wherein the capture probes are from a cell or subject which does not express 33521 (or does not express as highly as in the case of the 33521 positive plurality of capture probes) or from a cell or subject which in which a 33521 mediated response has not been elicited (or has been elicited to a lesser extent than in the first sample); contacting the array with one or more inquiry probes (which is preferably other than a 33521 nucleic acid, polypeptide, or antibody), and thereby evaluating the plurality of capture probes. Binding, e.g., in the case of a nucleic acid, hybridization with a capture probe at an address of the plurality, is detected, e.g., by signal generated from a label attached to the nucleic acid, polypeptide, or antibody.

[4328] In another aspect, the invention features a method of analyzing a plurality of probes or a sample. The method is useful, e.g., for analyzing gene expression. The method includes: providing a two dimensional array having a plurality of addresses, each address of the plurality being positionally distinguishable from each other address of the plurality having a unique capture probe, contacting the array with a first sample from a cell or subject which express or mis-express 33521 or from a cell or subject in which a 33521-mediated response has been elicited, e.g., by contact of the cell with 33521 nucleic acid or protein, or administration to the cell or subject 33521 nucleic acid or protein; providing a two dimensional array having a plurality of addresses, each address of the plurality being positionally distinguishable from each other address of the plurality, and each address of the plurality having a unique capture probe, and contacting the array with a second sample from a cell or subject which does not express 33521 (or does not express as highly as in the case of the 33521 positive plurality of capture probes) or from a cell or subject which in which a 33521 mediated response has not been elicited (or has been elicited to a lesser extent than in the first sample); and comparing the binding of the first sample with the binding of the second sample. Binding, e.g., in the case of a nucleic acid, hybridization with a capture probe at an address of the plurality, is detected, e.g., by signal generated from a label attached to the nucleic acid, polypeptide, or antibody. The same array can be used for both samples or different arrays can be used. If different arrays are used the plurality of addresses with capture probes should be present on both arrays.

[4329] In another aspect, the invention features a method of analyzing 33521, e.g., analyzing structure, function, or relatedness to other nucleic acid or amino acid sequences. The method includes: providing a 33521 nucleic acid or amino acid sequence; comparing the 33521 sequence with one or more preferably a plurality of sequences from a collection of sequences, e.g., a nucleic acid or protein sequence database; to thereby analyze 33521.

[4330] Detection of 33521 Variations or Mutations

[4331] The methods of the invention can also be used to detect genetic alterations in a 33521 gene, thereby determining if a subject with the altered gene is at risk for a disorder characterized by misregulation in 33521 protein activity or nucleic acid expression, such as a cell proliferation and differentiation disorder, e.g., cancer and metastasis. In preferred embodiments, the methods include detecting, in a sample from the subject, the presence or absence of a genetic alteration characterized by at least one of an alteration affecting the integrity of a gene encoding a 33521-protein, or the mis-expression of the 33521 gene. For example, such genetic alterations can be detected by ascertaining the existence of at least one of 1) a deletion of one or more nucleotides from a 33521 gene; 2) an addition of one or more nucleotides to a 33521 gene; 3) a substitution of one or more nucleotides of a 33521 gene, 4) a chromosomal rearrangement of a 33521 gene; 5) an alteration in the level of a messenger RNA transcript of a 33521 gene, 6) aberrant modification of a 33521 gene, such as of the methylation pattern of the genomic DNA, 7) the presence of a non-wild type splicing pattern of a messenger RNA transcript of a 33521 gene, 8) a non-wild type level of a 33521-protein, 9) allelic loss of a 33521 gene, and 10) inappropriate post-translational modification of a 33521-protein.

[4332] An alteration can be detected without a probe/primer in a polymerase chain reaction, such as anchor PCR or RACE PCR, or, alternatively, in a ligation chain reaction (LCR), the latter of which can be particularly useful for detecting point mutations in the 33521-gene. This method can include the steps of collecting a sample of cells from a subject, isolating nucleic acid (e.g., genomic, mRNA or both) from the sample, contacting the nucleic acid sample with one or more primers which specifically hybridize to a 33521 gene under conditions such that hybridization and amplification of the 33521-gene (if present) occurs, and detecting the presence or absence of an amplification product, or detecting the size of the amplification product and comparing the length to a control sample. It is anticipated that PCR and/or LCR may be desirable to use as a preliminary amplification step in conjunction with any of the techniques used for detecting mutations described herein. Alternatively, other amplification methods described herein or known in the art can be used.

[4333] In another embodiment, mutations in a 33521 gene from a sample cell can be identified by detecting alterations in restriction enzyme cleavage patterns. For example, sample and control DNA is isolated, amplified (optionally), digested with one or more restriction endonucleases, and fragment length sizes are determined, e.g., by gel electrophoresis and compared. Differences in fragment length sizes between sample and control DNA indicates mutations in the sample DNA. Moreover, the use of sequence specific ribozymes (see, for example, U.S. Pat. No. 5,498,531) can be used to score for the presence of specific mutations by development or loss of a ribozyme cleavage site.

[4334] In other embodiments, genetic mutations in 33521 can be identified by hybridizing a sample and control nucleic acids, e.g., DNA or RNA, two-dimensional arrays, e.g., chip based arrays. Such arrays include a plurality of addresses, each of which is positionally distinguishable from the other. A different probe is located at each address of the plurality. A probe can be complementary to a region of a 33521 nucleic acid or a putative variant (e.g., allelic variant) thereof. A probe can have one or more mismatches to a region of a 33521 nucleic acid (e.g., a destabilizing mismatch). The arrays can have a high density of addresses, e.g., can contain hundreds or thousands of oligonucleotides probes (Cronin, M. T. et al. (1996) Human Mutation 7: 244-255; Kozal, M. J. et al. (1996) Nature Medicine 2: 753-759). For example, genetic mutations in 33521 can be identified in two-dimensional arrays containing light-generated DNA probes as described in Cronin, M. T. et al. supra. Briefly, a first hybridization array of probes can be used to scan through long stretches of DNA in a sample and control to identify base changes between the sequences by making linear arrays of sequential overlapping probes. This step allows the identification of point mutations. This step is followed by a second hybridization array that allows the characterization of specific mutations by using smaller, specialized probe arrays complementary to all variants or mutations detected. Each mutation array is composed of parallel probe sets, one complementary to the wild-type gene and the other complementary to the mutant gene.

[4335] In yet another embodiment, any of a variety of sequencing reactions known in the art can be used to directly sequence the 33521 gene and detect mutations by comparing the sequence of the sample 33521 with the corresponding wild-type (control) sequence. Automated sequencing procedures can be utilized when performing the diagnostic assays ((1995) Biotechniques 19:448), including sequencing by mass spectrometry.

[4336] Other methods for detecting mutations in the 33521 gene include methods in which protection from cleavage agents is used to detect mismatched bases in RNA/RNA or RNA/DNA heteroduplexes (Myers et al. (1985) Science 230:1242; Cotton et al. (1988) Proc. Natl. Acad Sci USA 85:4397; Saleeba et al. (1992) Methods Enzymol. 217:286-295).

[4337] In still another embodiment, the mismatch cleavage reaction employs one or more proteins that recognize mismatched base pairs in double-stranded DNA (so called “DNA mismatch repair” enzymes) in defined systems for detecting and mapping point mutations in 33521 cDNAs obtained from samples of cells. For example, the mutY enzyme of E. coli cleaves A at G/A mismatches and the thymidine DNA glycosylase from HeLa cells cleaves T at G/T mismatches (Hsu et al. (1994) Carcinogenesis 15:1657-1662; U.S. Pat. No. 5,459,039).

[4338] In other embodiments, alterations in electrophoretic mobility will be used to identify mutations in 33521 genes. For example, single strand conformation polymorphism (SSCP) may be used to detect differences in electrophoretic mobility between mutant and wild type nucleic acids (Orita et al. (1989) Proc Natl. Acad. Sci USA: 86:2766, see also Cotton (1993) Mutat. Res. 285:125-144; and Hayashi (1992) Genet. Anal. Tech. Appl. 9:73-79). Single-stranded DNA fragments of sample and control 33521 nucleic acids will be denatured and allowed to renature. The secondary structure of single-stranded nucleic acids varies according to sequence, the resulting alteration in electrophoretic mobility enables the detection of even a single base change. The DNA fragments may be labeled or detected with labeled probes. The sensitivity of the assay may be enhanced by using RNA (rather than DNA), in which the secondary structure is more sensitive to a change in sequence. In a preferred embodiment, the subject method utilizes heteroduplex analysis to separate double stranded heteroduplex molecules on the basis of changes in electrophoretic mobility (Keen et al. (1991) Trends Genet 7:5).

[4339] In yet another embodiment, the movement of mutant or wild-type fragments in polyacrylamide gels containing a gradient of denaturant is assayed using denaturing gradient gel electrophoresis (DGGE) (Myers et al. (1985) Nature 313:495). When DGGE is used as the method of analysis, DNA will be modified to insure that it does not completely denature, for example by adding a GC clamp of approximately 40 bp of high-melting GC-rich DNA by PCR. In a further embodiment, a temperature gradient is used in place of a denaturing gradient to identify differences in the mobility of control and sample DNA (Rosenbaum and Reissner (1987) Biophys Chem 265:12753).

[4340] Examples of other techniques for detecting point mutations include, but are not limited to, selective oligonucleotide hybridization, selective amplification, or selective primer extension (Saiki et al. (1986) Nature 324:163); Saiki et al. (1989) Proc. Natl Acad. Sci USA 86:6230). A further method of detecting point mutations is the chemical ligation of oligonucleotides as described in Xu et al. ((2001) Nature Biotechnol. 19:148). Adjacent oligonucleotides, one of which selectively anneals to the query site, are ligated together if the nucleotide at the query site of the sample nucleic acid is complementary to the query oligonucleotide; ligation can be monitored, e.g., by fluorescent dyes coupled to the oligonucleotides.

[4341] Alternatively, allele specific amplification technology that depends on selective PCR amplification may be used in conjunction with the instant invention. Oligonucleotides used as primers for specific amplification may carry the mutation of interest in the center of the molecule (so that amplification depends on differential hybridization) (Gibbs et al. (1989) Nucleic Acids Res. 17:2437-2448) or at the extreme 3′end of one primer where, under appropriate conditions, mismatch can prevent, or reduce polymerase extension (Prossner (1993) Tibtech 11:238). In addition it may be desirable to introduce a novel restriction site in the region of the mutation to create cleavage-based detection (Gasparini et al. (1992) Mol. Cell Probes 6:1). It is anticipated that in certain embodiments amplification may also be performed using Taq ligase for amplification (Barany (1991) Proc. Natl. Acad. Sci USA 88:189). In such cases, ligation will occur only if there is a perfect match at the 3′end of the 5′sequence making it possible to detect the presence of a known mutation at a specific site by looking for the presence or absence of amplification.

[4342] In another aspect, the invention features a set of oligonucleotides. The set includes a plurality of oligonucleotides, each of which is at least partially complementary (e.g., at least 50%, 60%, 70%, 80%, 90%, 92%, 95%, 97%, 98%, or 99% complementary) to a 33521 nucleic acid.

[4343] In a preferred embodiment the set includes a first and a second oligonucleotide. The first and second oligonucleotide can hybridize to the same or to different locations of SEQ ID NO:61 or the complement of SEQ ID NO:61. Different locations can be different but overlapping, or non-overlapping on the same strand. The first and second oligonucleotide can hybridize to sites on the same or on different strands.

[4344] The set can be useful, e.g., for identifying SNP's, or identifying specific alleles of 33521. In a preferred embodiment, each oligonucleotide of the set has a different nucleotide at an interrogation position. In one embodiment, the set includes two oligonucleotides, each complementary to a different allele at a locus, e.g., a biallelic or polymorphic locus.

[4345] In another embodiment, the set includes four oligonucleotides, each having a different nucleotide (e.g., adenine, guanine, cytosine, or thymidine) at the interrogation position. The interrogation position can be a SNP or the site of a mutation. In another preferred embodiment, the oligonucleotides of the plurality are identical in sequence to one another (except for differences in length). The oligonucleotides can be provided with differential labels, such that an oligonucleotide that hybridizes to one allele provides a signal that is distinguishable from an oligonucleotide that hybridizes to a second allele. In still another embodiment, at least one of the oligonucleotides of the set has a nucleotide change at a position in addition to a query position, e.g., a destabilizing mutation to decrease the Tm of the oligonucleotide. In another embodiment, at least one oligonucleotide of the set has a non-natural nucleotide, e.g., inosine. In a preferred embodiment, the oligonucleotides are attached to a solid support, e.g., to different addresses of an array or to different beads or nanoparticles.

[4346] In a preferred embodiment the set of oligo nucleotides can be used to specifically amplify, e.g., by PCR, or detect, a 33521 nucleic acid.

[4347] The methods described herein may be performed, for example, by utilizing pre-packaged diagnostic kits comprising at least one probe nucleic acid or antibody reagent described herein, which may be conveniently used, e.g., in clinical settings to diagnose patients exhibiting symptoms or family history of a disease or illness involving a 33521 gene.

[4348] Use of 33521 Molecules as Surrogate Markers

[4349] The 33521 molecules of the invention are also useful as markers of disorders or disease states, as markers for precursors of disease states, as markers for predisposition of disease states, as markers of drug activity, or as markers of the pharmacogenomic profile of a subject. Using the methods described herein, the presence, absence and/or quantity of the 33521 molecules of the invention may be detected, and may be correlated with one or more biological states in vivo. For example, the 33521 molecules of the invention may serve as surrogate markers for one or more disorders or disease states or for conditions leading up to disease states. As used herein, a “surrogate marker” is an objective biochemical marker which correlates with the absence or presence of a disease or disorder, or with the progression of a disease or disorder (e.g., with the presence or absence of a tumor). The presence or quantity of such markers is independent of the disease. Therefore, these markers may serve to indicate whether a particular course of treatment is effective in lessening a disease state or disorder. Surrogate markers are of particular use when the presence or extent of a disease state or disorder is difficult to assess through standard methodologies (e.g., early stage tumors), or when an assessment of disease progression is desired before a potentially dangerous clinical endpoint is reached (e.g., an assessment of cardiovascular disease may be made using cholesterol levels as a surrogate marker, and an analysis of HIV infection may be made using HIV RNA levels as a surrogate marker, well in advance of the undesirable clinical outcomes of myocardial infarction or fully-developed AIDS). Examples of the use of surrogate markers in the art include: Koomen et al. (2000) J. Mass. Spectrom. 35: 258-264; and James (1994) AIDS Treatment News Archive 209.

[4350] The 33521 molecules of the invention are also useful as pharmacodynamic markers. As used herein, a “pharmacodynamic marker” is an objective biochemical marker which correlates specifically with drug effects. The presence or quantity of a pharmacodynamic marker is not related to the disease state or disorder for which the drug is being administered; therefore, the presence or quantity of the marker is indicative of the presence or activity of the drug in a subject. For example, a pharmacodynamic marker may be indicative of the concentration of the drug in a biological tissue, in that the marker is either expressed or transcribed or not expressed or transcribed in that tissue in relationship to the level of the drug. In this fashion, the distribution or uptake of the drug may be monitored by the pharmacodynamic marker. Similarly, the presence or quantity of the pharmacodynamic marker may be related to the presence or quantity of the metabolic product of a drug, such that the presence or quantity of the marker is indicative of the relative breakdown rate of the drug in vivo. Pharmacodynamic markers are of particular use in increasing the sensitivity of detection of drug effects, particularly when the drug is administered in low doses. Since even a small amount of a drug may be sufficient to activate multiple rounds of marker (e.g., a 33521 marker) transcription or expression, the amplified marker may be in a quantity which is more readily detectable than the drug itself. Also, the marker may be more easily detected due to the nature of the marker itself; for example, using the methods described herein, anti-33521 antibodies may be employed in an immune-based detection system for a 33521 protein marker, or 33521-specific radiolabeled probes may be used to detect a 33521 mRNA marker. Furthermore, the use of a pharmacodynamic marker may offer mechanism-based prediction of risk due to drug treatment beyond the range of possible direct observations. Examples of the use of pharmacodynamic markers in the art include: Matsuda et al. U.S. Pat. No. 6,033,862; Hattis et al. (1991) Env. Health Perspect. 90: 229-238; Schentag (1999) Am. J. Health-Syst. Pharm. 56 Suppl. 3: S21-S24; and Nicolau (1999) Am, J. Health-Syst. Pharm. 56 Suppl. 3: S16-S20.

[4351] The 33521 molecules of the invention are also useful as pharmacogenomic markers. As used herein, a “pharmacogenomic marker” is an objective biochemical marker which correlates with a specific clinical drug response or susceptibility in a subject (see, e.g., McLeod et al. (1999) Eur. J. Cancer 35:1650-1652). The presence or quantity of the pharmacogenomic marker is related to the predicted response of the subject to a specific drug or class of drugs prior to administration of the drug. By assessing the presence or quantity of one or more pharmacogenomic markers in a subject, a drug therapy which is most appropriate for the subject, or which is predicted to have a greater degree of success, may be selected. For example, based on the presence or quantity of RNA, or protein (e.g., 33521 protein or RNA) for specific tumor markers in a subject, a drug or course of treatment may be selected that is optimized for the treatment of the specific tumor likely to be present in the subject. Similarly, the presence or absence of a specific sequence mutation in 33521 DNA may correlate 33521 drug response. The use of pharmacogenomic markers therefore permits the application of the most appropriate treatment for each subject without having to administer the therapy.

[4352] Pharmaceutical Compositions of 33521

[4353] The nucleic acid and polypeptides, fragments thereof, as well as anti-33521 antibodies (also referred to herein as “active compounds”) of the invention can be incorporated into pharmaceutical compositions. Such compositions typically include the nucleic acid molecule, protein, or antibody and a pharmaceutically acceptable carrier. As used herein the language “pharmaceutically acceptable carrier” includes solvents, dispersion media, coatings, antibacterial and antifungal agents, isotonic and absorption delaying agents, and the like, compatible with pharmaceutical administration. Supplementary active compounds can also be incorporated into the compositions.

[4354] A pharmaceutical composition is formulated to be compatible with its intended route of administration. Examples of routes of administration include parenteral, e.g., intravenous, intradermal, subcutaneous, oral (e.g., inhalation), transdermal (topical), transmucosal, and rectal administration. Solutions or suspensions used for parenteral, intradermal, or subcutaneous application can include the following components: a sterile diluent such as water for injection, saline solution, fixed oils, polyethylene glycols, glycerine, propylene glycol or other synthetic solvents; antibacterial agents such as benzyl alcohol or methyl parabens; antioxidants such as ascorbic acid or sodium bisulfite; chelating agents such as ethylenediaminetetraacetic acid; buffers such as acetates, citrates or phosphates and agents for the adjustment of tonicity such as sodium chloride or dextrose. pH can be adjusted with acids or bases, such as hydrochloric acid or sodium hydroxide. The parenteral preparation can be enclosed in ampoules, disposable syringes or multiple dose vials made of glass or plastic.

[4355] Pharmaceutical compositions suitable for injectable use include sterile aqueous solutions (where water soluble) or dispersions and sterile powders for the extemporaneous preparation of sterile injectable solutions or dispersion. For intravenous administration, suitable carriers include physiological saline, bacteriostatic water, Cremophor EL™ (BASF, Parsippany, N.J.) or phosphate buffered saline (PBS). In all cases, the composition must be sterile and should be fluid to the extent that easy syringability exists. It should be stable under the conditions of manufacture and storage and must be preserved against the contaminating action of microorganisms such as bacteria and fungi. The carrier can be a solvent or dispersion medium containing, for example, water, ethanol, polyol (for example, glycerol, propylene glycol, and liquid polyetheylene glycol, and the like), and suitable mixtures thereof. The proper fluidity can be maintained, for example, by the use of a coating such as lecithin, by the maintenance of the required particle size in the case of dispersion and by the use of surfactants. Prevention of the action of microorganisms can be achieved by various antibacterial and antifingal agents, for example, parabens, chlorobutanol, phenol, ascorbic acid, thimerosal, and the like. In many cases, it will be preferable to include isotonic agents, for example, sugars, polyalcohols such as manitol, sorbitol, sodium chloride in the composition. Prolonged absorption of the injectable compositions can be brought about by including in the composition an agent which delays absorption, for example, aluminum monostearate and gelatin.

[4356] Sterile injectable solutions can be prepared by incorporating the active compound in the required amount in an appropriate solvent with one or a combination of ingredients enumerated above, as required, followed by filtered sterilization. Generally, dispersions are prepared by incorporating the active compound into a sterile vehicle which contains a basic dispersion medium and the required other ingredients from those enumerated above. In the case of sterile powders for the preparation of sterile injectable solutions, the preferred methods of preparation are vacuum drying and freeze-drying which yields a powder of the active ingredient plus any additional desired ingredient from a previously sterile-filtered solution thereof.

[4357] Oral compositions generally include an inert diluent or an edible carrier. For the purpose of oral therapeutic administration, the active compound can be incorporated with excipients and used in the form of tablets, troches, or capsules, e.g., gelatin capsules. Oral compositions can also be prepared using a fluid carrier for use as a mouthwash. Pharmaceutically compatible binding agents, and/or adjuvant materials can be included as part of the composition. The tablets, pills, capsules, troches and the like can contain any of the following ingredients, or compounds of a similar nature: a binder such as microcrystalline cellulose, gum tragacanth or gelatin; an excipient such as starch or lactose, a disintegrating agent such as alginic acid, Primogel, or corn starch; a lubricant such as magnesium stearate or Sterotes; a glidant such as colloidal silicon dioxide; a sweetening agent such as sucrose or saccharin; or a flavoring agent such as peppermint, methyl salicylate, or orange flavoring.

[4358] For administration by inhalation, the compounds are delivered in the form of an aerosol spray from pressured container or dispenser which contains a suitable propellant, e.g., a gas such as carbon dioxide, or a nebulizer.

[4359] Systemic administration can also be by transmucosal or transdermal means. For transmucosal or transdermal administration, penetrants appropriate to the barrier to be permeated are used in the formulation. Such penetrants are generally known in the art, and include, for example, for transmucosal administration, detergents, bile salts, and fusidic acid derivatives. Transmucosal administration can be accomplished through the use of nasal sprays or suppositories. For transdermal administration, the active compounds are formulated into ointments, salves, gels, or creams as generally known in the art.

[4360] The compounds can also be prepared in the form of suppositories (e.g., with conventional suppository bases such as cocoa butter and other glycerides) or retention enemas for rectal delivery.

[4361] In one embodiment, the active compounds are prepared with carriers that will protect the compound against rapid elimination from the body, such as a controlled release formulation, including implants and microencapsulated delivery systems. Biodegradable, biocompatible polymers can be used, such as ethylene vinyl acetate, polyanhydrides, polyglycolic acid, collagen, polyorthoesters, and polylactic acid. Methods for preparation of such formulations will be apparent to those skilled in the art. The materials can also be obtained commercially from Alza Corporation and Nova Pharmaceuticals, Inc. Liposomal suspensions (including liposomes targeted to infected cells with monoclonal antibodies to viral antigens) can also be used as pharmaceutically acceptable carriers. These can be prepared according to methods known to those skilled in the art, for example, as described in U.S. Pat. No. 4,522,811.

[4362] It is advantageous to formulate oral or parenteral compositions in dosage unit form for ease of administration and uniformity of dosage. Dosage unit form as used herein refers to physically discrete units suited as unitary dosages for the subject to be treated; each unit containing a predetermined quantity of active compound calculated to produce the desired therapeutic effect in association with the required pharmaceutical carrier.

[4363] Toxicity and therapeutic efficacy of such compounds can be determined by standard pharmaceutical procedures in cell cultures or experimental animals, e.g., for determining the LD50 (the dose lethal to 50% of the population) and the ED50 (the dose therapeutically effective in 50% of the population). The dose ratio between toxic and therapeutic effects is the therapeutic index and it can be expressed as the ratio LD50/ED50. Compounds which exhibit high therapeutic indices are preferred. While compounds that exhibit toxic side effects may be used, care should be taken to design a delivery system that targets such compounds to the site of affected tissue in order to minimize potential damage to uninfected cells and, thereby, reduce side effects.

[4364] The data obtained from the cell culture assays and animal studies can be used in formulating a range of dosage for use in humans. The dosage of such compounds lies preferably within a range of circulating concentrations that include the ED50 with little or no toxicity. The dosage may vary within this range depending upon the dosage form employed and the route of administration utilized. For any compound used in the method of the invention, the therapeutically effective dose can be estimated initially from cell culture assays. A dose may be formulated in animal models to achieve a circulating plasma concentration range that includes the IC50 (i.e., the concentration of the test compound which achieves a half-maximal inhibition of symptoms) as determined in cell culture. Such information can be used to more accurately determine useful doses in humans. Levels in plasma may be measured, for example, by high performance liquid chromatography.

[4365] As defined herein, a therapeutically effective amount of protein or polypeptide (i.e., an effective dosage) ranges from about 0.001 to 30 mg/kg body weight, preferably about 0.01 to 25 mg/kg body weight, more preferably about 0.1 to 20 mg/kg body weight, and even more preferably about 1 to 10 mg/kg, 2 to 9 mg/kg, 3 to 8 mg/kg, 4 to 7 mg/kg, or 5 to 6 mg/kg body weight. The protein or polypeptide can be administered one time per week for between about 1 to 10 weeks, preferably between 2 to 8 weeks, more preferably between about 3 to 7 weeks, and even more preferably for about 4, 5, or 6 weeks. The skilled artisan will appreciate that certain factors may influence the dosage and timing required to effectively treat a subject, including but not limited to the severity of the disease or disorder, previous treatments, the general health and/or age of the subject, and other diseases present. Moreover, treatment of a subject with a therapeutically effective amount of a protein, polypeptide, or antibody can include a single treatment or, preferably, can include a series of treatments.

[4366] For antibodies, the preferred dosage is 0.1 mg/kg of body weight (generally 10 mg/kg to 20 mg/kg). If the antibody is to act in the brain, a dosage of 50 mg/kg to 100 mg/kg is usually appropriate. Generally, partially human antibodies and fully human antibodies have a longer half-life within the human body than other antibodies. Accordingly, lower dosages and less frequent administration is often possible. Modifications such as lipidation can be used to stabilize antibodies and to enhance uptake and tissue penetration (e.g., into the brain). A method for lipidation of antibodies is described by Cruikshank et al. ((1997) J. Acquired Immune Deficiency Syndromes and Human Retrovirology 14:193).

[4367] The present invention encompasses agents which modulate expression or activity. An agent may, for example, be a small molecule. For example, such small molecules include, but are not limited to, peptides, peptidomimetics (e.g., peptoids), amino acids, amino acid analogs, polynucleotides, polynucleotide analogs, nucleotides, nucleotide analogs, organic or inorganic compounds (i.e.,. including heteroorganic and organometallic compounds) having a molecular weight less than about 10,000 grams per mole, organic or inorganic compounds having a molecular weight less than about 5,000 grams per mole, organic or inorganic compounds having a molecular weight less than about 1,000 grams per mole, organic or inorganic compounds having a molecular weight less than about 500 grams per mole, and salts, esters, and other pharmaceutically acceptable forms of such compounds.

[4368] Exemplary doses include milligram or microgram amounts of the small molecule per kilogram of subject or sample weight (e.g., about 1 microgram per kilogram to about 500 milligrams per kilogram, about 100 micrograms per kilogram to about 5 milligrams per kilogram, or about 1 microgram per kilogram to about 50 micrograms per kilogram. It is furthermore understood that appropriate doses of a small molecule depend upon the potency of the small molecule with respect to the expression or activity to be modulated. When one or more of these small molecules is to be administered to an animal (e.g., a human) in order to modulate expression or activity of a polypeptide or nucleic acid of the invention, a physician, veterinarian, or researcher may, for example, prescribe a relatively low dose at first, subsequently increasing the dose until an appropriate response is obtained. In addition, it is understood that the specific dose level for any particular animal subject will depend upon a variety of factors including the activity of the specific compound employed, the age, body weight, general health, gender, and diet of the subject, the time of administration, the route of administration, the rate of excretion, any drug combination, and the degree of expression or activity to be modulated.

[4369] An antibody (or fragment thereof) may be conjugated to a therapeutic moiety such as a cytotoxin, a therapeutic agent or a radioactive ion. A cytotoxin or cytotoxic agent includes any agent that is detrimental to cells. Examples include taxol, cytochalasin B, gramicidin D, ethidium bromide, emetine, mitomycin, etoposide, tenoposide, vincristine, vinblastine, colchicin, doxorubicin, daunorubicin, dihydroxy anthracin dione, mitoxantrone, mithramycin, actinomycin D, 1-dehydrotestosterone, glucocorticoids, procaine, tetracaine, lidocaine, propranolol, puromycin, maytansinoids, e.g., maytansinol (see U.S. Pat. No. 5,208,020), CC-1065 (see U.S. Pat. Nos. 5,475,092, 5,585,499, 5,846,545) and analogs or homologs thereof. Therapeutic agents include, but are not limited to, antimetabolites (e.g., methotrexate, 6-mercaptopurine, 6-thioguanine, cytarabine, 5-fluorouracil decarbazine), alkylating agents (e.g., mechlorethamine, thioepa chlorambucil, CC-1065, melphalan, carmustine (BSNU) and lomustine (CCNU), cyclothosphamide, busulfan, dibromomannitol, streptozotocin, mitomycin C, and cis-dichlorodiamine platinum (II) (DDP) cisplatin), anthracyclines (e.g., daunorubicin (formerly daunomycin) and doxorubicin), antibiotics (e.g., dactinomycin (formerly actinomycin), bleomycin, mithramycin, and anthramycin (AMC)), and anti-mitotic agents (e.g., vincristine, vinblastine, taxol and maytansinoids). Radioactive ions include, but are not limited to iodine, yttrium and praseodymium.

[4370] The conjugates of the invention can be used for modifying a given biological response, the drug moiety is not to be construed as limited to classical chemical therapeutic agents. For example, the drug moiety may be a protein or polypeptide possessing a desired biological activity. Such proteins may include, for example, a toxin such as abrin, ricin A, pseudomonas exotoxin, or diphtheria toxin; a protein such as tumor necrosis factor, &agr;-interferon, &bgr;-interferon, nerve growth factor, platelet derived growth factor, tissue plasminogen activator; or, biological response modifiers such as, for example, lymphokines, interleukin-1 (“IL-1”), interleukin-2 (“IL-2”), interleukin-6 (“IL-6”), granulocyte macrophase colony stimulating factor (“GM-CSF”), granulocyte colony stimulating factor (“G-CSF”), or other growth factors.

[4371] Alternatively, an antibody can be conjugated to a second antibody to form an antibody heteroconjugate as described by Segal in U.S. Pat. No. 4,676,980.

[4372] The nucleic acid molecules of the invention can be inserted into vectors and used as gene therapy vectors. Gene therapy vectors can be delivered to a subject by, for example, intravenous injection, local administration (see U.S. Pat. No. 5,328,470) or by stereotactic injection (see e.g., Chen et al. (1994) Proc. Natl. Acad. Sci. USA 91:3054-3057). The pharmaceutical preparation of the gene therapy vector can include the gene therapy vector in an acceptable diluent, or can comprise a slow release matrix in which the gene delivery vehicle is imbedded. Alternatively, where the complete gene delivery vector can be produced intact from recombinant cells, e.g., retroviral vectors, the pharmaceutical preparation can include one or more cells which produce the gene delivery system.

[4373] The pharmaceutical compositions can be included in a container, pack, or dispenser together with instructions for administration.

[4374] Methods of Treatment for 33521

[4375] The present invention provides for both prophylactic and therapeutic methods of treating a subject at risk of (or susceptible to) a disorder or having a disorder associated with aberrant or unwanted 33521 expression or activity. As used herein, the term “treatment” is defined as the application or administration of a therapeutic agent to a patient, or application or administration of a therapeutic agent to an isolated tissue or cell line from a patient, who has a disease, a symptom of disease or a predisposition toward a disease, with the purpose to cure, heal, alleviate, relieve, alter, remedy, ameliorate, improve or affect the disease, the symptoms of disease or the predisposition toward disease. A therapeutic agent includes, but is not limited to, small molecules, peptides, antibodies, ribozymes and antisense oligonucleotides.

[4376] With regards to both prophylactic and therapeutic methods of treatment, such treatments may be specifically tailored or modified, based on knowledge obtained from the field of pharmacogenomics. “Pharmacogenomics”, as used herein, refers to the application of genomics technologies such as gene sequencing, statistical genetics, and gene expression analysis to drugs in clinical development and on the market. More specifically, the term refers the study of how a patient's genes determine his or her response to a drug (e.g., a patient's “drug response phenotype”, or “drug response genotype”.) Thus, another aspect of the invention provides methods for tailoring an individual's prophylactic or therapeutic treatment with either the 33521 molecules of the present invention or 33521 modulators according to that individual's drug response genotype. Pharmacogenomics allows a clinician or physician to target prophylactic or therapeutic treatments to patients who will most benefit from the treatment and to avoid treatment of patients who will experience toxic drug-related side effects.

[4377] In one aspect, the invention provides a method for preventing in a subject, a disease or condition associated with an aberrant or unwanted 33521 expression or activity, by administering to the subject a 33521 or an agent which modulates 33521 expression or at least one 33521 activity. Subjects at risk for a disease which is caused or contributed to by aberrant or unwanted 33521 expression or activity can be identified by, for example, any or a combination of diagnostic or prognostic assays as described herein. Administration of a prophylactic agent can occur prior to the manifestation of symptoms characteristic of the 33521 aberrance, such that a disease or disorder is prevented or, alternatively, delayed in its progression. Depending on the type of 33521 aberrance, for example, a 33521, 33521 agonist or 33521 antagonist agent can be used for treating the subject. The appropriate agent can be determined based on screening assays described herein.

[4378] It is possible that some 33521 disorders can be caused, at least in part, by an abnormal level of gene product, or by the presence of a gene product exhibiting abnormal activity. As such, the reduction in the level and/or activity of such gene products would bring about the amelioration of disorder symptoms.

[4379] Aberrant expression and/or activity of 33521 molecules may mediate disorders associated with bone metabolism. “Bone metabolism” refers to direct or indirect effects in the formation or degeneration of bone structures, e.g., bone formation, bone resorption, etc., which may ultimately affect the concentrations in serum of calcium and phosphate. This term also includes activities mediated by 33521 molecules effects in bone cells, e.g. osteoclasts and osteoblasts, that may in turn result in bone formation and degeneration. For example, 33521 molecules may support different activities of bone resorbing osteoclasts such as the stimulation of differentiation of monocytes and mononuclear phagocytes into osteoclasts. Accordingly, 33521 molecules that modulate the production of bone cells can influence bone formation and degeneration, and thus may be used to treat bone disorders. Examples of such disorders include, but are not limited to, osteoporosis, osteodystrophy, osteomalacia, rickets, osteitis fibrosa cystica, renal osteodystrophy, osteosclerosis, anti-convulsant treatment, osteopenia, fibrogenesis-imperfecta ossium, secondary hyperparathyrodism, hypoparathyroidism, hyperparathyroidism, cirrhosis, obstructive jaundice, drug induced metabolism, medullary carcinoma, chronic renal disease, rickets, sarcoidosis, glucocorticoid antagonism, malabsorption syndrome, steatorrhea, tropical sprue, idiopathic hypercalcemia and milk fever.

[4380] The 33521 nucleic acid and protein of the invention can be used to treat and/or diagnose a variety of immune disorders. Examples of immune disorders or diseases include, but are not limited to, autoimmune diseases (including, for example, diabetes mellitus, arthritis (including rheumatoid arthritis, juvenile rheumatoid arthritis, osteoarthritis, psoriatic arthritis), multiple sclerosis, encephalomyelitis, myasthenia gravis, systemic lupus erythematosis, autoimmune thyroiditis, dermatitis (including atopic dermatitis and eczematous dermatitis), psoriasis, Sjögren's Syndrome, Crohn's disease, aphthous ulcer, iritis, conjunctivitis, keratoconjunctivitis, ulcerative colitis, asthma, allergic asthma, cutaneous lupus erythematosus, scleroderma, vaginitis, proctitis, drug eruptions, leprosy reversal reactions, erythema nodosum leprosum, autoimmune uveitis, allergic encephalomyelitis, acute necrotizing hemorrhagic encephalopathy, idiopathic bilateral progressive sensorineural hearing loss, aplastic anemia, pure red cell anemia, idiopathic thrombocytopenia, polychondritis, Wegener's granulomatosis, chronic active hepatitis, Stevens-Johnson syndrome, idiopathic sprue, lichen planus, Graves' disease, sarcoidosis, primary biliary cirrhosis, uveitis posterior, and interstitial lung fibrosis), graft-versus-host disease, cases of transplantation, and allergy such as, atopic allergy.

[4381] Examples of disorders involving the heart or “cardiovascular disorder” include, but are not limited to, a disease, disorder, or state involving the cardiovascular system, e.g., the heart, the blood vessels, and/or the blood. A cardiovascular disorder can be caused by an imbalance in arterial pressure, a malfunction of the heart, or an occlusion of a blood vessel, e.g., by a thrombus. Examples of such disorders include hypertension, atherosclerosis, coronary artery spasm, congestive heart failure, coronary artery disease, valvular disease, arrhythmias, and cardiomyopathies.

[4382] Disorders which may be treated or diagnosed by methods described herein include, but are not limited to, disorders associated with an accumulation in the liver of fibrous tissue, such as that resulting from an imbalance between production and degradation of the extracellular matrix accompanied by the collapse and condensation of preexisting fibers. The methods described herein can be used to diagnose or treat hepatocellular necrosis or injury induced by a wide variety of agents including processes which disturb homeostasis, such as an inflammatory process, tissue damage resulting from toxic injury or altered hepatic blood flow, and infections (e.g., bacterial, viral and parasitic). For example, the methods can be used for the early detection of hepatic injury, such as portal hypertension or hepatic fibrosis. In addition, the methods can be employed to detect liver fibrosis attributed to inborn errors of metabolism, for example, fibrosis resulting from a storage disorder such as Gaucher's disease (lipid abnormalities) or a glycogen storage disease, A1-antitrypsin deficiency; a disorder mediating the accumulation (e.g., storage) of an exogenous substance, for example, hemochromatosis (iron-overload syndrome) and copper storage diseases (Wilson's disease), disorders resulting in the accumulation of a toxic metabolite (e.g., tyrosinemia, fructosemia and galactosemia) and peroxisomal disorders (e.g., Zellweger syndrome). Additionally, the methods described herein may be useful for the early detection and treatment of liver injury associated with the administration of various chemicals or drugs, such as for example, methotrexate, isonizaid, oxyphenisatin, methyldopa, chlorpromazine, tolbutamide or alcohol, or which represents a hepatic manifestation of a vascular disorder such as obstruction of either the intrahepatic or extrahepatic bile flow or an alteration in hepatic circulation resulting, for example, from chronic heart failure, veno-occlusive disease, portal vein thrombosis or Budd-Chiari syndrome.

[4383] Additionally, 33521 molecules may play an important role in the etiology of certain viral diseases, including but not limited to Hepatitis B, Hepatitis C and Herpes Simplex Virus (HSV). Modulators of 33521 activity could be used to control viral diseases. The modulators can be used in the treatment and/or diagnosis of viral infected tissue or virus-associated tissue fibrosis, especially liver and liver fibrosis. Also, 33521 modulators can be used in the treatment and/or diagnosis of virus-associated carcinoma, especially hepatocellular cancer.

[4384] Additionally, 33521 may play an important role in the regulation of metabolism or pain disorders. Diseases of metabolic imbalance include, but are not limited to, obesity, anorexia nervosa, cachexia, lipid disorders, and diabetes. Examples of pain disorders include, but are not limited to, pain response elicited during various forms of tissue injury, e.g., inflammation, infection, and ischemia, usually referred to as hyperalgesia (described in, for example, Fields, H. L. (1987) Pain, New York: McGraw-Hill); pain associated with musculoskeletal disorders, e.g., joint pain; tooth pain; headaches; pain associated with surgery; pain related to irritable bowel syndrome; or chest pain.

[4385] As discussed, successful treatment of 33521 disorders can be brought about by techniques that serve to inhibit the expression or activity of target gene products. For example, compounds, e.g., an agent identified using an assays described above, that proves to exhibit negative modulatory activity, can be used in accordance with the invention to prevent and/or ameliorate symptoms of 33521 disorders. Such molecules can include, but are not limited to peptides, phosphopeptides, small organic or inorganic molecules, or antibodies (including, for example, polyclonal, monoclonal, humanized, anti-idiotypic, chimeric or single chain antibodies, and Fab, F(ab′)2 and Fab expression library fragments, scFV molecules, and epitope-binding fragments thereof).

[4386] Further, antisense and ribozyme molecules that inhibit expression of the target gene can also be used in accordance with the invention to reduce the level of target gene expression, thus effectively reducing the level of target gene activity. Still further, triple helix molecules can be utilized in reducing the level of target gene activity. Antisense, ribozyme and triple helix molecules are discussed above.

[4387] It is possible that the use of antisense, ribozyme, and/or triple helix molecules to reduce or inhibit mutant gene expression can also reduce or inhibit the transcription (triple helix) and/or translation (antisense, ribozyme) of mRNA produced by normal target gene alleles, such that the concentration of normal target gene product present can be lower than is necessary for a normal phenotype. In such cases, nucleic acid molecules that encode and express target gene polypeptides exhibiting normal target gene activity can be introduced into cells via gene therapy method. Alternatively, in instances in that the target gene encodes an extracellular protein, it can be preferable to co-administer normal target gene protein into the cell or tissue in order to maintain the requisite level of cellular or tissue target gene activity.

[4388] Another method by which nucleic acid molecules may be utilized in treating or preventing a disease characterized by 33521 expression is through the use of aptamer molecules specific for 33521 protein. Aptamers are nucleic acid molecules having a tertiary structure which permits them to specifically bind to protein ligands (see, e.g., Osborne, et al. (1997) Curr. Opin. Chem Biol. 1: 5-9; and Patel, D. J. (1997) Curr Opin Chem Biol 1:32-46). Since nucleic acid molecules may in many cases be more conveniently introduced into target cells than therapeutic protein molecules may be, aptamers offer a method by which 33521 protein activity may be specifically decreased without the introduction of drugs or other molecules which may have pluripotent effects.

[4389] Antibodies can be generated that are both specific for target gene product and that reduce target gene product activity. Such antibodies may, therefore, by administered in instances whereby negative modulatory techniques are appropriate for the treatment of 33521 disorders. For a description of antibodies, see the Antibody section above.

[4390] In circumstances wherein injection of an animal or a human subject with a 33521 protein or epitope for stimulating antibody production is harmful to the subject, it is possible to generate an immune response against 33521 through the use of anti-idiotypic antibodies (see, for example, Herlyn, D. (1999) Ann Med 31:66-78; and Bhattacharya-Chatterjee, M., and Foon, K. A. (1998) Cancer Treat Res. 94:51-68). If an anti-idiotypic antibody is introduced into a mammal or human subject, it should stimulate the production of anti-anti-idiotypic antibodies, which should be specific to the 33521 protein. Vaccines directed to a disease characterized by 33521 expression may also be generated in this fashion.

[4391] In instances where the target antigen is intracellular and whole antibodies are used, internalizing antibodies may be preferred. Lipofectin or liposomes can be used to deliver the antibody or a fragment of the Fab region that binds to the target antigen into cells. Where fragments of the antibody are used, the smallest inhibitory fragment that binds to the target antigen is preferred. For example, peptides having an amino acid sequence corresponding to the Fv region of the antibody can be used. Alternatively, single chain neutralizing antibodies that bind to intracellular target antigens can also be administered. Such single chain antibodies can be administered, for example, by expressing nucleotide sequences encoding single-chain antibodies within the target cell population (see e.g., Marasco et al. (1993) Proc. Natl. Acad. Sci. USA 90:7889-7893).

[4392] The identified compounds that inhibit target gene expression, synthesis and/or activity can be administered to a patient at therapeutically effective doses to prevent, treat or ameliorate 33521 disorders. A therapeutically effective dose refers to that amount of the compound sufficient to result in amelioration of symptoms of the disorders. Toxicity and therapeutic efficacy of such compounds can be determined by standard pharmaceutical procedures as described above.

[4393] The data obtained from the cell culture assays and animal studies can be used in formulating a range of dosage for use in humans. The dosage of such compounds lies preferably within a range of circulating concentrations that include the ED50 with little or no toxicity. The dosage can vary within this range depending upon the dosage form employed and the route of administration utilized. For any compound used in the method of the invention, the therapeutically effective dose can be estimated initially from cell culture assays. A dose can be formulated in animal models to achieve a circulating plasma concentration range that includes the IC50 (i.e., the concentration of the test compound that achieves a half-maximal inhibition of symptoms) as determined in cell culture. Such information can be used to more accurately determine useful doses in humans. Levels in plasma can be measured, for example, by high performance liquid chromatography. Another example of determination of effective dose for an individual is the ability to directly assay levels of “free” and “bound” compound in the serum of the test subject. Such assays may utilize antibody mimics and/or “biosensors” that have been created through molecular imprinting techniques. The compound which is able to modulate 33521 activity is used as a template, or “imprinting molecule”, to spatially organize polymerizable monomers prior to their polymerization with catalytic reagents. The subsequent removal of the imprinted molecule leaves a polymer matrix which contains a repeated “negative image” of the compound and is able to selectively rebind the molecule under biological assay conditions. A detailed review of this technique can be seen in Ansell, R. J. et al (1996) Current Opinion in Biotechnology 7:89-94 and in Shea, K. J. (1994) Trends in Polymer Science 2:166-173. Such “imprinted” affinity matrixes are amenable to ligand-binding assays, whereby the immobilized monoclonal antibody component is replaced by an appropriately imprinted matrix. An example of the use of such matrixes in this way can be seen in Vlatakis, G. et al (1993) Nature 361:645-647. Through the use of isotope-labeling, the “free” concentration of compound which modulates the expression or activity of 33521 can be readily monitored and used in calculations of IC50.

[4394] Such “imprinted” affinity matrixes can also be designed to include fluorescent groups whose photon-emitting properties measurably change upon local and selective binding of target compound. These changes can be readily assayed in real time using appropriate fiberoptic devices, in turn allowing the dose in a test subject to be quickly optimized based on its individual IC50. An rudimentary example of such a “biosensor” is discussed in Kriz, D. et al (1995) Analytical Chemistry 67:2142-2144.

[4395] Another aspect of the invention pertains to methods of modulating 33521 expression or activity for therapeutic purposes. Accordingly, in an exemplary embodiment, the modulatory method of the invention involves contacting a cell with a 33521 or agent that modulates one or more of the activities of 33521 protein activity associated with the cell. An agent that modulates 33521 protein activity can be an agent as described herein, such as a nucleic acid or a protein, a naturally-occurring target molecule of a 33521 protein (e.g., a 33521 substrate or receptor), a 33521 antibody, a 33521 agonist or antagonist, a peptidomimetic of a 33521 agonist or antagonist, or other small molecule.

[4396] In one embodiment, the agent stimulates one or 33521 activities. Examples of such stimulatory agents include active 33521 protein and a nucleic acid molecule encoding 33521. In another embodiment, the agent inhibits one or more 33521 activities. Examples of such inhibitory agents include antisense 33521 nucleic acid molecules, anti-33521 antibodies, and 33521 inhibitors. These modulatory methods can be performed in vitro (e.g., by culturing the cell with the agent) or, alternatively, in vivo (e.g., by administering the agent to a subject). As such, the present invention provides methods of treating an individual afflicted with a disease or disorder characterized by aberrant or unwanted expression or activity of a 33521 protein or nucleic acid molecule. In one embodiment, the method involves administering an agent (e.g., an agent identified by a screening assay described herein), or combination of agents that modulates (e.g., up regulates or down regulates) 33521 expression or activity. In another embodiment, the method involves administering a 33521 protein or nucleic acid molecule as therapy to compensate for reduced, aberrant, or unwanted 33521 expression or activity.

[4397] Stimulation of 33521 activity is desirable in situations in which 33521 is abnormally downregulated and/or in which increased 33521 activity is likely to have a beneficial effect. For example, stimulation of 33521 activity is desirable in situations in which a 33521 is downregulated and/or in which increased 33521 activity is likely to have a beneficial effect. Likewise, inhibition of 33521 activity is desirable in situations in which 33521 is abnormally upregulated and/or in which decreased 33521 activity is likely to have a beneficial effect.

[4398] 33521 Pharmacogenomics

[4399] The 33521 molecules of the present invention, as well as agents, or modulators which have a stimulatory or inhibitory effect on 33521 activity (e.g., 33521 gene expression) as identified by a screening assay described herein can be administered to individuals to treat (prophylactically or therapeutically) 33521 associated disorders (e.g., cell proliferation and differentiation disorder, e.g., cancer and metastasis) associated with aberrant or unwanted 33521 activity. In conjunction with such treatment, pharmacogenomics (i.e., the study of the relationship between an individual's genotype and that individual's response to a foreign compound or drug) may be considered. Differences in metabolism of therapeutics can lead to severe toxicity or therapeutic failure by altering the relation between dose and blood concentration of the pharmacologically active drug. Thus, a physician or clinician may consider applying knowledge obtained in relevant pharmacogenomics studies in determining whether to administer a 33521 molecule or 33521 modulator as well as tailoring the dosage and/or therapeutic regimen of treatment with a 33521 molecule or 33521 modulator.

[4400] Pharmacogenomics deals with clinically significant hereditary variations in the response to drugs due to altered drug disposition and abnormal action in affected persons. See, for example, Eichelbaum, M. et al. (1996) Clin. Exp. Pharmacol. Physiol. 23:983-985 and Linder, M. W. et al. (1997) Clin. Chem. 43:254-266. In general, two types of pharmacogenetic conditions can be differentiated. Genetic conditions transmitted as a single factor altering the way drugs act on the body (altered drug action) or genetic conditions transmitted as single factors altering the way the body acts on drugs (altered drug metabolism). These pharmacogenetic conditions can occur either as rare genetic defects or as naturally-occurring polymorphisms. For example, glucose-6-phosphate dehydrogenase deficiency (G6PD) is a common inherited enzymopathy in which the main clinical complication is haemolysis after ingestion of oxidant drugs (anti-malarials, sulfonamides, analgesics, nitrofurans) and consumption of fava beans.

[4401] One pharmacogenomics approach to identifying genes that predict drug response, known as “a genome-wide association”, relies primarily on a high-resolution map of the human genome consisting of already known gene-related markers (e.g., a “bi-allelic” gene marker map which consists of 60,000-100,000 polymorphic or variable sites on the human genome, each of which has two variants.) Such a high-resolution genetic map can be compared to a map of the genome of each of a statistically significant number of patients taking part in a Phase II/III drug trial to identify markers associated with a particular observed drug response or side effect. Alternatively, such a high resolution map can be generated from a combination of some ten-million known single nucleotide polymorphisms (SNPs) in the human genome. As used herein, a “SNP” is a common alteration that occurs in a single nucleotide base in a stretch of DNA. For example, a SNP may occur once per every 1000 bases of DNA. A SNP may be involved in a disease process, however, the vast majority may not be disease-associated. Given a genetic map based on the occurrence of such SNPs, individuals can be grouped into genetic categories depending on a particular pattern of SNPs in their individual genome. In such a manner, treatment regimens can be tailored to groups of genetically similar individuals, taking into account traits that may be common among such genetically similar individuals.

[4402] Alternatively, a method termed the “candidate gene approach,” can be utilized to identify genes that predict drug response. According to this method, if a gene that encodes a drug's target is known (e.g., a 33521 protein of the present invention), all common variants of that gene can be fairly easily identified in the population and it can be determined if having one version of the gene versus another is associated with a particular drug response.

[4403] Alternatively, a method termed the “gene expression profiling,” can be utilized to identify genes that predict drug response. For example, the gene expression of an animal dosed with a drug (e.g., a 33521 molecule or 33521 modulator of the present invention) can give an indication whether gene pathways related to toxicity have been turned on.

[4404] Information generated from more than one of the above pharmacogenomics approaches can be used to determine appropriate dosage and treatment regimens for prophylactic or therapeutic treatment of an individual. This knowledge, when applied to dosing or drug selection, can avoid adverse reactions or therapeutic failure and thus enhance therapeutic or prophylactic efficiency when treating a subject with a 33521 molecule or 33521 modulator, such as a modulator identified by one of the exemplary screening assays described herein.

[4405] The present invention further provides methods for identifying new agents, or combinations, that are based on identifying agents that modulate the activity of one or more of the gene products encoded by one or more of the 33521 genes of the present invention, wherein these products may be associated with resistance of the cells to a therapeutic agent. Specifically, the activity of the proteins encoded by the 33521 genes of the present invention can be used as a basis for identifying agents for overcoming agent resistance. By blocking the activity of one or more of the resistance proteins, target cells, e.g., human cells, will become sensitive to treatment with an agent that the unmodified target cells were resistant to.

[4406] Monitoring the influence of agents (e.g., drugs) on the expression or activity of a 33521 protein can be applied in clinical trials. For example, the effectiveness of an agent determined by a screening assay as described herein to increase 33521 gene expression, protein levels, or upregulate 33521 activity, can be monitored in clinical trials of subjects exhibiting decreased 33521 gene expression, protein levels, or downregulated 33521 activity. Alternatively, the effectiveness of an agent determined by a screening assay to decrease 33521 gene expression, protein levels, or downregulate 33521 activity, can be monitored in clinical trials of subjects exhibiting increased 33521 gene expression, protein levels, or upregulated 33521 activity. In such clinical trials, the expression or activity of a 33521 gene, and preferably, other genes that have been implicated in, for example, a 33521-associated disorder can be used as a “read out” or markers of the phenotype of a particular cell.

[4407] 33521 Informatics

[4408] The sequence of a 33521 molecule is provided in a variety of media to facilitate use thereof. A sequence can be provided as a manufacture, other than an isolated nucleic acid or amino acid molecule, which contains a 33521. Such a manufacture can provide a nucleotide or amino acid sequence, e.g., an open reading frame, in a form which allows examination of the manufacture using means not directly applicable to examining the nucleotide or amino acid sequences, or a subset thereof, as they exists in nature or in purified form. The sequence information can include, but is not limited to, 33521 full-length nucleotide and/or amino acid sequences, partial nucleotide and/or amino acid sequences, polymorphic sequences including single nucleotide polymorphisms (SNPs), epitope sequence, and the like. In a preferred embodiment, the manufacture is a machine-readable medium, e.g., a magnetic, optical, chemical or mechanical information storage device.

[4409] As used herein, “machine-readable media” refers to any medium that can be read and accessed directly by a machine, e.g., a digital computer or analogue computer. Non-limiting examples of a computer include a desktop PC, laptop, mainframe, server (e.g., a web server, network server, or server farm), handheld digital assistant, pager, mobile telephone, and the like. The computer can be stand-alone or connected to a communications network, e.g., a local area network (such as a VPN or intranet), a wide area network (e.g., an Extranet or the Internet), or a telephone network (e.g., a wireless, DSL, or ISDN network). Machine-readable media include, but are not limited to: magnetic storage media, such as floppy discs, hard disc storage medium, and magnetic tape; optical storage media such as CD-ROM; electrical storage media such as RAM, ROM, EPROM, EEPROM, flash memory, and the like; and hybrids of these categories such as magnetic/optical storage media.

[4410] A variety of data storage structures are available to a skilled artisan for creating a machine-readable medium having recorded thereon a nucleotide or amino acid sequence of the present invention. The choice of the data storage structure will generally be based on the means chosen to access the stored information. In addition, a variety of data processor programs and formats can be used to store the nucleotide sequence information of the present invention on computer readable medium. The sequence information can be represented in a word processing text file, formatted in commercially-available software such as WordPerfect and Microsoft Word, or represented in the form of an ASCII file, stored in a database application, such as DB2, Sybase, Oracle, or the like. The skilled artisan can readily adapt any number of data processor structuring formats (e.g., text file or database) in order to obtain computer readable medium having recorded thereon the nucleotide sequence information of the present invention.

[4411] In a preferred embodiment, the sequence information is stored in a relational database (such as Sybase or Oracle). The database can have a first table for storing sequence (nucleic acid and/or amino acid sequence) information. The sequence information can be stored in one field (e.g., a first column) of a table row and an identifier for the sequence can be store in another field (e.g., a second column) of the table row. The database can have a second table, e.g., storing annotations. The second table can have a field for the sequence identifier, a field for a descriptor or annotation text (e.g., the descriptor can refer to a functionality of the sequence, a field for the initial position in the sequence to which the annotation refers, and a field for the ultimate position in the sequence to which the annotation refers. Non-limiting examples for annotation to nucleic acid sequences include polymorphisms (e.g., SNP's) translational regulatory sites and splice junctions. Non-limiting examples for annotations to amino acid sequence include polypeptide domains, e.g., a domain described herein; active sites and other functional amino acids; and modification sites.

[4412] By providing the nucleotide or amino acid sequences of the invention in computer readable form, the skilled artisan can routinely access the sequence information for a variety of purposes. For example, one skilled in the art can use the nucleotide or amino acid sequences of the invention in computer readable form to compare a target sequence or target structural motif with the sequence information stored within the data storage means. A search is used to identify fragments or regions of the sequences of the invention which match a particular target sequence or target motif. The search can be a BLAST search or other routine sequence comparison, e.g., a search described herein.

[4413] Thus, in one aspect, the invention features a method of analyzing 33521, e.g., analyzing structure, function, or relatedness to one or more other nucleic acid or amino acid sequences. The method includes: providing a 33521 nucleic acid or amino acid sequence; comparing the 33521 sequence with a second sequence, e.g., one or more preferably a plurality of sequences from a collection of sequences, e.g., a nucleic acid or protein sequence database to thereby analyze 33521. The method can be performed in a machine, e.g., a computer, or manually by a skilled artisan.

[4414] The method can include evaluating the sequence identity between a 33521 sequence and a database sequence. The method can be performed by accessing the database at a second site, e.g., over the Internet.

[4415] As used herein, a “target sequence” can be any DNA or amino acid sequence of six or more nucleotides or two or more amino acids. A skilled artisan can readily recognize that the longer a target sequence is, the less likely a target sequence will be present as a random occurrence in the database. Typical sequence lengths of a target sequence are from about 10 to 100 amino acids or from about 30 to 300 nucleotide residues. However, it is well recognized that commercially important fragments, such as sequence fragments involved in gene expression and protein processing, may be of shorter length.

[4416] Computer software is publicly available which allows a skilled artisan to access sequence information provided in a computer readable medium for analysis and comparison to other sequences. A variety of known algorithms are disclosed publicly and a variety of commercially available software for conducting search means are and can be used in the computer-based systems of the present invention. Examples of such software include, but are not limited to, MacPattern (EMBL), BLASTN and BLASTX (NCBI).

[4417] Thus, the invention features a method of making a computer readable record of a sequence of a 33521 sequence which includes recording the sequence on a computer readable matrix. In a preferred embodiment the record includes one or more of the following: identification of an ORF; identification of a domain, region, or site; identification of the start of transcription; identification of the transcription terminator; the full length amino acid sequence of the protein, or a mature form thereof; the 5′end of the translated region.

[4418] In another aspect, the invention features, a method of analyzing a sequence. The method includes: providing a 33521 sequence, or record, in machine-readable form; comparing a second sequence to the 33521 sequence; thereby analyzing a sequence. Comparison can include comparing to sequences for sequence identity or determining if one sequence is included within the other, e.g., determining if the 33521 sequence includes a sequence being compared. In a preferred embodiment the 33521 or second sequence is stored on a first computer, e.g., at a first site and the comparison is performed, read, or recorded on a second computer, e.g., at a second site. E.g., the 33521 or second sequence can be stored in a public or proprietary database in one computer, and the results of the comparison performed, read, or recorded on a second computer. In a preferred embodiment the record includes one or more of the following: identification of an ORF; identification of a domain, region, or site; identification of the start of transcription; identification of the transcription terminator; the full length amino acid sequence of the protein, or a mature form thereof; the 5′end of the translated region.

[4419] In another aspect, the invention provides a machine-readable medium for holding instructions for performing a method for determining whether a subject has a 33521-associated disease or disorder or a pre-disposition to a 33521-associated disease or disorder, wherein the method comprises the steps of determining 33521 sequence information associated with the subject and based on the 33521 sequence information, determining whether the subject has a 33521-associated disease or disorder or a pre-disposition to a 33521-associated disease or disorder and/or recommending a particular treatment for the disease, disorder or pre-disease condition.

[4420] The invention further provides in an electronic system and/or in a network, a method for determining whether a subject has a 33521-associated disease or disorder or a pre-disposition to a disease associated with a 33521 wherein the method comprises the steps of determining 33521 sequence information associated with the subject, and based on the 33521 sequence information, determining whether the subject has a 33521-associated disease or disorder or a pre-disposition to a 33521-associated disease or disorder, and/or recommending a particular treatment for the disease, disorder or pre-disease condition. In a preferred embodiment, the method further includes the step of receiving information, e.g., phenotypic or genotypic information, associated with the subject and/or acquiring from a network phenotypic information associated with the subject. The information can be stored in a database, e.g., a relational database. In another embodiment, the method further includes accessing the database, e.g., for records relating to other subjects, comparing the 33521 sequence of the subject to the 33521 sequences in the database to thereby determine whether the subject as a 33521-associated disease or disorder, or a pre-disposition for such.

[4421] The present invention also provides in a network, a method for determining whether a subject has a 33521 associated disease or disorder or a pre-disposition to a 33521-associated disease or disorder associated with 33521, said method comprising the steps of receiving 33521 sequence information from the subject and/or information related thereto, receiving phenotypic information associated with the subject, acquiring information from the network corresponding to 33521 and/or corresponding to a 33521-associated disease or disorder (e.g., cell proliferation and differentiation disorder, e.g., cancer and metastasis), and based on one or more of the phenotypic information, the 33521 information (e.g., sequence information and/or information related thereto), and the acquired information, determining whether the subject has a 33521-associated disease or disorder or a pre-disposition to a 33521-associated disease or disorder. The method may further comprise the step of recommending a particular treatment for the disease, disorder or pre-disease condition.

[4422] The present invention also provides a method for determining whether a subject has a 33521-associated disease or disorder or a pre-disposition to a 33521-associated disease or disorder, said method comprising the steps of receiving information related to 33521 (e.g., sequence information and/or information related thereto), receiving phenotypic information associated with the subject, acquiring information from the network related to 33521 and/or related to a 33521-associated disease or disorder, and based on one or more of the phenotypic information, the 33521 information, and the acquired information, determining whether the subject has a 33521-associated disease or disorder or a pre-disposition to a 33521-associated disease or disorder. The method may further comprise the step of recommending a particular treatment for the disease, disorder or pre-disease condition.

[4423] This invention is further illustrated by the following examples that should not be construed as limiting. The contents of all references, patents and published patent applications cited throughout this application are incorporated herein by reference.

Background of the 23479, 48120, and 46689 Invention

[4424] Hydrolases are a large class of enzymes that use a water molecule to catalyze the cleavage of a chemical bond. Hydrolases play important roles in the synthesis and breakdown of almost all major metabolic intermediates, including polypeptides, nucleic acids, and lipids.

[4425] One family of hydrolases consists of the ubiquitin carboxy-terminal hydrolases. The ubiquitin pathway is one example of a post-translational mechanism used to regulate protein levels. Ubiquitin is a highly conserved polypeptide expressed in all eukaryotic cells that marks proteins for degradation. Ubiquitin is attached as a single molecule or as a conjugated form to lysine residue(s) of proteins via formation of an isopeptide bond at the C-terminal glycine residue. Most ubiquitinated proteins are subsequently targeted to the 26S proteasome, a multicatalytic protease, which cleaves the marked protein into peptide fragments.

[4426] Only the protein conjugated to ubiquitin is degraded via the proteasome; ubiquitin itself is recycled by ubiquitin carboxy-terminal hydrolases (UCH; sometimes abbreviated UCTH), which cleave the bond between ubiquitin and the protein targeted for degradation. These enzymes constitute a family of thiol proteases, and homologues have been found in, for example, yeast (Miller et al., Bio Technology 7:698-704, 1989; Tobias and Varshavsky, J. Biol. Chem. 266:12021-12028, 1991; Baker et al., J. Biol. Chem. 267:23364-23375, 1992), bovine (Papa and Hochstrasser, Nature 366:313-319, 1993), avian (Woo et al., J. Biol. Chem. 270:18766-18773, 1995), Drosophila (Zhang et al., Dev. Biol. 17:214, 1993) and human (Wilkinson et al., Science 246:670, 1989) cells.

[4427] Another family of hydorlases consists of &agr;/&bgr; hydrolases. The &agr;/&bgr; hydrolase family of enzymes is a phylogenetically diverse group of enzymes that share a common fold, typically comprising an eight-stranded J-sheet surrounded by 1-helices (Ollis, D. et al. (1992) Protein Eng 5:197-211; Nardini and Dikkstra (1999) Curr Opin Str Bio 9:732-737). Members of the &agr;/&bgr; hydrolase family are found in nearly all organisms, from microbes to plants to humans. Enzymes possessing the &agr;/&bgr; hydrolase fold diverged from a common ancestor, but have preserved a catalytic mechanism that utilizes a triad of residues consisting of a nucleophile, an acidic residue, and a histidine residue (Ollis, D. et al. (1992) Protein Eng. 5:197-211). Although only the histidine residue is invariant, the other two residues in the triad are limited in terms of the amino acid residues that are functionally acceptable. Thus, the nucleophile is usually a serine residue, but can also be an aspartate or a cysteine residue, while the acidic residue is either an aspartic or glutamic acid residue (Schrag, J. et al. (1997) Meth Enzymol 284:85-107). The relative order of the three catalytic residues in the amino acid chain is always the same: nucleophile, acid, histidine.

[4428] Members of the &agr;/&bgr; hydrolase family of enzymes include enzymes that hydrolyze ester bonds (e.g., phosphatases, sulfatases, exonucleases, and endonucleases), glycosidases, enzymes that act on ether bonds, peptidases (e.g., exopeptidases and endopeptidases), as well as enzymes that hydrolyze carbon-nitrogen bonds, acid anhydrides, carbon-carbon bonds, halide bonds, phosphorous-nitrogen bonds, sulfur-nitrogen bonds, carbon-phosphorous bonds, and sulfur-sulfur bonds (E. C. Webb ed., Enzyme Nomenclature, pp. 306-450, © 1992 Academic Press, Inc. San Diego, Calif.). Some specific biological activities of these enzymes include lipase activity, e.g., fungal, bacterial and pancreatic lipases, acetylcholinesterase activity, serine carboxypeptidase activity, prolyl aminopeptidase activity, haloalkane dehalogenase activity, dienelactone hydrolase activity, A2 bromoperoxidase activity, and thioesterase activity (Schrag, J. et al., supra). Acetylcholinesterases, epoxide hydrolases, cholesterol esterases, and lipases have particular medical importance. Inhibitors of acetylcholinesterase are useful therapeutic agents for the treatment of Alzheimer's disease, myasthenia gravis, and glaucoma; epoxide hydrolases and dienelactone hydrolases detoxify harmful aromatic compounds in mammals; and the human hormone sensitive lipase catalyzes the rate-limiting reaction of fat hydrolysis in adipocytes.

[4429] Given the important role that hydrolases play in the synthesis and breakdown of metabolic intermediates, including polypeptides, nucleic acids, and lipids, it is not surprising that their activity significantly impacts the activity of the cell. For example, hydrolases contribute to the growth and differentiation of the cell, to cellular proliferation, adhesion, and motility, and to the interaction and communication that takes place between cells. In addition, hydrolases are important in the conversion of pro-proteins and pro-hormones to their active forms, the inactivation of peptides, the biotransformation of compounds (e.g., a toxin or carcinogen), antigen presentation, and the regulation of synaptic transmission.

Summary of the 23479, 48120, and 46689 Invention

[4430] The present invention is based, in part, on the discovery of novel hydorlase molecules, referred to herein as “23479, 48120, and 46689”. The nucleotide sequence of cDNAs encoding 23479, 48120, and 46689 are recited in SEQ ID NO:74, SEQ ID NO:77, and SEQ ID NO:80, respectively, and the amino acid sequences of 23479, 48120, and 46689 polypeptides are recited in SEQ ID NO:75, SEQ ID NO:78, or SEQ ID NO:81, respectively. In addition, the nucleotide sequences of the coding regions are recited in SEQ ID NO:76, SEQ ID NO:79, and SEQ ID NO:82, respectively.

[4431] Accordingly, in one aspect, the invention features a nucleic acid molecule that encodes a 23479, 48120, or 46689 protein or polypeptide, e.g., a biologically active portion of the 23479, 48120, or 46689 protein. In a preferred embodiment the isolated nucleic acid molecule encodes a polypeptide having the amino acid sequence of SEQ ID NO:75, SEQ ID NO:78, or SEQ ID NO:81. In other embodiments, the invention provides isolated 23479, 48120, or 46689 nucleic acid molecules having the nucleotide sequence shown in SEQ ID NO:74, SEQ ID NO:76, SEQ ID NO:77, SEQ ID NO:79, SEQ ID NO:80, SEQ ID NO:82, or the sequence of a DNA insert of the plasmids deposited with ATCC Accession Numbers as described herein. In still other embodiments, the invention provides nucleic acid molecules that are substantially identical (e.g., naturally occurring allelic variants) to the nucleotide sequence shown in SEQ ID NO:74, SEQ ID NO:76, SEQ ID NO:77, SEQ ID NO:79, SEQ ID NO:80, SEQ ID NO:82, or the sequence of a DNA insert of the plasmids deposited with ATCC Accession Numbers as described herein. In other embodiments, the invention provides a nucleic acid molecule which hybridizes under a stringency condition described herein to a nucleic acid molecule comprising the nucleotide sequence of SEQ ID NO:74, SEQ ID NO:76, SEQ ID NO:77, SEQ ID NO:79, SEQ ID NO:80, SEQ ID NO:82, or the sequence of a DNA insert of the plasmids deposited with ATCC Accession Numbers as described herein, wherein the nucleic acid encodes a full length 23479, 48120, or 46689 protein or an active fragment thereof.

[4432] In a related aspect, the invention further provides nucleic acid constructs that include a 23479, 48120, or 46689 nucleic acid molecule described herein. In certain embodiments, the nucleic acid molecules of the invention are operatively linked to native or heterologous regulatory sequences. Also included, are vectors and host cells containing the 23479, 48120, or 46689 nucleic acid molecules of the invention e.g., vectors and host cells suitable for producing 23479, 48120, or 46689 nucleic acid molecules and polypeptides.

[4433] In another related aspect, the invention provides nucleic acid fragments suitable as primers or hybridization probes for the detection of 23479, 48120, or 46689-encoding nucleic acids.

[4434] In still another related aspect, isolated nucleic acid molecules that are antisense to a 23479, 48120, or 46689-encoding nucleic acid molecule are provided.

[4435] In another aspect, the invention features, 23479, 48120, or 46689 polypeptides, and biologically active or antigenic fragments thereof that are useful, e.g., as reagents or targets in assays applicable to treatment and diagnosis of 23479, 48120, or 46689-mediated or -related disorders. In another embodiment, the invention provides 23479, 48120, or 46689 polypeptides having a 23479, 48120, or 46689 activity. Preferred polypeptides are 23479, 48120, or 46689 proteins including at least one hydrolase domain, and, preferably, having a 23479, 48120, or 46689 activity, e.g., a 23479, 48120, or 46689 activity as described herein.

[4436] In other embodiments, the invention provides 23479, 48120, or 46689 polypeptides, e.g., a 23479, 48120, or 46689 polypeptide having the amino acid sequence shown in SEQ ID NO:75, SEQ ID NO:78, SEQ ID NO:81, or an amino acid sequence encoded by a cDNA insert of one of the plasmids deposited with ATCC Accession Number as described herein; an amino acid sequence that is substantially identical to the amino acid sequence shown in SEQ ID NO:75, SEQ ID NO:78, SEQ ID NO:81, or an amino acid sequence encoded by a cDNA insert of one of the plasmids deposited with ATCC Accession Number as described herein; or an amino acid sequence encoded by a nucleic acid molecule having a nucleotide sequence which hybridizes under a stringency condition described herein to a nucleic acid molecule comprising the nucleotide sequence of SEQ ID NO:74, SEQ ID NO:76, SEQ ID NO:77, SEQ ID NO:79, SEQ ID NO:80, SEQ ID NO:82, or the sequence of a DNA insert of the plasmids deposited with ATCC Accession Numbers as described herein, wherein the nucleic acid encodes a full length 23479, 48120, or 46689 protein or an active fragment thereof.

[4437] In a related aspect, the invention provides 23479, 48120, or 46689 polypeptides or fragments operatively linked to non-23479, 48120, or 46689 polypeptides to form fusion proteins.

[4438] In another aspect, the invention features antibodies and antigen-binding fragments thereof that react with or, more preferably, specifically bind 23479, 48120, or 46689 polypeptides or fragments thereof, e.g., a hydrolase domain of a 23479, 48120, or 46689 polypeptide.

[4439] In another aspect, the invention provides methods of screening for compounds that modulate the expression or activity of the 23479, 48120, or 46689 polypeptides or nucleic acids.

[4440] In still another aspect, the invention provides a process for modulating 23479, 48120, or 46689 polypeptide or nucleic acid expression or activity, e.g. using the screened compounds. In certain embodiments, the methods involve treatment of conditions related to aberrant activity or expression of the 23479, 48120, or 46689 polypeptides or nucleic acids, such as conditions involving aberrant or deficient cellular proliferation or differentiation.

[4441] In yet another aspect, the invention provides methods for inhibiting the proliferation or inducing the killing, of a 23479, 48120, or 46689-expressing cell, e.g., a hyper-proliferative 23479, 48120, or 46689-expressing cell. The method includes contacting the cell with an agent, e.g., a compound (e.g., a compound identified using the methods described herein) that modulates the activity, or expression, of the 23479, 48120, or 46689 polypeptide or nucleic acid. In a preferred embodiment, the contacting step occurs in vitro or ex vivo. In other embodiments, the contacting step is effected in vivo, e.g., in a subject (e.g., a mammal, e.g., a human), as part of a therapeutic or prophylactic protocol.

[4442] In one embodiment, the cell is a hyperproliferative cell, e.g., a cell found in a solid tumor, a soft tissue tumor, or a metastatic lesion, e.g., a tumor or metastatic lesion of the lung, brain, ovary or breast.

[4443] In other embodiments, the cell is a neuron or a glial cell, e.g., a cortical or hypothalamic cell. In yet other embodiments, the cell is a cardiovascular cell, e.g., a heart- or blood vessel-associated cell.

[4444] In a preferred embodiment, the agent, e.g., the compound, is an inhibitor of a 23479, 48120, or 46689 polypeptide. Preferably, the inhibitor is chosen from a peptide, a phosphopeptide, a small organic molecule, a small inorganic molecule and an antibody (e.g., an antibody conjugated to a therapeutic moiety selected from a cytotoxin, a cytotoxic agent and a radioactive metal ion). In another embodiment, the agent, e.g., the compound, is an inhibitor of a 23479, 48120, or 46689 nucleic acid, e.g., an antisense, a ribozyme, or a triple helix molecule.

[4445] In a preferred embodiment, the agent, e.g., the compound, is administered in combination with a cytotoxic agent. Examples of cytotoxic agents include anti-microtubule agent, a topoisomerase I inhibitor, a topoisomerase II inhibitor, an anti-metabolite, a mitotic inhibitor, an alkylating agent, an intercalating agent, an agent capable of interfering with a signal transduction pathway, an agent that promotes apoptosis or necrosis, and radiation.

[4446] In another aspect, the invention features methods for treating or preventing a disorder characterized by aberrant cellular proliferation or differentiation of a 23479, 48120, or 46689-expressing cell, in a subject. Preferably, the method includes administering to the subject (e.g., a mammal, e.g., a human) an effective amount of a compound (e.g., a compound identified using the methods described herein) that modulates the activity, or expression, of the 23479, 48120, or 46689 polypeptide or nucleic acid.

[4447] The disorder can be a cancerous or pre-cancerous condition, e.g., a solid tumor, a soft tissue tumor, or a metastatic lesion, e.g., a tumor or metastatic lesion of the lung, brain, ovary or breast. In other embodiments, the disorder is a neurological or a cardiovascular disorder.

[4448] In a further aspect, the invention provides methods for evaluating the efficacy of a treatment of a disorder, e.g., a cellular proliferative or differentiative disorder, a neurological or a cardiovascular disorder. The method includes: treating a subject, e.g., a patient or an animal, with a protocol under evaluation (e.g., treating a subject with one or more of: chemotherapy, radiation, and/or a compound identified using the methods described herein); and evaluating the expression of a 23479, 48120, or 46689 nucleic acid or polypeptide before and after treatment. A change, e.g., a decrease or increase, in the level of a 23479, 48120, or 46689 nucleic acid (e.g., mRNA) or polypeptide after treatment, relative to the level of expression before treatment, is indicative of the efficacy of the treatment of the disorder. The level of 23479, 48120, or 46689 nucleic acid or polypeptide expression can be detected by any method described herein.

[4449] In a preferred embodiment, the evaluating step includes obtaining a sample (e.g., a tissue sample, e.g., a biopsy, or a fluid sample) from the subject, before and after treatment and comparing the level of expressing of a 23479, 48120, or 46689 nucleic acid (e.g., mRNA) or polypeptide before and after treatment.

[4450] In another aspect, the invention provides methods for evaluating the efficacy of a therapeutic or prophylactic agent (e.g., an anti-neoplastic agent). The method includes: contacting a sample with an agent (e.g., a compound identified using the methods described herein, a cytotoxic agent) and, evaluating the expression of 23479, 48120, or 46689 nucleic acid or polypeptide in the sample before and after the contacting step. A change, e.g., a decrease or increase, in the level of 23479, 48120, or 46689 nucleic acid (e.g., mRNA) or polypeptide in the sample obtained after the contacting step, relative to the level of expression in the sample before the contacting step, is indicative of the efficacy of the agent. The level of 23479, 48120, or 46689 nucleic acid or polypeptide expression can be detected by any method described herein. In a preferred embodiment, the sample includes cells obtained from a cancerous tissue or a lung, brain, ovary, or breast tissue.

[4451] In further aspect, the invention provides assays for determining the presence or absence of a genetic alteration in a 23479, 48120, or 46689 polypeptide or nucleic acid molecule, including for disease diagnosis.

[4452] In another aspect, the invention features a two dimensional array having a plurality of addresses, each address of the plurality being positionally distinguishable from each other address of the plurality, and each address of the plurality having a unique capture probe, e.g., a nucleic acid or peptide sequence. At least one address of the plurality has a capture probe that recognizes a 23479, 48120, or 46689 molecule. In one embodiment, the capture probe is a nucleic acid, e.g., a probe complementary to a 23479, 48120, or 46689 nucleic acid sequence. In another embodiment, the capture probe is a polypeptide, e.g., an antibody specific for 23479, 48120, or 46689 polypeptides. Also featured is a method of analyzing a sample by contacting the sample to the aforementioned array and detecting binding of the sample to the array.

[4453] Other features and advantages of the invention will be apparent from the following detailed description, and from the claims.

Detailed Description of 23479, 48120, and 46689

[4454] Human 23479

[4455] The human 23479 sequence (see SEQ ID NO:74, as recited in Example 48), which is approximately 3494 nucleotides long including untranslated regions, contains a predicted methionine-initiated coding sequence of about 2805 nucleotides, including the termination codon. The coding sequence encodes a 934 amino acid protein (see SEQ ID NO:75, as recited in Example 48).

[4456] Human 23479 contains the following regions or structural features:

[4457] a ubiquitin carboxyl-terminal hydrolase-1 (UCH-1) domain (FIG. 34A; PFAM Accession PF00442) located at about amino acid residues 296-327 of SEQ ID NO:75;

[4458] a ubiquitin carboxyl-terminal hydrolase-2 (UCH-2) domain (FIG. 34B; PFAM Accession PF00443) located at about amino acid residues 546-640 of SEQ ID NO:75;

[4459] two predicted N-glycosylation sites (PS00001) located at about amino acid residues 94-97 and 739-742 of SEQ ID NO:75;

[4460] one predicted cAMP and cGMP-dependent protein kinase phosphorylation site (PS00004) located at about amino acid residues 60-63 of SEQ ID NO:75;

[4461] twelve predicted Protein Kinase C phosphorylation sites (PS00005) located at about amino acid residues 228-230, 241-243, 326-328, 402-404, 432-434, 451-453, 490-492, 529-531, 611-613, 619-621, 706-708, and 932-934 of SEQ ID NO:75;

[4462] sixteen predicted Casein Kinase II phosphorylation sites (PS00006) located at about amino acid residues 23-26, 48-51, 85-88, 156-159, 228-231, 338-341, 390-393, 426-429, 446-449, 451-454, 617-620, 695-698, 808-811, 890-893, 905-908, and 930-933 of SEQ ID NO:75;

[4463] three predicted tyrosine kinase phosphorylation sites (PS00007) located at about amino acid residues 22-30, 543-550, and 674-681 of SEQ ID NO:75;

[4464] seven predicted N-myristoylation sites (PS00008) located at about amino acid residues 86-91, 256-261, 408-413, 560-565, 607-612, 798-803, and 814-819 of SEQ ID NO:75;

[4465] one predicted amidation site (PS00009) located at about amino acid residues 467-470 of SEQ ID NO:75;

[4466] one predicted ubiquitin carboxyl-terminal hydrolase family 2 signature 2 (PS00973) located at about amino acid residues 550-567 of SEQ ID NO:75;

[4467] one peroxisomal targeting signal located at about amino acid residues 785-793 of SEQ ID NO:75; and

[4468] one predicted coiled coil domain located at about amino acid residues 884-911 of SEQ ID NO:75.

[4469] For general information regarding PFAM identifiers, PS prefix and PF prefix domain identification numbers, refer to Sonnhammer et al. (1997) Protein 28:405-420 and http://www.psc.edu/general/software/packages/pfam/pfam.html.

[4470] A plasmid containing the nucleotide sequence encoding human 23479 (clone “Fbh23479FL”) was deposited with American Type Culture Collection (ATCC), 10801 University Boulevard, Manassas, Va. 20110-2209, on ______ and assigned Accession Number ______. This deposit will be maintained under the terms of the Budapest Treaty on the International Recognition of the Deposit of Microorganisms for the Purposes of Patent Procedure. This deposit was made merely as a convenience for those of skill in the art and is not an admission that a deposit is required under 35 U.S.C. §112.

[4471] Human 48120

[4472] The human 48120 sequence (see SEQ ID NO:77, as recited in Example 48), which is approximately 4873 nucleotides long including untranslated regions, contains a predicted methionine-initiated coding sequence of about 3420 nucleotides, including the termination codon. The coding sequence encodes a 1139 amino acid protein (see SEQ ID NO:78, as recited in Example 48).

[4473] Human 48120 contains the following regions or structural features:

[4474] a ubiquitin carboxyl-terminal hydrolase-1 (UCH-1) domain (FIG. 36A; PFAM Accession PF00442) located at about amino acid residues 162-193 of SEQ ID NO:78;

[4475] a ubiquitin carboxyl-terminal hydrolase-2 (UCH-2) domain (FIG. 36B; PFAM Accession PF00443) located at about amino acid residues 580-649 of SEQ ID NO:78;

[4476] a ubiquitin associated (UBA) domain (FIG. 36C; PFAM Accession PF00627) located at about amino acid residues 20-61 of SEQ ID NO:78;

[4477] a ubiquitin interaction motif (UIM) domain (FIG. 36D; PFAM Accession PF02809) located at about amino acid residues 96-113 of SEQ ID NO:78;

[4478] six predicted N-glycosylation sites (PS00001) located at about amino acid residues 282-285, 310-313, 373-376, 639-642, 711-714, and 916-919 of SEQ ID NO:78;

[4479] one predicted cAMP and cGMP-dependent protein kinase phosphorylation site (PS00004) located at about amino acid residues 958-961 of SEQ ID NO:78;

[4480] fifteen predicted Protein Kinase C phosphorylation sites (PS00005) located at about amino acid residues 113-115, 134-136, 137-139, 207-209, 228-230, 260-262, 279-281, 347-349, 453-455, 484-486, 517-519, 700-702, 753-755, 1110-1112, and 1137-1139 of SEQ ID NO:78;

[4481] thirty-three predicted Casein Kinase II phosphorylation sites (PS00006) located at about amino acid residues 47-50, 76-79, 109-112, 130-133, 205-208, 248-251, 368-371, 479-482, 484-487, 489-492, 494-497, 503-506, 520-523, 532-535, 550-553, 620-623, 624-627, 662-665, 668-671, 713-716, 719-722, 760-763, 808-811, 822-825, 881-884, 907-910, 918-921,966-969, 1024-1027, 1028-1031, 1033-1036, 1058-1061, 1115-1118 of SEQ ID NO:78;

[4482] one predicted tyrosine kinase phosphorylation site (PS00007) located at about amino acid residues 975-982 of SEQ ID NO:78;

[4483] thirteen predicted N-myristoylation sites (PS00008) located at about amino acid residues 12-17, 80-85, 244-249, 294-299, 300-305, 433-438, 594-599, 635-640, 761-766, 839-844, 855-860, 1001-1006, and 1077-1082 of SEQ ID NO:78;

[4484] one predicted carbamoyl-phosphate synthase subdomain signature 2 (PS00867) located at about amino acid residues 1015-1022 of SEQ ID NO:78;

[4485] one predicted ubiquitin carboxyl-terminal hydrolase family 2 signature 2 (PS00973) located at about amino acid residues 584-601 of SEQ ID NO:78; and

[4486] one predicted coiled coil domain located at about amino acid residues 399-431 of SEQ ID NO:78.

[4487] A plasmid containing the nucleotide sequence encoding human 48120 (clone “Fbh48120FL”) was deposited with American Type Culture Collection (ATCC), 10801 University Boulevard, Manassas, Va. 20110-2209, on ______ and assigned Accession Number ______. This deposit will be maintained under the terms of the Budapest Treaty on the International Recognition of the Deposit of Microorganisms for the Purposes of Patent Procedure. This deposit was made merely as a convenience for those of skill in the art and is not an admission that a deposit is required under 35 U.S.C. § 112.

[4488] Human 46689

[4489] The human 46689 sequence (see SEQ ID NO:80, as recited in Example 48), which is approximately 2082 nucleotides long including untranslated regions, contains a predicted methionine-initiated coding sequence of about 1407 nucleotides, including the termination codon. The coding sequence encodes a 468 amino acid protein (see SEQ ID NO:81, as recited in Example 48). The human 46689 protein of SEQ ID NO:81 and FIG. 34 includes an amino-terminal hydrophobic amino acid sequence, consistent with a signal sequence, of about 26 amino acid residues (from amino acid 1 to about amino acid 26 of SEQ ID NO:81), which upon cleavage results in the production of a mature protein form. This mature protein form is approximately 442 amino acid residues in length (from about amino acid residues 27 to 468 of SEQ ID NO:81).

[4490] Human 46689 contains the following regions or other structural features:

[4491] an &agr;/&bgr; hydrolase domain (PFAM Accession Number PF00561) located at about amino acid residues 186 to 419 of SEQ ID NO:81;

[4492] a predicted catalytic acid residue, located at about amino acid residue 360 of SEQ ID NO:81;

[4493] a predicted catalytic histidine residue, located about amino acid residue 391 of SEQ ID NO:81;

[4494] a predicted signal peptide located at about amino acid residues 1 to 26 of SEQ ID NO:81;

[4495] one predicted transmembrane domain located at about amino acid residues 150 to 167 of SEQ ID NO:81;

[4496] four predicted Protein Kinase C phosphorylation sites (PS00005) located at about amino acid residues 203 to 205, 305 to 307, 313 to 315, and 411 to 413 of SEQ ID NO:81;

[4497] four predicted Casein Kinase II phosphorylation sites (PS00006) located at about amino acid residues 297 to 300, 313 to 316, 357 to 360, and 454 to 457 of SEQ ID NO:81;

[4498] one predicted cAMP/cGMP-dependent protein kinase phosphorylation sites (PS00004) located at about amino acid residues 148 to 151 of SEQ ID NO:81;

[4499] two predicted amidation sites (PS00009) located at about amino acid residues 146 to 149, and 437 to 440 of SEQ ID NO:81; and

[4500] five predicted N-myristylation sites (PS00008) located at about amino acid residues 5 to 10, 52 to 57, 154 to 159, 237 to 242, and 389 to 394 of SEQ ID NO:81.

[4501] For general information regarding PFAM identifiers, PS prefix and PF prefix domain identification numbers, refer to Sonnhammer et al. (1997) Protein 28:405-420 and http://www.psc.edu/general/software/packages/pfam/pfam.html.

[4502] A plasmid containing the nucleotide sequence encoding human 46689 (clone “Fbh46689FL”) was deposited with American Type Culture Collection (ATCC), 10801 University Boulevard, Manassas, Va. 20110-2209, on ______ and assigned Accession Number ______. This deposit will be maintained under the terms of the Budapest Treaty on the International Recognition of the Deposit of Microorganisms for the Purposes of Patent Procedure. This deposit was made merely as a convenience for those of skill in the art and is not an admission that a deposit is required under 35 U.S.C. §112. 2 TABLE 18 Summary of Sequence Information for 23479, 48120, and 46689 ATCC Poly- Accession Gene cDNA ORF peptide Figure Number 23479 SEQ ID SEQ ID SEQ ID FIG. 33, NO: 74 NO: 76 NO: 75 34A-34B 48120 SEQ ID SEQ ID SEQ ID FIG. 35, NO: 77 NO: 79 NO: 78 36A-36D 46689 SEQ ID SEQ ID SEQ ID FIG. 37, NO: 80 NO: 82 NO: 81 38, 39

[4503] 23479 and 48120 Polypeptides

[4504] The 23479 and 48120 proteins contain a significant number of structural characteristics in common with members of the ubiquitin carboxyl-terminal hydrolase family. The term “family” when referring to the protein and nucleic acid molecules of the invention means two or more proteins or nucleic acid molecules having a common structural domain or motif and having sufficient amino acid or nucleotide sequence homology as defined herein. Such family members can be naturally or non-naturally occurring and can be from either the same or different species. For example, a family can contain a first protein of human origin as well as other distinct proteins of human origin, or alternatively, can contain homologues of non-human origin, e.g., rat or mouse proteins. Members of a family can also have common functional characteristics.

[4505] Members of the ubiquitin carboxyl-terminal hydrolase family of proteins are characterized by a “ubiquitin carboxyl-terminal hydrolase domain.” The term “ubiquitin carboxyl-terminal hydrolase domain” refers to an amino acid sequence that participates in the removal of one or more ubiquitin molecules from a protein that has one or more molecules of ubiquitin attached to it. The term also includes amino acid sequences that cleave conjugated forms of ubiquitin (e.g., in a head to tail orientation linked via a peptide bond) whether or not the ubiquitin conjugate is attached to a protein. For example, a ubiquitin-ubiquitin conjugate (dimer) could be cleaved into monomers, a tri-ubiquitin conjugate could be cleaved into three monomers, or a dimer and a single monomer. In either of these particular examples, the monomer or dimer could remain attached to or be cleaved from the ubiquitinated protein.

[4506] Ubiquitin carboxyl-terminal hydrolases typically contain two conserved regions, a UCH-1 domain and a UCH-2 domain, each of which is thought to participate in the catalytic mechanism. The conserved signature patterns of UCH-1 and UCH-2 are respectively as follows: (1)G-[LIVMFY]-x(1,3)-[AGC]-[NASM]-x-C-[FYW]-[LIVMFC]-[NST]-[SACV]-x-[LIVMS]-Q; and (2)Y-x-L-x-[SAG]-[LIVMFT]-x(2)-H-x-G-x(4,5)-G-H-Y (SEQ ID NO:89). 23479 and 48120 proteins preferably contain one or more sequences that conform to this signature pattern.

[4507] A 23479 or 48120 polypeptide can include a “UCH-1 domain” or regions homologous with a “UCH-1 domain.”

[4508] As used herein, the term “UCH-1 domain” includes an amino acid sequence of about 10 to 100 amino acid residues in length and having a bit score for the alignment of the sequence to the UCH-1 domain (HMM) of at least 25. 23479 or 48120 proteins preferably contain sequences that conform to UCH-1 signature pattern described above. Preferably, a 23479 protein contains the sequence GLTNLGATCYLASTIQ (SEQ ID NO:90). Preferably, a 48120 protein contains the sequence GLKNVGNTCWFSAVIQ (SEQ ID NO:91). Preferably, a UCH-1 domain includes at least about 20 to 50 amino acids, more preferably about 25 to 40 amino acid residues, or about 30 to 35 amino acids and has a bit score for the alignment of the sequence to the UCH-1 domain (HMM) of at least 45 or greater. The UCH-1 domain has been assigned the PFAM Accession PF00442 (http;//genome.wustl.edu/Pfam/html). Alignments of the UCH-1 domains of human 23479 and 48120 with consensus amino acid sequences derived from hidden Markov models are depicted in FIG. 34A (23479; amino acids 296 to 327 of SEQ ID NO:75) and FIG. 36A (48120; amino acids 162 to 193 of SEQ ID NO:78).

[4509] In a preferred embodiment 23479 or 48120 polypeptide or protein has a “UCH-1 domain” or a region which includes at least about 20 to 50 more preferably about 25 to 40 or 30 to 35 amino acid residues and has at least about 70% 80% 90% 95%, 99%, or 100% homology with a “UCH-1 domain,” e.g., a UCH-1 domain of human 23479 or 48120, e.g., residues 296 to 327 of SEQ ID NO:75 or residues 162 to 193 of SEQ ID NO:78.

[4510] A 23479 or 48120 polypeptide can further include a “UCH-2 domain” or regions homologous with a “UCH-2 domain.”

[4511] As used herein, the term “UCH-2 domain” includes an amino acid sequence of about 10 to 150 amino acid residues in length and having a bit score for the alignment of the sequence to the UCH-2 domain (HMM) of at least 50. 23479 or 48120 proteins preferably contain sequences that conform to UCH-2 signature pattern described above. Preferably, a 23479 protein contains the sequence YDLIGVTVHTGTADGGHY (SEQ ID NO:92). Preferably, a 48120 protein contains the sequence YRLHAVLVHEGQANAGHY (SEQ ID NO:93). Preferably, a UCH-2 domain includes at least about 30 to 125 amino acids, more preferably about 50 to 110 amino acid residues, or about 60 to 100 amino acids and has a bit score for the alignment of the sequence to the UCH-2 domain (HMM) of at least 75 or greater. The UCH-2 domain has been assigned the PFAM Accession PF00443 (http;//genome.wustl.edu/Pfam/.html). Alignments of the UCH-2 domains of human 23479 and 48120 with consensus amino acid sequences derived from hidden Markov models are depicted in FIG. 34B (23479; amino acids 546 to 640 of SEQ ID NO:75) and FIG. 36B (48120; amino acids 580 to 649 of SEQ ID NO:78).

[4512] In a preferred embodiment 23479 or 48120 polypeptide or protein has a “UCH-2 domain” or a region which includes at least about 30 to 125 more preferably about 50 to 110 or 60 to 100 amino acid residues and has at least about 70% 80% 90% 95%, 99%, or 100% homology with a “UCH-2 domain,” e.g., a UCH-2 domain of human 23479 or 48120, e.g., residues 546 to 640 of SEQ ID NO:75 or residues 580 to 649 of SEQ ID NO:78.

[4513] A 48120 polypeptide can also include a “UBA domain” or regions homologous with a “UBA domain.” A “UBA domain” refers to a commonly occurring amino acid sequence found in several proteins having connections to ubiquitin and the ubiquitination pathway. The structure of the UBA domain consists of a compact three helix bundle.

[4514] As used herein, the term “UBA domain” includes an amino acid sequence of about 10 to 100 amino acid residues in length and having a bit score for the alignment of the sequence to the UBA domain (HMM) of at least 5. Preferably, a UBA domain includes at least about 20 to 80 amino acids, more preferably about 25 to 60 amino acid residues, or about 35 to 45 amino acids and has a bit score for the alignment of the sequence to the UBA domain (HMM) of at least 8 or greater. The UBA domain (HMM) has been assigned the PFAM Accession Number PF00627 (http://genome.wustl.edu/Pfam/html). An alignment of the UBA domain (amino acids 20-61 of SEQ ID NO:78) of human 48120 with a consensus amino acid sequence derived from a hidden Markov model is depicted in FIG. 36C.

[4515] In a preferred embodiment a 48120 polypeptide or protein has a “UBA domain” or a region which includes at least about 20 to 80 more preferably about 25 to 60 or 35 to 45 amino acid residues and has at least about 70% 80% 90% 95%, 99%, or 100% homology with a “UBA domain,” e.g., a UBA domain of human 48120, e.g., residues 20 to 61 of SEQ ID NO:78.

[4516] A 48120 polypeptide can also include a “ubiquitin interaction motif (UIM) domain” or regions homologous with a “UIM domain.”

[4517] As used herein, the term “UIM domain” includes an amino acid sequence of about 10 to 50 amino acid residues in length and having a bit score for the alignment of the sequence to the UBA domain (HMM) of at least 5. Preferably, a UIM domain includes at least about 10 to 40 amino acids, more preferably about 10 to 30 amino acid residues, or about 15 to 20 amino acids and has a bit score for the alignment of the sequence to the UIM domain (HMM) of at least 10 or greater. The UIM domain (HMM) has been assigned the PFAM Accession Number PF02809 (http://genome.wustl.edu/Pfam/html). An alignment of the UIM domain (amino acids 96-113 of SEQ ID NO:78) of human 48120 with a consensus amino acid sequence derived from a hidden Markov model is depicted in FIG. 36D.

[4518] In a preferred embodiment a 48120 polypeptide or protein has a “UIM domain” or a region which includes at least about 10 to 40 more preferably about 10 to 30 or 15 to 20 amino acid residues and has at least about 70% 80% 90% 95%, 99%, or 100% homology with a “UIM domain,” e.g., a UIM domain of human 48120, e.g., residues 96 to 113 of SEQ ID NO:78.

[4519] To identify the presence of a UCH-1, UCH-2, UBA, or UIM domain in a 23479 or 48120 protein sequence, and make the determination that a polypeptide or protein of interest has a particular profile, the amino acid sequence of the protein can be searched against the Pfam database of HMMs (e.g., the Pfam database, release 2.1) using the default parameters (http://www.sanger.ac.uk/Software/Pfam/HMM_search). For example, the hmmsf program, which is available as part of the HMMER package of search programs, is a family specific default program for MILPAT0063 and a score of 15 is the default threshold score for determining a hit. Alternatively, the threshold score for determining a hit can be lowered (e.g., to 8 bits). A description of the Pfam database can be found in Sonhammer et al. (1997) Proteins 28(3):405-420 and a detailed description of HMMs can be found, for example, in Gribskov et al. (1990) Meth. Enzymol. 183:146-159; Gribskov et al. (1987) Proc. Natl. Acad. Sci. USA 84:4355-4358; Krogh et al. (1994) J. Mol. Biol. 235:1501-1531; and Stultz et al. (1993) Protein Sci. 2:305-314, the contents of which are incorporated herein by reference.

[4520] A 23479 or 48120 family member can include a UCH-1 domain and a UCH-2 domain. A 48120 family member can also include a UBA domain. A 48120 family member can further include a UIM domain.

[4521] A 23479 family member can also include at least one and preferably two N-glycosylation sites (PS00001); at least one cAMP and cGMP-dependent protein kinase phosphorylation site (PS00004); at least one, two, three, four, five, six, seven, eight, nine, 10, 11, and preferably 12 protein kinase C phosphorylation sites (PS00005); at least one, two, three, four, five, six, seven, eight, nine, 10, 11, 12, 13, 14, 15, and preferably 16 casein kinase II phosphorylation sites (PS00006); at least one, two, and preferably three tyrosine kinase phosphorylation sites (PS00007); at least one, two, three, four, five, six, and preferably seven N-myristoylation sites (PS00008); at least one amidation site (PS00009); at least one ubiquitin carboxyl-terminal hydrolase family 2 signature 2 (PS00973); at least one peroxisomal targeting signal; and at least one coiled coil domain.

[4522] A 48120 family member can also include at least one, two, three, four, five, and preferably six N-glycosylation sites (PS00001); at least one cAMP and cGMP-dependent protein kinase phosphorylation site (PS00004); at least one, two, three, four, five, six, seven, eight, nine, 10, 11, 12, 13, 14, and preferably 15 protein kinase C phosphorylation sites (PS00005); at least one, two, three, four, five, six, seven, eight, nine, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, and preferably 33 casein kinase II phosphorylation sites (PS00006); at least one tyrosine kinase phosphorylation site (PS00007); at least one, two, three, four, five, six, seven, eight, nine, 10, 11, 12, and preferably 13 N-myristoylation sites (PS00008); at least one carbamoyl-phosphate synthase subdomain signature 2 (PS00867); at least one ubiquitin carboxyl-terminal hydrolase family 2 signature 2 (PS00973); and at least one coiled coil domain.

[4523] As the 23479 or 48120 polypeptides of the invention may modulate 23479 or 48120-mediated activities, they may be useful as of for developing novel diagnostic and therapeutic agents for 23479 or 48120-mediated or related disorders, as described below.

[4524] As used herein, a “23479 or 48120 activity”, “biological activity of 23479 or 48120” or “functional activity of 23479 or 48120”, refers to an activity exerted by a 23479 or 48120 protein, polypeptide or nucleic acid molecule. For example, a 23479 or 48120 activity can be an activity exerted by 23479 or 48120 in a physiological milieu on, e.g., a 23479 or 48120-responsive cell or on a 23479 or 48120 substrate, e.g., a protein substrate. A 23479 or 48120 activity can be determined in vivo or in vitro. In one embodiment, a 23479 or 48120 activity is a direct activity, such as an association with a 23479 or 48120 target molecule. A “target molecule” or “binding partner” is a molecule with which a 23479 or 48120 protein binds or interacts in nature, e.g., a complex of ubiquitin and a protein targeted for degradation.

[4525] A 23479 or 48120 activity can also be an indirect activity, e.g., an activity mediated by a protein that is a target for de-ubiquitination by 23479 or 48120. The features of the 23479 or 48120 molecules of the present invention can provide similar biological activities as ubiquitin carboxyl-terminal hydrolase family members. For example, the 23479 or 48120 proteins of the present invention can have one or more of the following activities: 1) modulation of de-ubiquitination of a substrate, e.g., a ubiquitinated protein targeted for degradation; 2) participation in the processing of poly-ubiquitin precursors; 3) modulation of cellular proliferation and/or differentiation; 4) modulation of apoptosis; 5) modulation of transcription and/or cell-cycle progression; 6) modulation of signal-transduction; 7) modulation of antigen processing; 8) modulation of cell-cell adhesion; 9) modulation of receptor-mediated endocytosis; 10) modulation of organelle biogenesis and development; 11) participation in neuropathological conditions; and 12) participation in oncogenesis.

[4526] Based on the above-described sequence similarities, the 23479 or 48120 molecules of the present invention are predicted to have similar biological activities as ubiquitin carboxyl-terminal hydrolase family members. Ubiquitin carboxyl-terminal hydrolase domains regulate the de-ubiquitination of a substrate, e.g., a protein targeted for degradation. Thus, 23479 or 48120 molecules can act as novel diagnostic targets and therapeutic agents for controlling, e.g., ubiquitination related disorders. 23479 or 48120 molecules of the invention may be useful, for example, in inducing the de-ubiquitination of ubiquitinated proteins. These proteins can therefore modulate protein degradation and the recycling of ubiquitin, as well as participate in cell signaling pathways in which ubiquitination or de-ubiquitination of a protein can alter or modify the activity of the protein. Thus, 23479 or 48120 molecules may act as novel therapeutic agents for controlling disorders associated with excessive or insufficient ubiquitination (e.g., protein degradation), and as diagnostic markers useful for indicating the presence or predisposition towards developing such disorders, or monitoring the progression or regression of a disorder.

[4527] Ubiquitination has been implicated in regulating numerous cellular processes including, for example, proliferation, differentiation, apoptosis (programmed cell death), transcription, signal-transduction, cell-cycle progression, receptor-mediated endocytosis, organelle biogenesis and others. The presence of abnormal amounts of ubiquitinated proteins in neuropathological conditions such as Alzheimer's and Pick's disease indicates that ubiquitination plays a role in various physiological disorders. Oncogenes (e.g., v-jun and v-fos) are often found to be resistant to ubiquitination in comparison to their normal cell counterparts, suggesting that a failure to degrade oncogene protein products accounts for some of their cell transformation capability.

[4528] As the 23479 and 48120 molecules of the invention are expressed in coronary and endothelial tissues, brain tissues, erythroid cells, and lung tumors (See Example 49), they can act as novel diagnostic targets and therapeutic agents for controlling disorders associated with abnormal de-ubiquitination activity and disorders associated with abnormal protein degradation in such tissues. Thus, examples of disorders that can be treated and/or diagnosed with the molecules of the invention include cellular proliferative and/or differentiative disorders (e.g., in the lung), cardiovascular disorders, brain disorders, and hematopoietic disorders.

[4529] 46689 Polypeptides

[4530] The 46689 protein contains a significant number of structural characteristics in common with members of the &agr;/&bgr; hydrolase family. The term “family” when referring to the protein and nucleic acid molecules of the invention means two or more proteins or nucleic acid molecules having a common structural domain or motif and having sufficient amino acid or nucleotide sequence homology as defined herein. Such family members can be naturally or non-naturally occurring and can be from either the same or different species. For example, a family can contain a first protein of human origin as well as other distinct proteins of human origin, or alternatively, can contain homologues of non-human origin, e.g., rat or mouse proteins. Members of a family can also have common functional characteristics.

[4531] The &agr;/&bgr; hydrolase family of proteins is characterized by a common fold. In its most typical form, the fold consists of an eight stranded &bgr;-sheet surrounded by &agr;-helices which further includes conserved catalytic residues. This enzyme family includes lipases, esterases, and proteases. Despite the variety of reactions this family is capable of mediating, the chemistry of these reactions is generally similar. Three positions in particular form a catalytic triad, contributing nucleophilic, acidic, and histidine residues. The side chains of these residues are required for nucleophilic attack on one of the atoms (typically a carbon atom) involved in the chemical bond that is to be cleaved, and thus they function during one of the most critical steps in the catalytic process. The nucleophilic residue is typically serine, although a cysteine or an aspartate residue substitutes for serine in some members of the family. The nucleophilic residue is located in a motif that has been termed the “nucleophile elbow”, a loop that makes a sharp turn following the fifth &bgr;-strand of the canonical &agr;/&bgr; hydrolase fold. Due to the constrained arrangement of the residues that form the nucleophile elbow, the sequence of the nucleophile elbow typically includes two or three glycine residues. Following the nucleophilic residue is the acidic residue, which is located on a loop following the seventh &bgr;-strand. This acidic residue can be either an aspartic acid or a glutamic acid residue. Finally, the third residue of the catalytic triad, the histidine residue, is absolutely conserved and is located in a loop that follows the eighth &bgr;-strand of the canonical &agr;/&bgr; hydrolase domain. Importantly, the relative order of these three catalytic residues within a given &agr;/&bgr; hydrolase peptide sequence, nuclophile-acid-histidine, is conserved in all members of the &agr;/&bgr; hydrolase family. Another conserved feature of the &agr;/&bgr; hydrolase fold is an oxyanion hole, located near the end of the third &bgr;-strand, which is believed to stabilize potential covalent intermediates formed during the nucleophilic attack step in the catalytic process. The covalent intermediate then proceeds to product by general base catalysis. A detailed description of the &agr;/&bgr; hydrolase fold can be found in Ollis et al. (1992), Protein Eng 5(3):197-211, and Nardini and Dijkstra (1999), Curr Opin Struct Biol 9(6):732-7, the contents of which are incorporated herein by reference.

[4532] A 46689 polypeptide can include an “&agr;/&bgr; hydrolase domain” or regions homologous with an “&agr;/&bgr; hydrolase domain”.

[4533] As used herein, the term “&agr;/&bgr; hydrolase domain” includes an amino acid sequence of about 100 to 350 amino acid residues in length which contains a conserved catalytic triad consisting of a nucleophilic amino acid residue, an acidic amino acid residue, and a histidine residue. Preferably, an &agr;/&bgr; hydrolase domain includes at least about 150 to 300 amino acids, more preferably about 175 to 250 amino acid residues, or about 200 to 250 amino acids. Based on sequence alignments, the presence and extent of an &agr;/&bgr; hydrolase domain in a test protein sequence can be determined. One description of an &agr;/&bgr; hydrolase domain (HMM) has been assigned the PFAM Accession Number PF00561 (http://pfam.wustl.edu). An alignment of human 46689 (about amino acids 186 to 419 of SEQ ID NO:81) with the PFAM &agr;/&bgr; hydrolase domain consensus amino acid sequence (SEQ ID NO:87) derived from a hidden Markov model is depicted in FIG. 38. The alignment demonstrates the presence of a catalytic acid residue, located at about amino acid 360 of SEQ ID NO:81, and a catalytic histidine residue, located about amino acid residue 391 of SEQ ID NO:81. The alignment also suggests that the serine residue located at about amino acid residue 238 of SEQ ID NO:81 may be the catalytic nucleophile residue of human 46689.

[4534] A consensus sequence for &agr;/&bgr; hydrolase domain-containing protein families is also provided, e.g., by ProDom family PD007763 (ProDomain Release 2001.1; http://www.toulouse.inra.fr/prodom.html). An alignment of a large portion of human 46689 (about amino acid residues 97 to 424 of SEQ ID NO:81) with the consensus amino acid sequence of an &agr;/&bgr; hydrolase-containing family of proteins (SEQ ID NO:88) derived from recursive PSI-BLAST searches, is depicted in FIG. 39. This alignment also reveals the presence of the conserved catalytic acid and catalytic histidine residues of human 46689 polypeptides.

[4535] In a preferred embodiment, a 46689 polypeptide or protein has an “&agr;/&bgr; hydrolase domain” or a region which includes a conserved catalytic triad consisting of a nucleophilic residue, an acid residue, and a histidine residue, wherein the nucleophilic residue is separated from the catalytic acid residue by about 110 to 145 amino acid residues, more preferably about 115 to 130 amino acid residues, or about 122 amino acid residues.

[4536] In another preferred embodiment, a 46689 polypeptide or protein has an “&agr;/&bgr; hydrolase domain” or a region which includes a conserved catalytic triad consisting of a nucleophilic residue, an acid residue, and a histidine residue, wherein the catalytic acid residue is separated from the catalytic histidine residue by about 15 to 40 amino acid residues, more preferably about 25 to 35 amino acid residues, or about 31 amino acid residues.

[4537] In yet another preferred embodiment, a 46689 polypeptide or protein has an “&agr;/&bgr; hydrolase domain” or a region which includes at least about 100 to 350, more preferably about 175 to 300, or 200 to 250 amino acid residues and has at least about 70% 80% 90% 95%, 98%, 99%, or 100% homology with an “&agr;/&bgr; hydrolase domain,” e.g., the &agr;/&bgr; hydrolase domain of human 46689 (e.g., residues 186 to 419 of SEQ ID NO:81).

[4538] To identify the presence of an “&agr;/&bgr; hydrolase” domain in a 46689 protein sequence, and make the determination that a polypeptide or protein of interest has a particular profile, the amino acid sequence of the protein can be searched against the PFAM database of HMMs (e.g., the Pfam database, release 2.1) using the default parameters (http://www.sanger.ac.uk/Software/Pfam/HMM_search). For example, the hmmsf program, which is available as part of the HMMER package of search programs, is a family specific default program for MILPAT0063 and a score of 15 is the default threshold score for determining a hit. Alternatively, the threshold score for determining a hit can be lowered (e.g., to 8 bits). A description of the PFAM database can be found in Sonhammer et al. (1997) Proteins 28(3):405-420 and a detailed description of HMMs can be found, for example, in Gribskov et al. (1990) Meth. Enzymol. 183:146-159; Gribskov et al. (1987) Proc. Natl. Acad. Sci. USA 84:4355-4358; Krogh et al. (1994) J. Mol. Biol. 235:1501-1531; and Stultz et al. (1993) Protein Sci. 2:305-314, the contents of which are incorporated herein by reference. A search was performed against the HMM database resulting in the identification of an “&agr;/&bgr; hydrolase” domain in the amino acid sequence of human 46689 at about residues 186 to 419 of SEQ ID NO:81 (see FIG. 38).

[4539] A 46689 molecule can further include an amino acid sequence homologous to “ProDom PD007763 domain.” Members of the family of proteins containing this domain are uncharacterized proteins that share conserved regions, including a hydrolase signature. Yeast protein YMR210W and human protein PHPS1-2 are members of this family.

[4540] As used herein, the term “ProDom PD007763 domain” includes an amino acid sequence of about 250 to 450 amino acid residues in length having a bit score for the alignment of the sequence with ProDom PD007763 of at least 50. Preferably, a ProDom PD007763 domain includes at least 250 to 450 amino acids, more preferably about 275 to 400 amino acid residues, or about 300 to 350 amino acids, and has a bit score for the alignment of the sequence with ProDom PD007763 of at least 75, 85, preferably 95 or more. An alignment of the ProDom PD007763 domain of human 46689 (amino acids 97 to 424 of SEQ ID NO:81) with a consensus amino acid sequence derived from a hidden Markov model is depicted in FIG. 39.

[4541] In a preferred embodiment, a 46689 polypeptide or protein has a “ProDom PD007763 domain” or a region which includes at least about 250 to 450, more preferably about 275 to 400, or 300 to 350 amino acid residues and has at least about 70% 80% 90% 95%, 99%, or 100% homology with a “ProDom PD007763 domain”, e.g., the ProDom PD007763 domain of human 46689 (e.g., residues 97 to 424 of SEQ ID NO:81).

[4542] To identify the presence of a “ProDom PD007763 domain” in a 46689 protein sequence, and make the determination that a polypeptide or protein of interest contains such a domain, the amino acid sequence of the protein can be searched against the ProDom database (Corpet et al. (1999), Nucl. Acids Res. 27:263-267) The ProDom protein domain database consists of an automatic compilation of homologous domains. Current versions of ProDom are built using recursive PSI-BLAST searches (Altschul S F et al. (1997) Nucleic Acids Res. 25:3389-3402; Gouzy et al. (1999) Computers and Chemistry 23:333-340.) of the SWISS-PROT 38 and TREMBL protein databases. The database automatically generates a consensus sequence for each domain. A BLAST search was performed against the ProDom database resulting in the identification of an “&agr;/&bgr; hydrolase” domain within the amino acid sequence of human 46689 that includes residues 97 to 424 of SEQ ID NO:81 (FIG. 39).

[4543] A 46689 protein can further include at least one transmembrane domain. As used herein, the term “transmembrane domain” includes an amino acid sequence of at least about 10 amino acid residues in length that spans the plasma membrane. More preferably, a transmembrane domain includes about at least 15 amino acid residues and spans the plasma membrane. Transmembrane domains are rich in hydrophobic residues, and typically have an alpha-helical structure. In a preferred embodiment, at least 50%, 60%, 70%, 80%, 90%, 95% or more of the amino acids of a transmembrane domain are hydrophobic, e.g., leucines, isoleucines, tyrosines, or tryptophans. Transmembrane domains are described in, for example, Zagotta W. N. et al., (1996) Annual Rev. Neurosci. 19: 235-263, the contents of which are incorporated herein by reference. Amino acid residues 150 to 167 of the 46689 protein (SEQ ID NO:81) are predicted to be a transmembrane domain (see FIG. 37). Accordingly, 56294 proteins having at least 50-60% homology, preferably about 60-70%, more preferably about 70-80%, or about 80-90% homology with at least one transmembrane domain of human 46689 are within the scope of the invention.

[4544] In a preferred embodiment, 46689 protein has a “transmembrane domain” or a region which includes at least about 10, more preferably at least about 15 amino acid residues and has at least about 70% 80% 90% 95%, 99%, or 100% homology with a “transmembrane domain,” e.g., the transmembrane domain of 46689 protein (e.g., residues 150 to 167 of SEQ ID NO:81).

[4545] A 46689 protein can further include a signal sequence. As used herein, a “signal peptide” or “signal sequence” refers to a peptide of about 15 to 60, preferably about 20 to 40, more preferably, 27 amino acid residues in length which occurs at the N-terminus of secretory and integral membrane proteins and which contains a majority of hydrophobic amino acid residues. For example, a signal sequence contains at least about 15 to 60, preferably about 20 to 40, more preferably, 27 amino acid residues, and has at least about 40-70%, preferably about 50-65%, and more preferably about 55-60% hydrophobic amino acid residues (e.g., alanine, valine, leucine, isoleucine, phenylalanine, tyrosine, tryptophan, or proline). Such a “signal sequence”, also referred to in the art as a “signal peptide”, serves to direct a protein containing such a sequence to a lipid bilayer. For example, in one embodiment, a 46689 protein contains a signal sequence of about 26 amino acids. The “signal sequence” is cleaved during processing of the mature protein. The mature 46689 protein corresponds to amino acids 27 to 468 of SEQ ID NO:81.

[4546] A 46689 family member can include at least one &agr;/&bgr; hydrolase domain and/or at least one ProDom PD007763 domain. Furthermore, a 46689 family member can include at least one catalytic acid residue; at least one catalytic histidine residue; at least one transmembrane domain; at least one signal peptide; at least one, two, three, preferably four predicted protein kinase C phosphorylation sites (PS00005); at least one, two, three, preferably four predicted casein kinase II phosphorylation sites (PS00006); at least one predicted cAMP- and cGMP-dependent protein kinase phosphorylation site (PS00004); at least one, preferably two predicted amidation sites (PS00009); and at least one, two, three, four, preferably five predicted N-myristylation sites (PS00008).

[4547] As the 46689 polypeptides of the invention may modulate 46689-mediated activities, they may be useful as of for developing novel diagnostic and therapeutic agents for 46689-mediated or related disorders, as described below.

[4548] As used herein, a “46689 activity”, “biological activity of 46689” or “functional activity of 46689”, refers to an activity exerted by a 46689 protein, polypeptide or nucleic acid molecule. For example, a 46689 activity can be an activity exerted by 46689 in a physiological milieu on, e.g., a 46689-responsive cell or on a 46689 substrate, e.g., a protein, lipid, or small molecule substrate. A 46689 activity can be determined in vivo or in vitro. In one embodiment, a 46689 activity is a direct activity, such as an association with a 46689 target molecule. A “target molecule” or “binding partner” is a molecule with which a 46689 protein binds or interacts in nature. In an exemplary embodiment, 46689 hydrolyzes a substrate, e.g., a protein, lipid, or small molecule (e.g., metabolite, signaling molecule, toxin, or carcinogen) substrate.

[4549] A 46689 activity can also be an indirect activity, e.g., modulation of a cellular signaling activity mediated by a 46689 substrate or product. The features of the 46689 molecules of the present invention can provide similar biological activities as &agr;/&bgr; hydrolase family members. For example, the 46689 proteins of the present invention can have one or more of the following activities: (1) hydrolysis of lipid substrates; (2) hydrolysis of cholesterol; (3) hydrolysis of epoxides and other toxic chemicals; (4) hydrolysis of acetylcholine and other neurotransmitters; (5) protease activity; (6) hydrolysis of carboxylesters; or (7) thioesterase activity. As a result, the 46689 protein may have a critical function in one or more of the following physiological processes: (1) metabolite regulation and degradation; (2) drug metabolism; (3) toxin or carcinogen removal and neutralization; (4) toxin or carcinogen production; (4) cellular proliferation or differentiation; and (5) neuronal function.

[4550] The 46689 polypeptide may be involved in disorders of metabolic imbalance, including obesity, anorexia nervosa, cachexia, lipid disorders, cholesterol imbalance, and diabetes. For example, many &agr;/&bgr; hydrolase family members have lipase activity and cholesterol esterase activity and the human hormone sensitive lipase is responsible for metabolizing fat stored in adipocytes. In addition, the methods can be employed to detect liver fibrosis attributed to inborn errors of metabolism, for example, fibrosis resulting from a storage disorder such as Gaucher's disease (lipid abnormalities) or a glycogen storage disease, al-antitrypsin deficiency; a disorder mediating the accumulation (e.g., storage) of an exogenous substance, for example, hemochromatosis (iron-overload syndrome) and copper storage diseases (Wilson's disease).

[4551] The 46689 polypeptide may be involved in disorders of toxin (e.g., carcinogen) metabolism and/or removal. For example, disorders resulting in the accumulation of a toxic metabolite (e.g., tyrosinemia, fructosemia and galactosemia) and peroxisomal disorders (e.g., Zellweger syndrome). Additionally, the methods described herein may be useful for the early detection and treatment of liver injury associated with the administration of various chemicals or drugs, such as for example, methotrexate, isonizaid, oxyphenisatin, methyldopa, chlorpromazine, tolbutamide or alcohol.

[4552] 46689 molecules may also have a critical role in removing xenobiotic epoxides and other toxins from the body. Furthermore, it may contribute to the metabolism of drugs and other pharmaceuticals. The study of polymorphisms in the 46689 gene should provide a useful resource for pharmacogenomic (see below) analysis of drug responses. Additionally, variations in 46689 may contribute to population differences in sensitivity to environmental toxins.

[4553] In addition, as the 46689 molecules are expressed in bone marrow tissue, lung tissue, thymus tissue, glial cells, brain tissue, and kidney tissue, as well as in lung, brain, ovary, and breast tumors, they can act as novel diagnostic targets and therapeutic agents for controlling disorders associated with abnormal cellular metabolism in those tissues.

[4554] Thus, examples of disorders that can be treated and/or diagnosed with the molecules of the invention include cellular proliferative and/or differentiative disorders (e.g., in the lung, brain, ovary, or breast), hematopoietic disorders, neural disorders (e.g., brain disorders), liver disorders, and cardiovascular disorders.

[4555] Examples of cellular proliferative and/or differentiative disorders include cancer, e.g., carcinoma, sarcoma, metastatic disorders or hematopoietic neoplastic disorders, e.g., leukemias. A metastatic tumor can arise from a multitude of primary tumor types, including but not limited to those of prostate, colon, lung, breast and liver origin.

[4556] As used herein, the terms “cancer”, “hyperproliferative” and “neoplastic” refer to cells having the capacity for autonomous growth. Examples of such cells include cells having an abnormal state or condition characterized by rapidly proliferating cell growth. Hyperproliferative and neoplastic disease states may be categorized as pathologic, i.e., characterizing or constituting a disease state, or may be categorized as non-pathologic, i.e., a deviation from normal but not associated with a disease state. The term is meant to include all types of cancerous growths or oncogenic processes, metastatic tissues or malignantly transformed cells, tissues, or organs, irrespective of histopathologic type or stage of invasiveness. “Pathologic hyperproliferative” cells occur in disease states characterized by malignant tumor growth. Examples of non-pathologic hyperproliferative cells include proliferation of cells associated with wound repair.

[4557] The terms “cancer” or “neoplasms” include malignancies of the various organ systems, such as affecting lung, breast, thyroid, lymphoid, gastrointestinal, and genito-urinary tract, as well as adenocarcinomas which include malignancies such as most colon cancers, renal-cell carcinoma, prostate cancer and/or testicular tumors, non-small cell carcinoma of the lung, cancer of the small intestine and cancer of the esophagus.

[4558] The term “carcinoma” is art recognized and refers to malignancies of epithelial or endocrine tissues including respiratory system carcinomas, gastrointestinal system carcinomas, genitourinary system carcinomas, testicular carcinomas, breast carcinomas, prostatic carcinomas, endocrine system carcinomas, and melanomas. Exemplary carcinomas include those forming from tissue of the cervix, lung, prostate, breast, head and neck, colon and ovary. The term also includes carcinosarcomas, e.g., which include malignant tumors composed of carcinomatous and sarcomatous tissues. An “adenocarcinoma” refers to a carcinoma derived from glandular tissue or in which the tumor cells form recognizable glandular structures.

[4559] The term “sarcoma” is art recognized and refers to malignant tumors of mesenchymal derivation.

[4560] Examples of cellular proliferative and/or differentiative disorders of the colon include, but are not limited to, non-neoplastic polyps, adenomas, familial syndromes, colorectal carcinogenesis, colorectal carcinoma, and carcinoid tumors.

[4561] Examples of cellular proliferative and/or differentiative disorders of the liver include, but are not limited to, nodular hyperplasias, adenomas, and malignant tumors, including primary carcinoma of the liver and metastatic tumors.

[4562] Examples of cellular proliferative and/or differentiative disorders of the breast include, but are not limited to, proliferative breast disease including, e.g., epithelial hyperplasia, sclerosing adenosis, and small duct papillomas; tumors, e.g., stromal tumors such as fibroadenoma, phyllodes tumor, and sarcomas, and epithelial tumors such as large duct papilloma; carcinoma of the breast including in situ (noninvasive) carcinoma that includes ductal carcinoma in situ (including Paget's disease) and lobular carcinoma in situ, and invasive (infiltrating) carcinoma including, but not limited to, invasive ductal carcinoma, invasive lobular carcinoma, medullary carcinoma, colloid (mucinous) carcinoma, tubular carcinoma, and invasive papillary carcinoma, and miscellaneous malignant neoplasms. Disorders in the male breast include, but are not limited to, gynecomastia and carcinoma.

[4563] Examples of cellular proliferative and/or differentiative disorders of the lung include, but are not limited to, bronchogenic carcinoma, including paraneoplastic syndromes, bronchioloalveolar carcinoma, neuroendocrine tumors, such as bronchial carcinoid, miscellaneous tumors, and metastatic tumors; pathologies of the pleura, including inflammatory pleural effusions, noninflammatory pleural effusions, pneumothorax, and pleural tumors, including solitary fibrous tumors (pleural fibroma) and malignant mesothelioma.

[4564] Additional examples of proliferative disorders include hematopoietic neoplastic disorders. As used herein, the term “hematopoietic neoplastic disorders” includes diseases involving hyperplastic/neoplastic cells of hematopoietic origin. A hematopoietic neoplastic disorder can arise from myeloid, lymphoid or erythroid lineages, or precursor cells thereof. Preferably, the diseases arise from poorly differentiated acute leukemias, e.g., erythroblastic leukemia and acute megakaryoblastic leukemia. Additional exemplary myeloid disorders include, but are not limited to, acute promyeloid leukemia (APML), acute myelogenous leukemia (AML) and chronic myelogenous leukemia (CML) (reviewed in Vaickus, L. (1991) Crit Rev. in Oncol./Hemotol. 11:267-97); lymphoid malignancies include, but are not limited to acute lymphoblastic leukemia (ALL) which includes B-lineage ALL and T-lineage ALL, chronic lymphocytic leukemia (CLL), prolymphocytic leukemia (PLL), hairy cell leukemia (HLL) and Waldenstrom's macroglobulinemia (WM). Additional forms of malignant lymphomas include, but are not limited to non-Hodgkin lymphoma and variants thereof, peripheral T cell lymphomas, adult T cell leukemia/lymphoma (ATL), cutaneous T-cell lymphoma (CTCL), large granular lymphocytic leukemia (LGF), Hodgkin's disease and Reed-Sternberg disease.

[4565] As used herein, heart disorders, or “cardiovascular disease” or a “cardiovascular disorder” includes a disease or disorder which affects the cardiovascular system, e.g., the heart, the blood vessels, and/or the blood. A cardiovascular disorder can be caused by an imbalance in arterial pressure, a malfunction of the heart, or an occlusion of a blood vessel, e.g., by a thrombus. A cardiovascular disorder includes, but is not limited to disorders such as arteriosclerosis, atherosclerosis, cardiac hypertrophy, ischemia reperfusion injury, restenosis, arterial inflammation, vascular wall remodeling, ventricular remodeling, rapid ventricular pacing, coronary microembolism, tachycardia, bradycardia, pressure overload, aortic bending, coronary artery ligation, vascular heart disease, valvular disease, including but not limited to, valvular degeneration caused by calcification, rheumatic heart disease, endocarditis, or complications of artificial valves; atrial fibrillation, long-QT syndrome, congestive heart failure, sinus node dysfunction, angina, heart failure, hypertension, atrial fibrillation, atrial flutter, pericardial disease, including but not limited to, pericardial effusion and pericarditis; cardiomyopathies, e.g., dilated cardiomyopathy or idiopathic cardiomyopathy, myocardial infarction, coronary artery disease, coronary artery spasm, ischemic disease, arrhythmia, sudden cardiac death, and cardiovascular developmental disorders (e.g., arteriovenous malformations, arteriovenous fistulae, raynaud's syndrome, neurogenic thoracic outlet syndrome, causalgia/reflex sympathetic dystrophy, hemangioma, aneurysm, cavernous angioma, aortic valve stenosis, atrial septal defects, atrioventricular canal, coarctation of the aorta, ebsteins anomaly, hypoplastic left heart syndrome, interruption of the aortic arch, mitral valve prolapse, ductus arteriosus, patent foramen ovale, partial anomalous pulmonary venous return, pulmonary atresia with ventricular septal defect, pulmonary atresia without ventricular septal defect, persistance of the fetal circulation, pulmonary valve stenosis, single ventricle, total anomalous pulmonary venous return, transposition of the great vessels, tricuspid atresia, truncus arteriosus, ventricular septal defects). A cardiovasular disease or disorder also can include an endothelial cell disorder.

[4566] As used herein, an “endothelial cell disorder” includes a disorder characterized by aberrant, unregulated, or unwanted endothelial cell activity, e.g., proliferation, migration, angiogenesis, or vascularization; or aberrant expression of cell surface adhesion molecules or genes associated with angiogenesis, e.g., TIE-2, FLT and FLK. Endothelial cell disorders include tumorigenesis, tumor metastasis, psoriasis, diabetic retinopathy, endometriosis, Grave's disease, ischemic disease (e.g., atherosclerosis), and chronic inflammatory diseases (e.g., rheumatoid arthritis).

[4567] Examples of hematopoieitic disorders or diseases include, but are not limited to, autoimmune diseases (including, for example, diabetes mellitus, arthritis (including rheumatoid arthritis, juvenile rheumatoid arthritis, osteoarthritis, psoriatic arthritis), multiple sclerosis, encephalomyelitis, myasthenia gravis, systemic lupus erythematosis, autoimmune thyroiditis, dermatitis (including atopic dermatitis and eczematous dermatitis), psoriasis, Sjögren's Syndrome, Crohn's disease, aphthous ulcer, iritis, conjunctivitis, keratoconjunctivitis, ulcerative colitis, asthma, allergic asthma, cutaneous lupus erythematosus, scleroderma, vaginitis, proctitis, drug eruptions, leprosy reversal reactions, erythema nodosum leprosum, autoimmune uveitis, allergic encephalomyelitis, acute necrotizing hemorrhagic encephalopathy, idiopathic bilateral progressive sensorineural hearing loss, aplastic anemia, pure red cell anemia, idiopathic thrombocytopenia, polychondritis, Wegener's granulomatosis, chronic active hepatitis, Stevens-Johnson syndrome, idiopathic sprue, lichen planus, Graves' disease, sarcoidosis, primary biliary cirrhosis, uveitis posterior, and interstitial lung fibrosis), graft-versus-host disease, cases of transplantation, and allergy such as, atopic allergy.

[4568] Disorders involving the brain include, but are not limited to, disorders involving neurons, and disorders involving glia, such as astrocytes, oligodendrocytes, ependymal cells, and microglia; cerebral edema, raised intracranial pressure and herniation, and hydrocephalus; malformations and developmental diseases, such as neural tube defects, forebrain anomalies, posterior fossa anomalies, and syringomyelia and hydromyelia; perinatal brain injury; cerebrovascular diseases, such as those related to hypoxia, ischemia, and infarction, including hypotension, hypoperfusion, and low-flow states—global cerebral ischemia and focal cerebral ischemia—infarction from obstruction of local blood supply, intracranial hemorrhage, including intracerebral (intraparenchymal) hemorrhage, subarachnoid hemorrhage and ruptured berry aneurysms, and vascular malformations, hypertensive cerebrovascular disease, including lacunar infarcts, slit hemorrhages, and hypertensive encephalopathy; infections, such as acute meningitis, including acute pyogenic (bacterial) meningitis and acute aseptic (viral) meningitis, acute focal suppurative infections, including brain abscess, subdural empyema, and extradural abscess, chronic bacterial meningoencephalitis, including tuberculosis and mycobacterioses, neurosyphilis, and neuroborreliosis (Lyme disease), viral meningoencephalitis, including arthropod-borne (Arbo) viral encephalitis, Herpes simplex virus Type 1, Herpes simplex virus Type 2, Varicalla-zoster virus (Herpes zoster), cytomegalovirus, poliomyelitis, rabies, and human immunodeficiency virus 1, including HIV-1 meningoencephalitis (subacute encephalitis), vacuolar myelopathy, AIDS-associated myopathy, peripheral neuropathy, and AIDS in children, progressive multifocal leukoencephalopathy, subacute sclerosing panencephalitis, fungal meningoencephalitis, other infectious diseases of the nervous system; transmissible spongiform encephalopathies (prion diseases); demyelinating diseases, including multiple sclerosis, multiple sclerosis variants, acute disseminated encephalomyelitis and acute necrotizing hemorrhagic encephalomyelitis, and other diseases with demyelination; degenerative diseases, such as degenerative diseases affecting the cerebral cortex, including Alzheimer disease and Pick disease, degenerative diseases of basal ganglia and brain stem, including Parkinsonism, idiopathic Parkinson disease (paralysis agitans), progressive supranuclear palsy, corticobasal degenration, multiple system atrophy, including striatonigral degenration, Shy-Drager syndrome, and olivopontocerebellar atrophy, and Huntington disease; spinocerebellar degenerations, including spinocerebellar ataxias, including Friedreich ataxia, and ataxia-telanglectasia, degenerative diseases affecting motor neurons, including amyotrophic lateral sclerosis (motor neuron disease), bulbospinal atrophy (Kennedy syndrome), and spinal muscular atrophy; inborn errors of metabolism, such as leukodystrophies, including Krabbe disease, metachromatic leukodystrophy, adrenoleukodystrophy, Pelizaeus-Merzbacher disease, and Canavan disease, mitochondrial encephalomyopathies, including Leigh disease and other mitochondrial encephalomyopathies; toxic and acquired metabolic diseases, including vitamin deficiencies such as thiamine (vitamin B1) deficiency and vitamin B12 deficiency, neurologic sequelae of metabolic disturbances, including hypoglycemia, hyperglycemia, and hepatic encephatopathy, toxic disorders, including carbon monoxide, methanol, ethanol, and radiation, including combined methotrexate and radiation-induced injury; tumors, such as gliomas, including astrocytoma, including fibrillary (diffuse) astrocytoma and glioblastoma multiforme, pilocytic astrocytoma, pleomorphic xanthoastrocytoma, and brain stem glioma, oligodendroglioma, and ependymoma and related paraventricular mass lesions, neuronal tumors, poorly differentiated neoplasms, including medulloblastoma, other parenchymal tumors, including primary brain lymphoma, germ cell tumors, and pineal parenchymal tumors, meningiomas, metastatic tumors, paraneoplastic syndromes, peripheral nerve sheath tumors, including schwannoma, neurofibroma, and malignant peripheral nerve sheath tumor (malignant schwannoma), and neurocutaneous syndromes (phakomatoses), including neurofibromotosis, including Type 1 neurofibromatosis (NF1) and TYPE 2 neurofibromatosis (NF2), tuberous sclerosis, and Von Hippel-Lindau disease.

[4569] Disorders which may be treated or diagnosed by methods described herein include, but are not limited to, disorders associated with an accumulation in the liver of fibrous tissue, such as that resulting from an imbalance between production and degradation of the extracellular matrix accompanied by the collapse and condensation of preexisting fibers. The methods described herein can be used to diagnose or treat hepatocellular necrosis or injury induced by a wide variety of agents including processes which disturb homeostasis, such as an inflammatory process, tissue damage resulting from toxic injury or altered hepatic blood flow, and infections (e.g., bacterial, viral and parasitic). For example, the methods can be used for the early detection of hepatic injury, such as portal hypertension or hepatic fibrosis. In addition, the methods can be employed to detect liver fibrosis attributed to inborn errors of metabolism, for example, fibrosis resulting from a storage disorder such as Gaucher's disease (lipid abnormalities) or a glycogen storage disease, A1-antitrypsin deficiency; a disorder mediating the accumulation (e.g., storage) of an exogenous substance, for example, hemochromatosis (iron-overload syndrome) and copper storage diseases (Wilson's disease), disorders resulting in the accumulation of a toxic metabolite (e.g., tyrosinemia, fructosemia and galactosemia) and peroxisomal disorders (e.g., Zellweger syndrome). Additionally, the methods described herein may be useful for the early detection and treatment of liver injury associated with the administration of various chemicals or drugs, such as for example, methotrexate, isonizaid, oxyphenisatin, methyldopa, chlorpromazine, tolbutamide or alcohol, or which represents a hepatic manifestation of a vascular disorder such as obstruction of either the intrahepatic or extrahepatic bile flow or an alteration in hepatic circulation resulting, for example, from chronic heart failure, veno-occlusive disease, portal vein thrombosis or Budd-Chiari syndrome.

[4570] The 23479, 48120, or 46689 protein, fragments thereof, and derivatives and other variants of the sequence in SEQ ID NO:75, SEQ ID NO:78, or SEQ ID NO:81 thereof are collectively referred to as “polypeptides or proteins of the invention” or “23479, 48120, or 46689 polypeptides or proteins”. Nucleic acid molecules encoding such polypeptides or proteins are collectively referred to as “nucleic acids of the invention” or “23479, 48120, or 46689 nucleic acids.” 23479, 48120, or 46689 molecules refer to 23479, 48120, or 46689 nucleic acids, polypeptides, and antibodies.

[4571] As used herein, the term “nucleic acid molecule” includes DNA molecules (e.g., a cDNA or genomic DNA), RNA molecules (e.g., an mRNA) and analogs of the DNA or RNA. A DNA or RNA analog can be synthesized from nucleotide analogs. The nucleic acid molecule can be single-stranded or double-stranded, but preferably is double-stranded DNA.

[4572] The term “isolated nucleic acid molecule” or “purified nucleic acid molecule” includes nucleic acid molecules that are separated from other nucleic acid molecules present in the natural source of the nucleic acid. For example, with regards to genomic DNA, the term “isolated” includes nucleic acid molecules that are separated from the chromosome with which the genomic DNA is naturally associated. Preferably, an “isolated” nucleic acid is free of sequences which naturally flank the nucleic acid (i.e., sequences located at the 5′ and/or 3′ ends of the nucleic acid) in the genomic DNA of the organism from which the nucleic acid is derived. For example, in various embodiments, the isolated nucleic acid molecule can contain less than about 5 kb, 4 kb, 3 kb, 2 kb, 1 kb, 0.5 kb or 0.1 kb of 5′ and/or 3′nucleotide sequences which naturally flank the nucleic acid molecule in genomic DNA of the cell from which the nucleic acid is derived. Moreover, an “isolated” nucleic acid molecule, such as a cDNA molecule, can be substantially free of other cellular material, or culture medium when produced by recombinant techniques, or substantially free of chemical precursors or other chemicals when chemically synthesized.

[4573] As used herein, the term “hybridizes under low stringency, medium stringency, high stringency, or very high stringency conditions” describes conditions for hybridization and washing. Guidance for performing hybridization reactions can be found in Current Protocols in Molecular Biology, John Wiley & Sons, N.Y. (1989), 6.3.1-6.3.6, which is incorporated by reference. Aqueous and nonaqueous methods are described in that reference and either can be used. Specific hybridization conditions referred to herein are as follows: 1) low stringency hybridization conditions in 6× sodium chloride/sodium citrate (SSC) at about 45° C., followed by two washes in 0.2× SSC, 0.1% SDS at least at 50° C. (the temperature of the washes can be increased to 55° C. for low stringency conditions); 2) medium stringency hybridization conditions in 6× SSC at about 45° C., followed by one or more washes in 0.2× SSC, 0.1% SDS at 60° C.; 3) high stringency hybridization conditions in 6× SSC at about 45° C., followed by one or more washes in 0.2× SSC, 0.1% SDS at 65° C.; and preferably 4) very high stringency hybridization conditions are 0.5M sodium phosphate, 7% SDS at 65° C., followed by one or more washes at 0.2× SSC, 1% SDS at 65° C. Very high stringency conditions (4) are the preferred conditions and the ones that should be used unless otherwise specified.

[4574] Preferably, an isolated nucleic acid molecule of the invention that hybridizes under a stringency condition described herein to the sequence of SEQ ID NO:74, SEQ ID NO:76, SEQ ID NO:77, SEQ ID NO:79, SEQ ID NO:80, or SEQ ID NO:82 corresponds to a naturally-occurring nucleic acid molecule.

[4575] As used herein, a “naturally-occurring” nucleic acid molecule refers to an RNA or DNA molecule having a nucleotide sequence that occurs in nature. For example a naturally occurring nucleic acid molecule can encode a natural protein.

[4576] As used herein, the terms “gene” and “recombinant gene” refer to nucleic acid molecules that include at least an open reading frame encoding a 23479, 48120, or 46689 protein. The gene can optionally further include non-coding sequences, e.g., regulatory sequences and introns. Preferably, a gene encodes a mammalian 23479, 48120, or 46689 protein or derivative thereof.

[4577] An “isolated” or “purified” polypeptide or protein is substantially free of cellular material or other contaminating proteins from the cell or tissue source from which the protein is derived, or substantially free from chemical precursors or other chemicals when chemically synthesized. “Substantially free” means that a preparation of 23479, 48120, or 46689 protein is at least 10% pure. In a preferred embodiment, the preparation of 23479, 48120, or 46689 protein has less than about 30%, 20%, 10% and more preferably 5% (by dry weight), of non-23479, 48120, or 46689 protein (also referred to herein as a “contaminating protein”), or of chemical precursors or non-23479, 48120, or 46689 chemicals. When the 23479, 48120, or 46689 protein or biologically active portion thereof is recombinantly produced, it is also preferably substantially free of culture medium, i.e., culture medium represents less than about 20%, more preferably less than about 10%, and most preferably less than about 5% of the volume of the protein preparation. The invention includes isolated or purified preparations of at least 0.01, 0.1, 1.0, and 10 milligrams in dry weight.

[4578] A “non-essential” amino acid residue is a residue that can be altered from the wild-type sequence of 23479, 48120, or 46689 without abolishing or substantially altering a 23479, 48120, or 46689 activity. Preferably the alteration does not substantially alter the 23479, 48120, or 46689 activity, e.g., the activity is at least 20%, 40%, 60%, 70% or 80% of wild-type. An “essential” amino acid residue is a residue that, when altered from the wild-type sequence of 23479, 48120, or 46689, results in abolishing a 23479, 48120, or 46689 activity such that less than 20% of the wild-type activity is present. For example, conserved amino acid residues in 23479, 48120, or 46689 are predicted to be particularly unamenable to alteration.

[4579] A “conservative amino acid substitution” is one in which the amino acid residue is replaced with an amino acid residue having a similar side chain. Families of amino acid residues having similar side chains have been defined in the art. These families include amino acids with basic side chains (e.g., lysine, arginine, histidine), acidic side chains (e.g., aspartic acid, glutamic acid), uncharged polar side chains (e.g., glycine, asparagine, glutamine, serine, threonine, tyrosine, cysteine), nonpolar side chains (e.g., alanine, valine, leucine, isoleucine, proline, phenylalanine, methionine, tryptophan), beta-branched side chains (e.g., threonine, valine, isoleucine) and aromatic side chains (e.g., tyrosine, phenylalanine, tryptophan, histidine). Thus, a predicted nonessential amino acid residue in a 23479, 48120, or 46689 protein is preferably replaced with another amino acid residue from the same side chain family. Alternatively, in another embodiment, mutations can be introduced randomly along all or part of a 23479, 48120, or 46689 coding sequence, such as by saturation mutagenesis, and the resultant mutants can be screened for 23479, 48120, or 46689 biological activity to identify mutants that retain activity. Following mutagenesis of SEQ ID NO:74, SEQ ID NO:76, SEQ ID NO:77, SEQ ID NO:79, SEQ ID NO:80, or SEQ ID NO:82, the encoded protein can be expressed recombinantly and the activity of the protein can be determined.

[4580] As used herein, a “biologically active portion” of a 23479, 48120, or 46689 protein includes a fragment of a 23479, 48120, or 46689 protein which participates in an interaction, e.g., an intramolecular or an inter-molecular interaction. An inter-molecular interaction can be a specific binding interaction or an enzymatic interaction (e.g., the interaction can be transient and a covalent bond is formed or broken). An inter-molecular interaction can be between a 23479, 48120, or 46689 molecule and a non-23479, 48120, or 46689 molecule or between a first 23479, 48120, or 46689 molecule and a second 23479, 48120, or 46689 molecule (e.g., a dimerization interaction). Biologically active portions of a 23479, 48120, or 46689 protein include peptides comprising amino acid sequences sufficiently homologous to or derived from the amino acid sequence of the 23479, 48120, or 46689 protein, e.g., the amino acid sequence shown in SEQ ID NO:75, SEQ ID NO:78, or SEQ ID NO:81, which include less amino acids than the full length 23479, 48120, or 46689 proteins, and exhibit at least one activity of a 23479, 48120, or 46689 protein. Typically, biologically active portions comprise a domain or motif with at least one activity of the 23479, 48120, or 46689 protein, e.g., hydrolysis of a substrate molecule, e.g., a protein (e.g., a ubiquitinated protein or poly-ubiquitin), lipid, or small molecule (e.g., metabolite, signaling molecule, toxin, or carcinogen) substrate. A biologically active portion of a 23479, 48120, or 46689 protein can be a polypeptide which is, for example, 10, 25, 50, 100, 200 or more amino acids in length. Biologically active portions of a 23479, 48120, or 46689 protein can be used as targets for developing agents which modulate a 23479, 48120, or 46689 mediated activity, e.g., hydrolysis of a substrate molecule, e.g., a protein (e.g., a ubiquitinated protein or poly-ubiquitin), lipid, or small molecule (e.g., metabolite, signaling molecule, toxin, or carcinogen) substrate.

[4581] Calculations of homology or sequence identity between sequences (the terms are used interchangeably herein) are performed as follows.

[4582] To determine the percent identity of two amino acid sequences, or of two nucleic acid sequences, the sequences are aligned for optimal comparison purposes (e.g., gaps can be introduced in one or both of a first and a second amino acid or nucleic acid sequence for optimal alignment and non-homologous sequences can be disregarded for comparison purposes). In a preferred embodiment, the length of a reference sequence aligned for comparison purposes is at least 30%, preferably at least 40%, more preferably at least 50%, 60%, and even more preferably at least 70%, 80%, 90%, 100% of the length of the reference sequence. The amino acid residues or nucleotides at corresponding amino acid positions or nucleotide positions are then compared. When a position in the first sequence is occupied by the same amino acid residue or nucleotide as the corresponding position in the second sequence, then the molecules are identical at that position (as used herein amino acid or nucleic acid “identity” is equivalent to amino acid or nucleic acid “homology”).

[4583] The percent identity between the two sequences is a function of the number of identical positions shared by the sequences, taking into account the number of gaps, and the length of each gap, which need to be introduced for optimal alignment of the two sequences.

[4584] The comparison of sequences and determination of percent identity between two sequences can be accomplished using a mathematical algorithm. In a preferred embodiment, the percent identity between two amino acid sequences is determined using the Needleman and Wunsch ((1970) J. Mol. Biol. 48:444-453) algorithm which has been incorporated into the GAP program in the GCG software package (available at http://www.gcg.com), using either a Blossum 62 matrix or a PAM250 matrix, and a gap weight of 16, 14, 12, 10, 8, 6, or 4 and a length weight of 1, 2, 3, 4, 5, or 6. In yet another preferred embodiment, the percent identity between two nucleotide sequences is determined using the GAP program in the GCG software package (available at http://www.gcg.com), using a NWSgapdna.CMP matrix and a gap weight of 40, 50, 60, 70, or 80 and a length weight of 1, 2, 3, 4, 5, or 6. A particularly preferred set of parameters (and the one that should be used unless otherwise specified) are a Blossum 62 scoring matrix with a gap penalty of 12, a gap extend penalty of 4, and a frameshift gap penalty of 5.

[4585] The percent identity between two amino acid or nucleotide sequences can be determined using the algorithm of E. Meyers and W. Miller ((1989) CABIOS, 4:11-17) which has been incorporated into the ALIGN program (version 2.0), using a PAM120 weight residue table, a gap length penalty of 12 and a gap penalty of 4.

[4586] The nucleic acid and protein sequences described herein can be used as a “query sequence” to perform a search against public databases to, for example, identify other family members or related sequences. Such searches can be performed using the NBLAST and XBLAST programs (version 2.0) of Altschul, et al. (1990) J. Mol. Biol. 215:403-10. BLAST nucleotide searches can be performed with the NBLAST program, score=100, wordlength=12 to obtain nucleotide sequences homologous to 23479, 48120, or 46689 nucleic acid molecules of the invention. BLAST protein searches can be performed with the XBLAST program, score=50, wordlength=3 to obtain amino acid sequences homologous to 23479, 48120, or 46689 protein molecules of the invention. To obtain gapped alignments for comparison purposes, Gapped BLAST can be utilized as described in Altschul et al., (1997) Nucleic Acids Res. 25:3389-3402. When utilizing BLAST and Gapped BLAST programs, the default parameters of the respective programs (e.g., XBLAST and NBLAST) can be used. See http://www.ncbi.nlm.nih.gov.

[4587] Particularly preferred 23479, 48120, or 46689 polypeptides of the present invention have an amino acid sequence substantially identical to the amino acid sequence of SEQ ID NO:75, SEQ ID NO:78, or SEQ ID NO:81. In the context of an amino acid sequence, the term “substantially identical” is used herein to refer to a first amino acid that contains a sufficient or minimum number of amino acid residues that are i) identical to, or ii) conservative substitutions of aligned amino acid residues in a second amino acid sequence such that the first and second amino acid sequences can have a common structural domain and/or common functional activity. For example, amino acid sequences that contain a common structural domain having at least about 60%, or 65% identity, likely 75% identity, more likely 85%, 90%. 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98% or 99% identity to SEQ ID NO:75, SEQ ID NO:78, or SEQ ID NO:81 are termed substantially identical.

[4588] In the context of nucleotide sequence, the term “substantially identical” is used herein to refer to a first nucleic acid sequence that contains a sufficient or minimum number of nucleotides that are identical to aligned nucleotides in a second nucleic acid sequence such that the first and second nucleotide sequences encode a polypeptide having common functional activity, or encode a common structural polypeptide domain or a common functional polypeptide activity. For example, nucleotide sequences having at least about 60%, or 65% identity, likely 75% identity, more likely 85%, 90%. 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98% or 99% identity to SEQ ID NO:74, SEQ ID NO:76, SEQ ID NO:77, SEQ ID NO:79, SEQ ID NO:80, or SEQ ID NO:82 are termed substantially identical.

[4589] “Misexpression or aberrant expression”, as used herein, refers to a non-wildtype pattern of gene expression at the RNA or protein level. It includes: expression at non-wild type levels, i.e., over- or under-expression; a pattern of expression that differs from wild type in terms of the time or stage at which the gene is expressed, e.g., increased or decreased expression (as compared with wild type) at a predetermined developmental period or stage; a pattern of expression that differs from wild type in terms of altered, e.g., increased or decreased, expression (as compared with wild type) in a predetermined cell type or tissue type; a pattern of expression that differs from wild type in terms of the splicing size, translated amino acid sequence, post-transitional modification, or biological activity of the expressed polypeptide; a pattern of expression that differs from wild type in terms of the effect of an environmental stimulus or extracellular stimulus on expression of the gene, e.g., a pattern of increased or decreased expression (as compared with wild type) in the presence of an increase or decrease in the strength of the stimulus.

[4590] “Subject,” as used herein, refers to human and non-human animals. The term “non-human animals” of the invention includes all vertebrates, e.g., mammals, such as non-human primates (particularly higher primates), sheep, dog, rodent (e.g., mouse or rat), guinea pig, goat, pig, cat, rabbits, cow, and non-mammals, such as chickens, amphibians, reptiles, etc. In a preferred embodiment, the subject is a human. In another embodiment, the subject is an experimental animal or animal suitable as a disease model.

[4591] A “purified preparation of cells”, as used herein, refers to an in vitro preparation of cells. In the case cells from multicellular organisms (e.g., plants and animals), a purified preparation of cells is a subset of cells obtained from the organism, not the entire intact organism. In the case of unicellular microorganisms (e.g., cultured cells and microbial cells), it consists of a preparation of at least 10% and more preferably 50% of the subject cells.

[4592] Various aspects of the invention are described in further detail below.

[4593] Isolated Nucleic Acid Molecules of 23479, 48120, and 46689

[4594] In one aspect, the invention provides, an isolated or purified, nucleic acid molecule that encodes a 23479, 48120, or 46689 polypeptide described herein, e.g., a full-length 23479, 48120, or 46689 protein or a fragment thereof, e.g., a biologically active portion of 23479, 48120, or 46689 protein. Also included is a nucleic acid fragment suitable for use as a hybridization probe, which can be used, e.g., to identify a nucleic acid molecule encoding a polypeptide of the invention, 23479, 48120, or 46689 mRNA, and fragments suitable for use as primers, e.g., PCR primers for the amplification or mutation of nucleic acid molecules.

[4595] In one embodiment, an isolated nucleic acid molecule of the invention includes the nucleotide sequence shown in SEQ ID NO:74, SEQ ID NO:77, or SEQ ID NO:80, or a portion of any of these nucleotide sequences. In one embodiment, the nucleic acid molecule includes sequences encoding the human 23479, 48120, or 46689 protein (i.e., “the coding region” of SEQ ID NO:74, SEQ ID NO:77, or SEQ ID NO:80, as shown in SEQ ID NO:76, SEQ ID NO:79, or SEQ ID NO:82, respectively), as well as 5′untranslated sequences. Alternatively, the nucleic acid molecule can include only the coding region of SEQ ID NO:74, SEQ ID NO:77, or SEQ ID NO:80 (e.g., SEQ ID NO:76, SEQ ID NO:79, or SEQ ID NO:82, respectively) and, e.g., no flanking sequences which normally accompany the subject sequence. In another embodiment, the nucleic acid molecule encodes a sequence corresponding to a fragment of a 46689 protein from about amino acid 27 to 468 of SEQ ID NO:75.

[4596] In another embodiment, an isolated nucleic acid molecule of the invention includes a nucleic acid molecule which is a complement of the nucleotide sequence shown in SEQ ID NO:74, SEQ ID NO:76, SEQ ID NO:77, SEQ ID NO:79, SEQ ID NO:80, or SEQ ID NO:82, or a portion of any of these nucleotide sequences. In other embodiments, the nucleic acid molecule of the invention is sufficiently complementary to the nucleotide sequence shown in SEQ ID NO:74, SEQ ID NO:76, SEQ ID NO:77, SEQ ID NO:79, SEQ ID NO:80, or SEQ ID NO:82, such that it can hybridize (e.g., under a stringency condition described herein) to the nucleotide sequence shown in SEQ ID NO:74, SEQ ID NO:76, SEQ ID NO:77, SEQ ID NO:79, SEQ ID NO:80, or SEQ ID NO:82, thereby forming a stable duplex.

[4597] In one embodiment, an isolated nucleic acid molecule of the present invention includes a nucleotide sequence which is at least about: 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or more homologous to the entire length of the nucleotide sequence shown in SEQ ID NO:74, SEQ ID NO:76, SEQ ID NO:77, SEQ ID NO:79, SEQ ID NO:80, or SEQ ID NO:82, or a portion, preferably of the same length, of any of these nucleotide sequences.

[4598] 23479 and 48120 Nucleic Acid Fragments

[4599] A nucleic acid molecule of the invention can include only a portion of the nucleic acid sequence of SEQ ID NO:74, SEQ ID NO:76, SEQ ID NO:77, or SEQ ID NO:79. For example, such a nucleic acid molecule can include a fragment that can be used as a probe or primer or a fragment encoding a portion of a 23479 or 48120 protein, e.g., an immunogenic or biologically active portion of a 23479 or 48120 protein. A fragment can comprise those nucleotides of SEQ ID NO:74 or SEQ ID NO:77 which encode a UCH-1, UCH-2, UBA, or UIM domain of human 23479 or 48120. The nucleotide sequence determined from the cloning of the 23479 or 48120 gene allows for the generation of probes and primers designed for use in identifying and/or cloning other 23479 or 48120 family members, or fragments thereof, as well as 23479 or 48120 homologues, or fragments thereof, from other species.

[4600] In another embodiment, a nucleic acid includes a nucleotide sequence that includes part, or all, of the coding region and extends into either (or both) the 5′ or 3′noncoding region. Other embodiments include a fragment that includes a nucleotide sequence encoding an amino acid fragment described herein. Nucleic acid fragments can encode a specific domain or site described herein or fragments thereof, particularly fragments thereof which are at least 20 amino acids in length. Preferably, fragments are at least 50, 75, 100, 200, 300, 400, 500, 600, 700, 800, or 900 amino acids in length. Fragments also include nucleic acid sequences corresponding to specific amino acid sequences described above or fragments thereof. Nucleic acid fragments should not to be construed as encompassing those fragments that may have been disclosed prior to the invention.

[4601] A nucleic acid fragment can include a sequence corresponding to a domain, region, or functional site described herein. A nucleic acid fragment can also include one or more domain, region, or functional site described herein. Thus, for example, a 23479 or 48120 nucleic acid fragment can include a sequence corresponding to a UCH-1 domain and a UCH-2 domain. A 48120 nucleic acid fragment can further include a sequence corresponding to a UBA domain and a UIM domain.

[4602] 23479 or 48120 probes and primers are provided. Typically a probe/primer is an isolated or purified oligonucleotide. The oligonucleotide typically includes a region of nucleotide sequence that hybridizes under a stringency condition described herein to at least about 7, 12 or 15, preferably about 20 or 25, more preferably about 30, 35, 40, 45, 50, 55, 60, 65, or 75 consecutive nucleotides of a sense or antisense sequence of SEQ ID NO:74, SEQ ID NO:76, SEQ ID NO:77, or SEQ ID NO:79, or of a naturally occurring allelic variant or mutant of SEQ ID NO:74, SEQ ID NO:76, SEQ ID NO:77, or SEQ ID NO:79. Preferably, an oligonucleotide is less than about 200, 150, 120, or 100 nucleotides in length.

[4603] In one embodiment, the probe or primer is attached to a solid support, e.g., a solid support described herein.

[4604] One exemplary kit of primers includes a forward primer that anneals to the coding strand and a reverse primer that anneals to the non-coding strand. The forward primer can anneal to the start codon, e.g., the nucleic acid sequence encoding amino acid residue 1 of SEQ ID NO:75 or SEQ ID NO:78. The reverse primer can anneal to the ultimate codon, e.g., the codon immediately before the stop codon, e.g., the codon encoding amino acid residue 934 of SEQ ID NO:75 or amino acid residue 1139 of SEQ ID NO:78. In a preferred embodiment, the annealing temperatures of the forward and reverse primers differ by no more than 5, 4, 3, or 2° C.

[4605] In a preferred embodiment the nucleic acid is a probe which is at least 10, 12, 15, 18, 20 and less than 200, more preferably less than 100, or less than 50, nucleotides in length. It should be identical, or differ by 1, or 2, or less than 5 or 10 nucleotides, from a sequence disclosed herein. If alignment is needed for this comparison the sequences should be aligned for maximum homology. “Looped” out sequences from deletions or insertions, or mismatches, are considered differences.

[4606] A probe or primer can be derived from the sense or anti-sense strand of a nucleic acid which encodes: a UCH-1 domain from about amino acids about amino acid 296-327 of SEQ ID NO:75 or amino acid 162-193 of SEQ ID NO:78; a UCH-2 domain from about amino acid 546-640 of SEQ ID NO:75 or amino acid 580-649 of SEQ ID NO:78; a UBA domain from about amino acid 20-61 of SEQ ID NO:78; or a UIM domain from about amino acid 96-113 of SEQ ID NO:78.

[4607] In another embodiment a set of primers is provided, e.g., primers suitable for use in a PCR, which can be used to amplify a selected region of a 23479 or 48120 sequence, e.g., a domain, region, site or other sequence described herein. The primers should be at least 5, 10, or 50 base pairs in length and less than 100, or less than 200, base pairs in length. The primers should be identical, or differs by one base from a sequence disclosed herein or from a naturally occurring variant. For example, primers suitable for amplifying all or a portion of any of the following regions are provided: a UCH-1 domain from about amino acids about amino acid 296-327 of SEQ ID NO:75 or amino acid 162-193 of SEQ ID NO:78; a UCH-2 domain from about amino acid 546-640 of SEQ ID NO:75 or amino acid 580-649 of SEQ ID NO:78; a UBA domain from about amino acid 20-61 of SEQ ID NO:78; and a UIM domain from about amino acid 96-113 of SEQ ID NO:78.

[4608] A nucleic acid fragment can encode an epitope bearing region of a polypeptide described herein.

[4609] A nucleic acid fragment encoding a “biologically active portion of a 23479 or 48120 polypeptide” can be prepared by isolating a portion of the nucleotide sequence of SEQ ID NO:74, SEQ ID NO:76, SEQ ID NO:77, or SEQ ID NO:79, which encodes a polypeptide having a 23479 or 48120 biological activity (e.g., the biological activities of the 23479 or 48120 proteins are described herein), expressing the encoded portion of the 23479 or 48120 protein (e.g., by recombinant expression in vitro) and assessing the activity of the encoded portion of the 23479 or 48120 protein. For example, a nucleic acid fragment encoding a biologically active portion of 23479 or 48120 includes a UCH-1 domain from about amino acids about amino acid 296-327 of SEQ ID NO:75 or amino acid 162-193 of SEQ ID NO:78, a UCH-2 domain from about amino acid 546-640 of SEQ ID NO:75 or amino acid 580-649 of SEQ ID NO:78, a UBA domain from about amino acid 20-61 of SEQ ID NO:78, or a UIM domain from about amino acid 96-113 of SEQ ID NO:78. A nucleic acid fragment encoding a biologically active portion of a 23479 or 48120 polypeptide, may comprise a nucleotide sequence which is greater than 300 or more nucleotides in length.

[4610] In preferred embodiments, a nucleic acid includes a nucleotide sequence which is about 300, 400, 500, 600, 700, 800, 900, 1000, 1100, 1200, 1300, 1400, 1500, 1600, 1700, 1800, 1900, 2000, 2100, 2200, 2300, 2400, 2500, 2600, 2700, 2800, 2900, 3000, 3100, 3200, 3300, 3400, 3500, 3600, 3700, 3800, 3900, 4000, 4100, 4200, 4300, 4400, 4500, 4600, 4700, 4800, or more nucleotides in length and hybridizes under a stringency condition described herein to a nucleic acid molecule of SEQ ID NO:74, SEQ ID NO:76, SEQ ID NO:77, or SEQ ID NO:79.

[4611] In a preferred embodiment, a nucleic acid fragment differs by at least 1, 2, 3, 10, 20, or more nucleotides from a sequence described in WO01/55301, WO 01/57058, or WO 01/38543, or Genbank™ accession numbers AB018272 or AK001193. Differences can include differing in length or sequence identity. For example, a nucleic acid fragment can: include one or more nucleotides from SEQ ID NO:74 or SEQ ID NO:76 located outside the region of nucleotides 2314-3363, 1788-2315, 1865-2831, 2660-3206, 1723-2649, 441-1046, 3012-3206, 2493-2726, or 1723-2372 of SEQ ID NO:74; include one or more nucleotides from SEQ ID NO:77 or SEQ ID NO:79 located outside the region of nucleotides 2722-4329, 2818-3713, 1366-2233, 2923-3535, 1366-1829, 2766-3169, 2766-3962, or 3958-4653 of SEQ ID NO:77; not include all of the nucleotides of a sequence of WO01/55301 or WO 01/57058, or Genbank™ accession numbers AB018272 or AK001193, e.g., can be one or more nucleotides shorter (at one or both ends) than a sequence of WO01/55301 or WO 01/57058, or Genbank™ accession numbers AB018272 or AK001193; or can differ by one or more nucleotides in the region of overlap.

[4612] 46689 Nucleic Acid Fragments

[4613] A nucleic acid molecule of the invention can include only a portion of the nucleic acid sequence of SEQ ID NO:80 or 82. For example, such a nucleic acid molecule can include a fragment that can be used as a probe or primer or a fragment encoding a portion of a 46689 protein, e.g., an immunogenic or biologically active portion of a 46689 protein. A fragment can comprise those nucleotides of SEQ ID NO:80 which encode an &agr;/&bgr; hydrolase domain of human 46689. The nucleotide sequence determined from the cloning of the 46689 gene allows for the generation of probes and primers designed for use in identifying and/or cloning other 46689 family members, or fragments thereof, as well as 46689 homologues, or fragments thereof, from other species.

[4614] In another embodiment, a nucleic acid includes a nucleotide sequence that includes part, or all, of the coding region and extends into either (or both) the 5′ or 3′noncoding region. Other embodiments include a fragment that includes a nucleotide sequence encoding an amino acid fragment described herein. Nucleic acid fragments can encode a specific domain or site described herein or fragments thereof, particularly fragments thereof which are at least 50, 100, 124, 136, 150, 185, 190, 200, 238, or more amino acids in length. Fragments also include nucleic acid sequences corresponding to specific amino acid sequences described above or fragments thereof. Nucleic acid fragments should not to be construed as encompassing those fragments that may have been disclosed prior to the invention.

[4615] A nucleic acid fragment can include a sequence corresponding to a domain, region, or functional site described herein. A nucleic acid fragment can also include one or more domain, region, or functional site described herein. Thus, for example, a 46689 nucleic acid fragment can include a sequence corresponding to an &agr;/&bgr; hydrolase domain, e.g., about nucleotides 670 to 1371 of SEQ ID NO:80, a region that includes a transmembrane domain, e.g., about nucleotides 115 to 669 of SEQ ID NO:80, or a region that includes both an &agr;/&bgr; hydrolase domain and a transmembrane domain, e.g., about nucleotides 562 to 1371 of SEQ ID NO:80.

[4616] 46689 probes and primers are provided. Typically a probe/primer is an isolated or purified oligonucleotide. The oligonucleotide typically includes a region of nucleotide sequence that hybridizes under a stringency condition described herein to at least about 7, 12 or 15, preferably about 20 or 25, more preferably about 30, 35, 40, 45, 50, 55, 60, 65, or 75 consecutive nucleotides of a sense or antisense sequence of SEQ ID NO:80 or SEQ ID NO:82, or of a naturally occurring allelic variant or mutant of SEQ ID NO:80 or SEQ ID NO:82. Preferably, an oligonucleotide is less than about 200, 150, 120, or 100 nucleotides in length.

[4617] In one embodiment, the probe or primer is attached to a solid support, e.g., a solid support described herein.

[4618] One exemplary kit of primers includes a forward primer that anneals to the coding strand and a reverse primer that anneals to the non-coding strand. The forward primer can anneal to the start codon, e.g., the nucleic acid sequence encoding amino acid residue 1 of SEQ ID NO:81. The reverse primer can anneal to the ultimate codon, e.g., the codon immediately before the stop codon, e.g., the codon encoding amino acid residue 468 of SEQ ID NO:81. In a preferred embodiment, the annealing temperatures of the forward and reverse primers differ by no more than 5, 4, 3, or 2° C.

[4619] In a preferred embodiment the nucleic acid is a probe which is at least 10, 12, 15, 18, 20 and less than 200, more preferably less than 100, or less than 50, nucleotides in length. It should be identical, or differ by 1, or 2, or less than 5 or 10 nucleotides, from a sequence disclosed herein. If alignment is needed for this comparison the sequences should be aligned for maximum homology. “Looped” out sequences from deletions or insertions, or mismatches, are considered differences.

[4620] A probe or primer can be derived from the sense or anti-sense strand of a nucleic acid which encodes: an &agr;/&bgr; hydrolase domain, e.g., about amino acid residues 186 to 419 of SEQ ID NO:81; a region that includes a transmembrane domain, e.g., about amino acid residues 1 to 185 or 27 to 185 of SEQ ID NO:81; or a region that includes both an &agr;/&bgr; hydrolase domain and a transmembrane domain, e.g., about amino acid residues 150 to 419 of SEQ ID NO:81.

[4621] In another embodiment a set of primers is provided, e.g., primers suitable for use in a PCR, which can be used to amplify a selected region of a 46689 sequence, e.g., a domain, region, site or other sequence described herein. The primers should be at least 5, 10, or 50 base pairs in length and less than 100, or less than 200, base pairs in length. The primers should be identical, or differs by one base from a sequence disclosed herein or from a naturally occurring variant. For example, primers suitable for amplifying all or a portion of any of the following regions are provided: an &agr;/&bgr; hydrolase domain, e.g., about amino acid residues 186 to 419 of SEQ ID NO:81; a region that includes a transmembrane domain, e.g., about amino acid residues 1 to 185 or 27 to 185 of SEQ ID NO:81; or a region that includes both an &agr;/&bgr; hydrolase domain and a transmembrane domain, e.g., about amino acid residues 150 to 419 of SEQ ID NO:81.

[4622] A nucleic acid fragment can encode an epitope bearing region of a polypeptide described herein.

[4623] A nucleic acid fragment encoding a “biologically active portion of a 46689 polypeptide” can be prepared by isolating a portion of the nucleotide sequence of SEQ ID NO:80 or 82, which encodes a polypeptide having a 46689 biological activity (e.g., the biological activities of the 46689 proteins are described herein), expressing the encoded portion of the 46689 protein (e.g., by recombinant expression in vitro) and assessing the activity of the encoded portion of the 46689 protein. For example, a nucleic acid fragment encoding a biologically active portion of 46689 includes an &agr;/&bgr; hydrolase domain, e.g., amino acid residues about 186 to 419 of SEQ ID NO:81. A nucleic acid fragment encoding a biologically active portion of a 46689 polypeptide, may comprise a nucleotide sequence which is greater than 712 or more nucleotides in length.

[4624] In preferred embodiments, a nucleic acid includes a nucleotide sequence that is about 300, 372, 408, 500, 555, 561, 600, 700, 712, 800, 900, 1000, 1100, 1200, 1300, 1400, 1500, 1600, 1700, 1800, 1900, 2000, 2050, or more nucleotides in length and hybridizes under a stringency condition described herein to a nucleic acid molecule of SEQ ID NO:80, or SEQ ID NO:82.

[4625] In a preferred embodiment, a nucleic acid fragment differs by at least 1, 2, 3, 10, 20, or more nucleotides from the sequence of SEQ ID NO: 7258 of WO 01/57188 or SEQ ID NO: 790 or WO 00/52165. Differences can include differing in length or sequence identity. For example, a nucleic acid fragment can: include one or more nucleotides from SEQ ID NO:80 or SEQ ID NO:82 located outside the region of nucleotides 246 to 799, 808 to 1519, 743 to 1113, 1115 to 1521, or 1523 to 2082; not include all of the nucleotides of SEQ ID NO: 7258 of WO 01/57188 or SEQ ID NO: 790 or WO 00/52165, e.g., can be one or more nucleotides shorter (at one or both ends) than the sequence of SEQ ID NO: 7258 of WO 01/57188 or SEQ ID NO: 790 or WO 00/52165; or can differ by one or more nucleotides in the region of overlap.

[4626] 23479, 48120, or 46689 Nucleic Acid Variants

[4627] The invention further encompasses nucleic acid molecules that differ from the nucleotide sequence shown in SEQ ID NO:74, SEQ ID NO:76, SEQ ID NO:77, SEQ ID NO:79, SEQ ID NO:80, or SEQ ID NO:82. Such differences can be due to degeneracy of the genetic code (and result in a nucleic acid which encodes the same 23479, 48120, or 46689 proteins as those encoded by the nucleotide sequence disclosed herein). In another embodiment, an isolated nucleic acid molecule of the invention has a nucleotide sequence encoding a protein having an amino acid sequence which differs, by at least 1, but less than 5, 10, 20, 50, or 100 amino acid residues that shown in SEQ ID NO:75, SEQ ID NO:78, or SEQ ID NO:81. If alignment is needed for this comparison the sequences should be aligned for maximum homology. The encoded protein can differ by no more than 5, 4, 3, 2, or 1 amino acid. “Looped” out sequences from deletions or insertions, or mismatches, are considered differences.

[4628] Nucleic acids of the inventor can be chosen for having codons, which are preferred, or non-preferred, for a particular expression system. E.g., the nucleic acid can be one in which at least one codon, at preferably at least 10%, or 20% of the codons has been altered such that the sequence is optimized for expression in E. coli, yeast, human, insect, or CHO cells.

[4629] Nucleic acid variants can be naturally occurring, such as allelic variants (same locus), homologs (different locus), and orthologs (different organism) or can be non naturally occurring. Non-naturally occurring variants can be made by mutagenesis techniques, including those applied to polynucleotides, cells, or organisms. The variants can contain nucleotide substitutions, deletions, inversions and insertions. Variation can occur in either or both the coding and non-coding regions. The variations can produce both conservative and non-conservative amino acid substitutions (as compared in the encoded product).

[4630] In a preferred embodiment, the nucleic acid differs from that of SEQ ID NO:74, SEQ ID NO:76, SEQ ID NO:77, SEQ ID NO:79, SEQ ID NO:80, or SEQ ID NO:82, e.g., as follows: by at least one but less than 10, 20, 30, or 40 nucleotides; at least one but less than 1%, 5%, 10% or 20% of the nucleotides in the subject nucleic acid. The nucleic acid can differ by no more than 5, 4, 3, 2, or 1 nucleotide. If necessary for this analysis the sequences should be aligned for maximum homology. “Looped” out sequences from deletions or insertions, or mismatches, are considered differences.

[4631] Orthologs, homologs, and allelic variants can be identified using methods known in the art. These variants comprise a nucleotide sequence encoding a polypeptide that is 50%, at least about 55%, typically at least about 70-75%, more typically at least about 80-85%, and most typically at least about 90-95% or more identical to the nucleotide sequence shown in SEQ ID NO:75, SEQ ID NO:78, or SEQ ID NO:81 or a fragment of this sequence. Such nucleic acid molecules can readily be identified as being able to hybridize under a stringency condition described herein, to the nucleotide sequence shown in SEQ ID NO:75, SEQ ID NO:78, SEQ ID NO:81 or a fragment of the sequence. Nucleic acid molecules corresponding to orthologs, homologs, and allelic variants of the 23479, 48120, or 46689 cDNAs of the invention can further be isolated by mapping to the same chromosome or locus as the 23479, 48120, or 46689 gene.

[4632] Preferred variants include those that are correlated with hydrolase activity, e.g., the hydrolysis of a substrate molecule, e.g., a protein (e.g., a ubiquitinated protein or poly-ubiquitin), lipid, or small molecule (e.g., metabolite, signaling molecule, toxin, or carcinogen) substrate.

[4633] Allelic variants of 23479, 48120, or 46689, e.g., human 23479, 48120, or 46689, include both functional and non-functional proteins. Functional allelic variants are naturally occurring amino acid sequence variants of the 23479, 48120, or 46689 protein within a population that maintain the ability to bind and hydrolyze substrate molecules, e.g., protein (e.g., a ubiquitinated protein or poly-ubiquitin), lipid, or small molecule (e.g., metabolite, signaling molecule, toxin, or carcinogen) substrates. Functional allelic variants will typically contain only conservative substitution of one or more amino acids of SEQ ID NO:75, SEQ ID NO:78, or SEQ ID NO:81, or substitution, deletion or insertion of non-critical residues in non-critical regions of the protein. Non-functional allelic variants are naturally-occurring amino acid sequence variants of the 23479, 48120, or 46689, e.g., human 23479, 48120, or 46689, protein within a population that do not have the ability to bind and hydrolyze substrate molecules, e.g., protein (e.g., a ubiquitinated protein or poly-ubiquitin), lipid, or small molecule (e.g., metabolite, signaling molecule, toxin, or carcinogen) substrates. Non-functional allelic variants will typically contain a non-conservative substitution, a deletion, or insertion, or premature truncation of the amino acid sequence of SEQ ID NO:75, SEQ ID NO:78, or SEQ ID NO:81, or a substitution, insertion, or deletion in critical residues or critical regions of the protein.

[4634] Moreover, nucleic acid molecules encoding other 23479, 48120, or 46689 family members and, thus, which have a nucleotide sequence which differs from the 23479, 48120, or 46689 sequences of SEQ ID NO:74, SEQ ID NO:76, SEQ ID NO:77, SEQ ID NO:79, SEQ ID NO:80, or SEQ ID NO:82 are intended to be within the scope of the invention.

[4635] Antisense Nucleic Acid Molecules, Ribozymes and Modified 23479, 48120, or 46689 Nucleic Acid Molecules

[4636] In another aspect, the invention features, an isolated nucleic acid molecule which is antisense to 23479, 48120, or 46689. An “antisense” nucleic acid can include a nucleotide sequence that is complementary to a “sense” nucleic acid encoding a protein, e.g., complementary to the coding strand of a double-stranded cDNA molecule or complementary to an mRNA sequence. The antisense nucleic acid can be complementary to an entire 23479, 48120, or 46689 coding strand, or to only a portion thereof (e.g., the coding region of human 23479, 48120, or 46689 corresponding to SEQ ID NO:76, SEQ ID NO:79, or SEQ ID NO:82, respectively). In another embodiment, the antisense nucleic acid molecule is antisense to a “noncoding region” of the coding strand of a nucleotide sequence encoding 23479, 48120, or 46689 (e.g., the 5′ and 3′untranslated regions).

[4637] An antisense nucleic acid can be designed such that it is complementary to the entire coding region of 23479, 48120, or 46689 mRNA, but more preferably is an oligonucleotide that is antisense to only a portion of the coding or noncoding region of 23479, 48120, or 46689 mRNA. For example, the antisense oligonucleotide can be complementary to the region surrounding the translation start site of 23479, 48120, or 46689 mRNA, e.g., between the −10 and +10 regions of the target gene nucleotide sequence of interest. An antisense oligonucleotide can be, for example, about 7, 10, 15, 20, 25, 30, 35, 40, 45, 50, 55, 60, 65, 70, 75, 80, or more nucleotides in length.

[4638] An antisense nucleic acid of the invention can be constructed using chemical synthesis and enzymatic ligation reactions using procedures known in the art. For example, an antisense nucleic acid (e.g., an antisense oligonucleotide) can be chemically synthesized using naturally occurring nucleotides or variously modified nucleotides designed to increase the biological stability of the molecules or to increase the physical stability of the duplex formed between the antisense and sense nucleic acids, e.g., phosphorothioate derivatives and acridine substituted nucleotides can be used. The antisense nucleic acid also can be produced biologically using an expression vector into which a nucleic acid has been subcloned in an antisense orientation (i.e., RNA transcribed from the inserted nucleic acid will be of an antisense orientation to a target nucleic acid of interest, described further in the following subsection).

[4639] The antisense nucleic acid molecules of the invention are typically administered to a subject (e.g., by direct injection at a tissue site), or generated in situ such that they hybridize with or bind to cellular mRNA and/or genomic DNA encoding a 23479, 48120, or 46689 protein to thereby inhibit expression of the protein, e.g., by inhibiting transcription and/or translation. Alternatively, antisense nucleic acid molecules can be modified to target selected cells and then administered systemically. For systemic administration, antisense molecules can be modified such that they specifically bind to receptors or antigens expressed on a selected cell surface, e.g., by linking the antisense nucleic acid molecules to peptides or antibodies that bind to cell surface receptors or antigens. The antisense nucleic acid molecules can also be delivered to cells using the vectors described herein. To achieve sufficient intracellular concentrations of the antisense molecules, vector constructs in which the antisense nucleic acid molecule is placed under the control of a strong pol II or pol III promoter are preferred.

[4640] In yet another embodiment, the antisense nucleic acid molecule of the invention is an &agr;-anomeric nucleic acid molecule. An &agr;-anomeric nucleic acid molecule forms specific double-stranded hybrids with complementary RNA in which, contrary to the usual &bgr;-units, the strands run parallel to each other (Gaultier et al. (1987) Nucleic Acids. Res. 15:6625-6641). The antisense nucleic acid molecule can also comprise a 2′-o-methylribonucleotide (Inoue et al. (1987) Nucleic Acids Res. 15:6131-6148) or a chimeric RNA-DNA analogue (Inoue et al. (1987) FEBS Lett. 215:327-330).

[4641] In still another embodiment, an antisense nucleic acid of the invention is a ribozyme. A ribozyme having specificity for a 23479, 48120, or 46689-encoding nucleic acid can include one or more sequences complementary to the nucleotide sequence of a 23479, 48120, or 46689 cDNA disclosed herein (i.e., SEQ ID NO:74, SEQ ID NO:76, SEQ ID NO:77, SEQ ID NO:79, SEQ ID NO:80, or SEQ ID NO:82), and a sequence having known catalytic sequence responsible for mRNA cleavage (see U.S. Pat. No. 5,093,246 or Haselhoff and Gerlach (1988) Nature 334:585-591). For example, a derivative of a Tetrahymena L-19 IVS RNA can be constructed in which the nucleotide sequence of the active site is complementary to the nucleotide sequence to be cleaved in a 23479, 48120, or 46689-encoding mRNA. See, e.g., Cech et al. U.S. Pat. No. 4,987,071; and Cech et al. U.S. Pat. No. 5,116,742. Alternatively, 23479, 48120, or 46689 mRNA can be used to select a catalytic RNA having a specific ribonuclease activity from a pool of RNA molecules. See, e.g., Bartel, D. and Szostak, J. W. (1993) Science 261:1411-1418.

[4642] 23479, 48120, or 46689 gene expression can be inhibited by targeting nucleotide sequences complementary to the regulatory region of the 23479, 48120, or 46689 (e.g., the 23479, 48120, or 46689 promoter and/or enhancers) to form triple helical structures that prevent transcription of the 23479, 48120, or 46689 gene in target cells. See generally, Helene, C. (1991) Anticancer Drug Des. 6:569-84; Helene, C. i (1992) Ann. N.Y. Acad. Sci. 660:27-36; and Maher, L. J. (1992) Bioassays 14:807-15. The potential sequences that can be targeted for triple helix formation can be increased by creating a so-called “switchback” nucleic acid molecule. Switchback molecules are synthesized in an alternating 5′-3′,3′-5′manner, such that they base pair with first one strand of a duplex and then the other, eliminating the necessity for a sizeable stretch of either purines or pyrimidines to be present on one strand of a duplex.

[4643] The invention also provides detectably labeled oligonucleotide primer and probe molecules. Typically, such labels are chemiluminescent, fluorescent, radioactive, or colorimetric.

[4644] A 23479, 48120, or 46689 nucleic acid molecule can be modified at the base moiety, sugar moiety or phosphate backbone to improve, e.g., the stability, hybridization, or solubility of the molecule. For non-limiting examples of synthetic oligonucleotides with modifications see Toulmé (2001) Nature Biotech. 19:17 and Faria et al. (2001) Nature Biotech. 19:40-44. Such phosphoramidite oligonucleotides can be effective antisense agents.

[4645] For example, the deoxyribose phosphate backbone of the nucleic acid molecules can be modified to generate peptide nucleic acids (see Hyrup B. et al. (1996) Bioorganic & Medicinal Chemistry 4: 5-23). As used herein, the terms “peptide nucleic acid” or “PNA” refers to a nucleic acid mimic, e.g., a DNA mimic, in which the deoxyribose phosphate backbone is replaced by a pseudopeptide backbone and only the four natural nucleobases are retained. The neutral backbone of a PNA can allow for specific hybridization to DNA and RNA under conditions of low ionic strength. The synthesis of PNA oligomers can be performed using standard solid phase peptide synthesis protocols as described in Hyrup B. et al. (1996) supra and Perry-O'Keefe et al. Proc. Natl. Acad. Sci. 93: 14670-675.

[4646] PNAs of 23479, 48120, or 46689 nucleic acid molecules can be used in therapeutic and diagnostic applications. For example, PNAs can be used as antisense or antigene agents for sequence-specific modulation of gene expression by, for example, inducing transcription or translation arrest or inhibiting replication. PNAs of 23479, 48120, or 46689 nucleic acid molecules can also be used in the analysis of single base pair mutations in a gene, (e.g., by PNA-directed PCR clamping); as ‘artificial restriction enzymes’ when used in combination with other enzymes, (e.g., S1 nucleases (Hyrup B. et al. (1996) supra)); or as probes or primers for DNA sequencing or hybridization (Hyrup B. et al. (1996) supra; Perry-O'Keefe supra).

[4647] In other embodiments, the oligonucleotide may include other appended groups such as peptides (e.g., for targeting host cell receptors in vivo), or agents facilitating transport across the cell membrane (see, e.g., Letsinger et al. (1989) Proc. Natl. Acad. Sci. USA 86:6553-6556; Lemaitre et al. (1987) Proc. Natl. Acad. Sci. USA 84:648-652; PCT Publication No. WO88/09810) or the blood-brain barrier (see, e.g., PCT Publication No. WO89/10134). In addition, oligonucleotides can be modified with hybridization-triggered cleavage agents (see, e.g., Krol et al. (1988) Bio-Techniques 6:958-976) or intercalating agents. (see, e.g., Zon (1988) Pharm. Res. 5:539-549). To this end, the oligonucleotide may be conjugated to another molecule, (e.g., a peptide, hybridization triggered cross-linking agent, transport agent, or hybridization-triggered cleavage agent).

[4648] The invention also includes molecular beacon oligonucleotide primer and probe molecules having at least one region which is complementary to a 23479, 48120, or 46689 nucleic acid of the invention, two complementary regions one having a fluorophore and one a quencher such that the molecular beacon is useful for quantitating the presence of the 23479, 48120, or 46689 nucleic acid of the invention in a sample. Molecular beacon nucleic acids are described, for example, in Lizardi et al., U.S. Pat. No. 5,854,033; Nazarenko et al., U.S. Pat. No. 5,866,336, and Livak et al., U.S. Pat. No. 5,876,930.

[4649] Isolated 23479 or 48120 Polypeptides

[4650] In another aspect, the invention features an isolated 23479 or 48120 protein, or fragment, e.g., a biologically active portion, for use as immunogens or antigens to raise or test (or more generally to bind) anti-23479 or 48120 antibodies. 23479 or 48120 protein can be isolated from cells or tissue sources using standard protein purification techniques. 23479 or 48120 protein or fragments thereof can be produced by recombinant DNA techniques or synthesized chemically.

[4651] Polypeptides of the invention include those which arise as a result of the existence of multiple genes, alternative transcription events, alternative RNA splicing events, and alternative translational and post-translational events. The polypeptide can be expressed in systems, e.g., cultured cells, which result in substantially the same post-translational modifications present when expressed the polypeptide is expressed in a native cell, or in systems which result in the alteration or omission of post-translational modifications, e.g., glycosylation or cleavage, present when expressed in a native cell.

[4652] In a preferred embodiment, a 23479 or 48120 polypeptide has one or more of the following characteristics:

[4653] (i) it has the ability to cleave a bond between ubiquitin and a substrate, e.g., a protein targeted for degradation by ubiquitination;

[4654] (ii) it has a molecular weight, e.g., a deduced molecular weight, preferably ignoring any contribution of post translational modifications, amino acid composition or other physical characteristic of a 23479 or 48120 polypeptide, e.g., a polypeptide of SEQ ID NO:75 or SEQ ID NO:78;

[4655] (iii) it has an overall sequence similarity of at least 60%, more preferably at least 70%, 80%, 90%, 95%, 98%, 99%, or more with a polypeptide of SEQ ID NO:75 or SEQ ID NO:78;

[4656] (iv) it has a UCH-1 domain which preferably has an overall sequence similarity of at least about 70%, 80%, 90% or 95% with amino acid residues about 296-327 of SEQ ID NO:75 or amino acid 162-193 of SEQ ID NO:78;

[4657] (v) it has a UCH-2 domain, or region which has an overall sequence similarity of at least about 70%, 80%, 90% or 95% with amino acid residues about 546-640 of SEQ ID NO:75 or amino acid 580-649 of SEQ ID NO:78;

[4658] (vi) it has a UBA domain, or region which has an overall sequence similarity of at least about 70%, 80%, 90% or 95% with amino acid residues about 20-61 of SEQ ID NO:78;

[4659] (vii) it has a UIM domain, or region which has an overall sequence similarity of at least about 70%, 80%, 90% or 95% with amino acid residues about 20-61 of SEQ ID NO:78; and

[4660] (viii) it has at least one, two, three, four, five, or six predicted N-glycosylation sites (PS00001);

[4661] (ix) it has at least one predicted cAMP and cGMP-dependent protein kinase phosphorylation site (PS00004);

[4662] (x) it has at least one, two, three, four, five, six, seven, eight, nine, 10, 11, 12, 13, 14, or 15 predicted protein kinase C phosphorylation sites (PS00005);

[4663] (xi) it has at least one, two, three, four, five, six, seven, eight, nine, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, or even 33 predicted casein kinase II phosphorylation sites (PS00006);

[4664] (xii) it has at least one, two, or three predicted tyrosine kinase phosphorylation sites (PS00007);

[4665] (xiii) it has at least one, two, three, four, five, six, seven, eight, nine, 10, 11, 12, or even 13 predicted N-myristoylation sites (PS00008);

[4666] (xiv) it has at least one amidation site (PS00009);

[4667] (xv) it has at least one carbamoyl-phosphate synthase subdomain signature 2 (PS00867);

[4668] (xvi) at least one ubiquitin carboxyl-terminal hydrolase family 2 signature 2 (PS00973);

[4669] (xvii) it has at least one predicted peroxisomal targeting signal; and

[4670] (xviii) it has at least one coiled coil domain.

[4671] In a preferred embodiment the 23479 or 48120 protein, or fragment thereof, differs from the corresponding sequence in SEQ ID NO:75 or SEQ ID NO:78. In one embodiment it differs by at least one but by less than 15, 10 or 5 amino acid residues. In another it differs from the corresponding sequence in SEQ ID NO:75 or SEQ ID NO:78 by at least one residue but less than 20%, 15%, 10% or 5% of the residues in it differ from the corresponding sequence in SEQ ID NO:75 or SEQ ID NO:78. (If this comparison requires alignment the sequences should be aligned for maximum homology. “Looped” out sequences from deletions or insertions, or mismatches, are considered differences.) The differences are, preferably, differences or changes at a non essential residue or a conservative substitution. In a preferred embodiment the differences are not in the UCH-1, UCH-2, UBA, or UIM domains. In another preferred embodiment one or more differences are in the UCH-1, UCH-2, UBA, or UIM domains.

[4672] Other embodiments include a protein that contain one or more changes in amino acid sequence, e.g., a change in an amino acid residue which is not essential for activity. Such 23479 or 48120 proteins differ in amino acid sequence from SEQ ID NO:75 or SEQ ID NO:78, yet retain biological activity.

[4673] In one embodiment, the protein includes an amino acid sequence at least about 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 98% or more homologous to SEQ ID NO:75 or SEQ ID NO:78.

[4674] A 23479 protein or fragment is provided which varies from the sequence of SEQ ID NO:75 in regions defined by amino acids about 1-295, 328-545, or 641-934 by at least one but by less than 15, 10 or 5 amino acid residues in the protein or fragment but which does not differ from SEQ ID NO:75 in regions defined by amino acids about 296-327 or 546-640. A 48120 protein or fragment is provided which varies from the sequence of SEQ ID NO:78 in regions defined by amino acids about 1-19, 62-95, 114-161, 194-579, or 650-1139 by at least one but by less than 15, 10 or 5 amino acid residues in the protein or fragment but which does not differ from SEQ ID NO:78 in regions defined by amino acids about 20-61, 96-113, 162-193, or 580-649. (If this comparison requires alignment the sequences should be aligned for maximum homology. “Looped” out sequences from deletions or insertions, or mismatches, are considered differences.) In some embodiments the difference is at a non-essential residue or is a conservative substitution, while in others the difference is at an essential residue or is a non-conservative substitution.

[4675] In one embodiment, a biologically active portion of a 23479 or 48120 protein includes a UCH-1, UCH-2, UBA, or UIM domain. Moreover, other biologically active portions, in which other regions of the protein are deleted, can be prepared by recombinant techniques and evaluated for one or more of the functional activities of a native 23479 or 48120 protein.

[4676] In a preferred embodiment, the 23479 or 48120 protein has an amino acid sequence shown in SEQ ID NO:75 or SEQ ID NO:78. In other embodiments, the 23479 or 48120 protein is substantially identical to SEQ ID NO:75 or SEQ ID NO:78. In yet another embodiment, the 23479 or 48120 protein is substantially identical to SEQ ID NO:75 or SEQ ID NO:78 and retains the functional activity of the protein of SEQ ID NO:75 or SEQ ID NO:78, as described in detail in the subsections above.

[4677] In a preferred embodiment, a fragment differs by at least 1, 2, 3, 10, 20, or more amino acid residues encoded by a sequence present in WO01/55301, WO 01/57058, WO 01/57272, or WO 01/38543, or Genbank™ accession number BAA34449. Differences can include differing in length or sequence identity. For example, a fragment can: include one or more amino acid residues from SEQ ID NO:75 or SEQ ID NO:78 outside the region of amino acid residues 488-843, 638-934, 753-934, or 440-510 of SEQ ID NO:75 or 428-716, 880-1139, 912-1139, 428-582, 1017-1081 of SEQ ID NO:78; not include all of the amino acid residues of a sequence present in WO01/55301, WO 01/57058, WO 01/57272, or WO 01/38543, or Genbank™ accession number BAA344449, e.g., can be one or more amino acid residues shorter (at one or both ends) than a sequence present in WO01/55301, WO 01/57058, WO 01/57272, or WO 01/38543, or Genbank accession number BAA344449; or can differ by one or more amino acid residues in the region of overlap.

[4678] Isolated 46689 Polypeptides

[4679] In another aspect, the invention features, an isolated 46689 protein, or fragment, e.g., a biologically active portion, for use as immunogens or antigens to raise or test (or more generally to bind) anti-46689 antibodies. 46689 protein can be isolated from cells or tissue sources using standard protein purification techniques. 46689 protein or fragments thereof can be produced by recombinant DNA techniques or synthesized chemically.

[4680] Polypeptides of the invention include those that arise as a result of the existence of multiple genes, alternative transcription events, alternative RNA splicing events, and alternative translational and post-translational events. The polypeptide can be expressed in systems, e.g., cultured cells, which result in substantially the same post-translational modifications present when expressed the polypeptide is expressed in a native cell, or in systems which result in the alteration or omission of post-translational modifications, e.g., glycosylation or cleavage, present when expressed in a native cell.

[4681] In a preferred embodiment, a 46689 polypeptide has one or more of the following characteristics:

[4682] (i) it has the ability to bind to and hydrolyze substrate molecules, e.g., protein, lipid, or small molecule (e.g., metabolite, signaling molecule, toxin, or carcinogen) substrates;

[4683] (ii) it has a molecular weight, e.g., a deduced molecular weight, preferably ignoring any contribution of post translational modifications, amino acid composition or other physical characteristic of a 46689 polypeptide, e.g., a polypeptide of SEQ ID NO:81;

[4684] (iii) it has an overall sequence similarity of at least 60%, more preferably at least 70%, 80%, 90%, 95%, 98%, 99%, or more with a polypeptide a of SEQ ID NO:81;

[4685] (iv) it can be found in tumors;

[4686] (v) it has an &agr;/&bgr; hydrolase domain which is preferably about 70%, 80%, 90%, 95%, 98%, 99%, or more identical with amino acid residues about 186 to 419 of SEQ ID NO:81;

[4687] (vi) it has a catalytic acid residue;

[4688] (vii) it has a catalytic histidine residue;

[4689] (viii) it has a transmembrane domain, or a region which is about 70%, 80%, 90%, 95%, 98%, 99%, or more identical with amino acid residues about 150 to 167 or SEQ ID NO:81;

[4690] (ix) it has a signal peptide, or a region which is about 70%, 80%, 90%, 95%, 98%, 99%, or more identical with amino acid residues about 150 to 167 or SEQ ID NO:81;

[4691] (x) it has at least one, two, three, preferably four predicted protein kinase C phosphorylation sites (PS00005);

[4692] (xi) it has at least one, two, three, preferably four predicted casein kinase II phosphorylation sites (PS00006);

[4693] (xii) it has at least one predicted cAMP- and cGMP-dependent protein kinase phosphorylation site (PS00004);

[4694] (xiii) it has at least one, preferably two predicted amidation sites (PS00009); and

[4695] (xiv) it has at least one, two, three, four, preferably five predicted N-myristylation sites (PS00008).

[4696] In a preferred embodiment the 46689 protein, or fragment thereof, differs from the corresponding sequence in SEQ ID NO:81. In one embodiment it differs by at least one but by less than 15, 10 or 5 amino acid residues. In another it differs from the corresponding sequence in SEQ ID NO:81 by at least one residue but less than 20%, 15%, 10% or 5% of the residues in it differ from the corresponding sequence in SEQ ID NO:81. (If this comparison requires alignment the sequences should be aligned for maximum homology. “Looped” out sequences from deletions or insertions, or mismatches, are considered differences.) The differences are, preferably, differences or changes at a non-essential residue or a conservative substitution. In a preferred embodiment the differences are not in the &agr;/&bgr; hydrolase domain, e.g., about amino acid residues 186 to 419 of SEQ ID NO:81. or the transmembrane domain, e.g., about amino acid residues 150 to 167 of SEQ ID NO:81. In another preferred embodiment one or more differences are in the &agr;/&bgr; hydrolase domain, e.g., about amino acid residues 186 to 419 of SEQ ID NO: 81. or the transmembrane domain, e.g., about amino acid residues 150 to 167 of SEQ ID NO:81

[4697] Other embodiments include a protein that contain one or more changes in amino acid sequence, e.g., a change in an amino acid residue which is not essential for activity. Such 46689 proteins differ in amino acid sequence from SEQ ID NO:81, yet retain biological activity.

[4698] In one embodiment, the protein includes an amino acid sequence at least about 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 98% or more homologous to SEQ ID NO:81.

[4699] A 46689 protein or fragment is provided which varies from the sequence of SEQ ID NO:81 in regions defined by amino acid residues about 27 to 149, 168 to 185, 242 to 350, and 400 to 468 by at least one but by less than 15, 10 or 5 amino acid residues in the protein or fragment but which does not differ from SEQ ID NO:81 in regions defined by amino acids about 150 to 167, 186 to 241, and 351 to 399. (If this comparison requires alignment the sequences should be aligned for maximum homology. “Looped” out sequences from deletions or insertions, or mismatches, are considered differences.) In some embodiments the difference is at a non-essential residue or is a conservative substitution, while in others the difference is at an essential residue or is a non-conservative substitution.

[4700] In one embodiment, a biologically active portion of a 46689 protein includes an &agr;/&bgr; hydrolase domain. Moreover, other biologically active portions, in which other regions of the protein are deleted, can be prepared by recombinant techniques and evaluated for one or more of the functional activities of a native 46689 protein.

[4701] In a preferred embodiment, the 46689 protein has an amino acid sequence shown in SEQ ID NO:81. In other embodiments, the 46689 protein is substantially identical to SEQ ID NO:81. In yet another embodiment, the 46689 protein is substantially identical to SEQ ID NO:81 and retains the functional activity of the protein of SEQ ID NO:81, as described in detail in the subsections above.

[4702] In a preferred embodiment, a fragment differs by at least 1, 2, 3, 10, 20, or more amino acid residues encoded by a sequence present in SEQ ID NO: 7258 of WO 01/57188 or SEQ ID NO: 790 or WO 00/52165. Differences can include differing in length or sequence identity. For example, a fragment can: include one or more amino acid residues from SEQ ID NO:81 outside the region encoded by nucleotides 246 to 799, 808 to 1519, 743 to 1113, 1115 to 1521, or 1523 to 2082 of SEQ ID NO:80; not include all of the amino acid residues encoded by a nucleotide sequence in SEQ ID NO: 7258 of WO 01/57188 or SEQ ID NO: 790 or WO 00/52165, e.g., can be one or more amino acid residues shorter (at one or both ends) than a sequence encoded by a nucleotide sequence in SEQ ID NO: 7258 of WO 01/57188 or SEQ ID NO: 790 or WO 00/52165; or can differ by one or more amino acid residues in the region of overlap.

[4703] 23479, 48120, or 46689 Chimeric or Fusion Proteins

[4704] In another aspect, the invention provides 23479, 48120, or 46689 chimeric or fusion proteins. As used herein, a 23479, 48120, or 46689 “chimeric protein” or “fusion protein” includes a 23479, 48120, or 46689 polypeptide linked to a non-23479, 48120, or 46689 polypeptide. A “non-23479, 48120, or 46689 polypeptide” refers to a polypeptide having an amino acid sequence corresponding to a protein which is not substantially homologous to the 23479, 48120, or 46689 protein, e.g., a protein which is different from the 23479, 48120, or 46689 protein and which is derived from the same or a different organism. The 23479, 48120, or 46689 polypeptide of the fusion protein can correspond to all or a portion e.g., a fragment described herein of a 23479, 48120, or 46689 amino acid sequence. In a preferred embodiment, a 23479, 48120, or 46689 fusion protein includes at least one (or two) biologically active portion of a 23479, 48120, or 46689 protein. The non-23479, 48120, or 46689 polypeptide can be fused to the N-terminus or C-terminus of the 23479, 48120, or 46689 polypeptide.

[4705] The fusion protein can include a moiety which has a high affinity for a ligand. For example, the fusion protein can be a GST-23479, 48120, or 46689 fusion protein in which the 23479, 48120, or 46689 sequences are fused to the C-terminus of the GST sequences. Such fusion proteins can facilitate the purification of recombinant 23479, 48120, or 46689. Alternatively, the fusion protein can be a 23479, 48120, or 46689 protein containing a heterologous signal sequence at its N-terminus. In certain host cells (e.g., mammalian host cells), expression and/or secretion of 23479, 48120, or 46689 can be increased through use of a heterologous signal sequence.

[4706] Fusion proteins can include all or a part of a serum protein, e.g., an IgG constant region, or human serum albumin.

[4707] The 23479, 48120, or 46689 fusion proteins of the invention can be incorporated into pharmaceutical compositions and administered to a subject in vivo. The 23479, 48120, or 46689 fusion proteins can be used to affect the bioavailability of a 23479, 48120, or 46689 substrate. 23479, 48120, or 46689 fusion proteins may be useful therapeutically for the treatment of disorders caused by, for example, (i) aberrant modification or mutation of a gene encoding a 23479, 48120, or 46689 protein; (ii) mis-regulation of the 23479, 48120, or 46689 gene; and (iii) aberrant post-translational modification of a 23479, 48120, or 46689 protein.

[4708] Moreover, the 23479, 48120, or 46689-fusion proteins of the invention can be used as immunogens to produce anti-23479, 48120, or 46689 antibodies in a subject, to purify 23479, 48120, or 46689 ligands and in screening assays to identify molecules which inhibit the interaction of 23479, 48120, or 46689 with a 23479, 48120, or 46689 substrate.

[4709] Expression vectors are commercially available that already encode a fusion moiety (e.g., a GST polypeptide). A 23479, 48120, or 46689-encoding nucleic acid can be cloned into such an expression vector such that the fusion moiety is linked in-frame to the 23479, 48120, or 46689 protein.

[4710] Variants of 23479, 48120, or 46689 Proteins

[4711] In another aspect, the invention also features a variant of a 23479, 48120, or 46689 polypeptide, e.g., which functions as an agonist (mimetics) or as an antagonist. Variants of the 23479, 48120, or 46689 proteins can be generated by mutagenesis, e.g., discrete point mutation, the insertion or deletion of sequences or the truncation of a 23479, 48120, or 46689 protein. An agonist of the 23479, 48120, or 46689 proteins can retain substantially the same, or a subset, of the biological activities of the naturally occurring form of a 23479, 48120, or 46689 protein. An antagonist of a 23479, 48120, or 46689 protein can inhibit one or more of the activities of the naturally occurring form of the 23479, 48120, or 46689 protein by, for example, competitively modulating a 23479, 48120, or 46689-mediated activity of a 23479, 48120, or 46689 protein. Thus, specific biological effects can be elicited by treatment with a variant of limited function. Preferably, treatment of a subject with a variant having a subset of the biological activities of the naturally occurring form of the protein has fewer side effects in a subject relative to treatment with the naturally occurring form of the 23479, 48120, or 46689 protein.

[4712] Variants of a 23479, 48120, or 46689 protein can be identified by screening combinatorial libraries of mutants, e.g., truncation mutants, of a 23479, 48120, or 46689 protein for agonist or antagonist activity.

[4713] Libraries of fragments e.g., N terminal, C terminal, or internal fragments, of a 23479, 48120, or 46689 protein coding sequence can be used to generate a variegated population of fragments for screening and subsequent selection of variants of a 23479, 48120, or 46689 protein. Variants in which a cysteine residues is added or deleted or in which a residue which is glycosylated is added or deleted are particularly preferred.

[4714] Methods for screening gene products of combinatorial libraries made by point mutations or truncation, and for screening cDNA libraries for gene products having a selected property are known in the art. Such methods are adaptable for rapid screening of the gene libraries generated by combinatorial mutagenesis of 23479, 48120, or 46689 proteins. Recursive ensemble mutagenesis (REM), a new technique which enhances the frequency of functional mutants in the libraries, can be used in combination with the screening assays to identify 23479, 48120, or 46689 variants (Arkin and Yourvan (1992) Proc. Natl. Acad. Sci. USA 89:7811-7815; Delgrave et al. (1993) Protein Engineering 6:327-331).

[4715] Cell based assays can be exploited to analyze a variegated 23479, 48120, or 46689 library. For example, a library of expression vectors can be transfected into a cell line, e.g., a cell line, which ordinarily responds to 23479, 48120, or 46689 in a substrate-dependent manner. The transfected cells are then contacted with 23479, 48120, or 46689 and the effect of the expression of the mutant on signaling by the 23479, 48120, or 46689 substrate can be detected, e.g., by measuring changes in cellular proliferation. Plasmid DNA can then be recovered from the cells which score for inhibition, or alternatively, potentiation of signaling by the 23479, 48120, or 46689 substrate, and the individual clones further characterized.

[4716] In another aspect, the invention features a method of making a 23479, 48120, or 46689 polypeptide, e.g., a peptide having a non-wild type activity, e.g., an antagonist, agonist, or super agonist of a naturally occurring 23479, 48120, or 46689 polypeptide, e.g., a naturally occurring 23479, 48120, or 46689 polypeptide. The method includes: altering the sequence of a 23479, 48120, or 46689 polypeptide, e.g., altering the sequence, e.g., by substitution or deletion of one or more residues of a non-conserved region, a domain or residue disclosed herein, and testing the altered polypeptide for the desired activity.

[4717] In another aspect, the invention features a method of making a fragment or analog of a 23479, 48120, or 46689 polypeptide a biological activity of a naturally occurring 23479, 48120, or 46689 polypeptide. The method includes: altering the sequence, e.g., by substitution or deletion of one or more residues, of a 23479, 48120, or 46689 polypeptide, e.g., altering the sequence of a non-conserved region, or a domain or residue described herein, and testing the altered polypeptide for the desired activity.

[4718] Anti-23479, 48120, or 46689 Antibodies

[4719] In another aspect, the invention provides an anti-23479, 48120, or 46689 antibody, or a fragment thereof (e.g., an antigen-binding fragment thereof). The term “antibody” as used herein refers to an immunoglobulin molecule or immunologically active portion thereof, i.e., an antigen-binding portion. As used herein, the term “antibody” refers to a protein comprising at least one, and preferably two, heavy (H) chain variable regions (abbreviated herein as VH), and at least one and preferably two light (L) chain variable regions (abbreviated herein as VL). The VH and VL regions can be further subdivided into regions of hypervariability, termed “complementarity determining regions” (“CDR”), interspersed with regions that are more conserved, termed “framework regions” (FR). The extent of the framework region and CDR's has been precisely defined (see, Kabat, E. A., et al. (1991) Sequences of Proteins of Immunological Interest, Fifth Edition, U.S. Department of Health and Human Services, NIH Publication No. 91-3242, and Chothia, C. et al. (1987) J. Mol. Biol. 196:901-917, which are incorporated herein by reference). Each VH and VL is composed of three CDR's and four FRs, arranged from amino-terminus to carboxy-terminus in the following order: FR1, CDR1, FR2, CDR2, FR3, CDR3, FR4.

[4720] The anti-23479, 48120, or 46689 antibody can further include a heavy and light chain constant region, to thereby form a heavy and light immunoglobulin chain, respectively. In one embodiment, the antibody is a tetramer of two heavy immunoglobulin chains and two light immunoglobulin chains, wherein the heavy and light immunoglobulin chains are inter-connected by, e.g., disulfide bonds. The heavy chain constant region is comprised of three domains, CH1, CH2 and CH3. The light chain constant region is comprised of one domain, CL. The variable region of the heavy and light chains contains a binding domain that interacts with an antigen. The constant regions of the antibodies typically mediate the binding of the antibody to host tissues or factors, including various cells of the immune system (e.g., effector cells) and the first component (Clq) of the classical complement system.

[4721] As used herein, the term “immunoglobulin” refers to a protein consisting of one or more polypeptides substantially encoded by immunoglobulin genes. The recognized human immunoglobulin genes include the kappa, lambda, alpha (IgA1 and IgA2), gamma (IgG1, IgG2, IgG3, IgG4), delta, epsilon and mu constant region genes, as well as the myriad immunoglobulin variable region genes. Full-length immunoglobulin “light chains” (about 25 KDa or 214 amino acids) are encoded by a variable region gene at the NH2-terminus (about 110 amino acids) and a kappa or lambda constant region gene at the COOH—terminus. Full-length immunoglobulin “heavy chains” (about 50 KDa or 446 amino acids), are similarly encoded by a variable region gene (about 116 amino acids) and one of the other aforementioned constant region genes, e.g., gamma (encoding about 330 amino acids).

[4722] The term “antigen-binding fragment” of an antibody (or simply “antibody portion,” or “fragment”), as used herein, refers to one or more fragments of a full-length antibody that retain the ability to specifically bind to the antigen, e.g., 23479, 48120, or 46689 polypeptide or fragment thereof. Examples of antigen-binding fragments of the anti-23479, 48120, or 46689 antibody include, but are not limited to: (i) a Fab fragment, a monovalent fragment consisting of the VL, VH, CL and CH1 domains; (ii) a F(ab′)2 fragment, a bivalent fragment comprising two Fab fragments linked by a disulfide bridge at the hinge region; (iii) a Fd fragment consisting of the VH and CH1 domains; (iv) a Fv fragment consisting of the VL and VH domains of a single arm of an antibody, (v) a dAb fragment (Ward et al., (1989) Nature 341:544-546), which consists of a VH domain; and (vi) an isolated complementarity determining region (CDR). Furthermore, although the two domains of the Fv fragment, VL and VH, are coded for by separate genes, they can be joined, using recombinant methods, by a synthetic linker that enables them to be made as a single protein chain in which the VL and VH regions pair to form monovalent molecules (known as single chain Fv (scFv); see e.g., Bird et al. (1988) Science 242:423-426; and Huston et al. (1988) Proc. Natl. Acad. Sci. USA 85:5879-5883). Such single chain antibodies are also encompassed within the term “antigen-binding fragment” of an antibody. These antibody fragments are obtained using conventional techniques known to those with skill in the art, and the fragments are screened for utility in the same manner as are intact antibodies.

[4723] The anti-23479, 48120, or 46689 antibody can be a polyclonal or a monoclonal antibody. In other embodiments, the antibody can be recombinantly produced, e.g., produced by phage display or by combinatorial methods.

[4724] Phage display and combinatorial methods for generating anti-23479, 48120, or 46689 antibodies are known in the art (as described in, e.g., Ladner et al. U.S. Pat. No. 5,223,409; Kang et al. International Publication No. WO 92/18619; Dower et al. International Publication No. WO 91/17271; Winter et al. International Publication WO 92/20791; Markland et al. International Publication No. WO 92/15679; Breitling et al. International Publication WO 93/01288; McCafferty et al. International Publication No. WO 92/01047; Garrard et al. International Publication No. WO 92/09690; Ladner et al. International Publication No. WO 90/02809; Fuchs et al. (1991) Bio/Technology 9:1370-1372; Hay et al. (1992) Hum Antibod Hybridomas 3:81-85; Huse et al. (1989) Science 246:1275-1281; Griffths et al. (1993) EMBO J. 12:725-734; Hawkins et al. (1992) J Mol Biol 226:889-896; Clackson et al. (1991) Nature 352:624-628; Gram et al. (1992) PNAS 89:3576-3580; Garrad et al. (1991) Bio/Technology 9:1373-1377; Hoogenboom et al. (1991) Nuc Acid Res 19:4133-4137; and Barbas et al. (1991) PNAS 88:7978-7982, the contents of all of which are incorporated by reference herein).

[4725] In one embodiment, the anti-23479, 48120, or 46689 antibody is a fully human antibody (e.g., an antibody made in a mouse which has been genetically engineered to produce an antibody from a human immunoglobulin sequence), or a non-human antibody, e.g., a rodent (mouse or rat), goat, primate (e.g., monkey), camel antibody. Preferably, the non-human antibody is a rodent (mouse or rat antibody). Method of producing rodent antibodies are known in the art.

[4726] Human monoclonal antibodies can be generated using transgenic mice carrying the human immunoglobulin genes rather than the mouse system. Splenocytes from these transgenic mice immunized with the antigen of interest are used to produce hybridomas that secrete human mAbs with specific affinities for epitopes from a human protein (see, e.g., Wood et al. International Application WO 91/00906, Kucherlapati et al. PCT publication WO 91/10741; Lonberg et al. International Application WO 92/03918; Kay et al. International Application 92/03917; Lonberg, N. et al. 1994 Nature 368:856-859; Green, L. L. et al. 1994 Nature Genet. 7:13-21; Morrison, S. L. et al. 1994 Proc. Natl. Acad. Sci. USA 81:6851-6855; Bruggeman et al. 1993 Year Immunol 7:33-40; Tuaillon et al. 1993 PNAS 90:3720-3724; Bruggeman et al. 1991 Eur J Immunol 21:1323-1326).

[4727] An anti-23479, 48120, or 46689 antibody can be one in which the variable region, or a portion thereof, e.g., the CDR's, are generated in a non-human organism, e.g., a rat or mouse. Chimeric, CDR-grafted, and humanized antibodies are within the invention. Antibodies generated in a non-human organism, e.g., a rat or mouse, and then modified, e.g., in the variable framework or constant region, to decrease antigenicity in a human are within the invention.

[4728] Chimeric antibodies can be produced by recombinant DNA techniques known in the art. For example, a gene encoding the Fc constant region of a murine (or other species) monoclonal antibody molecule is digested with restriction enzymes to remove the region encoding the murine Fc, and the equivalent portion of a gene encoding a human Fc constant region is substituted (see Robinson et al., International Patent Publication PCT/US86/02269; Akira, et al., European Patent Application 184,187; Taniguchi, M., European Patent Application 171,496; Morrison et al., European Patent Application 173,494; Neuberger et al., International Application WO 86/01533; Cabilly et al. U.S. Pat. No. 4,816,567; Cabilly et al., European Patent Application 125,023; Better et al. (1988 Science 240:1041-1043); Liu et al. (1987) PNAS 84:3439-3443; Liu et al., 1987, J. Immunol. 139:3521-3526; Sun et al. (1987) PNAS 84:214-218; Nishimura et al., 1987, Canc. Res. 47:999-1005; Wood et al. (1985) Nature 314:446-449; and Shaw et al., 1988, J. Natl Cancer Inst. 80:1553-1559).

[4729] A humanized or CDR-grafted antibody will have at least one or two but generally all three recipient CDR's (of heavy and or light immuoglobulin chains) replaced with a donor CDR. The antibody may be replaced with at least a portion of a non-human CDR or only some of the CDR's may be replaced with non-human CDR's. It is only necessary to replace the number of CDR's required for binding of the humanized antibody to a 23479, 48120, or 46689 or a fragment thereof. Preferably, the donor will be a rodent antibody, e.g., a rat or mouse antibody, and the recipient will be a human framework or a human consensus framework. Typically, the immunoglobulin providing the CDR's is called the “donor” and the immunoglobulin providing the framework is called the “acceptor.” In one embodiment, the donor immunoglobulin is a non-human (e.g., rodent). The acceptor framework is a naturally-occurring (e.g., a human) framework or a consensus framework, or a sequence about 85% or higher, preferably 90%, 95%, 99% or higher identical thereto.

[4730] As used herein, the term “consensus sequence” refers to the sequence formed from the most frequently occurring amino acids (or nucleotides) in a family of related sequences (See e.g., Winnaker, From Genes to Clones (Verlagsgesellschaft, Weinheim, Germany 1987). In a family of proteins, each position in the consensus sequence is occupied by the amino acid occurring most frequently at that position in the family. If two amino acids occur equally frequently, either can be included in the consensus sequence. A “consensus framework” refers to the framework region in the consensus immunoglobulin sequence.

[4731] An antibody can be humanized by methods known in the art. Humanized antibodies can be generated by replacing sequences of the Fv variable region that are not directly involved in antigen binding with equivalent sequences from human Fv variable regions. General methods for generating humanized antibodies are provided by Morrison, S. L., 1985, Science 229:1202-1207, by Oi et al., 1986, BioTechniques. 4:214, and by Queen et al. U.S. Pat. No. 5,585,089, U.S. Pat. No. 5,693,761 and U.S. Pat. No. 5,693,762, the contents of all of which are hereby incorporated by reference. Those methods include isolating, manipulating, and expressing the nucleic acid sequences that encode all or part of immunoglobulin Fv variable regions from at least one of a heavy or light chain. Sources of such nucleic acid are well known to those skilled in the art and, for example, may be obtained from a hybridoma producing an antibody against a 23479, 48120, or 46689 polypeptide or fragment thereof. The recombinant DNA encoding the humanized antibody, or fragment thereof, can then be cloned into an appropriate expression vector.

[4732] Humanized or CDR-grafted antibodies can be produced by CDR-grafting or CDR substitution, wherein one, two, or all CDR's of an immunoglobulin chain can be replaced. See e.g., U.S. Pat. No. 5,225,539; Jones et al. 1986 Nature 321:552-525; Verhoeyan et al. 1988 Science 239:1534; Beidler et al. 1988 J. Immunol. 141:4053-4060; Winter U.S. Pat. No. 5,225,539, the contents of all of which are hereby expressly incorporated by reference. Winter describes a CDR-grafting method which may be used to prepare the humanized antibodies of the present invention (UK Patent Application GB 2188638A, filed on Mar. 26, 1987; Winter U.S. Pat. No. 5,225,539), the contents of which is expressly incorporated by reference.

[4733] Also within the scope of the invention are humanized antibodies in which specific amino acids have been substituted, deleted or added. Preferred humanized antibodies have amino acid substitutions in the framework region, such as to improve binding to the antigen. For example, a humanized antibody will have framework residues identical to the donor framework residue or to another amino acid other than the recipient framework residue. To generate such antibodies, a selected, small number of acceptor framework residues of the humanized immunoglobulin chain can be replaced by the corresponding donor amino acids. Preferred locations of the substitutions include amino acid residues adjacent to the CDR, or which are capable of interacting with a CDR (see e.g., U.S. Pat. No. 5,585,089). Criteria for selecting amino acids from the donor are described in U.S. Pat. No. 5,585,089, e.g., columns 12-16 of U.S. Pat. No. 5,585,089, the e.g., columns 12-16 of U.S. Pat. No. 5,585,089, the contents of which are hereby incorporated by reference. Other techniques for humanizing antibodies are described in Padlan et al. EP 519596 A1, published on Dec. 23, 1992.

[4734] In preferred embodiments an antibody can be made by immunizing with purified 23479, 48120, or 46689 antigen, or a fragment thereof, e.g., a fragment described herein, membrane associated antigen, tissue, e.g., crude tissue preparations, whole cells, preferably living cells, lysed cells, or cell fractions, e.g., cytosol or membrane fractions.

[4735] A full-length 23479, 48120, or 46689 protein or, antigenic peptide fragment of 23479, 48120, or 46689 can be used as an immunogen or can be used to identify anti-23479, 48120, or 46689 antibodies made with other immunogens, e.g., cells, membrane preparations, and the like. The antigenic peptide of 23479, 48120, or 46689 should include at least 8 amino acid residues of the amino acid sequence shown in SEQ ID NO:75, SEQ ID NO:78, or SEQ ID NO:81 and encompasses an epitope of 23479, 48120, or 46689. Preferably, the antigenic peptide includes at least 10 amino acid residues, more preferably at least 15 amino acid residues, even more preferably at least 20 amino acid residues, and most preferably at least 30 amino acid residues.

[4736] Fragments of 23479 or 48120 can be used, e.g., to characterize the specificity of an antibody or to make immunogens. For example, fragments of 23479 or 48120 which include residues about 275 to 290, 530 to 550, or 640 to 650 of SEQ ID NO:75 or residues about 120 to 155, 680 to 700, or 770 to 800 of SEQ ID NO:78 can be used to make antibodies against hydrophilic regions of the 23479 or 48120 protein. Similarly, fragments of 23479 or 48120 which include residues about 100 to 110, 295 to 310, or 920 to 930 of SEQ ID NO:75 or residues about 1040 to 1055 of SEQ ID NO:78 can be used to make an antibody against a hydrophobic region of the 23479 or 48120 protein; a fragment of 23479 or 48120 which includes residues about 296-327 of SEQ ID NO:75 or 162-193 of SEQ ID NO:78 can be used to make an antibody against the UCH-1 region of the 23479 or 48120 protein; a fragment of 23479 or 48120 which includes residues about 546-640 of SEQ ID NO:75 or 580-649 of SEQ ID NO:78 can be used to make an antibody against the UCH-2 region of the 23479 or 48120 protein; a fragment of 48120 which includes residues about 20-61 of SEQ ID NO:78 can be used to make an antibody against the UBA region of the 48120 protein; and a fragment of 48120 which includes residues about 96-113 of SEQ ID NO:78 can be used to make an antibody against the UIM region of the 48120 protein.

[4737] Similarly, fragments of 46689 can be used, e.g., to characterize the specificity of an antibody or to make immunogens. For example, fragments of 46689 which include residues about 34 to 51, about 333 to 347, or about 438 to 449 of SEQ ID NO:81 can be used to make antibodies against hydrophilic regions of the 46689 protein. Similarly, fragments of 46689 which include residues about 1 to 23, about 133 to 145, or about 150 to 168 of SEQ ID NO:81 can be used to make an antibody against a hydrophobic region of the 46689 protein; fragments of 46689 which include residues about 27 to 149 or about 168 to 468 of SEQ ID NO:81 can be used to make an antibody against an non-transmembrane region of the 46689 protein; or fragments of 46689 which include residues about 186 to 241 or about 350 to 400 of SEQ ID NO:81 can be used to make an antibody against the &agr;/&bgr; hydrolase domain of the 46689 protein.

[4738] Antibodies reactive with, or specific for, any of these regions, or other regions or domains described herein are provided.

[4739] Antibodies that bind only native 23479, 48120, or 46689 protein, only denatured or otherwise non-native 23479, 48120, or 46689 protein, or which bind both, are with in the invention. Antibodies with linear or conformational epitopes are within the invention. Conformational epitopes can sometimes be identified by identifying antibodies that bind to native, but not denatured 23479, 48120, or 46689 protein.

[4740] Preferred epitopes encompassed by the antigenic peptide are regions of 23479, 48120, or 46689 are located on the surface of the protein, e.g., hydrophilic regions, as well as regions with high antigenicity. For example, an Emini surface probability analysis of the human 23479, 48120, or 46689 protein sequence can be used to indicate the regions that have a particularly high probability of being localized to the surface of the 23479, 48120, or 46689 protein and are thus likely to constitute surface residues useful for targeting antibody production.

[4741] In a preferred embodiment the antibody can bind to the extracellular portion of the 46689 protein, e.g., it can bind to a whole cell which expresses the 46689 protein. In another embodiment, the antibody binds an intracellular portion of the 46689 protein.

[4742] In preferred embodiments, antibodies can bind one or more of purified antigen, membrane associated antigen, tissue, e.g., tissue sections, whole cells, preferably living cells, lysed cells, cell fractions, e.g., cytosol or membrane fractions.

[4743] The anti-23479, 48120, or 46689 antibody can be a single chain antibody. A single-chain antibody (scFV) may be engineered (see, for example, Colcher, D. et al. (1999) Ann N Y Acad Sci 880:263-80; and Reiter, Y. (1996) Clin Cancer Res 2:245-52). The single chain antibody can be dimerized or multimerized to generate multivalent antibodies having specificities for different epitopes of the same target 23479, 48120, or 46689 protein.

[4744] In a preferred embodiment the antibody has effector function and/or can fix complement. In other embodiments the antibody does not recruit effector cells; or fix complement.

[4745] In a preferred embodiment, the antibody has reduced or no ability to bind an Fc receptor. For example, it is a isotype or subtype, fragment or other mutant, which does not support binding to an Fc receptor, e.g., it has a mutagenized or deleted Fc receptor binding region.

[4746] In a preferred embodiment, an anti-23479 or 48120 antibody alters (e.g., increases or decreases) the de-ubiquitination activity of a 23479 or 48120 polypeptide. For example, the antibody can bind at or in proximity to the active site, e.g., to an epitope that includes a residue located from about 550-567 of SEQ ID NO:75 or 584-601 of SEQ ID NO:78.

[4747] In another preferred embodiment, an anti-46689 antibody alters (e.g., increases or decreases) the hydrolase activity of a 46689 polypeptide. For example, the antibody can bind at or in proximity to the active site, e.g., to an epitope that includes a residue located from about 186 to 241 or 350 to 400 of SEQ ID NO:81.

[4748] The antibody can be coupled to a toxin, e.g., a polypeptide toxin, e,g, ricin or diphtheria toxin or active fragment hereof, or a radioactive nucleus, or imaging agent, e.g. a radioactive, enzymatic, or other, e.g., imaging agent, e.g., a NMR contrast agent. Labels which produce detectable radioactive emissions or fluorescence are preferred.

[4749] An anti-23479, 48120, or 46689 antibody (e.g., monoclonal antibody) can be used to isolate 23479, 48120, or 46689 by standard techniques, such as affinity chromatography or immunoprecipitation. Moreover, an anti-23479, 48120, or 46689 antibody can be used to detect 23479, 48120, or 46689 protein (e.g., in a cellular lysate or cell supernatant) in order to evaluate the abundance and pattern of expression of the protein. Anti-23479, 48120, or 46689 antibodies can be used diagnostically to monitor protein levels in tissue as part of a clinical testing procedure, e.g., to determine the efficacy of a given treatment regimen. Detection can be facilitated by coupling (i.e., physically linking) the antibody to a detectable substance (i.e., antibody labelling). Examples of detectable substances include various enzymes, prosthetic groups, fluorescent materials, luminescent materials, bioluminescent materials, and radioactive materials. Examples of suitable enzymes include horseradish peroxidase, alkaline phosphatase, &bgr;-galactosidase, or acetylcholinesterase; examples of suitable prosthetic group complexes include streptavidin/biotin and avidin/biotin; examples of suitable fluorescent materials include umbelliferone, fluorescein, fluorescein isothiocyanate, rhodamine, dichlorotriazinylamine fluorescein, dansyl chloride or phycoerythrin; an example of a luminescent material includes luminol; examples of bioluminescent materials include luciferase, luciferin, and aequorin, and examples of suitable radioactive material include 125I, 131I, 35S or 3H.

[4750] The invention also includes a nucleic acid which encodes an anti-23479, 48120, or 46689 antibody, e.g., an anti-23479, 48120, or 46689 antibody described herein. Also included are vectors which include the nucleic acid and cells transformed with the nucleic acid, particularly cells which are useful for producing an antibody, e.g., mammalian cells, e.g. CHO or lymphatic cells.

[4751] The invention also includes cell lines, e.g., hybridomas, which make an anti-23479, 48120, or 46689 antibody, e.g., an antibody described herein, and method of using said cells to make a 23479, 48120, or 46689 antibody.

[4752] 23479, 48120, and 46689 Recombinant Expression Vectors, Host Cells and Genetically Engineered Cells

[4753] In another aspect, the invention includes, vectors, preferably expression vectors, containing a nucleic acid encoding a polypeptide described herein. As used herein, the term “vector” refers to a nucleic acid molecule capable of transporting another nucleic acid to which it has been linked and can include a plasmid, cosmid or viral vector. The vector can be capable of autonomous replication or it can integrate into a host DNA. Viral vectors include, e.g., replication defective retroviruses, adenoviruses and adeno-associated viruses.

[4754] A vector can include a 23479, 48120, or 46689 nucleic acid in a form suitable for expression of the nucleic acid in a host cell. Preferably the recombinant expression vector includes one or more regulatory sequences operatively linked to the nucleic acid sequence to be expressed. The term “regulatory sequence” includes promoters, enhancers and other expression control elements (e.g., polyadenylation signals). Regulatory sequences include those which direct constitutive expression of a nucleotide sequence, as well as tissue-specific regulatory and/or inducible sequences. The design of the expression vector can depend on such factors as the choice of the host cell to be transformed, the level of expression of protein desired, and the like. The expression vectors of the invention can be introduced into host cells to thereby produce proteins or polypeptides, including fusion proteins or polypeptides, encoded by nucleic acids as described herein (e.g., 23479, 48120, or 46689 proteins, mutant forms of 23479, 48120, or 46689 proteins, fusion proteins, and the like).

[4755] The recombinant expression vectors of the invention can be designed for expression of 23479, 48120, or 46689 proteins in prokaryotic or eukaryotic cells. For example, polypeptides of the invention can be expressed in E. coli, insect cells (e.g., using baculovirus expression vectors), yeast cells or mammalian cells. Suitable host cells are discussed further in Goeddel, (1990) Gene Expression Technology: Methods in Enzymology 185, Academic Press, San Diego, Calif. Alternatively, the recombinant expression vector can be transcribed and translated in vitro, for example using T7 promoter regulatory sequences and T7 polymerase.

[4756] Expression of proteins in prokaryotes is most often carried out in E. coli with vectors containing constitutive or inducible promoters directing the expression of either fusion or non-fusion proteins. Fusion vectors add a number of amino acids to a protein encoded therein, usually to the amino terminus of the recombinant protein. Such fusion vectors typically serve three purposes: 1) to increase expression of recombinant protein; 2) to increase the solubility of the recombinant protein; and 3) to aid in the purification of the recombinant protein by acting as a ligand in affinity purification. Often, a proteolytic cleavage site is introduced at the junction of the fusion moiety and the recombinant protein to enable separation of the recombinant protein from the fusion moiety subsequent to purification of the fusion protein. Such enzymes, and their cognate recognition sequences, include Factor Xa, thrombin and enterokinase. Typical fusion expression vectors include pGEX (Pharmacia Biotech Inc; Smith, D. B. and Johnson, K. S. (1988) Gene 67:31-40), pMAL (New England Biolabs, Beverly, Mass.) and pRIT5 (Pharmacia, Piscataway, N.J.) which fuse glutathione S-transferase (GST), maltose E binding protein, or protein A, respectively, to the target recombinant protein.

[4757] Purified fusion proteins can be used in 23479, 48120, or 46689 activity assays, (e.g., direct assays or competitive assays described in detail below), or to generate antibodies specific for 23479, 48120, or 46689 proteins. In a preferred embodiment, a fusion protein expressed in a retroviral expression vector of the present invention can be used to infect bone marrow cells that are subsequently transplanted into irradiated recipients. The pathology of the subject recipient is then examined after sufficient time has passed (e.g., six weeks).

[4758] To maximize recombinant protein expression in E. coli is to express the protein in a host bacteria with an impaired capacity to proteolytically cleave the recombinant protein (Gottesman, S., (1990) Gene Expression Technology: Methods in Enzymology 185, Academic Press, San Diego, Calif. 119-128). Another strategy is to alter the nucleic acid sequence of the nucleic acid to be inserted into an expression vector so that the individual codons for each amino acid are those preferentially utilized in E. coli (Wada et al., (1992) Nucleic Acids Res. 20:2111-2118). Such alteration of nucleic acid sequences of the invention can be carried out by standard DNA synthesis techniques.

[4759] The 23479, 48120, or 46689 expression vector can be a yeast expression vector, a vector for expression in insect cells, e.g., a baculovirus expression vector or a vector suitable for expression in mammalian cells.

[4760] When used in mammalian cells, the expression vector's control functions can be provided by viral regulatory elements. For example, commonly used promoters are derived from polyoma, Adenovirus 2, cytomegalovirus and Simian Virus 40.

[4761] In another embodiment, the promoter is an inducible promoter, e.g., a promoter regulated by a steroid hormone, by a polypeptide hormone (e.g., by means of a signal transduction pathway), or by a heterologous polypeptide (e.g., the tetracycline-inducible systems, “Tet-On” and “Tet-Off”; see, e.g., Clontech Inc., CA, Gossen and Bujard (1992) Proc. Natl. Acad. Sci. USA 89:5547, and Paillard (1989) Human Gene Therapy 9:983).

[4762] In another embodiment, the recombinant mammalian expression vector is capable of directing expression of the nucleic acid preferentially in a particular cell type (e.g., tissue-specific regulatory elements are used to express the nucleic acid). Non-limiting examples of suitable tissue-specific promoters include the albumin promoter (liver-specific; Pinkert et al. (1987) Genes Dev. 1:268-277), lymphoid-specific promoters (Calame and Eaton (1988) Adv. Immunol. 43:235-275), in particular promoters of T cell receptors (Winoto and Baltimore (1989) EMBO J. 8:729-733) and immunoglobulins (Banerji et al. (1983) Cell 33:729-740; Queen and Baltimore (1983) Cell 33:741-748), neuron-specific promoters (e.g., the neurofilament promoter; Byrne and Ruddle (1989) Proc. Natl. Acad. Sci. USA 86:5473-5477), pancreas-specific promoters (Edlund et al. (1985) Science 230:912-916), and mammary gland-specific promoters (e.g., milk whey promoter; U.S. Pat. No. 4,873,316 and European Application Publication No. 264,166). Developmentally-regulated promoters are also encompassed, for example, the murine hox promoters (Kessel and Gruss (1990) Science 249:374-379) and the &agr;-fetoprotein promoter (Campes and Tilghman (1989) Genes Dev. 3:537-546).

[4763] The invention further provides a recombinant expression vector comprising a DNA molecule of the invention cloned into the expression vector in an antisense orientation. Regulatory sequences (e.g., viral promoters and/or enhancers) operatively linked to a nucleic acid cloned in the antisense orientation can be chosen which direct the constitutive, tissue specific or cell type specific expression of antisense RNA in a variety of cell types. The antisense expression vector can be in the form of a recombinant plasmid, phagemid or attenuated virus.

[4764] Another aspect the invention provides a host cell which includes a nucleic acid molecule described herein, e.g., a 23479, 48120, or 46689 nucleic acid molecule within a recombinant expression vector or a 23479, 48120, or 46689 nucleic acid molecule containing sequences which allow it to homologously recombine into a specific site of the host cell's genome. The terms “host cell” and “recombinant host cell” are used interchangeably herein. Such terms refer not only to the particular subject cell but to the progeny or potential progeny of such a cell. Because certain modifications may occur in succeeding generations due to either mutation or environmental influences, such progeny may not, in fact, be identical to the parent cell, but are still included within the scope of the term as used herein.

[4765] A host cell can be any prokaryotic or eukaryotic cell. For example, a 23479, 48120, or 46689 protein can be expressed in bacterial cells (such as E. coli), insect cells, yeast or mammalian cells (such as Chinese hamster ovary cells (CHO) or COS cells (African green monkey kidney cells CV-1 origin SV40 cells; Gluzman (1981) Cell 123:175-182)). Other suitable host cells are known to those skilled in the art.

[4766] Vector DNA can be introduced into host cells via conventional transformation or transfection techniques. As used herein, the terms “transformation” and “transfection” are intended to refer to a variety of art-recognized techniques for introducing foreign nucleic acid (e.g., DNA) into a host cell, including calcium phosphate or calcium chloride co-precipitation, DEAE-dextran-mediated transfection, lipofection, or electroporation.

[4767] A host cell of the invention can be used to produce (i.e., express) a 23479, 48120, or 46689 protein. Accordingly, the invention further provides methods for producing a 23479, 48120, or 46689 protein using the host cells of the invention. In one embodiment, the method includes culturing the host cell of the invention (into which a recombinant expression vector encoding a 23479, 48120, or 46689 protein has been introduced) in a suitable medium such that a 23479, 48120, or 46689 protein is produced. In another embodiment, the method further includes isolating a 23479, 48120, or 46689 protein from the medium or the host cell.

[4768] In another aspect, the invention features, a cell or purified preparation of cells which include a 23479, 48120, or 46689 transgene, or which otherwise misexpress 23479, 48120, or 46689. The cell preparation can consist of human or non-human cells, e.g., rodent cells, e.g., mouse or rat cells, rabbit cells, or pig cells. In preferred embodiments, the cell or cells include a 23479, 48120, or 46689 transgene, e.g., a heterologous form of a 23479, 48120, or 46689, e.g., a gene derived from humans (in the case of a non-human cell). The 23479, 48120, or 46689 transgene can be misexpressed, e.g., overexpressed or underexpressed. In other preferred embodiments, the cell or cells include a gene that mis-expresses an endogenous 23479, 48120, or 46689, e.g., a gene the expression of which is disrupted, e.g., a knockout. Such cells can serve as a model for studying disorders that are related to mutated or mis-expressed 23479, 48120, or 46689 alleles or for use in drug screening.

[4769] In another aspect, the invention features, a human cell, e.g., a hematopoietic or hepatic stem cell, transformed with nucleic acid which encodes a subject 23479, 48120, or 46689 polypeptide.

[4770] Also provided are cells, preferably human cells, e.g., human hematopoietic, hepatic, neural, lung, ovary, breast or fibroblast cells, in which an endogenous 23479, 48120, or 46689 is under the control of a regulatory sequence that does not normally control the expression of the endogenous 23479, 48120, or 46689 gene. The expression characteristics of an endogenous gene within a cell, e.g., a cell line or microorganism, can be modified by inserting a heterologous DNA regulatory element into the genome of the cell such that the inserted regulatory element is operably linked to the endogenous 23479, 48120, or 46689 gene. For example, an endogenous 23479, 48120, or 46689 gene which is “transcriptionally silent,” e.g., not normally expressed, or expressed only at very low levels, may be activated by inserting a regulatory element which is capable of promoting the expression of a normally expressed gene product in that cell. Techniques such as targeted homologous recombinations, can be used to insert the heterologous DNA as described in, e.g., Chappel, U.S. Pat. No. 5,272,071; WO 91/06667, published in May 16, 1991.

[4771] In a preferred embodiment, recombinant cells described herein can be used for replacement therapy in a subject. For example, a nucleic acid encoding a 23479, 48120, or 46689 polypeptide operably linked to an inducible promoter (e.g., a steroid hormone receptor-regulated promoter) is introduced into a human or nonhuman, e.g., mammalian, e.g., porcine recombinant cell. The cell is cultivated and encapsulated in a biocompatible material, such as poly-lysine alginate, and subsequently implanted into the subject. See, e.g., Lanza (1996) Nat. Biotechnol. 14:1107; Joki et al. (2001) Nat. Biotechnol. 19:35; and U.S. Pat. No. 5,876,742. Production of 23479, 48120, or 46689 polypeptide can be regulated in the subject by administering an agent (e.g., a steroid hormone) to the subject. In another preferred embodiment, the implanted recombinant cells express and secrete an antibody specific for a 23479, 48120, or 46689 polypeptide. The antibody can be any antibody or any antibody derivative described herein.

[4772] 23479, 48120, and 46689 Transgenic Animals

[4773] The invention provides non-human transgenic animals. Such animals are useful for studying the function and/or activity of a 23479, 48120, or 46689 protein and for identifying and/or evaluating modulators of 23479, 48120, or 46689 activity. As used herein, a “transgenic animal” is a non-human animal, preferably a mammal, more preferably a rodent such as a rat or mouse, in which one or more of the cells of the animal includes a transgene. Other examples of transgenic animals include non-human primates, sheep, dogs, cows, goats, chickens, amphibians, and the like. A transgene is exogenous DNA or a rearrangement, e.g., a deletion of endogenous chromosomal DNA, which preferably is integrated into or occurs in the genome of the cells of a transgenic animal. A transgene can direct the expression of an encoded gene product in one or more cell types or tissues of the transgenic animal, other transgenes, e.g., a knockout, reduce expression. Thus, a transgenic animal can be one in which an endogenous 23479, 48120, or 46689 gene has been altered by, e.g., by homologous recombination between the endogenous gene and an exogenous DNA molecule introduced into a cell of the animal, e.g., an embryonic cell of the animal, prior to development of the animal.

[4774] Intronic sequences and polyadenylation signals can also be included in the transgene to increase the efficiency of expression of the transgene. A tissue-specific regulatory sequence(s) can be operably linked to a transgene of the invention to direct expression of a 23479, 48120, or 46689 protein to particular cells. A transgenic founder animal can be identified based upon the presence of a 23479, 48120, or 46689 transgene in its genome and/or expression of 23479, 48120, or 46689 mRNA in tissues or cells of the animals. A transgenic founder animal can then be used to breed additional animals carrying the transgene. Moreover, transgenic animals carrying a transgene encoding a 23479, 48120, or 46689 protein can further be bred to other transgenic animals carrying other transgenes.

[4775] 23479, 48120, or 46689 proteins or polypeptides can be expressed in transgenic animals or plants, e.g., a nucleic acid encoding the protein or polypeptide can be introduced into the genome of an animal. In preferred embodiments the nucleic acid is placed under the control of a tissue specific promoter, e.g., a milk or egg specific promoter, and recovered from the milk or eggs produced by the animal. Suitable animals are mice, pigs, cows, goats, and sheep.

[4776] The invention also includes a population of cells from a transgenic animal, as discussed, e.g., below.

[4777] Uses of 23479, 48120, and 46689

[4778] The nucleic acid molecules, proteins, protein homologues, and antibodies described herein can be used in one or more of the following methods: a) screening assays; b) predictive medicine (e.g., diagnostic assays, prognostic assays, monitoring clinical trials, and pharmacogenetics); and c) methods of treatment (e.g., therapeutic and prophylactic).

[4779] The isolated nucleic acid molecules of the invention can be used, for example, to express a 23479, 48120, or 46689 protein (e.g., via a recombinant expression vector in a host cell in gene therapy applications), to detect a 23479, 48120, or 46689 mRNA (e.g., in a biological sample) or a genetic alteration in a 23479, 48120, or 46689 gene, and to modulate 23479, 48120, or 46689 activity, as described further below. The 23479, 48120, or 46689 proteins can be used to treat disorders characterized by insufficient or excessive production of a 23479, 48120, or 46689 substrate or production of 23479, 48120, or 46689 inhibitors. In addition, the 23479, 48120, or 46689 proteins can be used to screen for naturally occurring 23479, 48120, or 46689 substrates, to screen for drugs or compounds which modulate 23479, 48120, or 46689 activity, as well as to treat disorders characterized by insufficient or excessive production of 23479, 48120, or 46689 protein or production of 23479, 48120, or 46689 protein forms which have decreased, aberrant or unwanted activity compared to 23479, 48120, or 46689 wild type protein (e.g., a cellular proliferative or differentiative disorder, e.g., in the lung, brain, ovary, or breast; a neural disorder; a hematopoietic disorder; a cardiovascular disorder; or a liver disorder). Moreover, the anti-23479, 48120, or 46689 antibodies of the invention can be used to detect and isolate 23479, 48120, or 46689 proteins, regulate the bioavailability of 23479, 48120, or 46689 proteins, and modulate 23479, 48120, or 46689 activity.

[4780] A method of evaluating a compound for the ability to interact with, e.g., bind, a subject 23479, 48120, or 46689 polypeptide is provided. The method includes: contacting the compound with the subject 23479, 48120, or 46689 polypeptide; and evaluating ability of the compound to interact with, e.g., to bind or form a complex with the subject 23479, 48120, or 46689 polypeptide. This method can be performed in vitro, e.g., in a cell free system, or in vivo, e.g., in a two-hybrid interaction trap assay. This method can be used to identify naturally occurring molecules that interact with subject 23479, 48120, or 46689 polypeptide. It can also be used to find natural or synthetic inhibitors of subject 23479, 48120, or 46689 polypeptide. Screening methods are discussed in more detail below.

[4781] 23479, 48120, and 46689 Screening Assays

[4782] The invention provides methods (also referred to herein as “screening assays”) for identifying modulators, i.e., candidate or test compounds or agents (e.g., proteins, peptides, peptidomimetics, peptoids, small molecules or other drugs) which bind to 23479, 48120, or 46689 proteins, have a stimulatory or inhibitory effect on, for example, 23479, 48120, or 46689 expression or 23479, 48120, or 46689 activity, or have a stimulatory or inhibitory effect on, for example, the expression or activity of a 23479, 48120, or 46689 substrate. Compounds thus identified can be used to modulate the activity of target gene products (e.g., 23479, 48120, or 46689 genes) in a therapeutic protocol, to elaborate the biological function of the target gene product, or to identify compounds that disrupt normal target gene interactions.

[4783] In one embodiment, the invention provides assays for screening candidate or test compounds that are substrates of a 23479, 48120, or 46689 protein or polypeptide or a biologically active portion thereof. In another embodiment, the invention provides assays for screening candidate or test compounds that bind to or modulate an activity of a 23479, 48120, or 46689 protein or polypeptide or a biologically active portion thereof.

[4784] In one embodiment, an activity of a 23479 or 48120 protein can be assayed by measuring or detecting 23479- or 48120-mediated de-ubiquitination. De-ubiquitination assays useful for detecting a ubiquitin carboxyl-terminal hydrolase activity are well known in the art and can be found, for example, in Zhu et al. (1997) Journal of Biological Chemistry 272:51-57, Mitch et al. (1999) American Journal of Physiology 276:C1132-C1138, Liu et al. (1999) Molecular and Cell Biology 19:3029-3038, and such as those cited in various reviews, for example, Ciechanover et al. (1994) The FASEB Journal 8:182-192, Chiechanover (1994) Biol. Chem. Hoppe-Seyler 375:565-581, Hershko et al. (1998) Annual Review of Biochemistry 67:425-479, Swartz (1999) Annual Review of Medicine 50:57-74, Ciechanover (1998) EMBO Journal 17:7151-7160, and D'Andrea et al. (1998) Critical Reviews in Biochemistry and Molecular Biology 33:337-352. These assays include, but are not limited to, the disappearance of substrate, including a decrease in the amount of polyubiquitin or ubiquitinated substrate protein or protein remnant, appearance of intermediate and end products, such as appearance of free ubiquitin monomers, general protein turnover, specific protein turnover, ubiquitin binding, binding to ubiquitinated substrate protein, subunit interaction, interaction with ATP, interaction with cellular components such as trans-acting regulatory factors, stabilization of specific proteins, and the like.

[4785] In one embodiment, an activity of a 46689 protein can be assayed in vitro. First, 46689 protein can be expressed in a bacterial cell and then purified, e.g., by means of an covalently attached affinity tag, e.g., a His-6 tag. Purified 46689 can then be incubated in buffer, e.g., 100 mM Tris-HCL, pH 8.5, along with a substrate molecule, e.g., a protein, lipid or small molecule, e.g., steroid, signaling molecule, toxin, or carcinogen, substrate. By measuring the optical density of the solution at various time points, the loss of substrate or increase in product can be monitored, which is a direct measure of 46689 activity. The appropriate wavelength used to measure optical density will depend upon the light absorption spectra of the substrate and product molecules. An example of such as assay, used to monitor the activity of a 2-Hydroxymuconic Semialdehyde Dehydrogenase enzyme, is provided in Inoue et al. (1995), J of Bacteriology 177(5):1196-1201, the contents of which are incorporated herein by reference.

[4786] The test compounds of the present invention can be obtained using any of the numerous approaches in combinatorial library methods known in the art, including: biological libraries; peptoid libraries (libraries of molecules having the functionalities of peptides, but with a novel, non-peptide backbone which are resistant to enzymatic degradation but which nevertheless remain bioactive; see, e.g., Zuckermann, R. N. et al. (1994) J. Med. Chem. 37:2678-85); spatially addressable parallel solid phase or solution phase libraries; synthetic library methods requiring deconvolution; the ‘one-bead one-compound’ library method; and synthetic library methods using affinity chromatography selection. The biological library and peptoid library approaches are limited to peptide libraries, while the other four approaches are applicable to peptide, non-peptide oligomer or small molecule libraries of compounds (Lam (1997) Anticancer Drug Des. 12:145).

[4787] Examples of methods for the synthesis of molecular libraries can be found in the art, for example in: DeWitt et al. (1993) Proc. Natl. Acad. Sci. U.S.A. 90:6909; Erb et al. (1994) Proc. Natl. Acad. Sci. USA 91:11422; Zuckermann et al. (1994). J. Med. Chem. 37:2678; Cho et al. (1993) Science 261:1303; Carrell et al. (1994) Angew. Chem. Int. Ed. Engl. 33:2059; Carell et al. (1994) Angew. Chem. Int. Ed. Engl. 33:2061; and Gallop et al. (1994) J. Med. Chem. 37:1233.

[4788] Libraries of compounds may be presented in solution (e.g., Houghten (1992) Biotechniques 13:412-421), or on beads (Lam (1991) Nature 354:82-84), chips (Fodor (1993) Nature 364:555-556), bacteria (Ladner, U.S. Pat. No. 5,223,409), spores (Ladner U.S. Pat. No. 5,223,409), plasmids (Cull et al. (1992) Proc Natl Acad Sci USA 89:1865-1869) or on phage (Scott and Smith (1990) Science 249:386-390; Devlin (1990) Science 249:404-406; Cwirla et al. (1990) Proc. Natl. Acad. Sci. 87:6378-6382; Felici (1991) J. Mol. Biol. 222:301-310; Ladner supra.).

[4789] In one embodiment, an assay is a cell-based assay in which a cell which expresses a 23479, 48120, or 46689 protein or biologically active portion thereof is contacted with a test compound, and the ability of the test compound to modulate 23479, 48120, or 46689 activity is determined. Determining the ability of the test compound to modulate 23479, 48120, or 46689 activity can be accomplished by monitoring, for example, hydrolase activity, e.g., the hydrolysis of a substrate, e.g., a protein (e.g., a ubiquitinated protein or poly-ubiquitin), lipid, or small molecule, e.g., steroid, signaling molecule, toxin, or carcinogen, substrate. The cell, for example, can be of mammalian origin, e.g., human.

[4790] The ability of the test compound to modulate 23479, 48120, or 46689 binding to a compound, e.g., a 23479, 48120, or 46689 substrate, or to bind to 23479, 48120, or 46689 can also be evaluated. This can be accomplished, for example, by coupling the compound, e.g., the substrate, with a radioisotope or enzymatic label such that binding of the compound, e.g., the substrate, to 23479, 48120, or 46689 can be determined by detecting the labeled compound, e.g., substrate, in a complex. Alternatively, 23479, 48120, or 46689 could be coupled with a radioisotope or enzymatic label to monitor the ability of a test compound to modulate 23479, 48120, or 46689 binding to a 23479, 48120, or 46689 substrate in a complex. For example, compounds (e.g., 23479, 48120, or 46689 substrates) can be labeled with 125I, 35S, 14C, or 3H, either directly or indirectly, and the radioisotope detected by direct counting of radioemmission or by scintillation counting. Alternatively, compounds can be enzymatically labeled with, for example, horseradish peroxidase, alkaline phosphatase, or luciferase, and the enzymatic label detected by determination of conversion of an appropriate substrate to product.

[4791] The ability of a compound (e.g., a 23479, 48120, or 46689 substrate) to interact with 23479, 48120, or 46689 with or without the labeling of any of the interactants can be evaluated. For example, a microphysiometer can be used to detect the interaction of a compound with 23479, 48120, or 46689 without the labeling of either the compound or the 23479, 48120, or 46689. McConnell, H. M. et al. (1992) Science 257:1906-1912. As used herein, a “microphysiometer” (e.g., Cytosensor) is an analytical instrument that measures the rate at which a cell acidifies its environment using a light-addressable potentiometric sensor (LAPS). Changes in this acidification rate can be used as an indicator of the interaction between a compound and 23479, 48120, or 46689.

[4792] In yet another embodiment, a cell-free assay is provided in which a 23479, 48120, or 46689 protein or biologically active portion thereof is contacted with a test compound and the ability of the test compound to bind to the 23479, 48120, or 46689 protein or biologically active portion thereof is evaluated. Preferred biologically active portions of the 23479, 48120, or 46689 proteins to be used in assays of the present invention include fragments which participate in interactions with non-23479, 48120, or 46689 molecules, e.g., fragments with high surface probability scores.

[4793] Soluble and/or membrane-bound forms of isolated proteins (e.g., 23479, 48120, or 46689 proteins or biologically active portions thereof) can be used in the cell-free assays of the invention. When membrane-bound forms of the protein are used, it may be desirable to utilize a solubilizing agent. Examples of such solubilizing agents include non-ionic detergents such as n-octylglucoside, n-dodecylglucoside, n-dodecylmaltoside, octanoyl-N-methylglucamide, decanoyl-N-methylglucamide, Triton® X-100, Triton® X-114, Thesit®, Isotridecypoly(ethylene glycol ether)n, 3-[(3-cholamidopropyl)dimethylamminio]-1-propane sulfonate (CHAPS), 3-[(3-cholamidopropyl)dimethylamminio]-2-hydroxy-1-propane sulfonate (CHAPSO), or N-dodecyl=N,N-dimethyl-3-ammonio-1-propane sulfonate.

[4794] Cell-free assays involve preparing a reaction mixture of the target gene protein and the test compound under conditions and for a time sufficient to allow the two components to interact and bind, thus forming a complex that can be removed and/or detected.

[4795] The interaction between two molecules can also be detected, e.g., using fluorescence energy transfer (FET) (see, for example, Lakowicz et al., U.S. Pat. No. 5,631,169; Stavrianopoulos, et al., U.S. Pat. No. 4,868,103). A fluorophore label on the first, ‘donor’ molecule is selected such that its emitted fluorescent energy will be absorbed by a fluorescent label on a second, ‘acceptor’ molecule, which in turn is able to fluoresce due to the absorbed energy. Alternately, the ‘donor’ protein molecule may simply utilize the natural fluorescent energy of tryptophan residues. Labels are chosen that emit different wavelengths of light, such that the ‘acceptor’ molecule label may be differentiated from that of the ‘donor’. Since the efficiency of energy transfer between the labels is related to the distance separating the molecules, the spatial relationship between the molecules can be assessed. In a situation in which binding occurs between the molecules, the fluorescent emission of the ‘acceptor’ molecule label in the assay should be maximal. An FET binding event can be conveniently measured through standard fluorometric detection means well known in the art (e.g., using a fluorimeter).

[4796] In another embodiment, determining the ability of the 23479, 48120, or 46689 protein to bind to a target molecule can be accomplished using real-time Biomolecular Interaction Analysis (BIA) (see, e.g., Sjolander, S. and Urbaniczky, C. (1991) Anal. Chem. 63:2338-2345 and Szabo et al. (1995) Curr. Opin. Struct. Biol. 5:699-705). “Surface plasmon resonance” or “BIA” detects biospecific interactions in real time, without labeling any of the interactants (e.g., BIAcore). Changes in the mass at the binding surface (indicative of a binding event) result in alterations of the refractive index of light near the surface (the optical phenomenon of surface plasmon resonance (SPR)), resulting in a detectable signal which can be used as an indication of real-time reactions between biological molecules.

[4797] In one embodiment, the target gene product or the test substance is anchored onto a solid phase. The target gene product/test compound complexes anchored on the solid phase can be detected at the end of the reaction. Preferably, the target gene product can be anchored onto a solid surface, and the test compound, (which is not anchored), can be labeled, either directly or indirectly, with detectable labels discussed herein.

[4798] It may be desirable to immobilize either 23479, 48120, or 46689, an anti-23479, 48120, or 46689 antibody or its target molecule to facilitate separation of complexed from uncomplexed forms of one or both of the proteins, as well as to accommodate automation of the assay. Binding of a test compound to a 23479, 48120, or 46689 protein, or interaction of a 23479, 48120, or 46689 protein with a target molecule in the presence and absence of a candidate compound, can be accomplished in any vessel suitable for containing the reactants. Examples of such vessels include microtiter plates, test tubes, and micro-centrifuge tubes. In one embodiment, a fusion protein can be provided which adds a domain that allows one or both of the proteins to be bound to a matrix. For example, glutathione-S-transferase/23479, 48120, or 46689 fusion proteins or glutathione-S-transferase/target fusion proteins can be adsorbed onto glutathione sepharose beads (Sigma Chemical, St. Louis, Mo.) or glutathione derivatized microtiter plates, which are then combined with the test compound or the test compound and either the non-adsorbed target protein or 23479, 48120, or 46689 protein, and the mixture incubated under conditions conducive to complex formation (e.g., at physiological conditions for salt and pH). Following incubation, the beads or microtiter plate wells are washed to remove any unbound components, the matrix immobilized in the case of beads, complex determined either directly or indirectly, for example, as described above. Alternatively, the complexes can be dissociated from the matrix, and the level of 23479, 48120, or 46689 binding or activity determined using standard techniques.

[4799] Other techniques for immobilizing either a 23479, 48120, or 46689 protein or a target molecule on matrices include using conjugation of biotin and streptavidin. Biotinylated 23479, 48120, or 46689 protein or target molecules can be prepared from biotin-NHS(N-hydroxy-succinimide) using techniques known in the art (e.g., biotinylation kit, Pierce Chemicals, Rockford, Ill.), and immobilized in the wells of streptavidin-coated 96 well plates (Pierce Chemical).

[4800] In order to conduct the assay, the non-immobilized component is added to the coated surface containing the anchored component. After the reaction is complete, unreacted components are removed (e.g., by washing) under conditions such that any complexes formed will remain immobilized on the solid surface. The detection of complexes anchored on the solid surface can be accomplished in a number of ways. Where the previously non-immobilized component is pre-labeled, the detection of label immobilized on the surface indicates that complexes were formed. Where the previously non-immobilized component is not pre-labeled, an indirect label can be used to detect complexes anchored on the surface; e.g., using a labeled antibody specific for the immobilized component (the antibody, in turn, can be directly labeled or indirectly labeled with, e.g., a labeled anti-Ig antibody).

[4801] In one embodiment, this assay is performed utilizing antibodies reactive with 23479, 48120, or 46689 protein or target molecules but which do not interfere with binding of the 23479, 48120, or 46689 protein to its target molecule. Such antibodies can be derivatized to the wells of the plate, and unbound target or 23479, 48120, or 46689 protein trapped in the wells by antibody conjugation. Methods for detecting such complexes, in addition to those described above for the GST-immobilized complexes, include immunodetection of complexes using antibodies reactive with the 23479, 48120, or 46689 protein or target molecule, as well as enzyme-linked assays which rely on detecting an enzymatic activity associated with the 23479, 48120, or 46689 protein or target molecule.

[4802] Alternatively, cell free assays can be conducted in a liquid phase. In such an assay, the reaction products are separated from unreacted components, by any of a number of standard techniques, including but not limited to: differential centrifugation (see, for example, Rivas, G., and Minton, A. P., (1993) Trends Biochem Sci 18:284-7); chromatography (gel filtration chromatography, ion-exchange chromatography); electrophoresis (see, e.g., Ausubel, F. et al., eds. Current Protocols in Molecular Biology 1999, J. Wiley: New York.); and immunoprecipitation (see, for example, Ausubel, F. et al., eds. (1999) Current Protocols in Molecular Biology, J. Wiley: New York). Such resins and chromatographic techniques are known to one skilled in the art (see, e.g., Heegaard, N. H., (1998) J Mol Recognit 11: 141-8; Hage, D. S., and Tweed, S. A. (1997) J Chromatogr B Biomed Sci Appl. 699:499-525). Further, fluorescence energy transfer may also be conveniently utilized, as described herein, to detect binding without further purification of the complex from solution.

[4803] In a preferred embodiment, the assay includes contacting the 23479, 48120, or 46689 protein or biologically active portion thereof with a known compound which binds 23479, 48120, or 46689 to form an assay mixture, contacting the assay mixture with a test compound, and determining the ability of the test compound to interact with a 23479, 48120, or 46689 protein, wherein determining the ability of the test compound to interact with a 23479, 48120, or 46689 protein includes determining the ability of the test compound to preferentially bind to 23479, 48120, or 46689 or biologically active portion thereof, or to modulate the activity of a target molecule, as compared to the known compound.

[4804] The target gene products of the invention can, in vivo, interact with one or more cellular or extracellular macromolecules, such as proteins. For the purposes of this discussion, such cellular and extracellular macromolecules are referred to herein as “binding partners.” Compounds that disrupt such interactions can be useful in regulating the activity of the target gene product. Such compounds can include, but are not limited to molecules such as antibodies, peptides, and small molecules. The preferred target genes/products for use in this embodiment are the 23479, 48120, or 46689 genes herein identified. In an alternative embodiment, the invention provides methods for determining the ability of the test compound to modulate the activity of a 23479, 48120, or 46689 protein through modulation of the activity of a downstream effector of a 23479, 48120, or 46689 target molecule. For example, the activity of the effector molecule on an appropriate target can be determined, or the binding of the effector to an appropriate target can be determined, as previously described.

[4805] To identify compounds that interfere with the interaction between the target gene product and its cellular or extracellular binding partner(s), a reaction mixture containing the target gene product and the binding partner is prepared, under conditions and for a time sufficient, to allow the two products to form complex. In order to test an inhibitory agent, the reaction mixture is provided in the presence and absence of the test compound. The test compound can be initially included in the reaction mixture, or can be added at a time subsequent to the addition of the target gene and its cellular or extracellular binding partner. Control reaction mixtures are incubated without the test compound or with a placebo. The formation of any complexes between the target gene product and the cellular or extracellular binding partner is then detected. The formation of a complex in the control reaction, but not in the reaction mixture containing the test compound, indicates that the compound interferes with the interaction of the target gene product and the interactive binding partner. Additionally, complex formation within reaction mixtures containing the test compound and normal target gene product can also be compared to complex formation within reaction mixtures containing the test compound and mutant target gene product. This comparison can be important in those cases wherein it is desirable to identify compounds that disrupt interactions of mutant but not normal target gene products.

[4806] These assays can be conducted in a heterogeneous or homogeneous format. Heterogeneous assays involve anchoring either the target gene product or the binding partner onto a solid phase, and detecting complexes anchored on the solid phase at the end of the reaction. In homogeneous assays, the entire reaction is carried out in a liquid phase. In either approach, the order of addition of reactants can be varied to obtain different information about the compounds being tested. For example, test compounds that interfere with the interaction between the target gene products and the binding partners, e.g., by competition, can be identified by conducting the reaction in the presence of the test substance. Alternatively, test compounds that disrupt preformed complexes, e.g., compounds with higher binding constants that displace one of the components from the complex, can be tested by adding the test compound to the reaction mixture after complexes have been formed. The various formats are briefly described below.

[4807] In a heterogeneous assay system, either the target gene product or the interactive cellular or extracellular binding partner, is anchored onto a solid surface (e.g., a microtiter plate), while the non-anchored species is labeled, either directly or indirectly. The anchored species can be immobilized by non-covalent or covalent attachments. Alternatively, an immobilized antibody specific for the species to be anchored can be used to anchor the species to the solid surface.

[4808] In order to conduct the assay, the partner of the immobilized species is exposed to the coated surface with or without the test compound. After the reaction is complete, unreacted components are removed (e.g., by washing) and any complexes formed will remain immobilized on the solid surface. Where the non-immobilized species is pre-labeled, the detection of label immobilized on the surface indicates that complexes were formed. Where the non-immobilized species is not pre-labeled, an indirect label can be used to detect complexes anchored on the surface; e.g., using a labeled antibody specific for the initially non-immobilized species (the antibody, in turn, can be directly labeled or indirectly labeled with, e.g., a labeled anti-Ig antibody). Depending upon the order of addition of reaction components, test compounds that inhibit complex formation or that disrupt preformed complexes can be detected.

[4809] Alternatively, the reaction can be conducted in a liquid phase in the presence or absence of the test compound, the reaction products separated from unreacted components, and complexes detected; e.g., using an immobilized antibody specific for one of the binding components to anchor any complexes formed in solution, and a labeled antibody specific for the other partner to detect anchored complexes. Again, depending upon the order of addition of reactants to the liquid phase, test compounds that inhibit complex or that disrupt preformed complexes can be identified.

[4810] In an alternate embodiment of the invention, a homogeneous assay can be used. For example, a preformed complex of the target gene product and the interactive cellular or extracellular binding partner product is prepared in that either the target gene products or their binding partners are labeled, but the signal generated by the label is quenched due to complex formation (see, e.g., U.S. Pat. No. 4,109,496 that utilizes this approach for immunoassays). The addition of a test substance that competes with and displaces one of the species from the preformed complex will result in the generation of a signal above background. In this way, test substances that disrupt target gene product-binding partner interaction can be identified.

[4811] In yet another aspect, the 23479, 48120, or 46689 proteins can be used as “bait proteins” in a two-hybrid assay or three-hybrid assay (see, e.g., U.S. Pat. No. 5,283,317; Zervos et al. (1993) Cell 72:223-232; Madura et al. (1993) J. Biol. Chem. 268:12046-12054; Bartel et al. (1993) Biotechniques 14:920-924; Iwabuchi et al. (1993) Oncogene 8:1693-1696; and Brent WO94/10300), to identify other proteins, which bind to or interact with 23479, 48120, or 46689 (“23479, 48120, or 46689-binding proteins” or “23479, 48120, or 46689-bp”) and are involved in 23479, 48120, or 46689 activity. Such 23479, 48120, or 46689-bps can be activators or inhibitors of signals by the 23479, 48120, or 46689 proteins or 23479, 48120, or 46689 targets as, for example, downstream elements of a 23479, 48120, or 46689-mediated signaling pathway.

[4812] The two-hybrid system is based on the modular nature of most transcription factors, which consist of separable DNA-binding and activation domains. Briefly, the assay utilizes two different DNA constructs. In one construct, the gene that codes for a 23479, 48120, or 46689 protein is fused to a gene encoding the DNA binding domain of a known transcription factor (e.g., GAL-4). In the other construct, a DNA sequence, from a library of DNA sequences, that encodes an unidentified protein (“prey” or “sample”) is fused to a gene that codes for the activation domain of the known transcription factor. (Alternatively the: 23479, 48120, or 46689 protein can be the fused to the activator domain.) If the “bait” and the “prey” proteins are able to interact, in vivo, forming a 23479, 48120, or 46689-dependent complex, the DNA-binding and activation domains of the transcription factor are brought into close proximity. This proximity allows transcription of a reporter gene (e.g., lacZ) which is operably linked to a transcriptional regulatory site responsive to the transcription factor. Expression of the reporter gene can be detected and cell colonies containing the functional transcription factor can be isolated and used to obtain the cloned gene which encodes the protein which interacts with the 23479, 48120, or 46689 protein.

[4813] In another embodiment, modulators of 23479, 48120, or 46689 expression are identified. For example, a cell or cell free mixture is contacted with a candidate compound and the expression of 23479, 48120, or 46689 mRNA or protein evaluated relative to the level of expression of 23479, 48120, or 46689 mRNA or protein in the absence of the candidate compound. When expression of 23479, 48120, or 46689 mRNA or protein is greater in the presence of the candidate compound than in its absence, the candidate compound is identified as a stimulator of 23479, 48120, or 46689 mRNA or protein expression. Alternatively, when expression of 23479, 48120, or 46689 mRNA or protein is less (statistically significantly less) in the presence of the candidate compound than in its absence, the candidate compound is identified as an inhibitor of 23479, 48120, or 46689 mRNA or protein expression. The level of 23479, 48120, or 46689 mRNA or protein expression can be determined by methods described herein for detecting 23479, 48120, or 46689 mRNA or protein.

[4814] In another aspect, the invention pertains to a combination of two or more of the assays described herein. For example, a modulating agent can be identified using a cell-based or a cell free assay, and the ability of the agent to modulate the activity of a 23479, 48120, or 46689 protein can be confirmed in vivo, e.g., in an animal such as an animal model for a cellular proliferative or differentiative disorder.

[4815] This invention further pertains to novel agents identified by the above-described screening assays. Accordingly, it is within the scope of this invention to further use an agent identified as described herein (e.g., a 23479, 48120, or 46689 modulating agent, an antisense 23479, 48120, or 46689 nucleic acid molecule, a 23479, 48120, or 46689-specific antibody, or a 23479, 48120, or 46689-binding partner) in an appropriate animal model to determine the efficacy, toxicity, side effects, or mechanism of action, of treatment with such an agent. Furthermore, novel agents identified by the above-described screening assays can be used for treatments as described herein.

[4816] 23479, 48120, and 46689 Detection Assays

[4817] Portions or fragments of the nucleic acid sequences identified herein can be used as polynucleotide reagents. For example, these sequences can be used to: (i) map their respective genes on a chromosome e.g., to locate gene regions associated with genetic disease or to associate 23479, 48120, or 46689 with a disease; (ii) identify an individual from a minute biological sample (tissue typing); and (iii) aid in forensic identification of a biological sample. These applications are described in the subsections below.

[4818] 23479, 48120, and 46689 Chromosome Mapping

[4819] The 23479, 48120, or 46689 nucleotide sequences or portions thereof can be used to map the location of the 23479, 48120, or 46689 genes on a chromosome. This process is called chromosome mapping. Chromosome mapping is useful in correlating the 23479, 48120, or 46689 sequences with genes associated with disease.

[4820] Briefly, 23479, 48120, or 46689 genes can be mapped to chromosomes by preparing PCR primers (preferably 15-25 bp in length) from the 23479, 48120, or 46689 nucleotide sequences. These primers can then be used for PCR screening of somatic cell hybrids containing individual human chromosomes. Only those hybrids containing the human gene corresponding to the 23479, 48120, or 46689 sequences will yield an amplified fragment.

[4821] A panel of somatic cell hybrids in which each cell line contains either a single human chromosome or a small number of human chromosomes, and a full set of mouse chromosomes, can allow easy mapping of individual genes to specific human chromosomes. (D'Eustachio P. et al. (1983) Science 220:919-924).

[4822] Other mapping strategies e.g., in situ hybridization (described in Fan, Y. et al. (1990) Proc. Natl. Acad. Sci. USA, 87:6223-27), pre-screening with labeled flow-sorted chromosomes, and pre-selection by hybridization to chromosome specific cDNA libraries can be used to map 23479, 48120, or 46689 to a chromosomal location.

[4823] Fluorescence in situ hybridization (FISH) of a DNA sequence to a metaphase chromosomal spread can further be used to provide a precise chromosomal location in one step. The FISH technique can be used with a DNA sequence as short as 500 or 600 bases. However, clones larger than 1,000 bases have a higher likelihood of binding to a unique chromosomal location with sufficient signal intensity for simple detection. Preferably 1,000 bases, and more preferably 2,000 bases will suffice to get good results at a reasonable amount of time. For a review of this technique, see Verma et al., Human Chromosomes: A Manual of Basic Techniques ((1988) Pergamon Press, New York).

[4824] Reagents for chromosome mapping can be used individually to mark a single chromosome or a single site on that chromosome, or panels of reagents can be used for marking multiple sites and/or multiple chromosomes. Reagents corresponding to noncoding regions of the genes actually are preferred for mapping purposes. Coding sequences are more likely to be conserved within gene families, thus increasing the chance of cross hybridizations during chromosomal mapping.

[4825] Once a sequence has been mapped to a precise chromosomal location, the physical position of the sequence on the chromosome can be correlated with genetic map data. (Such data are found, for example, in V. McKusick, Mendelian Inheritance in Man, available on-line through Johns Hopkins University Welch Medical Library). The relationship between a gene and a disease, mapped to the same chromosomal region, can then be identified through linkage analysis (co-inheritance of physically adjacent genes), described in, for example, Egeland, J. et al. (1987) Nature, 325:783-787.

[4826] Moreover, differences in the DNA sequences between individuals affected and unaffected with a disease associated with the 23479, 48120, or 46689 gene, can be determined. If a mutation is observed in some or all of the affected individuals but not in any unaffected individuals, then the mutation is likely to be the causative agent of the particular disease. Comparison of affected and unaffected individuals generally involves first looking for structural alterations in the chromosomes, such as deletions or translocations that are visible from chromosome spreads or detectable using PCR based on that DNA sequence. Ultimately, complete sequencing of genes from several individuals can be performed to confirm the presence of a mutation and to distinguish mutations from polymorphisms.

[4827] 23479, 48120, and 46689 Tissue Typing

[4828] 23479, 48120, or 46689 sequences can be used to identify individuals from biological samples using, e.g., restriction fragment length polymorphism (RFLP). In this technique, an individual's genomic DNA is digested with one or more restriction enzymes, the fragments separated, e.g., in a Southern blot, and probed to yield bands for identification. The sequences of the present invention are useful as additional DNA markers for RFLP (described in U.S. Pat. No. 5,272,057).

[4829] Furthermore, the sequences of the present invention can also be used to determine the actual base-by-base DNA sequence of selected portions of an individual's genome. Thus, the 23479, 48120, or 46689 nucleotide sequences described herein can be used to prepare two PCR primers from the 5′ and 3′ends of the sequences. These primers can then be used to amplify an individual's DNA and subsequently sequence it. Panels of corresponding DNA sequences from individuals, prepared in this manner, can provide unique individual identifications, as each individual will have a unique set of such DNA sequences due to allelic differences.

[4830] Allelic variation occurs to some degree in the coding regions of these sequences, and to a greater degree in the noncoding regions. Each of the sequences described herein can, to some degree, be used as a standard against which DNA from an individual can be compared for identification purposes. Because greater numbers of polymorphisms occur in the noncoding regions, fewer sequences are necessary to differentiate individuals. The noncoding sequences of SEQ ID NO:74 can provide positive individual identification with a panel of perhaps 10 to 1,000 primers which each yield a noncoding amplified sequence of 100 bases. If predicted coding sequences, such as those in SEQ ID NO:76 are used, a more appropriate number of primers for positive individual identification would be 500-2,000.

[4831] If a panel of reagents from 23479, 48120, or 46689 nucleotide sequences described herein is used to generate a unique identification database for an individual, those same reagents can later be used to identify tissue from that individual. Using the unique identification database, positive identification of the individual, living or dead, can be made from extremely small tissue samples.

[4832] Use of Partial 23479, 48120, or 46689 Sequences in Forensic Biology

[4833] DNA-based identification techniques can also be used in forensic biology. To make such an identification, PCR technology can be used to amplify DNA sequences taken from very small biological samples such as tissues, e.g., hair or skin, or body fluids, e.g., blood, saliva, or semen found at a crime scene. The amplified sequence can then be compared to a standard, thereby allowing identification of the origin of the biological sample.

[4834] The sequences of the present invention can be used to provide polynucleotide reagents, e.g., PCR primers, targeted to specific loci in the human genome, which can enhance the reliability of DNA-based forensic identifications by, for example, providing another “identification marker” (i.e. another DNA sequence that is unique to a particular individual). As mentioned above, actual base sequence information can be used for identification as an accurate alternative to patterns formed by restriction enzyme generated fragments. Sequences targeted to noncoding regions of SEQ ID NO:74 (e.g., fragments derived from the noncoding regions of SEQ ID NO:74 having a length of at least 20 bases, preferably at least 30 bases) are particularly appropriate for this use.

[4835] The 23479, 48120, or 46689 nucleotide sequences described herein can further be used to provide polynucleotide reagents, e.g., labeled or labelable probes which can be used in, for example, an in situ hybridization technique, to identify a specific tissue. This can be very useful in cases where a forensic pathologist is presented with a tissue of unknown origin. Panels of such 23479, 48120, or 46689 probes can be used to identify tissue by species and/or by organ type.

[4836] In a similar fashion, these reagents, e.g., 23479, 48120, or 46689 primers or probes can be used to screen tissue culture for contamination (i.e. screen for the presence of a mixture of different types of cells in a culture).

[4837] Predictive Medicine of 23479, 48120, and 46689

[4838] The present invention also pertains to the field of predictive medicine in which diagnostic assays, prognostic assays, and monitoring clinical trials are used for prognostic (predictive) purposes to thereby treat an individual.

[4839] Generally, the invention provides, a method of determining if a subject is at risk for a disorder related to a lesion in or the misexpression of a gene which encodes 23479, 48120, or 46689.

[4840] Such disorders can include, e.g., a disorder associated with the misexpression of a 23479 or 48120 gene; a disorder associated with abnormal de-ubiquitination activity; and a disorder associated with abnormal protein degradation activity. Particularly preferred disorders include cellular proliferative and/or differentiative disorders, cardiovascular disorders, brain disorders. For example, preferred disorders include atherosclerosis, disorders associated with oxidative damage, cellular oxidative stress-related glucocorticoid responsiveness, and in disorders characterized by unwanted free radicals, e.g., in ischaemia reperfusion injury. Alternatively, the disorders can include, e.g., a disorder associated with the misexpression of a 46689 gene; a disorder associated with abnormal hydorlase activity, e.g., a metabolic disorder; a cellular proliferative or differentiative disorder, e.g., in the lung, brain, ovary, or breast, a neural disorder, a hematopoietic disorder, or a liver disorder.

[4841] The method includes one or more of the following:

[4842] detecting, in a tissue of the subject, the presence or absence of a mutation which affects the expression of the 23479, 48120, or 46689 gene, or detecting the presence or absence of a mutation in a region which controls the expression of the gene, e.g., a mutation in the 5′control region;

[4843] detecting, in a tissue of the subject, the presence or absence of a mutation which alters the structure of the 23479, 48120, or 46689 gene;

[4844] detecting, in a tissue of the subject, the misexpression of the 23479, 48120, or 46689 gene, at the mRNA level, e.g., detecting a non-wild type level of a mRNA;

[4845] detecting, in a tissue of the subject, the misexpression of the gene, at the protein level, e.g., detecting a non-wild type level of a 23479, 48120, or 46689 polypeptide.

[4846] In preferred embodiments the method includes: ascertaining the existence of at least one of: a deletion of one or more nucleotides from the 23479, 48120, or 46689 gene; an insertion of one or more nucleotides into the gene, a point mutation, e.g., a substitution of one or more nucleotides of the gene, a gross chromosomal rearrangement of the gene, e.g., a translocation, inversion, or deletion.

[4847] For example, detecting the genetic lesion can include: (i) providing a probe/primer including an oligonucleotide containing a region of nucleotide sequence which hybridizes to a sense or antisense sequence from SEQ ID NO:74, or naturally occurring mutants thereof or 5′ or 3′flanking sequences naturally associated with the 23479, 48120, or 46689 gene; (ii) exposing the probe/primer to nucleic acid of the tissue; and detecting, by hybridization, e.g., in situ hybridization, of the probe/primer to the nucleic acid, the presence or absence of the genetic lesion.

[4848] In preferred embodiments detecting the misexpression includes ascertaining the existence of at least one of: an alteration in the level of a messenger RNA transcript of the 23479, 48120, or 46689 gene; the presence of a non-wild type splicing pattern of a messenger RNA transcript of the gene; or a non-wild type level of 23479, 48120, or 46689.

[4849] Methods of the invention can be used prenatally or to determine if a subject's offspring will be at risk for a disorder.

[4850] In preferred embodiments the method includes determining the structure of a 23479, 48120, or 46689 gene, an abnormal structure being indicative of risk for the disorder.

[4851] In preferred embodiments the method includes contacting a sample from the subject with an antibody to the 23479, 48120, or 46689 protein or a nucleic acid, which hybridizes specifically with the gene. These and other embodiments are discussed below.

[4852] Diagnostic and Prognostic Assays of 23479, 48120, and 46689

[4853] Diagnostic and prognostic assays of the invention include method for assessing the expression level of 23479, 48120, or 46689 molecules and for identifying variations and mutations in the sequence of 23479, 48120, or 46689 molecules.

[4854] Expression Monitoring and Profiling:

[4855] The presence, level, or absence of 23479, 48120, or 46689 protein or nucleic acid in a biological sample can be evaluated by obtaining a biological sample from a test subject and contacting the biological sample with a compound or an agent capable of detecting 23479, 48120, or 46689 protein or nucleic acid (e.g., mRNA, genomic DNA) that encodes 23479, 48120, or 46689 protein such that the presence of 23479, 48120, or 46689 protein or nucleic acid is detected in the biological sample. The term “biological sample” includes tissues, cells and biological fluids isolated from a subject, as well as tissues, cells and fluids present within a subject. A preferred biological sample is serum. The level of expression of the 23479, 48120, or 46689 gene can be measured in a number of ways, including, but not limited to: measuring the mRNA encoded by the 23479, 48120, or 46689 genes; measuring the amount of protein encoded by the 23479, 48120, or 46689 genes; or measuring the activity of the protein encoded by the 23479, 48120, or 46689 genes.

[4856] The level of mRNA corresponding to the 23479, 48120, or 46689 gene in a cell can be determined both by in situ and by in vitro formats.

[4857] The isolated mRNA can be used in hybridization or amplification assays that include, but are not limited to, Southern or Northern analyses, polymerase chain reaction analyses and probe arrays. One preferred diagnostic method for the detection of mRNA levels involves contacting the isolated mRNA with a nucleic acid molecule (probe) that can hybridize to the mRNA encoded by the gene being detected. The nucleic acid probe can be, for example, a full-length 23479, 48120, or 46689 nucleic acid, such as the nucleic acid of SEQ ID NO:74, or a portion thereof, such as an oligonucleotide of at least 7, 15, 30, 50, 100, 250 or 500 nucleotides in length and sufficient to specifically hybridize under stringent conditions to 23479, 48120, or 46689 mRNA or genomic DNA. The probe can be disposed on an address of an array, e.g., an array described below. Other suitable probes for use in the diagnostic assays are described herein.

[4858] In one format, mRNA (or cDNA) is immobilized on a surface and contacted with the probes, for example by running the isolated mRNA on an agarose gel and transferring the mRNA from the gel to a membrane, such as nitrocellulose. In an alternative format, the probes are immobilized on a surface and the mRNA (or cDNA) is contacted with the probes, for example, in a two-dimensional gene chip array described below. A skilled artisan can adapt known mRNA detection methods for use in detecting the level of mRNA encoded by the 23479, 48120, or 46689 genes.

[4859] The level of mRNA in a sample that is encoded by one of 23479, 48120, or 46689 can be evaluated with nucleic acid amplification, e.g., by rtPCR (Mullis (1987) U.S. Pat. No. 4,683,202), ligase chain reaction (Barany (1991) Proc. Natl. Acad. Sci. USA 88:189-193), self sustained sequence replication (Guatelli et al., (1990) Proc. Natl. Acad. Sci. USA 87:1874-1878), transcriptional amplification system (Kwoh et al., (1989), Proc. Natl. Acad. Sci. USA 86:1173-1177), Q-Beta Replicase (Lizardi et al., (1988) Bio/Technology 6:1197), rolling circle replication (Lizardi et al., U.S. Pat. No. 5,854,033) or any other nucleic acid amplification method, followed by the detection of the amplified molecules using techniques known in the art. As used herein, amplification primers are defined as being a pair of nucleic acid molecules that can anneal to 5′ or 3′regions of a gene (plus and minus strands, respectively, or vice-versa) and contain a short region in between. In general, amplification primers are from about 10 to 30 nucleotides in length and flank a region from about 50 to 200 nucleotides in length. Under appropriate conditions and with appropriate reagents, such primers permit the amplification of a nucleic acid molecule comprising the nucleotide sequence flanked by the primers.

[4860] For in situ methods, a cell or tissue sample can be prepared/processed and immobilized on a support, typically a glass slide, and then contacted with a probe that can hybridize to mRNA that encodes the 23479, 48120, or 46689 gene being analyzed.

[4861] In another embodiment, the methods further contacting a control sample with a compound or agent capable of detecting 23479, 48120, or 46689 mRNA, or genomic DNA, and comparing the presence of 23479, 48120, or 46689 mRNA or genomic DNA in the control sample with the presence of 23479, 48120, or 46689 mRNA or genomic DNA in the test sample. In still another embodiment, serial analysis of gene expression, as described in U.S. Pat. No. 5,695,937, is used to detect 23479, 48120, or 46689 transcript levels.

[4862] A variety of methods can be used to determine the level of protein encoded by 23479, 48120, or 46689. In general, these methods include contacting an agent that selectively binds to the protein, such as an antibody with a sample, to evaluate the level of protein in the sample. In a preferred embodiment, the antibody bears a detectable label. Antibodies can be polyclonal, or more preferably, monoclonal. An intact antibody, or a fragment thereof (e.g., Fab or F(ab′)2) can be used. The term “labeled”, with regard to the probe or antibody, is intended to encompass direct labeling of the probe or antibody by coupling (i.e., physically linking) a detectable substance to the probe or antibody, as well as indirect labeling of the probe or antibody by reactivity with a detectable substance. Examples of detectable substances are provided herein.

[4863] The detection methods can be used to detect 23479, 48120, or 46689 protein in a biological sample in vitro as well as in vivo. In vitro techniques for detection of 23479, 48120, or 46689 protein include enzyme linked immunosorbent assays (ELISAs), immunoprecipitations, immunofluorescence, enzyme immunoassay (EIA), radioimmunoassay (RIA), and Western blot analysis. In vivo techniques for detection of 23479, 48120, or 46689 protein include introducing into a subject a labeled anti-23479, 48120, or 46689 antibody. For example, the antibody can be labeled with a radioactive marker whose presence and location in a subject can be detected by standard imaging techniques. In another embodiment, the sample is labeled, e.g., biotinylated and then contacted to the antibody, e.g., an anti-23479, 48120, or 46689 antibody positioned on an antibody array (as described below). The sample can be detected, e.g., with avidin coupled to a fluorescent label.

[4864] In another embodiment, the methods further include contacting the control sample with a compound or agent capable of detecting 23479, 48120, or 46689 protein, and comparing the presence of 23479, 48120, or 46689 protein in the control sample with the presence of 23479, 48120, or 46689 protein in the test sample.

[4865] The invention also includes kits for detecting the presence of 23479, 48120, or 46689 in a biological sample. For example, the kit can include a compound or agent capable of detecting 23479, 48120, or 46689 protein or mRNA in a biological sample; and a standard.

[4866] The compound or agent can be packaged in a suitable container. The kit can further comprise instructions for using the kit to detect 23479, 48120, or 46689 protein or nucleic acid.

[4867] For antibody-based kits, the kit can include: (1) a first antibody (e.g., attached to a solid support) which binds to a polypeptide corresponding to a marker of the invention; and, optionally, (2) a second, different antibody which binds to either the polypeptide or the first antibody and is conjugated to a detectable agent.

[4868] For oligonucleotide-based kits, the kit can include: (1) an oligonucleotide, e.g., a detectably labeled oligonucleotide, which hybridizes to a nucleic acid sequence encoding a polypeptide corresponding to a marker of the invention or (2) a pair of primers useful for amplifying a nucleic acid molecule corresponding to a marker of the invention. The kit can also includes a buffering agent, a preservative, or a protein stabilizing agent. The kit can also includes components necessary for detecting the detectable agent (e.g., an enzyme or a substrate). The kit can also contain a control sample or a series of control samples that can be assayed and compared to the test sample contained. Each component of the kit can be enclosed within an individual container and all of the various containers can be within a single package, along with instructions for interpreting the results of the assays performed using the kit.

[4869] The diagnostic methods described herein can identify subjects having, or at risk of developing, a disease or disorder associated with misexpressed or aberrant or unwanted 23479, 48120, or 46689 expression or activity. As used herein, the term “unwanted” includes an unwanted phenomenon involved in a biological response such as a cellular proliferative or differentiative disorder, e.g., in the lung, brain, ovary, or breast, as well as a neural disorder, a hematopoietic disorder, a cardiovascular disorder, or a liver disorder.

[4870] In one embodiment, a disease or disorder associated with aberrant or unwanted 23479, 48120, or 46689 expression or activity is identified. A test sample is obtained from a subject and 23479, 48120, or 46689 protein or nucleic acid (e.g., mRNA or genomic DNA) is evaluated, wherein the level, e.g., the presence or absence, of 23479, 48120, or 46689 protein or nucleic acid is diagnostic for a subject having or at risk of developing a disease or disorder associated with aberrant or unwanted 23479, 48120, or 46689 expression or activity. As used herein, a “test sample” refers to a biological sample obtained from a subject of interest, including a biological fluid (e.g., serum), cell sample, or tissue.

[4871] The prognostic assays described herein can be used to determine whether a subject can be administered an agent (e.g., an agonist, antagonist, peptidomimetic, protein, peptide, nucleic acid, small molecule, or other drug candidate) to treat a disease or disorder associated with aberrant or unwanted 23479, 48120, or 46689 expression or activity. For example, such methods can be used to determine whether a subject can be effectively treated with an agent for, e.g., a cellular proliferative or differentiative disorder.

[4872] In another aspect, the invention features a computer medium having a plurality of digitally encoded data records. Each data record includes a value representing the level of expression of 23479, 48120, or 46689 in a sample, and a descriptor of the sample. The descriptor of the sample can be an identifier of the sample, a subject from which the sample was derived (e.g., a patient), a diagnosis, or a treatment (e.g., a preferred treatment). In a preferred embodiment, the data record further includes values representing the level of expression of genes other than 23479, 48120, or 46689 (e.g., other genes associated with a 23479, 48120, or 46689-disorder, or other genes on an array). The data record can be structured as a table, e.g., a table that is part of a database such as a relational database (e.g., a SQL database of the Oracle or Sybase database environments).

[4873] Also featured is a method of evaluating a sample. The method includes providing a sample, e.g., from the subject, and determining a gene expression profile of the sample, wherein the profile includes a value representing the level of 23479, 48120, or 46689 expression. The method can further include comparing the value or the profile (i.e., multiple values) to a reference value or reference profile. The gene expression profile of the sample can be obtained by any of the methods described herein (e.g., by providing a nucleic acid from the sample and contacting the nucleic acid to an array). The method can be used to diagnose, e.g., a cellular proliferative or differentiative disorder, e.g., in the lung, brain, ovary, or breast, in a subject wherein an increase in 23479, 48120, or 46689 expression is an indication that the subject has or is disposed to having such a disorder. The method can be used to monitor a treatment for a disorder, e.g., a cellular proliferative or differentiative disorder, in a subject. For example, the gene expression profile can be determined for a sample from a subject undergoing treatment. The profile can be compared to a reference profile or to a profile obtained from the subject prior to treatment or prior to onset of the disorder (see, e.g., Golub et al. (1999) Science 286:531).

[4874] In yet another aspect, the invention features a method of evaluating a test compound (see also, “Screening Assays”, above). The method includes providing a cell and a test compound; contacting the test compound to the cell; obtaining a subject expression profile for the contacted cell; and comparing the subject expression profile to one or more reference profiles. The profiles include a value representing the level of 23479, 48120, or 46689 expression. In a preferred embodiment, the subject expression profile is compared to a target profile, e.g., a profile for a normal cell or for desired condition of a cell. The test compound is evaluated favorably if the subject expression profile is more similar to the target profile than an expression profile obtained from an uncontacted cell.

[4875] In another aspect, the invention features, a method of evaluating a subject. The method includes: a) obtaining a sample from a subject, e.g., from a caregiver, e.g., a caregiver who obtains the sample from the subject; b) determining a subject expression profile for the sample. Optionally, the method further includes either or both of steps: c) comparing the subject expression profile to one or more reference expression profiles; and d) selecting the reference profile most similar to the subject reference profile. The subject expression profile and the reference profiles include a value representing the level of 23479, 48120, or 46689 expression. A variety of routine statistical measures can be used to compare two reference profiles. One possible metric is the length of the distance vector that is the difference between the two profiles. Each of the subject and reference profile is represented as a multi-dimensional vector, wherein each dimension is a value in the profile.

[4876] The method can further include transmitting a result to a caregiver. The result can be the subject expression profile, a result of a comparison of the subject expression profile with another profile, a most similar reference profile, or a descriptor of any of the aforementioned. The result can be transmitted across a computer network, e.g., the result can be in the form of a computer transmission, e.g., a computer data signal embedded in a carrier wave.

[4877] Also featured is a computer medium having executable code for effecting the following steps: receive a subject expression profile; access a database of reference expression profiles; and either i) select a matching reference profile most similar to the subject expression profile or ii) determine at least one comparison score for the similarity of the subject expression profile to at least one reference profile. The subject expression profile, and the reference expression profiles each include a value representing the level of 23479, 48120, or 46689 expression.

[4878] 23479, 48120, and 46689 Arrays and Uses Thereof

[4879] In another aspect, the invention features an array that includes a substrate having a plurality of addresses. At least one address of the plurality includes a capture probe that binds specifically to a 23479, 48120, or 46689 molecule (e.g., a 23479, 48120, or 46689 nucleic acid or a 23479, 48120, or 46689 polypeptide). The array can have a density of at least than 10, 50, 100, 200, 500, 1,000, 2,000, or 10,000 or more addresses/cm2, and ranges between. In a preferred embodiment, the plurality of addresses includes at least 10, 100, 500, 1,000, 5,000, 10,000, 50,000 addresses. In a preferred embodiment, the plurality of addresses includes equal to or less than 10, 100, 500, 1,000, 5,000, 10,000, or 50,000 addresses. The substrate can be a two-dimensional substrate such as a glass slide, a wafer (e.g., silica or plastic), a mass spectroscopy plate, or a three-dimensional substrate such as a gel pad. Addresses in addition to address of the plurality can be disposed on the array.

[4880] In a preferred embodiment, at least one address of the plurality includes a nucleic acid capture probe that hybridizes specifically to a 23479, 48120, or 46689 nucleic acid, e.g., the sense or anti-sense strand. In one preferred embodiment, a subset of addresses of the plurality of addresses has a nucleic acid capture probe for 23479, 48120, or 46689. Each address of the subset can include a capture probe that hybridizes to a different region of a 23479, 48120, or 46689 nucleic acid. In another preferred embodiment, addresses of the subset include a capture probe for a 23479, 48120, or 46689 nucleic acid. Each address of the subset is unique, overlapping, and complementary to a different variant of 23479, 48120, or 46689 (e.g., an allelic variant, or all possible hypothetical variants). The array can be used to sequence 23479, 48120, or 46689 by hybridization (see, e.g., U.S. Pat. No. 5,695,940).

[4881] An array can be generated by various methods, e.g., by photolithographic methods (see, e.g., U.S. Pat. Nos. 5,143,854; 5,510,270; and 5,527,681), mechanical methods (e.g., directed-flow methods as described in U.S. Pat. No. 5,384,261), pin-based methods (e.g., as described in U.S. Pat. No. 5,288,514), and bead-based techniques (e.g., as described in PCT US/93/04145).

[4882] In another preferred embodiment, at least one address of the plurality includes a polypeptide capture probe that binds specifically to a 23479, 48120, or 46689 polypeptide or fragment thereof. The polypeptide can be a naturally-occurring interaction partner of 23479, 48120, or 46689 polypeptide. Preferably, the polypeptide is an antibody, e.g., an antibody described herein (see “Anti-23479, 48120, or 46689 Antibodies,” above), such as a monoclonal antibody or a single-chain antibody.

[4883] In another aspect, the invention features a method of analyzing the expression of 23479, 48120, or 46689. The method includes providing an array as described above; contacting the array with a sample and detecting binding of a 23479, 48120, or 46689-molecule (e.g., nucleic acid or polypeptide) to the array. In a preferred embodiment, the array is a nucleic acid array. Optionally the method further includes amplifying nucleic acid from the sample prior or during contact with the array.

[4884] In another embodiment, the array can be used to assay gene expression in a tissue to ascertain tissue specificity of genes in the array, particularly the expression of 23479, 48120, or 46689. If a sufficient number of diverse samples is analyzed, clustering (e.g., hierarchical clustering, k-means clustering, Bayesian clustering and the like) can be used to identify other genes which are co-regulated with 23479, 48120, or 46689. For example, the array can be used for the quantitation of the expression of multiple genes. Thus, not only tissue specificity, but also the level of expression of a battery of genes in the tissue is ascertained. Quantitative data can be used to group (e.g., cluster) genes on the basis of their tissue expression per se and level of expression in that tissue.

[4885] For example, array analysis of gene expression can be used to assess the effect of cell-cell interactions on 23479, 48120, or 46689 expression. A first tissue can be perturbed and nucleic acid from a second tissue that interacts with the first tissue can be analyzed. In this context, the effect of one cell type on another cell type in response to a biological stimulus can be determined, e.g., to monitor the effect of cell-cell interaction at the level of gene expression.

[4886] In another embodiment, cells are contacted with a therapeutic agent. The expression profile of the cells is determined using the array, and the expression profile is compared to the profile of like cells not contacted with the agent. For example, the assay can be used to determine or analyze the molecular basis of an undesirable effect of the therapeutic agent. If an agent is administered therapeutically to treat one cell type but has an undesirable effect on another cell type, the invention provides an assay to determine the molecular basis of the undesirable effect and thus provides the opportunity to co-administer a counteracting agent or otherwise treat the undesired effect. Similarly, even within a single cell type, undesirable biological effects can be determined at the molecular level. Thus, the effects of an agent on expression of other than the target gene can be ascertained and counteracted.

[4887] In another embodiment, the array can be used to monitor expression of one or more genes in the array with respect to time. For example, samples obtained from different time points can be probed with the array. Such analysis can identify and/or characterize the development of a 23479, 48120, or 46689-associated disease or disorder; and processes, such as a cellular transformation associated with a 23479, 48120, or 46689-associated disease or disorder. The method can also evaluate the treatment and/or progression of a 23479, 48120, or 46689-associated disease or disorder

[4888] The array is also useful for ascertaining differential expression patterns of one or more genes in normal and abnormal cells. This provides a battery of genes (e.g., including 23479, 48120, or 46689) that could serve as a molecular target for diagnosis or therapeutic intervention.

[4889] In another aspect, the invention features an array having a plurality of addresses. Each address of the plurality includes a unique polypeptide. At least one address of the plurality has disposed thereon a 23479, 48120, or 46689 polypeptide or fragment thereof. Methods of producing polypeptide arrays are described in the art, e.g., in De Wildt et al. (2000). Nature Biotech. 18, 989-994; Lueking et al. (1999). Anal. Biochem. 270, 103-111; Ge, H. (2000). Nucleic Acids Res. 28, e3, I-VII; MacBeath, G., and Schreiber, S. L. (2000). Science 289, 1760-1763; and WO 99/51773A1. In a preferred embodiment, each addresses of the plurality has disposed thereon a polypeptide at least 60, 70, 80, 85, 90, 95 or 99% identical to a 23479, 48120, or 46689 polypeptide or fragment thereof. For example, multiple variants of a 23479, 48120, or 46689 polypeptide (e.g., encoded by allelic variants, site-directed mutants, random mutants, or combinatorial mutants) can be disposed at individual addresses of the plurality. Addresses in addition to the address of the plurality can be disposed on the array.

[4890] The polypeptide array can be used to detect a 23479, 48120, or 46689 binding compound, e.g., an antibody in a sample from a subject with specificity for a 23479, 48120, or 46689 polypeptide or the presence of a 23479, 48120, or 46689-binding protein or ligand.

[4891] The array is also useful for ascertaining the effect of the expression of a gene on the expression of other genes in the same cell or in different cells (e.g., ascertaining the effect of 23479, 48120, or 46689 expression on the expression of other genes). This provides, for example, for a selection of alternate molecular targets for therapeutic intervention if the ultimate or downstream target cannot be regulated.

[4892] In another aspect, the invention features a method of analyzing a plurality of probes. The method is useful, e.g., for analyzing gene expression. The method includes: providing a two dimensional array having a plurality of addresses, each address of the plurality being positionally distinguishable from each other address of the plurality having a unique capture probe, e.g., wherein the capture probes are from a cell or subject which express 23479, 48120, or 46689 or from a cell or subject in which a 23479, 48120, or 46689 mediated response has been elicited, e.g., by contact of the cell with 23479, 48120, or 46689 nucleic acid or protein, or administration to the cell or subject 23479, 48120, or 46689 nucleic acid or protein; providing a two dimensional array having a plurality of addresses, each address of the plurality being positionally distinguishable from each other address of the plurality, and each address of the plurality having a unique capture probe, e.g., wherein the capture probes are from a cell or subject which does not express 23479, 48120, or 46689 (or does not express as highly as in the case of the 23479, 48120, or 46689 positive plurality of capture probes) or from a cell or subject which in which a 23479, 48120, or 46689 mediated response has not been elicited (or has been elicited to a lesser extent than in the first sample); contacting the array with one or more inquiry probes (which is preferably other than a 23479, 48120, or 46689 nucleic acid, polypeptide, or antibody), and thereby evaluating the plurality of capture probes. Binding, e.g., in the case of a nucleic acid, hybridization with a capture probe at an address of the plurality, is detected, e.g., by signal generated from a label attached to the nucleic acid, polypeptide, or antibody.

[4893] In another aspect, the invention features a method of analyzing a plurality of probes or a sample. The method is useful, e.g., for analyzing gene expression. The method includes: providing a two dimensional array having a plurality of addresses, each address of the plurality being positionally distinguishable from each other address of the plurality having a unique capture probe, contacting the array with a first sample from a cell or subject which express or mis-express 23479, 48120, or 46689 or from a cell or subject in which a 23479, 48120, or 46689-mediated response has been elicited, e.g., by contact of the cell with 23479, 48120, or 46689 nucleic acid or protein, or administration to the cell or subject 23479, 48120, or 46689 nucleic acid or protein; providing a two dimensional array having a plurality of addresses, each address of the plurality being positionally distinguishable from each other address of the plurality, and each address of the plurality having a unique capture probe, and contacting the array with a second sample from a cell or subject which does not express 23479, 48120, or 46689 (or does not express as highly as in the case of the 23479, 48120, or 46689 positive plurality of capture probes) or from a cell or subject which in which a 23479, 48120, or 46689 mediated response has not been elicited (or has been elicited to a lesser extent than in the first sample); and comparing the binding of the first sample with the binding of the second sample. Binding, e.g., in the case of a nucleic acid, hybridization with a capture probe at an address of the plurality, is detected, e.g., by signal generated from a label attached to the nucleic acid, polypeptide, or antibody. The same array can be used for both samples or different arrays can be used. If different arrays are used the plurality of addresses with capture probes should be present on both arrays.

[4894] In another aspect, the invention features a method of analyzing 23479, 48120, or 46689, e.g., analyzing structure, function, or relatedness to other nucleic acid or amino acid sequences. The method includes: providing a 23479, 48120, or 46689 nucleic acid or amino acid sequence; comparing the 23479, 48120, or 46689 sequence with one or more preferably a plurality of sequences from a collection of sequences, e.g., a nucleic acid or protein sequence database; to thereby analyze 23479, 48120, or 46689.

[4895] Detection of 23479, 48120, and 46689 Variations or Mutations

[4896] The methods of the invention can also be used to detect genetic alterations in a 23479, 48120, or 46689 gene, thereby determining if a subject with the altered gene is at risk for a disorder characterized by misregulation in 23479, 48120, or 46689 protein activity or nucleic acid expression, such as a cellular proliferative or differentiative disorder, e.g., in the lung, brain, ovary, or breast, a neural disorder, a hematopoietic disorder, a cardiovascular disorder, or a liver disorder. In preferred embodiments, the methods include detecting, in a sample from the subject, the presence or absence of a genetic alteration characterized by at least one of an alteration affecting the integrity of a gene encoding a 23479, 48120, or 46689-protein, or the mis-expression of the 23479, 48120, or 46689 gene. For example, such genetic alterations can be detected by ascertaining the existence of at least one of 1) a deletion of one or more nucleotides from a 23479, 48120, or 46689 gene; 2) an addition of one or more nucleotides to a 23479, 48120, or 46689 gene; 3) a substitution of one or more nucleotides of a 23479, 48120, or 46689 gene, 4) a chromosomal rearrangement of a 23479, 48120, or 46689 gene; 5) an alteration in the level of a messenger RNA transcript of a 23479, 48120, or 46689 gene, 6) aberrant modification of a 23479, 48120, or 46689 gene, such as of the methylation pattern of the genomic DNA, 7) the presence of a non-wild type splicing pattern of a messenger RNA transcript of a 23479, 48120, or 46689 gene, 8) a non-wild type level of a 23479, 48120, or 46689-protein, 9) allelic loss of a 23479, 48120, or 46689 gene, and 10) inappropriate post-translational modification of a 23479, 48120, or 46689-protein.

[4897] An alteration can be detected without a probe/primer in a polymerase chain reaction, such as anchor PCR or RACE PCR, or, alternatively, in a ligation chain reaction (LCR), the latter of which can be particularly useful for detecting point mutations in the 23479, 48120, or 46689-gene. This method can include the steps of collecting a sample of cells from a subject, isolating nucleic acid (e.g., genomic, mRNA or both) from the sample, contacting the nucleic acid sample with one or more primers which specifically hybridize to a 23479, 48120, or 46689 gene under conditions such that hybridization and amplification of the 23479, 48120, or 46689-gene (if present) occurs, and detecting the presence or absence of an amplification product, or detecting the size of the amplification product and comparing the length to a control sample. It is anticipated that PCR and/or LCR may be desirable to use as a preliminary amplification step in conjunction with any of the techniques used for detecting mutations described herein. Alternatively, other amplification methods described herein or known in the art can be used.

[4898] In another embodiment, mutations in a 23479, 48120, or 46689 gene from a sample cell can be identified by detecting alterations in restriction enzyme cleavage patterns. For example, sample and control DNA is isolated, amplified (optionally), digested with one or more restriction endonucleases, and fragment length sizes are determined, e.g., by gel electrophoresis and compared. Differences in fragment length sizes between sample and control DNA indicates mutations in the sample DNA. Moreover, the use of sequence specific ribozymes (see, for example, U.S. Pat. No. 5,498,531) can be used to score for the presence of specific mutations by development or loss of a ribozyme cleavage site.

[4899] In other embodiments, genetic mutations in 23479, 48120, or 46689 can be identified by hybridizing a sample and control nucleic acids, e.g., DNA or RNA, two-dimensional arrays, e.g., chip based arrays. Such arrays include a plurality of addresses, each of which is positionally distinguishable from the other. A different probe is located at each address of the plurality. A probe can be complementary to a region of a 23479, 48120, or 46689 nucleic acid or a putative variant (e.g., allelic variant) thereof. A probe can have one or more mismatches to a region of a 23479, 48120, or 46689 nucleic acid (e.g., a destabilizing mismatch). The arrays can have a high density of addresses, e.g., can contain hundreds or thousands of oligonucleotides probes (Cronin, M. T. et al. (1996) Human Mutation 7: 244-255; Kozal, M. J. et al. (1996) Nature Medicine 2: 753-759). For example, genetic mutations in 23479, 48120, or 46689 can be identified in two-dimensional arrays containing light-generated DNA probes as described in Cronin, M. T. et al. supra. Briefly, a first hybridization array of probes can be used to scan through long stretches of DNA in a sample and control to identify base changes between the sequences by making linear arrays of sequential overlapping probes. This step allows the identification of point mutations. This step is followed by a second hybridization array that allows the characterization of specific mutations by using smaller, specialized probe arrays complementary to all variants or mutations detected. Each mutation array is composed of parallel probe sets, one complementary to the wild-type gene and the other complementary to the mutant gene.

[4900] In yet another embodiment, any of a variety of sequencing reactions known in the art can be used to directly sequence the 23479, 48120, or 46689 gene and detect mutations by comparing the sequence of the sample 23479, 48120, or 46689 with the corresponding wild-type (control) sequence. Automated sequencing procedures can be utilized when performing the diagnostic assays ((1995) Biotechniques 19:448), including sequencing by mass spectrometry.

[4901] Other methods for detecting mutations in the 23479, 48120, or 46689 gene include methods in which protection from cleavage agents is used to detect mismatched bases in RNA/RNA or RNA/DNA heteroduplexes (Myers et al. (1985) Science 230:1242; Cotton et al. (1988) Proc. Natl. Acad Sci USA 85:4397; Saleeba et al. (1992) Methods Enzymol. 217:286-295).

[4902] In still another embodiment, the mismatch cleavage reaction employs one or more proteins that recognize mismatched base pairs in double-stranded DNA (so called “DNA mismatch repair” enzymes) in defined systems for detecting and mapping point mutations in 23479, 48120, or 46689 cDNAs obtained from samples of cells. For example, the mutY enzyme of E. coli cleaves A at G/A mismatches and the thymidine DNA glycosylase from HeLa cells cleaves T at G/T mismatches (Hsu et al. (1994) Carcinogenesis 15:1657-1662; U.S. Pat. No. 5,459,039).

[4903] In other embodiments, alterations in electrophoretic mobility will be used to identify mutations in 23479, 48120, or 46689 genes. For example, single strand conformation polymorphism (SSCP) may be used to detect differences in electrophoretic mobility between mutant and wild type nucleic acids (Orita et al. (1989) Proc Natl. Acad. Sci USA: 86:2766, see also Cotton (1993) Mutat. Res. 285:125-144; and Hayashi (1992) Genet. Anal. Tech. Appl. 9:73-79). Single-stranded DNA fragments of sample and control 23479, 48120, or 46689 nucleic acids will be denatured and allowed to renature. The secondary structure of single-stranded nucleic acids varies according to sequence, the resulting alteration in electrophoretic mobility enables the detection of even a single base change. The DNA fragments may be labeled or detected with labeled probes. The sensitivity of the assay may be enhanced by using RNA (rather than DNA), in which the secondary structure is more sensitive to a change in sequence. In a preferred embodiment, the subject method utilizes heteroduplex analysis to separate double stranded heteroduplex molecules on the basis of changes in electrophoretic mobility (Keen et al. (1991) Trends Genet 7:5).

[4904] In yet another embodiment, the movement of mutant or wild-type fragments in polyacrylamide gels containing a gradient of denaturant is assayed using denaturing gradient gel electrophoresis (DGGE) (Myers et al. (1985) Nature 313:495). When DGGE is used as the method of analysis, DNA will be modified to insure that it does not completely denature, for example by adding a GC clamp of approximately 40 bp of high-melting GC-rich DNA by PCR. In a further embodiment, a temperature gradient is used in place of a denaturing gradient to identify differences in the mobility of control and sample DNA (Rosenbaum and Reissner (1987) Biophys Chem 265:12753).

[4905] Examples of other techniques for detecting point mutations include, but are not limited to, selective oligonucleotide hybridization, selective amplification, or selective primer extension (Saiki et al. (1986) Nature 324:163); Saiki et al. (1989) Proc. Natl. Acad. Sci USA 86:6230). A further method of detecting point mutations is the chemical ligation of oligonucleotides as described in Xu et al. ((2001) Nature Biotechnol. 19:148). Adjacent oligonucleotides, one of which selectively anneals to the query site, are ligated together if the nucleotide at the query site of the sample nucleic acid is complementary to the query oligonucleotide; ligation can be monitored, e.g., by fluorescent dyes coupled to the oligonucleotides.

[4906] Alternatively, allele specific amplification technology that depends on selective PCR amplification may be used in conjunction with the instant invention. Oligonucleotides used as primers for specific amplification may carry the mutation of interest in the center of the molecule (so that amplification depends on differential hybridization) (Gibbs et al. (1989) Nucleic Acids Res. 17:2437-2448) or at the extreme 3′end of one primer where, under appropriate conditions, mismatch can prevent, or reduce polymerase extension (Prossner (1993) Tibtech 11:238). In addition it may be desirable to introduce a novel restriction site in the region of the mutation to create cleavage-based detection (Gasparini et al. (1992) Mol. Cell Probes 6:1). It is anticipated that in certain embodiments amplification may also be performed using Taq ligase for amplification (Barany (1991) Proc. Natl. Acad. Sci USA 88:189). In such cases, ligation will occur only if there is a perfect match at the 3′end of the 5′sequence making it possible to detect the presence of a known mutation at a specific site by looking for the presence or absence of amplification.

[4907] In another aspect, the invention features a set of oligonucleotides. The set includes a plurality of oligonucleotides, each of which is at least partially complementary (e.g., at least 50%, 60%, 70%, 80%, 90%, 92%, 95%, 97%, 98%, or 99% complementary) to a 23479, 48120, or 46689 nucleic acid.

[4908] In a preferred embodiment the set includes a first and a second oligonucleotide. The first and second oligonucleotide can hybridize to the same or to different locations of SEQ ID NO:74, SEQ ID NO:77, SEQ ID NO:80, or the complement of SEQ ID NO:74, SEQ ID NO:77, or SEQ ID NO:80. Different locations can be different but overlapping, or non-overlapping on the same strand. The first and second oligonucleotide can hybridize to sites on the same or on different strands.

[4909] The set can be useful, e.g., for identifying SNP's, or identifying specific alleles of 23479, 48120, or 46689. In a preferred embodiment, each oligonucleotide of the set has a different nucleotide at an interrogation position. In one embodiment, the set includes two oligonucleotides, each complementary to a different allele at a locus, e.g., a biallelic or polymorphic locus.

[4910] In another embodiment, the set includes four oligonucleotides, each having a different nucleotide (e.g., adenine, guanine, cytosine, or thymidine) at the interrogation position. The interrogation position can be a SNP or the site of a mutation. In another preferred embodiment, the oligonucleotides of the plurality are identical in sequence to one another (except for differences in length). The oligonucleotides can be provided with differential labels, such that an oligonucleotide that hybridizes to one allele provides a signal that is distinguishable from an oligonucleotide that hybridizes to a second allele. In still another embodiment, at least one of the oligonucleotides of the set has a nucleotide change at a position in addition to a query position, e.g., a destabilizing mutation to decrease the Tm of the oligonucleotide. In another embodiment, at least one oligonucleotide of the set has a non-natural nucleotide, e.g., inosine. In a preferred embodiment, the oligonucleotides are attached to a solid support, e.g., to different addresses of an array or to different beads or nanoparticles.

[4911] In a preferred embodiment the set of oligo nucleotides can be used to specifically amplify, e.g., by PCR, or detect, a 23479, 48120, or 46689 nucleic acid.

[4912] The methods described herein may be performed, for example, by utilizing pre-packaged diagnostic kits comprising at least one probe nucleic acid or antibody reagent described herein, which may be conveniently used, e.g., in clinical settings to diagnose patients exhibiting symptoms or family history of a disease or illness involving a 23479, 48120, or 46689 gene.

[4913] Use of 23479, 48120, or 46689 Molecules as Surrogate Markers

[4914] The 23479, 48120, or 46689 molecules of the invention are also useful as markers of disorders or disease states, as markers for precursors of disease states, as markers for predisposition of disease states, as markers of drug activity, or as markers of the pharmacogenomic profile of a subject. Using the methods described herein, the presence, absence and/or quantity of the 23479, 48120, or 46689 molecules of the invention may be detected, and may be correlated with one or more biological states in vivo. For example, the 23479, 48120, or 46689 molecules of the invention may serve as surrogate markers for one or more disorders or disease states or for conditions leading up to disease states. As used herein, a “surrogate marker” is an objective biochemical marker which correlates with the absence or presence of a disease or disorder, or with the progression of a disease or disorder (e.g., with the presence or absence of a tumor). The presence or quantity of such markers is independent of the disease. Therefore, these markers may serve to indicate whether a particular course of treatment is effective in lessening a disease state or disorder. Surrogate markers are of particular use when the presence or extent of a disease state or disorder is difficult to assess through standard methodologies (e.g., early stage tumors), or when an assessment of disease progression is desired before a potentially dangerous clinical endpoint is reached (e.g., an assessment of cardiovascular disease may be made using cholesterol levels as a surrogate marker, and an analysis of HIV infection may be made using HIV RNA levels as a surrogate marker, well in advance of the undesirable clinical outcomes of myocardial infarction or fully-developed AIDS). Examples of the use of surrogate markers in the art include: Koomen et al. (2000) J. Mass. Spectrom. 35: 258-264; and James (1994) AIDS Treatment News Archive 209.

[4915] The 23479, 48120, or 46689 molecules of the invention are also useful as pharmacodynamic markers. As used herein, a “pharmacodynamic marker” is an objective biochemical marker which correlates specifically with drug effects. The presence or quantity of a pharmacodynamic marker is not related to the disease state or disorder for which the drug is being administered; therefore, the presence or quantity of the marker is indicative of the presence or activity of the drug in a subject. For example, a pharmacodynamic marker may be indicative of the concentration of the drug in a biological tissue, in that the marker is either expressed or transcribed or not expressed or transcribed in that tissue in relationship to the level of the drug. In this fashion, the distribution or uptake of the drug may be monitored by the pharmacodynamic marker. Similarly, the presence or quantity of the pharmacodynamic marker may be related to the presence or quantity of the metabolic product of a drug, such that the presence or quantity of the marker is indicative of the relative breakdown rate of the drug in vivo. Pharmacodynamic markers are of particular use in increasing the sensitivity of detection of drug effects, particularly when the drug is administered in low doses. Since even a small amount of a drug may be sufficient to activate multiple rounds of marker (e.g., a 23479, 48120, or 46689 marker) transcription or expression, the amplified marker may be in a quantity which is more readily detectable than the drug itself. Also, the marker may be more easily detected due to the nature of the marker itself; for example, using the methods described herein, anti-23479, 48120, or 46689 antibodies may be employed in an immune-based detection system for a 23479, 48120, or 46689 protein marker, or 23479, 48120, or 46689-specific radiolabeled probes may be used to detect a 23479, 48120, or 46689 mRNA marker. Furthermore, the use of a pharmacodynamic marker may offer mechanism-based prediction of risk due to drug treatment beyond the range of possible direct observations. Examples of the use of pharmacodynamic markers in the art include: Matsuda et al. U.S. Pat. No. 6,033,862; Hattis et al. (1991) Env. Health Perspect. 90: 229-238; Schentag (1999) Am. J. Health-Syst. Pharm. 56 Suppl. 3: S21-S24; and Nicolau (1999) Am, J. Health-Syst. Pharm. 56 Suppl. 3: S16-S20.

[4916] The 23479, 48120, or 46689 molecules of the invention are also useful as pharmacogenomic markers. As used herein, a “pharmacogenomic marker” is an objective biochemical marker that correlates with a specific clinical drug response or susceptibility in a subject (see, e.g., McLeod et al. (1999) Eur. J. Cancer 35:1650-1652). The presence or quantity of the pharmacogenomic marker is related to the predicted response of the subject to a specific drug or class of drugs prior to administration of the drug. By assessing the presence or quantity of one or more pharmacogenomic markers in a subject, a drug therapy which is most appropriate for the subject, or which is predicted to have a greater degree of success, may be selected. For example, based on the presence or quantity of RNA, or protein (e.g., 23479, 48120, or 46689 protein or RNA) for specific tumor markers in a subject, a drug or course of treatment may be selected that is optimized for the treatment of the specific tumor likely to be present in the subject. Similarly, the presence or absence of a specific sequence mutation in 23479, 48120, or 46689 DNA may correlate 23479, 48120, or 46689 drug response. The use of pharmacogenomic markers therefore permits the application of the most appropriate treatment for each subject without having to administer the therapy.

[4917] Pharmaceutical Compositions of 23479, 48120, and 46689

[4918] The nucleic acid and polypeptides, fragments thereof, as well as anti-23479, 48120, or 46689 antibodies (also referred to herein as “active compounds”) of the invention can be incorporated into pharmaceutical compositions. Such compositions typically include the nucleic acid molecule, protein, or antibody and a pharmaceutically acceptable carrier. As used herein the language “pharmaceutically acceptable carrier” includes solvents, dispersion media, coatings, antibacterial and antifungal agents, isotonic and absorption delaying agents, and the like, compatible with pharmaceutical administration. Supplementary active compounds can also be incorporated into the compositions.

[4919] A pharmaceutical composition is formulated to be compatible with its intended route of administration. Examples of routes of administration include parenteral, e.g., intravenous, intradermal, subcutaneous, oral (e.g., inhalation), transdermal (topical), transmucosal, and rectal administration. Solutions or suspensions used for parenteral, intradermal, or subcutaneous application can include the following components: a sterile diluent such as water for injection, saline solution, fixed oils, polyethylene glycols, glycerine, propylene glycol or other synthetic solvents; antibacterial agents such as benzyl alcohol or methyl parabens; antioxidants such as ascorbic acid or sodium bisulfite; chelating agents such as ethylenediaminetetraacetic acid; buffers such as acetates, citrates or phosphates and agents for the adjustment of tonicity such as sodium chloride or dextrose. pH can be adjusted with acids or bases, such as hydrochloric acid or sodium hydroxide. The parenteral preparation can be enclosed in ampoules, disposable syringes or multiple dose vials made of glass or plastic.

[4920] Pharmaceutical compositions suitable for injectable use include sterile aqueous solutions (where water soluble) or dispersions and sterile powders for the extemporaneous preparation of sterile injectable solutions or dispersion. For intravenous administration, suitable carriers include physiological saline, bacteriostatic water, Cremophor EL™ (BASF, Parsippany, N.J.) or phosphate buffered saline (PBS). In all cases, the composition must be sterile and should be fluid to the extent that easy syringability exists. It should be stable under the conditions of manufacture and storage and must be preserved against the contaminating action of microorganisms such as bacteria and fungi. The carrier can be a solvent or dispersion medium containing, for example, water, ethanol, polyol (for example, glycerol, propylene glycol, and liquid polyetheylene glycol, and the like), and suitable mixtures thereof. The proper fluidity can be maintained, for example, by the use of a coating such as lecithin, by the maintenance of the required particle size in the case of dispersion and by the use of surfactants. Prevention of the action of microorganisms can be achieved by various antibacterial and antifungal agents, for example, parabens, chlorobutanol, phenol, ascorbic acid, thimerosal, and the like. In many cases, it will be preferable to include isotonic agents, for example, sugars, polyalcohols such as manitol, sorbitol, sodium chloride in the composition. Prolonged absorption of the injectable compositions can be brought about by including in the composition an agent which delays absorption, for example, aluminum monostearate and gelatin.

[4921] Sterile injectable solutions can be prepared by incorporating the active compound in the required amount in an appropriate solvent with one or a combination of ingredients enumerated above, as required, followed by filtered sterilization. Generally, dispersions are prepared by incorporating the active compound into a sterile vehicle which contains a basic dispersion medium and the required other ingredients from those enumerated above. In the case of sterile powders for the preparation of sterile injectable solutions, the preferred methods of preparation are vacuum drying and freeze-drying which yields a powder of the active ingredient plus any additional desired ingredient from a previously sterile-filtered solution thereof.

[4922] Oral compositions generally include an inert diluent or an edible carrier. For the purpose of oral therapeutic administration, the active compound can be incorporated with excipients and used in the form of tablets, troches, or capsules, e.g., gelatin capsules. Oral compositions can also be prepared using a fluid carrier for use as a mouthwash. Pharmaceutically compatible binding agents, and/or adjuvant materials can be included as part of the composition. The tablets, pills, capsules, troches and the like can contain any of the following ingredients, or compounds of a similar nature: a binder such as microcrystalline cellulose, gum tragacanth or gelatin; an excipient such as starch or lactose, a disintegrating agent such as alginic acid, Primogel, or corn starch; a lubricant such as magnesium stearate or Sterotes; a glidant such as colloidal silicon dioxide; a sweetening agent such as sucrose or saccharin; or a flavoring agent such as peppermint, methyl salicylate, or orange flavoring.

[4923] For administration by inhalation, the compounds are delivered in the form of an aerosol spray from pressured container or dispenser which contains a suitable propellant, e.g., a gas such as carbon dioxide, or a nebulizer.

[4924] Systemic administration can also be by transmucosal or transdermal means. For transmucosal or transdermal administration, penetrants appropriate to the barrier to be permeated are used in the formulation. Such penetrants are generally known in the art, and include, for example, for transmucosal administration, detergents, bile salts, and fusidic acid derivatives. Transmucosal administration can be accomplished through the use of nasal sprays or suppositories. For transdermal administration, the active compounds are formulated into ointments, salves, gels, or creams as generally known in the art.

[4925] The compounds can also be prepared in the form of suppositories (e.g., with conventional suppository bases such as cocoa butter and other glycerides) or retention enemas for rectal delivery.

[4926] In one embodiment, the active compounds are prepared with carriers that will protect the compound against rapid elimination from the body, such as a controlled release formulation, including implants and microencapsulated delivery systems. Biodegradable, biocompatible polymers can be used, such as ethylene vinyl acetate, polyanhydrides, polyglycolic acid, collagen, polyorthoesters, and polylactic acid. Methods for preparation of such formulations will be apparent to those skilled in the art. The materials can also be obtained commercially from Alza Corporation and Nova Pharmaceuticals, Inc. Liposomal suspensions (including liposomes targeted to infected cells with monoclonal antibodies to viral antigens) can also be used as pharmaceutically acceptable carriers. These can be prepared according to methods known to those skilled in the art, for example, as described in U.S. Pat. No. 4,522,811.

[4927] It is advantageous to formulate oral or parenteral compositions in dosage unit form for ease of administration and uniformity of dosage. Dosage unit form as used herein refers to physically discrete units suited as unitary dosages for the subject to be treated; each unit containing a predetermined quantity of active compound calculated to produce the desired therapeutic effect in association with the required pharmaceutical carrier.

[4928] Toxicity and therapeutic efficacy of such compounds can be determined by standard pharmaceutical procedures in cell cultures or experimental animals, e.g., for determining the LD50 (the dose lethal to 50% of the population) and the ED50 (the dose therapeutically effective in 50% of the population). The dose ratio between toxic and therapeutic effects is the therapeutic index and it can be expressed as the ratio LD50/ED50. Compounds which exhibit high therapeutic indices are preferred. While compounds that exhibit toxic side effects may be used, care should be taken to design a delivery system that targets such compounds to the site of affected tissue in order to minimize potential damage to uninfected cells and, thereby, reduce side effects.

[4929] The data obtained from the cell culture assays and animal studies can be used in formulating a range of dosage for use in humans. The dosage of such compounds lies preferably within a range of circulating concentrations that include the ED50 with little or no toxicity. The dosage may vary within this range depending upon the dosage form employed and the route of administration utilized. For any compound used in the method of the invention, the therapeutically effective dose can be estimated initially from cell culture assays. A dose may be formulated in animal models to achieve a circulating plasma concentration range that includes the IC50 (i.e., the concentration of the test compound which achieves a half-maximal inhibition of symptoms) as determined in cell culture. Such information can be used to more accurately determine useful doses in humans. Levels in plasma may be measured, for example, by high performance liquid chromatography.

[4930] As defined herein, a therapeutically effective amount of protein or polypeptide (i.e., an effective dosage) ranges from about 0.001 to 30 mg/kg body weight, preferably about 0.01 to 25 mg/kg body weight, more preferably about 0.1 to 20 mg/kg body weight, and even more preferably about 1 to 10 mg/kg, 2 to 9 mg/kg, 3 to 8 mg/kg, 4 to 7 mg/kg, or 5 to 6 mg/kg body weight. The protein or polypeptide can be administered one time per week for between about 1 to 10 weeks, preferably between 2 to 8 weeks, more preferably between about 3 to 7 weeks, and even more preferably for about 4, 5, or 6 weeks. The skilled artisan will appreciate that certain factors may influence the dosage and timing required to effectively treat a subject, including but not limited to the severity of the disease or disorder, previous treatments, the general health and/or age of the subject, and other diseases present. Moreover, treatment of a subject with a therapeutically effective amount of a protein, polypeptide, or antibody can include a single treatment or, preferably, can include a series of treatments.

[4931] For antibodies, the preferred dosage is 0.1 mg/kg of body weight (generally 10 mg/kg to 20 mg/kg). If the antibody is to act in the brain, a dosage of 50 mg/kg to 100 mg/kg is usually appropriate. Generally, partially human antibodies and fully human antibodies have a longer half-life within the human body than other antibodies. Accordingly, lower dosages and less frequent administration is often possible. Modifications such as lipidation can be used to stabilize antibodies and to enhance uptake and tissue penetration (e.g., into the brain). A method for lipidation of antibodies is described by Cruikshank et al. ((1997) J. Acquired Immune Deficiency Syndromes and Human Retrovirology 14:193).

[4932] The present invention encompasses agents which modulate expression or activity. An agent may, for example, be a small molecule. For example, such small molecules include, but are not limited to, peptides, peptidomimetics (e.g., peptoids), amino acids, amino acid analogs, polynucleotides, polynucleotide analogs, nucleotides, nucleotide analogs, organic or inorganic compounds (i.e., including heteroorganic and organometallic compounds) having a molecular weight less than about 10,000 grams per mole, organic or inorganic compounds having a molecular weight less than about 5,000 grams per mole, organic or inorganic compounds having a molecular weight less than about 1,000 grams per mole, organic or inorganic compounds having a molecular weight less than about 500 grams per mole, and salts, esters, and other pharmaceutically acceptable forms of such compounds.

[4933] Exemplary doses include milligram or microgram amounts of the small molecule per kilogram of subject or sample weight (e.g., about 1 microgram per kilogram to about 500 milligrams per kilogram, about 100 micrograms per kilogram to about 5 milligrams per kilogram, or about 1 microgram per kilogram to about 50 micrograms per kilogram. It is furthermore understood that appropriate doses of a small molecule depend upon the potency of the small molecule with respect to the expression or activity to be modulated. When one or more of these small molecules is to be administered to an animal (e.g., a human) in order to modulate expression or activity of a polypeptide or nucleic acid of the invention, a physician, veterinarian, or researcher may, for example, prescribe a relatively low dose at first, subsequently increasing the dose until an appropriate response is obtained. In addition, it is understood that the specific dose level for any particular animal subject will depend upon a variety of factors including the activity of the specific compound employed, the age, body weight, general health, gender, and diet of the subject, the time of administration, the route of administration, the rate of excretion, any drug combination, and the degree of expression or activity to be modulated.

[4934] An antibody (or fragment thereof) may be conjugated to a therapeutic moiety such as a cytotoxin, a therapeutic agent or a radioactive ion. A cytotoxin or cytotoxic agent includes any agent that is detrimental to cells. Examples include taxol, cytochalasin B, gramicidin D, ethidium bromide, emetine, mitomycin, etoposide, tenoposide, vincristine, vinblastine, colchicin, doxorubicin, daunorubicin, dihydroxy anthracin dione, mitoxantrone, mithramycin, actinomycin D, 1-dehydrotestosterone, glucocorticoids, procaine, tetracaine, lidocaine, propranolol, puromycin, maytansinoids, e.g., maytansinol (see U.S. Pat. No. 5,208,020), CC-1065 (see U.S. Pat. Nos. 5,475,092, 5,585,499, 5,846,545) and analogs or homologs thereof. Therapeutic agents include, but are not limited to, antimetabolites (e.g., methotrexate, 6-mercaptopurine, 6-thioguanine, cytarabine, 5-fluorouracil decarbazine), alkylating agents (e.g., mechlorethamine, thioepa chlorambucil, CC-1065, melphalan, carmustine (BSNU) and lomustine (CCNU), cyclothosphamide, busulfan, dibromomannitol, streptozotocin, mitomycin C, and cis-dichlorodiamine platinum (II) (DDP) cisplatin), anthracyclines (e.g., daunorubicin (formerly daunomycin) and doxorubicin), antibiotics (e.g., dactinomycin (formerly actinomycin), bleomycin, mithramycin, and anthramycin (AMC)), and anti-mitotic agents (e.g., vincristine, vinblastine, taxol and maytansinoids). Radioactive ions include, but are not limited to iodine, yttrium and praseodymium.

[4935] The conjugates of the invention can be used for modifying a given biological response, the drug moiety is not to be construed as limited to classical chemical therapeutic agents. For example, the drug moiety may be a protein or polypeptide possessing a desired biological activity. Such proteins may include, for example, a toxin such as abrin, ricin A, pseudomonas exotoxin, or diphtheria toxin; a protein such as tumor necrosis factor, &agr;-interferon, &bgr;-interferon, nerve growth factor, platelet derived growth factor, tissue plasminogen activator; or, biological response modifiers such as, for example, lymphokines, interleukin-1 (“IL-1”), interleukin-2 (“IL-2”), interleukin-6 (“IL-6”), granulocyte macrophase colony stimulating factor (“GM-CSF”), granulocyte colony stimulating factor (“G-CSF”), or other growth factors.

[4936] Alternatively, an antibody can be conjugated to a second antibody to form an antibody heteroconjugate as described by Segal in U.S. Pat. No. 4,676,980.

[4937] The nucleic acid molecules of the invention can be inserted into vectors and used as gene therapy vectors. Gene therapy vectors can be delivered to a subject by, for example, intravenous injection, local administration (see U.S. Pat. No. 5,328,470) or by stereotactic injection (see e.g., Chen et al. (1994) Proc. Natl. Acad. Sci. USA 91:3054-3057). The pharmaceutical preparation of the gene therapy vector can include the gene therapy vector in an acceptable diluent, or can comprise a slow release matrix in which the gene delivery vehicle is imbedded. Alternatively, where the complete gene delivery vector can be produced intact from recombinant cells, e.g., retroviral vectors, the pharmaceutical preparation can include one or more cells which produce the gene delivery system.

[4938] The pharmaceutical compositions can be included in a container, pack, or dispenser together with instructions for administration.

[4939] Methods of Treatment for 23479, 48120, and 46689

[4940] The present invention provides for both prophylactic and therapeutic methods of treating a subject at risk of (or susceptible to) a disorder or having a disorder associated with aberrant or unwanted 23479, 48120, or 46689 expression or activity. As used herein, the term “treatment” is defined as the application or administration of a therapeutic agent to a patient, or application or administration of a therapeutic agent to an isolated tissue or cell line from a patient, who has a disease, a symptom of disease or a predisposition toward a disease, with the purpose to cure, heal, alleviate, relieve, alter, remedy, ameliorate, improve or affect the disease, the symptoms of disease or the predisposition toward disease. A therapeutic agent includes, but is not limited to, small molecules, peptides, antibodies, ribozymes and antisense oligonucleotides.

[4941] With regards to both prophylactic and therapeutic methods of treatment, such treatments may be specifically tailored or modified, based on knowledge obtained from the field of pharmacogenomics. “Pharmacogenomics”, as used herein, refers to the application of genomics technologies such as gene sequencing, statistical genetics, and gene expression analysis to drugs in clinical development and on the market. More specifically, the term refers the study of how a patient's genes determine his or her response to a drug (e.g., a patient's “drug response phenotype”, or “drug response genotype”.) Thus, another aspect of the invention provides methods for tailoring an individual's prophylactic or therapeutic treatment with either the 23479, 48120, or 46689 molecules of the present invention or 23479, 48120, or 46689 modulators according to that individual's drug response genotype. Pharmacogenomics allows a clinician or physician to target prophylactic or therapeutic treatments to patients who will most benefit from the treatment and to avoid treatment of patients who will experience toxic drug-related side effects.

[4942] In one aspect, the invention provides a method for preventing in a subject, a disease or condition associated with an aberrant or unwanted 23479, 48120, or 46689 expression or activity, by administering to the subject a 23479, 48120, or 46689 or an agent which modulates 23479, 48120, or 46689 expression or at least one 23479, 48120, or 46689 activity. Subjects at risk for a disease which is caused or contributed to by aberrant or unwanted 23479, 48120, or 46689 expression or activity can be identified by, for example, any or a combination of diagnostic or prognostic assays as described herein. Administration of a prophylactic agent can occur prior to the manifestation of symptoms characteristic of the 23479, 48120, or 46689 aberrance, such that a disease or disorder is prevented or, alternatively, delayed in its progression. Depending on the type of 23479, 48120, or 46689 aberrance, for example, a 23479, 48120, or 46689, 23479, 48120, or 46689 agonist or 23479, 48120, or 46689 antagonist agent can be used for treating the subject. The appropriate agent can be determined based on screening assays described herein.

[4943] It is possible that some 23479, 48120, or 46689 disorders can be caused, at least in part, by an abnormal level of gene product, or by the presence of a gene product exhibiting abnormal activity. As such, the reduction in the level and/or activity of such gene products would bring about the amelioration of disorder symptoms.

[4944] The 23479, 48120, and 46689 molecules can act as novel diagnostic targets and therapeutic agents for controlling one or more of cellular proliferative or differentiative disorders, e.g., in the lung, brain, ovary, or breast, neural disorders, hematopoietic disorders, cardiovascular disorders, or liver disorders, as discussed above. The 23479, 48120, and 46689 molecules can also act as novel diagnostic targets and therapeutic agents for controlling one or more disorders associated with bone metabolism, viral diseases, and pain or metabolic disorders.

[4945] Aberrant expression and/or activity of 23479, 48120, or 46689 molecules may mediate disorders associated with bone metabolism. “Bone metabolism” refers to direct or indirect effects in the formation or degeneration of bone structures, e.g., bone formation, bone resorption, etc., which may ultimately affect the concentrations in serum of calcium and phosphate. This term also includes activities mediated by 23479, 48120, or 46689 molecules effects in bone cells, e.g. osteoclasts and osteoblasts, that may in turn result in bone formation and degeneration. For example, 23479, 48120, or 46689 molecules may support different activities of bone resorbing osteoclasts such as the stimulation of differentiation of monocytes and mononuclear phagocytes into osteoclasts. Accordingly, 23479, 48120, or 46689 molecules that modulate the production of bone cells can influence bone formation and degeneration, and thus may be used to treat bone disorders. Examples of such disorders include, but are not limited to, osteoporosis, osteodystrophy, osteomalacia, rickets, osteitis fibrosa cystica, renal osteodystrophy, osteosclerosis, anti-convulsant treatment, osteopenia, fibrogenesis-imperfecta ossium, secondary hyperparathyrodism, hypoparathyroidism, hyperparathyroidism, cirrhosis, obstructive jaundice, drug induced metabolism, medullary carcinoma, chronic renal disease, rickets, sarcoidosis, glucocorticoid antagonism, malabsorption syndrome, steatorrhea, tropical sprue, idiopathic hypercalcemia and milk fever.

[4946] The 23479, 48120, or 46689 molecules of the invention may play an important role in the etiology of certain viral diseases, including but not limited to Hepatitis B, Hepatitis C and Herpes Simplex Virus (HSV). Modulators of 23479, 48120, or 46689 activity could be used to control viral diseases. The modulators can be used in the treatment and/or diagnosis of viral infected tissue or virus-associated tissue fibrosis, especially liver and liver fibrosis. Also, 23479, 48120, or 46689 modulators can be used in the treatment and/or diagnosis of virus-associated carcinoma, especially hepatocellular cancer.

[4947] Additionally, 23479, 48120, or 46689 may play an important role in the regulation of metabolism or pain disorders. Diseases of metabolic imbalance include, but are not limited to, obesity, anorexia nervosa, cachexia, lipid disorders, and diabetes. Examples of pain disorders include, but are not limited to, pain response elicited during various forms of tissue injury, e.g., inflammation, infection, and ischemia, usually referred to as hyperalgesia (described in, for example, Fields, H. L. (1987) Pain, New York: McGraw-Hill); pain associated with musculoskeletal disorders, e.g., joint pain; tooth pain; headaches; pain associated with surgery; pain related to irritable bowel syndrome; or chest pain.

[4948] As discussed, successful treatment of 23479, 48120, or 46689 disorders can be brought about by techniques that serve to inhibit the expression or activity of target gene products. For example, compounds, e.g., an agent identified using an assays described above, that proves to exhibit negative modulatory activity, can be used in accordance with the invention to prevent and/or ameliorate symptoms of 23479, 48120, or 46689 disorders. Such molecules can include, but are not limited to peptides, phosphopeptides, small organic or inorganic molecules, or antibodies (including, for example, polyclonal, monoclonal, humanized, anti-idiotypic, chimeric or single chain antibodies, and Fab, F(ab′)2 and Fab expression library fragments, scFV molecules, and epitope-binding fragments thereof).

[4949] Further, antisense and ribozyme molecules that inhibit expression of the target gene can also be used in accordance with the invention to reduce the level of target gene expression, thus effectively reducing the level of target gene activity. Still further, triple helix molecules can be utilized in reducing the level of target gene activity. Antisense, ribozyme and triple helix molecules are discussed above.

[4950] It is possible that the use of antisense, ribozyme, and/or triple helix molecules to reduce or inhibit mutant gene expression can also reduce or inhibit the transcription (triple helix) and/or translation (antisense, ribozyme) of mRNA produced by normal target gene alleles, such that the concentration of normal target gene product present can be lower than is necessary for a normal phenotype. In such cases, nucleic acid molecules that encode and express target gene polypeptides exhibiting normal target gene activity can be introduced into cells via gene therapy method. Alternatively, in instances in that the target gene encodes an extracellular protein, it can be preferable to co-administer normal target gene protein into the cell or tissue in order to maintain the requisite level of cellular or tissue target gene activity.

[4951] Another method by which nucleic acid molecules may be utilized in treating or preventing a disease characterized by 23479, 48120, or 46689 expression is through the use of aptamer molecules specific for 23479, 48120, or 46689 protein. Aptamers are nucleic acid molecules having a tertiary structure which permits them to specifically bind to protein ligands (see, e.g., Osborne, et al. (1997) Curr. Opin. Chem Biol. 1: 5-9; and Patel, D. J. (1997) Curr Opin Chem Biol 1:32-46). Since nucleic acid molecules may in many cases be more conveniently introduced into target cells than therapeutic protein molecules may be, aptamers offer a method by which 23479, 48120, or 46689 protein activity may be specifically decreased without the introduction of drugs or other molecules which may have pluripotent effects.

[4952] Antibodies can be generated that are both specific for target gene product and that reduce target gene product activity. Such antibodies may, therefore, by administered in instances whereby negative modulatory techniques are appropriate for the treatment of 23479, 48120, or 46689 disorders. For a description of antibodies, see the Antibody section above.

[4953] In circumstances wherein injection of an animal or a human subject with a 23479, 48120, or 46689 protein or epitope for stimulating antibody production is harmful to the subject, it is possible to generate an immune response against 23479, 48120, or 46689 through the use of anti-idiotypic antibodies (see, for example, Herlyn, D. (1999) Ann Med 31:66-78; and Bhattacharya-Chatterjee, M., and Foon, K. A. (1998) Cancer Treat Res. 94:51-68). If an anti-idiotypic antibody is introduced into a mammal or human subject, it should stimulate the production of anti-anti-idiotypic antibodies, which should be specific to the 23479, 48120, or 46689 protein. Vaccines directed to a disease characterized by 23479, 48120, or 46689 expression may also be generated in this fashion.

[4954] In instances where the target antigen is intracellular and whole antibodies are used, internalizing antibodies may be preferred. Lipofectin or liposomes can be used to deliver the antibody or a fragment of the Fab region that binds to the target antigen into cells. Where fragments of the antibody are used, the smallest inhibitory fragment that binds to the target antigen is preferred. For example, peptides having an amino acid sequence corresponding to the Fv region of the antibody can be used. Alternatively, single chain neutralizing antibodies that bind to intracellular target antigens can also be administered. Such single chain antibodies can be administered, for example, by expressing nucleotide sequences encoding single-chain antibodies within the target cell population (see e.g., Marasco et al. (1993) Proc. Natl. Acad. Sci. USA 90:7889-7893).

[4955] The identified compounds that inhibit target gene expression, synthesis and/or activity can be administered to a patient at therapeutically effective doses to prevent, treat or ameliorate 23479, 48120, or 46689 disorders. A therapeutically effective dose refers to that amount of the compound sufficient to result in amelioration of symptoms of the disorders. Toxicity and therapeutic efficacy of such compounds can be determined by standard pharmaceutical procedures as described above.

[4956] The data obtained from the cell culture assays and animal studies can be used in formulating a range of dosage for use in humans. The dosage of such compounds lies preferably within a range of circulating concentrations that include the ED50 with little or no toxicity. The dosage can vary within this range depending upon the dosage form employed and the route of administration utilized. For any compound used in the method of the invention, the therapeutically effective dose can be estimated initially from cell culture assays. A dose can be formulated in animal models to achieve a circulating plasma concentration range that includes the IC50 (i.e., the concentration of the test compound that achieves a half-maximal inhibition of symptoms) as determined in cell culture. Such information can be used to more accurately determine useful doses in humans. Levels in plasma can be measured, for example, by high performance liquid chromatography. Another example of determination of effective dose for an individual is the ability to directly assay levels of “free” and “bound” compound in the serum of the test subject. Such assays may utilize antibody mimics and/or “biosensors” that have been created through molecular imprinting techniques. The compound which is able to modulate 23479, 48120, or 46689 activity is used as a template, or “imprinting molecule”, to spatially organize polymerizable monomers prior to their polymerization with catalytic reagents. The subsequent removal of the imprinted molecule leaves a polymer matrix which contains a repeated “negative image” of the compound and is able to selectively rebind the molecule under biological assay conditions. A detailed review of this technique can be seen in Ansell, R. J. et al (1996) Current Opinion in Biotechnology 7:89-94 and in Shea, K. J. (1994) Trends in Polymer Science 2:166-173. Such “imprinted” affinity matrixes are amenable to ligand-binding assays, whereby the immobilized monoclonal antibody component is replaced by an appropriately imprinted matrix. An example of the use of such matrixes in this way can be seen in Vlatakis, G. et al (1993) Nature 361:645-647. Through the use of isotope-labeling, the “free” concentration of compound which modulates the expression or activity of 23479, 48120, or 46689 can be readily monitored and used in calculations of IC50.

[4957] Such “imprinted” affinity matrixes can also be designed to include fluorescent groups whose photon-emitting properties measurably change upon local and selective binding of target compound. These changes can be readily assayed in real time using appropriate fiberoptic devices, in turn allowing the dose in a test subject to be quickly optimized based on its individual IC50. An rudimentary example of such a “biosensor” is discussed in Kriz, D. et al (1995) Analytical Chemistry 67:2142-2144.

[4958] Another aspect of the invention pertains to methods of modulating 23479, 48120, or 46689 expression or activity for therapeutic purposes. Accordingly, in an exemplary embodiment, the modulatory method of the invention involves contacting a cell with a 23479, 48120, or 46689 or agent that modulates one or more of the activities of 23479, 48120, or 46689 protein activity associated with the cell. An agent that modulates 23479, 48120, or 46689 protein activity can be an agent as described herein, such as a nucleic acid or a protein, a naturally-occurring target molecule of a 23479, 48120, or 46689 protein (e.g., a 23479, 48120, or 46689 substrate or receptor), a 23479, 48120, or 46689 antibody, a 23479, 48120, or 46689 agonist or antagonist, a peptidomimetic of a 23479, 48120, or 46689 agonist or antagonist, or other small molecule.

[4959] In one embodiment, the agent stimulates one or 23479, 48120, or 46689 activities. Examples of such stimulatory agents include active 23479, 48120, or 46689 protein and a nucleic acid molecule encoding 23479, 48120, or 46689. In another embodiment, the agent inhibits one or more 23479, 48120, or 46689 activities. Examples of such inhibitory agents include antisense 23479, 48120, or 46689 nucleic acid molecules, anti-23479, 48120, or 46689 antibodies, and 23479, 48120, or 46689 inhibitors. These modulatory methods can be performed in vitro (e.g., by culturing the cell with the agent) or, alternatively, in vivo (e.g., by administering the agent to a subject). As such, the present invention provides methods of treating an individual afflicted with a disease or disorder characterized by aberrant or unwanted expression or activity of a 23479, 48120, or 46689 protein or nucleic acid molecule. In one embodiment, the method involves administering an agent (e.g., an agent identified by a screening assay described herein), or combination of agents that modulates (e.g., up regulates or down regulates) 23479, 48120, or 46689 expression or activity. In another embodiment, the method involves administering a 23479, 48120, or 46689 protein or nucleic acid molecule as therapy to compensate for reduced, aberrant, or unwanted 23479, 48120, or 46689 expression or activity.

[4960] Stimulation of 23479, 48120, or 46689 activity is desirable in situations in which 23479, 48120, or 46689 is abnormally downregulated and/or in which increased 23479, 48120, or 46689 activity is likely to have a beneficial effect. For example, stimulation of 23479, 48120, or 46689 activity is desirable in situations in which a 23479, 48120, or 46689 is downregulated and/or in which increased 23479, 48120, or 46689 activity is likely to have a beneficial effect. Likewise, inhibition of 23479, 48120, or 46689 activity is desirable in situations in which 23479, 48120, or 46689 is abnormally upregulated and/or in which decreased 23479, 48120, or 46689 activity is likely to have a beneficial effect.

[4961] 23479, 48120, and 46689 Pharmacogenomics

[4962] The 23479, 48120, or 46689 molecules of the present invention, as well as agents, or modulators which have a stimulatory or inhibitory effect on 23479, 48120, or 46689 activity (e.g., 23479, 48120, or 46689 gene expression) as identified by a screening assay described herein can be administered to individuals to treat (prophylactically or therapeutically)23479, 48120, or 46689 associated disorders (e.g., cellular proliferative or differentiative disorders, e.g., in the lung, brain, ovary, or breast, neural disorders, hematopoietic disorders, cardiovascular disorders, or liver disorders) associated with aberrant or unwanted 23479, 48120, or 46689 activity. In conjunction with such treatment, pharmacogenomics (i.e., the study of the relationship between an individual's genotype and that individual's response to a foreign compound or drug) may be considered. Differences in metabolism of therapeutics can lead to severe toxicity or therapeutic failure by altering the relation between dose and blood concentration of the pharmacologically active drug. Thus, a physician or clinician may consider applying knowledge obtained in relevant pharmacogenomics studies in determining whether to administer a 23479, 48120, or 46689 molecule or 23479, 48120, or 46689 modulator as well as tailoring the dosage and/or therapeutic regimen of treatment with a 23479, 48120, or 46689 molecule or 23479, 48120, or 46689 modulator.

[4963] Pharmacogenomics deals with clinically significant hereditary variations in the response to drugs due to altered drug disposition and abnormal action in affected persons. See, for example, Eichelbaum, M. et al. (1996) Clin. Exp. Pharmacol. Physiol. 23:983-985 and Linder, M. W. et al. (1997) Clin. Chem. 43:254-266. In general, two types of pharmacogenetic conditions can be differentiated. Genetic conditions transmitted as a single factor altering the way drugs act on the body (altered drug action) or genetic conditions transmitted as single factors altering the way the body acts on drugs (altered drug metabolism). These pharmacogenetic conditions can occur either as rare genetic defects or as naturally-occurring polymorphisms. For example, glucose-6-phosphate dehydrogenase deficiency (G6PD) is a common inherited enzymopathy in which the main clinical complication is haemolysis after ingestion of oxidant drugs (anti-malarials, sulfonamides, analgesics, nitrofurans) and consumption of fava beans.

[4964] One pharmacogenomics approach to identifying genes that predict drug response, known as “a genome-wide association”, relies primarily on a high-resolution map of the human genome consisting of already known gene-related markers (e.g., a “bi-allelic” gene marker map which consists of 60,000-100,000 polymorphic or variable sites on the human genome, each of which has two variants.) Such a high-resolution genetic map can be compared to a map of the genome of each of a statistically significant number of patients taking part in a Phase II/III drug trial to identify markers associated with a particular observed drug response or side effect. Alternatively, such a high resolution map can be generated from a combination of some ten-million known single nucleotide polymorphisms (SNPs) in the human genome. As used herein, a “SNP” is a common alteration that occurs in a single nucleotide base in a stretch of DNA. For example, a SNP may occur once per every 1000 bases of DNA. A SNP may be involved in a disease process, however, the vast majority may not be disease-associated. Given a genetic map based on the occurrence of such SNPs, individuals can be grouped into genetic categories depending on a particular pattern of SNPs in their individual genome. In such a manner, treatment regimens can be tailored to groups of genetically similar individuals, taking into account traits that may be common among such genetically similar individuals.

[4965] Alternatively, a method termed the “candidate gene approach,” can be utilized to identify genes that predict drug response. According to this method, if a gene that encodes a drug's target is known (e.g., a 23479, 48120, or 46689 protein of the present invention), all common variants of that gene can be fairly easily identified in the population and it can be determined if having one version of the gene versus another is associated with a particular drug response.

[4966] Alternatively, a method termed the “gene expression profiling,” can be utilized to identify genes that predict drug response. For example, the gene expression of an animal dosed with a drug (e.g., a 23479, 48120, or 46689 molecule or 23479, 48120, or 46689 modulator of the present invention) can give an indication whether gene pathways related to toxicity have been turned on.

[4967] Information generated from more than one of the above pharmacogenomics approaches can be used to determine appropriate dosage and treatment regimens for prophylactic or therapeutic treatment of an individual. This knowledge, when applied to dosing or drug selection, can avoid adverse reactions or therapeutic failure and thus enhance therapeutic or prophylactic efficiency when treating a subject with a 23479, 48120, or 46689 molecule or 23479, 48120, or 46689 modulator, such as a modulator identified by one of the exemplary screening assays described herein.

[4968] The present invention further provides methods for identifying new agents, or combinations, that are based on identifying agents that modulate the activity of one or more of the gene products encoded by one or more of the 23479, 48120, or 46689 genes of the present invention, wherein these products may be associated with resistance of the cells to a therapeutic agent. Specifically, the activity of the proteins encoded by the 23479, 48120, or 46689 genes of the present invention can be used as a basis for identifying agents for overcoming agent resistance. By blocking the activity of one or more of the resistance proteins, target cells, e.g., human cells, will become sensitive to treatment with an agent that the unmodified target cells were resistant to.

[4969] Monitoring the influence of agents (e.g., drugs) on the expression or activity of a 23479, 48120, or 46689 protein can be applied in clinical trials. For example, the effectiveness of an agent determined by a screening assay as described herein to increase 23479, 48120, or 46689 gene expression, protein levels, or upregulate 23479, 48120, or 46689 activity, can be monitored in clinical trials of subjects exhibiting decreased 23479, 48120, or 46689 gene expression, protein levels, or downregulated 23479, 48120, or 46689 activity. Alternatively, the effectiveness of an agent determined by a screening assay to decrease 23479, 48120, or 46689 gene expression, protein levels, or downregulate 23479, 48120, or 46689 activity, can be monitored in clinical trials of subjects exhibiting increased 23479, 48120, or 46689 gene expression, protein levels, or upregulated 23479, 48120, or 46689 activity. In such clinical trials, the expression or activity of a 23479, 48120, or 46689 gene, and preferably, other genes that have been implicated in, for example, a 23479, 48120, or 46689-associated disorder can be used as a “read out” or markers of the phenotype of a particular cell.

[4970] 23479, 48120, or 46689 Informatics

[4971] The sequence of a 23479, 48120, or 46689 molecule is provided in a variety of media to facilitate use thereof. A sequence can be provided as a manufacture, other than an isolated nucleic acid or amino acid molecule, which contains a 23479, 48120, or 46689. Such a manufacture can provide a nucleotide or amino acid sequence, e.g., an open reading frame, in a form which allows examination of the manufacture using means not directly applicable to examining the nucleotide or amino acid sequences, or a subset thereof, as they exists in nature or in purified form. The sequence information can include, but is not limited to, 23479, 48120, or 46689 full-length nucleotide and/or amino acid sequences, partial nucleotide and/or amino acid sequences, polymorphic sequences including single nucleotide polymorphisms (SNPs), epitope sequence, and the like. In a preferred embodiment, the manufacture is a machine-readable medium, e.g., a magnetic, optical, chemical or mechanical information storage device.

[4972] As used herein, “machine-readable media” refers to any medium that can be read and accessed directly by a machine, e.g., a digital computer or analogue computer. Non-limiting examples of a computer include a desktop PC, laptop, mainframe, server (e.g., a web server, network server, or server farm), handheld digital assistant, pager, mobile telephone, and the like. The computer can be stand-alone or connected to a communications network, e.g., a local area network (such as a VPN or intranet), a wide area network (e.g., an Extranet or the Internet), or a telephone network (e.g., a wireless, DSL, or ISDN network). Machine-readable media include, but are not limited to: magnetic storage media, such as floppy discs, hard disc storage medium, and magnetic tape; optical storage media such as CD-ROM; electrical storage media such as RAM, ROM, EPROM, EEPROM, flash memory, and the like; and hybrids of these categories such as magnetic/optical storage media.

[4973] A variety of data storage structures are available to a skilled artisan for creating a machine-readable medium having recorded thereon a nucleotide or amino acid sequence of the present invention. The choice of the data storage structure will generally be based on the means chosen to access the stored information. In addition, a variety of data processor programs and formats can be used to store the nucleotide sequence information of the present invention on computer readable medium. The sequence information can be represented in a word processing text file, formatted in commercially-available software such as WordPerfect and Microsoft Word, or represented in the form of an ASCII file, stored in a database application, such as DB2, Sybase, Oracle, or the like. The skilled artisan can readily adapt any number of data processor structuring formats (e.g., text file or database) in order to obtain computer readable medium having recorded thereon the nucleotide sequence information of the present invention.

[4974] In a preferred embodiment, the sequence information is stored in a relational database (such as Sybase or Oracle). The database can have a first table for storing sequence (nucleic acid and/or amino acid sequence) information. The sequence information can be stored in one field (e.g., a first column) of a table row and an identifier for the sequence can be store in another field (e.g., a second column) of the table row. The database can have a second table, e.g., storing annotations. The second table can have a field for the sequence identifier, a field for a descriptor or annotation text (e.g., the descriptor can refer to a functionality of the sequence, a field for the initial position in the sequence to which the annotation refers, and a field for the ultimate position in the sequence to which the annotation refers. Non-limiting examples for annotation to nucleic acid sequences include polymorphisms (e.g., SNP's) translational regulatory sites and splice junctions. Non-limiting examples for annotations to amino acid sequence include polypeptide domains, e.g., a domain described herein; active sites and other functional amino acids; and modification sites.

[4975] By providing the nucleotide or amino acid sequences of the invention in computer readable form, the skilled artisan can routinely access the sequence information for a variety of purposes. For example, one skilled in the art can use the nucleotide or amino acid sequences of the invention in computer readable form to compare a target sequence or target structural motif with the sequence information stored within the data storage means. A search is used to identify fragments or regions of the sequences of the invention which match a particular target sequence or target motif. The search can be a BLAST search or other routine sequence comparison, e.g., a search described herein.

[4976] Thus, in one aspect, the invention features a method of analyzing 23479, 48120, or 46689, e.g., analyzing structure, function, or relatedness to one or more other nucleic acid or amino acid sequences. The method includes: providing a 23479, 48120, or 46689 nucleic acid or amino acid sequence; comparing the 23479, 48120, or 46689 sequence with a second sequence, e.g., one or more preferably a plurality of sequences from a collection of sequences, e.g., a nucleic acid or protein sequence database to thereby analyze 23479, 48120, or 46689. The method can be performed in a machine, e.g., a computer, or manually by a skilled artisan.

[4977] The method can include evaluating the sequence identity between a 23479, 48120, or 46689 sequence and a database sequence. The method can be performed by accessing the database at a second site, e.g., over the Internet.

[4978] As used herein, a “target sequence” can be any DNA or amino acid sequence of six or more nucleotides or two or more amino acids. A skilled artisan can readily recognize that the longer a target sequence is, the less likely a target sequence will be present as a random occurrence in the database. Typical sequence lengths of a target sequence are from about 10 to 100 amino acids or from about 30 to 300 nucleotide residues. However, it is well recognized that commercially important fragments, such as sequence fragments involved in gene expression and protein processing, may be of shorter length.

[4979] Computer software is publicly available which allows a skilled artisan to access sequence information provided in a computer readable medium for analysis and comparison to other sequences. A variety of known algorithms are disclosed publicly and a variety of commercially available software for conducting search means are and can be used in the computer-based systems of the present invention. Examples of such software include, but are not limited to, MacPattern (EMBL), BLASTN and BLASTX (NCBI).

[4980] Thus, the invention features a method of making a computer readable record of a sequence of a 23479, 48120, or 46689 sequence which includes recording the sequence on a computer readable matrix. In a preferred embodiment the record includes one or more of the following: identification of an ORF; identification of a domain, region, or site; identification of the start of transcription; identification of the transcription terminator; the full length amino acid sequence of the protein, or a mature form thereof; the 5′end of the translated region.

[4981] In another aspect, the invention features, a method of analyzing a sequence. The method includes: providing a 23479, 48120, or 46689 sequence, or record, in machine-readable form; comparing a second sequence to the 23479, 48120, or 46689 sequence; thereby analyzing a sequence. Comparison can include comparing to sequences for sequence identity or determining if one sequence is included within the other, e.g., determining if the 23479, 48120, or 46689 sequence includes a sequence being compared. In a preferred embodiment the 23479, 48120, or 46689 or second sequence is stored on a first computer, e.g., at a first site and the comparison is performed, read, or recorded on a second computer, e.g., at a second site. E.g., the 23479, 48120, or 46689 or second sequence can be stored in a public or proprietary database in one computer, and the results of the comparison performed, read, or recorded on a second computer. In a preferred embodiment the record includes one or more of the following: identification of an ORF; identification of a domain, region, or site; identification of the start of transcription; identification of the transcription terminator; the full length amino acid sequence of the protein, or a mature form thereof; the 5′end of the translated region.

[4982] In another aspect, the invention provides a machine-readable medium for holding instructions for performing a method for determining whether a subject has a 23479, 48120, or 46689-associated disease or disorder or a pre-disposition to a 23479, 48120, or 46689-associated disease or disorder, wherein the method comprises the steps of determining 23479, 48120, or 46689 sequence information associated with the subject and based on the 23479, 48120, or 46689 sequence information, determining whether the subject has a 23479, 48120, or 46689-associated disease or disorder or a pre-disposition to a 23479, 48120, or 46689-associated disease or disorder and/or recommending a particular treatment for the disease, disorder or pre-disease condition.

[4983] The invention further provides in an electronic system and/or in a network, a method for determining whether a subject has a 23479, 48120, or 46689-associated disease or disorder or a pre-disposition to a disease associated with a 23479, 48120, or 46689 wherein the method comprises the steps of determining 23479, 48120, or 46689 sequence information associated with the subject, and based on the 23479, 48120, or 46689 sequence information, determining whether the subject has a 23479, 48120, or 46689-associated disease or disorder or a pre-disposition to a 23479, 48120, or 46689-associated disease or disorder, and/or recommending a particular treatment for the disease, disorder or pre-disease condition. In a preferred embodiment, the method further includes the step of receiving information, e.g., phenotypic or genotypic information, associated with the subject and/or acquiring from a network phenotypic information associated with the subject. The information can be stored in a database, e.g., a relational database. In another embodiment, the method further includes accessing the database, e.g., for records relating to other subjects, comparing the 23479, 48120, or 46689 sequence of the subject to the 23479, 48120, or 46689 sequences in the database to thereby determine whether the subject as a 23479, 48120, or 46689-associated disease or disorder, or a pre-disposition for such.

[4984] The present invention also provides in a network, a method for determining whether a subject has a 23479, 48120, or 46689 associated disease or disorder or a pre-disposition to a 23479, 48120, or 46689-associated disease or disorder associated with 23479, 48120, or 46689, said method comprising the steps of receiving 23479, 48120, or 46689 sequence information from the subject and/or information related thereto, receiving phenotypic information associated with the subject, acquiring information from the network corresponding to 23479, 48120, or 46689 and/or corresponding to a 23479, 48120, or 46689-associated disease or disorder (e.g., a cellular proliferative or differentiative disorder, e.g., in the lung, brain, ovary, or breast, a neural disorder, a hematopoietic disorder, a cardiovascular disorder, or a liver disorder), and based on one or more of the phenotypic information, the 23479, 48120, or 46689 information (e.g., sequence information and/or information related thereto), and the acquired information, determining whether the subject has a 23479, 48120, or 46689-associated disease or disorder or a pre-disposition to a 23479, 48120, or 46689-associated disease or disorder. The method may further comprise the step of recommending a particular treatment for the disease, disorder or pre-disease condition.

[4985] The present invention also provides a method for determining whether a subject has a 23479, 48120, or 46689-associated disease or disorder or a pre-disposition to a 23479, 48120, or 46689-associated disease or disorder, said method comprising the steps of receiving information related to 23479, 48120, or 46689 (e.g., sequence information and/or information related thereto), receiving phenotypic information associated with the subject, acquiring information from the network related to 23479, 48120, or 46689 and/or related to a 23479, 48120, or 46689-associated disease or disorder, and based on one or more of the phenotypic information, the 23479, 48120, or 46689 information, and the acquired information, determining whether the subject has a 23479, 48120, or 46689-associated disease or disorder or a pre-disposition to a 23479, 48120, or 46689-associated disease or disorder. The method may further comprise the step of recommending a particular treatment for the disease, disorder or pre-disease condition.

[4986] This invention is further illustrated by the following examples that should not be construed as limiting. The contents of all references, patents and published patent applications cited throughout this application are incorporated herein by reference.

Background of the 80091 Invention

[4987] Living cells are capable of modulating the levels of proteins that they express. A variety of different mechanisms exist through which protein levels can be modulated. The ubiquitin pathway is one example of a post-translational mechanism used to regulate protein levels. Ubiquitin is a highly conserved polypeptide (8.6 kDa) expressed in all eukaryotic cells that marks proteins for degradation. Ubiquitin is attached as a single molecule or as a conjugated form to lysine residue(s) of proteins via formation of an isopeptide bond at the C-terminal glycine residue. Most ubiquitinated proteins are subsequently targeted to the 26S proteasome, a multicatalytic protease, which cleaves the marked protein into peptide fragments.

[4988] Only the protein conjugated to ubiquitin is degraded via the proteasome; ubiquitin itself is recycled by ubiquitin carboxy-terminal hydrolases (UCH; sometimes abbreviated UCTH), which cleave the bond between ubiquitin and the protein targeted for degradation. These enzymes constitute a family of thiol proteases, and homologues have been found in, for example, yeast (Miller et al., (1989) Bio Technology 7: 698-704; Tobias and Varshavsky (1991) J. Biol. Chem. 266: 12021-12028; Baker et al., (1992) J. Biol. Chem. 267: 23364-23375), bovine (Papa and Hochstrasser (1993) Nature 366: 313-319), avian (Woo et al., (1995) J. Biol. Chem. 270: 18766-18773), Drosophila (Zhang et al., (1993) Dev. Biol. 17: 214) and human (Wilkinson et al., (1989) Science 246:670) cells.

[4989] Ubiquitination has been implicated in regulating numerous cellular processes including, for example, proliferation, differentiation, apoptosis (programmed cell death), transcription, signal-transduction, cell-cycle progression, receptor-mediated endocytosis, organelle biogenesis and others. The presence of abnormal amounts of ubiquitinated proteins in neuropathological conditions such as Alzheimer's and Pick's disease indicates that ubiquitination plays a role in various physiological disorders. Oncogenes (e.g., v-jun and v-fos) are often found to be resistant to ubiquitination in comparison to their normal cell counterparts, suggesting that a failure to degrade oncogene protein products accounts for some of their cell transformation capability. Combined with the observation that not all ubiquitinated proteins are degraded by the proteosome, these findings indicate that the process of ubiquitination and de-ubiquitination of particular substrates have important functional roles apart from recycling ubiquitin.

[4990] There are two distinct families of UCH. The second class consists of large proteins (800 to 2000 residues) and these proteins only share two domains of similarity (UCH-1 and UCH-2). The UCH-1 domain contains a conserved cysteine that is probably implicated in the catalytic mechanism. The UCH-2 domain contains two conserved histidines residues, one of which is also probably implicated in the catalytic mechanism. The conserved signature patterns of UCH-1 and UCH-2 are respectively as follows: (1) G-[LIVMFY]-x(1,3)-[AGC]-[NASM]-x-C-[FYW]-[LIVMFC]-[NST]-[SACV]-x-[LIVMS]-Q, wherein C is the putative active site residue; and (2) Y-x-L-x-[SAG]-[LIVMFT]-x(2)-H-x-G-x(4,5)-G-H-Y (SEQ ID NO:98), wherein Hs are two putative active site residues.

[4991] Summary of the 80091 Invention

[4992] The present invention is based, in part, on the discovery of a novel ubiquitin carboxy-terminal hydrolase family member, referred to herein as “80091”. The nucleotide sequence of a cDNA encoding 80091 is shown in SEQ ID NO:94, and the amino acid sequence of an 80091 polypeptide is shown in SEQ ID NO:95. In addition, the nucleotide sequences of the coding region are depicted in SEQ ID NO:94.

[4993] Accordingly, in one aspect, the invention features a nucleic acid molecule that encodes an 80091 protein or polypeptide, e.g., a biologically active portion of the 80091 protein. In a preferred embodiment the isolated nucleic acid molecule encodes a polypeptide having the amino acid sequence of SEQ ID NO:95. In other embodiments, the invention provides isolated 80091 nucleic acid molecules having the nucleotide sequence shown in SEQ ID NO:94, a full complement of SEQ ID NO:94, or the sequence of the DNA insert of the plasmid deposited with ATCC Accession Number ______. In still other embodiments, the invention provides nucleic acid molecules that are substantially identical (e.g., naturally occurring allelic variants) to the nucleotide sequence shown in SEQ ID NO:94, or the sequence of the DNA insert of the plasmid deposited with ATCC Accession Number ______. In other embodiments, the invention provides a nucleic acid molecule which hybridizes under a stringency condition described herein to a nucleic acid molecule comprising the nucleotide sequence of SEQ ID NO:94, or the sequence of the DNA insert of the plasmid deposited with ATCC Accession Number ______, wherein the nucleic acid encodes a full length 80091 protein or an active fragment thereof.

[4994] In a related aspect, the invention further provides nucleic acid constructs that include an 80091 nucleic acid molecule described herein. In certain embodiments, the nucleic acid molecules of the invention are operatively linked to native or heterologous regulatory sequences. Also included, are vectors and host cells containing the 80091 nucleic acid molecules of the invention e.g., vectors and host cells suitable for producing 80091 nucleic acid molecules and polypeptides.

[4995] In another related aspect, the invention provides nucleic acid fragments suitable as primers or hybridization probes for the detection of 80091-encoding nucleic acids.

[4996] In still another related aspect, isolated nucleic acid molecules that are antisense to an 80091 encoding nucleic acid molecule are provided.

[4997] In another aspect, the invention features, 80091 polypeptides, and biologically active or antigenic fragments thereof that are useful, e.g., as reagents or targets in assays applicable to treatment and diagnosis of 80091-mediated or -related disorders. In another embodiment, the invention provides 80091 polypeptides having an 80091 activity. Preferred polypeptides are 80091 proteins including at least one ubiquitin carboxyl-terminal hydrolase domain and one ubiquitin carboxyl-terminal hydrolase-2 domain, and, preferably, having an 80091 activity, e.g., an 80091 activity as described herein.

[4998] In other embodiments, the invention provides 80091 polypeptides, e.g., an 80091 polypeptide having the amino acid sequence shown in SEQ ID NO:95 or the amino acid sequence encoded by the cDNA insert of the plasmid deposited with ATCC Accession Number ______; an amino acid sequence that is substantially identical to the amino acid sequence shown in SEQ ID NO:95 or the amino acid sequence encoded by the cDNA insert of the plasmid deposited with ATCC Accession Number ______; or an amino acid sequence encoded by a nucleic acid molecule having a nucleotide sequence which hybridizes under a stringency condition described herein to a nucleic acid molecule comprising the nucleotide sequence of SEQ ID NO:94, or the sequence of the DNA insert of the plasmid deposited with ATCC Accession Number ______, wherein the nucleic acid encodes a full length 80091 protein or an active fragment thereof.

[4999] In a related aspect, the invention further provides nucleic acid constructs which include an 80091 nucleic acid molecule described herein.

[5000] In a related aspect, the invention provides 80091 polypeptides or fragments operatively linked to non-80091 polypeptides to form fusion proteins.

[5001] In another aspect, the invention features antibodies and antigen-binding fragments thereof, that react with, or more preferably specifically bind 80091 polypeptides or fragments thereof, e.g., an ubiquitin carboxyl-terminal hydrolase-1 domain or an ubiquitin carboxyl-terminal hydrolase-2 domain.

[5002] In another aspect, the invention provides methods of screening for compounds that modulate the expression or activity of the 80091 polypeptides or nucleic acids.

[5003] In still another aspect, the invention provides a process for modulating 80091 polypeptide or nucleic acid expression or activity, e.g. using the screened compounds. In certain embodiments, the methods involve treatment of conditions related to aberrant activity or expression of the 80091 polypeptides or nucleic acids, such as conditions involving aberrant or deficient involving aberrant cellular proliferation of an 80091 expressing cell, e.g., a hematopoietic cell (e.g., an erythroid cell (e.g., an erythrocyte or an erythroblast), and cellular proliferation or differentiation.

[5004] The invention also provides assays for determining the activity of or the presence or absence of 80091 polypeptides or nucleic acid molecules in a biological sample, including for disease diagnosis.

[5005] In yet another aspect, the invention provides methods for inhibiting the proliferation or inducing the killing, of an 80091-expressing cell, e.g., a hyper-proliferative 80091-expressing cell. The method includes contacting the cell with a compound (e.g., a compound identified using the methods described herein) that modulates the activity, or expression, of the 80091 polypeptide or nucleic acid. In a preferred embodiment, the contacting step is effective in vitro or ex vivo.

[5006] In a preferred embodiment, the contacting step is effected in vivo in a subject, e.g., as part of a therapeutic or prophylactic protocol. Preferably, the subject is a human, e.g., a patient with a hematopoietic disorder such as an erythroid-associated disorder. For example, the subject can be a patient with an anemia, e.g., a drug-induced anemia (e.g., a chemotherapy-induced anemia), hemolytic anemia, aberrant erythropoiesis, secondary anemia in non-hematolic disorders, anemia of chronic disease such as chronic renal failure; endocrine deficiency disease; and/or erythrocytosis (e.g., polycythemia). Preferably, the erythroid-associated disorder is a drug-induced anemia (e.g., a chemotherapy induced anemia). Alternatively, the subject can be a cancer patient, e.g., a patient with leukemic cancer, e.g., an erythroid leukemia. In other embodiments, the subject is a non-human animal, e.g., an experimental animal.

[5007] In a preferred embodiment, the method further includes contacting of the erythroid cell with a protein, e.g., a hormone. The protein can be a member of the following non-limiting group: G-CSF, GM-CSF, stem cell factor, interleukin-3 (IL-3), IL-4, FIt-3 ligand, thrombopoietin, and erythropoietin. More preferably, the protein is erythropoietin. The protein contacting step can occur before, at the same time, or after the agent is contacted. The protein contacting step can be effected in vitro or ex vivo. For example, the cell, e.g., the erythroid cell can be obtained from a subject, e.g., a patient, and contacted with the protein ex vivo. The treated cell can be re-introduced into the subject. Alternatively, the protein contacting step can occur in vivo. The contacting step(s) can be repeated.

[5008] In a preferred embodiment, the agent increases the number of hematopoietic cells, e.g., erythroid cells, by e.g., increasing the proliferation, survival, and/or stimulating the differentiation, of hematopoietic (e.g., erythroid) progenitor cells, in the subject. Such agents can be used to treat an anemia, e.g., a drug- (e.g., chemotherapy-) induced anemia, hemolytic anemia, aberrant erythropoiesis, secondary anemia in non-hematolic disorder, anemia of chronic diseases such as chronic renal failure; endocrine deficiency disease; and/or erythrocytosis (e.g., polycythemias).

[5009] In a preferred embodiment, the cell, e.g., the 80091-expressing cell, is an erythroid cell, a human umbilical vein endothelial cell (HUVEC), or a brain cell.

[5010] In a preferred embodiment, the agent increases the number of erythroid cells, by e.g., increasing the proliferation, survival, and/or stimulating the differentiation, of granulocytic and monocytic progenitor cells, e.g., CFU-GM, CFU-G (colony forming unit—granulocyte), myeloblast, promyelocyte, myelocyte, a metamyelocyte, or a band cell. Such compounds can be used to treat or prevent neutropenia and granulocytopenia, e.g., conditions caused by cytotoxic chemotherapy, AIDS, congenital and cyclic neutropenia, myelodysplastic syndromes, or aplastic anemia.

[5011] In a preferred embodiment, the compound is an inhibitor of an 80091 polypeptide. Preferably, the inhibitor is chosen from a peptide, a phosphopeptide, a small organic molecule, a small inorganic molecule and an antibody (e.g., an antibody conjugated to a therapeutic moiety selected from a cytotoxin, a cytotoxic agent and a radioactive metal ion). In another preferred embodiment, the compound is an inhibitor of an 80091 nucleic acid, e.g., an antisense, a ribozyme, or a triple helix molecule.

[5012] In a preferred embodiment, the compound is administered in combination with a cytotoxic agent. Examples of cytotoxic agents include anti-microtubule agent, a topoisomerase I inhibitor, a topoisomerase II inhibitor, an anti-metabolite, a mitotic inhibitor, an alkylating agent, an intercalating agent, an agent capable of interfering with a signal transduction pathway, an agent that promotes apoptosis or necrosis, and radiation.

[5013] In another aspect, the invention features methods for treating or preventing a disorder characterized by aberrant cellular proliferation or differentiation of an 80091-expressing cell, in a subject. Preferably, the method includes administering to the subject (e.g., a mammal, e.g., a human) an effective amount of a compound (e.g., a compound identified using the methods described herein) that modulates the activity, or expression, of the 80091 polypeptide or nucleic acid. In a preferred embodiment, the disorder is a cancerous or pre-cancerous condition.

[5014] In a further aspect, the invention provides methods for evaluating the efficacy of a treatment of a disorder. The method includes: treating a subject, e.g., a patient or an animal, with a protocol under evaluation (e.g., treating a subject with one or more of: chemotherapy, radiation, and/or a compound identified using the methods described herein); and evaluating the expression of an 80091 nucleic acid or polypeptide before and after treatment. A change, e.g., a decrease or increase, in the level of an 80091 nucleic acid (e.g., mRNA) or polypeptide after treatment, relative to the level of expression before treatment, is indicative of the efficacy of the treatment of the disorder. The level of 80091 nucleic acid or polypeptide expression can be detected by any method described herein.

[5015] In a preferred embodiment, the disorder is a hematopoietic disorder, e.g., an erythroid-associated disorder. Examples of erythroid-associated disorders include an anemia, e.g., a drug- (e.g., chemotherapy-) induced anemia, a hemolytic anemia, aberrant erythropoiesis, secondary anemia in non-hematolic disorder, anemias of chronic disease such as chronic renal failure; endocrine deficiency diseases; and/or erythrocytosis (e.g., polycythemia).

[5016] In a preferred embodiment, the disorder is a cancer, e.g., leukemic cancer, e.g., an erythroid leukemia, or a carcinoma, e.g., a renal carcinoma.

[5017] In a preferred embodiment, the subject is a human.

[5018] In a preferred embodiment, the subject is an experimental animal, e.g., an animal model for a hematopoietic- (e.g., an erythroid-) associated disorder.

[5019] In a preferred embodiment, the method can further include treating the subject with a protein, e.g., a cytokine or a hormone. Exemplary proteins include, but are not limited to, G-CSF, GM-CSF, stem cell factor, interleukin-3 (IL-3), IL-4, FIt-3 ligand, thrombopoietin, and erythropoietin. Preferably, the protein is erythropoietin.

[5020] In a preferred embodiment, the evaluating step includes obtaining a sample (e.g., a tissue sample, e.g., a biopsy, or a fluid sample) from the subject, before and after treatment and comparing the level of expressing of an 80091 nucleic acid (e.g., mRNA) or polypeptide before and after treatment.

[5021] In another aspect, the invention provides methods for evaluating the efficacy of a therapeutic or prophylactic agent (e.g., an anti-neoplastic agent). The method includes: contacting a sample with an agent (e.g., a compound identified using the methods described herein, a cytotoxic agent) and, evaluating the expression of 80091 nucleic acid or polypeptide in the sample before and after the contacting step. A change, e.g., a decrease or increase, in the level of 80091 nucleic acid (e.g., mRNA) or polypeptide in the sample obtained after the contacting step, relative to the level of expression in the sample before the contacting step, is indicative of the efficacy of the agent. The level of 80091 nucleic acid or polypeptide expression can be detected by any method described herein. In a preferred embodiment, the sample includes cells obtained from a cancerous tissue or an erythroid cell tissue.

[5022] In further aspect, the invention provides assays for determining the presence or absence of a genetic alteration in an 80091 polypeptide or nucleic acid molecule, including for disease diagnosis.

[5023] In another aspect, the invention features a two dimensional array having a plurality of addresses, each address of the plurality being positionally distinguishable from each other address of the plurality, and each address of the plurality having a unique capture probe, e.g., a nucleic acid or peptide sequence. At least one address of the plurality has a capture probe that recognizes an 80091 molecule. In one embodiment, the capture probe is a nucleic acid, e.g., a probe complementary to an 80091 nucleic acid sequence. In another embodiment, the capture probe is a polypeptide, e.g., an antibody specific for 80091 polypeptides. Also featured is a method of analyzing a sample by contacting the sample to the aforementioned array and detecting binding of the sample to the array.

[5024] Other features and advantages of the invention will be apparent from the following detailed description, and from the claims.

Detailed Description of 80091

[5025] The human 80091 sequence (see SEQ ID NO:94, as recited in Example 53), which is approximately 3954 nucleotides long including untranslated regions, contains a predicted methionine-initiated coding sequence of about 3954 nucleotides, including the termination codon. The coding sequence encodes a 1317 amino acid protein (see SEQ ID NO:95, as recited in Example 53).

[5026] Human 80091 contains the following regions or other structural features:

[5027] one predicted ubiquitin carboxyl-terminal hydrolase-1 (UCH-1) domain (PFAM Accession PF00442) located at about amino acid 447 to about 478 of SEQ ID NO:95, which includes one predicted ubiquitin carboxyl-terminal hydrolases family 2 signature 1 from about amino acids 449 to 463 of SEQ ID NO:95;

[5028] one predicted ubiquitin carboxyl-terminal hydrolase-2 (UCH-2) domain (PFAM Accession PF00443) located at about amino acid 1219 to about 1279 of SEQ ID NO:95, which includes one predicted ubiquitin carboxyl-terminal hydrolases family 2 signature 2 from about amino acids 1223 to 1240 of SEQ ID NO:95;

[5029] ten predicted N-glycosylation sites (PS0001) located at about amino acids 168 to 171, 177 to 180, 209 to 212, 431 to 434, 459 to 462, 485 to 488, 588 to 591, 767 to 770, 1203 to 1206, and 1255 to 1258 of SEQ ID NO:95;

[5030] twenty-two predicted Protein Kinase C phosphorylation sites (PS00005) located at about amino acids 23 to 25, 56 to 58, 360 to 362, 369 to 371, 432 to 434, 442 to 444, 477 to 479, 511 to 513, 613 to 615, 764 to 766, 806 to 808, 826 to 828, 869 to 871, 938 to 940, 979 to 981, 1008 to 1010, 1085 to 1087, 1092 to 1094, 1102 to 1104, 1111 to 1113, 1135 to 1137, and 1258 to 1260 of SEQ ID NO:95;

[5031] twenty-four predicted Casein Kinase II phosphorylation sites (PS00006) located at about amino acids 10 to 13, 48 to 51, 83 to 86, 169 to 172, 180 to 183, 301 to 304, 360 to 363, 391 to 394, 421 to 424, 433 to 436, 723 to 726, 732 to 735, 741 to 744, 779 to 782, 923 to 926, 956 to 959, 1063 to 1066, 1135 to 1138, 1167 to 1170, 1173 to 1176, 1205 to 1208, 1209 to 1212, 1258 to 1261, and 1300 to 1303 of SEQ ID NO:95; two predicted Tyrosine kinase phosphorylation sites (PS00007) located at about amino acids 556 to 562, and 652 to 659 of SEQ ID NO:95;

[5032] twenty-one predicted N-myristylation sites (PS00008) located at about amino acids 41 to 46, 78 to 83, 102 to 107, 141 to 146, 161 to 166, 199 to 204, 220 to 225, 235 to 240, 288 to 293, 510 to 515, 598 to 603, 650 to 655, 754 to 759, 763 to 768, 771 to 776, 877 to 882, 1088 to 1093, 1193 to 1198, 1199 to 1204, 1233 to 1238, and 1281 to 1286 of SEQ ID NO:95;

[5033] one predicted amidation site (PS00009) located at about amino acids 1292 to 1295 of SEQ ID NO:95; and

[5034] one predicted Prenyl group binding site (PS00266) located at about amino acids 1314 to 1317 of SEQ ID NO:95;

[5035] For general information regarding PFAM identifiers, PS prefix and PF prefix domain identification numbers, refer to Sonnhammer et al. (1997) Protein 28:405-420 and http://www.psc.edu/general/software/packages/pfam/pfam.html.

[5036] A plasmid containing the nucleotide sequence encoding human 80091 (clone “Fbh80091 FL”) was deposited with American Type Culture Collection (ATCC), 10801 University Boulevard, Manassas, Va. 20110-2209, on ______ and assigned Accession Number ______. This deposit will be maintained under the terms of the Budapest Treaty on the International Recognition of the Deposit of Microorganisms for the Purposes of Patent Procedure. This deposit was made merely as a convenience for those of skill in the art and is not an admission that a deposit is required under 35 U.S.C. §112.

[5037] The 80091 protein contains a significant number of structural characteristics in common with members of the ubiquitin carboxy-terminal hydrolase (“UCH”) family. The term “family” when referring to the protein and nucleic acid molecules of the invention means two or more proteins or nucleic acid molecules having a common structural domain or motif and having sufficient amino acid or nucleotide sequence homology as defined herein. Such family members can be naturally or non-naturally occurring and can be from either the same or different species. For example, a family can contain a first protein of human origin as well as other distinct proteins of human origin, or alternatively, can contain homologues of non-human origin, e.g., rat or mouse proteins. Members of a family can also have common functional characteristics.

[5038] The UCH family consists of large proteins (about 800 to 2000 residues) that share two conserved regions, a UCH-1 domain and a UCH-2 domain, each of which is thought to participate in the catalytic mechanism. The conserved signature patterns of UCH-1 and UCH-2 are respectively as follows: (1) G-[LIVMFY]-x(1,3)-[AGC]-[NASM]-x-C-[FYW]-[LIVMFC]-[NST]-[SACV]-x-[LIVMS]-Q; and (2) Y-x-L-x-[SAG]-[LIVMFT]-x(2)-H-x-G-x(4,5)-G-H-Y (SEQ ID NO:98). An 80091 protein typically contains one or more sequences that conform to each of the signature patterns. For example, an 80091 protein can contain the sequence LSNLGNTCFMNSSIQ (SEQ ID NO:99), e.g., located at amino acids 449 to 463 of SEQ ID NO:95, which corresponds to the UCH-1 signature pattern. An 80091 protein can also include the sequence YNLYAISCHSGILGGGHY (SEQ ID NO:100), e.g., located at amino acids 1223 to 1240 of SEQ ID NO:95, which corresponds to the UCH-2 signature pattern.

[5039] An 80091 polypeptide can include at least one, and preferably two ubiquitin carboxy-terminal hydrolase domains (UCH domains) or regions homologous to a UCH domain. As used herein, the term “ubiquitin carboxyl-terminal hydrolase domain” refers to an amino acid sequence that participates in the removal of one or more ubiquitin molecules from a protein that has one or more molecules of ubiquitin attached to it. The definition also includes cleavage of conjugated forms of ubiquitin (e.g., in a head to tail orientation linked via a peptide bond) whether or not the ubiquitin conjugate is attached to a protein. For example, an ubiquitin-ubiquitin conjugate (dimer) could be cleaved into monomers, a tri-ubiquitin conjugate could be cleaved into three monomers, or a dimer and a single monomer. In either of these particular examples, the monomer or dimer could remain attached to or be cleaved from the ubiquitinated protein. An 80091 polypeptide can include at least one and preferably two UCH domains, referred to individually herein as “UCH-1 domain” and “UCH-2 domain.”

[5040] As used herein, the term “UCH-1 domain” includes an amino acid sequence of about 10 to 100 amino acid residues in length and having a bit score for the alignment of the sequence to the UCH-1 domain (HMM) of at least 25. Preferably, a UCH-1 domain includes at least about 20 to 50 amino acids, more preferably about 25 to 40 amino acid residues, or about 30 to 35 amino acids and has a bit score for the alignment of the sequence to the UCH-1 domain (HMM) of at least 35, 40, 55 or greater. The UCH-1 domain (HMM) has been assigned the PFAM Accession Number PF00442 (http;//genome.wustl.edu/Pfam/.html). In one embodiment, a UCH-1 domain includes the amino acid sequence LSNLGNTCFMNSSIQ (SEQ ID NO:99), wherein C is the putative active site residue. An alignment of the UCH-1 domain (amino acids 447 to 478 of SEQ ID NO:95) of human 80091 with a consensus amino acid sequence (SEQ ID NO:96) derived from a hidden Markov model is depicted in FIG. 41A.

[5041] In a preferred embodiment, an 80091 polypeptide or protein has a “UCH-1 domain” or a region which includes at least about 20 to 50 more preferably about 25 to 40 or 30 to 35 amino acid residues and has at least about 70% 80% 90% 95%, 99%, or 100% homology with a “UCH-1 domain,” e.g., a UCH-1 domain of human 80091, e.g., residues 447 to 478 of SEQ ID NO:95.

[5042] An 80091 polypeptide can include a “UCH-2 domain” or regions homologous with a “UCH-2 domain.”

[5043] As used herein, the term “UCH-2 domain” includes an amino acid sequence of about 10 to 150 amino acid residues in length and having a bit score for the alignment of the sequence to the UCH-2 domain (HMM) of at least 50. Preferably, a UCH-2 domain includes at least about 40 to 120 amino acids, more preferably about 50 to 100 amino acid residues, or about 60 to 70 amino acids and has a bit score for the alignment of the sequence to the UCH-2 domain (HMM) of at least 60, 70, 80, 90 or greater. The UCH-2 domain (HMM) has been assigned the PFAM Accession Number PF00443 (http;//genome.wustl.edu/Pfam/.html). In one embodiment, a UCH-2 domain includes the amino acid sequence YNLYAISCHSGILGGGHY (SEQ ID NO:100), wherein the histidine residues are two putative active site residues. An alignment of the UCH-2 domain (amino acids 1219 to 1279 of SEQ ID NO:95) of human 80091 with a consensus amino acid sequence (SEQ ID NO:97) derived from a hidden Markov model is depicted in FIG. 41B.

[5044] In a preferred embodiment, an 80091 polypeptide or protein has a “UCH-2 domain” or a region which includes at least about 40 to 120 more preferably about 50 to 100 or 60 to 70 amino acid residues and has at least about 70% 80% 90% 95%, 99%, or 100% homology with a “UCH-2 domain,” e.g., a UCH-2 domain of human 80091, e.g., residues 1219 to 1279 of SEQ ID NO:95.

[5045] To identify the presence of a “UCH (ubiquitin carboxyl-terminal hydrolase)” domain, e.g., a UCH-1 or a UCH-2 domain, in an 80091 protein sequence, and make the determination that a polypeptide or protein of interest has a particular profile, the amino acid sequence of the protein can be searched against a database of HMMs (e.g., the Pfam database, release 2.1) using the default parameters (http://www.sanger.ac.uk/Software/Pfam/HMM_search). For example, the hmmsf program, which is available as part of the HMMER package of search programs, is a family specific default program for MILPAT0063 and a score of 15 is the default threshold score for determining a hit. Alternatively, the threshold score for determining a hit can be lowered (e.g., to 8 bits). A description of the Pfam database can be found in Sonhammer et al. (1997) Proteins 28(3):405-420 and a detailed description of HMMs can be found, for example, in Gribskov et al. (1990) Meth. Enzymol. 183: 146-159; Gribskov et al. (1987) Proc. Natl. Acad. Sci. USA 84: 4355-4358; Krogh et al. (1994) J. Mol. Biol. 235:1501-1531; and Stultz et al. (1993) Protein Sci. 2: 305-314, the contents of which are incorporated herein by reference. A search was performed against the HMM database resulting in the identification of a “ubiquitin carboxy-terminal hydrolase” domain in the amino acid sequence of human 80091 at about residues 447 to 478 of SEQ ID NO:95 (UCH-1) and 1219 to 1279 of SEQ ID NO:95 (UCH-2) (see FIGS. 41A and 41B).

[5046] An 80091 family memeber can include at least one UCH-1 domain and at least one UCH-2 domain.

[5047] An 80091 polypeptide can further include at least one, two, three, four, five, six, seven, eight, nine, preferably ten N-glycosylation sites; at least one, two, three, four, five, six, seven, eight, nine, ten, eleven, twelve, thirteen, fourteen, fifteen, sixteen, seventeen, eighteen, nineteen, twenty, twenty-one, preferably twenty-two protein kinase C phosphorylation sites; at least one, two, three, four, five, six, seven, eight, nine, ten, eleven, twelve, thirteen, fourteen, fifteen, sixteen, seventeen, eighteen, nineteen, twenty, twenty-one, twenty-two, twenty-three, preferably twenty-four casein kinase II phosphorylation sites; at least one, preferably two tyrosine kinase phosphorylation sites; at least one, two, three, four, five, six, seven, eight, nine, ten, eleven, twelve, thirteen, fourteen, fifteen, sixteen, seventeen, eighteen, nineteen, twenty, preferably twenty-one N-myristylation sites; and at least one amidation site.

[5048] Members of the ubiquitin carboxyl-terminal hydrolase family of proteins are characterized by UCH domains, described above, and can be tested for de-ubiquination activity using assays know in the art. De-ubiquitination assays useful for detecting a ubiquitin carboxyl-terminal hydrolase activity are known in the art and can be found, for example, in Zhu et al. (1997) Journal of Biological Chemistry 272: 51-57, Mitch et al. (1999) American Journal of Physiology 276: C1132-C1138, Liu et al. (1999) Molecular and Cell Biology 19: 3029-3038, and such as those cited in various reviews, for example, Ciechanover et al. (1994) The FASEB Journal 8: 182-192, Chiechanover (1994) Biol. Chem. Hoppe-Seyler 375: 565-581, Hershko et al. (1998) Annual Review of Biochemistry 67: 425-479, Swartz (1999) Annual Review of Medicine 50: 57-74, Ciechanover (1998) EMBO Journal 17: 7151-7160, and D'Andrea et al. (1998) Critical Reviews in Biochemistry and Molecular Biology 33: 337-352. These assays include, but are not limited to, the disappearance of substrate, including a decrease in the amount of polyubiquitin or ubiquitinated substrate protein or protein remnant, appearance of intermediate and end products, such as appearance of free ubiquitin monomers, general protein turnover, specific protein turnover, ubiquitin binding, binding to ubiquitinated substrate protein, subunit interaction, interaction with ATP, interaction with cellular components such as trans-acting regulatory factors, stabilization of specific proteins, and the like. As the 80091 polypeptides of the invention may modulate 80091-mediated activities, they may be useful as of for developing novel diagnostic and therapeutic agents for 80091-mediated or related disorders, as described below.

[5049] As used herein, an “80091 activity”, “biological activity of 80091” or “functional activity of 80091”, refers to an activity exerted by an 80091 protein, polypeptide or nucleic acid molecule. For example, an 80091 activity can be an activity exerted by 80091 in a physiological milieu on, e.g., an 80091-responsive cell or on an 80091 substrate, e.g., a protein substrate. An 80091 activity can be determined in vivo or in vitro. In one embodiment, an 80091 activity is a direct activity, such as an association with an 80091 target molecule. A “target molecule” or “binding partner” is a molecule with which an 80091 protein binds or interacts in nature. In an exemplary embodiment, 80091 is an enzyme that mediates a de-ubiquitination reaction.

[5050] An 80091 activity can also be an indirect activity, e.g., a cellular signaling activity mediated by interaction of the 80091 protein with an 80091 receptor. The features of the 80091 molecules of the present invention can provide similar biological activities as ubiquitin carboxy-terminal hydrolase family members. For example, the 80091 proteins of the present invention can have one or more of the following activities: (1) modulation of de-ubiquitination of a substrate, e.g., a ubiquitinated protein targeted for degradation; (2) participation in the processing of poly-ubiquitin precursors; (3) modulation of cellular proliferation and/or differentiation; (4) modulation (e.g., stimulation) of cell differentiation, e.g., differentiation of hematopoietic cells (e.g., differentiation of blood cells (e.g., erythroid progenitor cells, such as CD34+ erythroid progenitors)); (5) modulation of hematopoiesis, e.g., erythropoiesis; (6) modulation of cell proliferation, e.g., proliferation of hematopoietic cells (e.g., erythroid progenitor cells); (7) modulation (increase or decrease) of apoptosis, e.g., apoptosis of a cancer cell, e.g., a leukemic cell, (e.g., an erythroleukemia cell); (8) modulation of transcription and/or cell-cycle progression; (9) modulation of signal transduction; (10) modulation of antigen processing; (11) modulation of cell-cell adhesion; (12) modulation of receptor-mediated endocytosis; (13) modulation of organelle biogenesis and development; (14) participation in neural development and/or maintenance; (15) participation in neuropathological conditions; (16) participation in oncogenesis; (17) modulation of immune function; (18) modulation of metabolism; or (19) regulation of gamete function.

[5051] Based upon the above-described sequence similarities and the detetced expression patterns of 80091 described in Table 28 of Example 53 (e.g., erythroid cells, neural tissues, and HUVEC), the 80091 molecules of the present invention are predicted to have similar biological activities as ubiquitin carboxy-terminal hydrolase family members. Ubiquitin carboxy-terminal hydrolase domains regulate the de-ubiquitination of a substrate, e.g., a protein targeted for degradation. Thus, 80091 molecules can act as novel diagnostic targets and therapeutic agents for controlling, e.g., ubiquitination related disorders. 80091 molecules of the invention may be useful, for example, in inducing the de-ubiquitination of ubiquitinated proteins. These proteins can therefore modulate protein degradation and the recycling of ubiquitin, as well as participate in cell signaling pathways in which ubiquitination or de-ubiquitination of a protein can alter or modify the activity of the protein. Thus, 80091 molecules may act as novel therapeutic agents for controlling disorders associated with excessive or insufficient ubiquitination (e.g., protein degradation), and as diagnostic markers useful for indicating the presence or predisposition towards developing such disorders, or monitoring the progression or regression of a disorder.

[5052] The 80091 molecules can act as novel diagnostic targets and therapeutic agents for controlling disorders associated with abnormal de-ubiquitination activity and disorders associated with abnormal protein degradation. Additional examples of disorders that can be treated and/or diagnosed with the molecules of the invention include hematopoietic disorders such as erythroid cell-associated disorders, cellular proliferative and/or differentiative disorders, neurological or brain disorders, metabolic disorders, angiogenic disorders, and endothelial cell disorders.

[5053] As used herein, the term “erythroid cell-associated disorders” includes disorders involving aberrant (increased or deficient) erythroblast proliferation, e.g., an erythroleukemia, and aberrant (increased or deficient) erythroblast differentiation, e.g., an anemia. Erythrocyte-associated disorders include anemias such as, for example, drug-(chemotherapy-) induced anemias, hemolytic anemias due to hereditary cell membrane abnormalities, such as hereditary spherocytosis, hereditary elliptocytosis, and hereditary pyropoikilocytosis; hemolytic anemias due to acquired cell membrane defects, such as paroxysmal nocturnal hemoglobinuria and spur cell anemia; hemolytic anemias caused by antibody reactions, for example to the RBC antigens, or antigens of the ABO system, Lewis system, Ii system, Rh system, Kidd system, Duffy system, and Kell system; methemoglobinemia; a failure of erythropoiesis, for example, as a result of aplastic anemia, pure red cell aplasia, myelodysplastic syndromes, sideroblastic anemias, and congenital dyserythropoietic anemia; secondary anemia in non-hematolic disorders, for example, as a result of chemotherapy, alcoholism, or liver disease; anemia of chronic disease, such as chronic renal failure; and endocrine deficiency diseases.

[5054] Agents that modulate 80091 polypeptide or nucleic acid activity or expression can be used to treat anemias, in particular, drug-induced anemias or anemias associated with cancer chemotherapy, chronic renal failure, malignancies, adult and juvenile rheumatoid arthritis, disorders of hemoglobin synthesis, prematurity, and zidovudine treatment of HIV infection. A subject receiving the treatment can be additionally treated with a second agent, e.g., erythropoietin, to further ameliorate the condition.

[5055] As used herein, the term “erythropoietin” or “EPO” refers to a glycoprotein produced in the kidney, which is the principal hormone responsible for stimulating red blood cell production (erythrogenesis). EPO stimulates the division and differentiation of committed erythroid progenitors in the bone marrow. Normal plasma erythropoietin levels range from 0.01 to 0.03 Units/mL, and can increase up to 100 to 1,000-fold during hypoxia or anemia. Graber and Krantz, Ann. Rev. Med. 29: 51 (1978); Eschbach and Adamson, Kidney Intl. 28:1 (1985). Recombinant human erythropoietin (rHuEpo or epoietin alpha) is commercially available as EPOGEN.RTM. (epoietin alpha, recombinant human erythropoietin) (Amgen Inc., Thousand Oaks, Calif.) and as PROCRIT.RTM. (epoietin alpha, recombinant human erythropoietin) (Ortho Biotech Inc., Raritan, N.J.).

[5056] Another example of an erythroid cell-associated disorder is erythrocytosis. Erythrocytosis, a disorder of red blood cell overproduction caused by excessive and/or ectopic erythropoietin production, can be caused by cancers, e.g., a renal cell cancer, a hepatocarcinoma, and a central nervous system cancer. Diseases associated with erythrocytosis include polycythemias, e.g., polycythemia vera, secondary polycythemia, and relative polycythemia.

[5057] Aberrant expression or activity of the 80091 molecules may be involved in neoplastic disorders. Accordingly, treatment, prevention and diagnosis of cancer or neoplastic disorders related to hematopoietic cells and, in particular, cells of the erythroid lineage are also included in the present invention. Such neoplastic disorders are exemplified by erythroid leukemias, or leukemias of erythroid precursor cells, e.g., poorly differentiated acute leukemias such as erythroblastic leukemia and acute megakaryoblastic leukemia. Additional exemplary myeloid disorders include, but are not limited to, acute promyeloid leukemia (APML), acute myelogenous leukemia (AML) and chronic myelogenous leukemia (CML) (reviewed in Vaickus, L. (1991) Crit Rev. in Oncol./Hemotol. 11: 267-97). In particular, AML can include the uncontrolled proliferation of CD34+ cells such as AML subtypes M1 and M2, myeloblastic leukemias with and without maturation, and AML subtype M6, erythroleukemia (Di Guglielmo's disease). Additional neoplastic disorders include a myelodysplastic syndrome or preleukemic disorder, e.g., oligoblastic leukemia, smoldering leukemia. Additional cancers of the erythroid lineage include erythroblastosis, and other relevant diseases of the bone marrow.

[5058] The term “leukemia” or “leukemic cancer” is intended to have its clinical meaning, namely, a neoplastic disease in which white corpuscle maturation is arrested at a primitive stage of cell development. The disease is characterized by an increased number of leukemic blast cells in the bone marrow, and by varying degrees of failure to produce normal hematopoietic cells. The condition may be either acute or chronic. Leukemias are further typically categorized as being either lymphocytic i.e., being characterized by cells which have properties in common with normal lymphocytes, or myelocytic (or myelogenous), i.e., characterized by cells having some characteristics of normal granulocytic cells. Acute lymphocytic leukemia (“ALL”) arises in lymphoid tissue, and ordinarily first manifests its presence in bone marrow. Acute myelocytic leukemia (“AML”) arises from bone marrow hematopoietic stem cells or their progeny. The term acute myelocytic leukemia subsumes several subtypes of leukemia: myeloblastic leukemia, promyelocytic leukemia, and myelomonocytic leukemia. In addition, leukemias with erythroid or megakaryocytic properties are considered myelogenous leukemias as well.

[5059] The molecules of the invention may also modulate the activity of neoplastic, non-hematopoietic tissues in which they are expressed, e.g., kidney, lung, liver, skeletal muscle. For example, increase expression of 80091 molecules is detected on lung tumors compared to the normal lung. Accordingly, the 80091 molecules can act as novel diagnostic targets and therapeutic agents for controlling one or more of cellular proliferative and/or differentiative disorders. Examples of such cellular proliferative and/or differentiative disorders include cancer, e.g., carcinoma, sarcoma, or metastatic disorders. A metastatic tumor can arise from a multitude of primary tumor types, including but not limited to those of lung, prostate, colon, breast, and liver origin.

[5060] Examples of cellular proliferative and/or differentiative disorders include cancer, e.g., carcinoma, sarcoma, metastatic disorders or hematopoietic neoplastic disorders, e.g., leukemias. A metastatic tumor can arise from a multitude of primary tumor types, including but not limited to those of prostate, colon, lung, breast and liver origin.

[5061] As used herein, the terms “cancer”, “hyperproliferative” and “neoplastic” refer to cells having the capacity for autonomous growth. Examples of such cells include cells having an abnormal state or condition characterized by rapidly proliferating cell growth. Hyperproliferative and neoplastic disease states may be categorized as pathologic, i.e., characterizing or constituting a disease state, or may be categorized as non-pathologic, i.e., a deviation from normal but not associated with a disease state. The term is meant to include all types of cancerous growths or oncogenic processes, metastatic tissues or malignantly transformed cells, tissues, or organs, irrespective of histopathologic type or stage of invasiveness. “Pathologic hyperproliferative” cells occur in disease states characterized by malignant tumor growth. Examples of non-pathologic hyperproliferative cells include proliferation of cells associated with wound repair.

[5062] The terms “cancer” or “neoplasms” include malignancies of the various organ systems, such as affecting lung, breast, thyroid, lymphoid, gastrointestinal, and genito-urinary tract, as well as adenocarcinomas which include malignancies such as most colon cancers, renal-cell carcinoma, prostate cancer and/or testicular tumors, non-small cell carcinoma of the lung, cancer of the small intestine and cancer of the esophagus.

[5063] The term “carcinoma” is art recognized and refers to malignancies of epithelial or endocrine tissues including respiratory system carcinomas, gastrointestinal system carcinomas, genitourinary system carcinomas, testicular carcinomas, breast carcinomas, prostatic carcinomas, endocrine system carcinomas, and melanomas. Exemplary carcinomas include those forming from tissue of the cervix, lung, prostate, breast, head and neck, colon and ovary. The term also includes carcinosarcomas, e.g., which include malignant tumors composed of carcinomatous and sarcomatous tissues. An “adenocarcinoma” refers to a carcinoma derived from glandular tissue or in which the tumor cells form recognizable glandular structures.

[5064] The term “sarcoma” is art recognized and refers to malignant tumors of mesenchymal derivation.

[5065] Additional examples of proliferative disorders include hematopoietic neoplastic disorders. As used herein, the term “hematopoietic neoplastic disorders” includes diseases involving hyperplastic/neoplastic cells of hematopoietic origin, e.g., arising from myeloid, lymphoid or erythroid lineages, or precursor cells thereof. Preferably, the diseases arise from poorly differentiated acute leukemias, e.g., erythroblastic leukemia and acute megakaryoblastic leukemia. Additional exemplary myeloid disorders include, but are not limited to, acute promyeloid leukemia (APML), acute myelogenous leukemia (AML) and chronic myelogenous leukemia (CML) (reviewed in Vaickus, L. (1991) Crit Rev. in Oncol./Hemotol. 11:267-97); lymphoid malignancies include, but are not limited to acute lymphoblastic leukemia (ALL) which includes B-lineage ALL and T-lineage ALL, chronic lymphocytic leukemia (CLL), prolymphocytic leukemia (PLL), hairy cell leukemia (HLL) and Waldenstrom's macroglobulinemia (WM). Additional forms of malignant lymphomas include, but are not limited to non-Hodgkin lymphoma and variants thereof, peripheral T cell lymphomas, adult T cell leukemia/lymphoma (ATL), cutaneous T-cell lymphoma (CTCL), large granular lymphocytic leukemia (LGF), Hodgkin's disease and Reed-Sternberg disease.

[5066] The 80091 nucleic acid and protein of the invention can be used to treat and/or diagnose a variety of neurological or brain disorders. Disorders involving the brain include, but are not limited to, disorders involving neurons, and disorders involving glia, such as astrocytes, oligodendrocytes, ependymal cells, and microglia; cerebral edema, raised intracranial pressure and herniation, and hydrocephalus; malformations and developmental diseases, such as neural tube defects, forebrain anomalies, posterior fossa anomalies, and syringomyelia and hydromyelia; perinatal brain injury; cerebrovascular diseases, such as those related to hypoxia, ischemia, and infarction, including hypotension, hypoperfusion, and low-flow states—global cerebral ischemia and focal cerebral ischemia—infarction from obstruction of local blood supply, intracranial hemorrhage, including intracerebral (intraparenchymal) hemorrhage, subarachnoid hemorrhage and ruptured berry aneurysms, and vascular malformations, hypertensive cerebrovascular disease, including lacunar infarcts, slit hemorrhages, and hypertensive encephalopathy; infections, such as acute meningitis, including acute pyogenic (bacterial) meningitis and acute aseptic (viral) meningitis, acute focal suppurative infections, including brain abscess, subdural empyema, and extradural abscess, chronic bacterial meningoencephalitis, including tuberculosis and mycobacterioses, neurosyphilis, and neuroborreliosis (Lyme disease), viral meningoencephalitis, including arthropod-borne (Arbo) viral encephalitis, Herpes simplex virus Type 1, Herpes simplex virus Type 2, Varicalla-zoster virus (Herpes zoster), cytomegalovirus, poliomyelitis, rabies, and human immunodeficiency virus 1, including HIV-1 meningoencephalitis (subacute encephalitis), vacuolar myelopathy, AIDS-associated myopathy, peripheral neuropathy, and AIDS in children, progressive multifocal leukoencephalopathy, subacute sclerosing panencephalitis, fungal meningoencephalitis, other infectious diseases of the nervous system; transmissible spongiform encephalopathies (prion diseases); demyelinating diseases, including multiple sclerosis, multiple sclerosis variants, acute disseminated encephalomyelitis and acute necrotizing hemorrhagic encephalomyelitis, and other diseases with demyelination; degenerative diseases, such as degenerative diseases affecting the cerebral cortex, including Alzheimer disease and Pick disease, degenerative diseases of basal ganglia and brain stem, including Parkinsonism, idiopathic Parkinson disease (paralysis agitans), progressive supranuclear palsy, corticobasal degenration, multiple system atrophy, including striatonigral degenration, Shy-Drager syndrome, and olivopontocerebellar atrophy, and Huntington disease; spinocerebellar degenerations, including spinocerebellar ataxias, including Friedreich ataxia, and ataxia-telanglectasia, degenerative diseases affecting motor neurons, including amyotrophic lateral sclerosis (motor neuron disease), bulbospinal atrophy (Kennedy syndrome), and spinal muscular atrophy; inborn errors of metabolism, such as leukodystrophies, including Krabbe disease, metachromatic leukodystrophy, adrenoleukodystrophy, Pelizaeus-Merzbacher disease, and Canavan disease, mitochondrial encephalomyopathies, including Leigh disease and other mitochondrial encephalomyopathies; toxic and acquired metabolic diseases, including vitamin deficiencies such as thiamine (vitamin B1) deficiency and vitamin B12 deficiency, neurologic sequelae of metabolic disturbances, including hypoglycemia, hyperglycemia, and hepatic encephatopathy, toxic disorders, including carbon monoxide, methanol, ethanol, and radiation, including combined methotrexate and radiation-induced injury; tumors, such as gliomas, including astrocytoma, including fibrillary (diffuse) astrocytoma and glioblastoma multiforme, pilocytic astrocytoma, pleomorphic xanthoastrocytoma, and brain stem glioma, oligodendroglioma, and ependymoma and related paraventricular mass lesions, neuronal tumors, poorly differentiated neoplasms, including medulloblastoma, other parenchymal tumors, including primary brain lymphoma, germ cell tumors, and pineal parenchymal tumors, meningiomas, metastatic tumors, paraneoplastic syndromes, peripheral nerve sheath tumors, including schwannoma, neurofibroma, and malignant peripheral nerve sheath tumor (malignant schwannoma), and neurocutaneous syndromes (phakomatoses), including neurofibromotosis, including Type 1 neurofibromatosis (NF1) and TYPE 2 neurofibromatosis (NF2), tuberous sclerosis, and Von Hippel-Lindau disease.

[5067] Additionally, 80091 may play an important role in the regulation of metabolism. Diseases of metabolic imbalance include, but are not limited to, obesity, anorexia nervosa, cachexia, lipid disorders, and diabetes.

[5068] An “angiogenic disorder” refers to a disorder characterized by aberrant, unregulated, or unwanted vascularization. Angiogenic disorders include, but are not limited to, hemangiomas, Kaposi's sarcoma, von Hippel-Lindau disease; psoriasis; diabetic retinopathy; endometriosis; Grave's disease; chronic inflammatory diseases (e.g., rheumatoid arthritis); aberrant or excess angiogenesis in diseases such as a Castleman's disease or fibrodysplasia ossificans progressiva; aberrant or deficient angiogenesis associated with aging, complications of healing certain wounds and complications of diseases such as diabetes and rheumatoid arthritis; or aberrant or deficient angiogenesis associated with hereditary hemorrhagic telangiectasia, autosomal dominant polycystic kidney disease, myelodysplastic syndrome or Klippel-Trenaunay-Weber syndrome.

[5069] As used herein, an “endothelial cell disorder” includes a disorder characterized by aberrant, unregulated, or unwanted endothelial cell activity, e.g., proliferation, migration, angiogenesis, or vascularization; or aberrant expression of cell surface adhesion molecules or genes associated with angiogenesis, e.g., TIE-2, FLT and FLK. Endothelial cell disorders include, but are not limited to, responses of vascular cell walls to injury, such as endothelial dysfunction and endothelial activation and intimal thickening; vascular diseases including, but not limited to, congenital anomalies, such as arteriovenous fistula, vasculitides, such as giant cell (temporal) arteritis, Takayasu arteritis, polyarteritis nodosa (classic), Kawasaki syndrome (mucocutaneous lymph node syndrome), microscopic polyanglitis (microscopic polyarteritis, hypersensitivity or leukocytoclastic anglitis), Wegener granulomatosis, thromboanglitis obliterans (Buerger disease), vasculitis associated with other disorders, and infectious arteritis; Raynaud disease; aneurysms and dissection, such as abdominal aortic aneurysms, syphilitic (luetic) aneurysms, and aortic dissection (dissecting hematoma); disorders of veins and lymphatics, such as varicose veins, thrombophlebitis and phlebothrombosis, obstruction of superior vena cava (superior vena cava syndrome), obstruction of inferior vena cava (inferior vena cava syndrome), and lymphangitis and lymphedema; tumors, including benign tumors and tumor-like conditions, such as hemangioma, lymphangioma, glomus tumor (glomangioma), vascular ectasias, and bacillary angiomatosis, and intermediate-grade (borderline low-grade malignant) tumors, such as Kaposi sarcoma, described above, and hemangloendothelioma, and malignant tumors, such as angiosarcoma and hemangiopericytoma; and pathology of therapeutic interventions in vascular disease, such as balloon angioplasty and related techniques and vascular replacement, such as coronary artery bypass graft surgery.

[5070] The 80091 protein, fragments thereof, and derivatives and other variants of the sequence in SEQ ID NO:95 thereof are collectively referred to as “polypeptides or proteins of the invention” or “80091 polypeptides or proteins”. Nucleic acid molecules encoding such polypeptides or proteins are collectively referred to as “nucleic acids of the invention” or “80091 nucleic acids.” 80091 molecules refer to 80091 nucleic acids, polypeptides, and antibodies.

[5071] As used herein, the term “nucleic acid molecule” includes DNA molecules (e.g., a cDNA or genomic DNA), RNA molecules (e.g., an mRNA) and analogs of the DNA or RNA. A DNA or RNA analog can be synthesized from nucleotide analogs. The nucleic acid molecule can be single-stranded or double-stranded, but preferably is double-stranded DNA.

[5072] The term “isolated nucleic acid molecule” or “purified nucleic acid molecule” includes nucleic acid molecules that are separated from other nucleic acid molecules present in the natural source of the nucleic acid. For example, with regards to genomic DNA, the term “isolated” includes nucleic acid molecules which are separated from the chromosome with which the genomic DNA is naturally associated. Preferably, an “isolated” nucleic acid is free of sequences which naturally flank the nucleic acid (i.e., sequences located at the 5′ and/or 3′ends of the nucleic acid) in the genomic DNA of the organism from which the nucleic acid is derived. For example, in various embodiments, the isolated nucleic acid molecule can contain less than about 5 kb, 4 kb, 3 kb, 2 kb, 1 kb, 0.5 kb or 0.1 kb of 5′ and/or 3′nucleotide sequences which naturally flank the nucleic acid molecule in genomic DNA of the cell from which the nucleic acid is derived. Moreover, an “isolated” nucleic acid molecule, such as a cDNA molecule, can be substantially free of other cellular material, or culture medium when produced by recombinant techniques, or substantially free of chemical precursors or other chemicals when chemically synthesized.

[5073] As used herein, the term “hybridizes under low stringency, medium stringency, high stringency, or very high stringency conditions” describes conditions for hybridization and washing. Guidance for performing hybridization reactions can be found in Current Protocols in Molecular Biology, John Wiley & Sons, N.Y. (1989), 6.3.1-6.3.6, which is incorporated by reference. Aqueous and nonaqueous methods are described in that reference and either can be used. Specific hybridization conditions referred to herein are as follows: 1) low stringency hybridization conditions in 6× sodium chloride/sodium citrate (SSC) at about 45° C., followed by two washes in 0.2× SSC, 0.1% SDS at least at 50° C. (the temperature of the washes can be increased to 55° C. for low stringency conditions); 2) medium stringency hybridization conditions in 6× SSC at about 45° C., followed by one or more washes in 0.2× SSC, 0.1% SDS at 60° C.; 3) high stringency hybridization conditions in 6× SSC at about 45° C., followed by one or more washes in 0.2× SSC, 0.1% SDS at 65° C.; and preferably 4) very high stringency hybridization conditions are 0.5M sodium phosphate, 7% SDS at 65° C., followed by one or more washes at 0.2× SSC, 1% SDS at 65° C. Very high stringency conditions (4) are the preferred conditions and the ones that should be used unless otherwise specified.

[5074] Preferably, an isolated nucleic acid molecule of the invention that hybridizes under a stringency condition described herein to the sequence of SEQ ID NO:94, corresponds to a naturally-occurring nucleic acid molecule.

[5075] As used herein, a “naturally-occurring” nucleic acid molecule refers to an RNA or DNA molecule having a nucleotide sequence that occurs in nature. For example a naturally occurring nucleic acid molecule can encode a natural protein. As used herein, the terms “gene” and “recombinant gene” refer to nucleic acid molecules which include at least an open reading frame encoding an 80091 protein. The gene can optionally further include non-coding sequences, e.g., regulatory sequences and introns. Preferably, a gene encodes a mammalian 80091 protein or derivative thereof.

[5076] An “isolated” or “purified” polypeptide or protein is substantially free of cellular material or other contaminating proteins from the cell or tissue source from which the protein is derived, or substantially free from chemical precursors or other chemicals when chemically synthesized. “Substantially free” means that a preparation of 80091 protein is at least 10% pure. In a preferred embodiment, the preparation of 80091 protein has less than about 30%, 20%, 10% and more preferably 5% (by dry weight), of non-80091 protein (also referred to herein as a “contaminating protein”), or of chemical precursors or non-80091 chemicals. When the 80091 protein or biologically active portion thereof is recombinantly produced, it is also preferably substantially free of culture medium, i.e., culture medium represents less than about 20%, more preferably less than about 10%, and most preferably less than about 5% of the volume of the protein preparation. The invention includes isolated or purified preparations of at least 0.01, 0.1, 1.0, and 10 milligrams in dry weight.

[5077] A “non-essential” amino acid residue is a residue that can be altered from the wild-type sequence of 80091 without abolishing or substantially altering an 80091 activity. Preferably the alteration does not substantially alter the 80091 activity, e.g., the activity is at least 20%, 40%, 60%, 70% or 80% of wild-type. An “essential” amino acid residue is a residue that, when altered from the wild-type sequence of 80091, results in abolishing an 80091 activity such that less than 20% of the wild-type activity is present. For example, conserved amino acid residues in 80091 are predicted to be particularly unamenable to alteration.

[5078] A “conservative amino acid substitution” is one in which the amino acid residue is replaced with an amino acid residue having a similar side chain. Families of amino acid residues having similar side chains have been defined in the art. These families include amino acids with basic side chains (e.g., lysine, arginine, histidine), acidic side chains (e.g., aspartic acid, glutamic acid), uncharged polar side chains (e.g., glycine, asparagine, glutamine, serine, threonine, tyrosine, cysteine), nonpolar side chains (e.g., alanine, valine, leucine, isoleucine, proline, phenylalanine, methionine, tryptophan), beta-branched side chains (e.g., threonine, valine, isoleucine) and aromatic side chains (e.g., tyrosine, phenylalanine, tryptophan, histidine). Thus, a predicted nonessential amino acid residue in an 80091 protein is preferably replaced with another amino acid residue from the same side chain family. Alternatively, in another embodiment, mutations can be introduced randomly along all or part of an 80091 coding sequence, such as by saturation mutagenesis, and the resultant mutants can be screened for 80091 biological activity to identify mutants that retain activity. Following mutagenesis of SEQ ID NO:94, the encoded protein can be expressed recombinantly and the activity of the protein can be determined.

[5079] As used herein, a “biologically active portion” of an 80091 protein includes a fragment of an 80091 protein which participates in an interaction, e.g., an intramolecular or an inter-molecular interaction. An inter-molecular interaction can be a specific binding interaction or an enzymatic interaction (e.g., the interaction can be transient and a covalent bond is formed or broken). An inter-molecular interaction can be between an 80091 molecule and a non-80091 molecule or between a first 80091 molecule and a second 80091 molecule (e.g., a dimerization interaction). Biologically active portions of an 80091 protein include peptides comprising amino acid sequences sufficiently homologous to or derived from the amino acid sequence of the 80091 protein, e.g., the amino acid sequence shown in SEQ ID NO:95, which include less amino acids than the full length 80091 proteins, and exhibit at least one activity of an 80091 protein. Typically, biologically active portions comprise a domain or motif with at least one activity of the 80091 protein, e.g., (1) modulation of de-ubiquitination of a substrate, e.g., a ubiquitinated protein targeted for degradation; (2) participation in the processing of poly-ubiquitin precursors; (3) modulation of cellular proliferation and/or differentiation; (4) modulation (e.g., stimulation) of cell differentiation, e.g., differentiation of hematopoietic cells (e.g., differentiation of blood cells (e.g., erythroid progenitor cells, such as CD34+ erythroid progenitors)); (5) modulation of hematopoiesis, e.g., erythropoiesis; (6) modulation of cell proliferation, e.g., proliferation of hematopoietic cells (e.g., erythroid progenitor cells); (7) modulation of apoptosis, of a cell, e.g., increase apoptosis of a cancer cell, e.g., a leukemic cell, (e.g., an erythroleukemia cell); or suppress apoptosis of a blood or erythroid cell; (8) modulation of apoptosis; (9) modulation of transcription and/or cell-cycle progression; or (10) modulation of signal transduction. A biologically active portion of an 80091 protein can be a polypeptide which is, for example, 10, 25, 50, 100, 200 or more amino acids in length. Biologically active portions of an 80091 protein can be used as targets for developing agents which modulate an 80091 mediated activity, e.g., modulation of de-ubiquitination of a substrate.

[5080] Calculations of homology or sequence identity between sequences (the terms are used interchangeably herein) are performed as follows.

[5081] To determine the percent identity of two amino acid sequences, or of two nucleic acid sequences, the sequences are aligned for optimal comparison purposes (e.g., gaps can be introduced in one or both of a first and a second amino acid or nucleic acid sequence for optimal alignment and non-homologous sequences can be disregarded for comparison purposes). In a preferred embodiment, the length of a reference sequence aligned for comparison purposes is at least 30%, preferably at least 40%, more preferably at least 50%, 60%, and even more preferably at least 70%, 80%, 90%, 100% of the length of the reference sequence. The amino acid residues or nucleotides at corresponding amino acid positions or nucleotide positions are then compared. When a position in the first sequence is occupied by the same amino acid residue or nucleotide as the corresponding position in the second sequence, then the molecules are identical at that position (as used herein amino acid or nucleic acid “identity” is equivalent to amino acid or nucleic acid “homology”).

[5082] The percent identity between the two sequences is a function of the number of identical positions shared by the sequences, taking into account the number of gaps, and the length of each gap, which need to be introduced for optimal alignment of the two sequences.

[5083] The comparison of sequences and determination of percent identity between two sequences can be accomplished using a mathematical algorithm. In a preferred embodiment, the percent identity between two amino acid sequences is determined using the Needleman and Wunsch ((1970) J. Mol. Biol. 48: 444-453) algorithm which has been incorporated into the GAP program in the GCG software package (available at http://www.gcg.com), using either a Blossum 62 matrix or a PAM250 matrix, and a gap weight of 16, 14, 12, 10, 8, 6, or 4 and a length weight of 1, 2, 3, 4, 5, or 6. In yet another preferred embodiment, the percent identity between two nucleotide sequences is determined using the GAP program in the GCG software package (available at http://www.gcg.com), using a NWSgapdna.CMP matrix and a gap weight of 40, 50, 60, 70, or 80 and a length weight of 1, 2, 3, 4, 5, or 6. A particularly preferred set of parameters (and the one that should be used unless otherwise specified) are a Blossum 62 scoring matrix with a gap penalty of 12, a gap extend penalty of 4, and a frameshift gap penalty of 5.

[5084] The percent identity between two amino acid or nucleotide sequences can be determined using the algorithm of E. Meyers and W. Miller ((1989) CABIOS, 4: 11-17) which has been incorporated into the ALIGN program (version 2.0), using a PAM120 weight residue table, a gap length penalty of 12 and a gap penalty of 4.

[5085] The nucleic acid and protein sequences described herein can be used as a “query sequence” to perform a search against public databases to, for example, identify other family members or related sequences. Such searches can be performed using the NBLAST and XBLAST programs (version 2.0) of Altschul, et al. (1990) J. Mol. Biol. 215: 403-10. BLAST nucleotide searches can be performed with the NBLAST program, score=100, wordlength=12 to obtain nucleotide sequences homologous to 80091 nucleic acid molecules of the invention. BLAST protein searches can be performed with the XBLAST program, score=50, wordlength=3 to obtain amino acid sequences homologous to 80091 protein molecules of the invention. To obtain gapped alignments for comparison purposes, Gapped BLAST can be utilized as described in Altschul et al., (1997) Nucleic Acids Res. 25: 3389-3402. When utilizing BLAST and Gapped BLAST programs, the default parameters of the respective programs (e.g., XBLAST and NBLAST) can be used. See http://www.ncbi.nlm.nih.gov.

[5086] Particularly preferred 80091 polypeptides of the present invention have an amino acid sequence substantially identical to the amino acid sequence of SEQ ID NO:95. In the context of an amino acid sequence, the term “substantially identical” is used herein to refer to a first amino acid that contains a sufficient or minimum number of amino acid residues that are i) identical to, or ii) conservative substitutions of aligned amino acid residues in a second amino acid sequence such that the first and second amino acid sequences can have a common structural domain and/or common functional activity. For example, amino acid sequences that contain a common structural domain having at least about 60%, or 65% identity, likely 75% identity, more likely 85%, 90%. 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98% or 99% identity to SEQ ID NO:95 are termed substantially identical.

[5087] In the context of nucleotide sequence, the term “substantially identical” is used herein to refer to a first nucleic acid sequence that contains a sufficient or minimum number of nucleotides that are identical to aligned nucleotides in a second nucleic acid sequence such that the first and second nucleotide sequences encode a polypeptide having common functional activity, or encode a common structural polypeptide domain or a common functional polypeptide activity. For example, nucleotide sequences having at least about 60%, or 65% identity, likely 75% identity, more likely 85%, 90%. 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98% or 99% identity to SEQ ID NO:94 are termed substantially identical.

[5088] “Misexpression or aberrant expression,” as used herein, refers to a non-wildtype pattern of gene expression at the RNA or protein level. It includes: expression at non-wild type levels, i.e., over- or under-expression; a pattern of expression that differs from wild type in terms of the time or stage at which the gene is expressed, e.g., increased or decreased expression (as compared with wild type) at a predetermined developmental period or stage; a pattern of expression that differs from wild type in terms of altered, e.g., increased or decreased, expression (as compared with wild type) in a predetermined cell type or tissue type; a pattern of expression that differs from wild type in terms of the splicing size, translated amino acid sequence, post-transitional modification, or biological activity of the expressed polypeptide; a pattern of expression that differs from wild type in terms of the effect of an environmental stimulus or extracellular stimulus on expression of the gene, e.g., a pattern of increased or decreased expression (as compared with wild type) in the presence of an increase or decrease in the strength of the stimulus.

[5089] “Subject,” as used herein, refers to human and non-human animals. The term “non-human animals” of the invention includes all vertebrates, e.g., mammals, such as non-human primates (particularly higher primates), sheep, dog, rodent (e.g., mouse or rat), guinea pig, goat, pig, cat, rabbits, cow, and non-mammals, such as chickens, amphibians, reptiles, etc. In a preferred embodiment, the subject is a human. In another embodiment, the subject is an experimental animal or animal suitable as a disease model.

[5090] A “purified preparation of cells,” as used herein, refers to an in vitro preparation of cells. In the case cells from multicellular organisms (e.g., plants and animals), a purified preparation of cells is a subset of cells obtained from the organism, not the entire intact organism. In the case of unicellular microorganisms (e.g., cultured cells and microbial cells), it consists of a preparation of at least 10% and more preferably 50% of the subject cells.

[5091] Various aspects of the invention are described in further detail below.

[5092] Isolated Nucleic Acid Molecules of 80091

[5093] In one aspect, the invention provides, an isolated or purified, nucleic acid molecule that encodes an 80091 polypeptide described herein, e.g., a full-length 80091 protein or a fragment thereof, e.g., a biologically active portion of 80091 protein. Also included is a nucleic acid fragment suitable for use as a hybridization probe, which can be used, e.g., to identify a nucleic acid molecule encoding a polypeptide of the invention, 80091 mRNA, and fragments suitable for use as primers, e.g., PCR primers for the amplification or mutation of nucleic acid molecules.

[5094] In one embodiment, an isolated nucleic acid molecule of the invention includes the nucleotide sequence shown in SEQ ID NO:94, or a portion of any of these nucleotide sequences. In one embodiment, the nucleic acid molecule includes sequences encoding the human 80091 protein (i.e., “the coding region” of SEQ ID NO:94), as well as 5′untranslated sequences. Alternatively, the nucleic acid molecule can include only the coding region of SEQ ID NO:94 (e.g., SEQ ID NO:94) and, e.g., no flanking sequences which normally accompany the subject sequence. In another embodiment, the nucleic acid molecule encodes a sequence corresponding to a fragment of the protein from about amino acids 447 to 478 and 1219 to 1279 of SEQ ID NO:95

[5095] In another embodiment, an isolated nucleic acid molecule of the invention includes a nucleic acid molecule which is a complement of the nucleotide sequence shown in SEQ ID NO:94, or a portion of any of these nucleotide sequences. In other embodiments, the nucleic acid molecule of the invention is sufficiently complementary to the nucleotide sequence shown in SEQ ID NO:94, such that it can hybridize (e.g., under a stringency condition described herein) to the nucleotide sequence shown in SEQ ID NO:94, thereby forming a stable duplex.

[5096] In one embodiment, an isolated nucleic acid molecule of the present invention includes a nucleotide sequence which is at least about: 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or more homologous to the entire length of the nucleotide sequence shown in SEQ ID NO:94, or a portion, preferably of the same length, of any of these nucleotide sequences.

[5097] 80091 Nucleic Acid Fragments

[5098] A nucleic acid molecule of the invention can include only a portion of the nucleic acid sequence of SEQ ID NO:94. For example, such a nucleic acid molecule can include a fragment which can be used as a probe or primer or a fragment encoding a portion of an 80091 protein, e.g., an immunogenic or biologically active portion of an 80091 protein. A fragment can comprise those nucleotides of SEQ ID NO:94, which encode an ubiquitin carboxy-terminal hydrolase domain of human 80091. The nucleotide sequence determined from the cloning of the 80091 gene allows for the generation of probes and primers designed for use in identifying and/or cloning other 80091 family members, or fragments thereof, as well as 80091 homologues, or fragments thereof, from other species.

[5099] In another embodiment, a nucleic acid includes a nucleotide sequence that includes part, or all, of the coding region and extends into either (or both) the 5′ or 3′noncoding region. Other embodiments include a fragment which includes a nucleotide sequence encoding an amino acid fragment described herein. Nucleic acid fragments can encode a specific domain or site described herein or fragments thereof, particularly fragments thereof which are at least 50 amino acids in length, e.g., residues from 447 to 478, 1010 to 1018, and 1219 to 1279. Fragments also include nucleic acid sequences corresponding to specific amino acid sequences described above or fragments thereof. Nucleic acid fragments should not to be construed as encompassing those fragments that may have been disclosed prior to the invention.

[5100] A nucleic acid fragment can include a sequence corresponding to a domain, region, or functional site described herein. A nucleic acid fragment can also include one or more domain, region, or functional site described herein. Thus, for example, an 80091 nucleic acid fragment can include a sequence corresponding to an ubiquitin carboxy-terminal hydrolase domain.

[5101] 80091 probes and primers are provided. Typically a probe/primer is an isolated or purified oligonucleotide. The oligonucleotide typically includes a region of nucleotide sequence that hybridizes under a stringency condition described herein to at least about 7, 12 or 15, preferably about 20 or 25, more preferably about 30, 35, 40, 45, 50, 55, 60, 65, or 75 consecutive nucleotides of a sense or antisense sequence of SEQ ID NO:94, or of a naturally occurring allelic variant or mutant of SEQ ID NO:94. Preferably, an oligonucleotide is less than about 200, 150, 120, or 100 nucleotides in length.

[5102] In one embodiment, the probe or primer is attached to a solid support, e.g., a solid support described herein.

[5103] One exemplary kit of primers includes a forward primer that anneals to the coding strand and a reverse primer that anneals to the non-coding strand. The forward primer can anneal to the start codon, e.g., the nucleic acid sequence encoding amino acid residue 1 of SEQ ID NO:95. The reverse primer can anneal to the ultimate codon, e.g., the codon immediately before the stop codon, e.g., the codon encoding amino acid residue 1317 of SEQ ID NO:95. In a preferred embodiment, the annealing temperatures of the forward and reverse primers differ by no more than 5, 4, 3, or 2° C.

[5104] In a preferred embodiment the nucleic acid is a probe which is at least 10, 12, 15, 18, 20 and less than 200, more preferably less than 100, or less than 50, nucleotides in length. It should be identical, or differ by 1, or 2, or less than 5 or 10 nucleotides, from a sequence disclosed herein. If alignment is needed for this comparison the sequences should be aligned for maximum homology. “Looped” out sequences from deletions or insertions, or mismatches, are considered differences.

[5105] A probe or primer can be derived from the sense or anti-sense strand of a nucleic acid which encodes: a UCH-1 domain or a UCH-2 domain.

[5106] In another embodiment a set of primers is provided, e.g., primers suitable for use in a PCR, which can be used to amplify a selected region of an 80091 sequence, e.g., a domain, region, site or other sequence described herein. The primers should be at least 5, 10, or 50 base pairs in length and less than 100, or less than 200, base pairs in length. The primers should be identical, or differs by one base from a sequence disclosed herein or from a naturally occurring variant. For example, primers suitable for amplifying all or a portion of any of the following regions are provided: a UCH-1 domain and a UCH-2 domian from about amino acids 447 to 478 of SEQ ID NO:95; and amino acids 1219 to 1279 of SEQ ID NO:95, respectively.

[5107] A nucleic acid fragment can encode an epitope bearing region of a polypeptide described herein.

[5108] A nucleic acid fragment encoding a “biologically active portion of an 80091 polypeptide” can be prepared by isolating a portion of the nucleotide sequence of SEQ ID NO:94, which encodes a polypeptide having an 80091 biological activity (e.g., the biological activities of the 80091 proteins are described herein), expressing the encoded portion of the 80091 protein (e.g., by recombinant expression in vitro) and assessing the activity of the encoded portion of the 80091 protein. For example, a nucleic acid fragment encoding a biologically active portion of 80091 includes UCH domain, e.g., amino acid residues about 447 to 478 and 1219 to 1279 of SEQ ID NO:95. A nucleic acid fragment encoding a biologically active portion of an 80091 polypeptide, may comprise a nucleotide sequence which is greater than 300 or more nucleotides in length.

[5109] In preferred embodiments, a nucleic acid includes a nucleotide sequence which is about 300, 400, 500, 600, 700, 800, 900, 1000, 1100, 1200, 1300, 1400, 1500, 1600, 1700, 1800, 1900, 2000, 2100, 2200, 2300, 2400, 2500, 2600, 2700, 2800, 2900, 3000, 3100, 3200, 3300, 3400, 3500, 3600, 3700, 3800, 3900 or more nucleotides in length and hybridizes under a stringency condition described herein to a nucleic acid molecule of SEQ ID NO:94.

[5110] In a preferred embodiment, a nucleic acid fragment differs by at least 1, 2, 3, 10, 20, or more nucleotides from the sequence of Genbank™ accession number AF155116, A83858, 176205, X63547, X63546, AJ012755, AC022596, AU131748, AU120381, AU117329, BE910371, AL043344, AV703768, AV706483, P35125, or a sequence disclosed in WO 01/70978, WO 01/75067, or WO 01/55318. Differences can include differing in length or sequence identity. For example, a nucleic acid fragment can: include one or more nucleotides from SEQ ID NO:94 located outside the region of nucleotides 1-743 or 3228-3954 of SEQ ID NO:94; not include all of the nucleotides of Genbank™ accession number AF155116, A83858, 176205, X63547, X63546, AJ012755, AC022596, AU131748, AU120381, AU117329, BE910371, AL043344, AV703768, AV706483, P35125, or a sequence disclosed in WO 01/70978, WO 01/75067, or WO 01/55318, e.g., can be one or more nucleotides shorter (at one or both ends) than the sequence of Genbank™ accession number AF155116, A83858, 176205, X63547, X63546, AJ012755, AC022596, AU131748, AU120381, AU117329, BE910371, AL043344, AV703768, AV706483, P35125, or a sequence disclosed in WO 01/70978, WO 01/75067, or WO 01/55318; or can differ by one or more nucleotides in the region of overlap.

[5111] 80091 Nucleic Acid Variants

[5112] The invention further encompasses nucleic acid molecules that differ from the nucleotide sequence shown in SEQ ID NO:94. Such differences can be due to degeneracy of the genetic code (and result in a nucleic acid which encodes the same 80091 proteins as those encoded by the nucleotide sequence disclosed herein. In another embodiment, an isolated nucleic acid molecule of the invention has a nucleotide sequence encoding a protein having an amino acid sequence which differs, by at least 1, but less than 5, 10, 20, 50, or 100 amino acid residues that shown in SEQ ID NO:95. If alignment is needed for this comparison the sequences should be aligned for maximum homology. The encoded protein can differ by no more than 5, 4, 3, 2, or 1 amino acid. “Looped” out sequences from deletions or insertions, or mismatches, are considered differences.

[5113] Nucleic acids of the inventor can be chosen for having codons, which are preferred, or non-preferred, for a particular expression system. E.g., the nucleic acid can be one in which at least one codon, at preferably at least 10%, or 20% of the codons has been altered such that the sequence is optimized for expression in E. coli, yeast, human, insect, or CHO cells.

[5114] Nucleic acid variants can be naturally occurring, such as allelic variants (same locus), homologs (different locus), and orthologs (different organism) or can be non naturally occurring. Non-naturally occurring variants can be made by mutagenesis techniques, including those applied to polynucleotides, cells, or organisms. The variants can contain nucleotide substitutions, deletions, inversions and insertions. Variation can occur in either or both the coding and non-coding regions. The variations can produce both conservative and non-conservative amino acid substitutions (as compared in the encoded product).

[5115] In a preferred embodiment, the nucleic acid differs from that of SEQ ID NO:94, e.g., as follows: by at least one but less than 10, 20, 30, or 40 nucleotides; at least one but less than 1%, 5%, 10% or 20% of the nucleotides in the subject nucleic acid. The nucleic acid can differ by no more than 5, 4, 3, 2, or 1 nucleotide. If necessary for this analysis the sequences should be aligned for maximum homology. “Looped” out sequences from deletions or insertions, or mismatches, are considered differences.

[5116] Orthologs, homologs, and allelic variants can be identified using methods known in the art. These variants comprise a nucleotide sequence encoding a polypeptide that is 50%, at least about 55%, typically at least about 70-75%, more typically at least about 80-85%, and most typically at least about 90-95% or more identical to the nucleotide sequence shown in SEQ ID NO:95 or a fragment of this sequence. Such nucleic acid molecules can readily be identified as being able to hybridize under a stringency condition described herein, to the nucleotide sequence shown in SEQ ID NO:95 or a fragment of the sequence. Nucleic acid molecules corresponding to orthologs, homologs, and allelic variants of the 80091 cDNAs of the invention can further be isolated by mapping to the same chromosome or locus as the 80091 gene.

[5117] Preferred variants include those that are correlated with modulation of de-ubiquitination of a substrate.

[5118] Allelic variants of 80091, e.g., human 80091, include both functional and non-functional proteins. Functional allelic variants are naturally occurring amino acid sequence variants of the 80091 protein within a population that maintain the ability to modulate de-ubiquitination of a substrate. Functional allelic variants will typically contain only conservative substitution of one or more amino acids of SEQ ID NO:95, or substitution, deletion or insertion of non-critical residues in non-critical regions of the protein. Non-functional allelic variants are naturally-occurring amino acid sequence variants of the 80091, e.g., human 80091, protein within a population that do not have the ability to: (1) modulate de-ubiquitination of a substrate, e.g., a ubiquitinated protein targeted for degradation; (2) participate in the processing of poly-ubiquitin precursors; (3) modulate cellular proliferation and/or differentiation; (4) modulate (e.g., stimulate) of cell differentiation, e.g., differentiation of hematopoietic cells (e.g., differentiation of blood cells (e.g., erythroid progenitor cells, such as CD34+ erythroid progenitors)); (5) modulate hematopoiesis, e.g., erythropoiesis; (6) modulate cell proliferation, e.g., proliferation of hematopoietic cells (e.g., erythroid progenitor cells); (7) modulate apoptosis, of a cell, e.g., increase apoptosis of a cancer cell, e.g., a leukemic cell, (e.g., an erythroleukemia cell); or suppress apoptosis of a blood or erythroid cell; (8) modulate apoptosis; (9) modulate transcription and/or cell-cycle progression; or (10) modulate signal transduction. Non-functional allelic variants will typically contain a non-conservative substitution, a deletion, or insertion, or premature truncation of the amino acid sequence of SEQ ID NO:95, or a substitution, insertion, or deletion in critical residues or critical regions of the protein.

[5119] Moreover, nucleic acid molecules encoding other 80091 family members and, thus, which have a nucleotide sequence which differs from the 80091 sequences of SEQ ID NO:94 are intended to be within the scope of the invention.

[5120] Antisense Nucleic Acid Molecules, Ribozymes and Modified 80091 Nucleic Acid Molecules

[5121] In another aspect, the invention features, an isolated nucleic acid molecule which is antisense to 80091. An “antisense” nucleic acid can include a nucleotide sequence which is complementary to a “sense” nucleic acid encoding a protein, e.g., complementary to the coding strand of a double-stranded cDNA molecule or complementary to an mRNA sequence. The antisense nucleic acid can be complementary to an entire 80091 coding strand, or to only a portion thereof (e.g., the coding region of human 80091 corresponding to SEQ ID NO:94). In another embodiment, the antisense nucleic acid molecule is antisense to a “noncoding region” of the coding strand of a nucleotide sequence encoding 80091 (e.g., the 5′ and 3′untranslated regions).

[5122] An antisense nucleic acid can be designed such that it is complementary to the entire coding region of 80091 mRNA, but more preferably is an oligonucleotide which is antisense to only a portion of the coding or noncoding region of 80091 mRNA. For example, the antisense oligonucleotide can be complementary to the region surrounding the translation start site of 80091 mRNA, e.g., between the −10 and +10 regions of the target gene nucleotide sequence of interest. An antisense oligonucleotide can be, for example, about 7, 10, 15, 20, 25, 30, 35, 40, 45, 50, 55, 60, 65, 70, 75, 80, or more nucleotides in length.

[5123] An antisense nucleic acid of the invention can be constructed using chemical synthesis and enzymatic ligation reactions using procedures known in the art. For example, an antisense nucleic acid (e.g., an antisense oligonucleotide) can be chemically synthesized using naturally occurring nucleotides or variously modified nucleotides designed to increase the biological stability of the molecules or to increase the physical stability of the duplex formed between the antisense and sense nucleic acids, e.g., phosphorothioate derivatives and acridine substituted nucleotides can be used. The antisense nucleic acid also can be produced biologically using an expression vector into which a nucleic acid has been subcloned in an antisense orientation (i.e., RNA transcribed from the inserted nucleic acid will be of an antisense orientation to a target nucleic acid of interest, described further in the following subsection).

[5124] The antisense nucleic acid molecules of the invention are typically administered to a subject (e.g., by direct injection at a tissue site), or generated in situ such that they hybridize with or bind to cellular mRNA and/or genomic DNA encoding an 80091 protein to thereby inhibit expression of the protein, e.g., by inhibiting transcription and/or translation. Alternatively, antisense nucleic acid molecules can be modified to target selected cells and then administered systemically. For systemic administration, antisense molecules can be modified such that they specifically bind to receptors or antigens expressed on a selected cell surface, e.g., by linking the antisense nucleic acid molecules to peptides or antibodies which bind to cell surface receptors or antigens. The antisense nucleic acid molecules can also be delivered to cells using the vectors described herein. To achieve sufficient intracellular concentrations of the antisense molecules, vector constructs in which the antisense nucleic acid molecule is placed under the control of a strong pol II or pol III promoter are preferred.

[5125] In yet another embodiment, the antisense nucleic acid molecule of the invention is an &agr;-anomeric nucleic acid molecule. An &agr;-anomeric nucleic acid molecule forms specific double-stranded hybrids with complementary RNA in which, contrary to the usual &bgr;-units, the strands run parallel to each other (Gaultier et al. (1987) Nucleic Acids. Res. 15:6625-6641). The antisense nucleic acid molecule can also comprise a 2′-o-methylribonucleotide (Inoue et al. (1987) Nucleic Acids Res. 15:6131-6148) or a chimeric RNA-DNA analogue (Inoue et al. (1987) FEBS Lett. 215:327-330).

[5126] In still another embodiment, an antisense nucleic acid of the invention is a ribozyme. A ribozyme having specificity for an 80091-encoding nucleic acid can include one or more sequences complementary to the nucleotide sequence of an 80091 cDNA disclosed herein (i.e., SEQ ID NO:94), and a sequence having known catalytic sequence responsible for mRNA cleavage (see U.S. Pat. No. 5,093,246 or Haselhoff and Gerlach (1988) Nature 334:585-591). For example, a derivative of a Tetrahymena L-19 IVS RNA can be constructed in which the nucleotide sequence of the active site is complementary to the nucleotide sequence to be cleaved in an 80091-encoding mRNA. See, e.g., Cech et al. U.S. Pat. No. 4,987,071; and Cech et al. U.S. Pat. No. 5,116,742. Alternatively, 80091 mRNA can be used to select a catalytic RNA having a specific ribonuclease activity from a pool of RNA molecules. See, e.g., Bartel, D. and Szostak, J. W. (1993) Science 261:1411-1418.

[5127] 80091 gene expression can be inhibited by targeting nucleotide sequences complementary to the regulatory region of the 80091 (e.g., the 80091 promoter and/or enhancers) to form triple helical structures that prevent transcription of the 80091 gene in target cells. See generally, Helene, C. (1991) Anticancer Drug Des. 6:569-84; Helene, C. i (1992) Ann. N.Y. Acad. Sci. 660:27-36; and Maher, L. J. (1992) Bioassays 14:807-15. The potential sequences that can be targeted for triple helix formation can be increased by creating a so-called “switchback” nucleic acid molecule. Switchback molecules are synthesized in an alternating 5′-3′,3′-5′manner, such that they base pair with first one strand of a duplex and then the other, eliminating the necessity for a sizeable stretch of either purines or pyrimidines to be present on one strand of a duplex.

[5128] The invention also provides detectably labeled oligonucleotide primer and probe molecules. Typically, such labels are chemiluminescent, fluorescent, radioactive, or colorimetric.

[5129] AN 80091 nucleic acid molecule can be modified at the base moiety, sugar moiety or phosphate backbone to improve, e.g., the stability, hybridization, or solubility of the molecule. For non-limiting examples of synthetic oligonucleotides with modifications see Toulmé (2001) Nature Biotech. 19: 17 and Faria et al. (2001) Nature Biotech. 19: 40-44. Such phosphoramidite oligonucleotides can be effective antisense agents.

[5130] For example, the deoxyribose phosphate backbone of the nucleic acid molecules can be modified to generate peptide nucleic acids (see Hyrup B. et al. (1996) Bioorganic & Medicinal Chemistry 4: 5-23). As used herein, the terms “peptide nucleic acid” or “PNA” refers to a nucleic acid mimic, e.g., a DNA mimic, in which the deoxyribose phosphate backbone is replaced by a pseudopeptide backbone and only the four natural nucleobases are retained. The neutral backbone of a PNA can allow for specific hybridization to DNA and RNA under conditions of low ionic strength. The synthesis of PNA oligomers can be performed using standard solid phase peptide synthesis protocols as described in Hyrup B. et al. (1996) supra and Perry-O'Keefe et al. Proc. Natl. Acad. Sci. 93: 14670-675.

[5131] PNAs of 80091 nucleic acid molecules can be used in therapeutic and diagnostic applications. For example, PNAs can be used as antisense or antigene agents for sequence-specific modulation of gene expression by, for example, inducing transcription or translation arrest or inhibiting replication. PNAs of 80091 nucleic acid molecules can also be used in the analysis of single base pair mutations in a gene, (e.g., by PNA-directed PCR clamping); as ‘artificial restriction enzymes’ when used in combination with other enzymes, (e.g., S1 nucleases (Hyrup B. et al. (1996) supra)); or as probes or primers for DNA sequencing or hybridization (Hyrup B. et al. (1996) supra; Perry-O'Keefe supra).

[5132] In other embodiments, the oligonucleotide may include other appended groups such as peptides (e.g., for targeting host cell receptors in vivo), or agents facilitating transport across the cell membrane (see, e.g., Letsinger et al. (1989) Proc. Natl. Acad. Sci. USA 86: 6553-6556; Lemaitre et al. (1987) Proc. Natl. Acad. Sci. USA 84: 648-652; PCT Publication No. WO88/09810) or the blood-brain barrier (see, e.g., PCT Publication No. WO89/10134). In addition, oligonucleotides can be modified with hybridization-triggered cleavage agents (see, e.g., Krol et al. (1988) Bio-Techniques 6: 958-976) or intercalating agents (see, e.g., Zon (1988) Pharm. Res. 5: 539-549). To this end, the oligonucleotide may be conjugated to another molecule, (e.g., a peptide, hybridization triggered cross-linking agent, transport agent, or hybridization-triggered cleavage agent).

[5133] The invention also includes molecular beacon oligonucleotide primer and probe molecules having at least one region which is complementary to an 80091 nucleic acid of the invention, two complementary regions one having a fluorophore and one a quencher such that the molecular beacon is useful for quantitating the presence of the 80091 nucleic acid of the invention in a sample. Molecular beacon nucleic acids are described, for example, in Lizardi et al., U.S. Pat. No. 5,854,033; Nazarenko et al., U.S. Pat. No. 5,866,336, and Livak et al., U.S. Pat. No. 5,876,930.

[5134] Isolated 80091 Polypeptides

[5135] In another aspect, the invention features, an isolated 80091 protein, or fragment, e.g., a biologically active portion, for use as immunogens or antigens to raise or test (or more generally to bind) anti-80091 antibodies. 80091 protein can be isolated from cells or tissue sources using standard protein purification techniques. 80091 protein or fragments thereof can be produced by recombinant DNA techniques or synthesized chemically.

[5136] Polypeptides of the invention include those which arise as a result of the existence of multiple genes, alternative transcription events, alternative RNA splicing events, and alternative translational and post-translational events. The polypeptide can be expressed in systems, e.g., cultured cells, which result in substantially the same post-translational modifications present when expressed the polypeptide is expressed in a native cell, or in systems which result in the alteration or omission of post-translational modifications, e.g., glycosylation or cleavage, present when expressed in a native cell.

[5137] In a preferred embodiment, an 80091 polypeptide has one or more of the following characteristics:

[5138] (i) it has the ability to modulate de-ubiquitination of a substrate, e.g., a ubiquitinated protein targeted for degradation;

[5139] (ii) it has the ability to participate in the processing of poly-ubiquitin precursors;

[5140] (iii) it has the ability to modulate cellular proliferation and/or differentiation;

[5141] (iv) it has the ability to modulate (e.g., stimulate) of cell differentiation, e.g., differentiation of hematopoietic cells (e.g., differentiation of blood cells (e.g., erythroid progenitor cells, such as CD34+erythroid progenitors));

[5142] (v) it has the ability to modulate hematopoiesis, e.g., erythropoiesis;

[5143] (vi) it has the ability to modulate cell proliferation, e.g., proliferation of hematopoietic cells (e.g., erythroid progenitor cells);

[5144] (vii) it has the ability to modulate apoptosis, of a cell, e.g., increase apoptosis of a cancer cell, e.g., a leukemic cell, (e.g., an erythroleukemia cell); or suppress apoptosis of a blood or erythroid cell;

[5145] (viii) it has a molecular weight, e.g., a deduced molecular weight, preferably ignoring any contribution of post translational modifications, amino acid composition or other physical characteristic of SEQ ID NO:95;

[5146] (ix) it has an overall sequence similarity of at least 65%, preferably at least 70%, more preferably at least 80, 90, or 95%, with a polypeptide a of SEQ ID NO:95;

[5147] (x) it has a UCH-1 domain which is preferably about 70%, 80%, 90% or 95% identical with amino acid residues about 447 to 478 of SEQ ID NO:95; and

[5148] (xi) it has a UCH-2 domain which is preferably about 70%, 80%, 90% or 95% identical with amino acid residues about 1219 to 1279 of SEQ ID NO:95.

[5149] In a preferred embodiment the 80091 protein, or fragment thereof, differs from the corresponding sequence in SEQ ID NO:95. In one embodiment it differs by at least one but by less than 15, 10 or 5 amino acid residues. In another it differs from the corresponding sequence in SEQ ID NO:95 by at least one residue but less than 20%, 15%, 10% or 5% of the residues in it differ from the corresponding sequence in SEQ ID NO:95. (If this comparison requires alignment the sequences should be aligned for maximum homology. “Looped” out sequences from deletions or insertions, or mismatches, are considered differences.) The differences are, preferably, differences or changes at a non essential residue or a conservative substitution. In a preferred embodiment the differences are not in the UCH. In another preferred embodiment one or more differences are in the UCH domains (amino acids 447 to 478 and 1219 to 1279 of SEQ ID NO:95).

[5150] Other embodiments include a protein that contain one or more changes in amino acid sequence, e.g., a change in an amino acid residue which is not essential for activity. Such 80091 proteins differ in amino acid sequence from SEQ ID NO:95, yet retain biological activity.

[5151] In one embodiment, the protein includes an amino acid sequence at least about 65%, 70%, 75%, 80%, 85%, 90%, 95%, 98% or more homologous to SEQ ID NO:95.

[5152] An 80091 protein or fragment is provided which varies from the sequence of SEQ ID NO:95 in regions defined by amino acids about 1 to 446 and 479 to 1009 by at least one but by less than 15, 10 or 5 amino acid residues in the protein or fragment but which does not differ from SEQ ID NO:95 in regions defined by amino acids about 447 to 478 and 1219 to 1279. (If this comparison requires alignment the sequences should be aligned for maximum homology. “Looped” out sequences from deletions or insertions, or mismatches, are considered differences.) In some embodiments the difference is at a non-essential residue or is a conservative substitution, while in others the difference is at an essential residue or is a non-conservative substitution.

[5153] In one embodiment, a biologically active portion of an 80091 protein includes a UCH-1 domain and a UCH-2 domain. Moreover, other biologically active portions, in which other regions of the protein are deleted, can be prepared by recombinant techniques and evaluated for one or more of the functional activities of a native 80091 protein.

[5154] In a preferred embodiment, the 80091 protein has an amino acid sequence shown in SEQ ID NO:95. In other embodiments, the 80091 protein is substantially identical to SEQ ID NO:95. In yet another embodiment, the 80091 protein is substantially identical to SEQ ID NO:95 and retains the functional activity of the protein of SEQ ID NO:95, as described in detail in the subsections above.

[5155] In a preferred embodiment, a fragment differs by at least 1, 2, 3, 10, 20, or more amino acid residues encoded by a nucleotide sequence present in Genbank™ accession number AF155116, A83858, 176205, X63547, X63546, AJ012755, AC022596, AU131748, AU120381, AU117329, BE910371, AL043344, AV703768, AV706483, P35125, or a sequence disclosed in WO 01/70978, WO 01/75067, or WO 01/55318. Differences can include differing in length or sequence identity. For example, a fragment can: include one or more amino acid residues from SEQ ID NO:95 outside the region encoded by nucleotides 1-743 or 3228-3954 of SEQ ID NO:94; not include all of the amino acid residues encoded by a nucleotide sequence in Genbank™ accession number AF155116, A83858, 176205, X63547, X63546, AJ012755, AC022596, AU131748, AU120381, AU117329, BE910371, AL043344, AV703768, AV706483, P35125, or a sequence disclosed in WO 01/70978, WO 01/75067, or WO 01/55318, e.g., can be one or more amino acid residues shorter (at one or both ends) than a sequence encoded by the nucleotide sequence in Genbank™ accession number AF155116, A83858, 176205, X63547, X63546, AJ012755, AC022596, AU131748, AU120381, AU117329, BE910371, AL043344, AV703768, AV706483, P35125, or a sequence disclosed in WO 01/70978, WO 01/75067, or WO 01/55318; or can differ by one or more amino acid residues in the region of overlap.

[5156] 80091 Chimeric or Fusion Proteins

[5157] In another aspect, the invention provides 80091 chimeric or fusion proteins. As used herein, an 80091 “chimeric protein” or “fusion protein” includes an 80091 polypeptide linked to a non-80091 polypeptide. A “non-80091 polypeptide” refers to a polypeptide having an amino acid sequence corresponding to a protein which is not substantially homologous to the 80091 protein, e.g., a protein which is different from the 80091 protein and which is derived from the same or a different organism. The 80091 polypeptide of the fusion protein can correspond to all or a portion e.g., a fragment described herein of an 80091 amino acid sequence. In a preferred embodiment, an 80091 fusion protein includes at least one (or two) biologically active portion of an 80091 protein. The non-80091 polypeptide can be fused to the N-terminus or C-terminus of the 80091 polypeptide.

[5158] The fusion protein can include a moiety which has a high affinity for a ligand. For example, the fusion protein can be a GST-80091 fusion protein in which the 80091 sequences are fused to the C-terminus of the GST sequences. Such fusion proteins can facilitate the purification of recombinant 80091. Alternatively, the fusion protein can be an 80091 protein containing a heterologous signal sequence at its N-terminus. In certain host cells (e.g., mammalian host cells), expression and/or secretion of 80091 can be increased through use of a heterologous signal sequence.

[5159] Fusion proteins can include all or a part of a serum protein, e.g., an IgG constant region, or human serum albumin.

[5160] The 80091 fusion proteins of the invention can be incorporated into pharmaceutical compositions and administered to a subject in vivo. The 80091 fusion proteins can be used to affect the bioavailability of an 80091 substrate. 80091 fusion proteins may be useful therapeutically for the treatment of disorders caused by, for example, (i) aberrant modification or mutation of a gene encoding an 80091 protein; (ii) mis-regulation of the 80091 gene; and (iii) aberrant post-translational modification of an 80091 protein.

[5161] Moreover, the 80091-fusion proteins of the invention can be used as immunogens to produce anti-80091 antibodies in a subject, to purify 80091 ligands and in screening assays to identify molecules which inhibit the interaction of 80091 with an 80091 substrate.

[5162] Expression vectors are commercially available that already encode a fusion moiety (e.g., a GST polypeptide). AN 80091-encoding nucleic acid can be cloned into such an expression vector such that the fusion moiety is linked in-frame to the 80091 protein.

[5163] Variants of 80091 Proteins

[5164] In another aspect, the invention also features a variant of an 80091 polypeptide, e.g., which functions as an agonist (mimetics) or as an antagonist. Variants of the 80091 proteins can be generated by mutagenesis, e.g., discrete point mutation, the insertion or deletion of sequences or the truncation of an 80091 protein. An agonist of the 80091 proteins can retain substantially the same, or a subset, of the biological activities of the naturally occurring form of an 80091 protein. An antagonist of an 80091 protein can inhibit one or more of the activities of the naturally occurring form of the 80091 protein by, for example, competitively modulating an 80091-mediated activity of an 80091 protein. Thus, specific biological effects can be elicited by treatment with a variant of limited function. Preferably, treatment of a subject with a variant having a subset of the biological activities of the naturally occurring form of the protein has fewer side effects in a subject relative to treatment with the naturally occurring form of the 80091 protein.

[5165] Variants of an 80091 protein can be identified by screening combinatorial libraries of mutants, e.g., truncation mutants, of an 80091 protein for agonist or antagonist activity.

[5166] Libraries of fragments e.g., N terminal, C terminal, or internal fragments, of an 80091 protein coding sequence can be used to generate a variegated population of fragments for screening and subsequent selection of variants of an 80091 protein. Variants in which a cysteine residues is added or deleted or in which a residue which is glycosylated is added or deleted are particularly preferred.

[5167] Methods for screening gene products of combinatorial libraries made by point mutations or truncation, and for screening cDNA libraries for gene products having a selected property are known in the art. Such methods are adaptable for rapid screening of the gene libraries generated by combinatorial mutagenesis of 80091 proteins. Recursive ensemble mutagenesis (REM), a new technique which enhances the frequency of functional mutants in the libraries, can be used in combination with the screening assays to identify 80091 variants (Arkin and Yourvan (1992) Proc. Natl. Acad. Sci. USA 89:7811-7815; Delgrave et al. (1993) Protein Engineering 6:327-331).

[5168] Cell based assays can be exploited to analyze a variegated 80091 library. For example, a library of expression vectors can be transfected into a cell line, e.g., a cell line, which ordinarily responds to 80091 in a substrate-dependent manner. The transfected cells are then contacted with 80091 and the effect of the expression of the mutant on signaling by the 80091 substrate can be detected, e.g., by measuring de-ubiquitination of a substrate. Plasmid DNA can then be recovered from the cells which score for inhibition, or alternatively, potentiation of signaling by the 80091 substrate, and the individual clones further characterized.

[5169] In another aspect, the invention features a method of making an 80091 polypeptide, e.g., a peptide having a non-wild type activity, e.g., an antagonist, agonist, or super agonist of a naturally occurring 80091 polypeptide, e.g., a naturally occurring 80091 polypeptide. The method includes: altering the sequence of an 80091 polypeptide, e.g., altering the sequence, e.g., by substitution or deletion of one or more residues of a non-conserved region, a domain or residue disclosed herein, and testing the altered polypeptide for the desired activity.

[5170] In another aspect, the invention features a method of making a fragment or analog of an 80091 polypeptide a biological activity of a naturally occurring 80091 polypeptide. The method includes: altering the sequence, e.g., by substitution or deletion of one or more residues, of an 80091 polypeptide, e.g., altering the sequence of a non-conserved region, or a domain or residue described herein, and testing the altered polypeptide for the desired activity.

[5171] Anti-80091 Antibodies

[5172] In another aspect, the invention provides an anti-80091 antibody, or a fragment thereof (e.g., an antigen-binding fragment thereof). The term “antibody” as used herein refers to an immunoglobulin molecule or immunologically active portion thereof, i.e., an antigen-binding portion. As used herein, the term “antibody” refers to a protein comprising at least one, and preferably two, heavy (H) chain variable regions (abbreviated herein as VH), and at least one and preferably two light (L) chain variable regions (abbreviated herein as VL). The VH and VL regions can be further subdivided into regions of hypervariability, termed “complementarity determining regions” (“CDR”), interspersed with regions that are more conserved, termed “framework regions” (FR). The extent of the framework region and CDR's has been precisely defined (see, Kabat, E. A., et al. (1991) Sequences of Proteins of Immunological Interest, Fifth Edition, U.S. Department of Health and Human Services, NIH Publication No. 91-3242, and Chothia, C. et al. (1987) J. Mol. Biol. 196: 901-917, which are incorporated herein by reference). Each VH and VL is composed of three CDR's and four FRs, arranged from amino-terminus to carboxy-terminus in the following order: FR1, CDR1, FR2, CDR2, FR3, CDR3, FR4.

[5173] The anti-80091 antibody can further include a heavy and light chain constant region, to thereby form a heavy and light immunoglobulin chain, respectively. In one embodiment, the antibody is a tetramer of two heavy immunoglobulin chains and two light immunoglobulin chains, wherein the heavy and light immunoglobulin chains are inter-connected by, e.g., disulfide bonds. The heavy chain constant region is comprised of three domains, CH1, CH2 and CH3. The light chain constant region is comprised of one domain, CL. The variable region of the heavy and light chains contains a binding domain that interacts with an antigen. The constant regions of the antibodies typically mediate the binding of the antibody to host tissues or factors, including various cells of the immune system (e.g., effector cells) and the first component (Clq) of the classical complement system.

[5174] As used herein, the term “immunoglobulin” refers to a protein consisting of one or more polypeptides substantially encoded by immunoglobulin genes. The recognized human immunoglobulin genes include the kappa, lambda, alpha (IgA1 and IgA2), gamma (IgG1, IgG2, IgG3, IgG4), delta, epsilon and mu constant region genes, as well as the myriad immunoglobulin variable region genes. Full-length immunoglobulin “light chains” (about 25 KDa or 214 amino acids) are encoded by a variable region gene at the NH2-terminus (about 110 amino acids) and a kappa or lambda constant region gene at the COOH—terminus. Full-length immunoglobulin “heavy chains” (about 50 KDa or 446 amino acids), are similarly encoded by a variable region gene (about 116 amino acids) and one of the other aforementioned constant region genes, e.g., gamma (encoding about 330 amino acids).

[5175] The term “antigen-binding fragment” of an antibody (or simply “antibody portion,” or “fragment”), as used herein, refers to one or more fragments of a full-length antibody that retain the ability to specifically bind to the antigen, e.g., 80091 polypeptide or fragment thereof. Examples of antigen-binding fragments of the anti-80091 antibody include, but are not limited to: (i) a Fab fragment, a monovalent fragment consisting of the VL, VH, CL and CH1 domains; (ii) a F(ab′)2 fragment, a bivalent fragment comprising two Fab fragments linked by a disulfide bridge at the hinge region; (iii) a Fd fragment consisting of the VH and CH1 domains; (iv) a Fv fragment consisting of the VL and VH domains of a single arm of an antibody, (v) a dAb fragment (Ward et al., (1989) Nature 341: 544-546), which consists of a VH domain; and (vi) an isolated complementarity determining region (CDR). Furthermore, although the two domains of the Fv fragment, VL and VH, are coded for by separate genes, they can be joined, using recombinant methods, by a synthetic linker that enables them to be made as a single protein chain in which the VL and VH regions pair to form monovalent molecules (known as single chain Fv (scFv); see e.g., Bird et al. (1988) Science 242: 423-426; and Huston et al. (1988) Proc. Natl. Acad. Sci. USA 85: 5879-5883). Such single chain antibodies are also encompassed within the term “antigen-binding fragment” of an antibody. These antibody fragments are obtained using conventional techniques known to those with skill in the art, and the fragments are screened for utility in the same manner as are intact antibodies.

[5176] The anti-80091 antibody can be a polyclonal or a monoclonal antibody. In other embodiments, the antibody can be recombinantly produced, e.g., produced by phage display or by combinatorial methods.

[5177] Phage display and combinatorial methods for generating anti-80091 antibodies are known in the art (as described in, e.g., Ladner et al. U.S. Pat. No. 5,223,409; Kang et al. International Publication No. WO 92/18619; Dower et al. International Publication No. WO 91/17271; Winter et al. International Publication WO 92/20791; Markland et al. International Publication No. WO 92/15679; Breitling et al. International Publication WO 93/01288; McCafferty et al. International Publication No. WO 92/01047; Garrard et al. International Publication No. WO 92/09690; Ladner et al. International Publication No. WO 90/02809; Fuchs et al. (1991) Bio/Technology 9: 1370-1372; Hay et al. (1992) Hum Antibod Hybridomas 3: 81-85; Huse et al. (1989) Science 246: 1275-1281; Griffths et al. (1993) EMBO J. 12: 725-734; Hawkins et al. (1992) J Mol Biol 226: 889-896; Clackson et al. (1991) Nature 352: 624-628; Gram et al. (1992) PNAS 89: 3576-3580; Garrad et al. (1991) Bio/Technology 9: 1373-1377; Hoogenboom et al. (1991) Nuc Acid Res 19: 4133-4137; and Barbas et al. (1991) PNAS 88: 7978-7982, the contents of all of which are incorporated by reference herein).

[5178] In one embodiment, the anti-80091 antibody is a fully human antibody (e.g., an antibody made in a mouse which has been genetically engineered to produce an antibody from a human immunoglobulin sequence), or a non-human antibody, e.g., a rodent (mouse or rat), goat, primate (e.g., monkey), camel antibody. Preferably, the non-human antibody is a rodent (mouse or rat antibody). Method of producing rodent antibodies are known in the art.

[5179] Human monoclonal antibodies can be generated using transgenic mice carrying the human immunoglobulin genes rather than the mouse system. Splenocytes from these transgenic mice immunized with the antigen of interest are used to produce hybridomas that secrete human mAbs with specific affinities for epitopes from a human protein (see, e.g., Wood et al. International Application WO 91/00906, Kucherlapati et al. PCT publication WO 91/10741; Lonberg et al. International Application WO 92/03918; Kay et al. International Application 92/03917; Lonberg, N. et al. 1994 Nature 368: 856-859; Green, L. L. et al. 1994 Nature Genet. 7: 13-21; Morrison, S. L. et al. 1994 Proc. Natl. Acad. Sci. USA 81: 6851-6855; Bruggeman et al. 1993 Year Immunol 7: 33-40; Tuaillon et al. 1993 PNAS 90: 3720-3724; Bruggeman et al. 1991 Eur J Immunol 21:1323-1326).

[5180] An anti-80091 antibody can be one in which the variable region, or a portion thereof, e.g., the CDR's, are generated in a non-human organism, e.g., a rat or mouse. Chimeric, CDR-grafted, and humanized antibodies are within the invention. Antibodies generated in a non-human organism, e.g., a rat or mouse, and then modified, e.g., in the variable framework or constant region, to decrease antigenicity in a human are within the invention.

[5181] Chimeric antibodies can be produced by recombinant DNA techniques known in the art. For example, a gene encoding the Fc constant region of a murine (or other species) monoclonal antibody molecule is digested with restriction enzymes to remove the region encoding the murine Fc, and the equivalent portion of a gene encoding a human Fc constant region is substituted (see Robinson et al., International Patent Publication PCT/US86/02269; Akira, et al., European Patent Application 184,187; Taniguchi, M., European Patent Application 171,496; Morrison et al., European Patent Application 173,494; Neuberger et al., International Application WO 86/01533; Cabilly et al. U.S. Pat. No. 4,816,567; Cabilly et al., European Patent Application 125,023; Better et al. (1988 Science 240: 1041-1043); Liu et al. (1987) PNAS 84: 3439-3443; Liu et al., 1987, J. Immunol. 139: 3521-3526; Sun et al. (1987) PNAS 84: 214-218; Nishimura et al., 1987, Canc. Res. 47: 999-1005; Wood et al. (1985) Nature 314: 446-449; and Shaw et al., 1988, J. Natl Cancer Inst. 80: 1553-1559).

[5182] A humanized or CDR-grafted antibody will have at least one or two but generally all three recipient CDR's (of heavy and or light immuoglobulin chains) replaced with a donor CDR. The antibody may be replaced with at least a portion of a non-human CDR or only some of the CDR's may be replaced with non-human CDR's. It is only necessary to replace the number of CDR's required for binding of the humanized antibody to an 80091 or a fragment thereof. Preferably, the donor will be a rodent antibody, e.g., a rat or mouse antibody, and the recipient will be a human framework or a human consensus framework. Typically, the immumoglobulin providing the CDR's is called the “donor” and the immunoglobulin providing the framework is called the “acceptor.” In one embodiment, the donor immunoglobulin is a non-human (e.g., rodent). The acceptor framework is a naturally-occurring (e.g., a human) framework or a consensus framework, or a sequence about 85% or higher, preferably 90%, 95%, 99% or higher identical thereto.

[5183] As used herein, the term “consensus sequence” refers to the sequence formed from the most frequently occurring amino acids (or nucleotides) in a family of related sequences (See e.g., Winnaker, From Genes to Clones (Verlagsgesellschaft, Weinheim, Germany 1987). In a family of proteins, each position in the consensus sequence is occupied by the amino acid occurring most frequently at that position in the family. If two amino acids occur equally frequently, either can be included in the consensus sequence. A “consensus framework” refers to the framework region in the consensus immunoglobulin sequence.

[5184] An antibody can be humanized by methods known in the art. Humanized antibodies can be generated by replacing sequences of the Fv variable region which are not directly involved in antigen binding with equivalent sequences from human Fv variable regions. General methods for generating humanized antibodies are provided by Morrison, S. L., 1985, Science 229: 1202-1207, by Oi et al., 1986, BioTechniques 4: 214, and by Queen et al. U.S. Pat. No. 5,585,089, U.S. Pat. No. 5,693,761 and U.S. Pat. No. 5,693,762, the contents of all of which are hereby incorporated by reference. Those methods include isolating, manipulating, and expressing the nucleic acid sequences that encode all or part of immunoglobulin Fv variable regions from at least one of a heavy or light chain. Sources of such nucleic acid are well known to those skilled in the art and, for example, may be obtained from a hybridoma producing an antibody against an 80091 polypeptide or fragment thereof. The recombinant DNA encoding the humanized antibody, or fragment thereof, can then be cloned into an appropriate expression vector.

[5185] Humanized or CDR-grafted antibodies can be produced by CDR-grafting or CDR substitution, wherein one, two, or all CDR's of an immunoglobulin chain can be replaced. See e.g., U.S. Pat. No. 5,225,539; Jones et al. 1986 Nature 321: 552-525; Verhoeyan et al. 1988 Science 239:1534; Beidler et al. 1988 J. Immunol. 141: 4053-4060; Winter U.S. Pat. No. 5,225,539, the contents of all of which are hereby expressly incorporated by reference. Winter describes a CDR-grafting method which may be used to prepare the humanized antibodies of the present invention (UK Patent Application GB 2188638A, filed on Mar. 26, 1987; Winter U.S. Pat. No. 5,225,539), the contents of which is expressly incorporated by reference.

[5186] Also within the scope of the invention are humanized antibodies in which specific amino acids have been substituted, deleted or added. Preferred humanized antibodies have amino acid substitutions in the framework region, such as to improve binding to the antigen. For example, a humanized antibody will have framework residues identical to the donor framework residue or to another amino acid other than the recipient framework residue. To generate such antibodies, a selected, small number of acceptor framework residues of the humanized immunoglobulin chain can be replaced by the corresponding donor amino acids. Preferred locations of the substitutions include amino acid residues adjacent to the CDR, or which are capable of interacting with a CDR (see e.g., U.S. Pat. No. 5,585,089). Criteria for selecting amino acids from the donor are described in U.S. Pat. No. 5,585,089, e.g., columns 12-16 of U.S. Pat. No. 5,585,089, the e.g., columns 12-16 of U.S. Pat. No. 5,585,089, the contents of which are hereby incorporated by reference. Other techniques for humanizing antibodies are described in Padlan et al. EP 519596 A1, published on Dec. 23, 1992.

[5187] A full-length 80091 protein or, antigenic peptide fragment of 80091 can be used as an immunogen or can be used to identify anti-80091 antibodies made with other immunogens, e.g., cells, membrane preparations, and the like. The antigenic peptide of 80091 should include at least 8 amino acid residues of the amino acid sequence shown in SEQ ID NO:95 and encompasses an epitope of 80091. Preferably, the antigenic peptide includes at least 10 amino acid residues, more preferably at least 15 amino acid residues, even more preferably at least 20 amino acid residues, and most preferably at least 30 amino acid residues.

[5188] Fragments of 80091 which include residues about 315 to 339, about 530 to 539, about 680 to 695, or about 1185 to 1220 can be used to make, e.g., used as immunogens or used to characterize the specificity of an antibody, antibodies against hydrophilic regions of the 80091 protein. Similarly, fragments of 80091 which include residues about 188 to 205, about 540 to 550, about 817 to 825, or about 980 to 995 can be used to make an antibody against a hydrophobic region of the 80091 protein; a fragment of 80091 which include residues about 447 to 478, or about 1219 to 1279 can be used to make an antibody against the ubiquitin carboxy-terminal hydrolase region of the 80091 protein.

[5189] Antibodies reactive with, or specific for, any of these regions, or other regions or domains described herein are provided.

[5190] Antibodies which bind only native 80091 protein, only denatured or otherwise non-native 80091 protein, or which bind both, are with in the invention. Antibodies with linear or conformational epitopes are within the invention. Conformational epitopes can sometimes be identified by identifying antibodies which bind to native but not denatured 80091 protein.

[5191] Preferred epitopes encompassed by the antigenic peptide are regions of 80091 are located on the surface of the protein, e.g., hydrophilic regions, as well as regions with high antigenicity. For example, an Emini surface probability analysis of the human 80091 protein sequence can be used to indicate the regions that have a particularly high probability of being localized to the surface of the 80091 protein and are thus likely to constitute surface residues useful for targeting antibody production.

[5192] The anti-80091 antibody can be a single chain antibody. A single-chain antibody (scFV) may be engineered (see, for example, Colcher, D. et al. (1999) Ann N Y Acad Sci 880:263-80; and Reiter, Y. (1996) Clin Cancer Res 2:245-52). The single chain antibody can be dimerized or multimerized to generate multivalent antibodies having specificities for different epitopes of the same target 80091 protein.

[5193] In a preferred embodiment the antibody has effector function and/or can fix complement. In other embodiments the antibody does not recruit effector cells; or fix complement.

[5194] In a preferred embodiment, the antibody has reduced or no ability to bind an Fc receptor. For example, it is a isotype or subtype, fragment or other mutant, which does not support binding to an Fc receptor, e.g., it has a mutagenized or deleted Fc receptor binding region.

[5195] In a preferred embodiment, an anti-80091 antibody alters (e.g., increases or decreases) the de-ubiquitination of an 80091 polypeptide.

[5196] The antibody can be coupled to a toxin, e.g., a polypeptide toxin, e,g, ricin or diphtheria toxin or active fragment hereof, or a radioactive nucleus, or imaging agent, e.g. a radioactive, enzymatic, or other, e.g., imaging agent, e.g., a NMR contrast agent. Labels which produce detectable radioactive emissions or fluorescence are preferred.

[5197] An anti-80091 antibody (e.g., monoclonal antibody) can be used to isolate 80091 by standard techniques, such as affinity chromatography or immunoprecipitation. Moreover, an anti-80091 antibody can be used to detect 80091 protein (e.g., in a cellular lysate or cell supernatant) in order to evaluate the abundance and pattern of expression of the protein. Anti-80091 antibodies can be used diagnostically to monitor protein levels in tissue as part of a clinical testing procedure, e.g., to determine the efficacy of a given treatment regimen. Detection can be facilitated by coupling (i.e., physically linking) the antibody to a detectable substance (i.e., antibody labelling). Examples of detectable substances include various enzymes, prosthetic groups, fluorescent materials, luminescent materials, bioluminescent materials, and radioactive materials. Examples of suitable enzymes include horseradish peroxidase, alkaline phosphatase, &bgr;-galactosidase, or acetylcholinesterase; examples of suitable prosthetic group complexes include streptavidin/biotin and avidin/biotin; examples of suitable fluorescent materials include umbelliferone, fluorescein, fluorescein isothiocyanate, rhodamine, dichlorotriazinylamine fluorescein, dansyl chloride or phycoerythrin; an example of a luminescent material includes luminol; examples of bioluminescent materials include luciferase, luciferin, and aequorin, and examples of suitable radioactive material include 125I, 131I, 35S or 3H.

[5198] The invention also includes a nucleic acid which encodes an anti-80091 antibody, e.g., an anti-80091 antibody described herein. Also included are vectors which include the nucleic acid and cells transformed with the nucleic acid, particularly cells which are useful for producing an antibody, e.g., mammalian cells, e.g. CHO or lymphatic cells.

[5199] The invention also includes cell lines, e.g., hybridomas, which make an anti-80091 antibody, e.g., and antibody described herein, and method of using said cells to make an 80091 antibody.

[5200] 80091 Recombinant Expression Vectors, Host Cells and Genetically Engineered Cells

[5201] In another aspect, the invention includes, vectors, preferably expression vectors, containing a nucleic acid encoding a polypeptide described herein. As used herein, the term “vector” refers to a nucleic acid molecule capable of transporting another nucleic acid to which it has been linked and can include a plasmid, cosmid or viral vector. The vector can be capable of autonomous replication or it can integrate into a host DNA. Viral vectors include, e.g., replication defective retroviruses, adenoviruses and adeno-associated viruses.

[5202] A vector can include an 80091 nucleic acid in a form suitable for expression of the nucleic acid in a host cell. Preferably the recombinant expression vector includes one or more regulatory sequences operatively linked to the nucleic acid sequence to be expressed. The term “regulatory sequence” includes promoters, enhancers and other expression control elements (e.g., polyadenylation signals). Regulatory sequences include those which direct constitutive expression of a nucleotide sequence, as well as tissue-specific regulatory and/or inducible sequences. The design of the expression vector can depend on such factors as the choice of the host cell to be transformed, the level of expression of protein desired, and the like. The expression vectors of the invention can be introduced into host cells to thereby produce proteins or polypeptides, including fusion proteins or polypeptides, encoded by nucleic acids as described herein (e.g., 80091 proteins, mutant forms of 80091 proteins, fusion proteins, and the like).

[5203] The recombinant expression vectors of the invention can be designed for expression of 80091 proteins in prokaryotic or eukaryotic cells. For example, polypeptides of the invention can be expressed in E. coli, insect cells (e.g., using baculovirus expression vectors), yeast cells or mammalian cells. Suitable host cells are discussed further in Goeddel, (1990) Gene Expression Technology: Methods in Enzymology 185, Academic Press, San Diego, Calif. Alternatively, the recombinant expression vector can be transcribed and translated in vitro, for example using T7 promoter regulatory sequences and T7 polymerase.

[5204] Expression of proteins in prokaryotes is most often carried out in E. coli with vectors containing constitutive or inducible promoters directing the expression of either fusion or non-fusion proteins. Fusion vectors add a number of amino acids to a protein encoded therein, usually to the amino terminus of the recombinant protein. Such fusion vectors typically serve three purposes: 1) to increase expression of recombinant protein; 2) to increase the solubility of the recombinant protein; and 3) to aid in the purification of the recombinant protein by acting as a ligand in affinity purification. Often, a proteolytic cleavage site is introduced at the junction of the fusion moiety and the recombinant protein to enable separation of the recombinant protein from the fusion moiety subsequent to purification of the fusion protein. Such enzymes, and their cognate recognition sequences, include Factor Xa, thrombin and enterokinase. Typical fusion expression vectors include pGEX (Pharmacia Biotech Inc; Smith, D. B. and Johnson, K. S. (1988) Gene 67:31-40), pMAL (New England Biolabs, Beverly, Mass.) and pRIT5 (Pharmacia, Piscataway, N.J.) which fuse glutathione S-transferase (GST), maltose E binding protein, or protein A, respectively, to the target recombinant protein.

[5205] Purified fusion proteins can be used in 80091 activity assays, (e.g., direct assays or competitive assays described in detail below), or to generate antibodies specific for 80091 proteins. In a preferred embodiment, a fusion protein expressed in a retroviral expression vector of the present invention can be used to infect bone marrow cells which are subsequently transplanted into irradiated recipients. The pathology of the subject recipient is then examined after sufficient time has passed (e.g., six weeks).

[5206] To maximize recombinant protein expression in E. coli is to express the protein in a host bacteria with an impaired capacity to proteolytically cleave the recombinant protein (Gottesman, S., (1990) Gene Expression Technology: Methods in Enzymology 185, Academic Press, San Diego, Calif. 119-128). Another strategy is to alter the nucleic acid sequence of the nucleic acid to be inserted into an expression vector so that the individual codons for each amino acid are those preferentially utilized in E. coli (Wada et al., (1992) Nucleic Acids Res. 20:2111-2118). Such alteration of nucleic acid sequences of the invention can be carried out by standard DNA synthesis techniques.

[5207] The 80091 expression vector can be a yeast expression vector, a vector for expression in insect cells, e.g., a baculovirus expression vector or a vector suitable for expression in mammalian cells.

[5208] When used in mammalian cells, the expression vector's control functions can be provided by viral regulatory elements. For example, commonly used promoters are derived from polyoma, Adenovirus 2, cytomegalovirus and Simian Virus 40.

[5209] In another embodiment, the promoter is an inducible promoter, e.g., a promoter regulated by a steroid hormone, by a polypeptide hormone (e.g., by means of a signal transduction pathway), or by a heterologous polypeptide (e.g., the tetracycline-inducible systems, “Tet-On” and “Tet-Off”; see, e.g., Clontech Inc., CA, Gossen and Bujard (1992) Proc. Natl. Acad. Sci. USA 89:5547, and Paillard (1989) Human Gene Therapy 9:983).

[5210] In another embodiment, the recombinant mammalian expression vector is capable of directing expression of the nucleic acid preferentially in a particular cell type (e.g., tissue-specific regulatory elements are used to express the nucleic acid). Non-limiting examples of suitable tissue-specific promoters include the albumin promoter (liver-specific; Pinkert et al. (1987) Genes Dev. 1: 268-277), lymphoid-specific promoters (Calame and Eaton (1988) Adv. Immunol. 43: 235-275), in particular promoters of T cell receptors (Winoto and Baltimore (1989) EMBO J. 8: 729-733) and immunoglobulins (Banerji et al. (1983) Cell 33: 729-740; Queen and Baltimore (1983) Cell 33: 741-748), neuron-specific promoters (e.g., the neurofilament promoter; Byrne and Ruddle (1989) Proc. Natl. Acad. Sci. USA 86: 5473-5477), pancreas-specific promoters (Edlund et al. (1985) Science 230: 912-916), and mammary gland-specific promoters (e.g., milk whey promoter; U.S. Pat. No. 4,873,316 and European Application Publication No. 264,166). Developmentally-regulated promoters are also encompassed, for example, the murine hox promoters (Kessel and Gruss (1990) Science 249: 374-379) and the &agr;-fetoprotein promoter (Campes and Tilghman (1989) Genes Dev. 3: 537-546).

[5211] The invention further provides a recombinant expression vector comprising a DNA molecule of the invention cloned into the expression vector in an antisense orientation. Regulatory sequences (e.g., viral promoters and/or enhancers) operatively linked to a nucleic acid cloned in the antisense orientation can be chosen which direct the constitutive, tissue specific or cell type specific expression of antisense RNA in a variety of cell types. The antisense expression vector can be in the form of a recombinant plasmid, phagemid or attenuated virus.

[5212] Another aspect the invention provides a host cell which includes a nucleic acid molecule described herein, e.g., an 80091 nucleic acid molecule within a recombinant expression vector or an 80091 nucleic acid molecule containing sequences which allow it to homologously recombine into a specific site of the host cell's genome. The terms “host cell” and “recombinant host cell” are used interchangeably herein. Such terms refer not only to the particular subject cell but to the progeny or potential progeny of such a cell. Because certain modifications may occur in succeeding generations due to either mutation or environmental influences, such progeny may not, in fact, be identical to the parent cell, but are still included within the scope of the term as used herein.

[5213] A host cell can be any prokaryotic or eukaryotic cell. For example, an 80091 protein can be expressed in bacterial cells (such as E. coli), insect cells, yeast or mammalian cells (such as Chinese hamster ovary cells (CHO) or COS cells (African green monkey kidney cells CV-1 origin SV40 cells; Gluzman (1981) Cell 123: 175-182)). Other suitable host cells are known to those skilled in the art.

[5214] Vector DNA can be introduced into host cells via conventional transformation or transfection techniques. As used herein, the terms “transformation” and “transfection” are intended to refer to a variety of art-recognized techniques for introducing foreign nucleic acid (e.g., DNA) into a host cell, including calcium phosphate or calcium chloride co-precipitation, DEAE-dextran-mediated transfection, lipofection, or electroporation.

[5215] A host cell of the invention can be used to produce (i.e., express) an 80091 protein. Accordingly, the invention further provides methods for producing an 80091 protein using the host cells of the invention. In one embodiment, the method includes culturing the host cell of the invention (into which a recombinant expression vector encoding an 80091 protein has been introduced) in a suitable medium such that an 80091 protein is produced. In another embodiment, the method further includes isolating an 80091 protein from the medium or the host cell.

[5216] In another aspect, the invention features, a cell or purified preparation of cells which include an 80091 transgene, or which otherwise misexpress 80091. The cell preparation can consist of human or non-human cells, e.g., rodent cells, e.g., mouse or rat cells, rabbit cells, or pig cells. In preferred embodiments, the cell or cells include an 80091 transgene, e.g., a heterologous form of an 80091, e.g., a gene derived from humans (in the case of a non-human cell). The 80091 transgene can be misexpressed, e.g., overexpressed or underexpressed. In other preferred embodiments, the cell or cells include a gene that mis-expresses an endogenous 80091, e.g., a gene the expression of which is disrupted, e.g., a knockout. Such cells can serve as a model for studying disorders that are related to mutated or mis-expressed 80091 alleles or for use in drug screening.

[5217] In another aspect, the invention features, a human cell, transformed with nucleic acid which encodes a subject 80091 polypeptide.

[5218] Also provided are cells, preferably human cells, e.g., human hematopoietic or fibroblast cells, in which an endogenous 80091 is under the control of a regulatory sequence that does not normally control the expression of the endogenous 80091 gene. The expression characteristics of an endogenous gene within a cell, e.g., a cell line or microorganism, can be modified by inserting a heterologous DNA regulatory element into the genome of the cell such that the inserted regulatory element is operably linked to the endogenous 80091 gene. For example, an endogenous 80091 gene which is “transcriptionally silent,” e.g., not normally expressed, or expressed only at very low levels, may be activated by inserting a regulatory element which is capable of promoting the expression of a normally expressed gene product in that cell. Techniques such as targeted homologous recombinations, can be used to insert the heterologous DNA as described in, e.g., Chappel, U.S. Pat. No. 5,272,071; WO 91/06667, published in May 16, 1991.

[5219] In a preferred embodiment, recombinant cells described herein can be used for replacement therapy in a subject. For example, a nucleic acid encoding an 80091 polypeptide operably linked to an inducible promoter (e.g., a steroid hormone receptor-regulated promoter) is introduced into a human or nonhuman, e.g., mammalian, e.g., porcine recombinant cell. The cell is cultivated and encapsulated in a biocompatible material, such as poly-lysine alginate, and subsequently implanted into the subject. See, e.g., Lanza (1996) Nat. Biotechnol. 14:1107; Joki et al. (2001) Nat. Biotechnol. 19:35; and U.S. Pat. No. 5,876,742. Production of 80091 polypeptide can be regulated in the subject by administering an agent (e.g., a steroid hormone) to the subject. In another preferred embodiment, the implanted recombinant cells express and secrete an antibody specific for an 80091 polypeptide. The antibody can be any antibody or any antibody derivative described herein.

[5220] 80091 Transgenic Animals

[5221] The invention provides non-human transgenic animals. Such animals are useful for studying the function and/or activity of an 80091 protein and for identifying and/or evaluating modulators of 80091 activity. As used herein, a “transgenic animal” is a non-human animal, preferably a mammal, more preferably a rodent such as a rat or mouse, in which one or more of the cells of the animal includes a transgene. Other examples of transgenic animals include non-human primates, sheep, dogs, cows, goats, chickens, amphibians, and the like. A transgene is exogenous DNA or a rearrangement, e.g., a deletion of endogenous chromosomal DNA, which preferably is integrated into or occurs in the genome of the cells of a transgenic animal. A transgene can direct the expression of an encoded gene product in one or more cell types or tissues of the transgenic animal, other transgenes, e.g., a knockout, reduce expression. Thus, a transgenic animal can be one in which an endogenous 80091 gene has been altered by, e.g., by homologous recombination between the endogenous gene and an exogenous DNA molecule introduced into a cell of the animal, e.g., an embryonic cell of the animal, prior to development of the animal.

[5222] Intronic sequences and polyadenylation signals can also be included in the transgene to increase the efficiency of expression of the transgene. A tissue-specific regulatory sequence(s) can be operably linked to a transgene of the invention to direct expression of an 80091 protein to particular cells. A transgenic founder animal can be identified based upon the presence of an 80091 transgene in its genome and/or expression of 80091 mRNA in tissues or cells of the animals. A transgenic founder animal can then be used to breed additional animals carrying the transgene. Moreover, transgenic animals carrying a transgene encoding an 80091 protein can further be bred to other transgenic animals carrying other transgenes.

[5223] 80091 proteins or polypeptides can be expressed in transgenic animals or plants, e.g., a nucleic acid encoding the protein or polypeptide can be introduced into the genome of an animal. In preferred embodiments the nucleic acid is placed under the control of a tissue specific promoter, e.g., a milk or egg specific promoter, and recovered from the milk or eggs produced by the animal. Suitable animals are mice, pigs, cows, goats, and sheep.

[5224] The invention also includes a population of cells from a transgenic animal, as discussed, e.g., below.

[5225] Uses of 80091

[5226] The nucleic acid molecules, proteins, protein homologues, and antibodies described herein can be used in one or more of the following methods: a) screening assays; b) predictive medicine (e.g., diagnostic assays, prognostic assays, monitoring clinical trials, and pharmacogenetics); and c) methods of treatment (e.g., therapeutic and prophylactic).

[5227] The isolated nucleic acid molecules of the invention can be used, for example, to express an 80091 protein (e.g., via a recombinant expression vector in a host cell in gene therapy applications), to detect an 80091 mRNA (e.g., in a biological sample) or a genetic alteration in an 80091 gene, and to modulate 80091 activity, as described further below. The 80091 proteins can be used to treat disorders characterized by insufficient or excessive production of an 80091 substrate or production of 80091 inhibitors. In addition, the 80091 proteins can be used to screen for naturally occurring 80091 substrates, to screen for drugs or compounds which modulate 80091 activity, as well as to treat disorders characterized by insufficient or excessive production of 80091 protein or production of 80091 protein forms which have decreased, aberrant or unwanted activity compared to 80091 wild type protein (e.g., an erythroid-associated disorder). Moreover, the anti-80091 antibodies of the invention can be used to detect and isolate 80091 proteins, regulate the bioavailability of 80091 proteins, and modulate 80091 activity.

[5228] A method of evaluating a compound for the ability to interact with, e.g., bind, a subject 80091 polypeptide is provided. The method includes: contacting the compound with the subject 80091 polypeptide; and evaluating ability of the compound to interact with, e.g., to bind or form a complex with the subject 80091 polypeptide. This method can be performed in vitro, e.g., in a cell free system, or in vivo, e.g., in a two-hybrid interaction trap assay. This method can be used to identify naturally occurring molecules that interact with subject 80091 polypeptide. It can also be used to find natural or synthetic inhibitors of subject 80091 polypeptide. Screening methods are discussed in more detail below.

[5229] 80091 Screening Assays

[5230] The invention provides methods (also referred to herein as “screening assays”) for identifying modulators, i.e., candidate or test compounds or agents (e.g., proteins, peptides, peptidomimetics, peptoids, small molecules or other drugs) which bind to 80091 proteins, have a stimulatory or inhibitory effect on, for example, 80091 expression or 80091 activity, or have a stimulatory or inhibitory effect on, for example, the expression or activity of an 80091 substrate. Compounds thus identified can be used to modulate the activity of target gene products (e.g., 80091 genes) in a therapeutic protocol, to elaborate the biological function of the target gene product, or to identify compounds that disrupt normal target gene interactions.

[5231] In one embodiment, the invention provides assays for screening candidate or test compounds which are substrates of an 80091 protein or polypeptide or a biologically active portion thereof. In another embodiment, the invention provides assays for screening candidate or test compounds that bind to or modulate an activity of an 80091 protein or polypeptide or a biologically active portion thereof.

[5232] The test compounds of the present invention can be obtained using any of the numerous approaches in combinatorial library methods known in the art, including: biological libraries; peptoid libraries (libraries of molecules having the functionalities of peptides, but with a novel, non-peptide backbone which are resistant to enzymatic degradation but which nevertheless remain bioactive; see, e.g., Zuckermann, R. N. et al. (1994) J. Med. Chem. 37:2678-85); spatially addressable parallel solid phase or solution phase libraries; synthetic library methods requiring deconvolution; the ‘one-bead one-compound’ library method; and synthetic library methods using affinity chromatography selection. The biological library and peptoid library approaches are limited to peptide libraries, while the other four approaches are applicable to peptide, non-peptide oligomer or small molecule libraries of compounds (Lam (1997) Anticancer Drug Des. 12:145).

[5233] Examples of methods for the synthesis of molecular libraries can be found in the art, for example in: DeWitt et al. (1993) Proc. Natl. Acad. Sci. U.S.A. 90: 6909; Erb et al. (1994) Proc. Natl. Acad. Sci. USA 91: 11422; Zuckermann et al. (1994). J. Med. Chem. 37: 2678; Cho et al. (1993) Science 261: 1303; Carrell et al. (1994) Angew. Chem. Int. Ed. Engl. 33: 2059; Carell et al. (1994) Angew. Chem. Int. Ed. Engl. 33: 2061; and Gallop et al. (1994) J. Med. Chem. 37:1233.

[5234] Libraries of compounds may be presented in solution (e.g., Houghten (1992) Biotechniques 13:412-421), or on beads (Lam (1991) Nature 354:82-84), chips (Fodor (1993) Nature 364: 555-556), bacteria (Ladner, U.S. Pat. No. 5,223,409), spores (Ladner U.S. Pat. No. 5,223,409), plasmids (Cull et al. (1992) Proc Natl Acad Sci USA 89:1865-1869) or on phage (Scott and Smith (1990) Science 249: 386-390; Devlin (1990) Science 249: 404-406; Cwirla et al. (1990) Proc. Natl. Acad. Sci. 87: 6378-6382; Felici (1991) J. Mol. Biol. 222: 301-310; Ladner supra.).

[5235] In one embodiment, an assay is a cell-based assay in which a cell which expresses an 80091 protein or biologically active portion thereof is contacted with a test compound, and the ability of the test compound to modulate 80091 activity is determined. Determining the ability of the test compound to modulate 80091 activity can be accomplished by monitoring, for example, the de-ubiquitination. The cell, for example, can be of mammalian origin, e.g., human.

[5236] The ability of the test compound to modulate 80091 binding to a compound, e.g., an 80091 substrate, or to bind to 80091 can also be evaluated. This can be accomplished, for example, by coupling the compound, e.g., the substrate, with a radioisotope or enzymatic label such that binding of the compound, e.g., the substrate, to 80091 can be determined by detecting the labeled compound, e.g., substrate, in a complex. Alternatively, 80091 could be coupled with a radioisotope or enzymatic label to monitor the ability of a test compound to modulate 80091 binding to an 80091 substrate in a complex. For example, compounds (e.g., 80091 substrates) can be labeled with 125I, 35S, 14C, or 3H, either directly or indirectly, and the radioisotope detected by direct counting of radioemmission or by scintillation counting. Alternatively, compounds can be enzymatically labeled with, for example, horseradish peroxidase, alkaline phosphatase, or luciferase, and the enzymatic label detected by determination of conversion of an appropriate substrate to product.

[5237] The ability of a compound (e.g., an 80091 substrate) to interact with 80091 with or without the labeling of any of the interactants can be evaluated. For example, a microphysiometer can be used to detect the interaction of a compound with 80091 without the labeling of either the compound or the 80091. McConnell, H. M. et al. (1992) Science 257:1906-1912. As used herein, a “microphysiometer” (e.g., Cytosensor) is an analytical instrument that measures the rate at which a cell acidifies its environment using a light-addressable potentiometric sensor (LAPS). Changes in this acidification rate can be used as an indicator of the interaction between a compound and 80091.

[5238] In yet another embodiment, a cell-free assay is provided in which an 80091 protein or biologically active portion thereof is contacted with a test compound and the ability of the test compound to bind to the 80091 protein or biologically active portion thereof is evaluated. Preferred biologically active portions of the 80091 proteins to be used in assays of the present invention include fragments which participate in interactions with non-80091 molecules, e.g., fragments with high surface probability scores.

[5239] Soluble and/or membrane-bound forms of isolated proteins (e.g., 80091 proteins or biologically active portions thereof) can be used in the cell-free assays of the invention. When membrane-bound forms of the protein are used, it may be desirable to utilize a solubilizing agent. Examples of such solubilizing agents include non-ionic detergents such as n-octylglucoside, n-dodecylglucoside, n-dodecylmaltoside, octanoyl-N-methylglucamide, decanoyl-N-methylglucamide, Triton® X-100, Triton® X-114, Thesit®, Isotridecypoly(ethylene glycol ether)n, 3-[(3-cholamidopropyl)dimethylamminio]-1-propane sulfonate (CHAPS), 3-[(3-cholamidopropyl)dimethylamminio]-2-hydroxy-1-propane sulfonate (CHAPSO), or N-dodecyl=N,N-dimethyl-3-ammonio-1-propane sulfonate.

[5240] Cell-free assays involve preparing a reaction mixture of the target gene protein and the test compound under conditions and for a time sufficient to allow the two components to interact and bind, thus forming a complex that can be removed and/or detected.

[5241] The interaction between two molecules can also be detected, e.g., using fluorescence energy transfer (FET) (see, for example, Lakowicz et al., U.S. Pat. No. 5,631,169; Stavrianopoulos, et al., U.S. Pat. No. 4,868,103). A fluorophore label on the first, ‘donor’ molecule is selected such that its emitted fluorescent energy will be absorbed by a fluorescent label on a second, ‘acceptor’ molecule, which in turn is able to fluoresce due to the absorbed energy. Alternately, the ‘donor’ protein molecule may simply utilize the natural fluorescent energy of tryptophan residues. Labels are chosen that emit different wavelengths of light, such that the ‘acceptor’ molecule label may be differentiated from that of the ‘donor’. Since the efficiency of energy transfer between the labels is related to the distance separating the molecules, the spatial relationship between the molecules can be assessed. In a situation in which binding occurs between the molecules, the fluorescent emission of the ‘acceptor’ molecule label in the assay should be maximal. An FET binding event can be conveniently measured through standard fluorometric detection means well known in the art (e.g., using a fluorimeter).

[5242] In another embodiment, determining the ability of the 80091 protein to bind to a target molecule can be accomplished using real-time Biomolecular Interaction Analysis (BIA) (see, e.g., Sjolander, S. and Urbaniczky, C. (1991) Anal. Chem. 63: 2338-2345 and Szabo et al. (1995) Curr. Opin. Struct. Biol. 5: 699-705). “Surface plasmon resonance” or “BIA” detects biospecific interactions in real time, without labeling any of the interactants (e.g., BIAcore). Changes in the mass at the binding surface (indicative of a binding event) result in alterations of the refractive index of light near the surface (the optical phenomenon of surface plasmon resonance (SPR)), resulting in a detectable signal which can be used as an indication of real-time reactions between biological molecules.

[5243] In one embodiment, the target gene product or the test substance is anchored onto a solid phase. The target gene product/test compound complexes anchored on the solid phase can be detected at the end of the reaction. Preferably, the target gene product can be anchored onto a solid surface, and the test compound, (which is not anchored), can be labeled, either directly or indirectly, with detectable labels discussed herein.

[5244] It may be desirable to immobilize either 80091, an anti-80091 antibody or its target molecule to facilitate separation of complexed from uncomplexed forms of one or both of the proteins, as well as to accommodate automation of the assay. Binding of a test compound to an 80091 protein, or interaction of an 80091 protein with a target molecule in the presence and absence of a candidate compound, can be accomplished in any vessel suitable for containing the reactants. Examples of such vessels include microtiter plates, test tubes, and micro-centrifuge tubes. In one embodiment, a fusion protein can be provided which adds a domain that allows one or both of the proteins to be bound to a matrix. For example, glutathione-S-transferase/80091 fusion proteins or glutathione-S-transferase/target fusion proteins can be adsorbed onto glutathione sepharose beads (Sigma Chemical, St. Louis, Mo.) or glutathione derivatized microtiter plates, which are then combined with the test compound or the test compound and either the non-adsorbed target protein or 80091 protein, and the mixture incubated under conditions conducive to complex formation (e.g., at physiological conditions for salt and pH). Following incubation, the beads or microtiter plate wells are washed to remove any unbound components, the matrix immobilized in the case of beads, complex determined either directly or indirectly, for example, as described above. Alternatively, the complexes can be dissociated from the matrix, and the level of 80091 binding or activity determined using standard techniques.

[5245] Other techniques for immobilizing either an 80091 protein or a target molecule on matrices include using conjugation of biotin and streptavidin. Biotinylated 80091 protein or target molecules can be prepared from biotin-NHS(N-hydroxy-succinimide) using techniques known in the art (e.g., biotinylation kit, Pierce Chemicals, Rockford, Ill.), and immobilized in the wells of streptavidin-coated 96 well plates (Pierce Chemical).

[5246] In order to conduct the assay, the non-immobilized component is added to the coated surface containing the anchored component. After the reaction is complete, unreacted components are removed (e.g., by washing) under conditions such that any complexes formed will remain immobilized on the solid surface. The detection of complexes anchored on the solid surface can be accomplished in a number of ways. Where the previously non-immobilized component is pre-labeled, the detection of label immobilized on the surface indicates that complexes were formed. Where the previously non-immobilized component is not pre-labeled, an indirect label can be used to detect complexes anchored on the surface; e.g., using a labeled antibody specific for the immobilized component (the antibody, in turn, can be directly labeled or indirectly labeled with, e.g., a labeled anti-Ig antibody).

[5247] In one embodiment, this assay is performed utilizing antibodies reactive with 80091 protein or target molecules but which do not interfere with binding of the 80091 protein to its target molecule. Such antibodies can be derivatized to the wells of the plate, and unbound target or 80091 protein trapped in the wells by antibody conjugation. Methods for detecting such complexes, in addition to those described above for the GST-immobilized complexes, include immunodetection of complexes using antibodies reactive with the 80091 protein or target molecule, as well as enzyme-linked assays which rely on detecting an enzymatic activity associated with the 80091 protein or target molecule.

[5248] Alternatively, cell free assays can be conducted in a liquid phase. In such an assay, the reaction products are separated from unreacted components, by any of a number of standard techniques, including but not limited to: differential centrifugation (see, for example, Rivas, G., and Minton, A. P., (1993) Trends Biochem Sci 18:284-7); chromatography (gel filtration chromatography, ion-exchange chromatography); electrophoresis (see, e.g., Ausubel, F. et al., eds. Current Protocols in Molecular Biology 1999, J. Wiley: New York.); and immunoprecipitation (see, for example, Ausubel, F. et al., eds. (1999) Current Protocols in Molecular Biology, J. Wiley: New York). Such resins and chromatographic techniques are known to one skilled in the art (see, e.g., Heegaard, N. H., (1998) J Mol Recognit 11: 141-8; Hage, D. S., and Tweed, S. A. (1997) J Chromatogr B Biomed Sci Appl. 699:499-525). Further, fluorescence energy transfer may also be conveniently utilized, as described herein, to detect binding without further purification of the complex from solution.

[5249] In a preferred embodiment, the assay includes contacting the 80091 protein or biologically active portion thereof with a known compound which binds 80091 to form an assay mixture, contacting the assay mixture with a test compound, and determining the ability of the test compound to interact with an 80091 protein, wherein determining the ability of the test compound to interact with an 80091 protein includes determining the ability of the test compound to preferentially bind to 80091 or biologically active portion thereof, or to modulate the activity of a target molecule, as compared to the known compound.

[5250] The target gene products of the invention can, in vivo, interact with one or more cellular or extracellular macromolecules, such as proteins. For the purposes of this discussion, such cellular and extracellular macromolecules are referred to herein as “binding partners.” Compounds that disrupt such interactions can be useful in regulating the activity of the target gene product. Such compounds can include, but are not limited to molecules such as antibodies, peptides, and small molecules. The preferred target genes/products for use in this embodiment are the 80091 genes herein identified. In an alternative embodiment, the invention provides methods for determining the ability of the test compound to modulate the activity of an 80091 protein through modulation of the activity of a downstream effector of an 80091 target molecule. For example, the activity of the effector molecule on an appropriate target can be determined, or the binding of the effector to an appropriate target can be determined, as previously described.

[5251] To identify compounds that interfere with the interaction between the target gene product and its cellular or extracellular binding partner(s), a reaction mixture containing the target gene product and the binding partner is prepared, under conditions and for a time sufficient, to allow the two products to form complex. In order to test an inhibitory agent, the reaction mixture is provided in the presence and absence of the test compound. The test compound can be initially included in the reaction mixture, or can be added at a time subsequent to the addition of the target gene and its cellular or extracellular binding partner. Control reaction mixtures are incubated without the test compound or with a placebo. The formation of any complexes between the target gene product and the cellular or extracellular binding partner is then detected. The formation of a complex in the control reaction, but not in the reaction mixture containing the test compound, indicates that the compound interferes with the interaction of the target gene product and the interactive binding partner. Additionally, complex formation within reaction mixtures containing the test compound and normal target gene product can also be compared to complex formation within reaction mixtures containing the test compound and mutant target gene product. This comparison can be important in those cases wherein it is desirable to identify compounds that disrupt interactions of mutant but not normal target gene products.

[5252] These assays can be conducted in a heterogeneous or homogeneous format. Heterogeneous assays involve anchoring either the target gene product or the binding partner onto a solid phase, and detecting complexes anchored on the solid phase at the end of the reaction. In homogeneous assays, the entire reaction is carried out in a liquid phase. In either approach, the order of addition of reactants can be varied to obtain different information about the compounds being tested. For example, test compounds that interfere with the interaction between the target gene products and the binding partners, e.g., by competition, can be identified by conducting the reaction in the presence of the test substance. Alternatively, test compounds that disrupt preformed complexes, e.g., compounds with higher binding constants that displace one of the components from the complex, can be tested by adding the test compound to the reaction mixture after complexes have been formed. The various formats are briefly described below.

[5253] In a heterogeneous assay system, either the target gene product or the interactive cellular or extracellular binding partner, is anchored onto a solid surface (e.g., a microtiter plate), while the non-anchored species is labeled, either directly or indirectly. The anchored species can be immobilized by non-covalent or covalent attachments. Alternatively, an immobilized antibody specific for the species to be anchored can be used to anchor the species to the solid surface.

[5254] In order to conduct the assay, the partner of the immobilized species is exposed to the coated surface with or without the test compound. After the reaction is complete, unreacted components are removed (e.g., by washing) and any complexes formed will remain immobilized on the solid surface. Where the non-immobilized species is pre-labeled, the detection of label immobilized on the surface indicates that complexes were formed. Where the non-immobilized species is not pre-labeled, an indirect label can be used to detect complexes anchored on the surface; e.g., using a labeled antibody specific for the initially non-immobilized species (the antibody, in turn, can be directly labeled or indirectly labeled with, e.g., a labeled anti-Ig antibody). Depending upon the order of addition of reaction components, test compounds that inhibit complex formation or that disrupt preformed complexes can be detected.

[5255] Alternatively, the reaction can be conducted in a liquid phase in the presence or absence of the test compound, the reaction products separated from unreacted components, and complexes detected; e.g., using an immobilized antibody specific for one of the binding components to anchor any complexes formed in solution, and a labeled antibody specific for the other partner to detect anchored complexes. Again, depending upon the order of addition of reactants to the liquid phase, test compounds that inhibit complex or that disrupt preformed complexes can be identified.

[5256] In an alternate embodiment of the invention, a homogeneous assay can be used. For example, a preformed complex of the target gene product and the interactive cellular or extracellular binding partner product is prepared in that either the target gene products or their binding partners are labeled, but the signal generated by the label is quenched due to complex formation (see, e.g., U.S. Pat. No. 4,109,496 that utilizes this approach for immunoassays). The addition of a test substance that competes with and displaces one of the species from the preformed complex will result in the generation of a signal above background. In this way, test substances that disrupt target gene product-binding partner interaction can be identified.

[5257] In yet another aspect, the 80091 proteins can be used as “bait proteins” in a two-hybrid assay or three-hybrid assay (see, e.g., U.S. Pat. No. 5,283,317; Zervos et al. (1993) Cell 72:223-232; Madura et al. (1993) J. Biol. Chem. 268:12046-12054; Bartel et al. (1993) Biotechniques 14:920-924; Iwabuchi et al. (1993) Oncogene 8:1693-1696; and Brent WO94/10300), to identify other proteins, which bind to or interact with 80091 (“80091-binding proteins” or “80091-bp”) and are involved in 80091 activity. Such 80091-bps can be activators or inhibitors of signals by the 80091 proteins or 80091 targets as, for example, downstream elements of an 80091-mediated signaling pathway.

[5258] The two-hybrid system is based on the modular nature of most transcription factors, which consist of separable DNA-binding and activation domains. Briefly, the assay utilizes two different DNA constructs. In one construct, the gene that codes for an 80091 protein is fused to a gene encoding the DNA binding domain of a known transcription factor (e.g., GAL-4). In the other construct, a DNA sequence, from a library of DNA sequences, that encodes an unidentified protein (“prey” or “sample”) is fused to a gene that codes for the activation domain of the known transcription factor. (Alternatively the: 80091 protein can be the fused to the activator domain.) If the “bait” and the “prey” proteins are able to interact, in vivo, forming an 80091-dependent complex, the DNA-binding and activation domains of the transcription factor are brought into close proximity. This proximity allows transcription of a reporter gene (e.g., lacZ) which is operably linked to a transcriptional regulatory site responsive to the transcription factor. Expression of the reporter gene can be detected and cell colonies containing the functional transcription factor can be isolated and used to obtain the cloned gene which encodes the protein which interacts with the 80091 protein.

[5259] In another embodiment, modulators of 80091 expression are identified. For example, a cell or cell free mixture is contacted with a candidate compound and the expression of 80091 mRNA or protein evaluated relative to the level of expression of 80091 mRNA or protein in the absence of the candidate compound. When expression of 80091 mRNA or protein is greater in the presence of the candidate compound than in its absence, the candidate compound is identified as a stimulator of 80091 mRNA or protein expression. Alternatively, when expression of 80091 mRNA or protein is less (statistically significantly less) in the presence of the candidate compound than in its absence, the candidate compound is identified as an inhibitor of 80091 mRNA or protein expression. The level of 80091 mRNA or protein expression can be determined by methods described herein for detecting 80091 mRNA or protein.

[5260] In another aspect, the invention pertains to a combination of two or more of the assays described herein. For example, a modulating agent can be identified using a cell-based or a cell free assay, and the ability of the agent to modulate the activity of an 80091 protein can be confirmed in vivo, e.g., in an animal such as an animal model for an erythroid-associated disorder.

[5261] This invention further pertains to novel agents identified by the above-described screening assays. Accordingly, it is within the scope of this invention to further use an agent identified as described herein (e.g., an 80091 modulating agent, an antisense 80091 nucleic acid molecule, an 80091-specific antibody, or an 80091-binding partner) in an appropriate animal model to determine the efficacy, toxicity, side effects, or mechanism of action, of treatment with such an agent. Furthermore, novel agents identified by the above-described screening assays can be used for treatments as described herein.

[5262] 80091 Detection Assays

[5263] Portions or fragments of the nucleic acid sequences identified herein can be used as polynucleotide reagents. For example, these sequences can be used to: (i) map their respective genes on a chromosome e.g., to locate gene regions associated with genetic disease or to associate 80091 with a disease; (ii) identify an individual from a minute biological sample (tissue typing); and (iii) aid in forensic identification of a biological sample. These applications are described in the subsections below.

[5264] 80091 Chromosome Mapping

[5265] The 80091 nucleotide sequences or portions thereof can be used to map the location of the 80091 genes on a chromosome. This process is called chromosome mapping. Chromosome mapping is useful in correlating the 80091 sequences with genes associated with disease.

[5266] Briefly, 80091 genes can be mapped to chromosomes by preparing PCR primers (preferably 15-25 bp in length) from the 80091 nucleotide sequences. These primers can then be used for PCR screening of somatic cell hybrids containing individual human chromosomes. Only those hybrids containing the human gene corresponding to the 80091 sequences will yield an amplified fragment.

[5267] A panel of somatic cell hybrids in which each cell line contains either a single human chromosome or a small number of human chromosomes, and a full set of mouse chromosomes, can allow easy mapping of individual genes to specific human chromosomes. (D'Eustachio P. et al. (1983) Science 220: 919-924).

[5268] Other mapping strategies e.g., in situ hybridization (described in Fan, Y. et al. (1990) Proc. Natl. Acad. Sci. USA, 87: 6223-27), pre-screening with labeled flow-sorted chromosomes, and pre-selection by hybridization to chromosome specific cDNA libraries can be used to map 80091 to a chromosomal location.

[5269] Fluorescence in situ hybridization (FISH) of a DNA sequence to a metaphase chromosomal spread can further be used to provide a precise chromosomal location in one step. The FISH technique can be used with a DNA sequence as short as 500 or 600 bases. However, clones larger than 1,000 bases have a higher likelihood of binding to a unique chromosomal location with sufficient signal intensity for simple detection. Preferably 1,000 bases, and more preferably 2,000 bases will suffice to get good results at a reasonable amount of time. For a review of this technique, see Verma et al., Human Chromosomes: A Manual of Basic Techniques ((1988) Pergamon Press, New York).

[5270] Reagents for chromosome mapping can be used individually to mark a single chromosome or a single site on that chromosome, or panels of reagents can be used for marking multiple sites and/or multiple chromosomes. Reagents corresponding to noncoding regions of the genes actually are preferred for mapping purposes. Coding sequences are more likely to be conserved within gene families, thus increasing the chance of cross hybridizations during chromosomal mapping.

[5271] Once a sequence has been mapped to a precise chromosomal location, the physical position of the sequence on the chromosome can be correlated with genetic map data. (Such data are found, for example, in V. McKusick, Mendelian Inheritance in Man, available on-line through Johns Hopkins University Welch Medical Library). The relationship between a gene and a disease, mapped to the same chromosomal region, can then be identified through linkage analysis (co-inheritance of physically adjacent genes), described in, for example, Egeland, J. et al. (1987) Nature, 325: 783-787.

[5272] Moreover, differences in the DNA sequences between individuals affected and unaffected with a disease associated with the 80091 gene, can be determined. If a mutation is observed in some or all of the affected individuals but not in any unaffected individuals, then the mutation is likely to be the causative agent of the particular disease. Comparison of affected and unaffected individuals generally involves first looking for structural alterations in the chromosomes, such as deletions or translocations that are visible from chromosome spreads or detectable using PCR based on that DNA sequence. Ultimately, complete sequencing of genes from several individuals can be performed to confirm the presence of a mutation and to distinguish mutations from polymorphisms.

[5273] 80091 Tissue Typing

[5274] 80091 sequences can be used to identify individuals from biological samples using, e.g., restriction fragment length polymorphism (RFLP). In this technique, an individual's genomic DNA is digested with one or more restriction enzymes, the fragments separated, e.g., in a Southern blot, and probed to yield bands for identification. The sequences of the present invention are useful as additional DNA markers for RFLP (described in U.S. Pat. No. 5,272,057).

[5275] Furthermore, the sequences of the present invention can also be used to determine the actual base-by-base DNA sequence of selected portions of an individual's genome. Thus, the 80091 nucleotide sequences described herein can be used to prepare two PCR primers from the 5′ and 3′ends of the sequences. These primers can then be used to amplify an individual's DNA and subsequently sequence it. Panels of corresponding DNA sequences from individuals, prepared in this manner, can provide unique individual identifications, as each individual will have a unique set of such DNA sequences due to allelic differences.

[5276] Allelic variation occurs to some degree in the coding regions of these sequences, and to a greater degree in the noncoding regions. Each of the sequences described herein can, to some degree, be used as a standard against which DNA from an individual can be compared for identification purposes. Because greater numbers of polymorphisms occur in the noncoding regions, fewer sequences are necessary to differentiate individuals. The noncoding sequences of SEQ ID NO:94 can provide positive individual identification with a panel of perhaps 10 to 1,000 primers which each yield a noncoding amplified sequence of 100 bases. If predicted coding sequences, such as those in SEQ ID NO:94 are used, a more appropriate number of primers for positive individual identification would be 500-2,000.

[5277] If a panel of reagents from 80091 nucleotide sequences described herein is used to generate a unique identification database for an individual, those same reagents can later be used to identify tissue from that individual. Using the unique identification database, positive identification of the individual, living or dead, can be made from extremely small tissue samples.

[5278] Use of Partial 80091 Sequences in Forensic Biology

[5279] DNA-based identification techniques can also be used in forensic biology. To make such an identification, PCR technology can be used to amplify DNA sequences taken from very small biological samples such as tissues, e.g., hair or skin, or body fluids, e.g., blood, saliva, or semen found at a crime scene. The amplified sequence can then be compared to a standard, thereby allowing identification of the origin of the biological sample.

[5280] The sequences of the present invention can be used to provide polynucleotide reagents, e.g., PCR primers, targeted to specific loci in the human genome, which can enhance the reliability of DNA-based forensic identifications by, for example, providing another “identification marker” (i.e. another DNA sequence that is unique to a particular individual). As mentioned above, actual base sequence information can be used for identification as an accurate alternative to patterns formed by restriction enzyme generated fragments. Sequences targeted to noncoding regions of SEQ ID NO:94 (e.g., fragments derived from the noncoding regions of SEQ ID NO:94 having a length of at least 20 bases, preferably at least 30 bases) are particularly appropriate for this use.

[5281] The 80091 nucleotide sequences described herein can further be used to provide polynucleotide reagents, e.g., labeled or labelable probes which can be used in, for example, an in situ hybridization technique, to identify a specific tissue. This can be very useful in cases where a forensic pathologist is presented with a tissue of unknown origin. Panels of such 80091 probes can be used to identify tissue by species and/or by organ type.

[5282] In a similar fashion, these reagents, e.g., 80091 primers or probes can be used to screen tissue culture for contamination (i.e. screen for the presence of a mixture of different types of cells in a culture).

[5283] Predictive Medicine of 80091

[5284] The present invention also pertains to the field of predictive medicine in which diagnostic assays, prognostic assays, and monitoring clinical trials are used for prognostic (predictive) purposes to thereby treat an individual.

[5285] Generally, the invention provides, a method of determining if a subject is at risk for a disorder related to a lesion in or the misexpression of a gene which encodes 80091.

[5286] Such disorders include, e.g., a disorder associated with the misexpression of 80091 gene; a disorder of the erythoid system.

[5287] The method includes one or more of the following:

[5288] detecting, in a tissue of the subject, the presence or absence of a mutation which affects the expression of the 80091 gene, or detecting the presence or absence of a mutation in a region which controls the expression of the gene, e.g., a mutation in the 5′control region;

[5289] detecting, in a tissue of the subject, the presence or absence of a mutation which alters the structure of the 80091 gene;

[5290] detecting, in a tissue of the subject, the misexpression of the 80091 gene, at the mRNA level, e.g., detecting a non-wild type level of a mRNA;

[5291] detecting, in a tissue of the subject, the misexpression of the gene, at the protein level, e.g., detecting a non-wild type level of an 80091 polypeptide.

[5292] In preferred embodiments the method includes: ascertaining the existence of at least one of: a deletion of one or more nucleotides from the 80091 gene; an insertion of one or more nucleotides into the gene, a point mutation, e.g., a substitution of one or more nucleotides of the gene, a gross chromosomal rearrangement of the gene, e.g., a translocation, inversion, or deletion.

[5293] For example, detecting the genetic lesion can include: (i) providing a probe/primer including an oligonucleotide containing a region of nucleotide sequence which hybridizes to a sense or antisense sequence from SEQ ID NO:94, or naturally occurring mutants thereof or 5′ or 3′flanking sequences naturally associated with the 80091 gene; (ii) exposing the probe/primer to nucleic acid of the tissue; and detecting, by hybridization, e.g., in situ hybridization, of the probe/primer to the nucleic acid, the presence or absence of the genetic lesion.

[5294] In preferred embodiments detecting the misexpression includes ascertaining the existence of at least one of: an alteration in the level of a messenger RNA transcript of the 80091 gene; the presence of a non-wild type splicing pattern of a messenger RNA transcript of the gene; or a non-wild type level of 80091.

[5295] Methods of the invention can be used prenatally or to determine if a subject's offspring will be at risk for a disorder.

[5296] In preferred embodiments the method includes determining the structure of an 80091 gene, an abnormal structure being indicative of risk for the disorder.

[5297] In preferred embodiments the method includes contacting a sample from the subject with an antibody to the 80091 protein or a nucleic acid, which hybridizes specifically with the gene. These and other embodiments are discussed below.

[5298] Diagnostic and Prognostic Assays of 80091

[5299] Diagnostic and prognostic assays of the invention include method for assessing the expression level of 80091 molecules and for identifying variations and mutations in the sequence of 80091 molecules.

[5300] Expression Monitoring and Profiling:

[5301] The presence, level, or absence of 80091 protein or nucleic acid in a biological sample can be evaluated by obtaining a biological sample from a test subject and contacting the biological sample with a compound or an agent capable of detecting 80091 protein or nucleic acid (e.g., mRNA, genomic DNA) that encodes 80091 protein such that the presence of 80091 protein or nucleic acid is detected in the biological sample. The term “biological sample” includes tissues, cells and biological fluids isolated from a subject, as well as tissues, cells and fluids present within a subject. A preferred biological sample is serum. The level of expression of the 80091 gene can be measured in a number of ways, including, but not limited to: measuring the mRNA encoded by the 80091 genes; measuring the amount of protein encoded by the 80091 genes; or measuring the activity of the protein encoded by the 80091 genes.

[5302] The level of mRNA corresponding to the 80091 gene in a cell can be determined both by in situ and by in vitro formats.

[5303] The isolated mRNA can be used in hybridization or amplification assays that include, but are not limited to, Southern or Northern analyses, polymerase chain reaction analyses and probe arrays. One preferred diagnostic method for the detection of mRNA levels involves contacting the isolated mRNA with a nucleic acid molecule (probe) that can hybridize to the mRNA encoded by the gene being detected. The nucleic acid probe can be, for example, a full-length 80091 nucleic acid, such as the nucleic acid of SEQ ID NO:94, or a portion thereof, such as an oligonucleotide of at least 7, 15, 30, 50, 100, 250 or 500 nucleotides in length and sufficient to specifically hybridize under stringent conditions to 80091 mRNA or genomic DNA. The probe can be disposed on an address of an array, e.g., an array described below. Other suitable probes for use in the diagnostic assays are described herein.

[5304] In one format, mRNA (or cDNA) is immobilized on a surface and contacted with the probes, for example by running the isolated mRNA on an agarose gel and transferring the mRNA from the gel to a membrane, such as nitrocellulose. In an alternative format, the probes are immobilized on a surface and the mRNA (or cDNA) is contacted with the probes, for example, in a two-dimensional gene chip array described below. A skilled artisan can adapt known mRNA detection methods for use in detecting the level of mRNA encoded by the 80091 genes.

[5305] The level of mRNA in a sample that is encoded by one of 80091 can be evaluated with nucleic acid amplification, e.g., by rtPCR (Mullis (1987) U.S. Pat. No. 4,683,202), ligase chain reaction (Barany (1991) Proc. Natl. Acad. Sci. USA 88: 189-193), self sustained sequence replication (Guatelli et al., (1990) Proc. Natl. Acad. Sci. USA 87: 1874-1878), transcriptional amplification system (Kwoh et al., (1989), Proc. Natl. Acad. Sci. USA 86: 1173-1177), Q-Beta Replicase (Lizardi et al., (1988) Bio/Technology 6: 1197), rolling circle replication (Lizardi et al., U.S. Pat. No. 5,854,033) or any other nucleic acid amplification method, followed by the detection of the amplified molecules using techniques known in the art. As used herein, amplification primers are defined as being a pair of nucleic acid molecules that can anneal to 5′ or 3′regions of a gene (plus and minus strands, respectively, or vice-versa) and contain a short region in between. In general, amplification primers are from about 10 to 30 nucleotides in length and flank a region from about 50 to 200 nucleotides in length. Under appropriate conditions and with appropriate reagents, such primers permit the amplification of a nucleic acid molecule comprising the nucleotide sequence flanked by the primers.

[5306] For in situ methods, a cell or tissue sample can be prepared/processed and immobilized on a support, typically a glass slide, and then contacted with a probe that can hybridize to mRNA that encodes the 80091 gene being analyzed.

[5307] In another embodiment, the methods further contacting a control sample with a compound or agent capable of detecting 80091 mRNA, or genomic DNA, and comparing the presence of 80091 mRNA or genomic DNA in the control sample with the presence of 80091 mRNA or genomic DNA in the test sample. In still another embodiment, serial analysis of gene expression, as described in U.S. Pat. No. 5,695,937, is used to detect 80091 transcript levels.

[5308] A variety of methods can be used to determine the level of protein encoded by 80091. In general, these methods include contacting an agent that selectively binds to the protein, such as an antibody with a sample, to evaluate the level of protein in the sample. In a preferred embodiment, the antibody bears a detectable label. Antibodies can be polyclonal, or more preferably, monoclonal. An intact antibody, or a fragment thereof (e.g., Fab or F(ab′)2) can be used. The term “labeled,” with regard to the probe or antibody, is intended to encompass direct labeling of the probe or antibody by coupling (i.e., physically linking) a detectable substance to the probe or antibody, as well as indirect labeling of the probe or antibody by reactivity with a detectable substance. Examples of detectable substances are provided herein.

[5309] The detection methods can be used to detect 80091 protein in a biological sample in vitro as well as in vivo. In vitro techniques for detection of 80091 protein include enzyme linked immunosorbent assays (ELISAs), immunoprecipitations, immunofluorescence, enzyme immunoassay (EIA), radioimmunoassay (RIA), and Western blot analysis. In vivo techniques for detection of 80091 protein include introducing into a subject a labeled anti-80091 antibody. For example, the antibody can be labeled with a radioactive marker whose presence and location in a subject can be detected by standard imaging techniques. In another embodiment, the sample is labeled, e.g., biotinylated and then contacted to the antibody, e.g., an anti-80091 antibody positioned on an antibody array (as described below). The sample can be detected, e.g., with avidin coupled to a fluorescent label.

[5310] In another embodiment, the methods further include contacting the control sample with a compound or agent capable of detecting 80091 protein, and comparing the presence of 80091 protein in the control sample with the presence of 80091 protein in the test sample.

[5311] The invention also includes kits for detecting the presence of 80091 in a biological sample. For example, the kit can include a compound or agent capable of detecting 80091 protein or mRNA in a biological sample; and a standard. The compound or agent can be packaged in a suitable container. The kit can further comprise instructions for using the kit to detect 80091 protein or nucleic acid.

[5312] For antibody-based kits, the kit can include: (1) a first antibody (e.g., attached to a solid support) which binds to a polypeptide corresponding to a marker of the invention; and, optionally, (2) a second, different antibody which binds to either the polypeptide or the first antibody and is conjugated to a detectable agent.

[5313] For oligonucleotide-based kits, the kit can include: (1) an oligonucleotide, e.g., a detectably labeled oligonucleotide, which hybridizes to a nucleic acid sequence encoding a polypeptide corresponding to a marker of the invention or (2) a pair of primers useful for amplifying a nucleic acid molecule corresponding to a marker of the invention. The kit can also includes a buffering agent, a preservative, or a protein stabilizing agent. The kit can also includes components necessary for detecting the detectable agent (e.g., an enzyme or a substrate). The kit can also contain a control sample or a series of control samples which can be assayed and compared to the test sample contained. Each component of the kit can be enclosed within an individual container and all of the various containers can be within a single package, along with instructions for interpreting the results of the assays performed using the kit.

[5314] The diagnostic methods described herein can identify subjects having, or at risk of developing, a disease or disorder associated with misexpressed or aberrant or unwanted 80091 expression or activity. As used herein, the term “unwanted” includes an unwanted phenomenon involved in a biological response such as an erythroid-associated disorder or deregulated cell proliferation.

[5315] In one embodiment, a disease or disorder associated with aberrant or unwanted 80091 expression or activity is identified. A test sample is obtained from a subject and 80091 protein or nucleic acid (e.g., mRNA or genomic DNA) is evaluated, wherein the level, e.g., the presence or absence, of 80091 protein or nucleic acid is diagnostic for a subject having or at risk of developing a disease or disorder associated with aberrant or unwanted 80091 expression or activity. As used herein, a “test sample” refers to a biological sample obtained from a subject of interest, including a biological fluid (e.g., serum), cell sample, or tissue.

[5316] The prognostic assays described herein can be used to determine whether a subject can be administered an agent (e.g., an agonist, antagonist, peptidomimetic, protein, peptide, nucleic acid, small molecule, or other drug candidate) to treat a disease or disorder associated with aberrant or unwanted 80091 expression or activity. For example, such methods can be used to determine whether a subject can be effectively treated with an agent for a disordered erythroid cell.

[5317] In another aspect, the invention features a computer medium having a plurality of digitally encoded data records. Each data record includes a value representing the level of expression of 80091 in a sample, and a descriptor of the sample. The descriptor of the sample can be an identifier of the sample, a subject from which the sample was derived (e.g., a patient), a diagnosis, or a treatment (e.g., a preferred treatment). In a preferred embodiment, the data record further includes values representing the level of expression of genes other than 80091 (e.g., other genes associated with an 80091-disorder, or other genes on an array). The data record can be structured as a table, e.g., a table that is part of a database such as a relational database (e.g., a SQL database of the Oracle or Sybase database environments).

[5318] Also featured is a method of evaluating a sample. The method includes providing a sample, e.g., from the subject, and determining a gene expression profile of the sample, wherein the profile includes a value representing the level of 80091 expression. The method can further include comparing the value or the profile (i.e., multiple values) to a reference value or reference profile. The gene expression profile of the sample can be obtained by any of the methods described herein (e.g., by providing a nucleic acid from the sample and contacting the nucleic acid to an array). The method can be used to diagnose an erythroid-associated disorder in a subject wherein an increase in 80091 expression is an indication that the subject has or is disposed to having an erythroid-associated disorder. The method can be used to monitor a treatment for an erythroid-associated disorder in a subject. For example, the gene expression profile can be determined for a sample from a subject undergoing treatment. The profile can be compared to a reference profile or to a profile obtained from the subject prior to treatment or prior to onset of the disorder (see, e.g., Golub et al. (1999) Science 286:531).

[5319] In yet another aspect, the invention features a method of evaluating a test compound (see also, “Screening Assays”, above). The method includes providing a cell and a test compound; contacting the test compound to the cell; obtaining a subject expression profile for the contacted cell; and comparing the subject expression profile to one or more reference profiles. The profiles include a value representing the level of 80091 expression. In a preferred embodiment, the subject expression profile is compared to a target profile, e.g., a profile for a normal cell or for desired condition of a cell. The test compound is evaluated favorably if the subject expression profile is more similar to the target profile than an expression profile obtained from an uncontacted cell.

[5320] In another aspect, the invention features, a method of evaluating a subject. The method includes: a) obtaining a sample from a subject, e.g., from a caregiver, e.g., a caregiver who obtains the sample from the subject; b) determining a subject expression profile for the sample. Optionally, the method further includes either or both of steps: c) comparing the subject expression profile to one or more reference expression profiles; and d) selecting the reference profile most similar to the subject reference profile. The subject expression profile and the reference profiles include a value representing the level of 80091 expression. A variety of routine statistical measures can be used to compare two reference profiles. One possible metric is the length of the distance vector that is the difference between the two profiles. Each of the subject and reference profile is represented as a multi-dimensional vector, wherein each dimension is a value in the profile.

[5321] The method can further include transmitting a result to a caregiver. The result can be the subject expression profile, a result of a comparison of the subject expression profile with another profile, a most similar reference profile, or a descriptor of any of the aforementioned. The result can be transmitted across a computer network, e.g., the result can be in the form of a computer transmission, e.g., a computer data signal embedded in a carrier wave.

[5322] Also featured is a computer medium having executable code for effecting the following steps: receive a subject expression profile; access a database of reference expression profiles; and either i) select a matching reference profile most similar to the subject expression profile or ii) determine at least one comparison score for the similarity of the subject expression profile to at least one reference profile. The subject expression profile, and the reference expression profiles each include a value representing the level of 80091 expression.

[5323] 80091 Arrays and Uses Thereof

[5324] In another aspect, the invention features an array that includes a substrate having a plurality of addresses. At least one address of the plurality includes a capture probe that binds specifically to an 80091 molecule (e.g., an 80091 nucleic acid or an 80091 polypeptide). The array can have a density of at least than 10, 50, 100, 200, 500, 1,000, 2,000, or 10,000 or more addresses/cm2, and ranges between. In a preferred embodiment, the plurality of addresses includes at least 10, 100, 500, 1,000, 5,000, 10,000, 50,000 addresses. In a preferred embodiment, the plurality of addresses includes equal to or less than 10, 100, 500, 1,000, 5,000, 10,000, or 50,000 addresses. The substrate can be a two-dimensional substrate such as a glass slide, a wafer (e.g., silica or plastic), a mass spectroscopy plate, or a three-dimensional substrate such as a gel pad. Addresses in addition to address of the plurality can be disposed on the array.

[5325] In a preferred embodiment, at least one address of the plurality includes a nucleic acid capture probe that hybridizes specifically to an 80091 nucleic acid, e.g., the sense or anti-sense strand. In one preferred embodiment, a subset of addresses of the plurality of addresses has a nucleic acid capture probe for 80091. Each address of the subset can include a capture probe that hybridizes to a different region of an 80091 nucleic acid. In another preferred embodiment, addresses of the subset include a capture probe for an 80091 nucleic acid. Each address of the subset is unique, overlapping, and complementary to a different variant of 80091 (e.g., an allelic variant, or all possible hypothetical variants). The array can be used to sequence 80091 by hybridization (see, e.g., U.S. Pat. No. 5,695,940).

[5326] An array can be generated by various methods, e.g., by photolithographic methods (see, e.g., U.S. Pat. Nos. 5,143,854; 5,510,270; and 5,527,681), mechanical methods (e.g., directed-flow methods as described in U.S. Pat. No. 5,384,261), pin-based methods (e.g., as described in U.S. Pat. No. 5,288,514), and bead-based techniques (e.g., as described in PCT US/93/04145).

[5327] In another preferred embodiment, at least one address of the plurality includes a polypeptide capture probe that binds specifically to an 80091 polypeptide or fragment thereof. The polypeptide can be a naturally-occurring interaction partner of 80091 polypeptide. Preferably, the polypeptide is an antibody, e.g., an antibody described herein (see “Anti-80091 Antibodies,” above), such as a monoclonal antibody or a single-chain antibody.

[5328] In another aspect, the invention features a method of analyzing the expression of 80091. The method includes providing an array as described above; contacting the array with a sample and detecting binding of an 80091-molecule (e.g., nucleic acid or polypeptide) to the array. In a preferred embodiment, the array is a nucleic acid array. Optionally the method further includes amplifying nucleic acid from the sample prior or during contact with the array.

[5329] In another embodiment, the array can be used to assay gene expression in a tissue to ascertain tissue specificity of genes in the array, particularly the expression of 80091. If a sufficient number of diverse samples is analyzed, clustering (e.g., hierarchical clustering, k-means clustering, Bayesian clustering and the like) can be used to identify other genes which are co-regulated with 80091. For example, the array can be used for the quantitation of the expression of multiple genes. Thus, not only tissue specificity, but also the level of expression of a battery of genes in the tissue is ascertained. Quantitative data can be used to group (e.g., cluster) genes on the basis of their tissue expression per se and level of expression in that tissue.

[5330] For example, array analysis of gene expression can be used to assess the effect of cell-cell interactions on 80091 expression. A first tissue can be perturbed and nucleic acid from a second tissue that interacts with the first tissue can be analyzed. In this context, the effect of one cell type on another cell type in response to a biological stimulus can be determined, e.g., to monitor the effect of cell-cell interaction at the level of gene expression.

[5331] In another embodiment, cells are contacted with a therapeutic agent. The expression profile of the cells is determined using the array, and the expression profile is compared to the profile of like cells not contacted with the agent. For example, the assay can be used to determine or analyze the molecular basis of an undesirable effect of the therapeutic agent. If an agent is administered therapeutically to treat one cell type but has an undesirable effect on another cell type, the invention provides an assay to determine the molecular basis of the undesirable effect and thus provides the opportunity to co-administer a counteracting agent or otherwise treat the undesired effect. Similarly, even within a single cell type, undesirable biological effects can be determined at the molecular level. Thus, the effects of an agent on expression of other than the target gene can be ascertained and counteracted.

[5332] In another embodiment, the array can be used to monitor expression of one or more genes in the array with respect to time. For example, samples obtained from different time points can be probed with the array. Such analysis can identify and/or characterize the development of an 80091-associated disease or disorder; and processes, such as a cellular transformation associated with an 80091-associated disease or disorder. The method can also evaluate the treatment and/or progression of an 80091-associated disease or disorder

[5333] The array is also useful for ascertaining differential expression patterns of one or more genes in normal and abnormal cells. This provides a battery of genes (e.g., including 80091) that could serve as a molecular target for diagnosis or therapeutic intervention.

[5334] In another aspect, the invention features an array having a plurality of addresses. Each address of the plurality includes a unique polypeptide. At least one address of the plurality has disposed thereon an 80091 polypeptide or fragment thereof. Methods of producing polypeptide arrays are described in the art, e.g., in De Wildt et al. (2000). Nature Biotech. 18, 989-994; Lueking et al. (1999). Anal. Biochem. 270, 103-111; Ge, H. (2000). Nucleic Acids Res. 28, e3, I-VII; MacBeath, G., and Schreiber, S. L. (2000). Science 289, 1760-1763; and WO 99/51773A1. In a preferred embodiment, each addresses of the plurality has disposed thereon a polypeptide at least 60, 70, 80, 85, 90, 95 or 99% identical to an 80091 polypeptide or fragment thereof. For example, multiple variants of an 80091 polypeptide (e.g., encoded by allelic variants, site-directed mutants, random mutants, or combinatorial mutants) can be disposed at individual addresses of the plurality. Addresses in addition to the address of the plurality can be disposed on the array.

[5335] The polypeptide array can be used to detect an 80091 binding compound, e.g., an antibody in a sample from a subject with specificity for an 80091 polypeptide or the presence of an 80091-binding protein or ligand.

[5336] The array is also useful for ascertaining the effect of the expression of a gene on the expression of other genes in the same cell or in different cells (e.g., ascertaining the effect of 80091 expression on the expression of other genes). This provides, for example, for a selection of alternate molecular targets for therapeutic intervention if the ultimate or downstream target cannot be regulated.

[5337] In another aspect, the invention features a method of analyzing a plurality of probes. The method is useful, e.g., for analyzing gene expression. The method includes: providing a two dimensional array having a plurality of addresses, each address of the plurality being positionally distinguishable from each other address of the plurality having a unique capture probe, e.g., wherein the capture probes are from a cell or subject which express 80091 or from a cell or subject in which an 80091 mediated response has been elicited, e.g., by contact of the cell with 80091 nucleic acid or protein, or administration to the cell or subject 80091 nucleic acid or protein; providing a two dimensional array having a plurality of addresses, each address of the plurality being positionally distinguishable from each other address of the plurality, and each address of the plurality having a unique capture probe, e.g., wherein the capture probes are from a cell or subject which does not express 80091 (or does not express as highly as in the case of the 80091 positive plurality of capture probes) or from a cell or subject which in which an 80091 mediated response has not been elicited (or has been elicited to a lesser extent than in the first sample); contacting the array with one or more inquiry probes (which is preferably other than an 80091 nucleic acid, polypeptide, or antibody), and thereby evaluating the plurality of capture probes. Binding, e.g., in the case of a nucleic acid, hybridization with a capture probe at an address of the plurality, is detected, e.g., by signal generated from a label attached to the nucleic acid, polypeptide, or antibody.

[5338] In another aspect, the invention features a method of analyzing a plurality of probes or a sample. The method is useful, e.g., for analyzing gene expression. The method includes: providing a two dimensional array having a plurality of addresses, each address of the plurality being positionally distinguishable from each other address of the plurality having a unique capture probe, contacting the array with a first sample from a cell or subject which express or mis-express 80091 or from a cell or subject in which an 80091-mediated response has been elicited, e.g., by contact of the cell with 80091 nucleic acid or protein, or administration to the cell or subject 80091 nucleic acid or protein; providing a two dimensional array having a plurality of addresses, each address of the plurality being positionally distinguishable from each other address of the plurality, and each address of the plurality having a unique capture probe, and contacting the array with a second sample from a cell or subject which does not express 80091 (or does not express as highly as in the case of the 80091 positive plurality of capture probes) or from a cell or subject which in which an 80091 mediated response has not been elicited (or has been elicited to a lesser extent than in the first sample); and comparing the binding of the first sample with the binding of the second sample. Binding, e.g., in the case of a nucleic acid, hybridization with a capture probe at an address of the plurality, is detected, e.g., by signal generated from a label attached to the nucleic acid, polypeptide, or antibody. The same array can be used for both samples or different arrays can be used. If different arrays are used the plurality of addresses with capture probes should be present on both arrays.

[5339] In another aspect, the invention features a method of analyzing 80091, e.g., analyzing structure, function, or relatedness to other nucleic acid or amino acid sequences. The method includes: providing an 80091 nucleic acid or amino acid sequence; comparing the 80091 sequence with one or more preferably a plurality of sequences from a collection of sequences, e.g., a nucleic acid or protein sequence database; to thereby analyze 80091.

[5340] Detection of 80091 Variations or Mutations

[5341] The methods of the invention can also be used to detect genetic alterations in an 80091 gene, thereby determining if a subject with the altered gene is at risk for a disorder characterized by misregulation in 80091 protein activity or nucleic acid expression, such as an erythroid-associated disorder. In preferred embodiments, the methods include detecting, in a sample from the subject, the presence or absence of a genetic alteration characterized by at least one of an alteration affecting the integrity of a gene encoding an 80091-protein, or the mis-expression of the 80091 gene. For example, such genetic alterations can be detected by ascertaining the existence of at least one of 1) a deletion of one or more nucleotides from an 80091 gene; 2) an addition of one or more nucleotides to an 80091 gene; 3) a substitution of one or more nucleotides of an 80091 gene, 4) a chromosomal rearrangement of an 80091 gene; 5) an alteration in the level of a messenger RNA transcript of an 80091 gene, 6) aberrant modification of an 80091 gene, such as of the methylation pattern of the genomic DNA, 7) the presence of a non-wild type splicing pattern of a messenger RNA transcript of an 80091 gene, 8) a non-wild type level of an 80091-protein, 9) allelic loss of an 80091 gene, and 10) inappropriate post-translational modification of an 80091-protein.

[5342] An alteration can be detected without a probe/primer in a polymerase chain reaction, such as anchor PCR or RACE PCR, or, alternatively, in a ligation chain reaction (LCR), the latter of which can be particularly useful for detecting point mutations in the 80091-gene. This method can include the steps of collecting a sample of cells from a subject, isolating nucleic acid (e.g., genomic, mRNA or both) from the sample, contacting the nucleic acid sample with one or more primers which specifically hybridize to an 80091 gene under conditions such that hybridization and amplification of the 80091-gene (if present) occurs, and detecting the presence or absence of an amplification product, or detecting the size of the amplification product and comparing the length to a control sample. It is anticipated that PCR and/or LCR may be desirable to use as a preliminary amplification step in conjunction with any of the techniques used for detecting mutations described herein. Alternatively, other amplification methods described herein or known in the art can be used.

[5343] In another embodiment, mutations in an 80091 gene from a sample cell can be identified by detecting alterations in restriction enzyme cleavage patterns. For example, sample and control DNA is isolated, amplified (optionally), digested with one or more restriction endonucleases, and fragment length sizes are determined, e.g., by gel electrophoresis and compared. Differences in fragment length sizes between sample and control DNA indicates mutations in the sample DNA. Moreover, the use of sequence specific ribozymes (see, for example, U.S. Pat. No. 5,498,531) can be used to score for the presence of specific mutations by development or loss of a ribozyme cleavage site.

[5344] In other embodiments, genetic mutations in 80091 can be identified by hybridizing a sample and control nucleic acids, e.g., DNA or RNA, two-dimensional arrays, e.g., chip based arrays. Such arrays include a plurality of addresses, each of which is positionally distinguishable from the other. A different probe is located at each address of the plurality. A probe can be complementary to a region of an 80091 nucleic acid or a putative variant (e.g., allelic variant) thereof. A probe can have one or more mismatches to a region of an 80091 nucleic acid (e.g., a destabilizing mismatch). The arrays can have a high density of addresses, e.g., can contain hundreds or thousands of oligonucleotides probes (Cronin, M. T. et al. (1996) Human Mutation 7: 244-255; Kozal, M. J. et al. (1996) Nature Medicine 2: 753-759). For example, genetic mutations in 80091 can be identified in two-dimensional arrays containing light-generated DNA probes as described in Cronin, M. T. et al. supra. Briefly, a first hybridization array of probes can be used to scan through long stretches of DNA in a sample and control to identify base changes between the sequences by making linear arrays of sequential overlapping probes. This step allows the identification of point mutations. This step is followed by a second hybridization array that allows the characterization of specific mutations by using smaller, specialized probe arrays complementary to all variants or mutations detected. Each mutation array is composed of parallel probe sets, one complementary to the wild-type gene and the other complementary to the mutant gene.

[5345] In yet another embodiment, any of a variety of sequencing reactions known in the art can be used to directly sequence the 80091 gene and detect mutations by comparing the sequence of the sample 80091 with the corresponding wild-type (control) sequence. Automated sequencing procedures can be utilized when performing the diagnostic assays ((1995) Biotechniques 19:448), including sequencing by mass spectrometry.

[5346] Other methods for detecting mutations in the 80091 gene include methods in which protection from cleavage agents is used to detect mismatched bases in RNA/RNA or RNA/DNA heteroduplexes (Myers et al. (1985) Science 230:1242; Cotton et al. (1988) Proc. Natl. Acad Sci USA 85: 4397; Saleeba et al. (1992) Methods Enzymol. 217: 286-295).

[5347] In still another embodiment, the mismatch cleavage reaction employs one or more proteins that recognize mismatched base pairs in double-stranded DNA (so called “DNA mismatch repair” enzymes) in defined systems for detecting and mapping point mutations in 80091 cDNAs obtained from samples of cells. For example, the mutY enzyme of E. coli cleaves A at G/A mismatches and the thymidine DNA glycosylase from HeLa cells cleaves T at G/T mismatches (Hsu et al. (1994) Carcinogenesis 15:1657-1662; U.S. Pat. No. 5,459,039).

[5348] In other embodiments, alterations in electrophoretic mobility will be used to identify mutations in 80091 genes. For example, single strand conformation polymorphism (SSCP) may be used to detect differences in electrophoretic mobility between mutant and wild type nucleic acids (Orita et al. (1989) Proc Natl. Acad. Sci USA: 86:2766, see also Cotton (1993) Mutat. Res. 285: 125-144; and Hayashi (1992) Genet. Anal. Tech. Appl. 9: 73-79). Single-stranded DNA fragments of sample and control 80091 nucleic acids will be denatured and allowed to renature. The secondary structure of single-stranded nucleic acids varies according to sequence, the resulting alteration in electrophoretic mobility enables the detection of even a single base change. The DNA fragments may be labeled or detected with labeled probes. The sensitivity of the assay may be enhanced by using RNA (rather than DNA), in which the secondary structure is more sensitive to a change in sequence. In a preferred embodiment, the subject method utilizes heteroduplex analysis to separate double stranded heteroduplex molecules on the basis of changes in electrophoretic mobility (Keen et al. (1991) Trends Genet 7: 5).

[5349] In yet another embodiment, the movement of mutant or wild-type fragments in polyacrylamide gels containing a gradient of denaturant is assayed using denaturing gradient gel electrophoresis (DGGE) (Myers et al. (1985) Nature 313:495). When DGGE is used as the method of analysis, DNA will be modified to insure that it does not completely denature, for example by adding a GC clamp of approximately 40 bp of high-melting GC-rich DNA by PCR. In a further embodiment, a temperature gradient is used in place of a denaturing gradient to identify differences in the mobility of control and sample DNA (Rosenbaum and Reissner (1987) Biophys Chem 265:12753).

[5350] Examples of other techniques for detecting point mutations include, but are not limited to, selective oligonucleotide hybridization, selective amplification, or selective primer extension (Saiki et al. (1986) Nature 324:163); Saiki et al. (1989) Proc. Natl. Acad. Sci USA 86:6230). A further method of detecting point mutations is the chemical ligation of oligonucleotides as described in Xu et al. ((2001) Nature Biotechnol. 19:148). Adjacent oligonucleotides, one of which selectively anneals to the query site, are ligated together if the nucleotide at the query site of the sample nucleic acid is complementary to the query oligonucleotide; ligation can be monitored, e.g., by fluorescent dyes coupled to the oligonucleotides.

[5351] Alternatively, allele specific amplification technology that depends on selective PCR amplification may be used in conjunction with the instant invention. Oligonucleotides used as primers for specific amplification may carry the mutation of interest in the center of the molecule (so that amplification depends on differential hybridization) (Gibbs et al. (1989) Nucleic Acids Res. 17:2437-2448) or at the extreme 3′end of one primer where, under appropriate conditions, mismatch can prevent, or reduce polymerase extension (Prossner (1993) Tibtech 11:238). In addition it may be desirable to introduce a novel restriction site in the region of the mutation to create cleavage-based detection (Gasparini et al. (1992) Mol. Cell Probes 6:1). It is anticipated that in certain embodiments amplification may also be performed using Taq ligase for amplification (Barany (1991) Proc. Natl. Acad. Sci USA 88: 189). In such cases, ligation will occur only if there is a perfect match at the 3′end of the 5′ sequence making it possible to detect the presence of a known mutation at a specific site by looking for the presence or absence of amplification.

[5352] In another aspect, the invention features a set of oligonucleotides. The set includes a plurality of oligonucleotides, each of which is at least partially complementary (e.g., at least 65%, 70%, 80%, 90%, 92%, 95%, 97%, 98%, or 99% complementary) to an 80091 nucleic acid.

[5353] In a preferred embodiment the set includes a first and a second oligonucleotide. The first and second oligonucleotide can hybridize to the same or to different locations of SEQ ID NO:94 or the complement of SEQ ID NO:94. Different locations can be different but overlapping, or non-overlapping on the same strand. The first and second oligonucleotide can hybridize to sites on the same or on different strands.

[5354] The set can be useful, e.g., for identifying SNP's, or identifying specific alleles of 80091. In a preferred embodiment, each oligonucleotide of the set has a different nucleotide at an interrogation position. In one embodiment, the set includes two oligonucleotides, each complementary to a different allele at a locus, e.g., a biallelic or polymorphic locus.

[5355] In another embodiment, the set includes four oligonucleotides, each having a different nucleotide (e.g., adenine, guanine, cytosine, or thymidine) at the interrogation position. The interrogation position can be a SNP or the site of a mutation. In another preferred embodiment, the oligonucleotides of the plurality are identical in sequence to one another (except for differences in length). The oligonucleotides can be provided with differential labels, such that an oligonucleotide that hybridizes to one allele provides a signal that is distinguishable from an oligonucleotide that hybridizes to a second allele. In still another embodiment, at least one of the oligonucleotides of the set has a nucleotide change at a position in addition to a query position, e.g., a destabilizing mutation to decrease the Tm of the oligonucleotide. In another embodiment, at least one oligonucleotide of the set has a non-natural nucleotide, e.g., inosine. In a preferred embodiment, the oligonucleotides are attached to a solid support, e.g., to different addresses of an array or to different beads or nanoparticles.

[5356] In a preferred embodiment the set of oligo nucleotides can be used to specifically amplify, e.g., by PCR, or detect, an 80091 nucleic acid.

[5357] The methods described herein may be performed, for example, by utilizing pre-packaged diagnostic kits comprising at least one probe nucleic acid or antibody reagent described herein, which may be conveniently used, e.g., in clinical settings to diagnose patients exhibiting symptoms or family history of a disease or illness involving an 80091 gene.

[5358] Use of 80091 Molecules as Surrogate Markers

[5359] The 80091 molecules of the invention are also useful as markers of disorders or disease states, as markers for precursors of disease states, as markers for predisposition of disease states, as markers of drug activity, or as markers of the pharmacogenomic profile of a subject. Using the methods described herein, the presence, absence and/or quantity of the 80091 molecules of the invention may be detected, and may be correlated with one or more biological states in vivo. For example, the 80091 molecules of the invention may serve as surrogate markers for one or more disorders or disease states or for conditions leading up to disease states. As used herein, a “surrogate marker” is an objective biochemical marker which correlates with the absence or presence of a disease or disorder, or with the progression of a disease or disorder (e.g., with the presence or absence of a tumor). The presence or quantity of such markers is independent of the disease. Therefore, these markers may serve to indicate whether a particular course of treatment is effective in lessening a disease state or disorder. Surrogate markers are of particular use when the presence or extent of a disease state or disorder is difficult to assess through standard methodologies (e.g., early stage tumors), or when an assessment of disease progression is desired before a potentially dangerous clinical endpoint is reached (e.g., an assessment of cardiovascular disease may be made using cholesterol levels as a surrogate marker, and an analysis of HIV infection may be made using HIV RNA levels as a surrogate marker, well in advance of the undesirable clinical outcomes of myocardial infarction or fully-developed AIDS). Examples of the use of surrogate markers in the art include: Koomen et al. (2000) J. Mass. Spectrom. 35: 258-264; and James (1994) AIDS Treatment News Archive 209.

[5360] The 80091 molecules of the invention are also useful as pharmacodynamic markers. As used herein, a “pharmacodynamic marker” is an objective biochemical marker which correlates specifically with drug effects. The presence or quantity of a pharmacodynamic marker is not related to the disease state or disorder for which the drug is being administered; therefore, the presence or quantity of the marker is indicative of the presence or activity of the drug in a subject. For example, a pharmacodynamic marker may be indicative of the concentration of the drug in a biological tissue, in that the marker is either expressed or transcribed or not expressed or transcribed in that tissue in relationship to the level of the drug. In this fashion, the distribution or uptake of the drug may be monitored by the pharmacodynamic marker. Similarly, the presence or quantity of the pharmacodynamic marker may be related to the presence or quantity of the metabolic product of a drug, such that the presence or quantity of the marker is indicative of the relative breakdown rate of the drug in vivo. Pharmacodynamic markers are of particular use in increasing the sensitivity of detection of drug effects, particularly when the drug is administered in low doses. Since even a small amount of a drug may be sufficient to activate multiple rounds of marker (e.g., an 80091 marker) transcription or expression, the amplified marker may be in a quantity which is more readily detectable than the drug itself. Also, the marker may be more easily detected due to the nature of the marker itself; for example, using the methods described herein, anti-80091 antibodies may be employed in an immune-based detection system for an 80091 protein marker, or 80091-specific radiolabeled probes may be used to detect an 80091 mRNA marker. Furthermore, the use of a pharmacodynamic marker may offer mechanism-based prediction of risk due to drug treatment beyond the range of possible direct observations. Examples of the use of pharmacodynamic markers in the art include: Matsuda et al. U.S. Pat. No. 6,033,862; Hattis et al. (1991) Env. Health Perspect. 90: 229-238; Schentag (1999) Am. J. Health-Syst. Pharm. 56 Suppl. 3: S21-S24; and Nicolau (1999) Am, J. Health-Syst. Pharm. 56 Suppl. 3: S16-S20.

[5361] The 80091 molecules of the invention are also useful as pharmacogenomic markers. As used herein, a “pharmacogenomic marker” is an objective biochemical marker which correlates with a specific clinical drug response or susceptibility in a subject (see, e.g., McLeod et al. (1999) Eur. J. Cancer 35: 1650-1652). The presence or quantity of the pharmacogenomic marker is related to the predicted response of the subject to a specific drug or class of drugs prior to administration of the drug. By assessing the presence or quantity of one or more pharmacogenomic markers in a subject, a drug therapy which is most appropriate for the subject, or which is predicted to have a greater degree of success, may be selected. For example, based on the presence or quantity of RNA, or protein (e.g., 80091 protein or RNA) for specific tumor markers in a subject, a drug or course of treatment may be selected that is optimized for the treatment of the specific tumor likely to be present in the subject. Similarly, the presence or absence of a specific sequence mutation in 80091 DNA may correlate 80091 drug response. The use of pharmacogenomic markers therefore permits the application of the most appropriate treatment for each subject without having to administer the therapy.

[5362] Pharmaceutical Compositions of 80091

[5363] The nucleic acid and polypeptides, fragments thereof, as well as anti-80091 antibodies (also referred to herein as “active compounds”) of the invention can be incorporated into pharmaceutical compositions. Such compositions typically include the nucleic acid molecule, protein, or antibody and a pharmaceutically acceptable carrier. As used herein the language “pharmaceutically acceptable carrier” includes solvents, dispersion media, coatings, antibacterial and antifungal agents, isotonic and absorption delaying agents, and the like, compatible with pharmaceutical administration. Supplementary active compounds can also be incorporated into the compositions.

[5364] A pharmaceutical composition is formulated to be compatible with its intended route of administration. Examples of routes of administration include parenteral, e.g., intravenous, intradermal, subcutaneous, oral (e.g., inhalation), transdermal (topical), transmucosal, and rectal administration. Solutions or suspensions used for parenteral, intradermal, or subcutaneous application can include the following components: a sterile diluent such as water for injection, saline solution, fixed oils, polyethylene glycols, glycerine, propylene glycol or other synthetic solvents; antibacterial agents such as benzyl alcohol or methyl parabens; antioxidants such as ascorbic acid or sodium bisulfite; chelating agents such as ethylenediaminetetraacetic acid; buffers such as acetates, citrates or phosphates and agents for the adjustment of tonicity such as sodium chloride or dextrose. pH can be adjusted with acids or bases, such as hydrochloric acid or sodium hydroxide. The parenteral preparation can be enclosed in ampoules, disposable syringes or multiple dose vials made of glass or plastic.

[5365] Pharmaceutical compositions suitable for injectable use include sterile aqueous solutions (where water soluble) or dispersions and sterile powders for the extemporaneous preparation of sterile injectable solutions or dispersion. For intravenous administration, suitable carriers include physiological saline, bacteriostatic water, Cremophor EL™ (BASF, Parsippany, N.J.) or phosphate buffered saline (PBS). In all cases, the composition must be sterile and should be fluid to the extent that easy syringability exists. It should be stable under the conditions of manufacture and storage and must be preserved against the contaminating action of microorganisms such as bacteria and fungi. The carrier can be a solvent or dispersion medium containing, for example, water, ethanol, polyol (for example, glycerol, propylene glycol, and liquid polyetheylene glycol, and the like), and suitable mixtures thereof. The proper fluidity can be maintained, for example, by the use of a coating such as lecithin, by the maintenance of the required particle size in the case of dispersion and by the use of surfactants. Prevention of the action of microorganisms can be achieved by various antibacterial and antifungal agents, for example, parabens, chlorobutanol, phenol, ascorbic acid, thimerosal, and the like. In many cases, it will be preferable to include isotonic agents, for example, sugars, polyalcohols such as manitol, sorbitol, sodium chloride in the composition. Prolonged absorption of the injectable compositions can be brought about by including in the composition an agent which delays absorption, for example, aluminum monostearate and gelatin.

[5366] Sterile injectable solutions can be prepared by incorporating the active compound in the required amount in an appropriate solvent with one or a combination of ingredients enumerated above, as required, followed by filtered sterilization. Generally, dispersions are prepared by incorporating the active compound into a sterile vehicle which contains a basic dispersion medium and the required other ingredients from those enumerated above. In the case of sterile powders for the preparation of sterile injectable solutions, the preferred methods of preparation are vacuum drying and freeze-drying which yields a powder of the active ingredient plus any additional desired ingredient from a previously sterile-filtered solution thereof.

[5367] Oral compositions generally include an inert diluent or an edible carrier. For the purpose of oral therapeutic administration, the active compound can be incorporated with excipients and used in the form of tablets, troches, or capsules, e.g., gelatin capsules. Oral compositions can also be prepared using a fluid carrier for use as a mouthwash. Pharmaceutically compatible binding agents, and/or adjuvant materials can be included as part of the composition. The tablets, pills, capsules, troches and the like can contain any of the following ingredients, or compounds of a similar nature: a binder such as microcrystalline cellulose, gum tragacanth or gelatin; an excipient such as starch or lactose, a disintegrating agent such as alginic acid, Primogel, or corn starch; a lubricant such as magnesium stearate or Sterotes; a glidant such as colloidal silicon dioxide; a sweetening agent such as sucrose or saccharin; or a flavoring agent such as peppermint, methyl salicylate, or orange flavoring.

[5368] For administration by inhalation, the compounds are delivered in the form of an aerosol spray from pressured container or dispenser which contains a suitable propellant, e.g., a gas such as carbon dioxide, or a nebulizer.

[5369] Systemic administration can also be by transmucosal or transdermal means. For transmucosal or transdermal administration, penetrants appropriate to the barrier to be permeated are used in the formulation. Such penetrants are generally known in the art, and include, for example, for transmucosal administration, detergents, bile salts, and fusidic acid derivatives. Transmucosal administration can be accomplished through the use of nasal sprays or suppositories. For transdermal administration, the active compounds are formulated into ointments, salves, gels, or creams as generally known in the art.

[5370] The compounds can also be prepared in the form of suppositories (e.g., with conventional suppository bases such as cocoa butter and other glycerides) or retention enemas for rectal delivery.

[5371] In one embodiment, the active compounds are prepared with carriers that will protect the compound against rapid elimination from the body, such as a controlled release formulation, including implants and microencapsulated delivery systems. Biodegradable, biocompatible polymers can be used, such as ethylene vinyl acetate, polyanhydrides, polyglycolic acid, collagen, polyorthoesters, and polylactic acid. Methods for preparation of such formulations will be apparent to those skilled in the art. The materials can also be obtained commercially from Alza Corporation and Nova Pharmaceuticals, Inc. Liposomal suspensions (including liposomes targeted to infected cells with monoclonal antibodies to viral antigens) can also be used as pharmaceutically acceptable carriers. These can be prepared according to methods known to those skilled in the art, for example, as described in U.S. Pat. No. 4,522,811.

[5372] It is advantageous to formulate oral or parenteral compositions in dosage unit form for ease of administration and uniformity of dosage. Dosage unit form as used herein refers to physically discrete units suited as unitary dosages for the subject to be treated; each unit containing a predetermined quantity of active compound calculated to produce the desired therapeutic effect in association with the required pharmaceutical carrier.

[5373] Toxicity and therapeutic efficacy of such compounds can be determined by standard pharmaceutical procedures in cell cultures or experimental animals, e.g., for determining the LD50 (the dose lethal to 50% of the population) and the ED50 (the dose therapeutically effective in 50% of the population). The dose ratio between toxic and therapeutic effects is the therapeutic index and it can be expressed as the ratio LD50/ED50. Compounds which exhibit high therapeutic indices are preferred. While compounds that exhibit toxic side effects may be used, care should be taken to design a delivery system that targets such compounds to the site of affected tissue in order to minimize potential damage to uninfected cells and, thereby, reduce side effects.

[5374] The data obtained from the cell culture assays and animal studies can be used in formulating a range of dosage for use in humans. The dosage of such compounds lies preferably within a range of circulating concentrations that include the ED50 with little or no toxicity. The dosage may vary within this range depending upon the dosage form employed and the route of administration utilized. For any compound used in the method of the invention, the therapeutically effective dose can be estimated initially from cell culture assays. A dose may be formulated in animal models to achieve a circulating plasma concentration range that includes the IC50 (i.e., the concentration of the test compound which achieves a half-maximal inhibition of symptoms) as determined in cell culture. Such information can be used to more accurately determine useful doses in humans. Levels in plasma may be measured, for example, by high performance liquid chromatography.

[5375] As defined herein, a therapeutically effective amount of protein or polypeptide (i.e., an effective dosage) ranges from about 0.001 to 30 mg/kg body weight, preferably about 0.01 to 25 mg/kg body weight, more preferably about 0.1 to 20 mg/kg body weight, and even more preferably about 1 to 10 mg/kg, 2 to 9 mg/kg, 3 to 8 mg/kg, 4 to 7 mg/kg, or 5 to 6 mg/kg body weight. The protein or polypeptide can be administered one time per week for between about 1 to 10 weeks, preferably between 2 to 8 weeks, more preferably between about 3 to 7 weeks, and even more preferably for about 4, 5, or 6 weeks. The skilled artisan will appreciate that certain factors may influence the dosage and timing required to effectively treat a subject, including but not limited to the severity of the disease or disorder, previous treatments, the general health and/or age of the subject, and other diseases present. Moreover, treatment of a subject with a therapeutically effective amount of a protein, polypeptide, or antibody can include a single treatment or, preferably, can include a series of treatments.

[5376] For antibodies, the preferred dosage is 0.1 mg/kg of body weight (generally 10 mg/kg to 20 mg/kg). If the antibody is to act in the brain, a dosage of 50 mg/kg to 100 mg/kg is usually appropriate. Generally, partially human antibodies and fully human antibodies have a longer half-life within the human body than other antibodies. Accordingly, lower dosages and less frequent administration is often possible. Modifications such as lipidation can be used to stabilize antibodies and to enhance uptake and tissue penetration (e.g., into the brain). A method for lipidation of antibodies is described by Cruikshank et al. ((1997) J. Acquired Immune Deficiency Syndromes and Human Retrovirology 14: 193).

[5377] The present invention encompasses agents which modulate expression or activity. An agent may, for example, be a small molecule. For example, such small molecules include, but are not limited to, peptides, peptidomimetics (e.g., peptoids), amino acids, amino acid analogs, polynucleotides, polynucleotide analogs, nucleotides, nucleotide analogs, organic or inorganic compounds (i.e., including heteroorganic and organometallic compounds) having a molecular weight less than about 10,000 grams per mole, organic or inorganic compounds having a molecular weight less than about 5,000 grams per mole, organic or inorganic compounds having a molecular weight less than about 1,000 grams per mole, organic or inorganic compounds having a molecular weight less than about 500 grams per mole, and salts, esters, and other pharmaceutically acceptable forms of such compounds.

[5378] Exemplary doses include milligram or microgram amounts of the small molecule per kilogram of subject or sample weight (e.g., about 1 microgram per kilogram to about 500 milligrams per kilogram, about 100 micrograms per kilogram to about 5 milligrams per kilogram, or about 1 microgram per kilogram to about 50 micrograms per kilogram. It is furthermore understood that appropriate doses of a small molecule depend upon the potency of the small molecule with respect to the expression or activity to be modulated. When one or more of these small molecules is to be administered to an animal (e.g., a human) in order to modulate expression or activity of a polypeptide or nucleic acid of the invention, a physician, veterinarian, or researcher may, for example, prescribe a relatively low dose at first, subsequently increasing the dose until an appropriate response is obtained. In addition, it is understood that the specific dose level for any particular animal subject will depend upon a variety of factors including the activity of the specific compound employed, the age, body weight, general health, gender, and diet of the subject, the time of administration, the route of administration, the rate of excretion, any drug combination, and the degree of expression or activity to be modulated.

[5379] An antibody (or fragment thereof) may be conjugated to a therapeutic moiety such as a cytotoxin, a therapeutic agent or a radioactive ion. A cytotoxin or cytotoxic agent includes any agent that is detrimental to cells. Examples include taxol, cytochalasin B, gramicidin D, ethidium bromide, emetine, mitomycin, etoposide, tenoposide, vincristine, vinblastine, colchicin, doxorubicin, daunorubicin, dihydroxy anthracin dione, mitoxantrone, mithramycin, actinomycin D, 1-dehydrotestosterone, glucocorticoids, procaine, tetracaine, lidocaine, propranolol, puromycin, maytansinoids, e.g., maytansinol (see U.S. Pat. No. 5,208,020), CC-1065 (see U.S. Pat. Nos. 5,475,092, 5,585,499, 5,846,545) and analogs or homologs thereof. Therapeutic agents include, but are not limited to, antimetabolites (e.g., methotrexate, 6-mercaptopurine, 6-thioguanine, cytarabine, 5-fluorouracil decarbazine), alkylating agents (e.g., mechlorethamine, thioepa chlorambucil, CC-1065, melphalan, carmustine (BSNU) and lomustine (CCNU), cyclothosphamide, busulfan, dibromomannitol, streptozotocin, mitomycin C, and cis-dichlorodiamine platinum (II) (DDP) cisplatin), anthracyclines (e.g., daunorubicin (formerly daunomycin) and doxorubicin), antibiotics (e.g., dactinomycin (formerly actinomycin), bleomycin, mithramycin, and anthramycin (AMC)), and anti-mitotic agents (e.g., vincristine, vinblastine, taxol and maytansinoids). Radioactive ions include, but are not limited to iodine, yttrium and praseodymium.

[5380] The conjugates of the invention can be used for modifying a given biological response, the drug moiety is not to be construed as limited to classical chemical therapeutic agents. For example, the drug moiety may be a protein or polypeptide possessing a desired biological activity. Such proteins may include, for example, a toxin such as abrin, ricin A, pseudomonas exotoxin, or diphtheria toxin; a protein such as tumor necrosis factor, &agr;-interferon, &bgr;-interferon, nerve growth factor, platelet derived growth factor, tissue plasminogen activator; or, biological response modifiers such as, for example, lymphokines, interleukin-1 (“IL-1”), interleukin-2 (“IL-2”), interleukin-6 (“IL-6”), granulocyte macrophase colony stimulating factor (“GM-CSF”), granulocyte colony stimulating factor (“G-CSF”), or other growth factors.

[5381] Alternatively, an antibody can be conjugated to a second antibody to form an antibody heteroconjugate as described by Segal in U.S. Pat. No. 4,676,980.

[5382] The nucleic acid molecules of the invention can be inserted into vectors and used as gene therapy vectors. Gene therapy vectors can be delivered to a subject by, for example, intravenous injection, local administration (see U.S. Pat. No. 5,328,470) or by stereotactic injection (see e.g., Chen et al. (1994) Proc. Natl. Acad. Sci. USA 91:3054-3057). The pharmaceutical preparation of the gene therapy vector can include the gene therapy vector in an acceptable diluent, or can comprise a slow release matrix in which the gene delivery vehicle is imbedded. Alternatively, where the complete gene delivery vector can be produced intact from recombinant cells, e.g., retroviral vectors, the pharmaceutical preparation can include one or more cells which produce the gene delivery system.

[5383] The pharmaceutical compositions can be included in a container, pack, or dispenser together with instructions for administration.

[5384] Methods of Treatment for 80091

[5385] The present invention provides for both prophylactic and therapeutic methods of treating a subject at risk of (or susceptible to) a disorder or having a disorder associated with aberrant or unwanted 80091 expression or activity. As used herein, the term “treatment” is defined as the application or administration of a therapeutic agent to a patient, or application or administration of a therapeutic agent to an isolated tissue or cell line from a patient, who has a disease, a symptom of disease or a predisposition toward a disease, with the purpose to cure, heal, alleviate, relieve, alter, remedy, ameliorate, improve or affect the disease, the symptoms of disease or the predisposition toward disease. A therapeutic agent includes, but is not limited to, small molecules, peptides, antibodies, ribozymes and antisense oligonucleotides.

[5386] With regards to both prophylactic and therapeutic methods of treatment, such treatments may be specifically tailored or modified, based on knowledge obtained from the field of pharmacogenomics. “Pharmacogenomics”, as used herein, refers to the application of genomics technologies such as gene sequencing, statistical genetics, and gene expression analysis to drugs in clinical development and on the market. More specifically, the term refers the study of how a patient's genes determine his or her response to a drug (e.g., a patient's “drug response phenotype”, or “drug response genotype”.) Thus, another aspect of the invention provides methods for tailoring an individual's prophylactic or therapeutic treatment with either the 80091 molecules of the present invention or 80091 modulators according to that individual's drug response genotype. Pharmacogenomics allows a clinician or physician to target prophylactic or therapeutic treatments to patients who will most benefit from the treatment and to avoid treatment of patients who will experience toxic drug-related side effects.

[5387] In one aspect, the invention provides a method for preventing in a subject, a disease or condition associated with an aberrant or unwanted 80091 expression or activity, by administering to the subject an 80091 or an agent which modulates 80091 expression or at least one 80091 activity. Subjects at risk for a disease which is caused or contributed to by aberrant or unwanted 80091 expression or activity can be identified by, for example, any or a combination of diagnostic or prognostic assays as described herein. Administration of a prophylactic agent can occur prior to the manifestation of symptoms characteristic of the 80091 aberrance, such that a disease or disorder is prevented or, alternatively, delayed in its progression. Depending on the type of 80091 aberrance, for example, an 80091, 80091 agonist or 80091 antagonist agent can be used for treating the subject. The appropriate agent can be determined based on screening assays described herein.

[5388] It is possible that some 80091 disorders can be caused, at least in part, by an abnormal level of gene product, or by the presence of a gene product exhibiting abnormal activity. As such, the reduction in the level and/or activity of such gene products would bring about the amelioration of disorder symptoms.

[5389] The 80091 molecules can act as novel diagnostic targets and therapeutic agents for controlling one or more of hematopoietic disorders such as erythroid cell-associated disorders, cellular proliferative and/or differentiative disorders, neurological or brain disorders, metabolic disorders, angiogenic disorders, and endothelial cell disorders, as described above, as well as one or more of disorders associated with bone metabolism, immune disorders, cardiovascular disorders, liver disorders, viral diseases, or pain disorders.

[5390] The 80091 molecules can act as novel diagnostic targets and therapeutic agents for controlling one or more of disorders associated with bone metabolism, immune disorders, cardiovascular disorders, liver disorders, viral diseases, pain or metabolic disorders.

[5391] Aberrant expression and/or activity of 80091 molecules may mediate disorders associated with bone metabolism. “Bone metabolism” refers to direct or indirect effects in the formation or degeneration of bone structures, e.g., bone formation, bone resorption, etc., which may ultimately affect the concentrations in serum of calcium and phosphate. This term also includes activities mediated by 80091 molecules effects in bone cells, e.g. osteoclasts and osteoblasts, that may in turn result in bone formation and degeneration. For example, 80091 molecules may support different activities of bone resorbing osteoclasts such as the stimulation of differentiation of monocytes and mononuclear phagocytes into osteoclasts. Accordingly, 80091 molecules that modulate the production of bone cells can influence bone formation and degeneration, and thus may be used to treat bone disorders. Examples of such disorders include, but are not limited to, osteoporosis, osteodystrophy, osteomalacia, rickets, osteitis fibrosa cystica, renal osteodystrophy, osteosclerosis, anti-convulsant treatment, osteopenia, fibrogenesis-imperfecta ossium, secondary hyperparathyrodism, hypoparathyroidism, hyperparathyroidism, cirrhosis, obstructive jaundice, drug induced metabolism, medullary carcinoma, chronic renal disease, rickets, sarcoidosis, glucocorticoid antagonism, malabsorption syndrome, steatorrhea, tropical sprue, idiopathic hypercalcemia and milk fever.

[5392] The 80091 nucleic acid and protein of the invention can be used to treat and/or diagnose a variety of immune disorders. Examples of immune disorders or diseases include, but are not limited to, autoimmune diseases (including, for example, diabetes mellitus, arthritis (including rheumatoid arthritis, juvenile rheumatoid arthritis, osteoarthritis, psoriatic arthritis), multiple sclerosis, encephalomyelitis, myasthenia gravis, systemic lupus erythematosis, autoimmune thyroiditis, dermatitis (including atopic dermatitis and eczematous dermatitis), psoriasis, Sjögren's Syndrome, Crohn's disease, aphthous ulcer, iritis, conjunctivitis, keratoconjunctivitis, ulcerative colitis, asthma, allergic asthma, cutaneous lupus erythematosus, scleroderma, vaginitis, proctitis, drug eruptions, leprosy reversal reactions, erythema nodosum leprosum, autoimmune uveitis, allergic encephalomyelitis, acute necrotizing hemorrhagic encephalopathy, idiopathic bilateral progressive sensorineural hearing loss, aplastic anemia, pure red cell anemia, idiopathic thrombocytopenia, polychondritis, Wegener's granulomatosis, chronic active hepatitis, Stevens-Johnson syndrome, idiopathic sprue, lichen planus, Graves' disease, sarcoidosis, primary biliary cirrhosis, uveitis posterior, and interstitial lung fibrosis), graft-versus-host disease, cases of transplantation, and allergy such as, atopic allergy.

[5393] Examples of disorders involving the heart or “cardiovascular disorder” include, but are not limited to, a disease, disorder, or state involving the cardiovascular system, e.g., the heart, the blood vessels, and/or the blood. A cardiovascular disorder can be caused by an imbalance in arterial pressure, a malfunction of the heart, or an occlusion of a blood vessel, e.g., by a thrombus. Examples of such disorders include hypertension, atherosclerosis, coronary artery spasm, congestive heart failure, coronary artery disease, valvular disease, arrhythmias, and cardiomyopathies.

[5394] Disorders which may be treated or diagnosed by methods described herein include, but are not limited to, disorders associated with an accumulation in the liver of fibrous tissue, such as that resulting from an imbalance between production and degradation of the extracellular matrix accompanied by the collapse and condensation of preexisting fibers. The methods described herein can be used to diagnose or treat hepatocellular necrosis or injury induced by a wide variety of agents including processes which disturb homeostasis, such as an inflammatory process, tissue damage resulting from toxic injury or altered hepatic blood flow, and infections (e.g., bacterial, viral and parasitic). For example, the methods can be used for the early detection of hepatic injury, such as portal hypertension or hepatic fibrosis. In addition, the methods can be employed to detect liver fibrosis attributed to inborn errors of metabolism, for example, fibrosis resulting from a storage disorder such as Gaucher's disease (lipid abnormalities) or a glycogen storage disease, A1-antitrypsin deficiency; a disorder mediating the accumulation (e.g., storage) of an exogenous substance, for example, hemochromatosis (iron-overload syndrome) and copper storage diseases (Wilson's disease), disorders resulting in the accumulation of a toxic metabolite (e.g., tyrosinemia, fructosemia and galactosemia) and peroxisomal disorders (e.g., Zellweger syndrome). Additionally, the methods described herein may be useful for the early detection and treatment of liver injury associated with the administration of various chemicals or drugs, such as for example, methotrexate, isonizaid, oxyphenisatin, methyldopa, chlorpromazine, tolbutamide or alcohol, or which represents a hepatic manifestation of a vascular disorder such as obstruction of either the intrahepatic or extrahepatic bile flow or an alteration in hepatic circulation resulting, for example, from chronic heart failure, veno-occlusive disease, portal vein thrombosis or Budd-Chiari syndrome.

[5395] Additionally, 80091 molecules may play an important role in the etiology of certain viral diseases, including but not limited to Hepatitis B, Hepatitis C and Herpes Simplex Virus (HSV). Modulators of 80091 activity could be used to control viral diseases. The modulators can be used in the treatment and/or diagnosis of viral infected tissue or virus-associated tissue fibrosis, especially liver and liver fibrosis. Also, 80091 modulators can be used in the treatment and/or diagnosis of virus-associated carcinoma, especially hepatocellular cancer.

[5396] Additionally, 80091 may play an important role in the regulation of metabolism or pain disorders. Diseases of metabolic imbalance include, but are not limited to, obesity, anorexia nervosa, cachexia, lipid disorders, and diabetes. Examples of pain disorders include, but are not limited to, pain response elicited during various forms of tissue injury, e.g., inflammation, infection, and ischemia, usually referred to as hyperalgesia (described in, for example, Fields, H. L. (1987) Pain, New York: McGraw-Hill); pain associated with musculoskeletal disorders, e.g., joint pain; tooth pain; headaches; pain associated with surgery; pain related to irritable bowel syndrome; or chest pain.

[5397] As discussed, successful treatment of 80091 disorders can be brought about by techniques that serve to inhibit the expression or activity of target gene products. For example, compounds, e.g., an agent identified using an assays described above, that proves to exhibit negative modulatory activity, can be used in accordance with the invention to prevent and/or ameliorate symptoms of 80091 disorders. Such molecules can include, but are not limited to peptides, phosphopeptides, small organic or inorganic molecules, or antibodies (including, for example, polyclonal, monoclonal, humanized, anti-idiotypic, chimeric or single chain antibodies, and Fab, F(ab′)2 and Fab expression library fragments, scFV molecules, and epitope-binding fragments thereof).

[5398] Further, antisense and ribozyme molecules that inhibit expression of the target gene can also be used in accordance with the invention to reduce the level of target gene expression, thus effectively reducing the level of target gene activity. Still further, triple helix molecules can be utilized in reducing the level of target gene activity. Antisense, ribozyme and triple helix molecules are discussed above.

[5399] It is possible that the use of antisense, ribozyme, and/or triple helix molecules to reduce or inhibit mutant gene expression can also reduce or inhibit the transcription (triple helix) and/or translation (antisense, ribozyme) of mRNA produced by normal target gene alleles, such that the concentration of normal target gene product present can be lower than is necessary for a normal phenotype. In such cases, nucleic acid molecules that encode and express target gene polypeptides exhibiting normal target gene activity can be introduced into cells via gene therapy method. Alternatively, in instances in that the target gene encodes an extracellular protein, it can be preferable to co-administer normal target gene protein into the cell or tissue in order to maintain the requisite level of cellular or tissue target gene activity.

[5400] Another method by which nucleic acid molecules may be utilized in treating or preventing a disease characterized by 80091 expression is through the use of aptamer molecules specific for 80091 protein. Aptamers are nucleic acid molecules having a tertiary structure which permits them to specifically bind to protein ligands (see, e.g., Osborne, et al. (1997) Curr. Opin. Chem Biol. 1: 5-9; and Patel, D. J. (1997) Curr Opin Chem Biol 1: 32-46). Since nucleic acid molecules may in many cases be more conveniently introduced into target cells than therapeutic protein molecules may be, aptamers offer a method by which 80091 protein activity may be specifically decreased without the introduction of drugs or other molecules which may have pluripotent effects.

[5401] Antibodies can be generated that are both specific for target gene product and that reduce target gene product activity. Such antibodies may, therefore, by administered in instances whereby negative modulatory techniques are appropriate for the treatment of 80091 disorders. For a description of antibodies, see the Antibody section above.

[5402] In circumstances wherein injection of an animal or a human subject with an 80091 protein or epitope for stimulating antibody production is harmful to the subject, it is possible to generate an immune response against 80091 through the use of anti-idiotypic antibodies (see, for example, Herlyn, D. (1999) Ann Med 31:66-78; and Bhattacharya-Chatterjee, M., and Foon, K. A. (1998) Cancer Treat Res. 94:51-68). If an anti-idiotypic antibody is introduced into a mammal or human subject, it should stimulate the production of anti-anti-idiotypic antibodies, which should be specific to the 80091 protein. Vaccines directed to a disease characterized by 80091 expression may also be generated in this fashion.

[5403] In instances where the target antigen is intracellular and whole antibodies are used, internalizing antibodies may be preferred. Lipofectin or liposomes can be used to deliver the antibody or a fragment of the Fab region that binds to the target antigen into cells. Where fragments of the antibody are used, the smallest inhibitory fragment that binds to the target antigen is preferred. For example, peptides having an amino acid sequence corresponding to the Fv region of the antibody can be used. Alternatively, single chain neutralizing antibodies that bind to intracellular target antigens can also be administered. Such single chain antibodies can be administered, for example, by expressing nucleotide sequences encoding single-chain antibodies within the target cell population (see e.g., Marasco et al. (1993) Proc. Natl. Acad. Sci. USA 90:7889-7893).

[5404] The identified compounds that inhibit target gene expression, synthesis and/or activity can be administered to a patient at therapeutically effective doses to prevent, treat or ameliorate 80091 disorders. A therapeutically effective dose refers to that amount of the compound sufficient to result in amelioration of symptoms of the disorders. Toxicity and therapeutic efficacy of such compounds can be determined by standard pharmaceutical procedures as described above.

[5405] The data obtained from the cell culture assays and animal studies can be used in formulating a range of dosage for use in humans. The dosage of such compounds lies preferably within a range of circulating concentrations that include the ED50 with little or no toxicity. The dosage can vary within this range depending upon the dosage form employed and the route of administration utilized. For any compound used in the method of the invention, the therapeutically effective dose can be estimated initially from cell culture assays. A dose can be formulated in animal models to achieve a circulating plasma concentration range that includes the IC50 (i.e., the concentration of the test compound that achieves a half-maximal inhibition of symptoms) as determined in cell culture. Such information can be used to more accurately determine useful doses in humans. Levels in plasma can be measured, for example, by high performance liquid chromatography. Another example of determination of effective dose for an individual is the ability to directly assay levels of “free” and “bound” compound in the serum of the test subject. Such assays may utilize antibody mimics and/or “biosensors” that have been created through molecular imprinting techniques. The compound which is able to modulate 80091 activity is used as a template, or “imprinting molecule”, to spatially organize polymerizable monomers prior to their polymerization with catalytic reagents. The subsequent removal of the imprinted molecule leaves a polymer matrix which contains a repeated “negative image” of the compound and is able to selectively rebind the molecule under biological assay conditions. A detailed review of this technique can be seen in Ansell, R. J. et al (1996) Current Opinion in Biotechnology 7: 89-94 and in Shea, K. J. (1994) Trends in Polymer Science 2: 166-173. Such “imprinted” affinity matrixes are amenable to ligand-binding assays, whereby the immobilized monoclonal antibody component is replaced by an appropriately imprinted matrix. An example of the use of such matrixes in this way can be seen in Vlatakis, G. et al (1993) Nature 361:645-647. Through the use of isotope-labeling, the “free” concentration of compound which modulates the expression or activity of 80091 can be readily monitored and used in calculations of IC50.

[5406] Such “imprinted” affinity matrixes can also be designed to include fluorescent groups whose photon-emitting properties measurably change upon local and selective binding of target compound. These changes can be readily assayed in real time using appropriate fiberoptic devices, in turn allowing the dose in a test subject to be quickly optimized based on its individual IC50. An rudimentary example of such a “biosensor” is discussed in Kriz, D. et al (1995) Analytical Chemistry 67: 2142-2144.

[5407] Another aspect of the invention pertains to methods of modulating 80091 expression or activity for therapeutic purposes. Accordingly, in an exemplary embodiment, the modulatory method of the invention involves contacting a cell with an 80091 or agent that modulates one or more of the activities of 80091 protein activity associated with the cell. An agent that modulates 80091 protein activity can be an agent as described herein, such as a nucleic acid or a protein, a naturally-occurring target molecule of an 80091 protein (e.g., an 80091 substrate or receptor), an 80091 antibody, an 80091 agonist or antagonist, a peptidomimetic of an 80091 agonist or antagonist, or other small molecule.

[5408] In one embodiment, the agent stimulates one or 80091 activities. Examples of such stimulatory agents include active 80091 protein and a nucleic acid molecule encoding 80091. In another embodiment, the agent inhibits one or more 80091 activities. Examples of such inhibitory agents include antisense 80091 nucleic acid molecules, anti-80091 antibodies, and 80091 inhibitors. These modulatory methods can be performed in vitro (e.g., by culturing the cell with the agent) or, alternatively, in vivo (e.g., by administering the agent to a subject). As such, the present invention provides methods of treating an individual afflicted with a disease or disorder characterized by aberrant or unwanted expression or activity of an 80091 protein or nucleic acid molecule. In one embodiment, the method involves administering an agent (e.g., an agent identified by a screening assay described herein), or combination of agents that modulates (e.g., up regulates or down regulates) 80091 expression or activity. In another embodiment, the method involves administering an 80091 protein or nucleic acid molecule as therapy to compensate for reduced, aberrant, or unwanted 80091 expression or activity.

[5409] Stimulation of 80091 activity is desirable in situations in which 80091 is abnormally downregulated and/or in which increased 80091 activity is likely to have a beneficial effect. For example, stimulation of 80091 activity is desirable in situations in which an 80091 is downregulated and/or in which increased 80091 activity is likely to have a beneficial effect. Likewise, inhibition of 80091 activity is desirable in situations in which 80091 is abnormally upregulated and/or in which decreased 80091 activity is likely to have a beneficial effect.

[5410] 80091 Pharmacogenomics

[5411] The 80091 molecules of the present invention, as well as agents, or modulators which have a stimulatory or inhibitory effect on 80091 activity (e.g., 80091 gene expression) as identified by a screening assay described herein can be administered to individuals to treat (prophylactically or therapeutically) 80091 associated disorders (e.g., an erythroid-associated disorder) associated with aberrant or unwanted 80091 activity. In conjunction with such treatment, pharmacogenomics (i.e., the study of the relationship between an individual's genotype and that individual's response to a foreign compound or drug) may be considered. Differences in metabolism of therapeutics can lead to severe toxicity or therapeutic failure by altering the relation between dose and blood concentration of the pharmacologically active drug. Thus, a physician or clinician may consider applying knowledge obtained in relevant pharmacogenomics studies in determining whether to administer an 80091 molecule or 80091 modulator as well as tailoring the dosage and/or therapeutic regimen of treatment with an 80091 molecule or 80091 modulator.

[5412] Pharmacogenomics deals with clinically significant hereditary variations in the response to drugs due to altered drug disposition and abnormal action in affected persons. See, for example, Eichelbaum, M. et al. (1996) Clin. Exp. Pharmacol. Physiol. 23:983-985 and Linder, M. W. et al. (1997) Clin. Chem. 43:254-266. In general, two types of pharmacogenetic conditions can be differentiated. Genetic conditions transmitted as a single factor altering the way drugs act on the body (altered drug action) or genetic conditions transmitted as single factors altering the way the body acts on drugs (altered drug metabolism). These pharmacogenetic conditions can occur either as rare genetic defects or as naturally-occurring polymorphisms. For example, glucose-6-phosphate dehydrogenase deficiency (G6PD) is a common inherited enzymopathy in which the main clinical complication is haemolysis after ingestion of oxidant drugs (anti-malarials, sulfonamides, analgesics, nitrofurans) and consumption of fava beans.

[5413] One pharmacogenomics approach to identifying genes that predict drug response, known as “a genome-wide association”, relies primarily on a high-resolution map of the human genome consisting of already known gene-related markers (e.g., a “bi-allelic” gene marker map which consists of 60,000-100,000 polymorphic or variable sites on the human genome, each of which has two variants.) Such a high-resolution genetic map can be compared to a map of the genome of each of a statistically significant number of patients taking part in a Phase II/III drug trial to identify markers associated with a particular observed drug response or side effect. Alternatively, such a high resolution map can be generated from a combination of some ten-million known single nucleotide polymorphisms (SNPs) in the human genome. As used herein, a “SNP” is a common alteration that occurs in a single nucleotide base in a stretch of DNA. For example, a SNP may occur once per every 1000 bases of DNA. A SNP may be involved in a disease process, however, the vast majority may not be disease-associated. Given a genetic map based on the occurrence of such SNPs, individuals can be grouped into genetic categories depending on a particular pattern of SNPs in their individual genome. In such a manner, treatment regimens can be tailored to groups of genetically similar individuals, taking into account traits that may be common among such genetically similar individuals.

[5414] Alternatively, a method termed the “candidate gene approach,” can be utilized to identify genes that predict drug response. According to this method, if a gene that encodes a drug's target is known (e.g., an 80091 protein of the present invention), all common variants of that gene can be fairly easily identified in the population and it can be determined if having one version of the gene versus another is associated with a particular drug response.

[5415] Alternatively, a method termed the “gene expression profiling,” can be utilized to identify genes that predict drug response. For example, the gene expression of an animal dosed with a drug (e.g., an 80091 molecule or 80091 modulator of the present invention) can give an indication whether gene pathways related to toxicity have been turned on.

[5416] Information generated from more than one of the above pharmacogenomics approaches can be used to determine appropriate dosage and treatment regimens for prophylactic or therapeutic treatment of an individual. This knowledge, when applied to dosing or drug selection, can avoid adverse reactions or therapeutic failure and thus enhance therapeutic or prophylactic efficiency when treating a subject with an 80091 molecule or 80091 modulator, such as a modulator identified by one of the exemplary screening assays described herein.

[5417] The present invention further provides methods for identifying new agents, or combinations, that are based on identifying agents that modulate the activity of one or more of the gene products encoded by one or more of the 80091 genes of the present invention, wherein these products may be associated with resistance of the cells to a therapeutic agent. Specifically, the activity of the proteins encoded by the 80091 genes of the present invention can be used as a basis for identifying agents for overcoming agent resistance. By blocking the activity of one or more of the resistance proteins, target cells, e.g., human cells, will become sensitive to treatment with an agent that the unmodified target cells were resistant to.

[5418] Monitoring the influence of agents (e.g., drugs) on the expression or activity of an 80091 protein can be applied in clinical trials. For example, the effectiveness of an agent determined by a screening assay as described herein to increase 80091 gene expression, protein levels, or upregulate 80091 activity, can be monitored in clinical trials of subjects exhibiting decreased 80091 gene expression, protein levels, or downregulated 80091 activity. Alternatively, the effectiveness of an agent determined by a screening assay to decrease 80091 gene expression, protein levels, or downregulate 80091 activity, can be monitored in clinical trials of subjects exhibiting increased 80091 gene expression, protein levels, or upregulated 80091 activity. In such clinical trials, the expression or activity of an 80091 gene, and preferably, other genes that have been implicated in, for example, an 80091-associated disorder can be used as a “read out” or markers of the phenotype of a particular cell.

[5419] 80091 Informatics

[5420] The sequence of an 80091 molecule is provided in a variety of media to facilitate use thereof. A sequence can be provided as a manufacture, other than an isolated nucleic acid or amino acid molecule, which contains an 80091. Such a manufacture can provide a nucleotide or amino acid sequence, e.g., an open reading frame, in a form which allows examination of the manufacture using means not directly applicable to examining the nucleotide or amino acid sequences, or a subset thereof, as they exists in nature or in purified form. The sequence information can include, but is not limited to, 80091 full-length nucleotide and/or amino acid sequences, partial nucleotide and/or amino acid sequences, polymorphic sequences including single nucleotide polymorphisms (SNPs), epitope sequence, and the like. In a preferred embodiment, the manufacture is a machine-readable medium, e.g., a magnetic, optical, chemical or mechanical information storage device.

[5421] As used herein, “machine-readable media” refers to any medium that can be read and accessed directly by a machine, e.g., a digital computer or analogue computer. Non-limiting examples of a computer include a desktop PC, laptop, mainframe, server (e.g., a web server, network server, or server farm), handheld digital assistant, pager, mobile telephone, and the like. The computer can be stand-alone or connected to a communications network, e.g., a local area network (such as a VPN or intranet), a wide area network (e.g., an Extranet or the Internet), or a telephone network (e.g., a wireless, DSL, or ISDN network). Machine-readable media include, but are not limited to: magnetic storage media, such as floppy discs, hard disc storage medium, and magnetic tape; optical storage media such as CD-ROM; electrical storage media such as RAM, ROM, EPROM, EEPROM, flash memory, and the like; and hybrids of these categories such as magnetic/optical storage media.

[5422] A variety of data storage structures are available to a skilled artisan for creating a machine-readable medium having recorded thereon a nucleotide or amino acid sequence of the present invention. The choice of the data storage structure will generally be based on the means chosen to access the stored information. In addition, a variety of data processor programs and formats can be used to store the nucleotide sequence information of the present invention on computer readable medium. The sequence information can be represented in a word processing text file, formatted in commercially-available software such as WordPerfect and Microsoft Word, or represented in the form of an ASCII file, stored in a database application, such as DB2, Sybase, Oracle, or the like. The skilled artisan can readily adapt any number of data processor structuring formats (e.g., text file or database) in order to obtain computer readable medium having recorded thereon the nucleotide sequence information of the present invention.

[5423] In a preferred embodiment, the sequence information is stored in a relational database (such as Sybase or Oracle). The database can have a first table for storing sequence (nucleic acid and/or amino acid sequence) information. The sequence information can be stored in one field (e.g., a first column) of a table row and an identifier for the sequence can be store in another field (e.g., a second column) of the table row. The database can have a second table, e.g., storing annotations. The second table can have a field for the sequence identifier, a field for a descriptor or annotation text (e.g., the descriptor can refer to a functionality of the sequence, a field for the initial position in the sequence to which the annotation refers, and a field for the ultimate position in the sequence to which the annotation refers. Non-limiting examples for annotation to nucleic acid sequences include polymorphisms (e.g., SNP's) translational regulatory sites and splice junctions. Non-limiting examples for annotations to amino acid sequence include polypeptide domains, e.g., a domain described herein; active sites and other functional amino acids; and modification sites.

[5424] By providing the nucleotide or amino acid sequences of the invention in computer readable form, the skilled artisan can routinely access the sequence information for a variety of purposes. For example, one skilled in the art can use the nucleotide or amino acid sequences of the invention in computer readable form to compare a target sequence or target structural motif with the sequence information stored within the data storage means. A search is used to identify fragments or regions of the sequences of the invention which match a particular target sequence or target motif. The search can be a BLAST search or other routine sequence comparison, e.g., a search described herein.

[5425] Thus, in one aspect, the invention features a method of analyzing 80091, e.g., analyzing structure, function, or relatedness to one or more other nucleic acid or amino acid sequences. The method includes: providing an 80091 nucleic acid or amino acid sequence; comparing the 80091 sequence with a second sequence, e.g., one or more preferably a plurality of sequences from a collection of sequences, e.g., a nucleic acid or protein sequence database to thereby analyze 80091. The method can be performed in a machine, e.g., a computer, or manually by a skilled artisan.

[5426] The method can include evaluating the sequence identity between an 80091 sequence and a database sequence. The method can be performed by accessing the database at a second site, e.g., over the Internet.

[5427] As used herein, a “target sequence” can be any DNA or amino acid sequence of six or more nucleotides or two or more amino acids. A skilled artisan can readily recognize that the longer a target sequence is, the less likely a target sequence will be present as a random occurrence in the database. Typical sequence lengths of a target sequence are from about 10 to 100 amino acids or from about 30 to 300 nucleotide residues. However, it is well recognized that commercially important fragments, such as sequence fragments involved in gene expression and protein processing, may be of shorter length.

[5428] Computer software is publicly available which allows a skilled artisan to access sequence information provided in a computer readable medium for analysis and comparison to other sequences. A variety of known algorithms are disclosed publicly and a variety of commercially available software for conducting search means are and can be used in the computer-based systems of the present invention. Examples of such software include, but are not limited to, MacPattern (EMBL), BLASTN and BLASTX (NCBI).

[5429] Thus, the invention features a method of making a computer readable record of a sequence of an 80091 sequence which includes recording the sequence on a computer readable matrix. In a preferred embodiment the record includes one or more of the following: identification of an ORF; identification of a domain, region, or site; identification of the start of transcription; identification of the transcription terminator; the full length amino acid sequence of the protein, or a mature form thereof; the 5′end of the translated region.

[5430] In another aspect, the invention features, a method of analyzing a sequence. The method includes: providing an 80091 sequence, or record, in machine-readable form; comparing a second sequence to the 80091 sequence; thereby analyzing a sequence. Comparison can include comparing to sequences for sequence identity or determining if one sequence is included within the other, e.g., determining if the 80091 sequence includes a sequence being compared. In a preferred embodiment the 80091 or second sequence is stored on a first computer, e.g., at a first site and the comparison is performed, read, or recorded on a second computer, e.g., at a second site. E.g., the 80091 or second sequence can be stored in a public or proprietary database in one computer, and the results of the comparison performed, read, or recorded on a second computer. In a preferred embodiment the record includes one or more of the following: identification of an ORF; identification of a domain, region, or site; identification of the start of transcription; identification of the transcription terminator; the full length amino acid sequence of the protein, or a mature form thereof; the 5′end of the translated region.

[5431] In another aspect, the invention provides a machine-readable medium for holding instructions for performing a method for determining whether a subject has an 80091-associated disease or disorder or a pre-disposition to an 80091-associated disease or disorder, wherein the method comprises the steps of determining 80091 sequence information associated with the subject and based on the 80091 sequence information, determining whether the subject has an 80091-associated disease or disorder or a pre-disposition to an 80091-associated disease or disorder and/or recommending a particular treatment for the disease, disorder or pre-disease condition.

[5432] The invention further provides in an electronic system and/or in a network, a method for determining whether a subject has an 80091-associated disease or disorder or a pre-disposition to a disease associated with an 80091 wherein the method comprises the steps of determining 80091 sequence information associated with the subject, and based on the 80091 sequence information, determining whether the subject has an 80091-associated disease or disorder or a pre-disposition to an 80091-associated disease or disorder, and/or recommending a particular treatment for the disease, disorder or pre-disease condition. In a preferred embodiment, the method further includes the step of receiving information, e.g., phenotypic or genotypic information, associated with the subject and/or acquiring from a network phenotypic information associated with the subject. The information can be stored in a database, e.g., a relational database. In another embodiment, the method further includes accessing the database, e.g., for records relating to other subjects, comparing the 80091 sequence of the subject to the 80091 sequences in the database to thereby determine whether the subject as an 80091-associated disease or disorder, or a pre-disposition for such.

[5433] The present invention also provides in a network, a method for determining whether a subject has an 80091 associated disease or disorder or a pre-disposition to an 80091-associated disease or disorder associated with 80091, said method comprising the steps of receiving 80091 sequence information from the subject and/or information related thereto, receiving phenotypic information associated with the subject, acquiring information from the network corresponding to 80091 and/or corresponding to an 80091-associated disease or disorder (e.g., an erythroid-associated disorder), and based on one or more of the phenotypic information, the 80091 information (e.g., sequence information and/or information related thereto), and the acquired information, determining whether the subject has an 80091-associated disease or disorder or a pre-disposition to an 80091-associated disease or disorder. The method may further comprise the step of recommending a particular treatment for the disease, disorder or pre-disease condition.

[5434] The present invention also provides a method for determining whether a subject has an 80091-associated disease or disorder or a pre-disposition to an 80091-associated disease or disorder, said method comprising the steps of receiving information related to 80091 (e.g., sequence information and/or information related thereto), receiving phenotypic information associated with the subject, acquiring information from the network related to 80091 and/or related to an 80091-associated disease or disorder, and based on one or more of the phenotypic information, the 80091 information, and the acquired information, determining whether the subject has an 80091-associated disease or disorder or a pre-disposition to an 80091-associated disease or disorder. The method may further comprise the step of recommending a particular treatment for the disease, disorder or pre-disease condition.

[5435] This invention is further illustrated by the following examples that should not be construed as limiting. The contents of all references, patents and published patent applications cited throughout this application are incorporated herein by reference.

Background of the 46508 Invention

[5436] Peptidyl-tRNA hydrolases are important cellular enzymes that cleave peptidyl-tRNA molecules that are prematurely released from ribosomes as a result of abortive translation events. The esterase activity of peptidyl-tRNA hydrolases cleaves the covalent bond between the nascent peptide and the tRNA. Such cleavage results in the recycling of tRNA. The peptidyl-tRNA hydrolase from E. coli is well characterized, and homologous proteins are found in many eubacterial species. In E. coli, the gene encoding peptidyl-tRNA hydrolase is essential. Further, the required level of peptidyl-tRNA hydrolase activity for viability is escalated under conditions that increase premature translational termination such as exposure to antibiotics (Menninger and Coleman (1993) Antimicrob. Agents Chemother. 37:2027-2029.) and reduced when tRNAs particularly prone to dissociate from the ribosome are supplied in excess (Heurgue-Hamard et al. (1996) EMBO J. 15:2826-2833).

[5437] The x-ray crystal structure of E. coli peptidyl-tRNA hydrolase was determined at high resolution (Schmitt et al. (1997) EMBO J. 16:4760-4769). The monomeric protein contains single monomeric &agr;/&bgr; globular domain of seven &bgr;-strands and six &agr;-helices. The peptidyl-tRNA hydrolase enzyme structure has structural similarity to an aminopeptidase from Aeromonas proteolytica (GenPept:640150) (Chevrier, B. et al. (1994) Structure 2:283-291) and to a lesser extent to bovine purine nucleoside phosphorylase (GenPept:2624420) (Koellner, G. et al. (1997) J. Mol. Biol. 265:202-216.). Genetic data and structural analysis indicate that three residues, asparagine 10, histidine 20, and aspartic acid 93 in the E. coli enzyme are critical residues for catalysis. In addition, asparagine 68 and asparagine 114 of the E. coli enzyme are poised to make favorable electrostatic contacts with the peptide region of the peptidyl-tRNA substrate whereas arginine 133 of the E. coli sequence may contact the tRNA portion of the substrate. The amino acid identities of these positions are conserved in alignments of eubacterial peptidyl-tRNA transferases.

Summary of the 46508 Invention

[5438] The present invention is based, in part, on the discovery of a novel peptidyl-tRNA hydrolase, referred to herein as “46508”. The nucleotide sequence of a cDNA encoding 46508 is shown in SEQ ID NO:101, and the amino acid sequence of a 46508 polypeptide is shown in SEQ ID NO: 102. In addition, the nucleotide sequences of the coding region are depicted in SEQ ID NO:103.

[5439] Accordingly, in one aspect, the invention features a nucleic acid molecule that encodes a 46508 protein or polypeptide, e.g., a biologically active portion of the 46508 protein. In a preferred embodiment the isolated nucleic acid molecule encodes a polypeptide having the amino acid sequence of SEQ ID NO: 102. In other embodiments, the invention provides isolated 46508 nucleic acid molecules having the nucleotide sequence shown in SEQ ID NO:101, SEQ ID NO:103. In still other embodiments, the invention provides nucleic acid molecules that are substantially identical (e.g., naturally occurring allelic variants) to the nucleotide sequence shown in SEQ ID NO:101, SEQ ID NO:103. In other embodiments, the invention provides a nucleic acid molecule which hybridizes under a stringency condition described herein to a nucleic acid molecule comprising the nucleotide sequence of SEQ ID NO:101, SEQ ID NO:103, wherein the nucleic acid encodes a full length 46508 protein or an active fragment thereof.

[5440] In a related aspect, the invention further provides nucleic acid constructs that include a 46508 nucleic acid molecule described herein. In certain embodiments, the nucleic acid molecules of the invention are operatively linked to native or heterologous regulatory sequences. Also included, are vectors and host cells containing the 46508 nucleic acid molecules of the invention e.g., vectors and host cells suitable for producing 46508 nucleic acid molecules and polypeptides.

[5441] In another related aspect, the invention provides nucleic acid fragments suitable as primers or hybridization probes for the detection of 46508-encoding nucleic acids.

[5442] In still another related aspect, isolated nucleic acid molecules that are antisense to a 46508 encoding nucleic acid molecule are provided.

[5443] In another aspect, the invention features, 46508 polypeptides, and biologically active or antigenic fragments thereof that are useful, e.g., as reagents or targets in assays applicable to treatment and diagnosis of 46508-mediated or -related disorders. In another embodiment, the invention provides 46508 polypeptides having a 46508 activity. Preferred polypeptides are 46508 proteins including at least one peptidyl-tRNA hydrolase domain, and preferably, having a 46508 activity, e.g., a 46508 activity as described herein.

[5444] In other embodiments, the invention provides 46508 polypeptides, e.g., a 46508 polypeptide having the amino acid sequence shown in SEQ ID NO:102; an amino acid sequence that is substantially identical to the amino acid sequence shown in SEQ ID NO: 102; or an amino acid sequence encoded by a nucleic acid molecule having a nucleotide sequence which hybridizes under a stringency condition described herein to a nucleic acid molecule comprising the nucleotide sequence of SEQ ID NO: 101, SEQ ID NO: 103, wherein the nucleic acid encodes a full length 46508 protein or an active fragment thereof.

[5445] In a related aspect, the invention further provides nucleic acid constructs which include a 46508 nucleic acid molecule described herein.

[5446] In a related aspect, the invention provides 46508 polypeptides or fragments operatively linked to non-46508 polypeptides to form fusion proteins.

[5447] In another aspect, the invention features antibodies and antigen-binding fragments thereof, that react with, or more preferably specifically bind 46508 polypeptides or fragments thereof (e.g., a peptidyl-tRNA hydrolase domain). In one embodiment, the antibodies or antigen-binding fragment thereof competitively inhibit the binding of a second antibody to a 46508 polypeptide or a fragment thereof.

[5448] In another aspect, the invention provides methods of screening for compounds that modulate the expression or activity of the 46508 polypeptides or nucleic acids.

[5449] In still another aspect, the invention provides a process for modulating 46508 polypeptide or nucleic acid expression or activity, e.g. using the screened compounds. In certain embodiments, the methods involve treatment of conditions related to aberrant activity or expression of the 46508 polypeptides or nucleic acids, such as conditions, e.g., disorders or diseases, involving aberrant or deficient cellular proliferation or differentiation, e.g., ovarian, colon, lung primary or metastatic tumors; cardiovascular disorders, e.g., endothelial cell, or smooth muscle cell, disorders; and neural, e.g., brain, disorders.

[5450] In yet another aspect, the invention provides methods for inhibiting the aberrant activity of a 46508-expressing cell, e.g., a hyper-proliferative 46508-expressing cell. The method includes contacting the cell with an agent, e.g., a compound (e.g., a compound identified using the methods described herein), that modulates the activity, or expression, of the 46508 polypeptide or nucleic acid.

[5451] In a preferred embodiment, the contacting step is effective in vitro or ex vivo. In other embodiments, the contacting step is effected in vivo, e.g., in a subject (e.g., a mammal, e.g., a human), as part of a therapeutic or prophylactic protocol.

[5452] In a preferred embodiment, the cell is a hyperproliferative cell, e.g., a cell found in a solid tumor, a soft tissue tumor, or a metastatic lesion, e.g., an ovarian, colon, lung primary or metastatic tumor. In other embodiments, the cell is a cardiovascular cell, e.g., an endothelial cell, or smooth muscle cell; or a neural, e.g., brain, cell.

[5453] In a preferred embodiment, the agent, e.g., the compound, is an inhibitor of a 46508 polypeptide. For example, the agent inhibits the proliferation, or induces the killing of a 46508-expressing cell. Preferably, the inhibitor is chosen from a peptide, a phosphopeptide, a small organic molecule, a small inorganic molecule and an antibody (e.g., an antibody conjugated to a therapeutic moiety selected from a cytotoxin, a cytotoxic agent and a radioactive metal ion). In another preferred embodiment, the agent, e.g., the compound, is an inhibitor of a 46508 nucleic acid, e.g., an antisense, a ribozyme, or a triple helix molecule.

[5454] In a preferred embodiment, the agent, e.g., the compound, is administered in combination with a cytotoxic agent. Examples of cytotoxic agents include anti-microtubule agent, a topoisomerase I inhibitor, a topoisomerase II inhibitor, an anti-metabolite, a mitotic inhibitor, an alkylating agent, an intercalating agent, an agent capable of interfering with a signal transduction pathway, an agent that promotes apoptosis or necrosis, and radiation.

[5455] In another aspect, the invention features methods for treating or preventing a disorder characterized by aberrant activity, e.g., cellular proliferation or differentiation, of a 46508-expressing cell, in a subject. Preferably, the method includes comprising administering to the subject (e.g., a mammal, e.g., a human) an effective amount of a compound (e.g., a compound identified using the methods described herein) that modulates the activity, or expression, of the 46508 polypeptide or nucleic acid. In a preferred embodiment, the disorder is a cancerous or pre-cancerous condition.

[5456] In a further aspect, the invention provides methods for evaluating the efficacy of a treatment of a disorder, e.g., a proliferative, cardiovascular, or a neural disorder. The method includes: treating a subject, e.g., a patient or an animal, with a protocol under evaluation (e.g., treating a subject with one or more of: chemotherapy, radiation, and/or a compound identified using the methods described herein); and evaluating the expression of a 46508 nucleic acid or polypeptide before and after treatment. A change, e.g., a decrease or increase, in the level of a 46508 nucleic acid (e.g., mRNA) or polypeptide after treatment, relative to the level of expression before treatment, is indicative of the efficacy of the treatment of the disorder. The level of 46508 nucleic acid or polypeptide expression can be detected by any method described herein.

[5457] In a preferred embodiment, the evaluating step includes obtaining a sample (e.g., a tissue sample, e.g., a biopsy, or a fluid sample) from the subject, before and after treatment and comparing the level of expressing of a 46508 nucleic acid (e.g., mRNA) or polypeptide before and after treatment.

[5458] In another aspect, the invention provides methods for evaluating the efficacy of a therapeutic or prophylactic agent (e.g., an anti-neoplastic agent). The method includes: contacting a sample with an agent (e.g., a compound identified using the methods described herein, a cytotoxic agent) and, evaluating the expression of 46508 nucleic acid or polypeptide in the sample before and after the contacting step. A change, e.g., a decrease or increase, in the level of 46508 nucleic acid (e.g., mRNA) or polypeptide in the sample obtained after the contacting step, relative to the level of expression in the sample before the contacting step, is indicative of the efficacy of the agent. The level of 46508 nucleic acid or polypeptide expression can be detected by any method described herein. In a preferred embodiment, the sample includes cells obtained from a cancerous tissue.

[5459] In further aspect, the invention provides assays for determining the presence or absence of a genetic alteration in a 46508 polypeptide or nucleic acid molecule, including for disease diagnosis.

[5460] In another aspect, the invention features a two dimensional array having a plurality of addresses, each address of the plurality being positionally distinguishable from each other address of the plurality, and each address of the plurality having a unique capture probe, e.g., a nucleic acid or peptide sequence. At least one address of the plurality has a capture probe that recognizes a 46508 molecule. In one embodiment, the capture probe is a nucleic acid, e.g., a probe complementary to a 46508 nucleic acid sequence. In another embodiment, the capture probe is a polypeptide, e.g., an antibody specific for 46508 polypeptides. Also featured is a method of analyzing a sample by contacting the sample to the aforementioned array and detecting binding of the sample to the array.

[5461] Other features and advantages of the invention will be apparent from the following detailed description, and from the claims.

Detailed Description of 46508

[5462] The human 46508 sequence (see SEQ ID NO:101, as recited in Example 58), which is approximately 1182 nucleotides long including untranslated regions, contains a predicted methionine-initiated coding sequence of about 684 nucleotides, including the termination codon. The coding sequence encodes a 227 amino acid protein (see SEQ ID NO:102, as recited in Example 58). Human 46508 contains the following regions or other structural features:

[5463] a peptidyl-tRNA hydrolase domain (PFAM Accession PF01195) located at about amino acid residues 44 to 221 of SEQ ID NO:102;

[5464] two Protein Kinase C sites (PS00005) at about amino acids 13 to 15, and 150 to 152 of SEQ ID NO: 102;

[5465] two Casein Kinase II sites (PS00006) located at about amino acids 125 to 128, and 194 to 197 of SEQ ID NO:102; and

[5466] seven N-myristoylation sites (PS00008) located at about amino acids 4 to 9, 17 to 22, 23 to 28, 53 to 58, 74 to 79, 149 to 154, and 156 to 161 of SEQ ID NO:102; and

[5467] one amidation site (PS00009) located at about amino acid 40 to 43 of SEQ ID NO:102; and

[5468] one glycosaminoglycan attachment site (PS00002) located at about amino acids 3 to 6 of SEQ ID NO: 102.

[5469] For general information regarding PFAM identifiers, PS prefix and PF prefix domain identification numbers, refer to Sonnhammer et al. (1997) Protein 28:405-420 and http://www.psc.edu/general/software/packages/pfam/pfam.html.

[5470] A plasmid containing the nucleotide sequence encoding human 46508 (clone “Fbh46508FL”) was deposited with American Type Culture Collection (ATCC), 10801 University Boulevard, Manassas, Va. 20110-2209, on ______ and assigned Accession Number ______. This deposit will be maintained under the terms of the Budapest Treaty on the International Recognition of the Deposit of Microorganisms for the Purposes of Patent Procedure. This deposit was made merely as a convenience for those of skill in the art and is not an admission that a deposit is required under 35 U.S.C. §112.

[5471] The 46508 protein contains a significant number of structural characteristics in common with members of the peptidyl-tRNA hydrolase family. The term “family” when referring to the protein and nucleic acid molecules of the invention means two or more proteins or nucleic acid molecules having a common structural domain or motif and having sufficient amino acid or nucleotide sequence homology as defined herein. Such family members can be naturally or non-naturally occurring and can be from either the same or different species. For example, a family can contain a first protein of human origin as well as other distinct proteins of human origin, or alternatively, can contain homologues of non-human origin, e.g., rat or mouse proteins. Members of a family can also have common functional characteristics.

[5472] Peptidyl-tRNA hydrolases are a family of enzymes which hydrolyze ester linkages between the peptide moiety and the tRNA of peptidyl-tRNAs (Kössel, H (1969) Biochim. Biophys. Acta 204:191-202; Garcia-Villegas, M. R. (1991) EMBO J. 10:3549-3555). Such substrates are the products of premature release of nascent polypeptides from the ribosome. Cleavage of the bond between the peptide and tRNA allows the tRNA to return into the tRNA pool for subsequent charging and reuse. A conserved asparagine, a conserved histidine, and a conserved aspartic acid, residues 10, 20, and 93 of the E. coli peptidyl-tRNA hydrolase respectively, are required to catalyze this cleavage reaction. The 46508 polypeptide (SEQ ID NO:102) has these conserved residues, respectively: an asparagine at position 51, a histidine at position 59, and an aspartic acid at position 134 of SEQ ID NO:102. In addition, a conserved asparagines and a conserved arginine, residues 68, 114, and 133 of the E. coli peptidyl-tRNA hydrolase, respectively, contribute to the specificity of substrate recognition. The 46508 polypeptide (SEQ ID NO: 103) also has these conserved residues as shown in the alignment shown in FIG. 43, namely, an asparagine at position 109, an asparagine at position 155, and an arginine at position 173 of SEQ ID NO:102.

[5473] Cells which utilize the translation machinery more intensely than quiescent cells, e.g. rapidly growing cells, environmentally stressed cells, and virally infected cells, are likely to produce more peptidyl-tRNA substrates. Further, because of the increased translation activity, such cells also require larger tRNA pools than quiescent cells either in their cytoplasm or mitochondria, or both. Accordingly the activity of peptidyl-tRNA hydrolase enzymes may be required by such cells. Thus, inhibition of 46508 activity might be a successful route to treatment of a variety of disorders, including but not limited to, cell proliferation, cell differentiation, viral infection, and metabolism.

[5474] A 46508 polypeptide can include a “peptidyl-tRNA hydrolase domain” or regions homologous with a “peptidyl-tRNA hydrolase domain”. A 46508 polypeptide can optionally further include at least one glycosaminoglycan attachment site; at least one, preferably two, protein kinase C phosphorylation sites; at least one, preferable two, casein kinase II phosphorylation sites; at least one, two, three, four, five, six, preferably seven, N-myristoylation sites; and at least one amidation site.

[5475] As used herein, the term “peptidyl-tRNA hydrolase domain” includes an amino acid sequence of about 160 to 240 amino acid residues in length and having a bit score for the alignment of the sequence to the peptidyl-tRNA hydrolase domain profile (Pfam HMM) of at least 80. Preferably, the peptidyl-tRNA hydrolase domain has an amino acid sequence of about 170 to about 200 amino acids, more preferable about 170 to 190 amino acids, or about 177 amino acids, and has a bit score for the alignment of the sequence to the peptidyl-tRNA hydrolase domain (HMM) of at least 100, preferably of at least 120, more preferably of at least 130 or greater. Preferably, the peptidyl-tRNA hydrolase domain further includes the following highly conserved residues: one, preferably two, more preferably three asparagine residues, a histidine residue, an aspartic acid, and an arginine corresponding respectively to asparagine 51, asparagine 109, asparagine 155, histidine 59, aspartic acid 134, and arginine 173 of SEQ ID NO:102. The peptidyl-tRNA hydrolase domain (HMM) has been assigned the PFAM Accession (PF01195) (http://genome.wustl.edu/Pfam/html). An alignment of the peptidyl-tRNA hydrolase domain (from about amino acids 44 to about 221 of SEQ ID NO:102) of human 46508 with a consensus amino acid sequence derived from a hidden Markov model (PFAM) is depicted in FIG. 43.

[5476] In a preferred embodiment 46508 polypeptide or protein has a “peptidyl-tRNA hydrolase domain” or a region which includes at least about 120 to about 200 amino acids, more preferably about 160 to 190, 170 to 180, or about 177 amino acid residues and has at least about 60%, 70% 80% 90% 95%, 99%, or 100% homology with a “peptidyl-tRNA hydrolase domain,” e.g., the peptidyl-tRNA hydrolase domain of human 46508 (e.g., residues 44 to 221 of SEQ ID NO:102).

[5477] To identify the presence of a “peptidyl-tRNA hydrolase” domain in a 46508 protein sequence, and make the determination that a polypeptide or protein of interest has a particular profile, the amino acid sequence of the protein can be searched against the Pfam database of HMMs (e.g., the Pfam database, release 2.1) using the default parameters (http://www.sanger.ac.uk/Software/Pfam/HMM_search). For example, the hmmsf program, which is available as part of the HMMER package of search programs, is a family specific default program for MILPAT0063 and a score of 15 is the default threshold score for determining a hit. Alternatively, the threshold score for determining a hit can be lowered (e.g., to 8 bits). A description of the Pfam database can be found in Sonhammer et al. (1997) Proteins 28(3):405-420 and a detailed description of HMMs can be found, for example, in Gribskov et al. (1990) Meth. Enzymol. 183:146-159; Gribskov et al. (1987) Proc. Natl. Acad. Sci. USA 84:4355-4358; Krogh et al. (1994) J. Mol. Biol. 235:1501-1531; and Stultz et al. (1993) Protein Sci. 2:305-314, the contents of which are incorporated herein by reference. A search was performed against the HMM database resulting in the identification of a “peptidyl-tRNA hydrolase” domain in the amino acid sequence of human 46508 at about residues 44 to about 221 of SEQ ID NO:102 (see FIG. 43).

[5478] A 46508 family member can include at least one peptidyl-tRNA hydrolase domain or regions homologous with a peptidyl-tRNA hydrolase domain. Furthermore, a 46508 family member can include at least one, preferably two, protein kinase C phosphorylation sites (PS00005); at least one, preferable two, casein kinase II phosphorylation sites (PS00006); at least one, two, three, four, five, six, preferably seven, N-myristoylation sites; and at least one amidation site.

[5479] As the 46508 polypeptides of the invention may modulate 46508-mediated activities, they may be useful as of for developing novel diagnostic and therapeutic agents for 46508-mediated or related disorders, as described below.

[5480] As used herein, a “46508 activity”, “biological activity of 46508” or “functional activity of 46508”, refers to an activity exerted by a 46508 protein, polypeptide or nucleic acid molecule. For example, a 46508 activity can be an activity exerted by 46508 in a physiological milieu on, e.g., a 46508-responsive cell or on a 46508 substrate, e.g., a protein substrate. A 46508 activity can be determined in vivo or in vitro. In one embodiment, a 46508 activity is a direct activity, such as an association with a 46508 target molecule. A “target molecule” or “binding partner” is a molecule with which a 46508 protein binds or interacts in nature. In an another embodiment, 46508 activity can also be an indirect activity, e.g. a cellular signaling activity mediated by interaction of the 46508 protein with a second protein or with a nucleic acid.

[5481] The features of the 46508 molecules of the present invention can provide similar biological activities as peptidyl-tRNA hydrolase family members. For example, the 46508 proteins of the present invention can have one or more of the following activities: (1) ability to bind tRNA; (2) ability to bind peptide fragments; (3) ability to bind peptidyl-tRNAs; (4) ability to hydrolyze covalent bond between peptide and tRNA within peptidyl-tRNAs; or (5) ability to modulate translational efficiency. The 46508 polypeptide may perform one or more of these properties in the milieu of the cell cytoplasm and/or of the cell mitochondria.

[5482] As shown in the Examples below, increased 46508 mRNA expression is detected in a variety of malignant and non-malignant tissues, including cardiovascular tissues (e.g., endothelial cells, coronary smooth muscle cells), pancreas, neural tissues (e.g., brain, hypothalamus, DRG), skin, immune, e.g., erythroid cells, as well as a number of primary and metastatic tumors, e.g., ovarian, breast, and lung tumors. Thus, the 46508 molecules can act as novel diagnostic targets and therapeutic agents for controlling disorders of involving aberrant activity of those cells, e.g., cell proliferative disorders (e.g., cancer), cardiovascular disorders, neurological disorders, pancreatic disorders, skin and immune, e.g., erythroid, disorders.

[5483] High transcriptional expression of 46508 was observed in tumor samples compared to normal organ control samples. For example, high expression was observed in 5/5 primary ovarian tumor samples, 4/4 primary colon tumor samples, 2/2 colon to liver metastases, and 3/6 primary lung tumor samples. Additionally, high expression was observed in proliferating HMVEC cells when compared to arrested HMVEC cells. Therefore, 46508 may mediate or be involved in cellular proliferative and/or differentiative disorders.

[5484] Examples of cellular proliferative and/or differentiative disorders include cancer, e.g., carcinoma, sarcoma, metastatic disorders or hematopoietic neoplastic disorders, e.g., leukemias. A metastatic tumor can arise from a multitude of primary tumor types, including but not limited to those of prostate, colon, lung, breast and liver origin.

[5485] As used herein, the terms “cancer”, “hyperproliferative” and “neoplastic” refer to cells having the capacity for autonomous growth. Examples of such cells include cells having an abnormal state or condition characterized by rapidly proliferating cell growth. Hyperproliferative and neoplastic disease states may be categorized as pathologic, i.e., characterizing or constituting a disease state, or may be categorized as non-pathologic, i.e., a deviation from normal but not associated with a disease state. The term is meant to include all types of cancerous growths or oncogenic processes, metastatic tissues or malignantly transformed cells, tissues, or organs, irrespective of histopathologic type or stage of invasiveness. “Pathologic hyperproliferative” cells occur in disease states characterized by malignant tumor growth. Examples of non-pathologic hyperproliferative cells include proliferation of cells associated with wound repair.

[5486] The terms “cancer” or “neoplasms” include malignancies of the various organ systems, such as affecting lung, breast, thyroid, lymphoid, gastrointestinal, and genito-urinary tract, as well as adenocarcinomas which include malignancies such as most colon cancers, renal-cell carcinoma, prostate cancer and/or testicular tumors, non-small cell carcinoma of the lung, cancer of the small intestine and cancer of the esophagus.

[5487] The term “carcinoma” is art recognized and refers to malignancies of epithelial or endocrine tissues including respiratory system carcinomas, gastrointestinal system carcinomas, genitourinary system carcinomas, testicular carcinomas, breast carcinomas, prostatic carcinomas, endocrine system carcinomas, and melanomas. Exemplary carcinomas include those forming from tissue of the cervix, lung, prostate, breast, head and neck, colon and ovary. The term also includes carcinosarcomas, e.g., which include malignant tumors composed of carcinomatous and sarcomatous tissues. An “adenocarcinoma” refers to a carcinoma derived from glandular tissue or in which the tumor cells form recognizable glandular structures.

[5488] The term “sarcoma” is art recognized and refers to malignant tumors of mesenchymal derivation.

[5489] Examples of cellular proliferative and/or differentiative disorders of the breast include, but are not limited to, proliferative breast disease including, e.g., epithelial hyperplasia, sclerosing adenosis, and small duct papillomas; tumors, e.g., stromal tumors such as fibroadenoma, phyllodes tumor, and sarcomas, and epithelial tumors such as large duct papilloma; carcinoma of the breast including in situ (noninvasive) carcinoma that includes ductal carcinoma in situ (including Paget's disease) and lobular carcinoma in situ, and invasive (infiltrating) carcinoma including, but not limited to, invasive ductal carcinoma, invasive lobular carcinoma, medullary carcinoma, colloid (mucinous) carcinoma, tubular carcinoma, and invasive papillary carcinoma, and miscellaneous malignant neoplasms. Disorders in the male breast include, but are not limited to, gynecomastia and carcinoma.

[5490] Examples of cellular proliferative and/or differentiative disorders of the lung include, but are not limited to, bronchogenic carcinoma, including paraneoplastic syndromes, bronchioloalveolar carcinoma, neuroendocrine tumors, such as bronchial carcinoid, miscellaneous tumors, and metastatic tumors; pathologies of the pleura, including inflammatory pleural effusions, noninflammatory pleural effusions, pneumothorax, and pleural tumors, including solitary fibrous tumors (pleural fibroma) and malignant mesothelioma.

[5491] Examples of cellular proliferative and/or differentiative disorders of the colon include, but are not limited to, non-neoplastic polyps, adenomas, familial syndromes, colorectal carcinogenesis, colorectal carcinoma, and carcinoid tumors.

[5492] Examples of cellular proliferative and/or differentiative disorders of the liver include, but are not limited to, nodular hyperplasias, adenomas, and malignant tumors, including primary carcinoma of the liver and metastatic tumors.

[5493] Examples of cellular proliferative and/or differentiative disorders of the ovary include, but are not limited to, ovarian tumors such as, tumors of coelomic epithelium, serous tumors, mucinous tumors, endometeriod tumors, clear cell adenocarcinoma, adenofibroma, brenner tumor, surface epithelial tumors; germ cell tumors such as mature (benign) teratomas, monodermal teratomas, immature malignant teratomas, dysgerminoma, endodermal sinus tumor, choriocarcinoma; sex cord-stomal tumors such as, granulosa-theca cell tumors, thecoma-fibromas, androblastomas, hill cell tumors, and gonadoblastoma; and metastatic tumors such as Krukenberg tumors.

[5494] Examples of cancers or neoplastic conditions, in addition to the ones described above, include, but are not limited to, a fibrosarcoma, myosarcoma, liposarcoma, chondrosarcoma, osteogenic sarcoma, chordoma, angiosarcoma, endotheliosarcoma, lymphangiosarcoma, lymphangioendotheliosarcoma, synovioma, mesothelioma, Ewing's tumor, leiomyosarcoma, rhabdomyosarcoma, gastric cancer, esophageal cancer, rectal cancer, pancreatic cancer, ovarian cancer, prostate cancer, uterine cancer, cancer of the head and neck, skin cancer, brain cancer, squamous cell carcinoma, sebaceous gland carcinoma, papillary carcinoma, papillary adenocarcinoma, cystadenocarcinoma, medullary carcinoma, bronchogenic carcinoma, renal cell carcinoma, hepatoma, bile duct carcinoma, choriocarcinoma, seminoma, embryonal carcinoma, Wilm's tumor, cervical cancer, testicular cancer, small cell lung carcinoma, non-small cell lung carcinoma, bladder carcinoma, epithelial carcinoma, glioma, astrocytoma, medulloblastoma, craniopharyngioma, ependymoma, pinealoma, hemangioblastoma, acoustic neuroma, oligodendroglioma, meningioma, melanoma, neuroblastoma, retinoblastoma, leukemia, lymphoma, or Kaposi sarcoma.

[5495] Additional examples of proliferative disorders include hematopoietic neoplastic disorders. As used herein, the term “hematopoietic neoplastic disorders” includes diseases involving hyperplastic/neoplastic cells of hematopoietic origin. A hematopoietic neoplastic disorder can arise from myeloid, lymphoid or erythroid lineages, or precursor cells thereof. Preferably, the diseases arise from poorly differentiated acute leukemias, e.g., erythroblastic leukemia and acute megakaryoblastic leukemia. Additional exemplary myeloid disorders include, but are not limited to, acute promyeloid leukemia (APML), acute myelogenous leukemia (AML) and chronic myelogenous leukemia (CML) (reviewed in Vaickus, L. (1991) Crit Rev. in Oncol./Hemotol. 11:267-97); lymphoid malignancies include, but are not limited to acute lymphoblastic leukemia (ALL) which includes B-lineage ALL and T-lineage ALL, chronic lymphocytic leukemia (CLL), prolymphocytic leukemia (PLL), hairy cell leukemia (HLL) and Waldenstrom's macroglobulinemia (WM). Additional forms of malignant lymphomas include, but are not limited to non-Hodgkin lymphoma and variants thereof, peripheral T cell lymphomas, adult T cell leukemia/lymphoma (ATL), cutaneous T-cell lymphoma (CTCL), large granular lymphocytic leukemia (LGF), Hodgkin's disease and Reed-Sternberg disease.

[5496] Moderate expression of 46508 was observed in normal heart tissue samples, and thus 46508 may mediate disorders involving the heart, e.g. cardiovascular disorders. As used herein, disorders involving the heart, or “cardiovascular disease” or a “cardiovascular disorder” includes a disease or disorder which affects the cardiovascular system, e.g., the heart, the blood vessels, and/or the blood. A cardiovascular disorder can be caused by an imbalance in arterial pressure, a malfunction of the heart, or an occlusion of a blood vessel, e.g., by a thrombus. A cardiovascular disorder includes, but is not limited to disorders such as arteriosclerosis, atherosclerosis, cardiac hypertrophy, ischemia reperfusion injury, restenosis, arterial inflammation, vascular wall remodeling, ventricular remodeling, rapid ventricular pacing, coronary microembolism, tachycardia, bradycardia, pressure overload, aortic bending, coronary artery ligation, vascular heart disease, valvular disease, including but not limited to, valvular degeneration caused by calcification, rheumatic heart disease, endocarditis, or complications of artificial valves; atrial fibrillation, long-QT syndrome, congestive heart failure, sinus node dysfunction, angina, heart failure, hypertension, atrial fibrillation, atrial flutter, pericardial disease, including but not limited to, pericardial effusion and pericarditis; cardiomyopathies, e.g., dilated cardiomyopathy or idiopathic cardiomyopathy, myocardial infarction, coronary artery disease, coronary artery spasm, ischemic disease, arrhythmia, sudden cardiac death, and cardiovascular developmental disorders (e.g., arteriovenous malformations, arteriovenous fistulae, raynaud's syndrome, neurogenic thoracic outlet syndrome, causalgia/reflex sympathetic dystrophy, hemangioma, aneurysm, cavernous angioma, aortic valve stenosis, atrial septal defects, atrioventricular canal, coarctation of the aorta, ebsteins anomaly, hypoplastic left heart syndrome, interruption of the aortic arch, mitral valve prolapse, ductus arteriosus, patent foramen ovale, partial anomalous pulmonary venous return, pulmonary atresia with ventricular septal defect, pulmonary atresia without ventricular septal defect, persistance of the fetal circulation, pulmonary valve stenosis, single ventricle, total anomalous pulmonary venous return, transposition of the great vessels, tricuspid atresia, truncus arteriosus, ventricular septal defects). A cardiovasular disease or disorder also can include an endothelial cell disorder.

[5497] As used herein, an “endothelial cell disorder” includes a disorder characterized by aberrant, unregulated, or unwanted endothelial cell activity, e.g., proliferation, migration, angiogenesis, or vascularization; or aberrant expression of cell surface adhesion molecules or genes associated with angiogenesis, e.g., TIE-2, FLT and FLK. Endothelial cell disorders include tumorigenesis, tumor metastasis, psoriasis, diabetic retinopathy, endometriosis, Grave's disease, ischemic disease (e.g., atherosclerosis), and chronic inflammatory diseases (e.g., rheumatoid arthritis).

[5498] Additionally, 46508 molecules may play an important role in the etiology of certain viral diseases, including but not limited to Hepatitis B, Hepatitis C and Herpes Simplex Virus (HSV). Modulators of 46508 activity could be used to control viral diseases, particularly because viral pathogens subvert the host cell translation machinery in order to translate viral genes. The modulators can be used in the treatment and/or diagnosis of viral infected tissue or virus-associated tissue fibrosis, especially liver and liver fibrosis. Also, 46508 modulators can be used in the treatment and/or diagnosis of virus-associated carcinoma, especially hepatocellular cancer.

[5499] High to moderate expression of 46508 was observed in normal pancreas tissue, and thus, 46508 may mediate disorders involving the pancreas, e.g. pancreatic disorders. Disorders involving the pancreas include those of the exocrine pancreas such as congenital anomalies, including but not limited to, ectopic pancreas; pancreatitis, including but not limited to, acute pancreatitis; cysts, including but not limited to, pseudocysts; tumors, including but not limited to, cystic tumors and carcinoma of the pancreas; and disorders of the endocrine pancreas such as, diabetes mellitus; islet cell tumors, including but not limited to, insulinomas, gastrinomas, and other rare islet cell tumors.

[5500] Moderate to high expression of 46508 mRNA was observed in the normal brain cortex and hypothalamus. Therefore, 46508 may mediate disorders involving these cells, e.g. disorders of the brain. Disorders involving the brain include, but are not limited to, disorders involving neurons, and disorders involving glia, such as astrocytes, oligodendrocytes, ependymal cells, and microglia; cerebral edema, raised intracranial pressure and herniation, and hydrocephalus; malformations and developmental diseases, such as neural tube defects, forebrain anomalies, posterior fossa anomalies, and syringomyelia and hydromyelia; perinatal brain injury; cerebrovascular diseases, such as those related to hypoxia, ischemia, and infarction, including hypotension, hypoperfusion, and low-flow states—global cerebral ischemia and focal cerebral ischemia—infarction from obstruction of local blood supply, intracranial hemorrhage, including intracerebral (intraparenchymal) hemorrhage, subarachnoid hemorrhage and ruptured berry aneurysms, and vascular malformations, hypertensive cerebrovascular disease, including lacunar infarcts, slit hemorrhages, and hypertensive encephalopathy; infections, such as acute meningitis, including acute pyogenic (bacterial) meningitis and acute aseptic (viral) meningitis, acute focal suppurative infections, including brain abscess, subdural empyema, and extradural abscess, chronic bacterial meningoencephalitis, including tuberculosis and mycobacterioses, neurosyphilis, and neuroborreliosis (Lyme disease), viral meningoencephalitis, including arthropod-borne (Arbo) viral encephalitis, Herpes simplex virus Type 1, Herpes simplex virus Type 2, Varicalla-zoster virus (Herpes zoster), cytomegalovirus, poliomyelitis, rabies, and human immunodeficiency virus 1, including HIV-1 meningoencephalitis (subacute encephalitis), vacuolar myelopathy, AIDS-associated myopathy, peripheral neuropathy, and AIDS in children, progressive multifocal leukoencephalopathy, subacute sclerosing panencephalitis, fungal meningoencephalitis, other infectious diseases of the nervous system; transmissible spongiform encephalopathies (prion diseases); demyelinating diseases, including multiple sclerosis, multiple sclerosis variants, acute disseminated encephalomyelitis and acute necrotizing hemorrhagic encephalomyelitis, and other diseases with demyelination; degenerative diseases, such as degenerative diseases affecting the cerebral cortex, including Alzheimer disease and Pick disease, degenerative diseases of basal ganglia and brain stem, including Parkinsonism, idiopathic Parkinson disease (paralysis agitans), progressive supranuclear palsy, corticobasal degenration, multiple system atrophy, including striatonigral degenration, Shy-Drager syndrome, and olivopontocerebellar atrophy, and Huntington disease; spinocerebellar degenerations, including spinocerebellar ataxias, including Friedreich ataxia, and ataxia-telanglectasia, degenerative diseases affecting motor neurons, including amyotrophic lateral sclerosis (motor neuron disease), bulbospinal atrophy (Kennedy syndrome), and spinal muscular atrophy; inborn errors of metabolism, such as leukodystrophies, including Krabbe disease, metachromatic leukodystrophy, adrenoleukodystrophy, Pelizaeus-Merzbacher disease, and Canavan disease, mitochondrial encephalomyopathies, including Leigh disease and other mitochondrial encephalomyopathies; toxic and acquired metabolic diseases, including vitamin deficiencies such as thiamine (vitamin B1) deficiency and vitamin B12 deficiency, neurologic sequelae of metabolic disturbances, including hypoglycemia, hyperglycemia, and hepatic encephatopathy, toxic disorders, including carbon monoxide, methanol, ethanol, and radiation, including combined methotrexate and radiation-induced injury; tumors, such as gliomas, including astrocytoma, including fibrillary (diffuse) astrocytoma and glioblastoma multiforme, pilocytic astrocytoma, pleomorphic xanthoastrocytoma, and brain stem glioma, oligodendroglioma, and ependymoma and related paraventricular mass lesions, neuronal tumors, poorly differentiated neoplasms, including medulloblastoma, other parenchymal tumors, including primary brain lymphoma, germ cell tumors, and pineal parenchymal tumors, meningiomas, metastatic tumors, paraneoplastic syndromes, peripheral nerve sheath tumors, including schwannoma, neurofibroma, and malignant peripheral nerve sheath tumor (malignant schwannoma), and neurocutaneous syndromes (phakomatoses), including neurofibromotosis, including Type 1 neurofibromatosis (NF1) and TYPE 2 neurofibromatosis (NF2), tuberous sclerosis, and Von Hippel-Lindau disease.

[5501] 46508 mRNA expression was also detected in the dorsal root ganglia (DRG). Therefore, 46508-associated disorders can detrimentally affect regulation and modulation of the pain response; and vasoconstriction, inflammatory response and pain therefrom. Examples of such disorders in which the 46508 molecules of the invention may be directly or indirectly involved include pain, pain syndromes, and inflammatory disorders, including inflammatory pain. Examples of pain conditions include, but are not limited to, pain elicited during various forms of tissue injury, e.g., inflammation, infection, and ischemia; pain associated with musculoskeletal disorders, e.g., joint pain, or arthritis; tooth pain; headaches, e.g., migrane; pain associated with surgery; pain related to inflammation, e.g., irritable bowel syndrome; chest pain; or hyperalgesia, e.g., excessive sensitivity to pain (described in, for example, Fields (1987) Pain, New York: McGraw-Hill). Other examples of pain disorders or pain syndromes include, but are not limited to, complex regional pain syndrome (CRPS), reflex sympathetic dystrophy (RSD), causalgia, neuralgia, central pain and dysesthesia syndrome, carotidynia, neurogenic pain, refractory cervicobrachial pain syndrome, myofascial pain syndrome, craniomandibular pain dysfunction syndrome, chronic idiopathic pain syndrome, Costen's pain-dysfunction, acute chest pain syndrome, nonulcer dyspepsia, interstitial cystitis, gynecologic pain syndrome, patellofemoral pain syndrome, anterior knee pain syndrome, recurrent abdominal pain in children, colic, low back pain syndrome, neuropathic pain, phantom pain from amputation, phantom tooth pain, or pain asymbolia (the inability to feel pain). Other examples of pain conditions include pain induced by parturition, or post partum pain.

[5502] Moderate to high expression of 46508 mRNA was also detected in the skin. Diseases of the skin, include but are not limited to, disorders of pigmentation and melanocytes, including but not limited to, vitiligo, freckle, melasma, lentigo, nevocellular nevus, dysplastic nevi, and malignant melanoma; benign epithelial tumors, including but not limited to, seborrheic keratoses, acanthosis nigricans, fibroepithelial polyp, epithelial cyst, keratoacanthoma, and adnexal (appendage) tumors; premalignant and malignant epidermal tumors, including but not limited to, actinic keratosis, squamous cell carcinoma, basal cell carcinoma, and merkel cell carcinoma; tumors of the dermis, including but not limited to, benign fibrous histiocytoma, dermatofibrosarcoma protuberans, xanthomas, and dermal vascular tumors; tumors of cellular immigrants to the skin, including but not limited to, histiocytosis X, mycosis fungoides (cutaneous T-cell lymphoma), and mastocytosis; disorders of epidermal maturation, including but not limited to, ichthyosis; acute inflammatory dermatoses, including but not limited to, urticaria, acute eczematous dermatitis, and erythema multiforme; chronic inflammatory dermatoses, including but not limited to, psoriasis, lichen planus, and lupus erythematosus; blistering (bullous) diseases, including but not limited to, pemphigus, bullous pemphigoid, dermatitis herpetiformis, and noninflammatory blistering diseases: epidermolysis bullosa and porphyria; disorders of epidermal appendages, including but not limited to, acne vulgaris; panniculitis, including but not limited to, erythema nodosum and erythema induratum; and infection and infestation, such as verrucae, molluscum contagiosum, impetigo, superficial fungal infections, and arthropod bites, stings, and infestations.

[5503] Moderate expression of 46508 mRNA was detected in both normal and tumor breast tissue samples, and thus may mediate disorders of the breast. Disorders of the breast include, but are not limited to, disorders of development; inflammations, including but not limited to, acute mastitis, periductal mastitis, periductal mastitis (recurrent subareolar abscess, squamous metaplasia of lactiferous ducts), mammary duct ectasia, fat necrosis, granulomatous mastitis, and pathologies associated with silicone breast implants; fibrocystic changes; proliferative breast disease including, but not limited to, epithelial hyperplasia, sclerosing adenosis, and small duct papillomas; tumors including, but not limited to, stromal tumors such as fibroadenoma, phyllodes tumor, and sarcomas, and epithelial tumors such as large duct papilloma; carcinoma of the breast including in situ (noninvasive) carcinoma that includes ductal carcinoma in situ (including Paget 's disease) and lobular carcinoma in situ, and invasive (infiltrating) carcinoma including, but not limited to, invasive ductal carcinoma, no special type, invasive lobular carcinoma, medullary carcinoma, colloid (mucinous) carcinoma, tubular carcinoma, and invasive papillary carcinoma, and miscellaneous malignant neoplasms.

[5504] Disorders in the male breast include, but are not limited to, gynecomastia and carcinoma.

[5505] Normal and tumorous samples of ovarian tissue also showed expression of 46508 mRNA. Various data indicates that 46508 is highly expressed in several ovarian cell lines, including SKOV3/Var, A2780, MDA 2774 and ES-2. Disorders involving the ovary include, for example, polycystic ovarian disease, Stein-leventhal syndrome, Pseudomyxoma peritonei and stromal hyperthecosis; ovarian tumors such as, tumors of coelomic epithelium, serous tumors, mucinous tumors, endometeriod tumors, clear cell adenocarcinoma, cystadenofibroma, brenner tumor, surface epithelial tumors; germ cell tumors such as mature (benign) teratomas, monodermal teratomas, immature malignant teratomas, dysgerminoma, endodermal sinus tumor, choriocarcinoma; sex cord-stomal tumors such as, granulosa-theca cell tumors, thecoma-fibromas, androblastomas, hill cell tumors, and gonadoblastoma; and metastatic tumors such as Krukenberg tumors.

[5506] Moderate expression of 46508 mRNA was also noted in normal colon tissue, and high expression was noted in colon tumor samples. Thus 46508 may mediate diseases involving the colon. Disorders involving the colon include, but are not limited to, congenital anomalies, such as atresia and stenosis, Meckel diverticulum, congenital aganglionic megacolon-Hirschsprung disease; enterocolitis, such as diarrhea and dysentery, infectious enterocolitis, including viral gastroenteritis, bacterial enterocolitis, necrotizing enterocolitis, antibiotic-associated colitis (pseudomembranous colitis), and collagenous and lymphocytic colitis, miscellaneous intestinal inflammatory disorders, including parasites and protozoa, acquired immunodeficiency syndrome, transplantation, drug-induced intestinal injury, radiation enterocolitis, neutropenic colitis (typhlitis), and diversion colitis; idiopathic inflammatory bowel disease, such as Crohn disease and ulcerative colitis; tumors of the colon, such as non-neoplastic polyps, adenomas, familial syndromes, colorectal carcinogenesis, colorectal carcinoma, and carcinoid tumors.

[5507] Moderate 46508 mRNA expression was also noted in normal and tumor prostate samples. Disorders involving the prostate include, but are not limited to, inflammations, benign enlargement, for example, nodular hyperplasia (benign prostatic hypertrophy or hyperplasia), and tumors such as carcinoma. As used herein, “a prostate disorder” refers to an abnormal condition occurring in the male pelvic region characterized by, e.g., male sexual dysfunction and/or urinary symptoms. This disorder may be manifested in the form of genitourinary inflammation (e.g., inflammation of smooth muscle cells) as in several common diseases of the http://164.195.100.11/netacgi/nph-Parser?Sect1=PTO2&Sect2=HITOFF&u=/netahtml/-h5http://164.195.100.11/netacgi/nph-Parser?Sect1=PTO2&Sect2=HITOFF&u=/netahtml/-h7prostate including prostatitis, benign prostatic hyperplasia and cancer, e.g., adenocarcinoma or carcinoma, of the http://164.195.100.11/netacgi/nph-Parser?Sect1=PTO2&Sect2=HITOFF&u=/netahtml/-h6http://164.195.100.11/netacgi/nph-Parser?Sect1=PTO2&Sect2=HITOFF&u=/netahtml/-h8prostate.

[5508] Moderate expression of 46508 mRNA was observed in fibrotic liver tissue samples. Thus, 46508 may be involved in disorders of the liver. Disorders involving the liver include, but are not limited to, hepatic injury; jaundice and cholestasis, such as bilirubin and bile formation; hepatic failure and cirrhosis, such as cirrhosis, portal hypertension, including ascites, portosystemic shunts, and splenomegaly; infectious disorders, such as viral hepatitis, including hepatitis A-E infection and infection by other hepatitis viruses, clinicopathologic syndromes, such as the carrier state, asymptomatic infection, acute viral hepatitis, chronic viral hepatitis, and fulminant hepatitis; autoimmune hepatitis; drug- and toxin-induced liver disease, such as alcoholic liver disease; inborn errors of metabolism and pediatric liver disease, such as hemochromatosis, Wilson disease, &agr;l-antitrypsin deficiency, and neonatal hepatitis; intrahepatic biliary tract disease, such as secondary biliary cirrhosis, primary biliary cirrhosis, primary sclerosing cholangitis, and anomalies of the biliary tree; circulatory disorders, such as impaired blood flow into the liver, including hepatic artery compromise and portal vein obstruction and thrombosis, impaired blood flow through the liver, including passive congestion and centrilobular necrosis and peliosis hepatis, hepatic vein outflow obstruction, including hepatic vein thrombosis (Budd-Chiari syndrome) and veno-occlusive disease; hepatic disease associated with pregnancy, such as preeclampsia and eclampsia, acute fatty liver of pregnancy, and intrehepatic cholestasis of pregnancy; hepatic complications of organ or bone marrow transplantation, such as drug toxicity after bone marrow transplantation, graft-versus-host disease and liver rejection, and nonimmunologic damage to liver allografts; tumors and tumorous conditions, such as nodular hyperplasias, adenomas, and malignant tumors, including primary carcinoma of the liver and metastatic tumors.

[5509] Moderate 46508 expression was also noted both normal and lung tumor samples, and thus 46508 may mediate disorders of the lung, e.g. lung disorders. Examples of disorders of the lung include, but are not limited to, congenital anomalies; atelectasis; diseases of vascular origin, such as pulmonary congestion and edema, including hemodynamic pulmonary edema and edema caused by microvascular injury, adult respiratory distress syndrome (diffuse alveolar damage), pulmonary embolism, hemorrhage, and infarction, and pulmonary hypertension and vascular sclerosis; chronic obstructive pulmonary disease, such as emphysema, chronic bronchitis, bronchial asthma, and bronchiectasis; diffuse interstitial (infiltrative, restrictive) diseases, such as pneumoconioses, sarcoidosis, idiopathic pulmonary fibrosis, desquamative interstitial pneumonitis, hypersensitivity pneumonitis, pulmonary eosinophilia (pulmonary infiltration with eosinophilia), Bronchiolitis obliterans-organizing pneumonia, diffuse pulmonary hemorrhage syndromes, including Goodpasture syndrome, idiopathic pulmonary hemosiderosis and other hemorrhagic syndromes, pulmonary involvement in collagen vascular disorders, and pulmonary alveolar proteinosis; complications of therapies, such as drug-induced lung disease, radiation-induced lung disease, and lung transplantation; tumors, such as bronchogenic carcinoma, including paraneoplastic syndromes, bronchioloalveolar carcinoma, neuroendocrine tumors, such as bronchial carcinoid, miscellaneous tumors, and metastatic tumors; pathologies of the pleura, including inflammatory pleural effusions, noninflammatory pleural effusions, pneumothorax, and pleural tumors, including solitary fibrous tumors (pleural fibroma) and malignant mesothelioma.

[5510] The 46508 protein, fragments thereof, and derivatives and other variants of the sequence in SEQ ID NO: 102 thereof are collectively referred to as “polypeptides or proteins of the invention” or “46508 polypeptides or proteins”. Nucleic acid molecules encoding such polypeptides or proteins are collectively referred to as “nucleic acids of the invention” or “46508 nucleic acids.” 46508 molecules refer to 46508 nucleic acids, polypeptides, and antibodies.

[5511] As used herein, the term “nucleic acid molecule” includes DNA molecules (e.g., a cDNA or genomic DNA), RNA molecules (e.g., an mRNA) and analogs of the DNA or RNA. A DNA or RNA analog can be synthesized from nucleotide analogs. The nucleic acid molecule can be single-stranded or double-stranded, but preferably is double-stranded DNA.

[5512] The term “isolated nucleic acid molecule” or “purified nucleic acid molecule” includes nucleic acid molecules that are separated from other nucleic acid molecules present in the natural source of the nucleic acid. For example, with regards to genomic DNA, the term “isolated” includes nucleic acid molecules which are separated from the chromosome with which the genomic DNA is naturally associated. Preferably, an “isolated” nucleic acid is free of sequences which naturally flank the nucleic acid (i.e., sequences located at the 5′ and/or 3′ends of the nucleic acid) in the genomic DNA of the organism from which the nucleic acid is derived. For example, in various embodiments, the isolated nucleic acid molecule can contain less than about 5 kb, 4 kb, 3 kb, 2 kb, 1 kb, 0.5 kb or 0.1 kb of 5′ and/or 3′nucleotide sequences which naturally flank the nucleic acid molecule in genomic DNA of the cell from which the nucleic acid is derived. Moreover, an “isolated” nucleic acid molecule, such as a cDNA molecule, can be substantially free of other cellular material, or culture medium when produced by recombinant techniques, or substantially free of chemical precursors or other chemicals when chemically synthesized.

[5513] As used herein, the term “hybridizes under low stringency, medium stringency, high stringency, or very high stringency conditions” describes conditions for hybridization and washing. Guidance for performing hybridization reactions can be found in Current Protocols in Molecular Biology, John Wiley & Sons, N.Y. (1989), 6.3.1-6.3.6, which is incorporated by reference. Aqueous and nonaqueous methods are described in that reference and either can be used. Specific hybridization conditions referred to herein are as follows: 1) low stringency hybridization conditions in 6× sodium chloride/sodium citrate (SSC) at about 45° C., followed by two washes in 0.2× SSC, 0.1% SDS at least at 50° C. (the temperature of the washes can be increased to 55° C. for low stringency conditions); 2) medium stringency hybridization conditions in 6× SSC at about 45° C., followed by one or more washes in 0.2× SSC, 0.1% SDS at 60° C.; 3) high stringency hybridization conditions in 6× SSC at about 45° C., followed by one or more washes in 0.2× SSC, 0.1% SDS at 65° C.; and preferably 4) very high stringency hybridization conditions are 0.5M sodium phosphate, 7% SDS at 65° C., followed by one or more washes at 0.2× SSC, 1% SDS at 65° C. Very high stringency conditions (4) are the preferred conditions and the ones that should be used unless otherwise specified.

[5514] Preferably, an isolated nucleic acid molecule of the invention that hybridizes under a stringency condition described herein to the sequence of SEQ ID NO:101 or SEQ ID NO:103, corresponds to a naturally-occurring nucleic acid molecule.

[5515] As used herein, a “naturally-occurring” nucleic acid molecule refers to an RNA or DNA molecule having a nucleotide sequence that occurs in nature. For example a naturally occurring nucleic acid molecule can encode a natural protein. As used herein, the terms “gene” and “recombinant gene” refer to nucleic acid molecules which include at least an open reading frame encoding a 46508 protein. The gene can optionally further include non-coding sequences, e.g., regulatory sequences and introns. Preferably, a gene encodes a mammalian 46508 protein or derivative thereof.

[5516] An “isolated” or “purified” polypeptide or protein is substantially free of cellular material or other contaminating proteins from the cell or tissue source from which the protein is derived, or substantially free from chemical precursors or other chemicals when chemically synthesized. “Substantially free” means that a preparation of 46508 protein is at least 10% pure. In a preferred embodiment, the preparation of 46508 protein has less than about 30%, 20%, 10% and more preferably 5% (by dry weight), of non-46508 protein (also referred to herein as a “contaminating protein”), or of chemical precursors or non-46508 chemicals. When the 46508 protein or biologically active portion thereof is recombinantly produced, it is also preferably substantially free of culture medium, i.e., culture medium represents less than about 20%, more preferably less than about 10%, and most preferably less than about 5% of the volume of the protein preparation. The invention includes isolated or purified preparations of at least 0.01, 0.1, 1.0, and 10 milligrams in dry weight.

[5517] A “non-essential” amino acid residue is a residue that can be altered from the wild-type sequence of 46508 without abolishing or substantially altering a 46508 activity. Preferably the alteration does not substantially alter the 46508 activity, e.g., the activity is at least 20%, 40%, 60%, 70% or 80% of wild-type. An “essential” amino acid residue is a residue that, when altered from the wild-type sequence of 46508, results in abolishing a 46508 activity such that less than 20% of the wild-type activity is present. For example, conserved amino acid residues in 46508 are predicted to be particularly unamenable to alteration.

[5518] A “conservative amino acid substitution” is one in which the amino acid residue is replaced with an amino acid residue having a similar side chain. Families of amino acid residues having similar side chains have been defined in the art. These families include amino acids with basic side chains (e.g., lysine, arginine, histidine), acidic side chains (e.g., aspartic acid, glutamic acid), uncharged polar side chains (e.g., glycine, asparagine, glutamine, serine, threonine, tyrosine, cysteine), nonpolar side chains (e.g., alanine, valine, leucine, isoleucine, proline, phenylalanine, methionine, tryptophan), beta-branched side chains (e.g., threonine, valine, isoleucine) and aromatic side chains (e.g., tyrosine, phenylalanine, tryptophan, histidine). Thus, a predicted nonessential amino acid residue in a 46508 protein is preferably replaced with another amino acid residue from the same side chain family. Alternatively, in another embodiment, mutations can be introduced randomly along all or part of a 46508 coding sequence, such as by saturation mutagenesis, and the resultant mutants can be screened for 46508 biological activity to identify mutants that retain activity. Following mutagenesis of SEQ ID NO: 101 or SEQ ID NO: 103, the encoded protein can be expressed recombinantly and the activity of the protein can be determined.

[5519] As used herein, a “biologically active portion” of a 46508 protein includes a fragment of a 46508 protein which participates in an interaction, e.g., an intramolecular or an inter-molecular interaction. An inter-molecular interaction can be a specific binding interaction or an enzymatic interaction (e.g., the interaction can be transient and a covalent bond is formed or broken). An inter-molecular interaction can be between a 46508 molecule and a non-46508 molecule or between a first 46508 molecule and a second 46508 molecule (e.g., a dimerization interaction). Biologically active portions of a 46508 protein include peptides comprising amino acid sequences sufficiently homologous to or derived from the amino acid sequence of the 46508 protein, e.g., the amino acid sequence shown in SEQ ID NO: 102, which include less amino acids than the full length 46508 proteins, and exhibit at least one activity of a 46508 protein. Typically, biologically active portions comprise a domain or motif with at least one activity of the 46508 protein, e.g., hydrolase activity towards peptidyl-tRNA substrates, binding activity specific for ribonucleic acids, and binding activity specific for peptides. A biologically active portion of a 46508 protein can be a polypeptide which is, for example, 10, 25, 50, 100, 200 or more amino acids in length. Biologically active portions of a 46508 protein can be used as targets for developing agents which modulate a 46508 mediated activity, e.g., peptidyl-tRNA hydrolase activity, RNA binding activity, and peptide binding activity.

[5520] Calculations of homology or sequence identity between sequences (the terms are used interchangeably herein) are performed as follows.

[5521] To determine the percent identity of two amino acid sequences, or of two nucleic acid sequences, the sequences are aligned for optimal comparison purposes (e.g., gaps can be introduced in one or both of a first and a second amino acid or nucleic acid sequence for optimal alignment and non-homologous sequences can be disregarded for comparison purposes). In a preferred embodiment, the length of a reference sequence aligned for comparison purposes is at least 30%, preferably at least 40%, more preferably at least 50%, 60%, and even more preferably at least 70%, 80%, 90%, 100% of the length of the reference sequence. The amino acid residues or nucleotides at corresponding amino acid positions or nucleotide positions are then compared. When a position in the first sequence is occupied by the same amino acid residue or nucleotide as the corresponding position in the second sequence, then the molecules are identical at that position (as used herein amino acid or nucleic acid “identity” is equivalent to amino acid or nucleic acid “homology”).

[5522] The percent identity between the two sequences is a function of the number of identical positions shared by the sequences, taking into account the number of gaps, and the length of each gap, which need to be introduced for optimal alignment of the two sequences.

[5523] The comparison of sequences and determination of percent identity between two sequences can be accomplished using a mathematical algorithm. In a preferred embodiment, the percent identity between two amino acid sequences is determined using the Needleman and Wunsch ((1970) J. Mol. Biol. 48:444-453) algorithm which has been incorporated into the GAP program in the GCG software package (available at http://www.gcg.com), using either a Blossum 62 matrix or a PAM250 matrix, and a gap weight of 16, 14, 12, 10, 8, 6, or 4 and a length weight of 1, 2, 3, 4, 5, or 6. In yet another preferred embodiment, the percent identity between two nucleotide sequences is determined using the GAP program in the GCG software package (available at http://www.gcg.com), using a NWSgapdna.CMP matrix and a gap weight of 40, 50, 60, 70, or 80 and a length weight of 1, 2, 3, 4, 5, or 6. A particularly preferred set of parameters (and the one that should be used unless otherwise specified) are a Blossum 62 scoring matrix with a gap penalty of 12, a gap extend penalty of 4, and a frameshift gap penalty of 5.

[5524] The percent identity between two amino acid or nucleotide sequences can be determined using the algorithm of E. Meyers and W. Miller ((1989) CABIOS, 4:11-17) which has been incorporated into the ALIGN program (version 2.0), using a PAM120 weight residue table, a gap length penalty of 12 and a gap penalty of 4.

[5525] The nucleic acid and protein sequences described herein can be used as a “query sequence” to perform a search against public databases to, for example, identify other family members or related sequences. Such searches can be performed using the NBLAST and XBLAST programs (version 2.0) of Altschul, et al. (1990) J. Mol. Biol. 215:403-10. BLAST nucleotide searches can be performed with the NBLAST program, score=100, wordlength=12 to obtain nucleotide sequences homologous to 46508 nucleic acid molecules of the invention. BLAST protein searches can be performed with the XBLAST program, score=50, wordlength=3 to obtain amino acid sequences homologous to 46508 protein molecules of the invention. To obtain gapped alignments for comparison purposes, Gapped BLAST can be utilized as described in Altschul et al., (1997) Nucleic Acids Res. 25:3389-3402. When utilizing BLAST and Gapped BLAST programs, the default parameters of the respective programs (e.g., XBLAST and NBLAST) can be used. See http://www.ncbi.nlm.nih.gov.

[5526] Particularly preferred 46508 polypeptides of the present invention have an amino acid sequence substantially identical to the amino acid sequence of SEQ ID NO: 102. In the context of an amino acid sequence, the term “substantially identical” is used herein to refer to a first amino acid that contains a sufficient or minimum number of amino acid residues that are i) identical to, or ii) conservative substitutions of aligned amino acid residues in a second amino acid sequence such that the first and second amino acid sequences can have a common structural domain and/or common functional activity. For example, amino acid sequences that contain a common structural domain having at least about 60%, or 65% identity, likely 75% identity, more likely 85%, 90%. 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98% or 99% identity to SEQ ID NO: 102 are termed substantially identical.

[5527] In the context of nucleotide sequence, the term “substantially identical” is used herein to refer to a first nucleic acid sequence that contains a sufficient or minimum number of nucleotides that are identical to aligned nucleotides in a second nucleic acid sequence such that the first and second nucleotide sequences encode a polypeptide having common functional activity, or encode a common structural polypeptide domain or a common functional polypeptide activity. For example, nucleotide sequences having at least about 60%, or 65% identity, likely 75% identity, more likely 85%, 90%. 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98% or 99% identity to SEQ ID NO:101 or 103 are termed substantially identical.

[5528] “Misexpression or aberrant expression”, as used herein, refers to a non-wildtype pattern of gene expression at the RNA or protein level. It includes: expression at non-wild type levels, i.e., over- or under-expression; a pattern of expression that differs from wild type in terms of the time or stage at which the gene is expressed, e.g., increased or decreased expression (as compared with wild type) at a predetermined developmental period or stage; a pattern of expression that differs from wild type in terms of altered, e.g., increased or decreased, expression (as compared with wild type) in a predetermined cell type or tissue type; a pattern of expression that differs from wild type in terms of the splicing size, translated amino acid sequence, post-transitional modification, or biological activity of the expressed polypeptide; a pattern of expression that differs from wild type in terms of the effect of an environmental stimulus or extracellular stimulus on expression of the gene, e.g., a pattern of increased or decreased expression (as compared with wild type) in the presence of an increase or decrease in the strength of the stimulus.

[5529] “Subject,” as used herein, refers to human and non-human animals. The term “non-human animals” of the invention includes all vertebrates, e.g., mammals, such as non-human primates (particularly higher primates), sheep, dog, rodent (e.g., mouse or rat), guinea pig, goat, pig, cat, rabbits, cow, and non-mammals, such as chickens, amphibians, reptiles, etc. In a preferred embodiment, the subject is a human. In another embodiment, the subject is an experimental animal or animal suitable as a disease model.

[5530] A “purified preparation of cells”, as used herein, refers to an in vitro preparation of cells. In the case cells from multicellular organisms (e.g., plants and animals), a purified preparation of cells is a subset of cells obtained from the organism, not the entire intact organism. In the case of unicellular microorganisms (e.g., cultured cells and microbial cells), it consists of a preparation of at least 10% and more preferably 50% of the subject cells.

[5531] Various aspects of the invention are described in further detail below.

[5532] Isolated Nucleic Acid Molecules of 46508

[5533] In one aspect, the invention provides, an isolated or purified, nucleic acid molecule that encodes a 46508 polypeptide described herein, e.g., a full-length 46508 protein or a fragment thereof, e.g., a biologically active portion of 46508 protein. Also included is a nucleic acid fragment suitable for use as a hybridization probe, which can be used, e.g., to identify a nucleic acid molecule encoding a polypeptide of the invention, 46508 mRNA, and fragments suitable for use as primers, e.g., PCR primers for the amplification or mutation of nucleic acid molecules.

[5534] In one embodiment, an isolated nucleic acid molecule of the invention includes the nucleotide sequence shown in SEQ ID NO:101, or a portion of any of these nucleotide sequences. In one embodiment, the nucleic acid molecule includes sequences encoding the human 46508 protein (i.e., “the coding region” of SEQ ID NO:101, as shown in SEQ ID NO: 103), as well as 5′untranslated sequences. Alternatively, the nucleic acid molecule can include only the coding region of SEQ ID NO:101 (e.g., SEQ ID NO:103) and, e.g., no flanking sequences which normally accompany the subject sequence. In another embodiment, the nucleic acid molecule encodes a sequence corresponding to a fragment of the protein from about amino acid 44 to 221 of SEQ ID NO:102.

[5535] In another embodiment, an isolated nucleic acid molecule of the invention includes a nucleic acid molecule which is a complement of the nucleotide sequence shown in SEQ ID NO:101 or SEQ ID NO:103, or a portion of any of these nucleotide sequences. In other embodiments, the nucleic acid molecule of the invention is sufficiently complementary to the nucleotide sequence shown in SEQ ID NO:101 or SEQ ID NO:103, such that it can hybridize (e.g., under a stringency condition described herein) to the nucleotide sequence shown in SEQ ID NO:101 or 103, thereby forming a stable duplex.

[5536] In one embodiment, an isolated nucleic acid molecule of the present invention includes a nucleotide sequence which is at least about: 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or more homologous to the entire length of the nucleotide sequence shown in SEQ ID NO:101 or SEQ ID NO:103, or a portion, preferably of the same length, of any of these nucleotide sequences.

[5537] 46508 Nucleic Acid Fragments

[5538] A nucleic acid molecule of the invention can include only a portion of the nucleic acid sequence of SEQ ID NO: 101 or 103. For example, such a nucleic acid molecule can include a fragment which can be used as a probe or primer or a fragment encoding a portion of a 46508 protein, e.g., an immunogenic or biologically active portion of a 46508 protein. A fragment can comprise those nucleotides of SEQ ID NO:101, which encode 44 to about 221 of SEQ ID NO: 102, which encompasses a peptidyl-tRNA hydrolase domain of human 46508. The nucleotide sequence determined from the cloning of the 46508 gene allows for the generation of probes and primers designed for use in identifying and/or cloning other 46508 family members, or fragments thereof, as well as 46508 homologues, or fragments thereof, from other species.

[5539] In another embodiment, a nucleic acid includes a nucleotide sequence that includes part, or all, of the coding region and extends into either (or both) the 5′ or 3′noncoding region. Other embodiments include a fragment which includes a nucleotide sequence encoding an amino acid fragment described herein. Nucleic acid fragments can encode a specific domain or site described herein or fragments thereof, particularly fragments thereof which are at least 20 amino acids in length. Fragments also include nucleic acid sequences corresponding to specific amino acid sequences described above or fragments thereof. Nucleic acid fragments should not to be construed as encompassing those fragments that may have been disclosed prior to the invention.

[5540] A nucleic acid fragment can include a sequence corresponding to a domain, region, or functional site described herein. A nucleic acid fragment can also include one or more domain, region, or functional site described herein. Thus, for example, a 46508 nucleic acid fragment can include a sequence corresponding to a peptidyl-tRNA hydrolase domain at locations in the translated 46508 polypeptide described herein.

[5541] 46508 probes and primers are provided. Typically a probe/primer is an isolated or purified oligonucleotide. The oligonucleotide typically includes a region of nucleotide sequence that hybridizes under a stringency condition described herein to at least about 7, 12 or 15, preferably about 20 or 25, more preferably about 30, 35, 40, 45, 50, 55, 60, 65, or 75 consecutive nucleotides of a sense or antisense sequence of SEQ ID NO: 101 or SEQ ID NO:103, or of a naturally occurring allelic variant or mutant of SEQ ID NO:101 or SEQ ID NO:103.

[5542] In a preferred embodiment the nucleic acid is a probe which is at least 5 or 10, and less than 200, more preferably less than 100, or less than 50, base pairs in length. It should be identical, or differ by 1, or less than in 5 or 10 bases, from a sequence disclosed herein. If alignment is needed for this comparison the sequences should be aligned for maximum homology. “Looped” out sequences from deletions or insertions, or mismatches, are considered differences.

[5543] A probe or primer can be derived from the sense or anti-sense strand of a nucleic acid which encodes for example, a peptidyl-tRNA hydrolase domain from about amino acid 44 to about 221 of SEQ ID NO:102.:

[5544] In another embodiment a set of primers is provided, e.g., primers suitable for use in a PCR, which can be used to amplify a selected region of a 46508 sequence, e.g., a domain, region, site or other sequence described herein. The primers should be at least 5, 10, or 50 base pairs in length and less than 100, or less than 200, base pairs in length. The primers should be identical, or differs by one base from a sequence disclosed herein or from a naturally occurring variant. For example, primers suitable for amplifying all or a portion of any of the following regions are provided: a peptidyl-tRNA hydrolase domain from about amino acid 44 to 221 of SEQ ID NO:102.

[5545] A nucleic acid fragment can encode an epitope bearing region of a polypeptide described herein.

[5546] A nucleic acid fragment encoding a “biologically active portion of a 46508 polypeptide” can be prepared by isolating a portion of the nucleotide sequence of SEQ ID NO:101 or 103, which encodes a polypeptide having a 46508 biological activity (e.g., the biological activities of the 46508 proteins are described herein), expressing the encoded portion of the 46508 protein (e.g., by recombinant expression in vitro) and assessing the activity of the encoded portion of the 46508 protein. For example, a nucleic acid fragment encoding a biologically active portion of 46508 includes a peptidyl-tRNA hydrolase domain, e.g., amino acid residues about 44 to about 221 of SEQ ID NO: 102. A nucleic acid fragment encoding a biologically active portion of a 46508 polypeptide, may comprise a nucleotide sequence which is greater than 300 or more nucleotides in length.

[5547] In preferred embodiments, the nucleic acid fragment includes a nucleotide sequence that is other than, e.g., differs by at least one, two, three of more nucleotides from, the sequence of AW665597 or AA479713.

[5548] In preferred embodiments, the fragment comprises the coding region of 46508, e.g., the nucleotide sequence of SEQ ID NO: 103.

[5549] In preferred embodiments, a nucleic acid includes a nucleotide sequence which is about 300, 400, 500, 600, 700, 800, 900, 1000, 1100, 1180, 1182 nucleotides in length and hybridizes under a stringency condition described herein to a nucleic acid molecule of SEQ ID NO:101, or SEQ ID NO:103.

[5550] 46508 Nucleic Acid Variants

[5551] The invention further encompasses nucleic acid molecules that differ from the nucleotide sequence shown in SEQ ID NO:101 or SEQ ID NO:103. Such differences can be due to degeneracy of the genetic code (and result in a nucleic acid which encodes the same 46508 proteins as those encoded by the nucleotide sequence disclosed herein. In another embodiment, an isolated nucleic acid molecule of the invention has a nucleotide sequence encoding a protein having an amino acid sequence which differs, by at least 1, but less than 5, 10, 20, 50, or 100 amino acid residues that shown in SEQ ID NO:102. If alignment is needed for this comparison the sequences should be aligned for maximum homology. “Looped” out sequences from deletions or insertions, or mismatches, are considered differences.

[5552] Nucleic acids of the inventor can be chosen for having codons, which are preferred, or non-preferred, for a particular expression system. E.g., the nucleic acid can be one in which at least one codon, at preferably at least 10%, or 20% of the codons has been altered such that the sequence is optimized for expression in E. coli, yeast, human, insect, or CHO cells.

[5553] Nucleic acid variants can be naturally occurring, such as allelic variants (same locus), homologs (different locus), and orthologs (different organism) or can be non naturally occurring. Non-naturally occurring variants can be made by mutagenesis techniques, including those applied to polynucleotides, cells, or organisms. The variants can contain nucleotide substitutions, deletions, inversions and insertions. Variation can occur in either or both the coding and non-coding regions. The variations can produce both conservative and non-conservative amino acid substitutions (as compared in the encoded product).

[5554] In a preferred embodiment, the nucleic acid differs from that of SEQ ID NO:101 or 103, e.g., as follows: by at least one but less than 10, 20, 30, or 40 nucleotides; at least one but less than 1%, 5%, 10% or 20% of the nucleotides in the subject nucleic acid. If necessary for this analysis the sequences should be aligned for maximum homology. “Looped” out sequences from deletions or insertions, or mismatches, are considered differences.

[5555] Orthologs, homologs, and allelic variants can be identified using methods known in the art. These variants comprise a nucleotide sequence encoding a polypeptide that is 50%, at least about 55%, typically at least about 70-75%, more typically at least about 80-85%, and most typically at least about 90-95% or more identical to the nucleotide sequence shown in SEQ ID NO: 102 or a fragment of this sequence. Such nucleic acid molecules can readily be identified as being able to hybridize under a stringency condition described herein, to the nucleotide sequence shown in SEQ ID NO: 102 or a fragment of the sequence. Nucleic acid molecules corresponding to orthologs, homologs, and allelic variants of the 46508 cDNAs of the invention can further be isolated by mapping to the same chromosome or locus as the 46508 gene.

[5556] Preferred variants include those that are correlated with modulating (stimulating and/or enhancing or inhibiting) cellular proliferation, differentiation, or tumorigenesis; modulating metabolism; modulating the course of viral and/or bacterial infection; and modulating cellular responses to environmental stress and toxins.

[5557] Allelic variants of 46508, e.g., human 46508, include both functional and non-functional proteins. Functional allelic variants are naturally occurring amino acid sequence variants of the 46508 protein within a population that maintain the ability to bind tRNAs, short peptides, peptidyl-tRNAs, and to catalyze peptidyl-tRNA hydrolysis. Functional allelic variants will typically contain only conservative substitution of one or more amino acids of SEQ ID NO:102, or substitution, deletion or insertion of non-critical residues in non-critical regions of the protein. Non-functional allelic variants are naturally-occurring amino acid sequence variants of the 46508, e.g., human 46508, protein within a population that do not have the ability to bind tRNAs, short peptides, peptidyl-tRNAs, and to catalyze peptidyl-tRNA hydrolysis. Non-functional allelic variants will typically contain a non-conservative substitution, a deletion, or insertion, or premature truncation of the amino acid sequence of SEQ ID NO: 102, or a substitution, insertion, or deletion in critical residues or critical regions of the protein.

[5558] Moreover, nucleic acid molecules encoding other 46508 family members and, thus, which have a nucleotide sequence which differs from the 46508 sequences of SEQ ID NO:101 or SEQ ID NO:103 are intended to be within the scope of the invention.

[5559] Antisense Nucleic Acid Molecules, Ribozymes and Modified 46508 Nucleic Acid Molecules

[5560] In another aspect, the invention features, an isolated nucleic acid molecule which is antisense to 46508. An “antisense” nucleic acid can include a nucleotide sequence which is complementary to a “sense” nucleic acid encoding a protein, e.g., complementary to the coding strand of a double-stranded cDNA molecule or complementary to an mRNA sequence. The antisense nucleic acid can be complementary to an entire 46508 coding strand, or to only a portion thereof (e.g., the coding region of human 46508 corresponding to SEQ ID NO:103). In another embodiment, the antisense nucleic acid molecule is antisense to a “noncoding region” of the coding strand of a nucleotide sequence encoding 46508 (e.g., the 5′ and 3′untranslated regions).

[5561] An antisense nucleic acid can be designed such that it is complementary to the entire coding region of 46508 mRNA, but more preferably is an oligonucleotide which is antisense to only a portion of the coding or noncoding region of 46508 mRNA. For example, the antisense oligonucleotide can be complementary to the region surrounding the translation start site of 46508 mRNA, e.g., between the −10 and +10 regions of the target gene nucleotide sequence of interest. An antisense oligonucleotide can be, for example, about 7, 10, 15, 20, 25, 30, 35, 40, 45, 50, 55, 60, 65, 70, 75, 80, or more nucleotides in length.

[5562] An antisense nucleic acid of the invention can be constructed using chemical synthesis and enzymatic ligation reactions using procedures known in the art. For example, an antisense nucleic acid (e.g., an antisense oligonucleotide) can be chemically synthesized using naturally occurring nucleotides or variously modified nucleotides designed to increase the biological stability of the molecules or to increase the physical stability of the duplex formed between the antisense and sense nucleic acids, e.g., phosphorothioate derivatives and acridine substituted nucleotides can be used. The antisense nucleic acid also can be produced biologically using an expression vector into which a nucleic acid has been subcloned in an antisense orientation (i.e., RNA transcribed from the inserted nucleic acid will be of an antisense orientation to a target nucleic acid of interest, described further in the following subsection).

[5563] The antisense nucleic acid molecules of the invention are typically administered to a subject (e.g., by direct injection at a tissue site), or generated in situ such that they hybridize with or bind to cellular mRNA and/or genomic DNA encoding a 46508 protein to thereby inhibit expression of the protein, e.g., by inhibiting transcription and/or translation. Alternatively, antisense nucleic acid molecules can be modified to target selected cells and then administered systemically. For systemic administration, antisense molecules can be modified such that they specifically bind to receptors or antigens expressed on a selected cell surface, e.g., by linking the antisense nucleic acid molecules to peptides or antibodies which bind to cell surface receptors or antigens. The antisense nucleic acid molecules can also be delivered to cells using the vectors described herein. To achieve sufficient intracellular concentrations of the antisense molecules, vector constructs in which the antisense nucleic acid molecule is placed under the control of a strong pol II or pol III promoter are preferred.

[5564] In yet another embodiment, the antisense nucleic acid molecule of the invention is an &agr;-anomeric nucleic acid molecule. An &agr;-anomeric nucleic acid molecule forms specific double-stranded hybrids with complementary RNA in which, contrary to the usual &bgr;-units, the strands run parallel to each other (Gaultier et al. (1987) Nucleic Acids. Res. 15:6625-6641). The antisense nucleic acid molecule can also comprise a 2′-o-methylribonucleotide (Inoue et al. (1987) Nucleic Acids Res. 15:6131-6148) or a chimeric RNA-DNA analogue (Inoue et al. (1987) FEBS Lett. 215:327-330).

[5565] In still another embodiment, an antisense nucleic acid of the invention is a ribozyme. A ribozyme having specificity for a 46508-encoding nucleic acid can include one or more sequences complementary to the nucleotide sequence of a 46508 cDNA disclosed herein (i.e., SEQ ID NO:101 or SEQ ID NO:103), and a sequence having known catalytic sequence responsible for mRNA cleavage (see U.S. Pat. No. 5,093,246 or Haselhoff and Gerlach (1988) Nature 334:585-591). For example, a derivative of a Tetrahymena L-19 IVS RNA can be constructed in which the nucleotide sequence of the active site is complementary to the nucleotide sequence to be cleaved in a 46508-encoding mRNA. See, e.g., Cech et al. U.S. Pat. No. 4,987,071; and Cech et al. U.S. Pat. No. 5,116,742. Alternatively, 46508 mRNA can be used to select a catalytic RNA having a specific ribonuclease activity from a pool of RNA molecules. See, e.g., Bartel, D. and Szostak, J. W. (1993) Science 261:1411-1418.

[5566] 46508 gene expression can be inhibited by targeting nucleotide sequences complementary to the regulatory region of the 46508 (e.g., the 46508 promoter and/or enhancers) to form triple helical structures that prevent transcription of the 46508 gene in target cells. See generally, Helene, C. (1991) Anticancer Drug Des. 6:569-84; Helene, C. i (1992) Ann. N.Y. Acad. Sci. 660:27-36; and Maher, L. J. (1992) Bioassays 14:807-15. The potential sequences that can be targeted for triple helix formation can be increased by creating a so-called “switchback” nucleic acid molecule. Switchback molecules are synthesized in an alternating 5′-3′,3′-5′manner, such that they base pair with first one strand of a duplex and then the other, eliminating the necessity for a sizeable stretch of either purines or pyrimidines to be present on one strand of a duplex.

[5567] The invention also provides detectably labeled oligonucleotide primer and probe molecules. Typically, such labels are chemiluminescent, fluorescent, radioactive, or colorimetric.

[5568] A 46508 nucleic acid molecule can be modified at the base moiety, sugar moiety or phosphate backbone to improve, e.g., the stability, hybridization, or solubility of the molecule. For non-limiting examples of synthetic oligonucleotides with modifications see Toulmé (2001) Nature Biotech. 19:17 and Faria et al. (2001) Nature Biotech. 19:40-44. Such phosphoramidite oligonucleotides can be effective antisense agents.

[5569] For example, the deoxyribose phosphate backbone of the nucleic acid molecules can be modified to generate peptide nucleic acids (see Hyrup B. et al. (1996) Bioorganic & Medicinal Chemistry 4: 5-23). As used herein, the terms “peptide nucleic acid” or “PNA” refers to a nucleic acid mimic, e.g., a DNA mimic, in which the deoxyribose phosphate backbone is replaced by a pseudopeptide backbone and only the four natural nucleobases are retained. The neutral backbone of a PNA can allow for specific hybridization to DNA and RNA under conditions of low ionic strength. The synthesis of PNA oligomers can be performed using standard solid phase peptide synthesis protocols as described in Hyrup B. et al. (1996) supra and Perry-O'Keefe et al. Proc. Natl. Acad. Sci. 93: 14670-675.

[5570] PNAs of 46508 nucleic acid molecules can be used in therapeutic and diagnostic applications. For example, PNAs can be used as antisense or antigene agents for sequence-specific modulation of gene expression by, for example, inducing transcription or translation arrest or inhibiting replication. PNAs of 46508 nucleic acid molecules can also be used in the analysis of single base pair mutations in a gene, (e.g., by PNA-directed PCR clamping); as ‘artificial restriction enzymes’ when used in combination with other enzymes, (e.g., S1 nucleases (Hyrup B. et al. (1996) supra)); or as probes or primers for DNA sequencing or hybridization (Hyrup B. et al. (1996) supra; Perry-O'Keefe supra).

[5571] In other embodiments, the oligonucleotide may include other appended groups such as peptides (e.g., for targeting host cell receptors in vivo), or agents facilitating transport across the cell membrane (see, e.g., Letsinger et al. (1989) Proc. Natl. Acad. Sci. USA 86:6553-6556; Lemaitre et al. (1987) Proc. Natl. Acad. Sci. USA 84:648-652; PCT Publication No. WO88/09810) or the blood-brain barrier (see, e.g., PCT Publication No. WO89/10134). In addition, oligonucleotides can be modified with hybridization-triggered cleavage agents (see, e.g., Krol et al. (1988) Bio-Techniques 6:958-976) or intercalating agents. (see, e.g., Zon (1988) Pharm. Res. 5:539-549). To this end, the oligonucleotide may be conjugated to another molecule, (e.g., a peptide, hybridization triggered cross-linking agent, transport agent, or hybridization-triggered cleavage agent).

[5572] The invention also includes molecular beacon oligonucleotide primer and probe molecules having at least one region which is complementary to a 46508 nucleic acid of the invention, two complementary regions one having a fluorophore and one a quencher such that the molecular beacon is useful for quantitating the presence of the 46508 nucleic acid of the invention in a sample. Molecular beacon nucleic acids are described, for example, in Lizardi et al., U.S. Pat. No. 5,854,033; Nazarenko et al., U.S. Pat. No. 5,866,336, and Livak et al., U.S. Pat. No. 5,876,930.

[5573] Isolated 46508 Polypeptides

[5574] In another aspect, the invention features, an isolated 46508 protein, or fragment, e.g., a biologically active portion, for use as immunogens or antigens to raise or test (or more generally to bind) anti-46508 antibodies. 46508 protein can be isolated from cells or tissue sources using standard protein purification techniques. 46508 protein or fragments thereof can be produced by recombinant DNA techniques or synthesized chemically.

[5575] Polypeptides of the invention include those which arise as a result of the existence of multiple genes, alternative transcription events, alternative RNA splicing events, and alternative translational and post-translational events. The polypeptide can be expressed in systems, e.g., cultured cells, which result in substantially the same post-translational modifications present when expressed the polypeptide is expressed in a native cell, or in systems which result in the alteration or omission of post-translational modifications, e.g., glycosylation or cleavage, present when expressed in a native cell.

[5576] In a preferred embodiment, A 46508 polypeptide has one or more of the following characteristics:

[5577] (i) it has the ability to bind tRNA, and peptidyl-tRNAs;

[5578] (ii) it catalyzes the hydrolysis of peptidyl-tRNAs.

[5579] (iii) it has a molecular weight, e.g., a deduced molecular weight, preferably ignoring any contribution of post translational modifications, amino acid composition or other physical characteristic of SEQ ID NO: 102;

[5580] (iv) it has an overall sequence similarity of at least 60%, more preferably at least 70, 80, 90, or 95%, with a polypeptide of SEQ ID NO: 102;

[5581] (v) it has a peptidyl-tRNA hydrolase domain with a sequence homology which is preferably about 70%, 80%, 90% or 95% with amino acid residues about 44 to about 221 of SEQ ID NO:102;

[5582] (vii) it has at least one, preferably two, and most preferably three of the conserved residues found at positions 51, 109, and 155 of SEQ ID NO: 102; or

[5583] (viii) it has at least one, preferably two, and most preferable three of the conserved residues, histidine, aspartic acid, and arginine, respectively, located at positions 59, 134, and 173 of SEQ ID NO: 102.

[5584] In a preferred embodiment the 46508 protein, or fragment thereof, differs from the corresponding sequence in SEQ ID NO:102. In one embodiment it differs by at least one but by less than 15, 10 or 5 amino acid residues. In another it differs from the corresponding sequence in SEQ ID NO:102 by at least one residue but less than 20%, 15%, 10% or 5% of the residues in it differ from the corresponding sequence in SEQ ID NO: 102. (If this comparison requires alignment the sequences should be aligned for maximum homology. “Looped” out sequences from deletions or insertions, or mismatches, are considered differences.) The differences are, preferably, differences or changes at a non essential residue or a conservative substitution. In a preferred embodiment the differences are not in the peptidyl-tRNA hydrolase domain. In another preferred embodiment one or more differences are in the peptidyl-tRNA hydrolase domain.

[5585] Other embodiments include a protein that contain one or more changes in amino acid sequence, e.g., a change in an amino acid residue which is not essential for activity. Such 46508 proteins differ in amino acid sequence from SEQ ID NO: 102, yet retain biological activity.

[5586] In one embodiment, the protein includes an amino acid sequence at least about 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 98% or more homologous to SEQ ID NO:102.

[5587] A 46508 protein or fragment is provided which varies from the sequence of SEQ ID NO:102 in regions defined by amino acids about 1 to about 43, and from about 85 to about 97, by at least one but by less than 15, 10 or 5 amino acid residues in the protein or fragment but which does not differ from SEQ ID NO: 102 in regions defined by amino acids about 44 to about 84, and from about 98 to about 217. (If this comparison requires alignment the sequences should be aligned for maximum homology. “Looped” out sequences from deletions or insertions, or mismatches, are considered differences.) In some embodiments the difference is at a non-essential residue or is a conservative substitution, while in others the difference is at an essential residue or is a non-conservative substitution.

[5588] In one embodiment, a biologically active portion of a 46508 protein includes a one peptidyl-tRNA hydrolase domain. Moreover, other biologically active portions, in which other regions of the protein are deleted, can be prepared by recombinant techniques and evaluated for one or more of the functional activities of a native 46508 protein.

[5589] In a preferred embodiment, the 46508 protein has an amino acid sequence shown in SEQ ID NO:102. In other embodiments, the 46508 protein is substantially identical to SEQ ID NO:102. In yet another embodiment, the 46508 protein is substantially identical to SEQ ID NO: 102 and retains the functional activity of the protein of SEQ ID NO: 102, as described in detail in the subsections above.

[5590] 46508 Chimeric or Fusion Proteins

[5591] In another aspect, the invention provides 46508 chimeric or fusion proteins. As used herein, a 46508 “chimeric protein” or “fusion protein” includes a 46508 polypeptide linked to a non-46508 polypeptide. A “non-46508 polypeptide” refers to a polypeptide having an amino acid sequence corresponding to a protein which is not substantially homologous to the 46508 protein, e.g., a protein which is different from the 46508 protein and which is derived from the same or a different organism. The 46508 polypeptide of the fusion protein can correspond to all or a portion e.g., a fragment described herein of a 46508 amino acid sequence. In a preferred embodiment, a 46508 fusion protein includes at least one (or two) biologically active portion of a 46508 protein. The non-46508 polypeptide can be fused to the N-terminus or C-terminus of the 46508 polypeptide.

[5592] The fusion protein can include a moiety which has a high affinity for a ligand. For example, the fusion protein can be a GST-46508 fusion protein in which the 46508 sequences are fused to the C-terminus of the GST sequences. Such fusion proteins can facilitate the purification of recombinant 46508. Alternatively, the fusion protein can be a 46508 protein containing a heterologous signal sequence at its N-terminus. In certain host cells (e.g., mammalian host cells), expression and/or secretion of 46508 can be increased through use of a heterologous signal sequence.

[5593] Fusion proteins can include all or a part of a serum protein, e.g., an IgG constant region, or human serum albumin.

[5594] The 46508 fusion proteins of the invention can be incorporated into pharmaceutical compositions and administered to a subject in vivo. The 46508 fusion proteins can be used to affect the bioavailability of a 46508 substrate. 46508 fusion proteins may be useful therapeutically for the treatment of disorders caused by, for example, (i) aberrant modification or mutation of a gene encoding a 46508 protein; (ii) mis-regulation of the 46508 gene; and (iii) aberrant post-translational modification of a 46508 protein.

[5595] Moreover, the 46508-fusion proteins of the invention can be used as immunogens to produce anti-46508 antibodies in a subject, to purify 46508 ligands and in screening assays to identify molecules which inhibit the interaction of 46508 with a 46508 substrate.

[5596] Expression vectors are commercially available that already encode a fusion moiety (e.g., a GST polypeptide). A 46508-encoding nucleic acid can be cloned into such an expression vector such that the fusion moiety is linked in-frame to the 46508 protein.

[5597] Variants of 46508 Proteins

[5598] In another aspect, the invention also features a variant of a 46508 polypeptide, e.g., which functions as an agonist (mimetics) or as an antagonist. Variants of the 46508 proteins can be generated by mutagenesis, e.g., discrete point mutation, the insertion or deletion of sequences or the truncation of a 46508 protein. An agonist of the 46508 proteins can retain substantially the same, or a subset, of the biological activities of the naturally occurring form of a 46508 protein. An antagonist of a 46508 protein can inhibit one or more of the activities of the naturally occurring form of the 46508 protein by, for example, competitively modulating a 46508-mediated activity of a 46508 protein. Thus, specific biological effects can be elicited by treatment with a variant of limited function. Preferably, treatment of a subject with a variant having a subset of the biological activities of the naturally occurring form of the protein has fewer side effects in a subject relative to treatment with the naturally occurring form of the 46508 protein.

[5599] Variants of a 46508 protein can be identified by screening combinatorial libraries of mutants, e.g., truncation mutants, of a 46508 protein for agonist or antagonist activity.

[5600] Libraries of fragments e.g., N terminal, C terminal, or internal fragments, of a 46508 protein coding sequence can be used to generate a variegated population of fragments for screening and subsequent selection of variants of a 46508 protein. Variants in which a cysteine residues is added or deleted or in which a residue which is glycosylated is added or deleted are particularly preferred.

[5601] Methods for screening gene products of combinatorial libraries made by point mutations or truncation, and for screening cDNA libraries for gene products having a selected property are known in the art. Such methods are adaptable for rapid screening of the gene libraries generated by combinatorial mutagenesis of 46508 proteins. Recursive ensemble mutagenesis (REM), a new technique which enhances the frequency of functional mutants in the libraries, can be used in combination with the screening assays to identify 46508 variants (Arkin and Yourvan (1992) Proc. Natl. Acad. Sci. USA 89:7811-7815; Delgrave et al. (1993) Protein Engineering 6:327-331).

[5602] Cell based assays can be exploited to analyze a variegated 46508 library. For example, a library of expression vectors can be transfected into a cell line, e.g., a cell line, which ordinarily responds to 46508 in a substrate-dependent manner. The transfected cells are then contacted with 46508 and the effect of the expression of the mutant on signaling by the 46508 substrate can be detected. Plasmid DNA can then be recovered from the cells which score for inhibition, or alternatively, potentiation of signaling by the 46508 substrate, and the individual clones further characterized.

[5603] In another aspect, the invention features a method of making a 46508 polypeptide, e.g., a peptide having a non-wild type activity, e.g., an antagonist, agonist, or super agonist of a naturally occurring 46508 polypeptide, e.g., a naturally occurring 46508 polypeptide. The method includes: altering the sequence of a 46508 polypeptide, e.g., altering the sequence, e.g., by substitution or deletion of one or more residues of a non-conserved region, a domain or residue disclosed herein, and testing the altered polypeptide for the desired activity.

[5604] In another aspect, the invention features a method of making a fragment or analog of a 46508 polypeptide a biological activity of a naturally occurring 46508 polypeptide. The method includes: altering the sequence, e.g., by substitution or deletion of one or more residues, of a 46508 polypeptide, e.g., altering the sequence of a non-conserved region, or a domain or residue described herein, and testing the altered polypeptide for the desired activity.

[5605] Anti-46508 Antibodies

[5606] In another aspect, the invention provides an anti-46508 antibody, or a fragment thereof (e.g., an antigen-binding fragment thereof). The term “antibody” as used herein refers to an immunoglobulin molecule or immunologically active portion thereof, i.e., an antigen-binding portion. As used herein, the term “antibody” refers to a protein comprising at least one, and preferably two, heavy (H) chain variable regions (abbreviated herein as VH), and at least one and preferably two light (L) chain variable regions (abbreviated herein as VL). The VH and VL regions can be further subdivided into regions of hypervariability, termed “complementarity determining regions” (“CDR”), interspersed with regions that are more conserved, termed “framework regions” (FR). The extent of the framework region and CDR's has been precisely defined (see, Kabat, E. A., et al. (1991) Sequences of Proteins of Immunological Interest, Fifth Edition, U.S. Department of Health and Human Services, NIH Publication No. 91-3242, and Chothia, C. et al. (1987) J. Mol. Biol. 196:901-917, which are incorporated herein by reference). Each VH and VL is composed of three CDR's and four FRs, arranged from amino-terminus to carboxy-terminus in the following order: FR1, CDR1, FR2, CDR2, FR3, CDR3, FR4.

[5607] The anti-46508 antibody can further include a heavy and light chain constant region, to thereby form a heavy and light immunoglobulin chain, respectively. In one embodiment, the antibody is a tetramer of two heavy immunoglobulin chains and two light immunoglobulin chains, wherein the heavy and light immunoglobulin chains are inter-connected by, e.g., disulfide bonds. The heavy chain constant region is comprised of three domains, CH1, CH2 and CH3. The light chain constant region is comprised of one domain, CL. The variable region of the heavy and light chains contains a binding domain that interacts with an antigen. The constant regions of the antibodies typically mediate the binding of the antibody to host tissues or factors, including various cells of the immune system (e.g., effector cells) and the first component (Clq) of the classical complement system.

[5608] As used herein, the term “immunoglobulin” refers to a protein consisting of one or more polypeptides substantially encoded by immunoglobulin genes. The recognized human immunoglobulin genes include the kappa, lambda, alpha (IgA1 and IgA2), gamma (IgG1, IgG2, IgG3, IgG4), delta, epsilon and mu constant region genes, as well as the myriad immunoglobulin variable region genes. Full-length immunoglobulin “light chains” (about 25 KDa or 214 amino acids) are encoded by a variable region gene at the NH2-terminus (about 110 amino acids) and a kappa or lambda constant region gene at the

[5609] COOH—terminus. Full-length immunoglobulin “heavy chains” (about 50 KDa or 446 amino acids), are similarly encoded by a variable region gene (about 116 amino acids) and one of the other aforementioned constant region genes, e.g., gamma (encoding about 330 amino acids).

[5610] The term “antigen-binding fragment” of an antibody (or simply “antibody portion,” or “fragment”), as used herein, refers to one or more fragments of a full-length antibody that retain the ability to specifically bind to the antigen, e.g., 46508 polypeptide or fragment thereof. Examples of antigen-binding fragments of the anti-46508 antibody include, but are not limited to: (i) a Fab fragment, a monovalent fragment consisting of the VL, VH, CL and CH1 domains; (ii) a F(ab′)2 fragment, a bivalent fragment comprising two Fab fragments linked by a disulfide bridge at the hinge region; (iii) a Fd fragment consisting of the VH and CH1 domains; (iv) a Fv fragment consisting of the VL and VH domains of a single arm of an antibody, (v) a dAb fragment (Ward et al., (1989) Nature 341:544-546), which consists of a VH domain; and (vi) an isolated complementarity determining region (CDR). Furthermore, although the two domains of the Fv fragment, VL and VH, are coded for by separate genes, they can be joined, using recombinant methods, by a synthetic linker that enables them to be made as a single protein chain in which the VL and VH regions pair to form monovalent molecules (known as single chain Fv (scFv); see e.g., Bird et al. (1988) Science 242:423-426; and Huston et al. (1988) Proc. Natl. Acad. Sci. USA 85:5879-5883). Such single chain antibodies are also encompassed within the term “antigen-binding fragment” of an antibody. These antibody fragments are obtained using conventional techniques known to those with skill in the art, and the fragments are screened for utility in the same manner as are intact antibodies.

[5611] The anti-46508 antibody can be a polyclonal or a monoclonal antibody. In other embodiments, the antibody can be recombinantly produced, e.g., produced by phage display or by combinatorial methods.

[5612] Phage display and combinatorial methods for generating anti-46508 antibodies are known in the art (as described in, e.g., Ladner et al. U.S. Pat. No. 5,223,409; Kang et al. International Publication No. WO 92/18619; Dower et al. International Publication No. WO 91/17271; Winter et al. International Publication WO 92/20791; Markland et al. International Publication No. WO 92/15679; Breitling et al. International Publication WO 93/01288; McCafferty et al. International Publication No. WO 92/01047; Garrard et al. International Publication No. WO 92/09690; Ladner et al. International Publication No. WO 90/02809; Fuchs et al. (1991) Bio/Technology 9:1370-1372; Hay et al. (1992) Hum Antibod Hybridomas 3:81-85; Huse et al. (1989) Science 246:1275-1281; Griffths et al. (1993) EMBO J 12:725-734; Hawkins et al. (1992) J Mol Biol 226:889-896; Clackson et al. (1991) Nature 352:624-628; Gram et al. (1992) PNAS 89:3576-3580; Garrad et al. (1991) Bio/Technology 9:1373-1377; Hoogenboom et al. (1991) Nuc Acid Res 19:4133-4137; and Barbas et al. (1991) PNAS 88:7978-7982, the contents of all of which are incorporated by reference herein).

[5613] In one embodiment, the anti-46508 antibody is a fully human antibody (e.g., an antibody made in a mouse which has been genetically engineered to produce an antibody from a human immunoglobulin sequence), or a non-human antibody, e.g., a rodent (mouse or rat), goat, primate (e.g., monkey), camel antibody. Preferably, the non-human antibody is a rodent (mouse or rat antibody). Method of producing rodent antibodies are known in the art.

[5614] Human monoclonal antibodies can be generated using transgenic mice carrying the human immunoglobulin genes rather than the mouse system. Splenocytes from these transgenic mice immunized with the antigen of interest are used to produce hybridomas that secrete human mAbs with specific affinities for epitopes from a human protein (see, e.g., Wood et al. International Application WO 91/00906, Kucherlapati et al. PCT publication WO 91/10741; Lonberg et al. International Application WO 92/03918; Kay et al. International Application 92/03917; Lonberg, N. et al. 1994 Nature 368:856-859; Green, L. L. et al. 1994 Nature Genet. 7:13-21; Morrison, S. L. et al. 1994 Proc. Natl. Acad. Sci. USA 81:6851-6855; Bruggeman et al. 1993 Year Immunol 7:33-40; Tuaillon et al. 1993 PNAS 90:3720-3724; Bruggeman et al. 1991 Eur J Immunol 21:1323-1326).

[5615] An anti-46508 antibody can be one in which the variable region, or a portion thereof, e.g., the CDR's, are generated in a non-human organism, e.g., a rat or mouse. Chimeric, CDR-grafted, and humanized antibodies are within the invention. Antibodies generated in a non-human organism, e.g., a rat or mouse, and then modified, e.g., in the variable framework or constant region, to decrease antigenicity in a human are within the invention.

[5616] Chimeric antibodies can be produced by recombinant DNA techniques known in the art. For example, a gene encoding the Fc constant region of a murine (or other species) monoclonal antibody molecule is digested with restriction enzymes to remove the region encoding the murine Fc, and the equivalent portion of a gene encoding a human Fc constant region is substituted (see Robinson et al., International Patent Publication PCT/US86/02269; Akira, et al., European Patent Application 184,187; Taniguchi, M., European Patent Application 171,496; Morrison et al., European Patent Application 173,494; Neuberger et al., International Application WO 86/01533; Cabilly et al. U.S. Pat. No. 4,816,567; Cabilly et al., European Patent Application 125,023; Better et al. (1988 Science 240:1041-1043); Liu et al. (1987) PNAS 84:3439-3443; Liu et al., 1987, J. Immunol. 139:3521-3526; Sun et al. (1987) PNAS 84:214-218; Nishimura et al., 1987, Canc. Res. 47:999-1005; Wood et al. (1985) Nature 314:446-449; and Shaw et al., 1988, J. Natl Cancer Inst. 80:1553-1559).

[5617] A humanized or CDR-grafted antibody will have at least one or two but generally all three recipient CDR's (of heavy and or light immuoglobulin chains) replaced with a donor CDR. The antibody may be replaced with at least a portion of a non-human CDR or only some of the CDR's may be replaced with non-human CDR's. It is only necessary to replace the number of CDR's required for binding of the humanized antibody to a 46508 or a fragment thereof. Preferably, the donor will be a rodent antibody, e.g., a rat or mouse antibody, and the recipient will be a human framework or a human consensus framework. Typically, the immunoglobulin providing the CDR's is called the “donor” and the immunoglobulin providing the framework is called the “acceptor.” In one embodiment, the donor immunoglobulin is a non-human (e.g., rodent). The acceptor framework is a naturally-occurring (e.g., a human) framework or a consensus framework, or a sequence about 85% or higher, preferably 90%, 95%, 99% or higher identical thereto.

[5618] As used herein, the term “consensus sequence” refers to the sequence formed from the most frequently occurring amino acids (or nucleotides) in a family of related sequences (See e.g., Winnaker, From Genes to Clones (Verlagsgesellschaft, Weinheim, Germany 1987). In a family of proteins, each position in the consensus sequence is occupied by the amino acid occurring most frequently at that position in the family. If two amino acids occur equally frequently, either can be included in the consensus sequence. A “consensus framework” refers to the framework region in the consensus immunoglobulin sequence.

[5619] An antibody can be humanized by methods known in the art. Humanized antibodies can be generated by replacing sequences of the Fv variable region which are not directly involved in antigen binding with equivalent sequences from human Fv variable regions. General methods for generating humanized antibodies are provided by Morrison, S. L., 1985, Science 229:1202-1207, by Oi et al., 1986, BioTechniques 4:214, and by Queen et al. U.S. Pat. No. 5,585,089, U.S. Pat. No. 5,693,761 and U.S. Pat. No. 5,693,762, the contents of all of which are hereby incorporated by reference. Those methods include isolating, manipulating, and expressing the nucleic acid sequences that encode all or part of immunoglobulin Fv variable regions from at least one of a heavy or light chain. Sources of such nucleic acid are well known to those skilled in the art and, for example, may be obtained from a hybridoma producing an antibody against a 46508 polypeptide or fragment thereof. The recombinant DNA encoding the humanized antibody, or fragment thereof, can then be cloned into an appropriate expression vector.

[5620] Humanized or CDR-grafted antibodies can be produced by CDR-grafting or CDR substitution, wherein one, two, or all CDR's of an immunoglobulin chain can be replaced. See e.g., U.S. Pat. No. 5,225,539; Jones et al. 1986 Nature 321:552-525; Verhoeyan et al. 1988 Science 239:1534; Beidler et al. 1988 J. Immunol. 141:4053-4060; Winter U.S. Pat. No. 5,225,539, the contents of all of which are hereby expressly incorporated by reference. Winter describes a CDR-grafting method which may be used to prepare the humanized antibodies of the present invention (UK Patent Application GB 2188638A, filed on Mar. 26, 1987; Winter U.S. Pat. No. 5,225,539), the contents of which is expressly incorporated by reference.

[5621] Also within the scope of the invention are humanized antibodies in which specific amino acids have been substituted, deleted or added. Preferred humanized antibodies have amino acid substitutions in the framework region, such as to improve binding to the antigen. For example, a humanized antibody will have framework residues identical to the donor framework residue or to another amino acid other than the recipient framework residue. To generate such antibodies, a selected, small number of acceptor framework residues of the humanized immunoglobulin chain can be replaced by the corresponding donor amino acids. Preferred locations of the substitutions include amino acid residues adjacent to the CDR, or which are capable of interacting with a CDR (see e.g., U.S. Pat. No. 5,585,089). Criteria for selecting amino acids from the donor are described in U.S. Pat. No. 5,585,089, e.g., columns 12-16 of U.S. Pat. No. 5,585,089, the e.g., columns 12-16 of U.S. Pat. No. 5,585,089, the contents of which are hereby incorporated by reference. Other techniques for humanizing antibodies are described in Padlan et al. EP 519596 A1, published on Dec. 23, 1992.

[5622] In preferred embodiments an antibody can be made by immunizing with purified 46508 antigen, or a fragment thereof, e.g., a fragment described herein.

[5623] A full-length 46508 protein or, antigenic peptide fragment of 46508 can be used as an immunogen or can be used to identify anti-46508 antibodies made with other immunogens, e.g., cells, membrane preparations, and the like. The antigenic peptide of 46508 should include at least 8 amino acid residues of the amino acid sequence shown in SEQ ID NO: 102 and encompasses an epitope of 46508. Preferably, the antigenic peptide includes at least 10 amino acid residues, more preferably at least 15 amino acid residues, even more preferably at least 20 amino acid residues, and most preferably at least 30 amino acid residues.

[5624] Antibodies can be made against the peptidyl-tRNA hydrolase domain of 46508. Fragments of 46508 which include residues from about 77 to about 85, and from about 217 to about 224 of SEQ ID NO:102 can be used to make, e.g., used as immunogens or used to characterize the specificity of an antibody, antibodies against hydrophilic regions of the 46508 protein. Similarly, a fragment of 46508 which include residues from about 60 to 70, from about 86 to 102, and from about 189 to 195 of SEQ ID NO:102 can be used to make an antibody against a hydrophobic region of the 46508 protein. Similarly, a fragment of 46508 which includes residues about 44 to 221 of SEQ ID NO:102 (or a fragment thereof, e.g., 44-100, 100-150, 150-200, 200-221 of SEQ ID NO:102) can be used to make an antibody against the peptidyl-tRNA hydrolase region of the 46508 protein.

[5625] Antibodies reactive with, or specific for, any of these regions, or other regions or domains described herein are provided.

[5626] Antibodies which bind only native 46508 protein, only denatured or otherwise non-native 46508 protein, or which bind both, are with in the invention. Antibodies with linear or conformational epitopes are within the invention. Conformational epitopes can sometimes be identified by identifying antibodies which bind to native but not denatured 46508 protein.

[5627] Preferred epitopes encompassed by the antigenic peptide are regions of 46508 are located on the surface of the protein, e.g., hydrophilic regions, as well as regions with high antigenicity. For example, an Emini surface probability analysis of the human 46508 protein sequence can be used to indicate the regions that have a particularly high probability of being localized to the surface of the 46508 protein and are thus likely to constitute surface residues useful for targeting antibody production.

[5628] The anti-46508 antibody can be a single chain antibody. A single-chain antibody (scFV) may be engineered (see, for example, Colcher, D. et al. (1999) Ann N Y Acad Sci 880:263-80; and Reiter, Y. (1996) Clin Cancer Res 2:245-52). The single chain antibody can be dimerized or multimerized to generate multivalent antibodies having specificities for different epitopes of the same target 46508 protein.

[5629] In a preferred embodiment the antibody has: effector function; and can fix complement. In other embodiments the antibody does not; recruit effector cells; or fix complement.

[5630] In a preferred embodiment, the antibody has reduced or no ability to bind an Fc receptor. For example., it is a isotype or subtype, fragment or other mutant, which does not support binding to an Fc receptor, e.g., it has a mutagenized or deleted Fc receptor binding region.

[5631] In a preferred embodiment, an anti-46508 antibody alters (e.g., increases or decreases) the peptidyl-tRNA hydrolase activity of a 46508 polypeptide. For example, the antibody can bind at or in proximity to the active site, e.g., to an epitope that includes a residue located from about 44 to 221 of SEQ ID NO: 102.

[5632] The antibody can be coupled to a toxin, e.g., a polypeptide toxin, e,g, ricin or diphtheria toxin or active fragment hereof, or a radioactive nucleus, or imaging agent, e.g. a radioactive, enzymatic, or other, e.g., imaging agent, e.g., a NMR contrast agent. Labels which produce detectable radioactive emissions or fluorescence are preferred.

[5633] An anti-46508 antibody (e.g., monoclonal antibody) can be used to isolate 46508 by standard techniques, such as affinity chromatography or immunoprecipitation. Moreover, an anti-46508 antibody can be used to detect 46508 protein (e.g., in a cellular lysate or cell supernatant) in order to evaluate the abundance and pattern of expression of the protein. Anti-46508 antibodies can be used diagnostically to monitor protein levels in tissue as part of a clinical testing procedure, e.g., to determine the efficacy of a given treatment regimen. Detection can be facilitated by coupling (i.e., physically linking) the antibody to a detectable substance (i.e., antibody labelling). Examples of detectable substances include various enzymes, prosthetic groups, fluorescent materials, luminescent materials, bioluminescent materials, and radioactive materials. Examples of suitable enzymes include horseradish peroxidase, alkaline phosphatase, &bgr;-galactosidase, or acetylcholinesterase; examples of suitable prosthetic group complexes include streptavidin/biotin and avidin/biotin; examples of suitable fluorescent materials include umbelliferone, fluorescein, fluorescein isothiocyanate, rhodamine, dichlorotriazinylamine fluorescein, dansyl chloride or phycoerythrin; an example of a luminescent material includes luminol; examples of bioluminescent materials include luciferase, luciferin, and aequorin, and examples of suitable radioactive material include 125I, 131I, 35S or 3H.

[5634] The invention also includes a nucleic acids which encodes an anti-46508 antibody, e.g., an anti-46508 antibody described herein. Also included are vectors which include the nucleic acid and sells transformed with the nucleic acid, particularly cells which are useful for producing an antibody, e.g., mammalian cells, e.g. CHO or lymphatic cells.

[5635] The invention also includes cell lines, e.g., hybridomas, which make an anti-46508 antibody, e.g., and antibody described herein, and method of using said cells to make a 46508 antibody.

[5636] 46508 Recombinant Expression Vectors, Host Cells and Genetically Engineered Cells

[5637] In another aspect, the invention includes, vectors, preferably expression vectors, containing a nucleic acid encoding a polypeptide described herein. As used herein, the term “vector” refers to a nucleic acid molecule capable of transporting another nucleic acid to which it has been linked and can include a plasmid, cosmid or viral vector. The vector can be capable of autonomous replication or it can integrate into a host DNA. Viral vectors include, e.g., replication defective retroviruses, adenoviruses and adeno-associated viruses.

[5638] A vector can include a 46508 nucleic acid in a form suitable for expression of the nucleic acid in a host cell. Preferably the recombinant expression vector includes one or more regulatory sequences operatively linked to the nucleic acid sequence to be expressed. The term “regulatory sequence” includes promoters, enhancers and other expression control elements (e.g., polyadenylation signals). Regulatory sequences include those which direct constitutive expression of a nucleotide sequence, as well as tissue-specific regulatory and/or inducible sequences. The design of the expression vector can depend on such factors as the choice of the host cell to be transformed, the level of expression of protein desired, and the like. The expression vectors of the invention can be introduced into host cells to thereby produce proteins or polypeptides, including fusion proteins or polypeptides, encoded by nucleic acids as described herein (e.g., 46508 proteins, mutant forms of 46508 proteins, fusion proteins, and the like).

[5639] The recombinant expression vectors of the invention can be designed for expression of 46508 proteins in prokaryotic or eukaryotic cells. For example, polypeptides of the invention can be expressed in E. coli, insect cells (e.g., using baculovirus expression vectors), yeast cells or mammalian cells. Suitable host cells are discussed further in Goeddel, (1990) Gene Expression Technology: Methods in Enzymology 185, Academic Press, San Diego, Calif. Alternatively, the recombinant expression vector can be transcribed and translated in vitro, for example using T7 promoter regulatory sequences and T7 polymerase.

[5640] Expression of proteins in prokaryotes is most often carried out in E. coli with vectors containing constitutive or inducible promoters directing the expression of either fusion or non-fusion proteins. Fusion vectors add a number of amino acids to a protein encoded therein, usually to the amino terminus of the recombinant protein. Such fusion vectors typically serve three purposes: 1) to increase expression of recombinant protein; 2) to increase the solubility of the recombinant protein; and 3) to aid in the purification of the recombinant protein by acting as a ligand in affinity purification. Often, a proteolytic cleavage site is introduced at the junction of the fusion moiety and the recombinant protein to enable separation of the recombinant protein from the fusion moiety subsequent to purification of the fusion protein. Such enzymes, and their cognate recognition sequences, include Factor Xa, thrombin and enterokinase. Typical fusion expression vectors include pGEX (Pharmacia Biotech Inc; Smith, D. B. and Johnson, K. S. (1988) Gene 67:31-40), pMAL (New England Biolabs, Beverly, Mass.) and pRIT5 (Pharmacia, Piscataway, N.J.) which fuse glutathione S-transferase (GST), maltose E binding protein, or protein A, respectively, to the target recombinant protein.

[5641] Purified fusion proteins can be used in 46508 activity assays, (e.g., direct assays or competitive assays described in detail below), or to generate antibodies specific for 46508 proteins. In a preferred embodiment, a fusion protein expressed in a retroviral expression vector of the present invention can be used to infect bone marrow cells which are subsequently transplanted into irradiated recipients. The pathology of the subject recipient is then examined after sufficient time has passed (e.g., six weeks).

[5642] To maximize recombinant protein expression in E. coli is to express the protein in a host bacteria with an impaired capacity to proteolytically cleave the recombinant protein (Gottesman, S., (1990) Gene Expression Technology: Methods in Enzymology 185, Academic Press, San Diego, Calif. 119-128). Another strategy is to alter the nucleic acid sequence of the nucleic acid to be inserted into an expression vector so that the individual codons for each amino acid are those preferentially utilized in E. coli (Wada et al., (1992) Nucleic Acids Res. 20:2111-2118). Such alteration of nucleic acid sequences of the invention can be carried out by standard DNA synthesis techniques.

[5643] The 46508 expression vector can be a yeast expression vector, a vector for expression in insect cells, e.g., a baculovirus expression vector or a vector suitable for expression in mammalian cells.

[5644] When used in mammalian cells, the expression vector's control functions can be provided by viral regulatory elements. For example, commonly used promoters are derived from polyoma, Adenovirus 2, cytomegalovirus and Simian Virus 40.

[5645] In another embodiment, the promoter is an inducible promoter, e.g., a promoter regulated by a steroid hormone, by a polypeptide hormone (e.g., by means of a signal transduction pathway), or by a heterologous polypeptide (e.g., the tetracycline-inducible systems, “Tet-On” and “Tet-Off”; see, e.g., Clontech Inc., CA, Gossen and Bujard (1992) Proc. Natl. Acad. Sci. USA 89:5547, and Paillard (1989) Human Gene Therapy 9:983).

[5646] In another embodiment, the recombinant mammalian expression vector is capable of directing expression of the nucleic acid preferentially in a particular cell type (e.g., tissue-specific regulatory elements are used to express the nucleic acid). Non-limiting examples of suitable tissue-specific promoters include the albumin promoter (liver-specific; Pinkert et al. (1987) Genes Dev. 1:268-277), lymphoid-specific promoters (Calame and Eaton (1988) Adv. Immunol. 43:235-275), in particular promoters of T cell receptors (Winoto and Baltimore (1989) EMBO J. 8:729-733) and immunoglobulins (Banerji et al. (1983) Cell 33:729-740; Queen and Baltimore (1983) Cell 33:741-748), neuron-specific promoters (e.g., the neurofilament promoter; Byrne and Ruddle (1989) Proc. Natl. Acad. Sci. USA 86:5473-5477), pancreas-specific promoters (Edlund et al. (1985) Science 230:912-916), and mammary gland-specific promoters (e.g., milk whey promoter; U.S. Pat. No. 4,873,316 and European Application Publication No. 264,166). Developmentally-regulated promoters are also encompassed, for example, the murine hox promoters (Kessel and Gruss (1990) Science 249:374-379) and the &agr;-fetoprotein promoter (Campes and Tilghman (1989) Genes Dev. 3:537-546).

[5647] The invention further provides a recombinant expression vector comprising a DNA molecule of the invention cloned into the expression vector in an antisense orientation. Regulatory sequences (e.g., viral promoters and/or enhancers) operatively linked to a nucleic acid cloned in the antisense orientation can be chosen which direct the constitutive, tissue specific or cell type specific expression of antisense RNA in a variety of cell types. The antisense expression vector can be in the form of a recombinant plasmid, phagemid or attenuated virus.

[5648] Another aspect the invention provides a host cell which includes a nucleic acid molecule described herein, e.g., a 46508 nucleic acid molecule within a recombinant expression vector or a 46508 nucleic acid molecule containing sequences which allow it to homologously recombine into a specific site of the host cell's genome. The terms “host cell” and “recombinant host cell” are used interchangeably herein. Such terms refer not only to the particular subject cell but to the progeny or potential progeny of such a cell. Because certain modifications may occur in succeeding generations due to either mutation or environmental influences, such progeny may not, in fact, be identical to the parent cell, but are still included within the scope of the term as used herein.

[5649] A host cell can be any prokaryotic or eukaryotic cell. For example, a 46508 protein can be expressed in bacterial cells (such as E. coli), insect cells, yeast or mammalian cells (such as Chinese hamster ovary cells (CHO) or COS cells (African green monkey kidney cells CV-1 origin SV40 cells; Gluzman (1981) Cell 123:175-182)). Other suitable host cells are known to those skilled in the art.

[5650] Vector DNA can be introduced into host cells via conventional transformation or transfection techniques. As used herein, the terms “transformation” and “transfection” are intended to refer to a variety of art-recognized techniques for introducing foreign nucleic acid (e.g., DNA) into a host cell, including calcium phosphate or calcium chloride co-precipitation, DEAE-dextran-mediated transfection, lipofection, or electroporation.

[5651] A host cell of the invention can be used to produce (i.e., express) a 46508 protein. Accordingly, the invention further provides methods for producing a 46508 protein using the host cells of the invention. In one embodiment, the method includes culturing the host cell of the invention (into which a recombinant expression vector encoding a 46508 protein has been introduced) in a suitable medium such that a 46508 protein is produced. In another embodiment, the method further includes isolating a 46508 protein from the medium or the host cell.

[5652] In another aspect, the invention features, a cell or purified preparation of cells which include a 46508 transgene, or which otherwise misexpress 46508. The cell preparation can consist of human or non-human cells, e.g., rodent cells, e.g., mouse or rat cells, rabbit cells, or pig cells. In preferred embodiments, the cell or cells include a 46508 transgene, e.g., a heterologous form of a 46508, e.g., a gene derived from humans (in the case of a non-human cell). The 46508 transgene can be misexpressed, e.g., overexpressed or underexpressed. In other preferred embodiments, the cell or cells include a gene that mis-expresses an endogenous 46508, e.g., a gene the expression of which is disrupted, e.g., a knockout. Such cells can serve as a model for studying disorders that are related to mutated or mis-expressed 46508 alleles or for use in drug screening.

[5653] In another aspect, the invention features, a human cell, e.g., a hematopoietic stem cell, transformed with nucleic acid which encodes a subject 46508 polypeptide.

[5654] Also provided are cells, preferably human cells, e.g., human hematopoietic or fibroblast cells, in which an endogenous 46508 is under the control of a regulatory sequence that does not normally control the expression of the endogenous 46508 gene. The expression characteristics of an endogenous gene within a cell, e.g., a cell line or microorganism, can be modified by inserting a heterologous DNA regulatory element into the genome of the cell such that the inserted regulatory element is operably linked to the endogenous 46508 gene. For example, an endogenous 46508 gene which is “transcriptionally silent,” e.g., not normally expressed, or expressed only at very low levels, may be activated by inserting a regulatory element which is capable of promoting the expression of a normally expressed gene product in that cell. Techniques such as targeted homologous recombinations, can be used to insert the heterologous DNA as described in, e.g., Chappel, U.S. Pat. No. 5,272,071; WO 91/06667, published in May 16, 1991.

[5655] In a preferred embodiment, recombinant cells described herein can be used for replacement therapy in a subject. For example, a nucleic acid encoding a 46508 polypeptide operably linked to an inducible promoter (e.g., a steroid hormone receptor-regulated promoter) is introduced into a human or nonhuman, e.g., mammalian, e.g., porcine recombinant cell. The cell is cultivated and encapsulated in a biocompatible material, such as poly-lysine alginate, and subsequently implanted into the subject. See, e.g., Lanza (1996) Nat. Biotechnol. 14:1107; Joki et al. (2001) Nat. Biotechnol. 19:35; and U.S. Pat. No. 5,876,742. Production of 46508 polypeptide can be regulated in the subject by administering an agent (e.g., a steroid hormone) to the subject. In another preferred embodiment, the implanted recombinant cells express and secrete an antibody specific for a 46508 polypeptide. The antibody can be any antibody or any antibody derivative described herein.

[5656] 46508 Transgenic Animals

[5657] The invention provides non-human transgenic animals. Such animals are useful for studying the function and/or activity of a 46508 protein and for identifying and/or evaluating modulators of 46508 activity. As used herein, a “transgenic animal” is a non-human animal, preferably a mammal, more preferably a rodent such as a rat or mouse, in which one or more of the cells of the animal includes a transgene. Other examples of transgenic animals include non-human primates, sheep, dogs, cows, goats, chickens, amphibians, and the like. A transgene is exogenous DNA or a rearrangement, e.g., a deletion of endogenous chromosomal DNA, which preferably is integrated into or occurs in the genome of the cells of a transgenic animal. A transgene can direct the expression of an encoded gene product in one or more cell types or tissues of the transgenic animal, other transgenes, e.g., a knockout, reduce expression. Thus, a transgenic animal can be one in which an endogenous 46508 gene has been altered by, e.g., by homologous recombination between the endogenous gene and an exogenous DNA molecule introduced into a cell of the animal, e.g., an embryonic cell of the animal, prior to development of the animal.

[5658] Intronic sequences and polyadenylation signals can also be included in the transgene to increase the efficiency of expression of the transgene. A tissue-specific regulatory sequence(s) can be operably linked to a transgene of the invention to direct expression of a 46508 protein to particular cells. A transgenic founder animal can be identified based upon the presence of a 46508 transgene in its genome and/or expression of 46508 mRNA in tissues or cells of the animals. A transgenic founder animal can then be used to breed additional animals carrying the transgene. Moreover, transgenic animals carrying a transgene encoding a 46508 protein can further be bred to other transgenic animals carrying other transgenes.

[5659] 46508 proteins or polypeptides can be expressed in transgenic animals or plants, e.g., a nucleic acid encoding the protein or polypeptide can be introduced into the genome of an animal. In preferred embodiments the nucleic acid is placed under the control of a tissue specific promoter, e.g., a milk or egg specific promoter, and recovered from the milk or eggs produced by the animal. Suitable animals are mice, pigs, cows, goats, and sheep.

[5660] The invention also includes a population of cells from a transgenic animal, as discussed, e.g., below.

[5661] Uses of 46508

[5662] The nucleic acid molecules, proteins, protein homologues, and antibodies described herein can be used in one or more of the following methods: a) screening assays; b) predictive medicine (e.g., diagnostic assays, prognostic assays, monitoring clinical trials, and pharmacogenetics); and c) methods of treatment (e.g., therapeutic and prophylactic).

[5663] The isolated nucleic acid molecules of the invention can be used, for example, to express a 46508 protein (e.g., via a recombinant expression vector in a host cell in gene therapy applications), to detect a 46508 mRNA (e.g., in a biological sample) or a genetic alteration in a 46508 gene, and to modulate 46508 activity, as described further below. The 46508 proteins can be used to treat disorders characterized by insufficient or excessive production of a 46508 substrate or production of 46508 inhibitors. In addition, the 46508 proteins can be used to screen for naturally occurring 46508 substrates, to screen for drugs or compounds which modulate 46508 activity, as well as to treat disorders characterized by insufficient or excessive production of 46508 protein or production of 46508 protein forms which have decreased, aberrant or unwanted activity compared to 46508 wild type protein (e.g., a cell proliferative or differentiative disorder, viral infection, metabolic disorder). Moreover, the anti-46508 antibodies of the invention can be used to detect and isolate 46508 proteins, regulate the bioavailability of 46508 proteins, and modulate 46508 activity.

[5664] A method of evaluating a compound for the ability to interact with, e.g., bind, a subject 46508 polypeptide is provided. The method includes: contacting the compound with the subject 46508 polypeptide; and evaluating ability of the compound to interact with, e.g., to bind or form a complex with the subject 46508 polypeptide. This method can be performed in vitro, e.g., in a cell free system, or in vivo, e.g., in a two-hybrid interaction trap assay. This method can be used to identify naturally occurring molecules that interact with subject 46508 polypeptide. It can also be used to find natural or synthetic inhibitors of subject 46508 polypeptide. Screening methods are discussed in more detail below.

[5665] 46508 Screening Assays

[5666] The invention provides methods (also referred to herein as “screening assays”) for identifying modulators, i.e., candidate or test compounds or agents (e.g., proteins, peptides, peptidomimetics, peptoids, small molecules or other drugs) which bind to 46508 proteins, have a stimulatory or inhibitory effect on, for example, 46508 expression or 46508 activity, or have a stimulatory or inhibitory effect on, for example, the expression or activity of a 46508 substrate. Compounds thus identified can be used to modulate the activity of target gene products (e.g., 46508 genes) in a therapeutic protocol, to elaborate the biological function of the target gene product, or to identify compounds that disrupt normal target gene interactions.

[5667] In one embodiment, the invention provides assays for screening candidate or test compounds which are substrates of a 46508 protein or polypeptide or a biologically active portion thereof. In another embodiment, the invention provides assays for screening candidate or test compounds that bind to or modulate an activity of a 46508 protein or polypeptide or a biologically active portion thereof.

[5668] The test compounds of the present invention can be obtained using any of the numerous approaches in combinatorial library methods known in the art, including: biological libraries; peptoid libraries (libraries of molecules having the functionalities of peptides, but with a novel, non-peptide backbone which are resistant to enzymatic degradation but which nevertheless remain bioactive; see, e.g., Zuckermann, R. N. et al. (1994) J. Med. Chem. 37:2678-85); spatially addressable parallel solid phase or solution phase libraries; synthetic library methods requiring deconvolution; the ‘one-bead one-compound’ library method; and synthetic library methods using affinity chromatography selection. The biological library and peptoid library approaches are limited to peptide libraries, while the other four approaches are applicable to peptide, non-peptide oligomer or small molecule libraries of compounds (Lam (1997) Anticancer Drug Des. 12:145).

[5669] Examples of methods for the synthesis of molecular libraries can be found in the art, for example in: DeWitt et al. (1993) Proc. Natl. Acad. Sci. U.S.A. 90:6909; Erb et al. (1994) Proc. Natl. Acad. Sci. USA 91:11422; Zuckermann et al. (1994). J. Med. Chem. 37:2678; Cho et al. (1993) Science 261:1303; Carrell et al. (1994) Angew. Chem. Int. Ed. Engl. 33:2059; Carell et al. (1994) Angew. Chem. Int. Ed. Engl. 33:2061; and Gallop et al. (1994) J. Med. Chem. 37:1233.

[5670] Libraries of compounds may be presented in solution (e.g., Houghten (1992) Biotechniques 13:412-421), or on beads (Lam (1991) Nature 354:82-84), chips (Fodor (1993) Nature 364:555-556), bacteria (Ladner, U.S. Pat. No. 5,223,409), spores (Ladner U.S. Pat. No. 5,223,409), plasmids (Cull et al. (1992) Proc Natl Acad Sci USA 89:1865-1869) or on phage (Scott and Smith (1990) Science 249:386-390; Devlin (1990) Science 249:404-406; Cwirla et al. (1990) Proc. Natl. Acad. Sci. 87:6378-6382; Felici (1991) J. Mol. Biol. 222:301-310; Ladner supra.).

[5671] In one embodiment, an assay is a cell-based assay in which a cell which expresses a 46508 protein or biologically active portion thereof is contacted with a test compound, and the ability of the test compound to modulate 46508 activity is determined. Determining the ability of the test compound to modulate 46508 activity can be accomplished by monitoring, for example, hydrolytic activity. The cell, for example, can be of mammalian origin, e.g., human.

[5672] The ability of the test compound to modulate 46508 binding to a compound, e.g., a 46508 substrate, or to bind to 46508 can also be evaluated. This can be accomplished, for example, by coupling the compound, e.g., the substrate, with a radioisotope or enzymatic label such that binding of the compound, e.g., the substrate, to 46508 can be determined by detecting the labeled compound, e.g., substrate, in a complex. Alternatively, 46508 could be coupled with a radioisotope or enzymatic label to monitor the ability of a test compound to modulate 46508 binding to a 46508 substrate in a complex. For example, compounds (e.g., 46508 substrates) can be labeled with 125I, 35S, 14C, or 3H, either directly or indirectly, and the radioisotope detected by direct counting of radioemmission or by scintillation counting. Alternatively, compounds can be enzymatically labeled with, for example, horseradish peroxidase, alkaline phosphatase, or luciferase, and the enzymatic label detected by determination of conversion of an appropriate substrate to product.

[5673] The ability of a compound (e.g., a 46508 substrate) to interact with 46508 with or without the labeling of any of the interactants can be evaluated. For example, a microphysiometer can be used to detect the interaction of a compound with 46508 without the labeling of either the compound or the 46508. McConnell, H. M. et al. (1992) Science 257:1906-1912. As used herein, a “microphysiometer” (e.g., Cytosensor) is an analytical instrument that measures the rate at which a cell acidifies its environment using a light-addressable potentiometric sensor (LAPS). Changes in this acidification rate can be used as an indicator of the interaction between a compound and 46508.

[5674] In yet another embodiment, a cell-free assay is provided in which a 46508 protein or biologically active portion thereof is contacted with a test compound and the ability of the test compound to bind to the 46508 protein or biologically active portion thereof is evaluated. Preferred biologically active portions of the 46508 proteins to be used in assays of the present invention include fragments which participate in interactions with non-46508 molecules, e.g., fragments with high surface probability scores.

[5675] Soluble and/or membrane-bound forms of isolated proteins (e.g., 46508 proteins or biologically active portions thereof) can be used in the cell-free assays of the invention. When membrane-bound forms of the protein are used, it may be desirable to utilize a solubilizing agent. Examples of such solubilizing agents include non-ionic detergents such as n-octylglucoside, n-dodecylglucoside, n-dodecylmaltoside, octanoyl-N-methylglucamide, decanoyl-N-methylglucamide, Triton® X-100, Triton® X-114, Thesit®, Isotridecypoly(ethylene glycol ether)n, 3-[(3-cholamidopropyl)dimethylamminio]-1-propane sulfonate (CHAPS), 3-[(3-cholamidopropyl)dimethylamminio]-2-hydroxy-1-propane sulfonate (CHAPSO), or N-dodecyl=N,N-dimethyl-3-ammonio-1-propane sulfonate.

[5676] Cell-free assays involve preparing a reaction mixture of the target gene protein and the test compound under conditions and for a time sufficient to allow the two components to interact and bind, thus forming a complex that can be removed and/or detected.

[5677] The interaction between two molecules can also be detected, e.g., using fluorescence energy transfer (FET) (see, for example, Lakowicz et al., U.S. Pat. No. 5,631,169; Stavrianopoulos, et al., U.S. Pat. No. 4,868,103). A fluorophore label on the first, ‘donor’ molecule is selected such that its emitted fluorescent energy will be absorbed by a fluorescent label on a second, ‘acceptor’ molecule, which in turn is able to fluoresce due to the absorbed energy. Alternately, the ‘donor’ protein molecule may simply utilize the natural fluorescent energy of tryptophan residues. Labels are chosen that emit different wavelengths of light, such that the ‘acceptor’ molecule label may be differentiated from that of the ‘donor’. Since the efficiency of energy transfer between the labels is related to the distance separating the molecules, the spatial relationship between the molecules can be assessed. In a situation in which binding occurs between the molecules, the fluorescent emission of the ‘acceptor’ molecule label in the assay should be maximal. An FET binding event can be conveniently measured through standard fluorometric detection means well known in the art (e.g., using a fluorimeter).

[5678] In another embodiment, determining the ability of the 46508 protein to bind to a target molecule can be accomplished using real-time Biomolecular Interaction Analysis (BIA) (see, e.g., Sjolander, S. and Urbaniczky, C. (1991) Anal. Chem. 63:2338-2345 and Szabo et al. (1995) Curr. Opin. Struct. Biol. 5:699-705). “Surface plasmon resonance” or “BIA” detects biospecific interactions in real time, without labeling any of the interactants (e.g., BIAcore). Changes in the mass at the binding surface (indicative of a binding event) result in alterations of the refractive index of light near the surface (the optical phenomenon of surface plasmon resonance (SPR)), resulting in a detectable signal which can be used as an indication of real-time reactions between biological molecules.

[5679] In one embodiment, the target gene product or the test substance is anchored onto a solid phase. The target gene product/test compound complexes anchored on the solid phase can be detected at the end of the reaction. Preferably, the target gene product can be anchored onto a solid surface, and the test compound, (which is not anchored), can be labeled, either directly or indirectly, with detectable labels discussed herein.

[5680] It may be desirable to immobilize either 46508, an anti-46508 antibody or its target molecule to facilitate separation of complexed from uncomplexed forms of one or both of the proteins, as well as to accommodate automation of the assay. Binding of a test compound to a 46508 protein, or interaction of a 46508 protein with a target molecule in the presence and absence of a candidate compound, can be accomplished in any vessel suitable for containing the reactants. Examples of such vessels include microtiter plates, test tubes, and micro-centrifuge tubes. In one embodiment, a fusion protein can be provided which adds a domain that allows one or both of the proteins to be bound to a matrix. For example, glutathione-S-transferase/46508 fusion proteins or glutathione-S-transferase/target fusion proteins can be adsorbed onto glutathione sepharose beads (Sigma Chemical, St. Louis, Mo.) or glutathione derivatized microtiter plates, which are then combined with the test compound or the test compound and either the non-adsorbed target protein or 46508 protein, and the mixture incubated under conditions conducive to complex formation (e.g., at physiological conditions for salt and pH). Following incubation, the beads or microtiter plate wells are washed to remove any unbound components, the matrix immobilized in the case of beads, complex determined either directly or indirectly, for example, as described above. Alternatively, the complexes can be dissociated from the matrix, and the level of 46508 binding or activity determined using standard techniques.

[5681] Other techniques for immobilizing either a 46508 protein or a target molecule on matrices include using conjugation of biotin and streptavidin. Biotinylated 46508 protein or target molecules can be prepared from biotin-NHS(N-hydroxy-succinimide) using techniques known in the art (e.g., biotinylation kit, Pierce Chemicals, Rockford, Ill.), and immobilized in the wells of streptavidin-coated 96 well plates (Pierce Chemical).

[5682] In order to conduct the assay, the non-immobilized component is added to the coated surface containing the anchored component. After the reaction is complete, unreacted components are removed (e.g., by washing) under conditions such that any complexes formed will remain immobilized on the solid surface. The detection of complexes anchored on the solid surface can be accomplished in a number of ways. Where the previously non-immobilized component is pre-labeled, the detection of label immobilized on the surface indicates that complexes were formed. Where the previously non-immobilized component is not pre-labeled, an indirect label can be used to detect complexes anchored on the surface; e.g., using a labeled antibody specific for the immobilized component (the antibody, in turn, can be directly labeled or indirectly labeled with, e.g., a labeled anti-Ig antibody).

[5683] In one embodiment, this assay is performed utilizing antibodies reactive with 46508 protein or target molecules but which do not interfere with binding of the 46508 protein to its target molecule. Such antibodies can be derivatized to the wells of the plate, and unbound target or 46508 protein trapped in the wells by antibody conjugation. Methods for detecting such complexes, in addition to those described above for the GST-immobilized complexes, include immunodetection of complexes using antibodies reactive with the 46508 protein or target molecule, as well as enzyme-linked assays which rely on detecting an enzymatic activity associated with the 46508 protein or target molecule.

[5684] Alternatively, cell free assays can be conducted in a liquid phase. In such an assay, the reaction products are separated from unreacted components, by any of a number of standard techniques, including but not limited to: differential centrifugation (see, for example, Rivas, G., and Minton, A. P., (1993) Trends Biochem Sci 18:284-7); chromatography (gel filtration chromatography, ion-exchange chromatography); electrophoresis (see, e.g., Ausubel, F. et al., eds. Current Protocols in Molecular Biology 1999, J. Wiley: New York.); and immunoprecipitation (see, for example, Ausubel, F. et al., eds. (1999) Current Protocols in Molecular Biology, J. Wiley: New York). Such resins and chromatographic techniques are known to one skilled in the art (see, e.g., Heegaard, N. H., (1998) J Mol Recognit 11: 141-8; Hage, D. S., and Tweed, S. A. (1997) J Chromatogr B Biomed Sci Appl. 699:499-525). Further, fluorescence energy transfer may also be conveniently utilized, as described herein, to detect binding without further purification of the complex from solution.

[5685] In a preferred embodiment, the assay includes contacting the 46508 protein or biologically active portion thereof with a known compound which binds 46508 to form an assay mixture, contacting the assay mixture with a test compound, and determining the ability of the test compound to interact with a 46508 protein, wherein determining the ability of the test compound to interact with a 46508 protein includes determining the ability of the test compound to preferentially bind to 46508 or biologically active portion thereof, or to modulate the activity of a target molecule, as compared to the known compound.

[5686] The target gene products of the invention can, in vivo, interact with one or more cellular or extracellular macromolecules, such as proteins. For the purposes of this discussion, such cellular and extracellular macromolecules are referred to herein as “binding partners.” Compounds that disrupt such interactions can be useful in regulating the activity of the target gene product. Such compounds can include, but are not limited to molecules such as antibodies, peptides, and small molecules. The preferred target genes/products for use in this embodiment are the 46508 genes herein identified. In an alternative embodiment, the invention provides methods for determining the ability of the test compound to modulate the activity of a 46508 protein through modulation of the activity of a downstream effector of a 46508 target molecule. For example, the activity of the effector molecule on an appropriate target can be determined, or the binding of the effector to an appropriate target can be determined, as previously described.

[5687] To identify compounds that interfere with the interaction between the target gene product and its cellular or extracellular binding partner(s), a reaction mixture containing the target gene product and the binding partner is prepared, under conditions and for a time sufficient, to allow the two products to form complex. In order to test an inhibitory agent, the reaction mixture is provided in the presence and absence of the test compound. The test compound can be initially included in the reaction mixture, or can be added at a time subsequent to the addition of the target gene and its cellular or extracellular binding partner. Control reaction mixtures are incubated without the test compound or with a placebo. The formation of any complexes between the target gene product and the cellular or extracellular binding partner is then detected. The formation of a complex in the control reaction, but not in the reaction mixture containing the test compound, indicates that the compound interferes with the interaction of the target gene product and the interactive binding partner. Additionally, complex formation within reaction mixtures containing the test compound and normal target gene product can also be compared to complex formation within reaction mixtures containing the test compound and mutant target gene product. This comparison can be important in those cases wherein it is desirable to identify compounds that disrupt interactions of mutant but not normal target gene products.

[5688] These assays can be conducted in a heterogeneous or homogeneous format. Heterogeneous assays involve anchoring either the target gene product or the binding partner onto a solid phase, and detecting complexes anchored on the solid phase at the end of the reaction. In homogeneous assays, the entire reaction is carried out in a liquid phase. In either approach, the order of addition of reactants can be varied to obtain different information about the compounds being tested. For example, test compounds that interfere with the interaction between the target gene products and the binding partners, e.g., by competition, can be identified by conducting the reaction in the presence of the test substance. Alternatively, test compounds that disrupt preformed complexes, e.g., compounds with higher binding constants that displace one of the components from the complex, can be tested by adding the test compound to the reaction mixture after complexes have been formed. The various formats are briefly described below.

[5689] In a heterogeneous assay system, either the target gene product or the interactive cellular or extracellular binding partner, is anchored onto a solid surface (e.g., a microtiter plate), while the non-anchored species is labeled, either directly or indirectly. The anchored species can be immobilized by non-covalent or covalent attachments. Alternatively, an immobilized antibody specific for the species to be anchored can be used to anchor the species to the solid surface.

[5690] In order to conduct the assay, the partner of the immobilized species is exposed to the coated surface with or without the test compound. After the reaction is complete, unreacted components are removed (e.g., by washing) and any complexes formed will remain immobilized on the solid surface. Where the non-immobilized species is pre-labeled, the detection of label immobilized on the surface indicates that complexes were formed. Where the non-immobilized species is not pre-labeled, an indirect label can be used to detect complexes anchored on the surface; e.g., using a labeled antibody specific for the initially non-immobilized species (the antibody, in turn, can be directly labeled or indirectly labeled with, e.g., a labeled anti-Ig antibody). Depending upon the order of addition of reaction components, test compounds that inhibit complex formation or that disrupt preformed complexes can be detected.

[5691] Alternatively, the reaction can be conducted in a liquid phase in the presence or absence of the test compound, the reaction products separated from unreacted components, and complexes detected; e.g., using an immobilized antibody specific for one of the binding components to anchor any complexes formed in solution, and a labeled antibody specific for the other partner to detect anchored complexes. Again, depending upon the order of addition of reactants to the liquid phase, test compounds that inhibit complex or that disrupt preformed complexes can be identified.

[5692] In an alternate embodiment of the invention, a homogeneous assay can be used. For example, a preformed complex of the target gene product and the interactive cellular or extracellular binding partner product is prepared in that either the target gene products or their binding partners are labeled, but the signal generated by the label is quenched due to complex formation (see, e.g., U.S. Pat. No. 4,109,496 that utilizes this approach for immunoassays). The addition of a test substance that competes with and displaces one of the species from the preformed complex will result in the generation of a signal above background. In this way, test substances that disrupt target gene product-binding partner interaction can be identified.

[5693] In yet another aspect, the 46508 proteins can be used as “bait proteins” in a two-hybrid assay or three-hybrid assay (see, e.g., U.S. Pat. No. 5,283,317; Zervos et al. (1993) Cell 72:223-232; Madura et al. (1993) J. Biol. Chem. 268:12046-12054; Bartel et al. (1993) Biotechniques 14:920-924; Iwabuchi et al. (1993) Oncogene 8:1693-1696; and Brent WO94/10300), to identify other proteins, which bind to or interact with 46508 (“46508-binding proteins” or “46508-bp”) and are involved in 46508 activity. Such 46508-bps can be activators or inhibitors of signals by the 46508 proteins or 46508 targets as, for example, downstream elements of a 46508-mediated signaling pathway.

[5694] The two-hybrid system is based on the modular nature of most transcription factors, which consist of separable DNA-binding and activation domains. Briefly, the assay utilizes two different DNA constructs. In one construct, the gene that codes for a 46508 protein is fused to a gene encoding the DNA binding domain of a known transcription factor (e.g., GAL-4). In the other construct, a DNA sequence, from a library of DNA sequences, that encodes an unidentified protein (“prey” or “sample”) is fused to a gene that codes for the activation domain of the known transcription factor. (Alternatively the: 46508 protein can be the fused to the activator domain.) If the “bait” and the “prey” proteins are able to interact, in vivo, forming a 46508-dependent complex, the DNA-binding and activation domains of the transcription factor are brought into close proximity. This proximity allows transcription of a reporter gene (e.g., lacZ) which is operably linked to a transcriptional regulatory site responsive to the transcription factor. Expression of the reporter gene can be detected and cell colonies containing the functional transcription factor can be isolated and used to obtain the cloned gene which encodes the protein which interacts with the 46508 protein.

[5695] In another embodiment, modulators of 46508 expression are identified. For example, a cell or cell free mixture is contacted with a candidate compound and the expression of 46508 mRNA or protein evaluated relative to the level of expression of 46508 mRNA or protein in the absence of the candidate compound. When expression of 46508 mRNA or protein is greater in the presence of the candidate compound than in its absence, the candidate compound is identified as a stimulator of 46508 mRNA or protein expression. Alternatively, when expression of 46508 mRNA or protein is less (statistically significantly less) in the presence of the candidate compound than in its absence, the candidate compound is identified as an inhibitor of 46508 mRNA or protein expression. The level of 46508 mRNA or protein expression can be determined by methods described herein for detecting 46508 mRNA or protein.

[5696] In another aspect, the invention pertains to a combination of two or more of the assays described herein. For example, a modulating agent can be identified using a cell-based or a cell free assay, and the ability of the agent to modulate the activity of a 46508 protein can be confirmed in vivo, e.g., in an animal such as an animal model for cancer or viral disease.

[5697] This invention further pertains to novel agents identified by the above-described screening assays. Accordingly, it is within the scope of this invention to further use an agent identified as described herein (e.g., a 46508 modulating agent, an antisense 46508 nucleic acid molecule, a 46508-specific antibody, or a 46508-binding partner) in an appropriate animal model to determine the efficacy, toxicity, side effects, or mechanism of action, of treatment with such an agent. Furthermore, novel agents identified by the above-described screening assays can be used for treatments as described herein.

[5698] 46508 Detection Assays

[5699] Portions or fragments of the nucleic acid sequences identified herein can be used as polynucleotide reagents. For example, these sequences can be used to: (i) map their respective genes on a chromosome e.g., to locate gene regions associated with genetic disease or to associate 46508 with a disease; (ii) identify an individual from a minute biological sample (tissue typing); and (iii) aid in forensic identification of a biological sample. These applications are described in the subsections below.

[5700] 46508 Chromosome Mapping

[5701] The 46508 nucleotide sequences or portions thereof can be used to map the location of the 46508 genes on a chromosome. This process is called chromosome mapping. Chromosome mapping is useful in correlating the 46508 sequences with genes associated with disease.

[5702] Briefly, 46508 genes can be mapped to chromosomes by preparing PCR primers (preferably 15-25 bp in length) from the 46508 nucleotide sequences. These primers can then be used for PCR screening of somatic cell hybrids containing individual human chromosomes. Only those hybrids containing the human gene corresponding to the 46508 sequences will yield an amplified fragment.

[5703] A panel of somatic cell hybrids in which each cell line contains either a single human chromosome or a small number of human chromosomes, and a full set of mouse chromosomes, can allow easy mapping of individual genes to specific human chromosomes. (D'Eustachio P. et al. (1983) Science 220:919-924).

[5704] Other mapping strategies e.g., in situ hybridization (described in Fan, Y. et al. (1990) Proc. Natl. Acad. Sci. USA, 87:6223-27), pre-screening with labeled flow-sorted chromosomes, and pre-selection by hybridization to chromosome specific cDNA libraries can be used to map 46508 to a chromosomal location.

[5705] Fluorescence in situ hybridization (FISH) of a DNA sequence to a metaphase chromosomal spread can further be used to provide a precise chromosomal location in one step. The FISH technique can be used with a DNA sequence as short as 500 or 600 bases. However, clones larger than 1,000 bases have a higher likelihood of binding to a unique chromosomal location with sufficient signal intensity for simple detection. Preferably 1,000 bases, and more preferably 2,000 bases will suffice to get good results at a reasonable amount of time. For a review of this technique, see Verma et al., Human Chromosomes: A Manual of Basic Techniques ((1988) Pergamon Press, New York).

[5706] Reagents for chromosome mapping can be used individually to mark a single chromosome or a single site on that chromosome, or panels of reagents can be used for marking multiple sites and/or multiple chromosomes. Reagents corresponding to noncoding regions of the genes actually are preferred for mapping purposes. Coding sequences are more likely to be conserved within gene families, thus increasing the chance of cross hybridizations during chromosomal mapping.

[5707] Once a sequence has been mapped to a precise chromosomal location, the physical position of the sequence on the chromosome can be correlated with genetic map data. (Such data are found, for example, in V. McKusick, Mendelian Inheritance in Man, available on-line through Johns Hopkins University Welch Medical Library). The relationship between a gene and a disease, mapped to the same chromosomal region, can then be identified through linkage analysis (co-inheritance of physically adjacent genes), described in, for example, Egeland, J. et al. (1987) Nature, 325:783-787.

[5708] Moreover, differences in the DNA sequences between individuals affected and unaffected with a disease associated with the 46508 gene, can be determined. If a mutation is observed in some or all of the affected individuals but not in any unaffected individuals, then the mutation is likely to be the causative agent of the particular disease. Comparison of affected and unaffected individuals generally involves first looking for structural alterations in the chromosomes, such as deletions or translocations that are visible from chromosome spreads or detectable using PCR based on that DNA sequence. Ultimately, complete sequencing of genes from several individuals can be performed to confirm the presence of a mutation and to distinguish mutations from polymorphisms.

[5709] 46508 Tissue Typing

[5710] 46508 sequences can be used to identify individuals from biological samples using, e.g., restriction fragment length polymorphism (RFLP). In this technique, an individual's genomic DNA is digested with one or more restriction enzymes, the fragments separated, e.g., in a Southern blot, and probed to yield bands for identification. The sequences of the present invention are useful as additional DNA markers for RFLP (described in U.S. Pat. No. 5,272,057).

[5711] Furthermore, the sequences of the present invention can also be used to determine the actual base-by-base DNA sequence of selected portions of an individual's genome. Thus, the 46508 nucleotide sequences described herein can be used to prepare two PCR primers from the 5′ and 3′ends of the sequences. These primers can then be used to amplify an individual's DNA and subsequently sequence it. Panels of corresponding DNA sequences from individuals, prepared in this manner, can provide unique individual identifications, as each individual will have a unique set of such DNA sequences due to allelic differences.

[5712] Allelic variation occurs to some degree in the coding regions of these sequences, and to a greater degree in the noncoding regions. Each of the sequences described herein can, to some degree, be used as a standard against which DNA from an individual can be compared for identification purposes. Because greater numbers of polymorphisms occur in the noncoding regions, fewer sequences are necessary to differentiate individuals. The noncoding sequences of SEQ ID NO:101 can provide positive individual identification with a panel of perhaps 10 to 1,000 primers which each yield a noncoding amplified sequence of 100 bases. If predicted coding sequences, such as those in SEQ ID NO: 103 are used, a more appropriate number of primers for positive individual identification would be 500-2,000.

[5713] If a panel of reagents from 46508 nucleotide sequences described herein is used to generate a unique identification database for an individual, those same reagents can later be used to identify tissue from that individual. Using the unique identification database, positive identification of the individual, living or dead, can be made from extremely small tissue samples.

[5714] Use of Partial 46508 Sequences in Forensic Biology

[5715] DNA-based identification techniques can also be used in forensic biology. To make such an identification, PCR technology can be used to amplify DNA sequences taken from very small biological samples such as tissues, e.g., hair or skin, or body fluids, e.g., blood, saliva, or semen found at a crime scene. The amplified sequence can then be compared to a standard, thereby allowing identification of the origin of the biological sample.

[5716] The sequences of the present invention can be used to provide polynucleotide reagents, e.g., PCR primers, targeted to specific loci in the human genome, which can enhance the reliability of DNA-based forensic identifications by, for example, providing another “identification marker” (i.e. another DNA sequence that is unique to a particular individual). As mentioned above, actual base sequence information can be used for identification as an accurate alternative to patterns formed by restriction enzyme generated fragments. Sequences targeted to noncoding regions of SEQ ID NO:101 (e.g., fragments derived from the noncoding regions of SEQ ID NO: 101 having a length of at least 20 bases, preferably at least 30 bases) are particularly appropriate for this use.

[5717] The 46508 nucleotide sequences described herein can further be used to provide polynucleotide reagents, e.g., labeled or labelable probes which can be used in, for example, an in situ hybridization technique, to identify a specific tissue. This can be very useful in cases where a forensic pathologist is presented with a tissue of unknown origin. Panels of such 46508 probes can be used to identify tissue by species and/or by organ type.

[5718] In a similar fashion, these reagents, e.g., 46508 primers or probes can be used to screen tissue culture for contamination (i.e. screen for the presence of a mixture of different types of cells in a culture).

[5719] Predictive Medicine of 46508

[5720] The present invention also pertains to the field of predictive medicine in which diagnostic assays, prognostic assays, and monitoring clinical trials are used for prognostic (predictive) purposes to thereby treat an individual.

[5721] Generally, the invention provides, a method of determining if a subject is at risk for a disorder related to a lesion in or the misexpression of a gene which encodes 46508.

[5722] Such disorders include, e.g., a disorder associated with the misexpression of a 46508 molecule; a disorder of cell growth and proliferation (e.g., cancer); a disorder resulting from viral or bacterial infection; a disorder resulting from environmental toxins; a disorder of metabolism.

[5723] The method includes one or more of the following:

[5724] detecting, in a tissue of the subject, the presence or absence of a mutation which affects the expression of the 46508 gene, or detecting the presence or absence of a mutation in a region which controls the expression of the gene, e.g., a mutation in the 5′control region;

[5725] detecting, in a tissue of the subject, the presence or absence of a mutation which alters the structure of the 46508 gene;

[5726] detecting, in a tissue of the subject, the misexpression of the 46508 gene, at the mRNA level, e.g., detecting a non-wild type level of a mRNA;

[5727] detecting, in a tissue of the subject, the misexpression of the gene, at the protein level, e.g., detecting a non-wild type level of a 46508 polypeptide.

[5728] In preferred embodiments the method includes: ascertaining the existence of at least one of: a deletion of one or more nucleotides from the 46508 gene; an insertion of one or more nucleotides into the gene, a point mutation, e.g., a substitution of one or more nucleotides of the gene, a gross chromosomal rearrangement of the gene, e.g., a translocation, inversion, or deletion.

[5729] For example, detecting the genetic lesion can include: (i) providing a probe/primer including an oligonucleotide containing a region of nucleotide sequence which hybridizes to a sense or antisense sequence from SEQ ID NO:101, or naturally occurring mutants thereof or 5′ or 3′flanking sequences naturally associated with the 46508 gene; (ii) exposing the probe/primer to nucleic acid of the tissue; and detecting, by hybridization, e.g., in situ hybridization, of the probe/primer to the nucleic acid, the presence or absence of the genetic lesion.

[5730] In preferred embodiments detecting the misexpression includes ascertaining the existence of at least one of: an alteration in the level of a messenger RNA transcript of the 46508 gene; the presence of a non-wild type splicing pattern of a messenger RNA transcript of the gene; or a non-wild type level of 46508.

[5731] Methods of the invention can be used prenatally or to determine if a subject's offspring will be at risk for a disorder.

[5732] In preferred embodiments the method includes determining the structure of a 46508 gene, an abnormal structure being indicative of risk for the disorder.

[5733] In preferred embodiments the method includes contacting a sample from the subject with an antibody to the 46508 protein or a nucleic acid, which hybridizes specifically with the gene. These and other embodiments are discussed below.

[5734] Diagnostic and Prognostic Assays of 46508

[5735] Diagnostic and prognostic assays of the invention include method for assessing the expression level of 46508 molecules and for identifying variations and mutations in the sequence of 46508 molecules.

[5736] Expression Monitoring and Profiling:

[5737] The presence, level, or absence of 46508 protein or nucleic acid in a biological sample can be evaluated by obtaining a biological sample from a test subject and contacting the biological sample with a compound or an agent capable of detecting 46508 protein or nucleic acid (e.g., mRNA, genomic DNA) that encodes 46508 protein such that the presence of 46508 protein or nucleic acid is detected in the biological sample. The term “biological sample” includes tissues, cells and biological fluids isolated from a subject, as well as tissues, cells and fluids present within a subject. A preferred biological sample is serum. The level of expression of the 46508 gene can be measured in a number of ways, including, but not limited to: measuring the mRNA encoded by the 46508 genes; measuring the amount of protein encoded by the 46508 genes; or measuring the activity of the protein encoded by the 46508 genes.

[5738] The level of mRNA corresponding to the 46508 gene in a cell can be determined both by in situ and by in vitro formats.

[5739] The isolated mRNA can be used in hybridization or amplification assays that include, but are not limited to, Southern or Northern analyses, polymerase chain reaction analyses and probe arrays. One preferred diagnostic method for the detection of mRNA levels involves contacting the isolated mRNA with a nucleic acid molecule (probe) that can hybridize to the mRNA encoded by the gene being detected. The nucleic acid probe can be, for example, a fill-length 46508 nucleic acid, such as the nucleic acid of SEQ ID NO:101, or a portion thereof, such as an oligonucleotide of at least 7, 15, 30, 50, 100, 250 or 500 nucleotides in length and sufficient to specifically hybridize under stringent conditions to 46508 mRNA or genomic DNA. The probe can be disposed on an address of an array, e.g., an array described below. Other suitable probes for use in the diagnostic assays are described herein.

[5740] In one format, mRNA (or cDNA) is immobilized on a surface and contacted with the probes, for example by running the isolated mRNA on an agarose gel and transferring the mRNA from the gel to a membrane, such as nitrocellulose. In an alternative format, the probes are immobilized on a surface and the mRNA (or cDNA) is contacted with the probes, for example, in a two-dimensional gene chip array described below. A skilled artisan can adapt known mRNA detection methods for use in detecting the level of mRNA encoded by the 46508 genes.

[5741] The level of mRNA in a sample that is encoded by one of 46508 can be evaluated with nucleic acid amplification, e.g., by rtPCR (Mullis (1987) U.S. Pat. No. 4,683,202), ligase chain reaction (Barany (1991) Proc. Natl. Acad. Sci. USA 88:189-193), self sustained sequence replication (Guatelli et al., (1990) Proc. Natl. Acad. Sci. USA 87:1874-1878), transcriptional amplification system (Kwoh et al., (1989), Proc. Natl. Acad. Sci. USA 86:1173-1177), Q-Beta Replicase (Lizardi et al., (1988) Bio/Technology 6:1197), rolling circle replication (Lizardi et al., U.S. Pat. No. 5,854,033) or any other nucleic acid amplification method, followed by the detection of the amplified molecules using techniques known in the art. As used herein, amplification primers are defined as being a pair of nucleic acid molecules that can anneal to 5′ or 3′regions of a gene (plus and minus strands, respectively, or vice-versa) and contain a short region in between. In general, amplification primers are from about 10 to 30 nucleotides in length and flank a region from about 50 to 200 nucleotides in length. Under appropriate conditions and with appropriate reagents, such primers permit the amplification of a nucleic acid molecule comprising the nucleotide sequence flanked by the primers.

[5742] For in situ methods, a cell or tissue sample can be prepared/processed and immobilized on a support, typically a glass slide, and then contacted with a probe that can hybridize to mRNA that encodes the 46508 gene being analyzed.

[5743] In another embodiment, the methods further contacting a control sample with a compound or agent capable of detecting 46508 mRNA, or genomic DNA, and comparing the presence of 46508 mRNA or genomic DNA in the control sample with the presence of 46508 mRNA or genomic DNA in the test sample. In still another embodiment, serial analysis of gene expression, as described in U.S. Pat. No. 5,695,937, is used to detect 46508 transcript levels.

[5744] A variety of methods can be used to determine the level of protein encoded by 46508. In general, these methods include contacting an agent that selectively binds to the protein, such as an antibody with a sample, to evaluate the level of protein in the sample. In a preferred embodiment, the antibody bears a detectable label. Antibodies can be polyclonal, or more preferably, monoclonal. An intact antibody, or a fragment thereof (e.g., Fab or F(ab′)2) can be used. The term “labeled”, with regard to the probe or antibody, is intended to encompass direct labeling of the probe or antibody by coupling (i.e., physically linking) a detectable substance to the probe or antibody, as well as indirect labeling of the probe or antibody by reactivity with a detectable substance. Examples of detectable substances are provided herein.

[5745] The detection methods can be used to detect 46508 protein in a biological sample in vitro as well as in vivo. In vitro techniques for detection of 46508 protein include enzyme linked immunosorbent assays (ELISAs), immunoprecipitations, immunofluorescence, enzyme immunoassay (EIA), radioimmunoassay (RIA), and Western blot analysis. In vivo techniques for detection of 46508 protein include introducing into a subject a labeled anti-46508 antibody. For example, the antibody can be labeled with a radioactive marker whose presence and location in a subject can be detected by standard imaging techniques. In another embodiment, the sample is labeled, e.g., biotinylated and then contacted to the antibody, e.g., an anti-46508 antibody positioned on an antibody array (as described below). The sample can be detected, e.g., with avidin coupled to a fluorescent label.

[5746] In another embodiment, the methods further include contacting the control sample with a compound or agent capable of detecting 46508 protein, and comparing the presence of 46508 protein in the control sample with the presence of 46508 protein in the test sample.

[5747] The invention also includes kits for detecting the presence of 46508 in a biological sample. For example, the kit can include a compound or agent capable of detecting 46508 protein or mRNA in a biological sample; and a standard. The compound or agent can be packaged in a suitable container. The kit can further comprise instructions for using the kit to detect 46508 protein or nucleic acid.

[5748] For antibody-based kits, the kit can include: (1) a first antibody (e.g., attached to a solid support) which binds to a polypeptide corresponding to a marker of the invention; and, optionally, (2) a second, different antibody which binds to either the polypeptide or the first antibody and is conjugated to a detectable agent.

[5749] For oligonucleotide-based kits, the kit can include: (1) an oligonucleotide, e.g., a detectably labeled oligonucleotide, which hybridizes to a nucleic acid sequence encoding a polypeptide corresponding to a marker of the invention or (2) a pair of primers useful for amplifying a nucleic acid molecule corresponding to a marker of the invention. The kit can also includes a buffering agent, a preservative, or a protein stabilizing agent. The kit can also includes components necessary for detecting the detectable agent (e.g., an enzyme or a substrate). The kit can also contain a control sample or a series of control samples which can be assayed and compared to the test sample contained. Each component of the kit can be enclosed within an individual container and all of the various containers can be within a single package, along with instructions for interpreting the results of the assays performed using the kit.

[5750] The diagnostic methods described herein can identify subjects having, or at risk of developing, a disease or disorder associated with misexpressed or aberrant or unwanted 46508 expression or activity. As used herein, the term “unwanted” includes an unwanted phenomenon involved in a biological response such as pain or deregulated cell proliferation.

[5751] In one embodiment, a disease or disorder associated with aberrant or unwanted 46508 expression or activity is identified. A test sample is obtained from a subject and 46508 protein or nucleic acid (e.g., mRNA or genomic DNA) is evaluated, wherein the level, e.g., the presence or absence, of 46508 protein or nucleic acid is diagnostic for a subject having or at risk of developing a disease or disorder associated with aberrant or unwanted 46508 expression or activity. As used herein, a “test sample” refers to a biological sample obtained from a subject of interest, including a biological fluid (e.g., serum), cell sample, or tissue.

[5752] The prognostic assays described herein can be used to determine whether a subject can be administered an agent (e.g., an agonist, antagonist, peptidomimetic, protein, peptide, nucleic acid, small molecule, or other drug candidate) to treat a disease or disorder associated with aberrant or unwanted 46508 expression or activity. For example, such methods can be used to determine whether a subject can be effectively treated with an agent for a cell proliferative disorder.

[5753] In another aspect, the invention features a computer medium having a plurality of digitally encoded data records. Each data record includes a value representing the level of expression of 46508 in a sample, and a descriptor of the sample. The descriptor of the sample can be an identifier of the sample, a subject from which the sample was derived (e.g., a patient), a diagnosis, or a treatment (e.g., a preferred treatment). In a preferred embodiment, the data record further includes values representing the level of expression of genes other than 46508 (e.g., other genes associated with a 46508-disorder, or other genes on an array). The data record can be structured as a table, e.g., a table that is part of a database such as a relational database (e.g., a SQL database of the Oracle or Sybase database environments).

[5754] Also featured is a method of evaluating a sample. The method includes providing a sample, e.g., from the subject, and determining a gene expression profile of the sample, wherein the profile includes a value representing the level of 46508 expression. The method can further include comparing the value or the profile (i.e., multiple values) to a reference value or reference profile. The gene expression profile of the sample can be obtained by any of the methods described herein (e.g., by providing a nucleic acid from the sample and contacting the nucleic acid to an array). The method can be used to diagnose, for example, a cell differentiative or proliferative disorder in a subject wherein an increase in 46508 expression is an indication that the subject has or is disposed to having a disorder. The method can be used to monitor a treatment for, e.g. a cell proliferative or differentiative disorder in a subject. For example, the gene expression profile can be determined for a sample from a subject undergoing treatment. The profile can be compared to a reference profile or to a profile obtained from the subject prior to treatment or prior to onset of the disorder (see, e.g., Golub et al. (1999) Science 286:531).

[5755] In yet another aspect, the invention features a method of evaluating a test compound (see also, “Screening Assays”, above). The method includes providing a cell and a test compound; contacting the test compound to the cell; obtaining a subject expression profile for the contacted cell; and comparing the subject expression profile to one or more reference profiles. The profiles include a value representing the level of 46508 expression. In a preferred embodiment, the subject expression profile is compared to a target profile, e.g., a profile for a normal cell or for desired condition of a cell. The test compound is evaluated favorably if the subject expression profile is more similar to the target profile than an expression profile obtained from an uncontacted cell.

[5756] In another aspect, the invention features, a method of evaluating a subject. The method includes: a) obtaining a sample from a subject, e.g., from a caregiver, e.g., a caregiver who obtains the sample from the subject; b) determining a subject expression profile for the sample. Optionally, the method further includes either or both of steps: c) comparing the subject expression profile to one or more reference expression profiles; and d) selecting the reference profile most similar to the subject reference profile. The subject expression profile and the reference profiles include a value representing the level of 46508 expression. A variety of routine statistical measures can be used to compare two reference profiles. One possible metric is the length of the distance vector that is the difference between the two profiles. Each of the subject and reference profile is represented as a multi-dimensional vector, wherein each dimension is a value in the profile.

[5757] The method can further include transmitting a result to a caregiver. The result can be the subject expression profile, a result of a comparison of the subject expression profile with another profile, a most similar reference profile, or a descriptor of any of the aforementioned. The result can be transmitted across a computer network, e.g., the result can be in the form of a computer transmission, e.g., a computer data signal embedded in a carrier wave.

[5758] Also featured is a computer medium having executable code for effecting the following steps: receive a subject expression profile; access a database of reference expression profiles; and either i) select a matching reference profile most similar to the subject expression profile or ii) determine at least one comparison score for the similarity of the subject expression profile to at least one reference profile. The subject expression profile, and the reference expression profiles each include a value representing the level of 46508 expression.

[5759] 46508 Arrays and Uses Thereof

[5760] In another aspect, the invention features an array that includes a substrate having a plurality of addresses. At least one address of the plurality includes a capture probe that binds specifically to a 46508 molecule (e.g., a 46508 nucleic acid or a 46508 polypeptide). The array can have a density of at least than 10, 50, 100, 200, 500, 1,000, 2,000, or 10,000 or more addresses/cm2, and ranges between. In a preferred embodiment, the plurality of addresses includes at least 10, 100, 500, 1,000, 5,000, 10,000, 50,000 addresses. In a preferred embodiment, the plurality of addresses includes equal to or less than 10, 100, 500, 1,000, 5,000, 10,000, or 50,000 addresses. The substrate can be a two-dimensional substrate such as a glass slide, a wafer (e.g., silica or plastic), a mass spectroscopy plate, or a three-dimensional substrate such as a gel pad. Addresses in addition to address of the plurality can be disposed on the array.

[5761] In a preferred embodiment, at least one address of the plurality includes a nucleic acid capture probe that hybridizes specifically to a 46508 nucleic acid, e.g., the sense or anti-sense strand. In one preferred embodiment, a subset of addresses of the plurality of addresses has a nucleic acid capture probe for 46508. Each address of the subset can include a capture probe that hybridizes to a different region of a 46508 nucleic acid. In another preferred embodiment, addresses of the subset include a capture probe for a 46508 nucleic acid. Each address of the subset is unique, overlapping, and complementary to a different variant of 46508 (e.g., an allelic variant, or all possible hypothetical variants). The array can be used to sequence 46508 by hybridization (see, e.g., U.S. Pat. No. 5,695,940).

[5762] An array can be generated by various methods, e.g., by photolithographic methods (see, e.g., U.S. Pat. Nos. 5,143,854; 5,510,270; and 5,527,681), mechanical methods (e.g., directed-flow methods as described in U.S. Pat. No. 5,384,261), pin-based methods (e.g., as described in U.S. Pat. No. 5,288,514), and bead-based techniques (e.g., as described in PCT US/93/04145).

[5763] In another preferred embodiment, at least one address of the plurality includes a polypeptide capture probe that binds specifically to a 46508 polypeptide or fragment thereof. The polypeptide can be a naturally-occurring interaction partner of 46508 polypeptide. Preferably, the polypeptide is an antibody, e.g., an antibody described herein (see “Anti-46508 Antibodies,” above), such as a monoclonal antibody or a single-chain antibody.

[5764] In another aspect, the invention features a method of analyzing the expression of 46508. The method includes providing an array as described above; contacting the array with a sample and detecting binding of a 46508-molecule (e.g., nucleic acid or polypeptide) to the array. In a preferred embodiment, the array is a nucleic acid array. Optionally the method further includes amplifying nucleic acid from the sample prior or during contact with the array.

[5765] In another embodiment, the array can be used to assay gene expression in a tissue to ascertain tissue specificity of genes in the array, particularly the expression of 46508. If a sufficient number of diverse samples is analyzed, clustering (e.g., hierarchical clustering, k-means clustering, Bayesian clustering and the like) can be used to identify other genes which are co-regulated with 46508. For example, the array can be used for the quantitation of the expression of multiple genes. Thus, not only tissue specificity, but also the level of expression of a battery of genes in the tissue is ascertained. Quantitative data can be used to group (e.g., cluster) genes on the basis of their tissue expression per se and level of expression in that tissue.

[5766] For example, array analysis of gene expression can be used to assess the effect of cell-cell interactions on 46508 expression. A first tissue can be perturbed and nucleic acid from a second tissue that interacts with the first tissue can be analyzed. In this context, the effect of one cell type on another cell type in response to a biological stimulus can be determined, e.g., to monitor the effect of cell-cell interaction at the level of gene expression.

[5767] In another embodiment, cells are contacted with a therapeutic agent. The expression profile of the cells is determined using the array, and the expression profile is compared to the profile of like cells not contacted with the agent. For example, the assay can be used to determine or analyze the molecular basis of an undesirable effect of the therapeutic agent. If an agent is administered therapeutically to treat one cell type but has an undesirable effect on another cell type, the invention provides an assay to determine the molecular basis of the undesirable effect and thus provides the opportunity to co-administer a counteracting agent or otherwise treat the undesired effect. Similarly, even within a single cell type, undesirable biological effects can be determined at the molecular level. Thus, the effects of an agent on expression of other than the target gene can be ascertained and counteracted.

[5768] In another embodiment, the array can be used to monitor expression of one or more genes in the array with respect to time. For example, samples obtained from different time points can be probed with the array. Such analysis can identify and/or characterize the development of a 46508-associated disease or disorder; and processes, such as a cellular transformation associated with a 46508-associated disease or disorder. The method can also evaluate the treatment and/or progression of a 46508-associated disease or disorder

[5769] The array is also useful for ascertaining differential expression patterns of one or more genes in normal and abnormal cells. This provides a battery of genes (e.g., including 46508) that could serve as a molecular target for diagnosis or therapeutic intervention.

[5770] In another aspect, the invention features an array having a plurality of addresses. Each address of the plurality includes a unique polypeptide. At least one address of the plurality has disposed thereon a 46508 polypeptide or fragment thereof. Methods of producing polypeptide arrays are described in the art, e.g., in De Wildt et al. (2000). Nature Biotech. 18, 989-994; Lueking et al. (1999). Anal. Biochem. 270, 103-111; Ge, H. (2000). Nucleic Acids Res. 28, e3, I-VII; MacBeath, G., and Schreiber, S. L. (2000). Science 289, 1760-1763; and WO 99/51773A1. In a preferred embodiment, each addresses of the plurality has disposed thereon a polypeptide at least 60, 70, 80, 85, 90, 95 or 99% identical to a 46508 polypeptide or fragment thereof. For example, multiple variants of a 46508 polypeptide (e.g., encoded by allelic variants, site-directed mutants, random mutants, or combinatorial mutants) can be disposed at individual addresses of the plurality. Addresses in addition to the address of the plurality can be disposed on the array.

[5771] The polypeptide array can be used to detect a 46508 binding compound, e.g., an antibody in a sample from a subject with specificity for a 46508 polypeptide or the presence of a 46508-binding protein or ligand.

[5772] The array is also useful for ascertaining the effect of the expression of a gene on the expression of other genes in the same cell or in different cells (e.g., ascertaining the effect of 46508 expression on the expression of other genes). This provides, for example, for a selection of alternate molecular targets for therapeutic intervention if the ultimate or downstream target cannot be regulated.

[5773] In another aspect, the invention features a method of analyzing a plurality of probes. The method is useful, e.g., for analyzing gene expression. The method includes: providing a two dimensional array having a plurality of addresses, each address of the plurality being positionally distinguishable from each other address of the plurality having a unique capture probe, e.g., wherein the capture probes are from a cell or subject which express 46508 or from a cell or subject in which a 46508 mediated response has been elicited, e.g., by contact of the cell with 46508 nucleic acid or protein, or administration to the cell or subject 46508 nucleic acid or protein; providing a two dimensional array having a plurality of addresses, each address of the plurality being positionally distinguishable from each other address of the plurality, and each address of the plurality having a unique capture probe, e.g., wherein the capture probes are from a cell or subject which does not express 46508 (or does not express as highly as in the case of the 46508 positive plurality of capture probes) or from a cell or subject which in which a 46508 mediated response has not been elicited (or has been elicited to a lesser extent than in the first sample); contacting the array with one or more inquiry probes (which is preferably other than a 46508 nucleic acid, polypeptide, or antibody), and thereby evaluating the plurality of capture probes. Binding, e.g., in the case of a nucleic acid, hybridization with a capture probe at an address of the plurality, is detected, e.g., by signal generated from a label attached to the nucleic acid, polypeptide, or antibody.

[5774] In another aspect, the invention features a method of analyzing a plurality of probes or a sample. The method is useful, e.g., for analyzing gene expression. The method includes: providing a two dimensional array having a plurality of addresses, each address of the plurality being positionally distinguishable from each other address of the plurality having a unique capture probe, contacting the array with a first sample from a cell or subject which express or mis-express 46508 or from a cell or subject in which a 46508-mediated response has been elicited, e.g., by contact of the cell with 46508 nucleic acid or protein, or administration to the cell or subject 46508 nucleic acid or protein; providing a two dimensional array having a plurality of addresses, each address of the plurality being positionally distinguishable from each other address of the plurality, and each address of the plurality having a unique capture probe, and contacting the array with a second sample from a cell or subject which does not express 46508 (or does not express as highly as in the case of the 46508 positive plurality of capture probes) or from a cell or subject which in which a 46508 mediated response has not been elicited (or has been elicited to a lesser extent than in the first sample); and comparing the binding of the first sample with the binding of the second sample. Binding, e.g., in the case of a nucleic acid, hybridization with a capture probe at an address of the plurality, is detected, e.g., by signal generated from a label attached to the nucleic acid, polypeptide, or antibody. The same array can be used for both samples or different arrays can be used. If different arrays are used the plurality of addresses with capture probes should be present on both arrays.

[5775] In another aspect, the invention features a method of analyzing 46508, e.g., analyzing structure, function, or relatedness to other nucleic acid or amino acid sequences. The method includes: providing a 46508 nucleic acid or amino acid sequence; comparing the 46508 sequence with one or more preferably a plurality of sequences from a collection of sequences, e.g., a nucleic acid or protein sequence database; to thereby analyze 46508.

[5776] Detection of 46508 Variations or Mutations

[5777] The methods of the invention can also be used to detect genetic alterations in a 46508 gene, thereby determining if a subject with the altered gene is at risk for a disorder characterized by misregulation in 46508 protein activity or nucleic acid expression, such as a cell proliferative or differentiative disorder. In preferred embodiments, the methods include detecting, in a sample from the subject, the presence or absence of a genetic alteration characterized by at least one of an alteration affecting the integrity of a gene encoding a 46508-protein, or the mis-expression of the 46508 gene. For example, such genetic alterations can be detected by ascertaining the existence of at least one of 1) a deletion of one or more nucleotides from a 46508 gene; 2) an addition of one or more nucleotides to a 46508 gene; 3) a substitution of one or more nucleotides of a 46508 gene, 4) a chromosomal rearrangement of a 46508 gene; 5) an alteration in the level of a messenger RNA transcript of a 46508 gene, 6) aberrant modification of a 46508 gene, such as of the methylation pattern of the genomic DNA, 7) the presence of a non-wild type splicing pattern of a messenger RNA transcript of a 46508 gene, 8) a non-wild type level of a 46508-protein, 9) allelic loss of a 46508 gene, and 10) inappropriate post-translational modification of a 46508-protein.

[5778] An alteration can be detected without a probe/primer in a polymerase chain reaction, such as anchor PCR or RACE PCR, or, alternatively, in a ligation chain reaction (LCR), the latter of which can be particularly useful for detecting point mutations in the 46508-gene. This method can include the steps of collecting a sample of cells from a subject, isolating nucleic acid (e.g., genomic, mRNA or both) from the sample, contacting the nucleic acid sample with one or more primers which specifically hybridize to a 46508 gene under conditions such that hybridization and amplification of the 46508-gene (if present) occurs, and detecting the presence or absence of an amplification product, or detecting the size of the amplification product and comparing the length to a control sample. It is anticipated that PCR and/or LCR may be desirable to use as a preliminary amplification step in conjunction with any of the techniques used for detecting mutations described herein. Alternatively, other amplification methods described herein or known in the art can be used.

[5779] In another embodiment, mutations in a 46508 gene from a sample cell can be identified by detecting alterations in restriction enzyme cleavage patterns. For example, sample and control DNA is isolated, amplified (optionally), digested with one or more restriction endonucleases, and fragment length sizes are determined, e.g., by gel electrophoresis and compared. Differences in fragment length sizes between sample and control DNA indicates mutations in the sample DNA. Moreover, the use of sequence specific ribozymes (see, for example, U.S. Pat. No. 5,498,531) can be used to score for the presence of specific mutations by development or loss of a ribozyme cleavage site.

[5780] In other embodiments, genetic mutations in 46508 can be identified by hybridizing a sample and control nucleic acids, e.g., DNA or RNA, two-dimensional arrays, e.g., chip based arrays. Such arrays include a plurality of addresses, each of which is positionally distinguishable from the other. A different probe is located at each address of the plurality. A probe can be complementary to a region of a 46508 nucleic acid or a putative variant (e.g., allelic variant) thereof. A probe can have one or more mismatches to a region of a 46508 nucleic acid (e.g., a destabilizing mismatch). The arrays can have a high density of addresses, e.g., can contain hundreds or thousands of oligonucleotides probes (Cronin, M. T. et al. (1996) Human Mutation 7: 244-255; Kozal, M. J. et al. (1996) Nature Medicine 2: 753-759). For example, genetic mutations in 46508 can be identified in two-dimensional arrays containing light-generated DNA probes as described in Cronin, M. T. et al. supra. Briefly, a first hybridization array of probes can be used to scan through long stretches of DNA in a sample and control to identify base changes between the sequences by making linear arrays of sequential overlapping probes. This step allows the identification of point mutations. This step is followed by a second hybridization array that allows the characterization of specific mutations by using smaller, specialized probe arrays complementary to all variants or mutations detected. Each mutation array is composed of parallel probe sets, one complementary to the wild-type gene and the other complementary to the mutant gene.

[5781] In yet another embodiment, any of a variety of sequencing reactions known in the art can be used to directly sequence the 46508 gene and detect mutations by comparing the sequence of the sample 46508 with the corresponding wild-type (control) sequence. Automated sequencing procedures can be utilized when performing the diagnostic assays ((1995) Biotechniques 19:448), including sequencing by mass spectrometry.

[5782] Other methods for detecting mutations in the 46508 gene include methods in which protection from cleavage agents is used to detect mismatched bases in RNA/RNA or RNA/DNA heteroduplexes (Myers et al. (1985) Science 230:1242; Cotton et al. (1988) Proc. Natl. Acad Sci USA 85:4397; Saleeba et al. (1992) Methods Enzymol. 217:286-295).

[5783] In still another embodiment, the mismatch cleavage reaction employs one or more proteins that recognize mismatched base pairs in double-stranded DNA (so called “DNA mismatch repair” enzymes) in defined systems for detecting and mapping point mutations in 46508 cDNAs obtained from samples of cells. For example, the mutY enzyme of E. coli cleaves A at G/A mismatches and the thymidine DNA glycosylase from HeLa cells cleaves T at G/T mismatches (Hsu et al. (1994) Carcinogenesis 15:1657-1662; U.S. Pat. No. 5,459,039).

[5784] In other embodiments, alterations in electrophoretic mobility will be used to identify mutations in 46508 genes. For example, single strand conformation polymorphism (SSCP) may be used to detect differences in electrophoretic mobility between mutant and wild type nucleic acids (Orita et al. (1989) Proc Natl. Acad. Sci USA: 86:2766, see also Cotton (1993) Mutat. Res. 285:125-144; and Hayashi (1992) Genet. Anal. Tech. Appl. 9:73-79). Single-stranded DNA fragments of sample and control 46508 nucleic acids will be denatured and allowed to renature. The secondary structure of single-stranded nucleic acids varies according to sequence, the resulting alteration in electrophoretic mobility enables the detection of even a single base change. The DNA fragments may be labeled or detected with labeled probes. The sensitivity of the assay may be enhanced by using RNA (rather than DNA), in which the secondary structure is more sensitive to a change in sequence. In a preferred embodiment, the subject method utilizes heteroduplex analysis to separate double stranded heteroduplex molecules on the basis of changes in electrophoretic mobility (Keen et al. (1991) Trends Genet 7:5).

[5785] In yet another embodiment, the movement of mutant or wild-type fragments in polyacrylamide gels containing a gradient of denaturant is assayed using denaturing gradient gel electrophoresis (DGGE) (Myers et al. (1985) Nature 313:495). When DGGE is used as the method of analysis, DNA will be modified to insure that it does not completely denature, for example by adding a GC clamp of approximately 40 bp of high-melting GC-rich DNA by PCR. In a further embodiment, a temperature gradient is used in place of a denaturing gradient to identify differences in the mobility of control and sample DNA (Rosenbaum and Reissner (1987) Biophys Chem 265:12753).

[5786] Examples of other techniques for detecting point mutations include, but are not limited to, selective oligonucleotide hybridization, selective amplification, or selective primer extension (Saiki et al. (1986) Nature 324:163); Saiki et al. (1989) Proc. Natl Acad. Sci USA 86:6230). A further method of detecting point mutations is the chemical ligation of oligonucleotides as described in Xu et al. ((2001) Nature Biotechnol. 19:148). Adjacent oligonucleotides, one of which selectively anneals to the query site, are ligated together if the nucleotide at the query site of the sample nucleic acid is complementary to the query oligonucleotide; ligation can be monitored, e.g., by fluorescent dyes coupled to the oligonucleotides.

[5787] Alternatively, allele specific amplification technology that depends on selective PCR amplification may be used in conjunction with the instant invention. Oligonucleotides used as primers for specific amplification may carry the mutation of interest in the center of the molecule (so that amplification depends on differential hybridization) (Gibbs et al. (1989) Nucleic Acids Res. 17:2437-2448) or at the extreme 3′end of one primer where, under appropriate conditions, mismatch can prevent, or reduce polymerase extension (Prossner (1993) Tibtech 11:238). In addition it may be desirable to introduce a novel restriction site in the region of the mutation to create cleavage-based detection (Gasparini et al. (1992) Mol. Cell Probes 6:1). It is anticipated that in certain embodiments amplification may also be performed using Taq ligase for amplification (Barany (1991) Proc. Natl. Acad. Sci USA 88:189). In such cases, ligation will occur only if there is a perfect match at the 3′end of the 5′sequence making it possible to detect the presence of a known mutation at a specific site by looking for the presence or absence of amplification.

[5788] In another aspect, the invention features a set of oligonucleotides. The set includes a plurality of oligonucleotides, each of which is at least partially complementary (e.g., at least 50%, 60%, 70%, 80%, 90%, 92%, 95%, 97%, 98%, or 99% complementary) to a 46508 nucleic acid.

[5789] In a preferred embodiment the set includes a first and a second oligonucleotide. The first and second oligonucleotide can hybridize to the same or to different locations of SEQ ID NO:101 or the complement of SEQ ID NO:101. Different locations can be different but overlapping, or non-overlapping on the same strand. The first and second oligonucleotide can hybridize to sites on the same or on different strands.

[5790] The set can be useful, e.g., for identifying SNP's, or identifying specific alleles of 46508. In a preferred embodiment, each oligonucleotide of the set has a different nucleotide at an interrogation position. In one embodiment, the set includes two oligonucleotides, each complementary to a different allele at a locus, e.g., a biallelic or polymorphic locus.

[5791] In another embodiment, the set includes four oligonucleotides, each having a different nucleotide (e.g., adenine, guanine, cytosine, or thymidine) at the interrogation position. The interrogation position can be a SNP or the site of a mutation. In another preferred embodiment, the oligonucleotides of the plurality are identical in sequence to one another (except for differences in length). The oligonucleotides can be provided with differential labels, such that an oligonucleotide that hybridizes to one allele provides a signal that is distinguishable from an oligonucleotide that hybridizes to a second allele. In still another embodiment, at least one of the oligonucleotides of the set has a nucleotide change at a position in addition to a query position, e.g., a destabilizing mutation to decrease the Tm of the oligonucleotide. In another embodiment, at least one oligonucleotide of the set has a non-natural nucleotide, e.g., inosine. In a preferred embodiment, the oligonucleotides are attached to a solid support, e.g., to different addresses of an array or to different beads or nanoparticles.

[5792] In a preferred embodiment the set of oligo nucleotides can be used to specifically amplify, e.g., by PCR, or detect, a 46508 nucleic acid.

[5793] The methods described herein may be performed, for example, by utilizing pre-packaged diagnostic kits comprising at least one probe nucleic acid or antibody reagent described herein, which may be conveniently used, e.g., in clinical settings to diagnose patients exhibiting symptoms or family history of a disease or illness involving a 46508 gene.

[5794] Use of 46508 Molecules as Surrogate Markers

[5795] The 46508 molecules of the invention are also useful as markers of disorders or disease states, as markers for precursors of disease states, as markers for predisposition of disease states, as markers of drug activity, or as markers of the pharmacogenomic profile of a subject. Using the methods described herein, the presence, absence and/or quantity of the 46508 molecules of the invention may be detected, and may be correlated with one or more biological states in vivo. For example, the 46508 molecules of the invention may serve as surrogate markers for one or more disorders or disease states or for conditions leading up to disease states. As used herein, a “surrogate marker” is an objective biochemical marker which correlates with the absence or presence of a disease or disorder, or with the progression of a disease or disorder (e.g., with the presence or absence of a tumor). The presence or quantity of such markers is independent of the disease. Therefore, these markers may serve to indicate whether a particular course of treatment is effective in lessening a disease state or disorder. Surrogate markers are of particular use when the presence or extent of a disease state or disorder is difficult to assess through standard methodologies (e.g., early stage tumors), or when an assessment of disease progression is desired before a potentially dangerous clinical endpoint is reached (e.g., an assessment of cardiovascular disease may be made using cholesterol levels as a surrogate marker, and an analysis of HIV infection may be made using HIV RNA levels as a surrogate marker, well in advance of the undesirable clinical outcomes of myocardial infarction or fully-developed AIDS). Examples of the use of surrogate markers in the art include: Koomen et al. (2000) J. Mass. Spectrom. 35: 258-264; and James (1994) AIDS Treatment News Archive 209.

[5796] The 46508 molecules of the invention are also useful as pharmacodynamic markers. As used herein, a “pharmacodynamic marker” is an objective biochemical marker which correlates specifically with drug effects. The presence or quantity of a pharmacodynamic marker is not related to the disease state or disorder for which the drug is being administered; therefore, the presence or quantity of the marker is indicative of the presence or activity of the drug in a subject. For example, a pharmacodynamic marker may be indicative of the concentration of the drug in a biological tissue, in that the marker is either expressed or transcribed or not expressed or transcribed in that tissue in relationship to the level of the drug. In this fashion, the distribution or uptake of the drug may be monitored by the pharmacodynamic marker. Similarly, the presence or quantity of the pharmacodynamic marker may be related to the presence or quantity of the metabolic product of a drug, such that the presence or quantity of the marker is indicative of the relative breakdown rate of the drug in vivo. Pharmacodynamic markers are of particular use in increasing the sensitivity of detection of drug effects, particularly when the drug is administered in low doses. Since even a small amount of a drug may be sufficient to activate multiple rounds of marker (e.g., a 46508 marker) transcription or expression, the amplified marker may be in a quantity which is more readily detectable than the drug itself Also, the marker may be more easily detected due to the nature of the marker itself; for example, using the methods described herein, anti-46508 antibodies may be employed in an immune-based detection system for a 46508 protein marker, or 46508-specific radiolabeled probes may be used to detect a 46508 mRNA marker. Furthermore, the use of a pharmacodynamic marker may offer mechanism-based prediction of risk due to drug treatment beyond the range of possible direct observations. Examples of the use of pharmacodynamic markers in the art include: Matsuda et al. U.S. Pat. No. 6,033,862; Hattis et al. (1991) Env. Health Perspect. 90: 229-238; Schentag (1999) Am. J. Health-Syst. Pharm. 56 Suppl. 3: S21-S24; and Nicolau (1999) Am, J. Health-Syst. Pharm. 56 Suppl. 3: S16-S20.

[5797] The 46508 molecules of the invention are also useful as pharmacogenomic markers. As used herein, a “pharmacogenomic marker” is an objective biochemical marker which correlates with a specific clinical drug response or susceptibility in a subject (see, e.g., McLeod et al. (1999) Eur. J. Cancer 35:1650-1652). The presence or quantity of the pharmacogenomic marker is related to the predicted response of the subject to a specific drug or class of drugs prior to administration of the drug. By assessing the presence or quantity of one or more pharmacogenomic markers in a subject, a drug therapy which is most appropriate for the subject, or which is predicted to have a greater degree of success, may be selected. For example, based on the presence or quantity of RNA, or protein (e.g., 46508 protein or RNA) for specific tumor markers in a subject, a drug or course of treatment may be selected that is optimized for the treatment of the specific tumor likely to be present in the subject. Similarly, the presence or absence of a specific sequence mutation in 46508 DNA may correlate 46508 drug response. The use of pharmacogenomic markers therefore permits the application of the most appropriate treatment for each subject without having to administer the therapy.

[5798] Pharmaceutical Compositions of 46508

[5799] The nucleic acid and polypeptides, fragments thereof, as well as anti-46508 antibodies (also referred to herein as “active compounds”) of the invention can be incorporated into pharmaceutical compositions. Such compositions typically include the nucleic acid molecule, protein, or antibody and a pharmaceutically acceptable carrier. As used herein the language “pharmaceutically acceptable carrier” includes solvents, dispersion media, coatings, antibacterial and antifingal agents, isotonic and absorption delaying agents, and the like, compatible with pharmaceutical administration. Supplementary active compounds can also be incorporated into the compositions.

[5800] A pharmaceutical composition is formulated to be compatible with its intended route of administration. Examples of routes of administration include parenteral, e.g., intravenous, intradermal, subcutaneous, oral (e.g., inhalation), transdermal (topical), transmucosal, and rectal administration. Solutions or suspensions used for parenteral, intradermal, or subcutaneous application can include the following components: a sterile diluent such as water for injection, saline solution, fixed oils, polyethylene glycols, glycerine, propylene glycol or other synthetic solvents; antibacterial agents such as benzyl alcohol or methyl parabens; antioxidants such as ascorbic acid or sodium bisulfite; chelating agents such as ethylenediaminetetraacetic acid; buffers such as acetates, citrates or phosphates and agents for the adjustment of tonicity such as sodium chloride or dextrose. pH can be adjusted with acids or bases, such as hydrochloric acid or sodium hydroxide. The parenteral preparation can be enclosed in ampoules, disposable syringes or multiple dose vials made of glass or plastic.

[5801] Pharmaceutical compositions suitable for injectable use include sterile aqueous solutions (where water soluble) or dispersions and sterile powders for the extemporaneous preparation of sterile injectable solutions or dispersion. For intravenous administration, suitable carriers include physiological saline, bacteriostatic water, Cremophor EL™ (BASF, Parsippany, N.J.) or phosphate buffered saline (PBS). In all cases, the composition must be sterile and should be fluid to the extent that easy syringability exists. It should be stable under the conditions of manufacture and storage and must be preserved against the contaminating action of microorganisms such as bacteria and fungi. The carrier can be a solvent or dispersion medium containing, for example, water, ethanol, polyol (for example, glycerol, propylene glycol, and liquid polyetheylene glycol, and the like), and suitable mixtures thereof. The proper fluidity can be maintained, for example, by the use of a coating such as lecithin, by the maintenance of the required particle size in the case of dispersion and by the use of surfactants. Prevention of the action of microorganisms can be achieved by various antibacterial and antifungal agents, for example, parabens, chlorobutanol, phenol, ascorbic acid, thimerosal, and the like. In many cases, it will be preferable to include isotonic agents, for example, sugars, polyalcohols such as manitol, sorbitol, sodium chloride in the composition. Prolonged absorption of the injectable compositions can be brought about by including in the composition an agent which delays absorption, for example, aluminum monostearate and gelatin.

[5802] Sterile injectable solutions can be prepared by incorporating the active compound in the required amount in an appropriate solvent with one or a combination of ingredients enumerated above, as required, followed by filtered sterilization. Generally, dispersions are prepared by incorporating the active compound into a sterile vehicle which contains a basic dispersion medium and the required other ingredients from those enumerated above. In the case of sterile powders for the preparation of sterile injectable solutions, the preferred methods of preparation are vacuum drying and freeze-drying which yields a powder of the active ingredient plus any additional desired ingredient from a previously sterile-filtered solution thereof.

[5803] Oral compositions generally include an inert diluent or an edible carrier. For the purpose of oral therapeutic administration, the active compound can be incorporated with excipients and used in the form of tablets, troches, or capsules, e.g., gelatin capsules. Oral compositions can also be prepared using a fluid carrier for use as a mouthwash. Pharmaceutically compatible binding agents, and/or adjuvant materials can be included as part of the composition. The tablets, pills, capsules, troches and the like can contain any of the following ingredients, or compounds of a similar nature: a binder such as microcrystalline cellulose, gum tragacanth or gelatin; an excipient such as starch or lactose, a disintegrating agent such as alginic acid, Primogel, or corn starch; a lubricant such as magnesium stearate or Sterotes; a glidant such as colloidal silicon dioxide; a sweetening agent such as sucrose or saccharin; or a flavoring agent such as peppermint, methyl salicylate, or orange flavoring.

[5804] For administration by inhalation, the compounds are delivered in the form of an aerosol spray from pressured container or dispenser which contains a suitable propellant, e.g., a gas such as carbon dioxide, or a nebulizer.

[5805] Systemic administration can also be by transmucosal or transdermal means. For transmucosal or transdermal administration, penetrants appropriate to the barrier to be permeated are used in the formulation. Such penetrants are generally known in the art, and include, for example, for transmucosal administration, detergents, bile salts, and fusidic acid derivatives. Transmucosal administration can be accomplished through the use of nasal sprays or suppositories. For transdermal administration, the active compounds are formulated into ointments, salves, gels, or creams as generally known in the art.

[5806] The compounds can also be prepared in the form of suppositories (e.g., with conventional suppository bases such as cocoa butter and other glycerides) or retention enemas for rectal delivery.

[5807] In one embodiment, the active compounds are prepared with carriers that will protect the compound against rapid elimination from the body, such as a controlled release formulation, including implants and microencapsulated delivery systems. Biodegradable, biocompatible polymers can be used, such as ethylene vinyl acetate, polyanhydrides, polyglycolic acid, collagen, polyorthoesters, and polylactic acid. Methods for preparation of such formulations will be apparent to those skilled in the art. The materials can also be obtained commercially from Alza Corporation and Nova Pharmaceuticals, Inc. Liposomal suspensions (including liposomes targeted to infected cells with monoclonal antibodies to viral antigens) can also be used as pharmaceutically acceptable carriers. These can be prepared according to methods known to those skilled in the art, for example, as described in U.S. Pat. No. 4,522,811.

[5808] It is advantageous to formulate oral or parenteral compositions in dosage unit form for ease of administration and uniformity of dosage. Dosage unit form as used herein refers to physically discrete units suited as unitary dosages for the subject to be treated; each unit containing a predetermined quantity of active compound calculated to produce the desired therapeutic effect in association with the required pharmaceutical carrier.

[5809] Toxicity and therapeutic efficacy of such compounds can be determined by standard pharmaceutical procedures in cell cultures or experimental animals, e.g., for determining the LD50 (the dose lethal to 50% of the population) and the ED50 (the dose therapeutically effective in 50% of the population). The dose ratio between toxic and therapeutic effects is the therapeutic index and it can be expressed as the ratio LD50/ED50. Compounds which exhibit high therapeutic indices are preferred. While compounds that exhibit toxic side effects may be used, care should be taken to design a delivery system that targets such compounds to the site of affected tissue in order to minimize potential damage to uninfected cells and, thereby, reduce side effects.

[5810] The data obtained from the cell culture assays and animal studies can be used in formulating a range of dosage for use in humans. The dosage of such compounds lies preferably within a range of circulating concentrations that include the ED50 with little or no toxicity. The dosage may vary within this range depending upon the dosage form employed and the route of administration utilized. For any compound used in the method of the invention, the therapeutically effective dose can be estimated initially from cell culture assays. A dose may be formulated in animal models to achieve a circulating plasma concentration range that includes the IC50 (i.e., the concentration of the test compound which achieves a half-maximal inhibition of symptoms) as determined in cell culture. Such information can be used to more accurately determine useful doses in humans. Levels in plasma may be measured, for example, by high performance liquid chromatography.

[5811] As defined herein, a therapeutically effective amount of protein or polypeptide (i.e., an effective dosage) ranges from about 0.001 to 30 mg/kg body weight, preferably about 0.01 to 25 mg/kg body weight, more preferably about 0.1 to 20 mg/kg body weight, and even more preferably about 1 to 10 mg/kg, 2 to 9 mg/kg, 3 to 8 mg/kg, 4 to 7 mg/kg, or 5 to 6 mg/kg body weight. The protein or polypeptide can be administered one time per week for between about 1 to 10 weeks, preferably between 2 to 8 weeks, more preferably between about 3 to 7 weeks, and even more preferably for about 4, 5, or 6 weeks. The skilled artisan will appreciate that certain factors may influence the dosage and timing required to effectively treat a subject, including but not limited to the severity of the disease or disorder, previous treatments, the general health and/or age of the subject, and other diseases present. Moreover, treatment of a subject with a therapeutically effective amount of a protein, polypeptide, or antibody can include a single treatment or, preferably, can include a series of treatments.

[5812] For antibodies, the preferred dosage is 0.1 mg/kg of body weight (generally 10 mg/kg to 20 mg/kg). If the antibody is to act in the brain, a dosage of 50 mg/kg to 100 mg/kg is usually appropriate. Generally, partially human antibodies and fully human antibodies have a longer half-life within the human body than other antibodies. Accordingly, lower dosages and less frequent administration is often possible. Modifications such as lipidation can be used to stabilize antibodies and to enhance uptake and tissue penetration (e.g., into the brain). A method for lipidation of antibodies is described by Cruikshank et al. ((1997) J. Acquired Immune Deficiency Syndromes and Human Retrovirology 14:193).

[5813] The present invention encompasses agents which modulate expression or activity. An agent may, for example, be a small molecule. For example, such small molecules include, but are not limited to, peptides, peptidomimetics (e.g., peptoids), amino acids, amino acid analogs, polynucleotides, polynucleotide analogs, nucleotides, nucleotide analogs, organic or inorganic compounds (i.e., including heteroorganic and organometallic compounds) having a molecular weight less than about 10,000 grams per mole, organic or inorganic compounds having a molecular weight less than about 5,000 grams per mole, organic or inorganic compounds having a molecular weight less than about 1,000 grams per mole, organic or inorganic compounds having a molecular weight less than about 500 grams per mole, and salts, esters, and other pharmaceutically acceptable forms of such compounds.

[5814] Exemplary doses include milligram or microgram amounts of the small molecule per kilogram of subject or sample weight (e.g., about 1 microgram per kilogram to about 500 milligrams per kilogram, about 100 micrograms per kilogram to about 5 milligrams per kilogram, or about 1 microgram per kilogram to about 50 micrograms per kilogram. It is furthermore understood that appropriate doses of a small molecule depend upon the potency of the small molecule with respect to the expression or activity to be modulated. When one or more of these small molecules is to be administered to an animal (e.g., a human) in order to modulate expression or activity of a polypeptide or nucleic acid of the invention, a physician, veterinarian, or researcher may, for example, prescribe a relatively low dose at first, subsequently increasing the dose until an appropriate response is obtained. In addition, it is understood that the specific dose level for any particular animal subject will depend upon a variety of factors including the activity of the specific compound employed, the age, body weight, general health, gender, and diet of the subject, the time of administration, the route of administration, the rate of excretion, any drug combination, and the degree of expression or activity to be modulated.

[5815] An antibody (or fragment thereof) may be conjugated to a therapeutic moiety such as a cytotoxin, a therapeutic agent or a radioactive ion. A cytotoxin or cytotoxic agent includes any agent that is detrimental to cells. Examples include taxol, cytochalasin B, gramicidin D, ethidium bromide, emetine, mitomycin, etoposide, tenoposide, vincristine, vinblastine, colchicin, doxorubicin, daunorubicin, dihydroxy anthracin dione, mitoxantrone, mithramycin, actinomycin D, 1-dehydrotestosterone, glucocorticoids, procaine, tetracaine, lidocaine, propranolol, puromycin, maytansinoids, e.g., maytansinol (see U.S. Pat. No. 5,208,020), CC-1065 (see U.S. Pat. Nos. 5,475,092, 5,585,499, 5,846,545) and analogs or homologs thereof. Therapeutic agents include, but are not limited to, antimetabolites (e.g., methotrexate, 6-mercaptopurine, 6-thioguanine, cytarabine, 5-fluorouracil decarbazine), alkylating agents (e.g., mechlorethamine, thioepa chlorambucil, CC-1065, melphalan, carmustine (BSNU) and lomustine (CCNU), cyclothosphamide, busulfan, dibromomannitol, streptozotocin, mitomycin C, and cis-dichlorodiamine platinum (II) (DDP) cisplatin), anthracyclines (e.g., daunorubicin (formerly daunomycin) and doxorubicin), antibiotics (e.g., dactinomycin (formerly actinomycin), bleomycin, mithramycin, and anthramycin (AMC)), and anti-mitotic agents (e.g., vincristine, vinblastine, taxol and maytansinoids). Radioactive ions include, but are not limited to iodine, yttrium and praseodymium.

[5816] The conjugates of the invention can be used for modifying a given biological response, the drug moiety is not to be construed as limited to classical chemical therapeutic agents. For example, the drug moiety may be a protein or polypeptide possessing a desired biological activity. Such proteins may include, for example, a toxin such as abrin, ricin A, pseudomonas exotoxin, or diphtheria toxin; a protein such as tumor necrosis factor, &agr;-interferon, &bgr;-interferon, nerve growth factor, platelet derived growth factor, tissue plasminogen activator; or, biological response modifiers such as, for example, lymphokines, interleukin-1 (“IL-1”), interleukin-2 (“IL-2”), interleukin-6 (“IL-6”), granulocyte macrophase colony stimulating factor (“GM-CSF”), granulocyte colony stimulating factor (“G-CSF”), or other growth factors.

[5817] Alternatively, an antibody can be conjugated to a second antibody to form an antibody heteroconjugate as described by Segal in U.S. Pat. No. 4,676,980.

[5818] The nucleic acid molecules of the invention can be inserted into vectors and used as gene therapy vectors. Gene therapy vectors can be delivered to a subject by, for example, intravenous injection, local administration (see U.S. Pat. No. 5,328,470) or by stereotactic injection (see e.g., Chen et al. (1994) Proc. Natl. Acad. Sci. USA 91:3054-3057). The pharmaceutical preparation of the gene therapy vector can include the gene therapy vector in an acceptable diluent, or can comprise a slow release matrix in which the gene delivery vehicle is imbedded. Alternatively, where the complete gene delivery vector can be produced intact from recombinant cells, e.g., retroviral vectors, the pharmaceutical preparation can include one or more cells which produce the gene delivery system.

[5819] The pharmaceutical compositions can be included in a container, pack, or dispenser together with instructions for administration.

[5820] Methods of Treatment for 46508

[5821] The present invention provides for both prophylactic and therapeutic methods of treating a subject at risk of (or susceptible to) a disorder or having a disorder associated with aberrant or unwanted 46508 expression or activity. As used herein, the term “treatment” is defined as the application or administration of a therapeutic agent to a patient, or application or administration of a therapeutic agent to an isolated tissue or cell line from a patient, who has a disease, a symptom of disease or a predisposition toward a disease, with the purpose to cure, heal, alleviate, relieve, alter, remedy, ameliorate, improve or affect the disease, the symptoms of disease or the predisposition toward disease. A therapeutic agent includes, but is not limited to, small molecules, peptides, antibodies, ribozymes and antisense oligonucleotides.

[5822] With regards to both prophylactic and therapeutic methods of treatment, such treatments may be specifically tailored or modified, based on knowledge obtained from the field of pharmacogenomics. “Pharmacogenomics”, as used herein, refers to the application of genomics technologies such as gene sequencing, statistical genetics, and gene expression analysis to drugs in clinical development and on the market. More specifically, the term refers the study of how a patient's genes determine his or her response to a drug (e.g., a patient's “drug response phenotype”, or “drug response genotype”.) Thus, another aspect of the invention provides methods for tailoring an individual's prophylactic or therapeutic treatment with either the 46508 molecules of the present invention or 46508 modulators according to that individual's drug response genotype. Pharmacogenomics allows a clinician or physician to target prophylactic or therapeutic treatments to patients who will most benefit from the treatment and to avoid treatment of patients who will experience toxic drug-related side effects.

[5823] In one aspect, the invention provides a method for preventing in a subject, a disease or condition associated with an aberrant or unwanted 46508 expression or activity, by administering to the subject a 46508 or an agent which modulates 46508 expression or at least one 46508 activity. Subjects at risk for a disease which is caused or contributed to by aberrant or unwanted 46508 expression or activity can be identified by, for example, any or a combination of diagnostic or prognostic assays as described herein. Administration of a prophylactic agent can occur prior to the manifestation of symptoms characteristic of the 46508 aberrance, such that a disease or disorder is prevented or, alternatively, delayed in its progression. Depending on the type of 46508 aberrance, for example, a 46508, 46508 agonist or 46508 antagonist agent can be used for treating the subject. The appropriate agent can be determined based on screening assays described herein.

[5824] It is possible that some 46508 disorders can be caused, at least in part, by an abnormal level of gene product, or by the presence of a gene product exhibiting abnormal activity. As such, the reduction in the level and/or activity of such gene products would bring about the amelioration of disorder symptoms.

[5825] The 46508 molecules can also act as novel diagnostic targets and therapeutic agents for controlling one or more of immune disorders, or metabolic disorders.

[5826] The 46508 nucleic acid and protein of the invention can be used to treat and/or diagnose a variety of immune (or inflammatory) disorders. Examples of immune disorders or diseases include, but are not limited to, autoimmune diseases (including, for example, diabetes mellitus, arthritis (including rheumatoid arthritis, juvenile rheumatoid arthritis, osteoarthritis, psoriatic arthritis), multiple sclerosis, encephalomyelitis, myasthenia gravis, systemic lupus erythematosis, autoimmune thyroiditis, dermatitis (including atopic dermatitis and eczematous dermatitis), psoriasis, Sjögren's Syndrome, Crohn's disease, aphthous ulcer, iritis, conjunctivitis, keratoconjunctivitis, ulcerative colitis, asthma, allergic asthma, cutaneous lupus erythematosus, scleroderma, vaginitis, proctitis, drug eruptions, leprosy reversal reactions, erythema nodosum leprosum, autoimmune uveitis, allergic encephalomyelitis, acute necrotizing hemorrhagic encephalopathy, idiopathic bilateral progressive sensorineural hearing loss, aplastic anemia, pure red cell anemia, idiopathic thrombocytopenia, polychondritis, Wegener's granulomatosis, chronic active hepatitis, Stevens-Johnson syndrome, idiopathic sprue, lichen planus, Graves' disease, sarcoidosis, primary biliary cirrhosis, uveitis posterior, and interstitial lung fibrosis), graft-versus-host disease, cases of transplantation, and allergy such as, atopic allergy.

[5827] As used herein, the term “erythroid associated disorders” include disorders involving aberrant (increased or deficient) erythroblast proliferation, e.g., an erythroleukemia, and aberrant (increased or deficient) erythroblast differentiation, e.g., an anemia. Erythrocyte-associated disorders include anemias such as, for example, drug- (chemotherapy-) induced anemias, hemolytic anemias due to hereditary cell membrane abnormalities, such as hereditary spherocytosis, hereditary elliptocytosis, and hereditary pyropoikilocytosis; hemolytic anemias due to acquired cell membrane defects, such as paroxysmal nocturnal hemoglobinuria and spur cell anemia; hemolytic anemias caused by antibody reactions, for example to the RBC antigens, or antigens of the ABO system, Lewis system, Ii system, Rh system, Kidd system, Duffy system, and Kell system; methemoglobinemia; a failure of erythropoiesis, for example, as a result of aplastic anemia, pure red cell aplasia, myelodysplastic syndromes, sideroblastic anemias, and congenital dyserythropoietic anemia; secondary anemia in non-hematolic disorders, for example, as a result of chemotherapy, alcoholism, or liver disease; anemia of chronic disease, such as chronic renal failure; and endocrine deficiency diseases.

[5828] Additionally, 46508 may play an important role in the regulation of metabolism. Diseases of metabolic imbalance include, but are not limited to, obesity, anorexia nervosa, cachexia, lipid disorders, and diabetes.

[5829] As discussed, successful treatment of 46508 disorders can be brought about by techniques that serve to inhibit the expression or activity of target gene products. For example, compounds, e.g., an agent identified using an assays described above, that proves to exhibit negative modulatory activity, can be used in accordance with the invention to prevent and/or ameliorate symptoms of 46508 disorders. Such molecules can include, but are not limited to peptides, phosphopeptides, small organic or inorganic molecules, or antibodies (including, for example, polyclonal, monoclonal, humanized, anti-idiotypic, chimeric or single chain antibodies, and Fab, F(ab′)2 and Fab expression library fragments, scFV molecules, and epitope-binding fragments thereof).

[5830] Further, antisense and ribozyme molecules that inhibit expression of the target gene can also be used in accordance with the invention to reduce the level of target gene expression, thus effectively reducing the level of target gene activity. Still further, triple helix molecules can be utilized in reducing the level of target gene activity. Antisense, ribozyme and triple helix molecules are discussed above.

[5831] It is possible that the use of antisense, ribozyme, and/or triple helix molecules to reduce or inhibit mutant gene expression can also reduce or inhibit the transcription (triple helix) and/or translation (antisense, ribozyme) of mRNA produced by normal target gene alleles, such that the concentration of normal target gene product present can be lower than is necessary for a normal phenotype. In such cases, nucleic acid molecules that encode and express target gene polypeptides exhibiting normal target gene activity can be introduced into cells via gene therapy method. Alternatively, in instances in that the target gene encodes an extracellular protein, it can be preferable to co-administer normal target gene protein into the cell or tissue in order to maintain the requisite level of cellular or tissue target gene activity.

[5832] Another method by which nucleic acid molecules may be utilized in treating or preventing a disease characterized by 46508 expression is through the use of aptamer molecules specific for 46508 protein. Aptamers are nucleic acid molecules having a tertiary structure which permits them to specifically bind to protein ligands (see, e.g., Osborne, et al. (1997) Curr. Opin. Chem Biol. 1: 5-9; and Patel, D. J. (1997) Curr Opin Chem Biol 1:32-46). Since nucleic acid molecules may in many cases be more conveniently introduced into target cells than therapeutic protein molecules may be, aptamers offer a method by which 46508 protein activity may be specifically decreased without the introduction of drugs or other molecules which may have pluripotent effects.

[5833] Antibodies can be generated that are both specific for target gene product and that reduce target gene product activity. Such antibodies may, therefore, by administered in instances whereby negative modulatory techniques are appropriate for the treatment of 46508 disorders. For a description of antibodies, see the Antibody section above.

[5834] In circumstances wherein injection of an animal or a human subject with a 46508 protein or epitope for stimulating antibody production is harmful to the subject, it is possible to generate an immune response against 46508 through the use of anti-idiotypic antibodies (see, for example, Herlyn, D. (1999) Ann Med 31:66-78; and Bhattacharya-Chatterjee, M., and Foon, K. A. (1998) Cancer Treat Res. 94:51-68). If an anti-idiotypic antibody is introduced into a mammal or human subject, it should stimulate the production of anti-anti-idiotypic antibodies, which should be specific to the 46508 protein. Vaccines directed to a disease characterized by 46508 expression may also be generated in this fashion.

[5835] In instances where the target antigen is intracellular and whole antibodies are used, internalizing antibodies may be preferred. Lipofectin or liposomes can be used to deliver the antibody or a fragment of the Fab region that binds to the target antigen into cells. Where fragments of the antibody are used, the smallest inhibitory fragment that binds to the target antigen is preferred. For example, peptides having an amino acid sequence corresponding to the Fv region of the antibody can be used. Alternatively, single chain neutralizing antibodies that bind to intracellular target antigens can also be administered. Such single chain antibodies can be administered, for example, by expressing nucleotide sequences encoding single-chain antibodies within the target cell population (see e.g., Marasco et al. (1993) Proc. Natl. Acad. Sci. USA 90:7889-7893).

[5836] The identified compounds that inhibit target gene expression, synthesis and/or activity can be administered to a patient at therapeutically effective doses to prevent, treat or ameliorate 46508 disorders. A therapeutically effective dose refers to that amount of the compound sufficient to result in amelioration of symptoms of the disorders. Toxicity and therapeutic efficacy of such compounds can be determined by standard pharmaceutical procedures as described above.

[5837] The data obtained from the cell culture assays and animal studies can be used in formulating a range of dosage for use in humans. The dosage of such compounds lies preferably within a range of circulating concentrations that include the ED50 with little or no toxicity. The dosage can vary within this range depending upon the dosage form employed and the route of administration utilized. For any compound used in the method of the invention, the therapeutically effective dose can be estimated initially from cell culture assays. A dose can be formulated in animal models to achieve a circulating plasma concentration range that includes the IC50 (i.e., the concentration of the test compound that achieves a half-maximal inhibition of symptoms) as determined in cell culture. Such information can be used to more accurately determine useful doses in humans. Levels in plasma can be measured, for example, by high performance liquid chromatography.

[5838] Another example of determination of effective dose for an individual is the ability to directly assay levels of “free” and “bound” compound in the serum of the test subject. Such assays may utilize antibody mimics and/or “biosensors” that have been created through molecular imprinting techniques. The compound which is able to modulate 46508 activity is used as a template, or “imprinting molecule”, to spatially organize polymerizable monomers prior to their polymerization with catalytic reagents. The subsequent removal of the imprinted molecule leaves a polymer matrix which contains a repeated “negative image” of the compound and is able to selectively rebind the molecule under biological assay conditions. A detailed review of this technique can be seen in Ansell, R. J. et al (1996) Current Opinion in Biotechnology 7:89-94 and in Shea, K. J. (1994) Trends in Polymer Science 2:166-173. Such “imprinted” affinity matrixes are amenable to ligand-binding assays, whereby the immobilized monoclonal antibody component is replaced by an appropriately imprinted matrix. An example of the use of such matrixes in this way can be seen in Vlatakis, G. et al (1993) Nature 361:645-647. Through the use of isotope-labeling, the “free” concentration of compound which modulates the expression or activity of 46508 can be readily monitored and used in calculations of IC50.

[5839] Such “imprinted” affinity matrixes can also be designed to include fluorescent groups whose photon-emitting properties measurably change upon local and selective binding of target compound. These changes can be readily assayed in real time using appropriate fiberoptic devices, in turn allowing the dose in a test subject to be quickly optimized based on its individual IC50. An rudimentary example of such a “biosensor” is discussed in Kriz, D. et al (1995) Analytical Chemistry 67:2142-2144.

[5840] Another aspect of the invention pertains to methods of modulating 46508 expression or activity for therapeutic purposes. Accordingly, in an exemplary embodiment, the modulatory method of the invention involves contacting a cell with a 46508 or agent that modulates one or more of the activities of 46508 protein activity associated with the cell. An agent that modulates 46508 protein activity can be an agent as described herein, such as a nucleic acid or a protein, a naturally-occurring target molecule of a 46508 protein (e.g., a 46508 substrate or receptor), a 46508 antibody, a 46508 agonist or antagonist, a peptidomimetic of a 46508 agonist or antagonist, or other small molecule.

[5841] In one embodiment, the agent stimulates one or 46508 activities. Examples of such stimulatory agents include active 46508 protein and a nucleic acid molecule encoding 46508. In another embodiment, the agent inhibits one or more 46508 activities. Examples of such inhibitory agents include antisense 46508 nucleic acid molecules, anti-46508 antibodies, and 46508 inhibitors. These modulatory methods can be performed in vitro (e.g., by culturing the cell with the agent) or, alternatively, in vivo (e.g., by administering the agent to a subject). As such, the present invention provides methods of treating an individual afflicted with a disease or disorder characterized by aberrant or unwanted expression or activity of a 46508 protein or nucleic acid molecule. In one embodiment, the method involves administering an agent (e.g., an agent identified by a screening assay described herein), or combination of agents that modulates (e.g., up regulates or down regulates) 46508 expression or activity. In another embodiment, the method involves administering a 46508 protein or nucleic acid molecule as therapy to compensate for reduced, aberrant, or unwanted 46508 expression or activity.

[5842] Stimulation of 46508 activity is desirable in situations in which 46508 is abnormally downregulated and/or in which increased 46508 activity is likely to have a beneficial effect. For example, stimulation of 46508 activity is desirable in situations in which a 46508 is downregulated and/or in which increased 46508 activity is likely to have a beneficial effect. Likewise, inhibition of 46508 activity is desirable in situations in which 46508 is abnormally upregulated and/or in which decreased 46508 activity is likely to have a beneficial effect.

[5843] 46508 Pharmacogenomics

[5844] The 46508 molecules of the present invention, as well as agents, or modulators which have a stimulatory or inhibitory effect on 46508 activity (e.g., 46508 gene expression) as identified by a screening assay described herein can be administered to individuals to treat (prophylactically or therapeutically) 46508 associated disorders (e.g., cancer) associated with aberrant or unwanted 46508 activity. In conjunction with such treatment, pharmacogenomics (i.e., the study of the relationship between an individual's genotype and that individual's response to a foreign compound or drug) may be considered. Differences in metabolism of therapeutics can lead to severe toxicity or therapeutic failure by altering the relation between dose and blood concentration of the pharmacologically active drug. Thus, a physician or clinician may consider applying knowledge obtained in relevant pharmacogenomics studies in determining whether to administer a 46508 molecule or 46508 modulator as well as tailoring the dosage and/or therapeutic regimen of treatment with a 46508 molecule or 46508 modulator.

[5845] Pharmacogenomics deals with clinically significant hereditary variations in the response to drugs due to altered drug disposition and abnormal action in affected persons. See, for example, Eichelbaum, M. et al. (1996) Clin. Exp. Pharmacol. Physiol. 23:983-985 and Linder, M. W. et al. (1997) Clin. Chem. 43:254-266. In general, two types of pharmacogenetic conditions can be differentiated. Genetic conditions transmitted as a single factor altering the way drugs act on the body (altered drug action) or genetic conditions transmitted as single factors altering the way the body acts on drugs (altered drug metabolism). These pharmacogenetic conditions can occur either as rare genetic defects or as naturally-occurring polymorphisms. For example, glucose-6-phosphate dehydrogenase deficiency (G6PD) is a common inherited enzymopathy in which the main clinical complication is haemolysis after ingestion of oxidant drugs (anti-malarials, sulfonamides, analgesics, nitrofurans) and consumption of fava beans.

[5846] One pharmacogenomics approach to identifying genes that predict drug response, known as “a genome-wide association”, relies primarily on a high-resolution map of the human genome consisting of already known gene-related markers (e.g., a “bi-allelic” gene marker map which consists of 60,000-100,000 polymorphic or variable sites on the human genome, each of which has two variants.) Such a high-resolution genetic map can be compared to a map of the genome of each of a statistically significant number of patients taking part in a Phase II/III drug trial to identify markers associated with a particular observed drug response or side effect. Alternatively, such a high resolution map can be generated from a combination of some ten-million known single nucleotide polymorphisms (SNPs) in the human genome. As used herein, a “SNP” is a common alteration that occurs in a single nucleotide base in a stretch of DNA. For example, a SNP may occur once per every 1000 bases of DNA. A SNP may be involved in a disease process, however, the vast majority may not be disease-associated. Given a genetic map based on the occurrence of such SNPs, individuals can be grouped into genetic categories depending on a particular pattern of SNPs in their individual genome. In such a manner, treatment regimens can be tailored to groups of genetically similar individuals, taking into account traits that may be common among such genetically similar individuals.

[5847] Alternatively, a method termed the “candidate gene approach,” can be utilized to identify genes that predict drug response. According to this method, if a gene that encodes a drug's target is known (e.g., a 46508 protein of the present invention), all common variants of that gene can be fairly easily identified in the population and it can be determined if having one version of the gene versus another is associated with a particular drug response.

[5848] Alternatively, a method termed the “gene expression profiling,” can be utilized to identify genes that predict drug response. For example, the gene expression of an animal dosed with a drug (e.g., a 46508 molecule or 46508 modulator of the present invention) can give an indication whether gene pathways related to toxicity have been turned on.

[5849] Information generated from more than one of the above pharmacogenomics approaches can be used to determine appropriate dosage and treatment regimens for prophylactic or therapeutic treatment of an individual. This knowledge, when applied to dosing or drug selection, can avoid adverse reactions or therapeutic failure and thus enhance therapeutic or prophylactic efficiency when treating a subject with a 46508 molecule or 46508 modulator, such as a modulator identified by one of the exemplary screening assays described herein.

[5850] The present invention further provides methods for identifying new agents, or combinations, that are based on identifying agents that modulate the activity of one or more of the gene products encoded by one or more of the 46508 genes of the present invention, wherein these products may be associated with resistance of the cells to a therapeutic agent. Specifically, the activity of the proteins encoded by the 46508 genes of the present invention can be used as a basis for identifying agents for overcoming agent resistance. By blocking the activity of one or more of the resistance proteins, target cells, e.g., human cells, will become sensitive to treatment with an agent that the unmodified target cells were resistant to.

[5851] Monitoring the influence of agents (e.g., drugs) on the expression or activity of a 46508 protein can be applied in clinical trials. For example, the effectiveness of an agent determined by a screening assay as described herein to increase 46508 gene expression, protein levels, or upregulate 46508 activity, can be monitored in clinical trials of subjects exhibiting decreased 46508 gene expression, protein levels, or downregulated 46508 activity. Alternatively, the effectiveness of an agent determined by a screening assay to decrease 46508 gene expression, protein levels, or downregulate 46508 activity, can be monitored in clinical trials of subjects exhibiting increased 46508 gene expression, protein levels, or upregulated 46508 activity. In such clinical trials, the expression or activity of a 46508 gene, and preferably, other genes that have been implicated in, for example, a 46508-associated disorder can be used as a “read out” or markers of the phenotype of a particular cell.

[5852] 46508 Informatics

[5853] The sequence of a 46508 molecule is provided in a variety of media to facilitate use thereof. A sequence can be provided as a manufacture, other than an isolated nucleic acid or amino acid molecule, which contains a 46508. Such a manufacture can provide a nucleotide or amino acid sequence, e.g., an open reading frame, in a form which allows examination of the manufacture using means not directly applicable to examining the nucleotide or amino acid sequences, or a subset thereof, as they exists in nature or in purified form. The sequence information can include, but is not limited to, 46508 full-length nucleotide and/or amino acid sequences, partial nucleotide and/or amino acid sequences, polymorphic sequences including single nucleotide polymorphisms (SNPs), epitope sequence, and the like. In a preferred embodiment, the manufacture is a machine-readable medium, e.g., a magnetic, optical, chemical or mechanical information storage device.

[5854] As used herein, “machine-readable media” refers to any medium that can be read and accessed directly by a machine, e.g., a digital computer or analogue computer. Non-limiting examples of a computer include a desktop PC, laptop, mainframe, server (e.g., a web server, network server, or server farm), handheld digital assistant, pager, mobile telephone, and the like. The computer can be stand-alone or connected to a communications network, e.g., a local area network (such as a VPN or intranet), a wide area network (e.g., an Extranet or the Internet), or a telephone network (e.g., a wireless, DSL, or ISDN network). Machine-readable media include, but are not limited to: magnetic storage media, such as floppy discs, hard disc storage medium, and magnetic tape; optical storage media such as CD-ROM; electrical storage media such as RAM, ROM, EPROM, EEPROM, flash memory, and the like; and hybrids of these categories such as magnetic/optical storage media.

[5855] A variety of data storage structures are available to a skilled artisan for creating a machine-readable medium having recorded thereon a nucleotide or amino acid sequence of the present invention. The choice of the data storage structure will generally be based on the means chosen to access the stored information. In addition, a variety of data processor programs and formats can be used to store the nucleotide sequence information of the present invention on computer readable medium. The sequence information can be represented in a word processing text file, formatted in commercially-available software such as WordPerfect and Microsoft Word, or represented in the form of an ASCII file, stored in a database application, such as DB2, Sybase, Oracle, or the like. The skilled artisan can readily adapt any number of data processor structuring formats (e.g., text file or database) in order to obtain computer readable medium having recorded thereon the nucleotide sequence information of the present invention.

[5856] In a preferred embodiment, the sequence information is stored in a relational database (such as Sybase or Oracle). The database can have a first table for storing sequence (nucleic acid and/or amino acid sequence) information. The sequence information can be stored in one field (e.g., a first column) of a table row and an identifier for the sequence can be store in another field (e.g., a second column) of the table row. The database can have a second table, e.g., storing annotations. The second table can have a field for the sequence identifier, a field for a descriptor or annotation text (e.g., the descriptor can refer to a functionality of the sequence, a field for the initial position in the sequence to which the annotation refers, and a field for the ultimate position in the sequence to which the annotation refers. Non-limiting examples for annotation to nucleic acid sequences include polymorphisms (e.g., SNP's) translational regulatory sites and splice junctions. Non-limiting examples for annotations to amino acid sequence include polypeptide domains, e.g., a domain described herein; active sites and other functional amino acids; and modification sites.

[5857] By providing the nucleotide or amino acid sequences of the invention in computer readable form, the skilled artisan can routinely access the sequence information for a variety of purposes. For example, one skilled in the art can use the nucleotide or amino acid sequences of the invention in computer readable form to compare a target sequence or target structural motif with the sequence information stored within the data storage means. A search is used to identify fragments or regions of the sequences of the invention which match a particular target sequence or target motif. The search can be a BLAST search or other routine sequence comparison, e.g., a search described herein.

[5858] Thus, in one aspect, the invention features a method of analyzing 46508, e.g., analyzing structure, function, or relatedness to one or more other nucleic acid or amino acid sequences. The method includes: providing a 46508 nucleic acid or amino acid sequence; comparing the 46508 sequence with a second sequence, e.g., one or more preferably a plurality of sequences from a collection of sequences, e.g., a nucleic acid or protein sequence database to thereby analyze 46508. The method can be performed in a machine, e.g., a computer, or manually by a skilled artisan.

[5859] The method can include evaluating the sequence identity between a 46508 sequence and a database sequence. The method can be performed by accessing the database at a second site, e.g., over the Internet.

[5860] As used herein, a “target sequence” can be any DNA or amino acid sequence of six or more nucleotides or two or more amino acids. A skilled artisan can readily recognize that the longer a target sequence is, the less likely a target sequence will be present as a random occurrence in the database. Typical sequence lengths of a target sequence are from about 10 to 100 amino acids or from about 30 to 300 nucleotide residues. However, it is well recognized that commercially important fragments, such as sequence fragments involved in gene expression and protein processing, may be of shorter length.

[5861] Computer software is publicly available which allows a skilled artisan to access sequence information provided in a computer readable medium for analysis and comparison to other sequences. A variety of known algorithms are disclosed publicly and a variety of commercially available software for conducting search means are and can be used in the computer-based systems of the present invention. Examples of such software include, but are not limited to, MacPattern (EMBL), BLASTN and BLASTX (NCBI).

[5862] Thus, the invention features a method of making a computer readable record of a sequence of a 46508 sequence which includes recording the sequence on a computer readable matrix. In a preferred embodiment the record includes one or more of the following: identification of an ORF; identification of a domain, region, or site; identification of the start of transcription; identification of the transcription terminator; the full length amino acid sequence of the protein, or a mature form thereof; the 5′end of the translated region.

[5863] In another aspect, the invention features, a method of analyzing a sequence. The method includes: providing a 46508 sequence, or record, in machine-readable form; comparing a second sequence to the 46508 sequence; thereby analyzing a sequence. Comparison can include comparing to sequences for sequence identity or determining if one sequence is included within the other, e.g., determining if the 46508 sequence includes a sequence being compared. In a preferred embodiment the 46508 or second sequence is stored on a first computer, e.g., at a first site and the comparison is performed, read, or recorded on a second computer, e.g., at a second site. E.g., the 46508 or second sequence can be stored in a public or proprietary database in one computer, and the results of the comparison performed, read, or recorded on a second computer. In a preferred embodiment the record includes one or more of the following: identification of an ORF; identification of a domain, region, or site; identification of the start of transcription; identification of the transcription terminator; the full length amino acid sequence of the protein, or a mature form thereof; the 5′end of the translated region.

[5864] In another aspect, the invention provides a machine-readable medium for holding instructions for performing a method for determining whether a subject has a 46508-associated disease or disorder or a pre-disposition to a 46508-associated disease or disorder, wherein the method comprises the steps of determining 46508 sequence information associated with the subject and based on the 46508 sequence information, determining whether the subject has a 46508-associated disease or disorder or a pre-disposition to a 46508-associated disease or disorder and/or recommending a particular treatment for the disease, disorder or pre-disease condition.

[5865] The invention further provides in an electronic system and/or in a network, a method for determining whether a subject has a 46508-associated disease or disorder or a pre-disposition to a disease associated with a 46508 wherein the method comprises the steps of determining 46508 sequence information associated with the subject, and based on the 46508 sequence information, determining whether the subject has a 46508-associated disease or disorder or a pre-disposition to a 46508-associated disease or disorder, and/or recommending a particular treatment for the disease, disorder or pre-disease condition. In a preferred embodiment, the method further includes the step of receiving information, e.g., phenotypic or genotypic information, associated with the subject and/or acquiring from a network phenotypic information associated with the subject. The information can be stored in a database, e.g., a relational database. In another embodiment, the method further includes accessing the database, e.g., for records relating to other subjects, comparing the 46508 sequence of the subject to the 46508 sequences in the database to thereby determine whether the subject as a 46508-associated disease or disorder, or a pre-disposition for such.

[5866] The present invention also provides in a network, a method for determining whether a subject has a 46508 associated disease or disorder or a pre-disposition to a 46508-associated disease or disorder associated with 46508, said method comprising the steps of receiving 46508 sequence information from the subject and/or information related thereto, receiving phenotypic information associated with the subject, acquiring information from the network corresponding to 46508 and/or corresponding to a 46508-associated disease or disorder and based on one or more of the phenotypic information, the 46508 information (e.g., sequence information and/or information related thereto), and the acquired information, determining whether the subject has a 46508-associated disease or disorder or a pre-disposition to a 46508-associated disease or disorder. The method may further comprise the step of recommending a particular treatment for the disease, disorder or pre-disease condition.

[5867] The present invention also provides a method for determining whether a subject has a 46508-associated disease or disorder or a pre-disposition to a 46508-associated disease or disorder, said method comprising the steps of receiving information related to 46508 (e.g., sequence information and/or information related thereto), receiving phenotypic information associated with the subject, acquiring information from the network related to 46508 and/or related to a 46508-associated disease or disorder, and based on one or more of the phenotypic information, the 46508 information, and the acquired information, determining whether the subject has a 46508-associated disease or disorder or a pre-disposition to a 46508-associated disease or disorder. The method may further comprise the step of recommending a particular treatment for the disease, disorder or pre-disease condition.

[5868] This invention is further illustrated by the following examples that should not be construed as limiting. The contents of all references, patents and published patent applications cited throughout this application are incorporated herein by reference.

EXAMPLES Examples for 26443 and 46873 Example 1 Identification and Characterization of Human 26443 or 46873 cDNA

[5869] The human 26443 sequence (FIG. 1; SEQ ID NO:1), which is approximately 1888 nucleotides long, including 5′ and 3′untranslated regions, contains a predicted methionine-initiated coding sequence of about 1254 nucleotides (SEQ ID NO:3 and nucleotides 91 to 1344 of SEQ ID NO:1). The coding sequence encodes an 418 amino acid protein (SEQ ID NO:2).

[5870] The human 46873 sequence (FIG. 5; SEQ ID NO:4), which is approximately 1358 nucleotides long, including 5′ and 3′untranslated regions, contains a predicted methionine-initiated coding sequence of about 924 nucleotides (SEQ ID NO:3 and nucleotides 134 to 1057 of SEQ ID NO:1). The coding sequence encodes an 308 amino acid protein (SEQ ID NO:2).

Example 2 Tissue Distribution of 26443 or 46873 mRNA by Large-Scale Tissue-Specific Library Sequencing and by Northern Blot Hybridization

[5871] Northern blot hybridizations with various RNA samples can be performed under standard conditions and washed under stringent conditions, i.e., 0.2×SSC at 65° C. A DNA probe corresponding to all or a portion of the 26443 or 46873 cDNA (SEQ ID NO: 1 or SEQ ID NO:4, respectively) can be used. The DNA was radioactively labeled with 32P-dCTP using the Prime-It Kit (Stratagene, La Jolla, Calif.) according to the instructions of the supplier. Filters containing mRNA from mouse liver, hematopoietic and endocrine tissues, and cancer cell lines (Clontech, Palo Alto, Calif.) can be probed in ExpressHyb hybridization solution (Clontech) and washed at high stringency according to manufacturer's recommendations.

Example 3 Recombinant Expression of 26443 or 46873 in Bacterial Cells

[5872] In this example, 26443 or 46873 is expressed as a recombinant glutathione-S-transferase (GST) fusion polypeptide in E. coli and the fusion polypeptide is isolated and characterized. Specifically, 26443 or 46873 is fused to GST and this fusion polypeptide is expressed in E. coli, e.g., strain PEB199. Expression of the GST-26443 or -46873 fusion protein in PEB199 is induced with IPTG. The recombinant fusion polypeptide is purified from crude bacterial lysates of the induced PEB199 strain by affinity chromatography on glutathione beads. Using polyacrylamide gel electrophoretic analysis of the polypeptide purified from the bacterial lysates, the molecular weight of the resultant fusion polypeptide is determined.

Example 4 Expression of Recombinant 26443 or 46873 Protein in COS Cells

[5873] To express the 26443 or 46873 gene in COS cells, the pcDNA/Amp vector by Invitrogen Corporation (San Diego, Calif.) is used. This vector contains an SV40 origin of replication, an ampicillin resistance gene, an E. coli replication origin, a CMV promoter followed by a polylinker region, and an SV40 intron and polyadenylation site. A DNA fragment encoding the entire 26443 or 46873 protein and an HA tag (Wilson et al. (1984) Cell 37:767) or a FLAG tag fused in-frame to its 3′end of the fragment is cloned into the polylinker region of the vector, thereby placing the expression of the recombinant protein under the control of the CMV promoter.

[5874] To construct the plasmid, the 26443 or 46873 DNA sequence is amplified by PCR using two primers. The 5′primer contains the restriction site of interest followed by approximately twenty nucleotides of the 26443 or 46873 coding sequence starting from the initiation codon; the 3′end sequence contains complementary sequences to the other restriction site of interest, a translation stop codon, the HA tag or FLAG tag and the last 20 nucleotides of the 26443 or 46873 coding sequence. The PCR amplified fragment and the pcDNA/Amp vector are digested with the appropriate restriction enzymes and the vector is dephosphorylated using the CIAP enzyme (New England Biolabs, Beverly, Mass.). Preferably, the two restriction sites chosen are different so that the 26443 or 46873 gene is inserted in the correct orientation. The ligation mixture is transformed into E. coli cells (strains HB101, DH5&agr;, SURE, available from Stratagene Cloning Systems, La Jolla, Calif., can be used), the transformed culture is plated on ampicillin media plates, and resistant colonies are selected. Plasmid DNA is isolated from transformants and examined by restriction analysis for the presence of the correct fragment.

[5875] COS cells are subsequently transfected with the 26443- or 46873-pcDNA/Amp plasmid DNA using the calcium phosphate or calcium chloride co-precipitation methods, DEAE-dextran-mediated transfection, lipofection, or electroporation. Other suitable methods for transfecting host cells can be found in Sambrook, J., Fritsh, E. F., and Maniatis, T. Molecular Cloning: A Laboratory Manual. 2nd, ed., Cold Spring Harbor Laboratory, Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y., 1989. The expression of the 26443 or 46873 polypeptide is detected by radiolabeling (35S-methionine or 35S-cysteine, available from NEN, Boston, Mass., can be used) and immunoprecipitation (Harlow, E. and Lane, D. Antibodies: A Laboratory Manual, Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y., 1988) using an HA specific monoclonal antibody. Briefly, the cells are labeled for 8 hours with 35S-methionine (or 35S-cysteine). The culture media are then collected and the cells are lysed using detergents (RIPA buffer, 150 mM NaCl, 1% NP-40, 0.1% SDS, 0.5% DOC, 50 mM Tris, pH 7.5). Both the cell lysate and the culture media are precipitated with an HA specific monoclonal antibody. Precipitated polypeptides are then analyzed by SDS-PAGE.

[5876] Alternatively, DNA containing the 26443 or 46873 coding sequence is cloned directly into the polylinker of the pcDNA/Amp vector using the appropriate restriction sites. The resulting plasmid is transfected into COS cells in the manner described above, and the expression of the 26443 or 46873 polypeptide is detected by radiolabeling and immunoprecipitation using a 26443 or 46873 specific monoclonal antibody.

Examples for 61833 Example 5 Identification and Characterization of Human 61833 cDNA

[5877] The human 61833 nucleic acid sequence is recited as follows: 3 GGCTGCTGGGCTGGCGGGGCGCAGGCCGCGGGACCCGAGCCCGGGGAAGCGAG (SEQ ID NO: 10) AGAGCGGAGGCGCCGAGGATCCGATTCACTCCCTGGGGAGACCTATGGGCCGAA GCCGTGTAAATGCGTTTTAAGCAGAGGCCTCGGCTCCGCAACTGCCACTCCTCCT CGGGGTGTTGCACAAGTTTCGAGGTCACCGGCGACCCCCCCTAGCAGCGCGCCT GGCTCTGGCCCCCGCGAAGGAGGACGGAGTTTGTGTGTTGCATACTTTCTAAGGC GGCGGCTGCAGCAGCGGCTCCATCCAGCCCGTCAGCTCCTCCTGCAAGGCATGG CTGGCTACCTGAGTGAATCGGACTTTGTGATGGTGGAGGAGGGCTTCAGTACCC GAGACCTGCTGAAGGAACTCACTCTGGGGGCCTCACAGGCCACCACGGACGAGG TAGCTGCCTTCTTCGTGGCTGACCTGGGTGCCATAGTGAGGAAGCACTTTTGCTT TCTGAAGTGCCTGCCACGAGTCCGGCCCTTTTATGCTGTCAAGTGCAACAGCAGC CCAGGTGTGCTGAAGGTTCTGGCCCAGCTGGGGCTGGGCTTTAGCTGTGCCAACA AGGCAGAGATGGAGTTGGTCCAGCATATTGGAATCCCTGCCAGTAAGATCATCT GCGCCAACCCCTGTAAGCAAATTGCACAGATCAAATATGCTGCCAAGCATGGGA TCCAGCTGCTGAGCTTTGACAATGAGATGGAGCTGGCAAAGGTGGTAAAGAGCC ACCCCAGTGCCAAGATGGTTCTGTGCATTGCTACCGATGACTCCCACTCCCTGAG CTGCCTGAGCCTAAAGTTTGGAGTGTCACTGAAATCCTGCAGACACCTGCTTGAA AATGCGAAGAAGCACCATGTGGAGGTGGTGGGTGTGAGTTTTCACATTGGCAGT GGCTGTCCTGACCCTCAGGCCTATGCTCAGTCCATCGCAGACGCCCGGCTCGTGT TTGAAATGGGCACCGAGCTGGGTCACAAGATGCACGTTCTGGACCTTGGTGGTG GCTTCCCTGGCACAGAAGGGGCCAAAGTGAGATTTGAAGAGATTGCTTCCGTGA TCAACTCAGCCTTGGACCTGTACTTCCCAGAGGGCTGTGGCGTGGACATCTTTGC TGAGCTGGGGCGCTACTACGTGACCTCGGCCTTCACTGTGGCAGTCAGCATCATT GCCAAGAAGGAGGTTCTGCTAGACCAGCCTGGCAGGGAGGAGGAAAATGGTTCC ACCTCAAGACCCATCGTGTACCACCTTGATGAGGGCGTGTATGGGATCTTCAACT CAGTCCTGTTTGACAACATCTGCCCTACCCCCATCCTGCAGAAGAAACCATCCAC GGAGCAGCCCCTGTACAGCAGCAGCCTGTGGGGCCCGGCGGTTGATGGCTGTGA TTGCGTGGCTGAGGGCCTGTGGCTGCCGCAACTACACGTAGGGGACTGGCTGGT CTTTGACAACATGGGCGCCTACACTGTGGGCATGGGTTCCCCCTTTTGGGGGACC CAGGCCTGCCACATCACCTATGCCATGTCCCGGGTGGCCTGGGAAGCGCTGCGA AGGCAGCTGATGGCTGCAGAACAGGAGGATGACGTGGAGGGTGTGTGCAAGCCT CTGTCCTGCGGCTGGGAGATCACAGACACCCTGTGCGTGGGCCCTGTCTTCACCC CAGCGAGCATCATGTGAGTGGGCCTCGTTCCCCCCGGAGAATCCCAGCGGGGCC TCAGAGATGCATCTGGGAGAGGTGGGCGAGGCAGCGAGCTGGTACCCTCTGGCC AGGACTTCTGGTGCTCGCTCTGCCGCCCCACGCTCCACCTGTAGTGTTTCTGCCCT GTAAATAGGACCAGTCTTACACTCGCTTGTAGTTTCAAGTATGCAACATAAATCC TGTCCCTTCCAAAAAAAAAAAAAAAAAAAAA.

[5878] The human 61833 sequence (FIG. 9; SEQ ID NO:10) is approximately 1937 nucleotides long. The nucleic acid sequence includes an initiation codon (ATG) and a termination codon (TGA) which are underscored above. The region between and inclusive of the initiation codon and the termination codon is a methionine-initiated coding sequence of about 1383 nucleotides (nucleotides indicated as “coding” of SEQ ID NO: 10; SEQ ID NO: 12). The coding sequence encodes a 460 amino acid protein (SEQ ID NO:11), which is recited as follows: 4 MAGYLSESDFVMVEEGFSTRDLLKELTLGASQATTDEVAAFFVADLGAIVRKHFCF (SEQ ID NO: 11) LKCLPRVRPFYAVKCNSSPGVLKVLAQLGLGFSCANKAEMELVQHIGIPASKIICANP CKQIAQIKYAAKHGIQLLSFDNEMELAKVVKSHPSAKMVLCIATDDSHSLSCLSLKF GVSLKSCRHLLENAKKHHVEVVGVSFHIGSGCPDPQAYAQSIADARLVFEMGTELG HKMHVLDLGGGFPGTEGAKVRFEEIASVINSALDLYFPEGCGVDIFAELGRYYVTSA FTVAVSIIAKKEVLLDQPGREEENGSTSRPIVYHLDEGVYGIFNSVLFDNICPTPILQK KPSTEQPLYSSSLWGPAVDGCDCVAEGLWLPQLHVGDWLVFDNMGAYTVGMGSP FWGTQACHITYAMSRVAWEALRRQLMAAEQEDDVEGVCKPLSCGWEITDTLCVGP VFTPASIM.

Example 6 Tissue Distribution of 61833 mRNA by TaqMan Analysis

[5879] Endogenous human 61833 gene expression can be determined using the Perkin-Elmer/ABI 7700 Sequence Detection System which employs TaqMan technology. Briefly, TaqMan technology relies on standard RT-PCR with the addition of a third gene-specific oligonucleotide (referred to as a probe) which has a fluorescent dye coupled to its 5′end (typically 6-FAM) and a quenching dye at the 3′end (typically TAMRA). When the fluorescently tagged oligonucleotide is intact, the fluorescent signal from the 5′dye is quenched. As PCR proceeds, the 5′ to 3′nucleolytic activity of Taq polymerase digests the labeled primer, producing a free nucleotide labeled with 6-FAM, which is now detected as a fluorescent signal. The PCR cycle where fluorescence is first released and detected is directly proportional to the starting amount of the gene of interest in the test sample, thus providing a quantitative measure of the initial template concentration. Samples can be internally controlled by the addition of a second set of primers/probe specific for a housekeeping gene such as GAPDH which has been labeled with a different fluorophore on the 5′end (typically VIC).

Example 7 Tissue Distribution of 61833 mRNA by Northern Analysis

[5880] Northern blot hybridizations with various RNA samples can be performed under standard conditions and washed under stringent conditions, i.e., 0.2× SSC at 65° C. A DNA probe corresponding to all or a portion of the 61833 cDNA (SEQ ID NO:10) can be used. The DNA was radioactively labeled with 32P-dCTP using the Prime-It Kit (Stratagene, La Jolla, Calif.) according to the instructions of the supplier. Filters containing mRNA from mouse hematopoietic and endocrine tissues, and cancer cell lines (Clontech, Palo Alto, Calif.) can be probed in ExpressHyb hybridization solution (Clontech) and washed at high stringency according to manufacturer's recommendations.

Example 8 Recombinant Expression of 61833 in Bacterial Cells

[5881] In this example, 61833 is expressed as a recombinant glutathione-S-transferase (GST) fusion polypeptide in E. coli and the fusion polypeptide is isolated and characterized. Specifically, 61833 is fused to GST and this fusion polypeptide is expressed in E. coli, e.g., strain PEB 199. Expression of the GST-61833 fusion protein in PEB 199 is induced with IPTG. The recombinant fusion polypeptide is purified from crude bacterial lysates of the induced PEB199 strain by affinity chromatography on glutathione beads. Using polyacrylamide gel electrophoretic analysis of the polypeptide purified from the bacterial lysates, the molecular weight of the resultant fusion polypeptide is determined.

Example 9 Expression of Recombinant 61833 Protein in COS Cells

[5882] To express the 61833 gene in COS cells, the pcDNA/Amp vector by Invitrogen Corporation (San Diego, Calif.) is used. This vector contains an SV40 origin of replication, an ampicillin resistance gene, an E. coli replication origin, a CMV promoter followed by a polylinker region, and an SV40 intron and polyadenylation site. A DNA fragment encoding the entire 61833 protein and an HA tag (Wilson et al. (1984) Cell 37:767) or a FLAG tag fused in-frame to its 3′end of the fragment is cloned into the polylinker region of the vector, thereby placing the expression of the recombinant protein under the control of the CMV promoter.

[5883] To construct the plasmid, the 61833 DNA sequence is amplified by PCR using two primers. The 5′primer contains the restriction site of interest followed by approximately twenty nucleotides of the 61833 coding sequence starting from the initiation codon; the 3′end sequence contains complementary sequences to the other restriction site of interest, a translation stop codon, the HA tag or FLAG tag and the last 20 nucleotides of the 61833 coding sequence. The PCR amplified fragment and the pcDNA/Amp vector are digested with the appropriate restriction enzymes and the vector is dephosphorylated using the CIAP enzyme (New England Biolabs, Beverly, Mass.). Preferably the two restriction sites chosen are different so that the 61833_gene is inserted in the correct orientation. The ligation mixture is transformed into E. coli cells (strains HB101, DH5&agr;, SURE, available from Stratagene Cloning Systems, La Jolla, Calif., can be used), the transformed culture is plated on ampicillin media plates, and resistant colonies are selected. Plasmid DNA is isolated from transformants and examined by restriction analysis for the presence of the correct fragment.

[5884] COS cells are subsequently transfected with the 61833-pcDNA/Amp plasmid DNA using the calcium phosphate or calcium chloride co-precipitation methods, DEAE-dextran-mediated transfection, lipofection, or electroporation. Other suitable methods for transfecting host cells can be found in Sambrook, J., Fritsh, E. F., and Maniatis, T. (1989) Molecular Cloning: A Laboratory Manual. 2nd, ed., Cold Spring Harbor Laboratory, Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y. The expression of the 61833 polypeptide is detected by radiolabelling (35S-methionine or 35S-cysteine available from NEN, Boston, Mass., can be used) and immunoprecipitation (Harlow, E. and Lane, D. (1988) Antibodies: A Laboratory Manual, Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y.) using an HA specific monoclonal antibody. Briefly, the cells are labeled for 8 hours with 35S-methionine (or 35S-cysteine). The culture media are then collected and the cells are lysed using detergents (RIPA buffer, 150 mM NaCl, 1% NP-40, 0.1% SDS, 0.5% DOC, 50 mM Tris, pH 7.5). Both the cell lysate and the culture media are precipitated with an HA specific monoclonal antibody. Precipitated polypeptides are then analyzed by SDS-PAGE.

[5885] Alternatively, DNA containing the 61833 coding sequence is cloned directly into the polylinker of the pcDNA/Amp vector using the appropriate restriction sites. The resulting plasmid is transfected into COS cells in the manner described above, and the expression of the 61833 polypeptide is detected by radiolabelling and immunoprecipitation using a 61833 specific monoclonal antibody.

Examples for 26493 Example 10 Identification and Characterization of Human 26493 cDNA

[5886] The human 26493 nucleic acid sequence is recited as follows: 5 CGACCCACGCGTCCGCGTCGGAGCTCCTGCAGACCAGTGCGCGCTCGGGGAGTT (SEQ ID NO: 16) GGCGAGCGGGTGGCGGCTGGGAGACGTCCCGAGCGCACGGGACTGACAGGCGG CAGAAGCCGGGCGGGGTCCGCTGGGCTCCGGACCCGTGCCCACCCAGTTCCAGG GCGGCCCCGGGCGGCCCCGCCCCCTCGGTGAATGCCGCGGGCCGGCCAATCCGG GCAGGCCGCGGCGCCGCGCAGCCTATCAGCGGCCAGAGCTCGCGTGCGCTTCCG CGTTCGCGTGCGCTTCCGCGTTCTCGTGAGCTCCCGGCCCGCTGCCGCAGGGACT GGGAGCGGTCTCCGCAGGGACTGGGAGCGGGCTCCGCAGCGCACTCTAGCCCGC GGCTCGGCTCAGTCGGTCTGCGAGGATCCGGCCCGCCGCCCCCCGGGGGACCCG ATGGCCTCGGAGGGCCTGGCGGGGGCGCTGGCTTCCGTGCTGGCTGGCCAGGGG TCCAGCGTGCACAGCTGCGACTCGGCGCCGGCCGGGGAGCCGCCGGCGCCCGTG CGGCTGCGGAAGAACGTGTGCTACGTGGTGCTGGCCGTGTTCCTCAGCGAGCAG GATGAGGTGCTACTGATCCAGGAGGCCAAGAGGGAGTGCCGGGGGTCGTGGTAC CTGCCTGCGGGGAGAATGGAGCCAGGGGAGACCATCGTGGAGGCGCTGCAGCG GGAGGTGAAGGAGGAGGCGGGGCTGCACTGTGAGCCCGAGACACTGCTGTCCGT GGAGGAGCGGGGCCCCTCCTGGGTCCGCTTCGTGTTCCTCGCTCGCCCCACAGGT GGAATTCTCAAGACTTCCAAGGAGGCCGATGCGGAGTCCCTGCAGGCTGCCTGG TACCCACGGACCTCCCTGCCCACTCCGCTGCGAGCCCATGACATCCTGCACCTGG TTGAACTAGCCGCCCAGTATCGCCAGCAAGCCAGGCACCCTCTCATTCTGCCCCA AGAGCTACCCTGTGATCTGGTCTGCCAGCGGCTCGTGGCTACCTTTACCAGCGCC CAGACAGTGTGGGTGTTAGTGGGCACAGTGGGGATGCCTCACTTGCCTGTCACTG CCTGTGGCCTCGACCCTATGGAGCAGAGGGGTGGCATGAAGATGGCCGTCCTGC GGCTGCTGCAGGAGTGTCTGACCCTGCACCACTTGGTGGTGGAGATCAAGGGGT TGCTTGGACTGCAGCACCTGGGCCGAGATCACAGTGATGGCATCTGTTTGAATGT GCTGGTGACCGTGGCTTTTCGGAGCCCAGGGATCCAGGATGAACCCCCAAAAGT TCGGGGTGAGAACTTCTCTTGGTGGAAGGTGATGGAGGAAGACCTGCAAAGCCA GCTCCTCCAGCGGCTTCAGGGATCCTCTGTTGTCCCAGTGAACAGATAGAGAGGT GGAGGAGGTGACAGGGAGCTAGGCAGCCGTGCTCCCTCCAGTGCGGACTTGTCT CCCTCTGAGGGAGGCAAGAGGCTGGCGATCAGGGATCTTGTTGCATTGGGAGCA GGGGCGGCTCTCCTGGTCCCCAGGAGAGATGCTTTGAGGAGCATTCCTCTAGATT GCACAAGGGACAGTGCCTTTAACCAAGCGAGGAGTCCAAAGCTCAGGACCTGAC TACCCTGAGGGCACGCTGACGCCTCTCCCCAGGGGGATGGGGAGCTTTCTGCAC CCCCAGTGGCATCTCCTCATCACGTTCTGTGCCGTCCTTGGGAAAGGCCTGCATT CTGATCCTTCCAGGCCCTTCGAGCATGGAGGGGCACTGGGGAAGGTCCCCCGAG GGAGGAGCACGTTGCTGAGTAAAGAGGTGTTACTCAMMATAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAGGGCGGCCGCTAG ACTAGTC.

[5887] The human 26493 sequence (SEQ ID NO:16), which is approximately 1902 nucleotides long. The nucleic acid sequence includes an initiation codon (ATG) and a termination codon (TAG), which are underscored above. The region between and inclusive of the initiation codon and the termination codon is a methionine-initiated coding sequence of about 1212 nucleotides, including the termination codon (nucleotides indicated as “coding” of SEQ ID NO: 16; SEQ ID NO: 18). The coding sequence encodes a 404 amino acid protein (SEQ ID NO:17), which is recited as follows: 6 MPRAGQSGQAAAPRSLSAARARVRFRVRVRFRVLVSSRPAAAGTGSGLRRDWERA (SEQ ID NO: 17) PQRTLARGSAQSVCEDPARRPPGDPMASEGLAGALASVLAGQGSSVHSCDSAPAGE PPAPVRLRKNVCYVVLAVFLSEQDEVLLIQEAKRECRGSWYLPAGRMEPGETIVEAL QREVKEEAGLHCEPETLLSVEERGPSWVRFVFLARPTGGILKTSKEADAESLQAAWY PRTSLPTPLRAHDILHLVELAAQYRQQARHPLILPQELPCDLVCQRLVATFTSAQTV WVLVGTVGMPHLPVTACGLDPMEQRGGMKMAVLRLLQECLTLHHLVVEIKGLLGL QHLGRDHSDGICLNVLVTVAFRSPGIQDEPPKVRGENFSWWKVMEEDLQSQLLQRL QGSSVVPVNR.

Example 11 Tissue Distribution of 26493 mRNA by TaqMan Analysis

[5888] Endogenous human 26493 gene expression was determined using the Perkin-Elmer/ABI 7700 Sequence Detection System which employs TaqMan technology. Briefly, TaqMan technology relies on standard RT-PCR with the addition of a third gene-specific oligonucleotide (referred to as a probe) which has a fluorescent dye coupled to its 5′end (typically 6-FAM) and a quenching dye at the 3′end (typically TAMRA). When the fluorescently tagged oligonucleotide is intact, the fluorescent signal from the 5′dye is quenched. As PCR proceeds, the 5′ to 3′nucleolytic activity of Taq polymerase digests the labeled primer, producing a free nucleotide labeled with 6-FAM, which is now detected as a fluorescent signal. The PCR cycle where fluorescence is first released and detected is directly proportional to the starting amount of the gene of interest in the test sample, thus providing a quantitative measure of the initial template concentration. Samples can be internally controlled by the addition of a second set of primers/probe specific for a housekeeping gene such as GAPDH which has been labeled with a different fluorophore on the 5′end (typically VIC).

[5889] To determine the level of 26493 in various human tissues a primer/probe set was designed. Total RNA was prepared from a series of human tissues using an RNeasy kit from Qiagen. First strand cDNA was prepared from 1 &mgr;g total RNA using an oligo-dT primer and Superscript II reverse transcriptase (Gibco/BRL). cDNA obtained from approximately 50 ng total RNA was used per TaqMan reaction.

[5890] Table 2 below shows expression of 26493 mRNA in various normal and diseased tissues, detected using TaqMan analysis. The mRNA expression data for 26493 indicate that 26493 mRNA was highly expressed in many tissues, for example, coronary tissue, human umbilical vein endothelial cells (HUVEC), kidney, pancreas, brain (hypothalamus and cortex), dorsal root ganglion (DRG), and erythroid cells. More moderate levels of expression were noted in normal artery, nerve, normal breast, prostate (normal and tumor), lung tumor and lung COPD, tonsil, lymph node, neutrophils and megakaryocytes. All tissues included in the panel exhibited at least lower levels of expression. 7 TABLE 2 Tissue Distribution of 25692 mRNA by TaqMan Analysis. Sample Expression Artery normal 1.06 Aorta diseased 0.41 Vein normal 0.18 Coronary SMC 5.30 HUVEC 8.70 Hemangioma 0.18 Heart normal 0.31 Heart CHF 0.35 Kidney 2.34 Skeletal Muscle 0.39 Adipose normal 0.32 Pancreas 5.10 primary osteoblasts 0.37 Osteoclasts (diff) 0.08 Skin normal 0.46 Spinal cord normal 0.26 Brain Cortex normal 8.70 Brain Hypothalamus normal 3.67 Nerve 0.56 DRG (Dorsal Root Ganglion) 4.00 Breast normal 1.20 Breast tumor 0.47 Ovary normal 0.33 Ovary Tumor 0.11 Prostate Normal 1.05 Prostate Tumor 0.88 Salivary glands 0.22 Colon normal 0.35 Colon Tumor 0.61 Lung normal 0.22 Lung tumor 1.36 Lung COPD 0.86 Colon IBD 0.27 Liver normal 0.12 Liver fibrosis 0.18 Spleen normal 0.26 Tonsil normal 0.59 Lymph node normal 0.53 Small intestine normal 0.22 Macrophages 0.03 Synovium 0.09 BM-MNC 0.29 Activated PBMC 0.04 Neutrophils 0.51 Megakaryocytes 0.65 Erythroid 4.73 positive control 2.78

Example 12 Tissue Distribution of 26493 mRNA by Northern Analysis

[5891] Northern blot hybridizations with various RNA samples can be performed under standard conditions and washed under stringent conditions, i.e., 0.2× SSC at 65° C. A DNA probe corresponding to all or a portion of the 26493 cDNA (SEQ ID NO: 16) can be used. The DNA was radioactively labeled with 32P-dCTP using the Prime-It Kit (Stratagene, La Jolla, Calif.) according to the instructions of the supplier. Filters containing mRNA from mouse hematopoietic and endocrine tissues, and cancer cell lines (Clontech, Palo Alto, Calif.) can be probed in ExpressHyb hybridization solution (Clontech) and washed at high stringency according to manufacturer's recommendations.

Example 13 Recombinant Expression of 26493 in Bacterial Cells

[5892] In this example, 26493 is expressed as a recombinant glutathione-S-transferase (GST) fusion polypeptide in E. coli and the fusion polypeptide is isolated and characterized. Specifically, 26493 is fused to GST and this fusion polypeptide is expressed in E. coli, e.g., strain PEB 199. Expression of the GST-26493 fusion protein in PEB 199 is induced with IPTG. The recombinant fusion polypeptide is purified from crude bacterial lysates of the induced PEB 199 strain by affinity chromatography on glutathione beads. Using polyacrylamide gel electrophoretic analysis of the polypeptide purified from the bacterial lysates, the molecular weight of the resultant fusion polypeptide is determined.

Example 14 Expression of Recombinant 26493 Protein in COS Cells

[5893] To express the 26493 gene in COS cells, the pcDNA/Amp vector by Invitrogen Corporation (San Diego, Calif.) is used. This vector contains an SV40 origin of replication, an ampicillin resistance gene, an E. coli replication origin, a CMV promoter followed by a polylinker region, and an SV40 intron and polyadenylation site. A DNA fragment encoding the entire 26493 protein and an HA tag (Wilson et al. (1984) Cell 37:767) or a FLAG tag fused in-frame to its 3′end of the fragment is cloned into the polylinker region of the vector, thereby placing the expression of the recombinant protein under the control of the CMV promoter.

[5894] To construct the plasmid, the 26493 DNA sequence is amplified by PCR using two primers. The 5′primer contains the restriction site of interest followed by approximately twenty nucleotides of the 26493 coding sequence starting from the initiation codon; the 3′ end sequence contains complementary sequences to the other restriction site of interest, a translation stop codon, the HA tag or FLAG tag and the last 20 nucleotides of the 26493 coding sequence. The PCR amplified fragment and the pcDNA/Amp vector are digested with the appropriate restriction enzymes and the vector is dephosphorylated using the CIAP enzyme (New England Biolabs, Beverly, Mass.). Preferably the two restriction sites chosen are different so that the 26493_gene is inserted in the correct orientation. The ligation mixture is transformed into E. coli cells (strains HB101, DH5□, SURE, available from Stratagene Cloning Systems, La Jolla, Calif., can be used), the transformed culture is plated on ampicillin media plates, and resistant colonies are selected. Plasmid DNA is isolated from transformants and examined by restriction analysis for the presence of the correct fragment.

[5895] COS cells are subsequently transfected with the 26493-pcDNA/Amp plasmid DNA using the calcium phosphate or calcium chloride co-precipitation methods, DEAE-dextran-mediated transfection, lipofection, or electroporation. Other suitable methods for transfecting host cells can be found in Sambrook, J., Fritsh, E. F., and Maniatis, T. (1989) Molecular Cloning: A Laboratory Manual. 2nd, ed., Cold Spring Harbor Laboratory, Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y. The expression of the 26493 polypeptide is detected by radiolabelling (35S-methionine or 35S-cysteine available from NEN, Boston, Mass., can be used) and immunoprecipitation (Harlow, E. and Lane, D. (1988) Antibodies: A Laboratory Manual, Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y.) using an HA specific monoclonal antibody. Briefly, the cells are labeled for 8 hours with 35S-methionine (or 35S-cysteine). The culture media are then collected and the cells are lysed using detergents (RIPA buffer, 150 mM NaCl, 1% NP-40, 0.1% SDS, 0.5% DOC, 50 mM Tris, pH 7.5). Both the cell lysate and the culture media are precipitated with an HA specific monoclonal antibody. Precipitated polypeptides are then analyzed by SDS-PAGE.

[5896] Alternatively, DNA containing the 26493 coding sequence is cloned directly into the polylinker of the pcDNA/Amp vector using the appropriate restriction sites. The resulting plasmid is transfected into COS cells in the manner described above, and the expression of the 26493 polypeptide is detected by radiolabelling and immunoprecipitation using a 26493 specific monoclonal antibody.

Examples for 58224 Example 15 Identification and Characterization of Human 58224 cDNA

[5897] The human 58224 nucleic acid sequence is recited as follows: 8 TACTATAGGGAGTCGCCCACGCGTCCGGGCAGCGGTTGTGAGGAGTTAGCTCGC (SEQ ID NO: 22) GGCATTGCAGGCTCTGAGAGGAGGGGACCCGGTTCCCGGGTGAGTGTCCAGGCA TGCCAGCGGAACGGCCCGCGGGCAGCGGCGGCTCGGAGGCTCCAGCAATGGTT GAACAACTGGACACTGCTGTGATTACCCCGGCCATGCTAGAAGAGGAAGAACAG CTTGAAGCTGCTGGACTAGAGAGAGAGCGGAAGATGCTGGAAAAGGCTCGCATG TCTTGGGATAGAGAGTCGACAGAAATTCGGTACCGTAGACTTCAACATTTGCTTG AAAAAAGCAATATMTACTCCAAATTTTTATTGACGAAAATGGAACAGCAACAAT TAGAGGAACAGAAGAAGAAAGAAAAATTGGAGAGAAAAAAGGAGTCTTTAAAA GTTAAAAAGGGTAAAAATTCAATTGATGCAAGTGAAGAGAAGCCAGTTATGAGG AAAAAAAGAGGAAGAGAAGATGAATCATACAATATTTCAGAGGTCATGTCAAA AGAGGAAATTTTGTCTGTGGCTAAAAAAAATAAAAAGGAGAATGAGGATGAAA ACTCCTCCTCTACTAATCTCTGTGTGGAAGATCTTCAGAAAAATAAAGATTCGAA TAGTATAATTAAAGATAGATTGTCTGAAACGGTTAGGCAGAATACTAAATTCTTT TTTGACCCAGTCCGGAAGTGTAATGGTCAGCCAGTACCTTTTCAACAACCAAAGC ACTTCACTGGAGGAGTGATGCGATGGTACCAAGTAGAAGGCATGGAATGGCTTA GGATGCTTTGGGAAAATGGAATTAATGGCATTTTAGCAGATGAAATGGGATTGG GTAAGACAGTTCAGTGCATTGCTACTATTGCATTGATGATTCAGAGAGGAGTACC AGGACCTTTTCTTGTCTGTGGCCCTTTGTCTACACTTCCTAACTGGATGGCTGAAT TCAAAAGATTTACACCAGATATCCCTACAATGTTATATCATGGAACCCAGGAGG ACCGTCGAAAATTGGTAAGAAATATTTACAAAAGACAAGGGACACTGCAGATTC ATCCTGTGGTGGTCACATCATTCGAGATCGCTATGCGAGACCAGAATGCTTTACA GCATTGCTATTGGAAATACTTAATAGTAGATGAAGGACACAGGATTAAGAATAT GAAGTGCCGTCTAATCAGGGAGTTAAAACGATTCAATGCTGATAACAAACTTCTT TTGACTGGTACTCCCTTGCAAAACAATTTATCAGAACTTTGGTCATTGCTAAACT TTTTGTTGCCAGATGTATTTGATGACTTGAAAAGCTTTGAGTCTTGGTTTGACATC ACTAGTCTTTCTGAAACTGCTGAAGATATTATTGCTAAAGAAAGAGAACAGAAT GTATTGCATATGCTGCACCAGATTTTAACACCTTTCTTATTGAGAAGACTGAAGT CTGATGTTGCTCTTGAAGTTCCTCCTAAACGAGAAGTAGTCGTTTATGCTCCACTT TCAAAGAAGCAGGAGATCTTTTATACAGCCATTGTGAACCGTACAATTGCAAAC ATGTTTGGATCCAGTGAGAAAGAAACAATTGAGTTAAGTCCTACTGGTCGACCA AAACGACGAACTAGAAAATCAATAAATTACAGCAAAATAGATGATTTCCCTAAT GAATTGGAAAAACTGATCAGTCAAATACAGCCAGAGGTGGACCGAGAAAGAGC TGTTGTGGAAGTGAATATCCCTGTAGAATCTGAAGTTAATCTGAAGCTGCAGAAT ATAATGATGCTACTTCGTAAATGTTGTAATCATCCATATTTGATTGAATATCCTAT AGACCCTGTTACACAAGAATTTAAGATCGATGAAGAATTGGTAACAAATTCTGG GAAGTTCTTGATTTTGGATCGAATGCTGCCAGAACTAAAAAAAAGAGGTCACAA GGTGCTGCTTTTTTCACAAATGACAAGCATGTTGGACATTTTGATGGATTACTGC CATCTCAGAGATTTCAACTTCAGCAGGCTTGATGGGTCCATGTCTTACTCAGAGA GAGAAAAAAACATGCACAGCTTCAACACGGATCCAGAGGTGTTTATCTTCTTAG TGAGTACACGAGCTGGTGGCCTGGGCATTAATCTGACTGCAGCAGATACAGTTA TCATTTATGATAGTGATTGGAACCCCCAGTCGGATCTTCAGGCCCAGGATAGATG TCATAGAATTGGTCAGACAAAGCCAGTTGTTGTTTATCGCCTTGTTACAGCAAAT ACTATCGATCAGAAAATTGTGGAAAGAGCAGCTGCTAAAAGGAAACTGGAAAA GTTGATCATCCATAAAAATCATTTCAAAGGTGGTCAGTCTGGATTAAATCTGTCT AAGAATTTCTTAGATCCTAAGGAATTAATGGAATTATTAAAATCTAGAGATTATG AAAGGGAAATAAAAGGATCAAGAGAGAAGGTCATTAGTGATAAAGATCTAGAG TTGTTGTTAGATCGAAGTGATCTTATTGATCAAATGAATGCTTCAGGACCAATTA AAGAGAAGATGGGGATATTCAAGATATTAGAAAATTCTGAAGATTCCAGTCCTG AATGTTTGTTTTAAAGTGGAGCTCAAGAATAGCTTTTAAAAGTTCTTATTTACAT CTAGTGATTTCCCTGTATTGGGTTTGAAATACTGATTGTCCACTTCACCTTTTTTA TTATATCAGTTGACATGTAACTAGTACCATGCCGTACCTTAAATAGATGGTAATT TTCTGAGCCTTTCCCAAGAACA

[5898] The human 58224 sequence (SEQ ID NO:22), is approximately 2798 nucleotides long including untranslated regions. The nucleic acid sequence includes a preferred initiation codon (ATG) and a termination codon (TAA) which are double underlined and bolded above. Other methionine residues may also be used as initiation codons. The region between and inclusive of the preferred initiation codon and the termination codon is a methionine-initiated coding sequence of about 2514 nucleotides (nucleotides 108 to 2621 of SEQ ID NO:22) designated as SEQ ID NO:24. The coding sequence encodes a 838 amino acid protein (SEQ ID NO:23), the sequence of which is recited as follows: 9 MPAERPAGSGGSEAPAMVEQLDTAVITPAMLEEEEQLEAAGLERERKMLEKARMS (SEQ ID NO: 23) WDRESTEIRYRRLQHLLEKSNIYSKFLLTKMEQQQLEEQKKKEKLERKKESLKVKK GKNSIDASEEKPVMRKKRGREDESYNISEVMSKEEILSVAKKNKKENEDENSSSTNL CVEDLQKNKDSNSIIKDRLSETVRQNTKFFFDPVRKCNGQPVPFQQPKHFTGGVMR WYQVEGMEWLRMLWENGINGILADEMGLGKTVQCIATIALMIQRGVPGPFLVCGPL STLPNWMAEFKRFTPDIPTMLYHGTQEDRRKLVRNIYKRQGTLQIHPVVVTSFEIAM RDQNALQHCYWKYLIVDEGHRIKNMKCRLIRELKRFNADNKLLLTGTPLQNNLSEL WSLLNFLLPDVFDDLKSFESWFDITSLSETAEDIIAKEREQNVLHMLHQILTPFLLRRL KSDVALEVPPKREVVVYAPLSKKQEIFYTAIVNRTIANMFGSSEKETIELSPTGRPKR RTRKSINYSKIDDFPNELEKLISQIQPEVDRERAVVEVNIPVESEVNLKLQNIMMLLRK CCNHPYLIEYPIDPVTQEFKIDEELVTNSGKFLILDRMLPELKKRGHKVLLFSQMTSM LDILMDYCHLRDFNFSRLDGSMSYSEREKNMHSFNTDPEVFIFLVSTRAGGLGINLT AADTVIIYDSDWNPQSDLQAQDRCHRIGQTKPVVVYRLVTANTIDQKIVERAAAKR KLEKLIIHKNHFKGGQSGLNLSKNFLDPKELMELLKSRDYEREIKGSREKVISDKDLE LLLDRSDLIDQMNASGPIKEKMGIFKILENSEDSSPECLF

Example 16 Tissue Distribution of 58224 mRNA

[5899] Endogenous human 58224 gene expression was determined using the Perkin-Elmer/ABI 7700 Sequence Detection System which employs TaqMan technology. Briefly, TaqMan technology relies on standard RT-PCR with the addition of a third gene-specific oligonucleotide (referred to as a probe) which has a fluorescent dye coupled to its 5′end (typically 6-FAM) and a quenching dye at the 3′end (typically TAMRA). When the fluorescently tagged oligonucleotide is intact, the fluorescent signal from the 5′dye is quenched. As PCR proceeds, the 5′ to 3′nucleolytic activity of Taq polymerase digests the labeled primer, producing a free nucleotide labeled with 6-FAM, which is now detected as a fluorescent signal. The PCR cycle where fluorescence is first released and detected is directly proportional to the starting amount of the gene of interest in the test sample, thus providing a quantitative measure of the initial template concentration. Samples were internally controlled by the addition of a second set of primers/probe specific for a reference gene such as &bgr;2-macroglobulin, GAPDH which has been labeled with a different fluorophore on the 5′end (typically VIC).

[5900] To determine the level of 58224 in various human tissues a primer/probe set was designed. Total RNA was prepared from a series of human tissues using an RNeasy kit from Qiagen. First strand cDNA was prepared from 1 &mgr;g total RNA using an oligo-dT primer and Superscript II reverse transcriptase (Gibco/BRL). cDNA obtained from approximately 50 ng total RNA was used per TaqMan reaction. Tissues tested include the human tissues and several cell lines shown in the left column of the tables below.

[5901] 58224 mRNA expression was analyzed by TaqMan in a number of clinical samples, including human breast, ovary, lung, colon, and liver tissue. In general, but not always, 58224 is likely to be more highly expressed in tumor tissue as compared to normal tissue (See Table 3). For example, expression of 58224 mRNA was expressed at higher levels in breast tumor, ovarian tumor, lung tumor, and colon tumor samples (T) than in control samples (N) of the same tissue. In addition, expression of 58224 mRNA was elevated in liver metastases and in fetal liver (Table 4). 10 TABLE 3 Tissue Relative sample Expression Breast N 0.00 Breast N 0.33 Breast N 0.00 Breast T 0.24 Breast T 0.03 Breast T 0.00 Breast T 0.00 Breast T 3.21 Breast T 0.62 Breast T 0.00 Breast T 1.48 Ovary N 0.11 Ovary N 0.01 Ovary N 0.17 Ovary N 0.00 Ovary T 0.02 Ovary T 0.01 Ovary T 1.31 Ovary T 0.00 Ovary T 0.04 Ovary T 0.55 Ovary T 0.38 Lung N 0.00 Lung N 0.00 Lung N 0.00 Lung N 0.00 Lung T 2.32 Lung T 3.54 Lung T 0.33 Lung T 0.10 Lung T 4.25 Lung T 1.67 Lung T 0.04 Lung T 0.05 Colon N 0.00 Colon N 0.00 Colon N 0.00 Colon N 0.00 Colon T 1.55 Colon T 0.01 Colon T 0.01 Colon T 0.00 Colon T 0.39

[5902] TABLE 3: 58224 expression is upregulated in some breast, ovary, lung, and colon tumor tissue samples. Relative expression is relative to expression of beta2-macroglobulin. 11 TABLE 4 Tissue Relative sample Expression Liver Nor 0.00 Liver Nor 0.00 Liver Met 0.08 Liver Met 0.75 Liver Met 0.65 Liver Met 0.71 Brain N 0.00 Brain N 0.88 Brain N 0.39 Astrocytes 6.07 Brain T 0.02 Brain T 0.00 Brain T 0.00 Brain T 1.72 Brain T 0.00 HMVEC-Arr 0.11 HMVEC-Prol 2.10 Fetal Liver 10.13 Fetal Liver 0.82

[5903] TABLE 4. 58224 expression is upregulated in some liver metastases. 58224 is also expressed in soma brain tissues and fetal liver. Relative expression is relative to expression of &bgr;2-macroglobulin

Example 17 Recombinant Expression of 58224 in Bacterial Cells

[5904] In this example, 58224 is expressed as a recombinant glutathione-S-transferase (GST) fusion polypeptide in E. coli and the fusion polypeptide is isolated and characterized. Specifically, 58224 is fused to GST and this fusion polypeptide is expressed in E. coli, e.g., strain PEB199. Expression of the GST-25934 fusion protein in PEB199 is induced with IPTG. The recombinant fusion polypeptide is purified from crude bacterial lysates of the induced_PEB 199 strain by affinity chromatography on glutathione beads. Using polyacrylamide gel electrophoretic analysis of the polypeptide purified from the bacterial lysates, the molecular weight of the resultant fusion polypeptide is determined.

Example 18 Expression of Recombinant 58224 Protein in COS Cells

[5905] To express the 58224 gene in COS cells, the pcDNA/Amp vector by Invitrogen Corporation (San Diego, Calif.) is used. This vector contains an SV40 origin of replication, an ampicillin resistance gene, an E. coli replication origin, a CMV promoter followed by a polylinker region, and an SV40 intron and polyadenylation site. A DNA fragment encoding the entire 58224 protein and an HA tag (Wilson et al. (1984) Cell 37:767) or a FLAG tag fused in-frame to its 3′end of the fragment is cloned into the polylinker region of the vector, thereby placing the expression of the recombinant protein under the control of the CMV promoter.

[5906] To construct the plasmid, the 58224 DNA sequence is amplified by PCR using two primers. The 5′primer contains the restriction site of interest followed by approximately twenty nucleotides of the 58224 coding sequence starting from the initiation codon; the 3′end sequence contains complementary sequences to the other restriction site of interest, a translation stop codon, the HA tag or FLAG tag and the last 20 nucleotides of the 58224 coding sequence. The PCR amplified fragment and the pcDNA/Amp vector are digested with the appropriate restriction enzymes and the vector is dephosphorylated using the CIAP enzyme (New England Biolabs, Beverly, Mass.). Preferably the two restriction sites chosen are different so that the 58224 gene is inserted in the correct orientation. The ligation mixture is transformed into E. coli cells (strains HB101, DH5a, SURE, available from Stratagene Cloning Systems, La Jolla, Calif., can be used), the transformed culture is plated on ampicillin media plates, and resistant colonies are selected. Plasmid DNA is isolated from transformants and examined by restriction analysis for the presence of the correct fragment.

[5907] COS cells are subsequently transfected with the 58224-pcDNA/Amp plasmid DNA using the calcium phosphate or calcium chloride co-precipitation methods, DEAE-dextran-mediated transfection, lipofection, or electroporation. Other suitable methods for transfecting host cells can be found in Sambrook, J., Fritsh, E. F., and Maniatis, T. Molecular Cloning: A Laboratory Manual. 2nd, ed., Cold Spring Harbor Laboratory, Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y., 1989. The expression of the 58224 polypeptide is detected by radiolabelling (35S-methionine or 35S-cysteine available from NEN, Boston, Mass., can be used) and immunoprecipitation (Harlow, E. and Lane, D. Antibodies: A Laboratory Manual, Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y., 1988) using an HA specific monoclonal antibody. Briefly, the cells are labeled for 8 hours with 35S-methionine (or 35S-cysteine). The culture media are then collected and the cells are lysed using detergents (RIPA buffer, 150 mM NaCl, 1% NP-40, 0.1% SDS, 0.5% DOC, 50 mM Tris, pH 7.5). Both the cell lysate and the culture media are precipitated with an HA specific monoclonal antibody. Precipitated polypeptides are then analyzed by SDS-PAGE.

[5908] Alternatively, DNA containing the 58224 coding sequence is cloned directly into the polylinker of the pcDNA/Amp vector using the appropriate restriction sites. The resulting plasmid is transfected into COS cells in the manner described above, and the expression of the 58224 polypeptide is detected by radiolabelling and immunoprecipitation using a 58224 specific monoclonal antibody.

Examples for 46980 Example 19 Identification and Characterization of Human 46980 cDNA

[5909] The human 46980 nucleic acid sequence is recited as follows: 12 TCCGACCCACGCGTCCGACTAGTTCTAGATCGCGATCTAGAACTAGCGGGGACA (SEQ ID NO: 27) CACTATTGACAGCAGAAACAATGAATTTCCTCCAAACCCGGCAATGTTGGTGGCT CTTGCATTCCTCTGGATGAGCGAATCTAGTTGGGGGGTTCCCGAAGGGGAAGGC GCCTGGGCTTTCAATACATCCTCCTGAATCATACTGCGTTTCAGGTTCCTTAGAA AAATTTGGATGTGTAAAAAGAACTCTTAACGGCGATGCAGGTCTTCCACAGCTA AGGTTGCATTGGAGTTTTCGAAAGACTTATCTTTCTGCAGGCTCGCCTCTGAGCT TTGTCTCCTTGGAGCCACCTCACTTAGACAGCTTCGGATGTGGATGCAGATTTGA ACCATGTTGCGTCCCCAGGGACTGCTATGGCTCCCTTTGTTGTTCACCTCTGTCTG TGTCATGTTAAACTCCAATGTTCTTCTGTGGATAACTGCTCTTGCCATCAAGTTCA CCCTCATTGACAGCCAAGCACAGTATCCAGTTGTCAACACAAATTATGGTAAAAT CCAGGGCCTAAGAACACCATTACCCAGTGAGATCTTGGGTCCAGTGGAGCAGTA CTTAGGGGTCCCCTATGCCTCACCCCCAACTGGAGAGAGGCGGTTTCAGCCACC AGAATCCCCATCCTCCTGGACTGGCATCCGAAATGCTACTCAGTTTTCTGCTGTG TGCCCCCAGCACCTGGATGAAAGATTCTTATTGCATGACATGCTGCCCATCTGGT TTACCACCAGTTTGGATACTTTGATGACCTATGTTCAAGATCAAAATGAAGACTG CCTTTACTTAAACATCTATGTGCCCATGGAAGATGATATTCATGAACAGAACAGT AAGAAGCCTGTTATGGTCTATATCCATGGGGGATCTTACATGGAGGGAACCGGT AACATGATTGATGGCAGCATTTTGGCCAGCTATGGGAACGTCATCGTTATCACCA TTAACTACCGTCTGGGAATACTAGGGTTTTTAAGTACCGGTGACCAGGCAGCAA AAGGCAACTATGGGCTCCTGGATCAGATTCAAGCACTGAGGTGGATTGAGGAGA ATGTCGGAGCCTTTGGCGGGGACCCCAAGAGAGTGACTATCTTTGGCTCGGGGG CTGGGGCCTCCTGTGTCAGCCTGTTGACCCTGTCCCACTACTCAGAAGGTCTCTT CCAGAAGGCCATCATTCAGAGCGGCACTGCCCTGTCCAGCTGGGCAGTGAACTA CCAGCCGGCCAAGTACACTCGGATATTGGCAGACAAGGTCGGCTGCAACATGCT GGACACCACGGACATGGTAGAATGTCTGAAGAACAAGAACTACAAGGAGCTCAT CCAGCAGACCATCACCCCGGCCACCTACCACATAGCCTTTGGGCCGGTGATCGA CGGCGACGTCATCCCAGACGACCCCCAGATCCTGATGGAGCAAGGCGAGTTCCT CAACTACGACATCATGCTGGGCGTCAACCAAGGGGAAGGCCTGAAGTTCGTGGA CGGCATCGTGGATAACGAGGACGGTGTGACGCCCAACGACTTTGACTTCTCCGT GTCCAACTTCGTGGACAACCTTTACGGCTACCCTGAAGGGAAAGACACTTTGCG GGAGACTATCAAGTTCATGTACACAGACTGGGCCGATAAGGAAAACCCGGAGAC GCGGCGGAAAACCCTGGTGGCTCTCTTTACTGACCATCAGTGGGTGGCCCCCGCC GTGGCCACCGCCGACCTGCACGCGCAGTACGGCTCCCCCACCTACTTCTATGCCT TCTATCATCACTGCCAAAGCGAAATGAAGCCCAGCTGGGCAGATTCGGCCCATG GCGATGAAGTCCCCTATGTCTTCGGCATCCCCATGATCGGTCCCACAGAGCTCTT CAGTTGTAATTTCTCCAAGAACGACGTCATGCTCAGTGCCGTGGTGATGACCTAC TGGACGAACTTCGCCAAAACTGGTGATCCAAACCAACCAGTTCCTCAGGATACC AAGTTCATTCATACAAAACCCAATCGCTTTGAAGAAGTGGCCTGGTCCAAGTATA ATCCCAAAGACCAGCTCTATCTGCATATTGGCTTGAAACCCAGAGTGAGAGATC ACTACCGGGCAACGAAAGTGGCTTTCTGGTTGGAATTGGTTCCTCATTTGCACAA CTTGAACGAGATATTCCAGTATGTTTCAACAACCACAAAGGTTCCTCCACCAGAC ATGACATCATTTCCCTATGGCACCCGGCGATCTCCCGCCAAGATATGGCCAACCA CCAAACGCCCAGCAATCACTCCTGCCAACAATCCCAAACACTCTAAGGACCCTC ACAAAACAGGGCCCGAGGACACAACTGTCCTCATTGAAACCAAACGAGATTATT CCACCGAATTAAGTGTCACCATTGCCGTCGGGGCGTCGCTCCTCTTCCTCAACAT CTTAGCCTTTGCGGCGCTGTACTACAAAAAGGACAAGAGGCGCCATGAGACTCA CAGGCACCCCAGTCCCCAGAGAAACACCACAAATGATATCACTCACATCCAGAA CGAAGAGATCATGTCTCTGCAGATGAAGCAGCTGGAACACGATCACGAGTGTGA GTCGCTGCAGGCACACGACACGCTGAGGCTCACCTGCCCTCCAGACTACACCCT CACGCTGCGCCGGTCGCCGGATGACATCCCATTTATGACGCCAAACACCATCAC CATGATTCCAAACACATTGATGGGGATGCAGCCTTTACACACTTTTAAAACCTTC AGTGGAGGACAAAACAGTACAAATTTACCCCACGGACATTCCACCACTAGAGTA TAGCTTTTCCCTATTTCCCCTCCTATCCCTCTGCCCCTACTGCTCAGCAATGTAAA AGAGACAAATAAGGAGAAAGAAAATCTCCAAACCAGGAATGTTTTTGTGCCACT GACTTTAGATAAAAATGCAAAAGGGCAGTCATCCTGTCCCAGCAGACCCTTCTC ATTGGCATTTTCCAGTATTGTGAGATCAATTTCTGACCATATGAAATGTGAAAAG TATATGTTTCTGTTACAATACTGCTTTAAGATCTAAACCATGCCAACAGATGTTTC GTGTGACTAGGACATCACCATTTCAAGGAACTGTGTGTTTCCAACATCATGGTAG CAGCACACACTTCCAAAGCTCAGCCAGGGACACTTAATATTTTTTAATTACAATG GAAATTTAAACATTTTTATGTGGGCTACACAATGGATGGCTCTTCTTAAGTGAAG AAAGACTCTATAGGCTTTTACACAGCACATGAAGCAGTAATCCAGAAAGAAGGA AATGCAGAATTTTATTATCAAAGTAAGCGAATTGACTGTGCAGAAAAATTGTAG GGTTCTGTGGAAGGAGGTATTCTGCCAGCCTGAACTATATTTAAGAAACTTTGTA AAAAATAAAAATGTATATAGCTGTGAGCTCAAACAAAAACTGCAAAAAAAAAA AAAAAAAAAAAAA.

Example 20 Tissue Distribution of 46980 mRNA by TaqMan Analysis

[5910] Endogenous human 46980 gene expression was determined using the Perkin-Elmer/ABI 7700 Sequence Detection System which employs TaqMan technology. Briefly, TaqMan technology relies on standard RT-PCR with the addition of a third gene-specific oligonucleotide (referred to as a probe) which has a fluorescent dye coupled to its 5′end (typically 6-FAM) and a quenching dye at the 3′end (typically TAMRA). When the fluorescently tagged oligonucleotide is intact, the fluorescent signal from the 5′dye is quenched. As PCR proceeds, the 5′ to 3′nucleolytic activity of Taq polymerase digests the labeled primer, producing a free nucleotide labeled with 6-FAM, which is now detected as a fluorescent signal. The PCR cycle where fluorescence is first released and detected is directly proportional to the starting amount of the gene of interest in the test sample, thus providing a quantitative measure of the initial template concentration. Samples can be internally controlled by the addition of a second set of primers/probe specific for a housekeeping gene such as GAPDH which has been labeled with a different fluorophore on the 5′end (typically VIC).

[5911] To determine the level of 46980 in various human tissues a primer/probe set was designed. Total RNA was prepared from a series of human tissues using an RNeasy kit from Qiagen. First strand cDNA was prepared from 1 &mgr;g total RNA using an oligo-dT primer and Superscript II reverse transcriptase (Gibco/BRL). cDNA obtained from approximately 50 ng total RNA was used per TaqMan reaction. Tissues tested include the human tissues and several cell lines shown in Table 5 below. Expression levels are relative to &bgr;2-macroglobulin. 13 TABLE 5 Tissue Relative Expression Adrenal Gland 0.93 Brain 74.58 Heart 0.82 Kidney 0.41 Liver 0.24 Lung 0.40 Mammary Gland 1.82 Placenta 2.73 Prostate 3.87 Salivary Gland 0.56 Muscle 0.92 Sm. Intestine 2.72 Spleen 0.24 Stomach 2.45 Teste 12.05 Thymus 5.00 Trachea 0.38 Uterus 1.01 Spinal Cord 7.39 Skin 0.13 DRG 8.34

[5912] 46980 mRNA was detected as highly expressed in brain tissue. 46980 expression was also found in testis, spinal cord, dorsal root ganglia, prostate, and thymus (Table 5).

Example 21 Tissue Distribution of 46980 mRNA by Northern Analysis

[5913] Northern blot hybridizations with various RNA samples can be performed under standard conditions and washed under stringent conditions, i.e., 0.2× SSC at 65° C. A DNA probe corresponding to all or a portion of the 46980 cDNA (SEQ ID NO:27) can be used. The DNA was radioactively labeled with 32P-dCTP using the Prime-It Kit (Stratagene, La Jolla, Calif.) according to the instructions of the supplier. Filters containing mRNA from mouse hematopoietic and endocrine tissues, and cancer cell lines (Clontech, Palo Alto, Calif.) can be probed in ExpressHyb hybridization solution (Clontech) and washed at high stringency according to manufacturer's recommendations.

Example 22 Recombinant Expression of 46980 in Bacterial Cells

[5914] In this example, 46980 is expressed as a recombinant glutathione-S-transferase (GST) fusion polypeptide in E. coli and the fusion polypeptide is isolated and characterized. Specifically, 46980 is fused to GST and this fusion polypeptide is expressed in E. coli, e.g., strain PEB199. Expression of the GST-46980 fusion protein in PEB199 is induced with IPTG. The recombinant fusion polypeptide is purified from crude bacterial lysates of the induced PEB 199 strain by affinity chromatography on glutathione beads. Using polyacrylamide gel electrophoretic analysis of the polypeptide purified from the bacterial lysates, the molecular weight of the resultant fusion polypeptide is determined.

Example 23 Expression of Recombinant 46980 Protein in COS Cells

[5915] To express the 46980 gene in COS cells (e.g., COS-7 cells, CV-1 origin SV40 cells; Gluzman (1981) Cell 123:175-182), the pcDNA/Amp vector by Invitrogen Corporation (San Diego, Calif.) is used. This vector contains an SV40 origin of replication, an ampicillin resistance gene, an E. coli replication origin, a CMV promoter followed by a polylinker region, and an SV40 intron and polyadenylation site. A DNA fragment encoding the entire 46980 protein and an HA tag (Wilson et al. (1984) Cell 37:767) or a FLAG tag fused in-frame to its 3′end of the fragment is cloned into the polylinker region of the vector, thereby placing the expression of the recombinant protein under the control of the CMV promoter.

[5916] To construct the plasmid, the 46980 DNA sequence is amplified by PCR using two primers. The 5′primer contains the restriction site of interest followed by approximately twenty nucleotides of the 46980 coding sequence starting from the initiation codon; the 3′ end sequence contains complementary sequences to the other restriction site of interest, a translation stop codon, the HA tag or FLAG tag and the last 20 nucleotides of the 46980 coding sequence. The PCR amplified fragment and the pcDNA/Amp vector are digested with the appropriate restriction enzymes and the vector is dephosphorylated using the CIAP enzyme (New England Biolabs, Beverly, Mass.). Preferably the two restriction sites chosen are different so that the 46980_gene is inserted in the correct orientation. The ligation mixture is transformed into E. coli cells (strains HB101, DH5&agr;, SURE, available from Stratagene Cloning Systems, La Jolla, Calif., can be used), the transformed culture is plated on ampicillin media plates, and resistant colonies are selected. Plasmid DNA is isolated from transformants and examined by restriction analysis for the presence of the correct fragment.

[5917] COS cells are subsequently transfected with the 46980-pcDNA/Amp plasmid DNA using the calcium phosphate or calcium chloride co-precipitation methods, DEAE-dextran-mediated transfection, lipofection, or electroporation. Other suitable methods for transfecting host cells can be found in Sambrook, J., Fritsh, E. F., and Maniatis, T. (1989) Molecular Cloning: A Laboratory Manual. 2nd, ed., Cold Spring Harbor Laboratory, Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y. The expression of the 46980 polypeptide is detected by radiolabelling (35S-methionine or 35S-cysteine available from NEN, Boston, Mass., can be used) and immunoprecipitation (Harlow, E. and Lane, D. (1988) Antibodies: A Laboratory Manual, Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y.) using an HA specific monoclonal antibody. Briefly, the cells are labeled for 8 hours with 35S-methionine (or 35S-cysteine). The culture media are then collected and the cells are lysed using detergents (RIPA buffer, 150 mM NaCl, 1% NP-40, 0.1% SDS, 0.5% DOC, 50 mM Tris, pH 7.5). Both the cell lysate and the culture media are precipitated with an HA specific monoclonal antibody. Precipitated polypeptides are then analyzed by SDS-PAGE.

[5918] Alternatively, DNA containing the 46980 coding sequence is cloned directly into the polylinker of the pcDNA/Amp vector using the appropriate restriction sites. The resulting plasmid is transfected into COS cells in the manner described above, and the expression of the 46980 polypeptide is detected by radiolabelling and immunoprecipitation using a 46980 specific monoclonal antibody.

Examples for 32225 Example 24 Identification and Characterization of Human 32225 cDNA

[5919] The human 32225 nucleic acid sequence is recited as follows: 14 GGCTCGCCAGGACCTGGCAAGGCTTGTTTACTATGGCCGATGATCTGGAGCAGC (SEQ ID NO: 33) AGTCTCAAGGCTGGCTGAGTAGCTGGCTGCCCACGTGGCGCCCCACTTCCATGTC TCAGCTGAAGAATGTGGAAGCCAGGATCCTCCAGTGTCTCCAGAATAAGTTCCT GGCCAGATATGTATCCCTCCCAAACCAGAATAAGATCTGGACGGTGACTGTGAG CCCCGAGCAAAACGACCGCACCCCCTTGGTGATGGTGCATGGTTTTGGGGGCGG CGTGGGTCTCTGGATCCTCAACATGGACTCACTGAGTGCCCGCCGCACACTGCAC ACCTTCGATCTGCTTGGCTTCGGGCGAAGCTCAAGGCCAGCATTCCCAAGGGAC CCGGAGGGGGCTGAGGATGAGTTTGTGACATCGATAGAGACATGGCGGGAGACC ATGGGGATCCCCAGCATGATCCTCCTGGGGCACAGTTTGGGAGGATTCCTGGCC ACTTCTTACTCAATCAAGTACCCTGATAGAGTTAAACACCTCATCCTGGTGGACC CATGGGGCTTTCCCCTCCGACCAACTAACCCCAGTGAGATCCGTGCACCCCCAGC CTGGGTCAAAGCCGTGGCATCTGTCCTAGGACGTTCCAATCCATTGGCTGTTCTT CGAGTAGCTGGGCCCTGGGGGCCTGGTCTGGTGCAGCGATTCCGGCCGGACTTC AAACGCAAGTTTGCAGACTTCTTTGAAGATGATACCATATCAGAGTATATTTACC ACTGCAACGCACAGAATCCCAGTGGTGAGACAGCATTCAAAGCCATGATGGAGT CCTTTGGCTGGGCCCGGCGCCCTATGCTGGAGCGAATTCACTTGATTCGAAAAGA TGTGCCTATCACTATGATCTACGGGTCCGACACCTGGATAGATACCAGTACGGGA AAAAAGGTGAAGATGCAGCGGCCGGATTCCTATGTCCGAGACATGGAGATTAAG GGTGCCTCCCACCATGTCTATGCTGACCAGCCACACATCTTCAATGCTGTGGTGG AGGAGATCTGCGACTCAGTTGATTGAGCTGCTCTCTGAAGAGGAAGAGGAGAAA GCCAGAGAGTCACTCTTACCTCCCTGTCTGCTTACTCACCCACTCTGTCCTTTCCT CACCAACTAACATGTGCCAGCCAGGCAGAGTCTTGTGCTGTTCCCAGAACAGGA CGACAGTGAAAAGAACACTCTTGACCCTACACTGAAGGCTGAAGGCAGAAGCCA CAAGAGGCCTTGAGTGCCACCCCCAGGGAAGAACATAAAGGGTTGCACAATGCC ACCCATCCACTCCTTGCCAAGTGTTACCCAGATGGTGGAGGATGTGAAGGGATT GCACCAAGCCACATTCACTCTCTCTGTGGCCTTTCTTCCTCTGGGCAAAGAAGGG CTTCCAGTGGCCTTTCCTCACTCTGTAGTGTTTGTGGGGATAGGTTCCATGCAAG AACACCTTCCTCCTCCATCCCCCACTTCACCCCATCCCATACCAGTTCCATCCAG GGTCTGCTTAACTGCCAAGAGCAGGTCCTGGAGTTCCCTTCACCTGCAGAGTCCT TTTCATGACCTAGGAGGTCTTATTCAAAGCCCTCATTGACAGAGGAGGAAACAG GCCAAGGCAGGACATGGCTGGACCATGGTGATACAGCTCTGTGTGATTCAAGTT CTGGCAGAGCTTGTAAGGCTAGAGCCCAGGTCTGCCGACACCCTGTGCTTGTTGC ACACTTGATTTGCTAAGGCTGGAGACAGGCACCATTGCCATGGGGCTGGTCCTA GTCACTGGCCGAGGATAAGCCCGTCCCTGTCCCACATTCTAGCCCCACTATGCGG GGGTGCTGTTGTCCTGCCTGTGTCTCATCCCCAGCTGCCTAAGCTAGGGACACTC AAGTGCTTCCTTCCTTGCCCCATCTTCCTCCCAACTGGAGGCCTCTGAGCCTCCCC TGTGCCTTGGGCCCTGAAGCCCCATATGTAGTATAGAGCAAAGGTGGCTCCTGGT GAAGAGAGGGTGGAAAGGCCCTTCAGCCCCAGGGCCATGTCTGGGTTCTCCATG CCCATCAGTCTCTGCAGTTTCTCTACCTGCCCCCAGAGCTGAGGCCATCTGCAAG CCCCTGCCCATGGCCCAATGGGGAGCCTCCAGCCACAAGTTCCCTGTCCTTATCA GCCACTGGGTGGTTCCCACTGCATGACCCTCTATCCCTGCCATCTGTCCCCATGG TTTCCAGCTCAATCCACCCCTGACCCATCTGTCAGCTTTTTCCCAGGGAGCCGTTT CAGGGGTTCTG.

[5920] The human 32225 sequence (FIG. 18; SEQ ID NO:33), which is approximately 2305 nucleotides long. The nucleic acid sequence includes an initiation codon (ATG) and a termination codon (TGA) which are underscored above. The region between and inclusive of the initiation codon and the termination codon is a methionine-initiated coding sequence of about 1029 nucleotides, including the termination codon (nucleotides indicated as “coding” of SEQ ID NO:33; SEQ ID NO:35). The coding sequence encodes a 342 amino acid protein (SEQ ID NO:34), which is recited as follows: 15 MADDLEQQSQGWLSSWLPTWRPTSMSQLKNVEARILQCLQNKFLARYVSLPNQNKI (SEQ ID NO: 34) WTVTVSPEQNDRTPLVMVHGFGGGVGLWILNMDSLSARRTLHTFDLLGFGRSSRPA FPRDPEGAEDEFVTSIETWRETMGIPSMILLGHSLGGFLATSYSIKYPDRVKHLILVDP WGFPLRPTNPSEIRAPPAWVKAVASVLGRSNPLAVLRVAGPWGPGLVQRFRPDFKR KFADFFEDDTISEYIYHCNAQNPSGETAFKAMMESFGWARRPMLERIHLIRKDVPIT MIYGSDTWIDTSTGKKVKMQRPDSYVRDMEIKGASHHVYADQPHIFNAVVEEICDS VD.

Example 25 Tissue Distribution of 32225 mRNA by TaqMan Analysis

[5921] Endogenous human 32225 gene expression was determined using the Perkin-Elmer/ABI 7700 Sequence Detection System which employs TaqMan technology. Briefly, TaqMan technology relies on standard RT-PCR with the addition of a third gene-specific oligonucleotide (referred to as a probe) which has a fluorescent dye coupled to its 5′end (typically 6-FAM) and a quenching dye at the 3′end (typically TAMRA). When the fluorescently tagged oligonucleotide is intact, the fluorescent signal from the 5′dye is quenched. As PCR proceeds, the 5′ to 3′nucleolytic activity of Taq polymerase digests the labeled primer, producing a free nucleotide labeled with 6-FAM, which is now detected as a fluorescent signal. The PCR cycle where fluorescence is first released and detected is directly proportional to the starting amount of the gene of interest in the test sample, thus providing a quantitative measure of the initial template concentration. Samples can be internally controlled by the addition of a second set of primers/probe specific for a housekeeping gene such as GAPDH which has been labeled with a different fluorophore on the 5′end (typically VIC).

[5922] To determine the level of 32225 in various human tissues a primer/probe set was designed. Total RNA was prepared from a series of human tissues using an RNeasy kit from Qiagen. First strand cDNA was prepared from 1 &mgr;g total RNA using an oligo-dT primer and Superscript II reverse transcriptase (Gibco/BRL). cDNA obtained from approximately 50 ng total RNA was used per TaqMan reaction. Tissues tested include the human tissues and several cell lines shown in Tables 6 and 7. 32225 mRNA was detected in brain, liver, breast, ovary, colon, and lung tissues. In addition, 32225 expression was altered in some tumor samples, as compared to the appropriate normal tissues. 16 TABLE 6 Relative Tissue Type Expression PIT 400 Breast Normal 1.46 PIT 56 Breast Normal 3.08 MDA 106 Breast Tumor 2.06 MDA 234 Breast Tumor 1.33 NDR 57 Breast Tumor 0.23 MDA 304 Breast Tumor 0.73 NDR 58 Breast Tumor 0.40 NDR 132 Breast Tumor 13.23 NDR 07 Breast Tumor 0.36 NDR 12 Breast Tumor 21.94 PIT 208 Ovary Normal 8.61 CHT 620 Ovary Normal 8.14 CHT 619 Ovary Normal 15.63 CLN 03 Ovary Tumor 0.17 CLN 05 Ovary Tumor 1.53 CLN 17 Ovary Tumor 4.23 CLN 07 Ovary Tumor 0.31 CLN 08 Ovary Tumor 0.43 MDA 216 Ovary Tumor 1.84 CLN 012 Ovary Tumor 23.77 MDA 25 Ovary Tumor 21.72 MDA 183 Lung Normal 0.24 CLN 930 Lung Normal 0.66 MDA 185 Lung Normal 0.57 CHT 816 Lung Normal 0.07 MPI 215 Lung Tumor-SmC 11.60 MDA 259 Lung Tumor-PDNSCCL 5.70 CHT 832 Lung Tumor-PDNSCCL 2.48 MDA 253 Lung Tumor-PDNSCCL 32.46 CHT 814 Lung Tumor-SCC 68.63 CHT 911 Lung Tumor-SCC 65.38 CHT 726 Lung T-SCC 4.22 CHT 845 Lung T-AC 17.10

[5923] 32225 is expressed in ovary and breast tissues, and marginally expressed in lung tissue. The expression of 32225 in lung tumors is dramatically elevated in all of the samples tested. Similarly, 32225 expression is elevated in a subset of breast and ovary tumor samples. In a distinct subset of ovary tumors, however, 32225 expression is decreased relative to the levels in normal tissues. 17 TABLE 7 Relative Tissue Type Expression CHT 396 Colon Normal 0.15 CHT 519 Colon Normal 0.15 CHT 416 Colon Normal 0.65 CHT 452 Colon Normal 0.19 CHT 398 Colon Tumor 26.83 CHT 807 Colon Tumor 0.04 CHT 528 Colon Tumor 9.01 CHT 368 Colon Tumor 0.09 CHT 372 Colon Tumor 0.84 CHT 01 Liver Metastasis 7.73 CHT 3 Liver Metastasis 12.13 CHT 896 Liver Metastasis 2.02 NDR 217 Liver Metastasis 0.47 PIT 260 Liver Normal 0.46 PIT 229 Liver Normal 43.89 MGH 16 Brain Normal 48.03 MCL 53 Brain Normal 72.54 MCL 390 Brain Normal 74.58 Astrocytes 131.21 CHT 201 Brain Tumor 1.17 CHT 216 Brain Tumor 5.45 CHT 501 Brain Tumor 6.35 CHT 1273 Brain Tumor 185.57 A24 HMVEC-Arrested 31.36 C48 HMVEC-Proliferating 21.27 BWH 54 Fetal Liver 117.85 BWH 75 Fetal Liver 20.98 CHT 765 Wilms Tumor 32.35 CHT 1424 Endometrial AC 23.93 MCL 377 Brain Normal 129.86 BWH 58 Fetal Adrenal 496.55 PIT 251 Fetal Adrenal 19.57 PIT 213 Renal Tumor 0.07

[5924] Expression of 32225 is very high in the fetal adrenal gland and liver, as well as in astrocytes and the brain. 32225 is also expressed in endothelial cells and cells cultured from an endometrial cancer, a Wilms tumor, and several liver metastases. The expression of 32225 was marginal in colon tissue, but elevated in a subset of colon tumors. As with ovary tumors (see Table 6), a subset of brain tumors showed an increase in 32225 expression, while a distinct subset displayed a decrease in 32225 expression relative to the levels observed in normal brain tissue samples.

Example 26 Tissue Distribution of 32225 mRNA by Northern Analysis

[5925] Northern blot hybridizations with various RNA samples can be performed under standard conditions and washed under stringent conditions, i.e., 0.2×SSC at 65° C. A DNA probe corresponding to all or a portion of the 32225 cDNA (SEQ ID NO:33) can be used. The DNA was radioactively labeled with 32P-dCTP using the Prime-It Kit (Stratagene, La Jolla, Calif.) according to the instructions of the supplier. Filters containing mRNA from mouse hematopoietic and endocrine tissues, and cancer cell lines (Clontech, Palo Alto, Calif.) can be probed in ExpressHyb hybridization solution (Clontech) and washed at high stringency according to manufacturer's recommendations.

Example 27 Recombinant Expression of 32225 in Bacterial Cells

[5926] In this example, 32225 is expressed as a recombinant glutathione-S-transferase (GST) fusion polypeptide in E. coli and the fusion polypeptide is isolated and characterized. Specifically, 32225 is fused to GST and this fusion polypeptide is expressed in E. coli, e.g., strain PEB199. Expression of the GST-32225 fusion protein in PEB199 is induced with IPTG. The recombinant fusion polypeptide is purified from crude bacterial lysates of the induced PEB 199 strain by affinity chromatography on glutathione beads. Using polyacrylamide gel electrophoretic analysis of the polypeptide purified from the bacterial lysates, the molecular weight of the resultant fusion polypeptide is determined.

Example 28 Expression of Recombinant 32225 Protein in COS Cells

[5927] To express the 32225 gene in COS cells (e.g., COS-7 cells, CV-1 origin SV40 cells; Gluzman (1981) Cell 123:175-182), the pcDNA/Amp vector by Invitrogen Corporation (San Diego, Calif.) is used. This vector contains an SV40 origin of replication, an ampicillin resistance gene, an E. coli replication origin, a CMV promoter followed by a polylinker region, and an SV40 intron and polyadenylation site. A DNA fragment encoding the entire 32225 protein and an HA tag (Wilson et al. (1984) Cell 37:767) or a FLAG tag fused in-frame to its 3′end of the fragment is cloned into the polylinker region of the vector, thereby placing the expression of the recombinant protein under the control of the CMV promoter.

[5928] To construct the plasmid, the 32225 DNA sequence is amplified by PCR using two primers. The 5′primer contains the restriction site of interest followed by approximately twenty nucleotides of the 32225 coding sequence starting from the initiation codon; the 3′ end sequence contains complementary sequences to the other restriction site of interest, a translation stop codon, the HA tag or FLAG tag and the last 20 nucleotides of the 32225 coding sequence. The PCR amplified fragment and the pCDNA/Amp vector are digested with the appropriate restriction enzymes and the vector is dephosphorylated using the CIAP enzyme (New England Biolabs, Beverly, Mass.). Preferably the two restriction sites chosen are different so that the 32225_gene is inserted in the correct orientation. The ligation mixture is transformed into E. coli cells (strains HB101, DH5&agr;, SURE, available from Stratagene Cloning Systems, La Jolla, Calif., can be used), the transformed culture is plated on ampicillin media plates, and resistant colonies are selected. Plasmid DNA is isolated from transformants and examined by restriction analysis for the presence of the correct fragment.

[5929] COS cells are subsequently transfected with the 32225-pcDNA/Amp plasmid DNA using the calcium phosphate or calcium chloride co-precipitation methods, DEAE-dextran-mediated transfection, lipofection, or electroporation. Other suitable methods for transfecting host cells can be found in Sambrook, J., Fritsh, E. F., and Maniatis, T. (1989) Molecular Cloning: A Laboratory Manual. 2nd, ed., Cold Spring Harbor Laboratory, Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y. The expression of the 32225 polypeptide is detected by radiolabelling (35S-methionine or 35S-cysteine available from NEN, Boston, Mass., can be used) and immunoprecipitation (Harlow, E. and Lane, D. (1988) Antibodies: A Laboratory Manual, Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y.) using an HA specific monoclonal antibody. Briefly, the cells are labeled for 8 hours with 35S-methionine (or 35S-cysteine). The culture media are then collected and the cells are lysed using detergents (RIPA buffer, 150 mM NaCl, 1% NP-40, 0.1% SDS, 0.5% DOC, 50 mM Tris, pH 7.5). Both the cell lysate and the culture media are precipitated with an HA specific monoclonal antibody. Precipitated polypeptides are then analyzed by SDS-PAGE.

[5930] Alternatively, DNA containing the 32225 coding sequence is cloned directly into the polylinker of the pcDNA/Amp vector using the appropriate restriction sites. The resulting plasmid is transfected into COS cells in the manner described above, and the expression of the 32225 polypeptide is detected by radiolabelling and immunoprecipitation using a 32225 specific monoclonal antibody.

Examples for 47508 Example 29 Identification and Characterization of Human 47508 cDNA

[5931] The human 47508 nucleic acid sequence is recited as follows: 18 CGGAGCTTCCGAACCAGGCGGGATTCCACCGGGTATTTGCCTGCGGAGGCGGGA (SEQ ID NO: 41) CTTCGGGCTTGATGGGCGTTGGGGGTGGCCTTCCTGCGGGCAGGCTCTCTGTGTC GCAACACTGGCGGGGCGGGCCAAATCGGCCAGAGCTCTGCCCCCAGAGGACGC GGCTAAGCCCGGGGGCGTGTCCTGGGCTGGCCCCACCCGCGCCCCGCCCCGCCC CGCCCGGTCGCGGAGCTGCGGCCAGCTTTGGGAGGGCCGGCCCCGGGATGCTAC ACACAACCCAGCTGTACCAGCATGTGCCAGAGACACGCTGGCCAATCGTGTACT CGCCGCGCTACAACATCACCTTCATGGGCCTGGAGAAGCTGCATCCCTTTGATGC CGGAAAATGGGGCAAAGTGATCAATTTCCTAAAAGAAGAGAAGCTTCTGTCTGA CAGCATGCTGGTGGAGGCGCGGGAGGCCTCGGAGGAGGACCTGCTGGTGGTGCA CACGAGGCGCTATCTTAATGAGCTCAAGTGGTCCTTTGCTGTTGCTACCATCACA GAAATCCCCCCCGTTATCTTCCTCCCCAACTTCCTTGTGCAGAGGAAGGTGCTGA GGCCCCTTCGGACCCAGACAGGAGGAACCATAATGGCGGGGAAGCTGGCTGTGG AGCGAGGCTGGGCCATCAACGTGGGGGGTGGCTTCCACCACTGCTCCAGCGACC GTGGCGGGGGCTTCTGTGCCTATGCGGACATCACGCTCGCCATCAAGTTTCTGTT TGAGCGTGTGGAGGGCATCTCCAGGGCTACCATCATTGATCTTGATGCCCATCAG GGCAATGGGCATGAGCGAGACTTCATGGACGACAAGCGTGTGTACATCATGGAT GTCTACAACCGCCACATCTACCCAGGGGACCGCTTTGCCAAGCAGGCCATCAGG CGGAAGGTGGAGCTGGAGTGGGGCACAGAGGATGATGAGTACCTGGATAAGGT GGAGAGGAACATCAAGAAATCCCTCCAGGAGCACCTGCCCGACGTGGTGGTATA CAATGCAGGCACCGACATCCTCGAGGGGGACCGCCTTGGGGGGCTGTCCATCAG CCCAGCGGGCATCGTGAAGCGGGATGAGCTGGTGTTCCGGATGGTCCGTGGCCG CCGGGTGCCCATCCTTATGGTGACCTCAGGCGGGTACCAGAAGCGCACAGCCCG CATCATTGCTGACTCCATACTTAATCTGTTTGGCCTGGGGCTCATTGGGCCTGAG TCACCCAGCGTCTCCGCACAGAACTCAGACACACCGCTGCTTCCCCCTGCAGTGC CCTGACCCTTGCTGCCCTGCCTGTCACGTGGCCCTGCCTATCCGCCCCTTAGTGCT TTTTGTTTTCTAACCTCATGGGGTGGTGGAGGCAGCCTTCAGTGAGCATGGAGGG GCAGGGCCATCCCTGGCTGGGGCCTGGAGCTGGCCCTTCCTCTACTTTTCCCTGC TGGAAGCCAGAAGGGCTTGAGGCCTCTATGGGTGGGGGCAGAAGGCAGAGCCT GTGTCCCAGGGGGACCCACACGAAGTCACCAGCCCATAGGTCCAGGGAGGCAG GCAGG.

[5932] The human 47508 sequence (FIG. 21; SEQ ID NO:41), which is approximately 1579 nucleotides long. The nucleic acid sequence includes an initiation codon (ATG) and a termination codon (TGA) which are underscored above. The region between and inclusive of the initiation codon and the termination codon is a methionine-initiated coding sequence of about 1242 nucleotides, including the termination codon (nucleotides indicated as “coding” of SEQ ID NO:41; SEQ ID NO:43). The coding sequence encodes a 413 amino acid protein (SEQ ID NO:42), which is recited as follows: 19 MGVGGGLPAGRLSVSQHWRGGPNRPELCPQRTRLSPGACPGLAPPAPRPAPPGRGA (SEQ ID NO: 42) AASFGRAGPGMLHTTQLYQHVPETRWPIVYSPRYNITFMGLEKLHPFDAGKWGKVI NFLKEEKLLSDSMLVEAREASEEDLLVVHTRRYLNELKWSFAVATITEIPPVIFLPNFL VQRKVLRPLRTQTGGTIMAGKLAVERGWAINVGGGFHHCSSDRGGGFCAYADITLA IKFLFERVEGISRATIIDLDAHQGNGHERDFMDDKRVYIMDVYNRHIYPGDRFAKQA IRRKVELEWGTEDDEYLDKVERNIKKSLQEHLPDVVVYNAGTDILEGDRLGGLSISP AGIVKRDELVFRMVRGRRVPILMVTSGGYQKRTARIIADSILNLFGLGLIGPESPSVSA QNSDTPLLPPAVP.

Example 30 Tissue Distribution of 47508 mRNA by TaqMan Analysis

[5933] Endogenous human 47508 gene expression was determined using the Perkin-Elmer/ABI 7700 Sequence Detection System which employs TaqMan technology. Briefly, TaqMan technology relies on standard RT-PCR with the addition of a third gene-specific oligonucleotide (referred to as a probe) which has a fluorescent dye coupled to its 5′end (typically 6-FAM) and a quenching dye at the 3′end (typically TAMRA). When the fluorescently tagged oligonucleotide is intact, the fluorescent signal from the 5′dye is quenched. As PCR proceeds, the 5′ to 3′nucleolytic activity of Taq polymerase digests the labeled primer, producing a free nucleotide labeled with 6-FAM, which is now detected as a fluorescent signal. The PCR cycle where fluorescence is first released and detected is directly proportional to the starting amount of the gene of interest in the test sample, thus providing a quantitative measure of the initial template concentration. Samples can be internally controlled by the addition of a second set of primers/probe specific for a housekeeping gene such as GAPDH which has been labeled with a different fluorophore on the 5′end (typically VIC).

[5934] To determine the level of 47508 in various human tissues a primer/probe set was designed. Total RNA was prepared from a series of human tissues using an RNeasy kit from Qiagen. First strand cDNA was prepared from 1 &mgr;g total RNA using an oligo-dT primer and Superscript II reverse transcriptase (Gibco/BRL). cDNA obtained from approximately 50 ng total RNA was used per TaqMan reaction. Tissues tested include the human tissues and several cell lines shown in Tables 8-11. 47508 mRNA was detected in all tissues and cell lines analyzed (Tables 8-11), including normal and cancerous tissues of the breast, ovary, lung, and colon (Table 8). 20 TABLE 8 Relative Tissue Expression PIT 400 Breast Normal 1.56 PIT 372 Breast Normal 1.99 CHT 558 Breast Normal 0.47 CLN 168 Breast Tumor: IDC 11.20 MDA 304 Breast Tumor: moderately differentiated IDC 0.71 NDR 57 Breast Tumor: poorly differentiated IDC 4.19 NDR 132 Breast Tumor: IDC/ILC 10.10 CHT 562 Breast Tumor: IDC 20.69 NDR 12 Breast Tumor 19.17 PIT 208 Ovary Normal 7.04 CHT 620 Ovary Normal 5.98 CLN 03 Ovary Tumor 2.78 CLN 17 Ovary Tumor 9.42 MDA 25 Ovary Tumor 18.52 MDA 216 Ovary Tumor 6.26 CLN 012 Ovary Tumor 13.51 MDA 185 Lung Normal 0.76 CLN 930 Lung Normal 2.04 MDA 183 Lung Normal 0.51 MPI 215 Lung Tumor: SmC 2.21 MDA 259 Lung Tumor: PDNSCCL 2.87 CHT 832 Lung Tumor: PDNSCCL 3.09 CHT 911 Lung Tumor: SCC 3.52 MDA 262 Lung Tumor: SCC 10.97 CHT 211 Lung Tumor: adenocarcinoma 2.72 MDA 253 Lung Tumor: PDNSCCL 0.46 NHBE 27.68 CHT 396 Colon Normal 21.87 CHT 523 Colon Normal 2.17 CHT 452 Colon Normal 0.55 CHT 382 Colon Tumor: moderately differentiated 6.09 CHT 528 Colon Tumor: moderately differentiated 4.65 CLN 609 Colon Tumor 4.91 CHT 372 Colon Tumor: poorly/moderately differentiated 3.27 NDR 217 Colon-Liver Metastasis 3.38 NDR 100 Colon-Liver Metastasis 7.26 PIT 260 Liver Normal (female) 0.21 ONC 102 Hemangioma 0.67 A24 HMVEC Arrested 3.01 C48 HMVEC Proliferating 3.97

[5935] As shown in table 8, 47508 is expressed in normal breast, ovary, lung, colon, liver, and endothelial cells, as well as tumors of the breast, ovary, lung, colon, and liver (metastases originating from the colon). In most breast tumors analyzed, 47508 expression is elevated relative to that observed in normal breast tissue. In addition, a subset of ovary and lung tumors display an increase in 47508 expression, and all of the colon tumor samples display an increase in 47508 expression relative to the majority of normal colon tissue samples analyzed. Abbreviations used in Table 8 include: IDC—invasive ductal carcinoma; ILC—invasive lobular carcinoma; SmC-; PDNSCCL—poorly differentiated non-small cell carcinoma of the lung; SCC—squamous cell carcinoma; and HMVEC—human vein endothelial cells. 21 TABLE 9 Relative Cell Line Expression MCF10MS 16.35 MCF10A 10.27 MCF10AT.c11 12.47 MCF10AT.c13 12.74 MCF10AT1 6.92 MCF10AT3B 10.64 MCF10CA1a.c11 6.90 MCF10CA1a.c11 Agar 16.63 MCF10A.m25 Plastic 18.65 MCF10CA Agar 10.13 MCF10CA Plastic 4.41 MCF3B Agar 23.60 MCF3B Plastic 14.28 MCF10A EGF 0 hr 6.30 MCF10A EGF 0.5 hr 6.78 MCF10A EGF 1 hr 5.96 MCF10A EGF 2 hr 5.43 MCF10A EGF 4 hr 7.21 MCF10A EGF 8 hr 5.66 MCF10A IGF1A 0 hr 18.84 MCF10A IGF1A 0.5 hr 25.30 MCF10A IGF1A 1 hr 19.51 MCF10A IGF1A 3 hr 32.69 MCF10A IGF1A 24 hr 51.30 MCF10AT3B.c15 Plastic 15.15 MCF10AT3B.c16 Plastic 23.52 MCF10AT3B.c13 Plastic 22.17 MCF10AT3B.c11 Plastic 26.10 MCF10AT3B.c14 Plastic 11.88 MCF10AT3B.c12 Plastic 25.03 MCF10AT3B.c15 Agar 29.36 MCF10AT3B.c16 Agar 35.28 MCF-7 116.23 ZR--75 40.11 T47D 54.22 MDA-231 11.76 MDA-435 10.42 SkBr3 49.89 Hs578Bst 6.17 Hs578T 9.29

[5936] As shown in table 9, 47508 mRNA is expressed in many different cell lines derived from breast tumors, and under many different culture conditions. In some cells lines, e.g., MCF10CA, MCF3B, and MCF10AT3B.cl5 and c16 cell lines, expression of 47508 is elevated in cells that are plated on agar, as compared to cells that are plated on plastic. Cells that are plated on agar tend to have a more transformed phenotype, so the increase in 47508 expression in breast tumor cells plated on agar is consistent with the increase in 47508 expression observed in many tumors (see Table 8). In addition, MCF10A cells display an increase in 47508 expression in response to Insulin Growth Factor (IGF) 1A after 3 hours, with an even larger increase in 47508 expression after 24 hours of exposure to IGF1A. 22 TABLE 10 Relative Cell Line Expression H460 +p16 24 hr 1.15 H460 +p16 48 hr 0.72 H460 +p16 72 hr 0.41 H460 +p16 96 hr 0.28 H460 −p16 24 hr 0.32 H460 −p16 48 hr 0.48 H460 −p16 72 hr 0.23 H460 −p16 96 hr 0.44 H460 −p16 24 hr 0.29 H460 −p16 48 hr 0.47 H460 −p16 72 hr 0.29 H460 −p16 96 hr 0.45 H460 +p16 48 hr 0.34 H460 +p16 72 hr 0.44 H460 +p16 96 hr 0.22

[5937] As shown in Table 10, H460 large cell lung carcinoma cells express 47508 in both the presence and absence of the tumor suppressor gene, p16. 23 TABLE 11 Relative Cell Line Expression NHBE 43.59 A549 (BA) 36.78 H460 (LCLC) 1.91 H23 (AC) 36.02 H522 (AC) 62.28 H125 (AC/SCC) 56.52 H520 (SCC) 26.01 H69 (SCLC) 3.25 H345 (SCLC) 42.25 H460 INCX 24 hr 1.91 H460 p16 24 hr 2.91 H460 INCX 48 hr 1.65 H460 p16 48 hr 2.20 H460 INCX Stable Plas 3.75 H460 p16 Stable Plas 3.40 H460 NA-Agar 1.19 H460 Incx stable Agar 0.38 H460 p16 stable Agar 0.90 H125 Incx 96 hr 43.74 H125 p53 96 hr 36.52 H345 Mock 144 hr 66.06 H345 Glue 144 hr 42.54 H345 VIP 144 hr 80.21

[5938] As shown in Table 11, a number of different lung cell lines express 47508, with neither the p16 or p53 tumor suppressor genes having a strong impact on 47508 expression under the conditions examined. Expression of 47508 is lowest in the H460 large cell lung carcinoma cells. Abbreviations used in Table 11 include: NHBE—primary normal human bronchial epithelial cells; LCLC—large cell lung carcinoma; AC—adenocarcinoma; SCC—squamous cell carcinoma; SCLC—small cell lung carcinoma; INCX; Gluc; and VIP.

Example 31 Tissue Distribution of 47508 mRNA by Northern Analysis

[5939] Northern blot hybridizations with various RNA samples can be performed under standard conditions and washed under stringent conditions, i.e., 0.2×SSC at 65° C. A DNA probe corresponding to all or a portion of the 47508 cDNA (SEQ ID NO:41) can be used. The DNA was radioactively labeled with 32P-dCTP using the Prime-It Kit (Stratagene, La Jolla, Calif.) according to the instructions of the supplier. Filters containing mRNA from mouse hematopoietic and endocrine tissues, and cancer cell lines (Clontech, Palo Alto, Calif.) can be probed in ExpressHyb hybridization solution (Clontech) and washed at high stringency according to manufacturer's recommendations.

Example 32 Recombinant Expression of 47508 in Bacterial Cells

[5940] In this example, 47508 is expressed as a recombinant glutathione-S-transferase (GST) fusion polypeptide in E. coli and the fusion polypeptide is isolated and characterized. Specifically, 47508 is fused to GST and this fusion polypeptide is expressed in E. coli, e.g., strain PEB199. Expression of the GST-47508 fusion protein in PEB199 is induced with IPTG. The recombinant fusion polypeptide is purified from crude bacterial lysates of the induced PEB 199 strain by affinity chromatography on glutathione beads. Using polyacrylamide gel electrophoretic analysis of the polypeptide purified from the bacterial lysates, the molecular weight of the resultant fusion polypeptide is determined.

Example 33 Expression of Recombinant 47508 Protein in COS Cells

[5941] To express the 47508 gene in COS cells (e.g., COS-7 cells, CV-1 origin SV40 cells; Gluzman (1981) Cell 123:175-182), the pcDNA/Amp vector by Invitrogen Corporation (San Diego, Calif.) is used. This vector contains an SV40 origin of replication, an ampicillin resistance gene, an E. coli replication origin, a CMV promoter followed by a polylinker region, and an SV40 intron and polyadenylation site. A DNA fragment encoding the entire 47508 protein and an HA tag (Wilson et al. (1984) Cell 37:767) or a FLAG tag fused in-frame to its 3′end of the fragment is cloned into the polylinker region of the vector, thereby placing the expression of the recombinant protein under the control of the CMV promoter.

[5942] To construct the plasmid, the 47508 DNA sequence is amplified by PCR using two primers. The 5′primer contains the restriction site of interest followed by approximately twenty nucleotides of the 47508 coding sequence starting from the initiation codon; the 3′end sequence contains complementary sequences to the other restriction site of interest, a translation stop codon, the HA tag or FLAG tag and the last 20 nucleotides of the 47508 coding sequence. The PCR amplified fragment and the pCDNA/Amp vector are digested with the appropriate restriction enzymes and the vector is dephosphorylated using the CLAP enzyme (New England Biolabs, Beverly, Mass.). Preferably the two restriction sites chosen are different so that the 47508_gene is inserted in the correct orientation. The ligation mixture is transformed into E. coli cells (strains HB101, DH5&agr;, SURE, available from Stratagene Cloning Systems, La Jolla, Calif., can be used), the transformed culture is plated on ampicillin media plates, and resistant colonies are selected. Plasmid DNA is isolated from transformants and examined by restriction analysis for the presence of the correct fragment.

[5943] COS cells are subsequently transfected with the 47508-pcDNA/Amp plasmid DNA using the calcium phosphate or calcium chloride co-precipitation methods, DEAE-dextran-mediated transfection, lipofection, or electroporation. Other suitable methods for transfecting host cells can be found in Sambrook, J., Fritsh, E. F., and Maniatis, T. (1989) Molecular Cloning: A Laboratory Manual. 2nd, ed., Cold Spring Harbor Laboratory, Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y. The expression of the 47508 polypeptide is detected by radiolabelling (35S-methionine or 35S-cysteine available from NEN, Boston, Mass., can be used) and immunoprecipitation (Harlow, E. and Lane, D. (1988) Antibodies: A Laboratory Manual, Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y.) using an HA specific monoclonal antibody. Briefly, the cells are labeled for 8 hours with 35S-methionine (or 35S-cysteine). The culture media are then collected and the cells are lysed using detergents (RIPA buffer, 150 mM NaCl, 1% NP-40, 0.1% SDS, 0.5% DOC, 50 mM Tris, pH 7.5). Both the cell lysate and the culture media are precipitated with an HA specific monoclonal antibody. Precipitated polypeptides are then analyzed by SDS-PAGE.

[5944] Alternatively, DNA containing the 47508 coding sequence is cloned directly into the polylinker of the pcDNA/Amp vector using the appropriate restriction sites. The resulting plasmid is transfected into COS cells in the manner described above, and the expression of the 47508 polypeptide is detected by radiolabelling and immunoprecipitation using a 47508 specific monoclonal antibody.

Examples for 56939 Example 34 Identification and Characterization of Human 56939 cDNA

[5945] The human 56939 nucleic acid sequence is recited as follows: 24 GCTCCAAGATGTCAGCAACGCTGATCCTGGAGCCCCCAGGCCGCTGCTGCTGGA (SEQ ID NO: 48) ACGAGCCGGTGCGCATTGCCGTGCGCGGCCTGGCCCCGGAGCAGCGGGTTACGC TGCGCGCGTCCCTGCGCGACGAGAAGGGCGCGCTCTTCCGGGCCCACGCGCGCT ACTGCGCCGACGCCTGCGGCGAGCTGGACCTGGAGCGCGCACCCGCGCTGGGCG GCAGCTTCGCGGGACTCGAGCCCATGGGGCTGCTCTGGGCCCTGGAACCCGAGA AGCCTTTTTGGCGCTTCCTGAAGCGGGACGTACAGATTCCTTTTGTCGTGGAGTT GGAGGTGCTGGACGGCCACGACCCCGAGCCTGGACGGCTGCTGTGCCAGGCGCA GCACGAGCGCCACTTCCTCCCGCCAGGGGTGCGGCGCCAGTCGGTGCGAGCGGG CCGGGTGCGCGCCACGCTCTTCCTGCCGCCAGGACCTGGACCCTTCCCAGGGATC ATTGACATCTTTGGTATTGGAGGGGGCCTCTTGGAATATCGAGCCAGCCTCCTTG CTGGCCATGGCTTTGCCACGTTGGCTCTAGCTTATTATAACTTTGAAGATCTCCCC AATAACATGGACAACATATCCCTGGAGTACTTCGAAGAAGCCGTATGCTACATG CTTCAACATCCCCAGGTAAAAGGCCCAGGCATTGGGCTTTTGGGCATTTCTCTAG GAGCTGATATTTGTCTCTCAATGGCCTCATTCTTGAAGAATGTCTCAGCCACAGT TTCCATCAATGGATCTGGGATCAGTGGGAACACAGCCATCAACTATAAGCACAG TAGCATTCCACCATTGGGCTATGACCTGAGGAGAATCAAGGTAGCTTTCTCAGGC CTCGTGGACATCGTGGATATAAGGAATGCTCTCGTAGGAGGGTACAAGAACCCC AGCATGATTCCAATAGAGAAGGCCCAGGGGCCCATCCTGCTCATTGTTGGTCAG GATGACCATAACTGGAGAAGTGAGTTGTATGCCCAAACAGTCTCTGAACGGTTA CAGGCCCATGGAAAGGAAAAACCCCAGATCATCTGTTACCCTGGGACTGGGCAT TACATCGAGCCTCCTTACTTCCCCCTGTGCCCAGCTTCCCTTCACAGATTACTGAA CAAACATGTTATATGGGGTGGGGAGCCCAGGGCTCATTCTAAGGCCCAGGAAGA TGCCTGGAAGCAAATTCTAGCCTTCTTCTGCAAACACCTGGGAGGTACCCAGAA AACAGCTGTCCCTAAATTGTAATGCATTTGTCTGTTGTTGACATGAGAGATTCAA GATCAGATTCTAGTGTTCAGTAACCCTATGTGAATCAGATGTCTCCTGGATAACA TTAAAGCCATGTCTTTGTCATTAAAAAAA.

[5946] The human 56939 sequence (SEQ ID NO:48), which is approximately 1391 nucleotides long, and includes an initiation codon (ATG) and a termination codon (TAA) which are underscored above. The region between and inclusive of the initiation codon and the termination codon is a methionine-initiated coding sequence of about 1266 nucleotides, including the termination codon (nucleotides indicated as “coding” of SEQ ID NO:48; SEQ ID NO:50). The coding sequence encodes a 421 amino acid protein (SEQ ID NO:49), which is recited as follows: 25 MSATLILEPPGRCCWNEPVRIAVRGLAPEQRVTLRASLRDEKGALFRAHARYCADA (SEQ ID NO: 49) CGELDLERAPALGGSFAGLEPMGLLWALEPEKPFWRFLKRDVQIPFVVELEVLDGH DPEPGRLLCQAQHERHFLPPGVRRQSVRAGRVRATLFLPPGPGPFPGIIDIFGIGGGLL EYRASLLAGHGFATLALAYYNFEDLPNNMDNISLEYFEEAVCYMLQHPQVKGPGIG LLGISLGADICLSMASFLKNVSATVSINGSGISGNTAINYKHSSIPPLGYDLRRIKVAFS GLVDIVDIRNALVGGYKNPSMIPIEKAQGPILLIVGQDDHNWRSELYAQTVSERLQA HGKEKPQIICYPGTGHYIEPPYFPLCPASLHRLLNKHVIWGGEPRAHSKAQEDAWKQ ILAFFCKHLGGTQKTAVPKL.

Example 35 Tissue Distribution of 56939 mRNA by TaqMan Analysis

[5947] Endogenous human 56939 gene expression was determined using the Perkin-Elmer/ABI 7700 Sequence Detection System which employs TaqMan technology. Briefly, TaqMan technology relies on standard RT-PCR with the addition of a third gene-specific oligonucleotide (referred to as a probe) which has a fluorescent dye coupled to its 5′end (typically 6-FAM) and a quenching dye at the 3′end (typically TAMRA). When the fluorescently tagged oligonucleotide is intact, the fluorescent signal from the 5′dye is quenched. As PCR proceeds, the 5′ to 3′nucleolytic activity of Taq polymerase digests the labeled primer, producing a free nucleotide labeled with 6-FAM, which is now detected as a fluorescent signal. The PCR cycle where fluorescence is first released and detected is directly proportional to the starting amount of the gene of interest in the test sample, thus providing a quantitative measure of the initial template concentration. Samples can be internally controlled by the addition of a second set of primers/probe specific for a housekeeping gene such as GAPDH which has been labeled with a different fluorophore on the 5′end (typically VIC).

[5948] To determine the level of 56939 in various human tissues a primer/probe set was designed. Total RNA was prepared from a series of human tissues using an RNeasy kit from Qiagen. First strand cDNA was prepared from 1 &mgr;g total RNA using an oligo-dT primer and Superscript II reverse transcriptase (Gibco/BRL). cDNA obtained from approximately 50 ng total RNA was used per TaqMan reaction. Tissues tested include the human tissues and several cell lines shown in Tables 12, 13, 14, and 15.

[5949] Table 12 below depicts the expression of 56939 mRNA in a panel of normal and tumor human tissues using TaqMan analysis. Elevated expression of 56939 was found in the following tissues: kidney, brain cortex, brain hypothalamus, dorsal root ganglion, colon tumor, and liver cells, including normal liver and liver fibrosis. Lower levels of expression were detected in normal and CHF heart, skeletal muscle, normal adipose tissue, pancreas, primary osteoclasts, normal skin, normal breast, breast tumor, normal ovary, ovary tumor, normal prostate, prostate tumor, prostate epithelial cells, normal colon, colon tumor, normal lung, lung tumor, and skin decubitus. Elevated levels of 56939 can be observed in many tumor samples as compared to their normal tissue counterparts 26 TABLE 12 Tissue Expression Artery normal 0 Vein normal 0 Aortic smooth muscle cells (EARLY) 0.186 Coronary smooth muscle cells (SMC) 0 Static human umbilical vein endothelial cells (HUVEC) 0 Shear HUVEC 0 Heart normal 0.3785 Heart (congestive heart failure) 1.273 Kidney 8.1867 Skeletal Muscle 0.9064 Adipose normal 2.4594 Pancreas 1.6624 primary osteoblasts 0.6636 Osteoclasts (differentiated) 1.2425 Skin normal 0.7973 Spinal cord normal 0.2144 Brain Cortex normal 28.8057 Brain Hypothalamus normal 5.4766 Nerve 0 DRG (Dorsal Root Ganglion) 3.4065 Glial Cells (Astrocytes) 0.6659 Glioblastoma 0 Breast normal 0.4824 Breast tumor 1.0093 Ovary normal 0.4071 Ovary Tumor 0.7465 Prostate Normal 0.2695 Prostate Tumor 0.3932 Epithelial Cells (Prostate) 0.6387 Colon normal 0.1021 Colon Tumor 2.5996 Lung normal 0.1244 Lung tumor 0.3694 Lung COPD 0.1608 Colon IBD 0.0226 Liver normal 5.5147 Liver fibrosis 17.9795 Dermal Cells- fibroblasts 0.0231 Spleen normal 0 Tonsil normal 0.0107 Lymph node 0 Small intestine 0.2497 Skin-Decubitus 0.7465 Synovium 0 BM-MNC (Bone marrow mononuclear cells) 0 Activated PB-MC 0.0584

[5950] Table 13 below depicts the expression of 56939 mRNA in a panel of normal and diseased vessel tissue using TaqMan analysis. Elevated expression of 56939 was found in adipose tissue. Lower levels of expression were detected in normal carotid artery, normal muscular artery, diseased iliac artery, diseased tibial artery, diseased aorta, normal saphenous vein, diseased saphenous vein, normal vein, normal coronary artery, HUVEC, and HAEC. 27 TABLE 13 Tissue Expression Aortic SMC 0.07 Human microvascular endothelial cells (HMVEC) 0.03 Human/Adipose 4.68 Human/Artery/Normal/Carotid 0.29 Human/Artery/Normal/Carotid 0.30 Human/Artery/Normal/Muscular 0.15 H Human/Artery/Diseased/iliac 0.12 Human/Artery/Diseased/Tibial 0.08 Human/Aorta/Diseased 0.20 Human/Vein/Normal/Saphenous 0.05 Human/Vein/Normal/Saphenous 0.08 Human/Vein/Normal/Saphenous 0.07 Human/Vein/Normal/Saphenous 0.16 Human/Vein/Diseased/Saphenous 0.07 Human/Vein/Normal 0.13 Human/Vein/Normal/Saphenous 0.09 Human/Vein/Normal 0.00 Mouse/Artery/Normal/Coronary 0.18 Mouse/Artery/Normal/Coronary 0.24 Mouse/Artery/Normal/Coronary 0.24 Mouse/Artery/Normal/Coronary 0.32 Mouse/Vein/Normal 0.10 HUVEC Vehicle 0.14 HUVECM 0.08 Human amniotic epithelial cells (HAEC) Vehicle 0.18 HAECM 0.07

[5951] Table 14 below depicts the expression of 56939 mRNA in a panel of normal and diseased liver, heart, kidney, and skeletal muscle tissue, using TaqMan analysis. The panel also includes the expression of 56939 mRNA in biopsied monkey liver following feeding of the indicated diet (e.g., a standard chow diet (CHOW), diets containing polyunsaturated fatty acids (POLY), and diets containing both polyunsaturated fatty acids and cholesterol (CHOL)). Elevated expression of 56939 mRNA can be observed in normal kidney. 28 TABLE 14 Tissue Expression Normal Heart 0.20 Normal Kidney 1.45 Normal Skeletal Muscle 0.25 Normal Liver 0.17 Normal Liver 0.15 Normal Liver 0.31 Normal Liver 0.06 Normal Liver 0.80 Normal Liver 0.35 Normal Liver 0.03 Diseased Liver 0.04 Diseased Liver 0.15 Diseased Liver 0.12 Diseased Liver 0.69 Monkey Liver (Chow Diet) 0.38 Monkey Liver (Poly Diet W/out CHOL)) 0.27 Monkey Liver (Poly Diet W/CHOL) 0.28 Monkey Liver (Chow Diet) 0.21 Monkey Liver (Sat Diet W/out CHOL) 7.64 Monkey Liver (Sat Diet W/CHOL) 0.20 CHOW DIET 0.12 POLY DIET W/OUT CHOL 0.09 POLY DIET W/CHOL 0.11 CHOW DIET 0.07 POLY DIET W/OUT CHOL 0.05 POLY DIET W/CHOL 0.06 CHOW DIET 0.04 POLY DIET W/OUT CHOL 0.14 POLY DIET W/CHOL 0.12 CHOW DIET 0.06 POLY DIET W/OUT CHOL 0.09 POLY DIET W/CHOL 0.07 CHOW DIET 0.08 POLY DIET W/CHOL 0.20 CHOW DIET 0.10 POLY DIET W/OUT CHOL 0.19 POLY DIET W/CHOL 0.14 CHOW DIET 0.07 POY DIET W/OUT CHOL 0.15 POLY DIET W/CHOL 0.22 CHOW DIET 0.28 POLY DIET W/OUT CHOL 0.30 POLY DIET W/CHOL 0.34 Liver 1.62 Liver 0.38 Liver 0.08 Liver 0.28

[5952] Table 15 below depicts the expression of 56939 mRNA in a panel of biopsied monkey livers following feeding of the indicated diet (i.e., a standard chow diet (CHOW), and diets containing polyunsaturated fatty acids (POLY) with and without cholesterol (CHOL), and diets containing monounsaturated fatty acids (MONO) with and without cholesterol (CHOL)), using TaqMan analysis. 29 TABLE 15 Tissue Expression CHOW DIET 0.256 POLY DIET W/OUT CHOL 0.178 POLY DIET W/CHOL 0.297 CHOW DIET 0.309 POLY DIET W/OUT CHOL 0.283 POLY DIET W/CHOL 0.064 CHOW DIET 0.066 POLY DIET W/OUT CHOL 0.202 POLY DIET W/CHOL 0.253 CHOW DIET 0.196 POLY DIET W/OUT CHOL 0.315 POLY DIET W/CHOL 0.062 CHOW DIET 0.069 POLY DIET W/OUT CHOL 0.115 POLY DIET W/CHOL 0.352 CHOW DIET 0.268 POLY DIET W/OUT CHOL 0.378 POLY DIET W/CHOL 0.090 CHOW DIET 0.061 POY DIET W/OUT CHOL 0.133 POLY DIET W/CHOL 0.147 CHOW DIET 0.262 POLY DIET W/OUT CHOL 0.262 POLY DIET W/CHOL 0.078 CHOW DIET 0.035 MONO DIET W/OUT CHOL 0.079 MONO DIET W/CHOL 0.187 CHOW DIET 0.165 MONO DIET W/OUT CHOL 0.132 MONO DIET W/CHOL 0.053 CHOW DIET 0.034 MONO DIET W/OUT CHOL 0.050 MONO DIET W/CHOL 0.063 CHOW DIET 0.445 MONO DIET W/OU CHOL 0.321 MONO DIET W/CHOL 0.046 CHOW DIET 0.113 MONO DIET W/OUT CHOL 0.111 MONO DIET W/CHOL 0.116 CHOW DIET 0.217 MONO DIET W/OUT CHOL 0.094 MONO DIET W/CHOL 0.050 CHOW DIET 0.136 MONO DIET W/OUT CHOL 0.176 MONO DIET W/CHOL 0.266 CHOW DIET 0.435 MONO DIET W/CHOL 0.535

Example 36 Tissue Distribution of 56939 mRNA by Northern Analysis

[5953] Northern blot hybridizations with various RNA samples can be performed under standard conditions and washed under stringent conditions, i.e., 0.2×SSC at 65° C. A DNA probe corresponding to all or a portion of the 56939 cDNA (SEQ ID NO:48) can be used. The DNA was radioactively labeled with 32P-dCTP using the Prime-It Kit (Stratagene, La Jolla, Calif.) according to the instructions of the supplier. Filters containing mRNA from mouse hematopoietic and endocrine tissues, and cancer cell lines (Clontech, Palo Alto, Calif.) can be probed in ExpressHyb hybridization solution (Clontech) and washed at high stringency according to manufacturer's recommendations.

Example 37 Recombinant Expression of 56939 in Bacterial Cells

[5954] In this example, 56939 is expressed as a recombinant glutathione-S-transferase (GST) fusion polypeptide in E. coli and the fusion polypeptide is isolated and characterized. Specifically, 56939 is fused to GST and this fusion polypeptide is expressed in E. coli, e.g., strain PEB199. Expression of the GST-56939 fusion protein in PEB199 is induced with IPTG. The recombinant fusion polypeptide is purified from crude bacterial lysates of the induced PEB199 strain by affinity chromatography on glutathione beads. Using polyacrylamide gel electrophoretic analysis of the polypeptide purified from the bacterial lysates, the molecular weight of the resultant fusion polypeptide is determined.

Example 38 Expression of Recombinant 56939 Protein in COS Cells

[5955] To express the 56939 gene in COS cells (e.g., COS-7 cells, CV-1 origin SV40 cells; Gluzman (1981) Cell 123:175-182), the pcDNA/Amp vector by Invitrogen Corporation (San Diego, Calif.) is used. This vector contains an SV40 origin of replication, an ampicillin resistance gene, an E. coli replication origin, a CMV promoter followed by a polylinker region, and an SV40 intron and polyadenylation site. A DNA fragment encoding the entire 56939 protein and an HA tag (Wilson et al. (1984) Cell 37:767) or a FLAG tag fused in-frame to its 3′end of the fragment is cloned into the polylinker region of the vector, thereby placing the expression of the recombinant protein under the control of the CMV promoter.

[5956] To construct the plasmid, the 56939 DNA sequence is amplified by PCR using two primers. The 5′primer contains the restriction site of interest followed by approximately twenty nucleotides of the 56939 coding sequence starting from the initiation codon; the 3′end sequence contains complementary sequences to the other restriction site of interest, a translation stop codon, the HA tag or FLAG tag and the last 20 nucleotides of the 56939 coding sequence. The PCR amplified fragment and the pCDNA/Amp vector are digested with the appropriate restriction enzymes and the vector is dephosphorylated using the CIAP enzyme (New England Biolabs, Beverly, Mass.). Preferably the two restriction sites chosen are different so that the 56939_gene is inserted in the correct orientation. The ligation mixture is transformed into E. coli cells (strains HB101, DH5&agr;, SURE, available from Stratagene Cloning Systems, La Jolla, Calif., can be used), the transformed culture is plated on ampicillin media plates, and resistant colonies are selected. Plasmid DNA is isolated from transformants and examined by restriction analysis for the presence of the correct fragment.

[5957] COS cells are subsequently transfected with the 56939-pcDNA/Amp plasmid DNA using the calcium phosphate or calcium chloride co-precipitation methods, DEAE-dextran-mediated transfection, lipofection, or electroporation. Other suitable methods for transfecting host cells can be found in Sambrook, J., Fritsh, E. F., and Maniatis, T. (1989) Molecular Cloning: A Laboratory Manual. 2nd, ed., Cold Spring Harbor Laboratory, Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y. The expression of the 56939 polypeptide is detected by radiolabelling (35S-methionine or 35S-cysteine available from NEN, Boston, Mass., can be used) and immunoprecipitation (Harlow, E. and Lane, D. (1988) Antibodies: A Laboratory Manual, Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y.) using an HA specific monoclonal antibody. Briefly, the cells are labeled for 8 hours with 35S-methionine (or 35S-cysteine). The culture media are then collected and the cells are lysed using detergents (RIPA buffer, 150 mM NaCl, 1% NP-40, 0.1% SDS, 0.5% DOC, 50 mM Tris, pH 7.5). Both the cell lysate and the culture media are precipitated with an HA specific monoclonal antibody. Precipitated polypeptides are then analyzed by SDS-PAGE.

[5958] Alternatively, DNA containing the 56939 coding sequence is cloned directly into the polylinker of the pcDNA/Amp vector using the appropriate restriction sites. The resulting plasmid is transfected into COS cells in the manner described above, and the expression of the 56939 polypeptide is detected by radiolabelling and immunoprecipitation using a 56939 specific monoclonal antibody.

Examples for 33410 Example 39 Identification and Characterization of Human 33410 cDNA

[5959] The human 33410 sequence (FIG. 26; SEQ ID NO:53), which is approximately 4667 nucleotides long, including untranslated regions, contains a predicted methionine-initiated coding sequence of about 2508 nucleotides, including the termination codon (nucleotides indicated as “coding” of SEQ ID NO:53; SEQ ID NO:55). The coding sequence encodes a 835 amino acid protein (SEQ ID NO:54).

[5960] The location of the initiation and termination codons is indicated by the underline. The human 33410 nucleic acid sequence is recited as follows: 30 GGCACGAGGCAACTTGGTCTGAATTCCAGGTCACTAACCACTTGTCTCTTCTGTT (SEQ ID NO: 53) TCCCCATTCCTTTCTGTCTGCCCCATCCAATTTCCTTTGCCCTCTTCCACCTCTGTA TTTTTCTGTCTGTCCGTCTGTCTGTATCCTGCCTCCCTGCCCCTCTCGCTCCACCCC CCGCAGGTCGGGCCTGCCTTCACCTTCTCCCACTTCCTTCCCCTTCCCCACCCCGT GCCCCCTCCATGGAGAGGAACAGACCCCTTCTCTGTCCAGTCTAACCCAGGTCCC TCCCCAACCCCCTCCTCCCTCCTTTCCCCCCGCCCCTCCTCCCTCCTGGGGCGAGG GGGGCCTCCCTCCCTCTCCCCCCCTTCTCTCTCTCTCCGAGGGGGGGGGGTCCCA GGGAGGGAGGGGGGGTCCCCCGATCAGCATGTGGCTCCTGGCGCTGTGTCTGGT GGGGCTGGCGGGGGCTCAACGCGGGGGAGGGGGTCCCGGCGGCGGCGCCCCGG GCGGCCCCGGCCTGGGCCTCGGCAGCCTCGGCGAGGAGCGCTTCCCGGTGGTGA ACACGGCCTACGGGCGAGTGCGCGGTGTGCGGCGCGAGCTCAACAACGAGATCC TGGGCCCCGTCGTGCAGTTCTTGGGCGTGCCCTACGCCACGCCGCCCCTGGGCGC CCGCCGCTTCCAGCCGCCTGAGGCGCCCGCCTCGTGGCCCGGCGTGCGCAACGC CACCACCCTGCCGCCCGCCTGCCCGCAGAACCTGCACGGGGCGCTGCCCGCCAT CATGCTGCCTGTGTGGTTCACCGACAACTTGGAGGCGGCCGCCACCTACGTGCA GAACCAGAGCGAGGACTGCCTGTACCTCAACCTCTACGTGCCCACCGAGGACGG TCCGCTCACAAAAAAACGTGACGAGGCGACGCTCAATCCGCCAGACACAGATAT CCGTGACCCTGGGAAGAAGCCTGTGATGCTGTTTCTCCATGGCGGCTCCTACATG GAGGGGACCGGAAACATGTTCGATGGCTCAGTCCTGGCTGCCTATGGCAACGTC ATTGTAGCCACGCTCAACTACCGTCTTGGGGTGCTCGGTTTTCTCAGCACCGGGG ACCAGGCTGCAAAAGGCAACTATGGGCTCCTGGACCAGATCCAGGCCCTGCGCT GGCTCAGTGAAAACATCGCCCACTTTGGGGGCGACCCCGAGCGTATCACCATCT TTGGTTCCGGGGCAGGGGCCTCCTGCGTCAACCTTCTGATCCTCTCCCACCATTC AGAAGGGCTGTTCCAGAAGGCCATCGCCCAGAGTGGCACCGCCATTTCCAGCTG GTCTGTCAACTACCAGCCGCTCAAGTACACGCGGCTGCTGGCAGCCAAGGTGGG CTGTGACCGAGAGGACAGTGCTGAAGCTGTGGAGTGTCTGCGCCGGAAGCCCTC CCGGGAGCTGGTGGACCAGGACGTGCAGCCTGCCCGCTACCACATCGCCTTTGG GCCCGTGGTGGATGGCGACGTGGTCCCCGATGACCCTGAGATCCTCATGCAGCA GGGAGAATTCCTCAACTACGACATGCTCATCGGCGTCAACCAGGGAGAGGGCCT CAAGTTCGTGGAGGACTCTGCAGAGAGCGAGGACGGTGTGTCTGCCAGCGCCTT TGACTTCACTGTCTCCAACTTTGTGGACAACCTGTATGGCTACCCGGAAGGCAAG GATGTGCTTCGGGAGACCATCAAGTTTATGTACACAGACTGGGCCGACCGGGAC AATGGCGAAATGCGCCGCAAAACCCTGCTGGCGCTCTTTACTGACCACCAATGG GTGGCACCAGCTGTGGCCACTGCCAAGCTGCACGCCGACTACCAGTCTCCCGTCT ACTTTTACACCTTCTACCACCACTGCCAGGCGGAGGGCCGGCCTGAGTGGGCAG ATGCGGCGCACGGGGATGAACTGCCCTATGTCTTTGGCGTGCCCATGGTGGGTGC CACCGACCTCTTCCCCTGTAACTTCTCCAAGAATGACGTCATGCTCAGTGCCGTG GTCATGACCTACTGGACCAACTTCGCCAAGACTGGGGACCCCAACCAGCCGGTG CCGCAGGATACCAAGTTCATCCACACCAAGCCCAATCGCTTCGAGGAGGTGGTG TGGAGCAAATTCAACAGCAAGGAGAAGCAGTATCTGCACATAGGCCTGAAGCCA CGCGTGCGTGACAACTACCGCGCCAACAAGGTGGCCTTCTGGCTGGAGCTCGTG CCCCACCTGCACAACCTGCACACGGAGCTCTTCACCACCACCACGCGCCTGCCTC CCTACGCCACGCGCTGGCCGCCTCGTCCCCCCGCTGGCGCCCCGGGCACACGCC GGCCCCCGCCGCCTGCCACCCTGCCTCCCGAGCCCGAGCCCGAGCCCGGCCCAA GGGCCTATGACCGCTTCCCCGGGGACTCACGGGACTACTCCACGGAGCTGAGCG TCACCGTGGCCGTGGGTGCCTCCCTCCTCTTCCTCAACATCCTGGCCTTTGCTGCC CTCTACTACAAGCGGGACCGGCGGCAGGAGCTGCGGTGCAGGCGGCTTAGCCCA CCTGGCGGCTCAGGCTCTGGCGTGCCTGGTGGGGGCCCCCTGCTCCCCGCCGCGG GCCGTGAGCTGCCACCAGAGGAGGAGCTGGTGTCACTGCAGCTGAAGCGGGGTG GTGGCGTCGGGGCGGACCCTGCCGAGGCTCTGCGCCCTGCCTGCCCGCCCGACT ACACCCTGGCCCTGCGCCGGGCACCGGACGATGTGCCTCTCTTGGCCCCCGGGG CCCTGACCCTGCTGCCCAGTGGCCTGGGGCCACCGCCACCCCCACCGCCCCCCTC CCTTCATCCCTTCGGGCCCTTCCCCCCGCCCCCTCCCACCGCCACCAGCCACAAC AACACGCTACCCCACCCCCACTCCACCACTCGGGTATAGGGGGTGGGTGGGGAG GCCCTCCTCCCCGGCCCTCCCTGGCCCGGCCACTCCGAAGGCAGGGAGGAGGAC TTGGCAACTGGCTTTTCTCCTGTGGAGTCGTCACACGCCATCCAGCAGCGCTAAG GTGGACATGGGATTCCTCCCTGCGATGCGTGTCTTTCCCACGCAGAGAAGCCCCA GTCTCTTCTCTGGATCTGGGCCTTTGAACAACTGGGGGGCGTTTTCTCCCCCCCAT TGGGACACCAGTCTTCGGTGTGTGGAATGTGGTATTTTCCCGCGTGGAGGTGTGC TTTCTCACAACGGGGTGTGTTTTCCCATGTGCAGGGTGAGGTTTTTTTTTGCCACC CTGGACACATGTTGGCCCCCTCAAAGAATTTCTGTGGGGATTTGTACCCCAGAAT CCTGTTCCCCCATCCCTTCTCCCACCTCCTCCCCTCTCCCTCCCCCTGGAGACCCT GGAAGTGGTGTGTTCACATACAGTGACCCTTGGCCACCAGACCACAGAGGATGG AGCCTGGGAAGCAGCGAGGAAATCACAGCCCCCTCGCCCCTGCCTCCCTTGCCC CTACCCCGGCGAAGCATGTTCCCCCCGACGCCCCCCTTGGCACAAGTCAGATGA AGCACGTTCTGCCGGGGAGGCCCTCACCTTCCAGAGAGGACAGACACAGATTTC CTGCTGGGGGAGGGAGGAGTCCACGCATCCTGATGCTGCCTGGAAGCTTATTTTC CCGTGGCCAGGACGCATTTCTCTGAGTGGAAACAGGTTCTTGCATGTGGATGTGT GTTTCCCCAGGCAGACGGCCCCTCTCTTCCCAGCACTTCCCTGCCTCCCCCAGGC CTCAGGCCCAGCACCCAGTTCCTCCTCACATGGCAGGTGAGCACAGACTTCTAGT TGGCAGGAGCTGAGGAGGGTGAACAAACCCCGAGGGAGGCCCGGCCCTTGCTCC CGAGTTGGGGGGAGGGGGTGTGGCAACGTGCCCCCCGCAGAGGCCACGCATGTT TGACCAAAGCCCTCATTGTGGTCCGAGGACAGCCTTTTCCCCAGGCCTCAGAGCA TTGCTCATCCGTGCCAAACTGGGTAGGTGGATTTGAGCGGAAAGACTCCCAAAA TGTGCCAAGAATTTCCCAGTCCCAGGCAGGGCAGGGGAAACTAAGGGCAAGCA GGATACAGGGCGAGGGATGTGGCAGGTGAGGGGGCTCCCGCCTGTGCCCCTTCT CCTCACCATGTCTCCCCCACCCTGCCTCAGTTCTCCGTTCCCCTTCATCTCCGTCC CCCTCTTTGAAGCTGTCCCCATCTCAGTGTCAGACCAGCCTTCTCCTCATCTGACC ACCCTCCTCTGACCGACGCCCCCTCCTTGTCTGAAAGAAAGGAGCCTTGAATGGT GGAGGGAGGCAGTGGGGAGAAAGGTCTCACCGGACAGGTTGGGAGAATGAGGT CAGCGGTGCTGGGGAACAGATGGAGGGGGCAGTGGGGACAGGGCTTGGGCAGA CACCAGCAGGAATAATTTGAAATGTGTGAGGTGACTCCCCGGAGGGCCTTGGGC TTGGGCATTTGGGAAAAGAATGATGTCTGGAAGGGCTTAAGGGACACAGTGGAC GAGGGGAGAGTCCTCATCTGCTGGCATTTTGTGGGGTGTTAGTGCCAAACTTGAA TAGGGGCTGGGGTGCTGTCTTCCACTGACACCCAAATCCAGAATCCCTGGTCTTG AGTCCCAGAACTTTGCCTCTTGACTGTCCCTC.

[5961] The coding sequence encodes a 835 amino acid protein (SEQ ID NO:54) and has the following amino acid sequence: 31 MWLLALCLVGLAGAQRGGGGPGGGAPGGPGLGLGSLGEERFPVVNTAYGRVRGV (SEQ ID NO: 54) RRELNNEILGPVVQFLGVPYATPPLGARRFQPPEAPASWPGVRNATTLPPACPQNLH GALPAIMLPVWFTDNLEAAATYVQNQSEDCLYLNLYVPTEDGPLTKKRDEATLNPP DTDIRDPGKKLPVMLFLHGGSYMEGTGNMFDGSVLAAYGNVIVATLNYRLGVLGFL STGDQAAKGNYGLLDQIQALRWLSENIAHFGGDPERITIFGSGAGASCVNLLILSHHS EGLFQKAIAQSGTAISSWSVNYQPLKYTRLLAAKVGCDREDSAEAVECLRRKPSREI VDQDVQPARYHIAFGPVVDGDVVPDDPEILMQQGEFLNYDMLIGVNQGEGLKFVED SAESEDGVSASAFDFTVSNFVDNLYGYPEGKDVLRETIKFMYTDWADRDNGEMRR KTLLALFTDHQWVAPAVATAKLHADYQSPVYFYTFYHHCQAEGRPEWADAAHGD ELPYVFGVPMVGATDLFPCNFSKNDVMLSAVVMTYWTNFAKTGDPNQPVPQDTKF IHTKPNRFEEVVWSKFNSKEKQYLHIGLKPRVRDNYRANKVAFWLELVPHLHNLHT ELFTTTTRLPPYATRWPPRPPAGAPGTRRPPPPATLPPEPEPEPGPRAYDRFPGDSRDY STELSVTVAVGASLLFLNILAFAALYYKRDRRQELRCRRLSPPGGSGSGVPGGGPLLP AAGRELPPEEELVSLQLKRGGGVGADPAEALRPACPPDYTLALRRAPDDVPLLAPG ALTLLPSGLGPPPPPPPPSLHPFGPFPPPPPTATSHNNTLPHPHSTTRV.

Example 40 Tissue Distribution of 33410 mRNA by Large-Scale Tissue-Specific Library Sequencing and by Northern Blot Hybridization

[5962] Endogenous human 33410 gene expression was determined using the Perkin-Elmer/ABI 7700 Sequence Detection System which employs TaqMan technology. Briefly, TaqMan technology relies on standard RT-PCR with the addition of a third gene-specific oligonucleotide (referred to as a probe) that has a fluorescent dye coupled to its 5′end (typically 6-FAM) and a quenching dye at the 3′end (typically TAMRA). When the fluorescently tagged oligonucleotide is intact, the fluorescent signal from the 5′dye is quenched. As PCR proceeds, the 5′ to 3′nucleolytic activity of Taq polymerase digests the labeled primer, producing a free nucleotide labeled with 6-FAM, which is now detected as a fluorescent signal. The PCR cycle where fluorescence is first released and detected is directly proportional to the starting amount of the gene of interest in the test sample, thus providing a way of quantitating the initial template concentration. Samples can be internally controlled by the addition of a second set of primers/probe that has been labeled with a different fluorophore on the 5′end (typically VIC) and is directed toward, for example, a housekeeping gene such as GAPDH.

[5963] To determine the level of 33410 in various human tissues a primer/probe set was designed using Primer Express (Perkin-Elmer) software and primary cDNA sequence information. Total RNA was prepared from a series of human tissues using an RNeasy kit from Qiagen. First strand cDNA was prepared from one &mgr;g total RNA using an oligo-dT primer and Superscript II reverse transcriptase (Gibco/BRL). cDNA obtained from approximately 50 ng total RNA was used per TaqMan reaction.

[5964] Tissues tested included the human tissues shown in the Table below, including ovary, kidney, spleen, lung, liver, and colon, among others. Expression was found primarily in normal brain cortex, with expression to a lesser extent in normal artery, normal heart, normal kidney, normal spinal cord, normal brain hypothalamus, and normal ovary (Table below). Expression of 33410 RNA in a human liver panel using TaqMan analysis with the following tissues were performed: heart, kidney, skeletal muscle, and several samples of normal and diseased liver. Elevated expression was detected in heart tissue. 32 TABLE 16 Expression Data Tissue Type Relative Expression Artery normal 2.4894 Aorta diseased 0 Vein normal 0 Coronary SMC 0 HUVEC 12.9581 Hemangioma 0.94 Heart normal 0.398 Heart CHF 0 Kidney 0.2608 Skeletal Muscle 0 Adipose normal 0 Pancreas 0 primary osteoblasts 0 Osteoclasts (diff) 0 Skin normal 0 Spinal cord normal 1.3621 Brain Cortex normal 68.8691 Brain Hypothalamus normal 9.3553 Nerve 0 DRG (Dorsal Root Ganglion) 0 Breast normal 0 Breast tumor 0 Ovary normal 1.4957 Ovary tumor 0 Prostate normal 0 Prostate tumor 0 Salivary glands 0 Colon normal 0 Colon tumor 0 Lung normal 0 Lung tumor 0 Lung COPD 0 Colon IBD 0 Liver normal 0 Liver fibrosis 0 Spleen normal 0 Tonsil normal 0 Lymph node normal 0 Small intestine normal 0 Macrophages 0 Synovium 0 BM-MNC 0 Activated PBMC 0 Neutrophils 0 Megakaryocytes 0 Erythroid 0 positive control 11.3592

[5965] Table 16 depicts the expression of 33410 RNA in a panel of normal and tumor human tissues detecting using TaqMan analysis. Elevated expression was detected in normal artery, HUVEC, hemangioma, normal heart, kidney, normal spinal cord, normal brain cortex, brain hypothalamus and normal ovary.

[5966] Northern blot hybridizations with various RNA samples can be performed under standard conditions and washed under stringent conditions, i.e., 0.2× SSC at 65° C. A DNA probe corresponding to all or a portion of the 33410 cDNA (SEQ ID NO:53) can be used. The DNA was radioactively labeled with 32P-dCTP using the Prime-It Kit (Stratagene, La Jolla, Calif.) according to the instructions of the supplier. Filters containing mRNA from mouse hematopoietic and endocrine tissues, and cancer cell lines (Clontech, Palo Alto, Calif.) can be probed in ExpressHyb hybridization solution (Clontech) and washed at high stringency according to manufacturer's recommendations.

Example 41 Recombinant Expression of 33410 in Bacterial Cells

[5967] In this example, 33410 is expressed as a recombinant glutathione-S-transferase (GST) fusion polypeptide in E. coli and the fusion polypeptide is isolated and characterized. Specifically, 33410 is fused to GST and this fusion polypeptide is expressed in E. coli, e.g., strain PEB199. Expression of the GST-33410 fusion protein in PEB199 is induced with IPTG. The recombinant fusion polypeptide is purified from crude bacterial lysates of the induced PEB 199 strain by affinity chromatography on glutathione beads. Using polyacrylamide gel electrophoretic analysis of the polypeptide purified from the bacterial lysates, the molecular weight of the resultant fusion polypeptide is determined.

Example 42 Expression of Recombinant 33410 Protein in COS Cells

[5968] To express the 33410 gene in COS cells, the pcDNA/Amp vector by Invitrogen Corporation (San Diego, Calif.) is used. This vector contains an SV40 origin of replication, an ampicillin resistance gene, an E. coli replication origin, a CMV promoter followed by a polylinker region, and an SV40 intron and polyadenylation site. A DNA fragment encoding the entire 33410 protein and an HA tag (Wilson et al. (1984) Cell 37:767) or a FLAG tag fused in-frame to its 3′end of the fragment is cloned into the polylinker region of the vector, thereby placing the expression of the recombinant protein under the control of the CMV promoter.

[5969] To construct the plasmid, the 33410 DNA sequence is amplified by PCR using two primers. The 5′primer contains the restriction site of interest followed by approximately twenty nucleotides of the 33410 coding sequence starting from the initiation codon; the 3′end sequence contains complementary sequences to the other restriction site of interest, a translation stop codon, the HA tag or FLAG tag and the last 20 nucleotides of the 33410 coding sequence. The PCR amplified fragment and the pcDNA/Amp vector are digested with the appropriate restriction enzymes and the vector is dephosphorylated using the CIAP enzyme (New England Biolabs, Beverly, Mass.). Preferably, the two restriction sites chosen are different so that the 33410 gene is inserted in the correct orientation. The ligation mixture is transformed into E. coli cells (strains HB101, DH5&agr;, SURE, available from Stratagene Cloning Systems, La Jolla, Calif., can be used), the transformed culture is plated on ampicillin media plates, and resistant colonies are selected. Plasmid DNA is isolated from transformants and examined by restriction analysis for the presence of the correct fragment.

[5970] COS cells are subsequently transfected with the 33410-pcDNA/Amp plasmid DNA using the calcium phosphate or calcium chloride co-precipitation methods, DEAE-dextran-mediated transfection, lipofection, or electroporation. Other suitable methods for transfecting host cells can be found in Sambrook, J., Fritsh, E. F., and Maniatis, T. Molecular Cloning: A Laboratory Manual. 2nd, ed., Cold Spring Harbor Laboratory, Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y., 1989. The expression of the 33410 polypeptide is detected by radiolabeling (35S-methionine or 35S-cysteine available from NEN, Boston, Mass., can be used) and immunoprecipitation (Harlow, E. and Lane, D. Antibodies: A Laboratory Manual, Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y., 1988) using an HA specific monoclonal antibody. Briefly, the cells are labeled for 8 hours with 35S-methionine (or 35S-cysteine). The culture media are then collected and the cells are lysed using detergents (RIPA buffer, 150 mM NaCl, 1% NP-40, 0.1% SDS, 0.5% DOC, 50 mM Tris, pH 7.5). Both the cell lysate and the culture media are precipitated with an HA specific monoclonal antibody. Precipitated polypeptides are then analyzed by SDS-PAGE.

[5971] Alternatively, DNA containing the 33410 coding sequence is cloned directly into the polylinker of the pcDNA/Amp vector using the appropriate restriction sites. The resulting plasmid is transfected into COS cells in the manner described above, and the expression of the 33410 polypeptide is detected by radiolabeling and immunoprecipitation using a 33410 specific monoclonal antibody.

Examples for 33521 Example 43 Identification and Characterization of Human 33521 cDNA

[5972] The human 33521 nucleic acid sequence is recited as follows: 33 GCACTGTGACAAGCTGCACGCTCTAGAGTCGACCCAGCAGGCTCTGTGTATGAA (SEQ ID NO: 61) TGACAAGGATACCTTCAGCCAGCTCATTCTGGATGAATGAATGATTACACTAAGT GTCCTCCACATTCCTCTGTGGGCTCACTTCATGGACTCACTTTGCGTGCTTGTTAA ATGTGCTGCGTTGCTCCCAAGACCATGTAAAGCCTACTGACCACTAACCTCCCTC ACAGCAGAAACTAGACGTCAGGTTAAAATGGGCAACTCCGACAGTCAGTACACC CTTCAAGGATCTAAAAATCATAGCAATACTATTACTGGTGCTAAGCAAATTCCTT GCTCCCTGAAAATACGTGGCGTTCATGCAAAAGAGGAAAAGTCATTGCATGGAT GGGGTCACGGGAGCAACGGAGCAGGTTACAAGTCCAGGTCCCTGGCCCGAAGCT GCCTTTCTCACTTTAAGAGTAACCAGCCTTACGCATCGAGACTCGGTGGCCCCAC ATGCAAGGTCTCCAGAGGTGTTGCCTACTCCACGCACAGGACAAATGCCCCAGG GAAGGATTTCCAGGGCATCAGTGCTGCTTTCTCAACTGAGAATGGCTTCCATTCT GTTGGCCACGAGCTGGCAGATAACCACATCACCTCCAGAGACTGCAACGGACAC CTTCTCAACTGCTACGGGAGGAATGAGAGCATTGCCTCCACCCCACCGGGCGAA GACCGCAAGAGCCCCCGAGTGCTCATCAAAACGCTGGGGAAGCCGGATGGGTGT TTAAGGGTCGAGTTCCACAATGGTGGCAACCCCAGCAAAGTGCCTGCAGAGGAC TGCAGTGAGCCGGTGCAGCTGCTGAGGTACTCACCTACCTTAGCATCGGAAACC TCCCCTGTGCCTGAAGCCAGGAGGGGGTCCAGCGCCGATTCCCTGCCCAGCCAT CGCCCCTCTCCCACGGACTCTCGCCTGCGGTCCAGCAAAGGCAGCTCCCTGAGTT CTGAGTCATCCTGGTACGACTCCCCTTGGGGCAATGCTGGAGAGCTGAGCGAGG CTGAGGGCTCCTTCCTGGCCCCCGGCATGCCTGACCCCAGTCTCCATGCCAGCTT CCCACCTGGCGATGCCAAAAAGCCTTTCAACCAAAGCTCTTCCCTCTCCTCCCTC CGGGAACTGTACAAAGATGCCAACCTGGGGAGCCTCTCCCCCTCAGGTATCCGC CTTTCTGATGAATACATGGGCACGCATGCCAGCCTGAGCAACCGTGTCTCTTTTG CTTCCGACATTGATGTGCCCTCCAGAGTGGCACACGGGGACCCCATCCAGTACA GTTCCTTCACTCTCCCCTGTCGGAAGCCCAAAGCCTTTGTTGAGGATACTGCGAA GAAGGACTCCCTCAAAGCCAGGATGCGACGGATCAGTGACTGGACGGGAAGCCT CTCAAGGAAGAAAAGGAAACTCCAGGAGCCGAGGTCCAAGGAGGGCAGTGACT ACTTTGACAGTCGCTCTGATGGACTGAATACAGATGTGCAGGGATCCTCCCAGGC ATCTGCTTTTCTGTGGTCAGGGGGCTCTACTCAGATCCTGTCTCAGAGAAGTGAA TCCACACATGCGATTGGCAGCGATCCCCTCCGGCAGAACATTTATGAGAATTTCA TGCGAGAGTTGGAAATGAGCAGGACCAACACTGAGAACATAGAAACATCTACA GAAACCGCCGAGTCCAGCAGCGAGTCACTCAGCTCTCTGGAACAGCTGGATCTG CTCTTCGAGAAGGAACAGGGGGTGGTCCGGAGGGCCGGGTGGCTCTTCTTCAAG CCCCTGGTCACTGTGCAGAAGGAAAGGAAGCTTGAGCTGGTGGCACGAAGGAAA TGGAAACAGTACTGGGTAACGCTGAAAGGATGCACGCTGCTGTTTTATGAGACC TATGGGAAGAATTCCATGGATCAGAGCAGTGCCCCTCGGTGTGCTCTGTTTGCAG AAGACAGCATAGTGCAGTCTGTTCCAGAGCATCCCAAGAAAGAAAATGTGTTCT GCCTCAGCAACTCCTTTGGAGATGTCTACCTTTTCCAGGCCACCAGCCAGACAGA TCTAGAAAACTGGGTCACTGCTGTACACTCTGCTTGTGCATCCCTTTTTGCAAAG AAGCATGGGAAAGAGGACACGCTGCGGCTGCTGAAGAACCAGACCAAAAACCT GCTTCAGAAGATAGACATGGACAGCAAGATGAAGAAGATGGCAGAGCTGCAGC TGTCCGTGGTGAGCGACCCAAAGAACAGGAAAGCCATAGAGAACCAGATCCAG CAATGGGAGCAGAATCTTGAGAAATTTCACATGGATCTGTTCAGGATGCGCTGCT ATCTGGCCAGCCTACAAGGTGGGGAGTTACCGAACCCAAAGAGTCTCCTTGCAG CCGCCAGCCGCCCCTCCAAGCTGGCCCTCGGCAGGCTGGGCATCTTGTCTGTTTC CTCTTTCCATGCTCTGGTATGTTCTAGAGATGACTCTGCTCTCCGGAAAAGGACA CTGTCACTGACCCAGCGAGGGAGAAACAAGAAGGGAATATTTTCTTCGTTAAAA GGGCTGGACACACTGGCCAGAAAAGGCAAGGAGAAGAGACCTTCTATAACTCA GGTCGATGAACTTCTGCATATATATGGTTCAACAGTGGACGGTGTTCCCCGAGAC AATGCATGGGAAATCCAGACTTATGTTCACTTTCAGGACAATCACGGAGTTACTG TAGGGATCAAGCCAGAGCACAGAGTAGAAGATATTTTGACTTTGGCATGCAAGA TGAGGCAGTTGGAACCCAGCCATTATGGCCTACAGCTTCGAAAATTAGTAGATG ACAATGTTGAGTATTGCATCCCTGCACCATATGAATATATGCAACAACAGGTTTA TGATGAAATAGAAGTCTTTCCACTAAATGTTTATGATGTGCAGCTCACGAAGACT GGGAGTGTGTGTGACTTTGGGTTTGCAGTTACAGCGCAGGTGGATGAGCGTCAG CATCTCAGCCGGATATTTATAAGCGACGTTCTTCCCGATGGCCTGGCGTATGGGG AAGGGCTGAGAAAGGGCAATGAGATCATGACCTTAAATGGGGAAGCTGTGTCTG ATCTTGACCTTAAGCAGATGGAGGCCCTGTTTTCTGAGAAGAGCGTCGGACTCAC TCTGATTGCCCGGCCTCCGGACACAAAAGCAACCCTGTGTACATCCTGGTCAGAC AGTGACCTGTTCTCCAGGGACCAGAAGAGTCTGCTGCCCCCTCCTAACCAGTCCC AACTGCTGGAGGAATTCCTGGATAACTTTAAAAAGAATACAGCCAATGATTTCA GCAACGTCCCTGATATCACAACAGGTCTGAAAAGGAGTCAGACAGATGGCACTC TGGATCAGGTTTCCCACAGGGAGAAAATGGAGCAGACATTCAGGAGTGCTGAGC AGATCACTGCACTGTGCAGGAGTTTTAACGACAGTCAGGCCAACGGCATGGAAG GACCGCGGGAGAATCAGGATCCTCCTCCGAGGCCTCTGGCCCGCCACCTGTCTG ATGCAGACCGCCTCCGCAAAGTCATCCAGGAGCTTGTGGACACAGAGAAGTCCT ACGTGAAGGATTTGAGCTGCCTCTTTGAATTATACTTGGAGCCACTTCAGAATGA GACCTTTCTTACCCAAGATGAGATGGAGTCACTTTTTGGAAGTTTGCCAGAGATG CTTGAGTTTCAGAAGGTGTTTCTGGAGACCCTGGAGGATGGGATTTCAGCATCAT CTGACTTTAACACCCTAGAAACCCCCTCACAGTTTAGAAAATTACTGTTTTCCCTT GGAGGCTCTTTCCTTTATTACGCGGACCACTTTAAACTGTACAGTGGATTCTGTG CTAACCATATCAAAGTACAGAAGGTTCTGGAGCGAGCTAAAACTGACAAAGCCT TCAAGGCTTTTCTGGATGCCCGGAACCCCACCAAGCAGCATTCCTCCACGCTGGA GTCCTACCTCATCAAGCCGGTTCAGAGAGTGCTCAAGTACCCGCTGCTGCTCAAG GAGCTGGTGTCCCTGACGGACCAGGAGAGCGAGGAGCACTACCACCTGACGGA AGCACTAAAGGCAATGGAGAAAGTAGCGAGCCACATCAATGAGATGCAGAAGA TCTATGAGGATTATGGGACCGTGTTTGACCAGCTAGTAGCTGAGCAGAGCGGAA CAGAGAAGGAGGTAACAGAACTTTCGATGGGAGAGCTTCTGATGCACTCTACGG TTTCCTGGTTGAATCCATTTCTGTCTCTAGGAAAAGCTAGAAAGGACCTTGAGCT CACAGTATTTGTTTTTAAGAGAGCCGTCATACTGGTTTATAAAGAAAACTGCAAA CTGAAAAAGAAATTGCCCTCGAATTCCCGGCCTGCACACAACTCTACTGACTTGG ACCCATTTAAATTCCGCTGGTTGATCCCCATCTCCGCGCTTCAAGTCAGACTGGG GAATCCAGCAGGGACAGAAAATAATTCCATATGGGAACTGATCCATACGAAGTC AGAAATAGAAGGACGGCCAGAAACCATCTTTCAGTTGTGTTGCAGTGACAGTGA AAGCAAAACCAACATTGTTAAGGTGATTCGTTCTATTCTGAGGGAAAACTTCAG GCGTCACATAAAGTGTGAATTACCACTGGAGAAAACGTGTAAGGATCGCCTGGT ACCTCTTAAGAACCGAGTTCCTGTTTCGGCCAAATTAGCTTCATCCAGGTCTTTA AAAGTCCTGAAGAATTCCTCCAGCAACGAGTGGACCGGTGAGACTGGCAAGGGA ACCTTGCTGGACTCTGACGAGGGCAGCTTGAGCAGCGGCACCCAGAGCAGCGGC TGCCCCACGGCTGAGGGCAGGCAGGACTCCAAGAGCACTTCTCCCGGGAAATAC CCACACCCCGGCTTGGCAGATTTTGCCGACAATCTCATCAAAGAGAGTGACATC CTGAGCGATGAAGATGATGACCACCGTCAGACTGTGAAGCAGGGCAGCCCTACT AAAGACATCGAAATTCAGTTCCAGAGACTGAGGATTTCCGAGGACCCAGACGTT CACCCCGAGGCTGAGCAGCAGCCTGGCCCGGAGTCGGGTGAGGGTCAGAAAGG AGGAGAGCAGCCCAAACTGGTCCGGGGGCACTTCTGCCCCATTAAACGAAAAGC CAACAGCACCAAGAGGGACAGAGGAACTTTGCTCAAGGCGCAGATCCGTCACCA GTCCCTTGACAGTCAGTCTGAAAATGCCACCATCGACCTAAATTCTGTTCTAGAG CGAGAATTCAGTGTCCAGAGTTTAACATCTGTTGTCAGTGAGGAGTGTTTTTATG AAACAGAGAGCCACGGAAAATCATAGTATGATTCAATCCAGATATGGGTTAAAT TCCTCATTTTACTTTTAAACTGGTGGTAAAGTGGAAATTGCAAAAAAAAAAAAAAA.

[5973] The human 33521 sequence (FIG. 30; SEQ ID NO:61) is approximately 5437 nucleotides long. The nucleic acid sequence includes an initiation codon (ATG) and a termination codon (TAG) which are underscored above. The region between and inclusive of the initiation codon and the termination codon is a methionine-initiated coding sequence of about 5106 nucleotides, including the termination codon (nucleotides indicated as “coding” of SEQ ID NO:61; SEQ ID NO:63). The coding sequence encodes a 1701 amino acid protein (SEQ ID NO:62), which is recited as follows: 34 MGNSDSQYTLQGSKNHSNTITGAKQIPCSLKIRGVHAKEEKSLHGWGHGSNGAGYK (SEQ ID NO: 62) SRSLARSCLSHFKSNQPYASRLGGPTCKVSRGVAYSTHRTNAPGKDFQGISAAFSTE NGFHSVGHELADNHITSRDCNGHLLNCYGRNESIASTPPGEDRKSPRVLIKTLGKPD GCLRVEFHNGGNPSKVPAEDCSEPVQLLRYSPTLASETSPVPEARRGSSADSLPSHRP SPTDSRLRSSKGSSLSSESSWYDSPWGNAGELSEAEGSFLAPGMPDPSLHASFPPGDA KKPFNQSSSLSSLRELYKDANLGSLSPSGIRLSDEYMGTHASLSNRVSFASDIDVPSR VAHGDPIQYSSFTLPCRKPKAFVEDTAKKDSLKARMRRISDWTGSLSRKKRKLQEPR SKEGSDYFDSRSDGLNTDVQGSSQASAFLWSGGSTQILSQRSESTHAIGSDPLRQNIY ENFMRELEMSRTNTENIETSTETAESSSESLSSLEQLDLLFEKEQGVVRRAGWLFFKP LVTVQKERKLELVARRKWKQYWVTLKGCTLLFYETYGKNSMDQSSAPRCALFAED SIVQSVPEHPKKENVFCLSNSFGDVYLFQATSQTDLENWVTAVHSACASLFAKKHG KEDTLRLLKNQTKNLLQKIDMDSKMKKMAELQLSVVSDPKNRKAIENQIQQWEQN LEKFHMDLFRMRCYLASLQGGELPNPKSLLAAASRPSKLALGRLGILSVSSFHALVC SRDDSALRKRTLSLTQRGRNKKGIFSSLKGLDTLARKGKEKRPSITQVDELLHIYGST VDGVPRDNAWEIQTYVHFQDNHGVTVGIKPEHRVEDILTLACKMRQLEPSHYGLQL RKLVDDNVEYCIPAPYEYMQQQVYDEIEVFPLNVYDVQLTKTGSVCDFGFAVTAQV DERQHLSRIFISDVLPDGLAYGEGLRKGNEIMTLNGEAVSDLDLKQMEALFSEKSVG LTLIARPPDTKATLCTSWSDSDLFSRDQKSLLPPPNQSQLLEEFLDNFKKNTANDFSN VPDITTGLKRSQTDGTLDQVSHREKMEQTFRSAEQITALCRSFNDSQANGMEGPREN QDPPPRPLARHLSDADRLRKVIQELVDTEKSYVKDLSCLFELYLEPLQNETFLTQDE MESLFGSLPEMLEFQKVFLETLEDGISASSDFNTLETPSQFRKLLFSLGGSFLYYADHF KLYSGFCANHIKVQKVLERAKTDKAFKAFLDARNPTKQHSSTLESYLIKPVQRVLKY PLLLKELVSLTDQESEEHYHLTEALKAMEKVASHINEMQKIYEDYGTVFDQLVAEQS GTEKEVTELSMGELLMHSTVSWLNPFLSLGKARKDLELTVFVFKRAVILVYKENCK LKKKLPSNSRPAHNSTDLDPFKFRWLIPISALQVRLGNPAGTENNSIWELIHTKSEIEG RPETIFQLCCSDSESKTNIVKVIRSILRENFRRHIKCELPLEKTCKDRLVPLKNRVPVSA KLASSRSLKVLKNSSSNEWTGETGKGTLLDSDEGSLSSGTQSSGCPTAEGRQDSKST SPGKYPHPGLADFADNLIKESDILSDEDDDHRQTVKQGSPTKDIEIQFQRLRISEDPDV HPEAEQQPGPESGEGQKGGEQPKLVRGHFCPIKRKANSTKRDRGTLLKAQIRHQSLD SQSENATIDLNSVLEREFSVQSLTSVVSEECFYETESHGKS.

Example 44 Tissue Distribution of 33521 mRNA by TaqMan Analysis

[5974] Endogenous human 33521 gene expression was determined using the Perkin-Elmer/ABI 7700 Sequence Detection System which employs TaqMan technology. Briefly, TaqMan technology relies on standard RT-PCR with the addition of a third gene-specific oligonucleotide (referred to as a probe) which has a fluorescent dye coupled to its 5′end (typically 6-FAM) and a quenching dye at the 3′end (typically TAMRA). When the fluorescently tagged oligonucleotide is intact, the fluorescent signal from the 5′dye is quenched. As PCR proceeds, the 5′ to 3′nucleolytic activity of Taq polymerase digests the labeled primer, producing a free nucleotide labeled with 6-FAM, which is now detected as a fluorescent signal. The PCR cycle where fluorescence is first released and detected is directly proportional to the starting amount of the gene of interest in the test sample, thus providing a quantitative measure of the initial template concentration. Samples can be internally controlled by the addition of a second set of primers/probe specific for a housekeeping gene such as GAPDH which has been labeled with a different fluorophore on the 5′end (typically VIC).

[5975] To determine the level of 33521 in various human tissues a primer/probe set was designed. Total RNA was prepared from a series of human tissues using an RNeasy kit from Qiagen. First strand cDNA was prepared from 1 &mgr;g total RNA using an oligo-dT primer and Superscript II reverse transcriptase (Gibco/BRL). cDNA obtained from approximately 50 ng total RNA was used per TaqMan reaction. Tissues tested include the human tissues and several cell lines shown in Table 17. 33521 mRNA was detected in glioblastoma cells. As compared to their normal tissue counterparts, elevated levels of expression were observed in breast, colon, and lung tumors. 35 TABLE 17 Expression of 33521 in various tissues Tissue Expression Glioblastoma 21.33 Glioblastoma 22.78 Glioblastoma 48.17 Glioblastoma 59.71 Brain Normal 11.67 Placenta Normal 5.58 Skin Normal 9.68 Adipose Normal 1.18 Breast Normal 3.89 Breast Tumor 100.78 Breast Tumor 5.62 Breast Tumor 23.34 Colon Normal 1.13 Colon Tumor 16.91 Colon Tumor 6.73 Colon Tumor 19.56 Liver Normal 1.20 Colon Metastasis 12.21 Colon Metastasis 9.99 Lung Normal 1.00 Lung Tumor 5.98 Lung Tumor 11.24 Lung Tumor 4.08

Example 45 Tissue Distribution of 33521 mRNA by Northern Analysis

[5976] Northern blot hybridizations with various RNA samples can be performed under standard conditions and washed under stringent conditions, i.e., 0.2× SSC at 65° C. A DNA probe corresponding to all or a portion of the 33521 cDNA (SEQ ID NO:61) can be used. The DNA was radioactively labeled with 32P-dCTP using the Prime-It Kit (Stratagene, La Jolla, Calif.) according to the instructions of the supplier. Filters containing mRNA from mouse hematopoietic and endocrine tissues, and cancer cell lines (Clontech, Palo Alto, Calif.) can be probed in ExpressHyb hybridization solution (Clontech) and washed at high stringency according to manufacturer's recommendations.

Example 46 Recombinant Expression of 33521 in Bacterial Cells

[5977] In this example, 33521 is expressed as a recombinant glutathione-S-transferase (GST) fusion polypeptide in E. coli and the fusion polypeptide is isolated and characterized. Specifically, 33521 is fused to GST and this fusion polypeptide is expressed in E. coli, e.g., strain PEB199. Expression of the GST-33521 fusion protein in PEB199 is induced with IPTG. The recombinant fusion polypeptide is purified from crude bacterial lysates of the induced PEB 199 strain by affinity chromatography on glutathione beads. Using polyacrylamide gel electrophoretic analysis of the polypeptide purified from the bacterial lysates, the molecular weight of the resultant fusion polypeptide is determined.

Example 47 Expression of Recombinant 33521 Protein in COS Cells

[5978] To express the 33521 gene in COS cells (e.g., COS-7 cells, CV-1 origin SV40 cells; Gluzman (1981) Cell 123:175-182), the pcDNA/Amp vector by Invitrogen Corporation (San Diego, Calif.) is used. This vector contains an SV40 origin of replication, an ampicillin resistance gene, an E. coli replication origin, a CMV promoter followed by a polylinker region, and an SV40 intron and polyadenylation site. A DNA fragment encoding the entire 33521 protein and an HA tag (Wilson et al. (1984) Cell 37:767) or a FLAG tag fused in-frame to its 3′end of the fragment is cloned into the polylinker region of the vector, thereby placing the expression of the recombinant protein under the control of the CMV promoter.

[5979] To construct the plasmid, the 33521 DNA sequence is amplified by PCR using two primers. The 5′primer contains the restriction site of interest followed by approximately twenty nucleotides of the 33521 coding sequence starting from the initiation codon; the 3′ end sequence contains complementary sequences to the other restriction site of interest, a translation stop codon, the HA tag or FLAG tag and the last 20 nucleotides of the 33521 coding sequence. The PCR amplified fragment and the pCDNA/Amp vector are digested with the appropriate restriction enzymes and the vector is dephosphorylated using the CIAP enzyme (New England Biolabs, Beverly, Mass.). Preferably the two restriction sites chosen are different so that the 33521_gene is inserted in the correct orientation. The ligation mixture is transformed into E. coli cells (strains HB101, DH5&agr;, SURE, available from Stratagene Cloning Systems, La Jolla, Calif., can be used), the transformed culture is plated on ampicillin media plates, and resistant colonies are selected. Plasmid DNA is isolated from transformants and examined by restriction analysis for the presence of the correct fragment.

[5980] COS cells are subsequently transfected with the 33521-pcDNA/Amp plasmid DNA using the calcium phosphate or calcium chloride co-precipitation methods, DEAE-dextran-mediated transfection, lipofection, or electroporation. Other suitable methods for transfecting host cells can be found in Sambrook, J., Fritsh, E. F., and Maniatis, T. (1989) Molecular Cloning: A Laboratory Manual. 2nd, ed., Cold Spring Harbor Laboratory, Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y. The expression of the 33521 polypeptide is detected by radiolabelling (35S-methionine or 35S-cysteine available from NEN, Boston, Mass., can be used) and immunoprecipitation (Harlow, E. and Lane, D. (1988) Antibodies: A Laboratory Manual, Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y.) using an HA specific monoclonal antibody. Briefly, the cells are labeled for 8 hours with 35S-methionine (or 35S-cysteine). The culture media are then collected and the cells are lysed using detergents (RIPA buffer, 150 mM NaCl, 1% NP-40, 0.1% SDS, 0.5% DOC, 50 mM Tris, pH 7.5). Both the cell lysate and the culture media are precipitated with an HA specific monoclonal antibody. Precipitated polypeptides are then analyzed by SDS-PAGE.

[5981] Alternatively, DNA containing the 33521 coding sequence is cloned directly into the polylinker of the pcDNA/Amp vector using the appropriate restriction sites. The resulting plasmid is transfected into COS cells in the manner described above, and the expression of the 33521 polypeptide is detected by radiolabelling and immunoprecipitation using a 33521 specific monoclonal antibody.

Examples of 23479, 48120, and 46689 Example 48 Identification and Characterization of Human 23479, 48120, or 46689 cDNA

[5982] The human 23479 nucleic acid sequence is recited as follows: 36 TGCTGTCGCTAGATTCAGATGATTCAAGTGAGGATCAAGTGGAAAATAGTAAAA (SEQ ID NO: 74) ATTCCTGGAGTTGCAAGTTTGTTGCTGCTGGAGGGCTTCAACAGTTATTAGAAAT TTTTAATTCTGGAATTCTAGAGCCTAAAGAGCAGGAATCATGGACTGTGTGGCAG CTAGACTGTCTTGCTTGCTTGCTGAAGTTAATATGCCAGTTTGCAGTAGATCCAT CCGATTTGGATTTAGCTTATCATGATGTCTTTGCCTGGTCTGGTATAGCGGAAAG CCATAGGAAAAGAACCTGGCCTGGCAAATCAAGGAAGGCTGCTGGTGATCATGC TAAGGGTCTTCATATACCACGATTAACAGAGGTATTTCTTGTTCTTGTCCAAGGA ACCAGTTTGATTCAGCGACTTATGTCTGTTGCTTATACGTATGATAATCTGGCTC CTAGAGTTTTAAAAGCTCAGTCTGATCACAGGTCTAGACATGAAGTTTCACATTA TTCAATGTGGCTCTTGGTGAGTTGGGCTCATTGCTGTTCTTTAGTGAAATCTAGCC TTGCTGATAGCGATCATTTACAAGATTGGCTAAAGAAATTGACTCTCCTTATTCC TGAGACTGCAGTTCGTCATGAATCATGCAGTGGTCTCTATAAGTTATCCCTGTCA GGGCTGGATGGAGGAGACTCAATCAATCGTTCTTTTCTGCTATTGGCTGCCTCAA CATTATTGAAATTTCTTCCTGATGCTCAAGCACTCAAACCTATTAGGATAGATGA TTATGAGGAAGAACCAATATTAAAACCAGGATGTAAAGAGTATTTTTGGTTGTTA TGCAAATTAGTTGACAACATACATATAAAGGACGCTAGTCAGACAACGCTCCTC GACTTAGATGCCTTGGCAAGACATTTGGCTGACTGTATTCGAAGTAGGGAGATCC TTGATCATCAGGATGGTAATGTAGAAGATGATGGGCTTACAGGACTCCTAAGGC TTGCAACAAGTGTTGTTAAACACAAACCACCCTTTAAATTTTCAAGGGAAGGAC AGGAATTTTTGAGAGATATCTTCAATCTCCTGTTTTTGTTGCCAAGTCTAAAGGA CCGACAACAGCCAAAGTGCAAATCACATTCTACAAGAGCTGCCGCTTACGATTT GTTAGTAGAGATGGTAAAGGGGTCTGTTGAGAACTACAGGCTAATACACAACTG GGTTATGGCACAACACATGCAGTCCCATGCACCTTATAAATGGGATTACTGGCCT CATGAAGATGTCCGTGCTGAATGTAGATTTGTTGGCCTTACTAACCTTGGAGCTA CTTGTTACTTAGCTTCTACTATTCAGCAACTTTATATGATACCTGAGGCAAGACA GGCTGTCTTCACTGCCAAGTATTCAGAGGATATGAAGCACAAGACCACTCTTCTG GAGCTTCAGAAAATGTTTACATATTTAATGGAGAGTGAATGCAAAGCATATAAT CCTAGACCTTTCTGTAAAACATACACCATGGATAAGCAGCCTCTGAATACTGGGG AACAGAAAGATATGACAGAGTTTTTTACTGATCTAATTACCAAAATCGAAGAAA TGTCTCCCGAACTGAAAAATACCGTCAAAAGTTTATTTGGAGGTGTAATTACAAA CAATGTTGTATCCTTGGATTGTGAACATGTTAGTCAAACTGCTGAAGAGTTTTAT ACTGTGAGGTGCCAAGTGGCTGATATGAAGAACATTTATGAATCTCTTGATGAA GTTACTATAAAAGACACTTTGGAAGGTGATAACATGTATACTTGTTCTCATTGTG GGAAGAAAGTACGAGCTGAAAAAAGGGCATGTTTTAAGAAATTGCCTCGCATTT TGAGTTTCAATACTATGAGATACACATTTAATATGGTCACGATGATGAAAGAGA AAGTGAATACACACTTTTCCTTCCCATTACGTTTGGACATGACGCCCTATACAGA AGATTTTCTTATGGGAAAGAGTGAGAGGAAAGAAGGTTTTAAAGAAGTCAGTGA TCATTCAAAAGACTCAGAGAGCTATGAATATGACTTGATAGGAGTGACTGTTCA CACAGGAACGGCAGATGGTGGACACTATTATAGCTTTATCAGAGATATAGTAAA TCCCCATGCTTATAAAAACAATAAATGGTATCTTTTTAATGATGCTGAGGTAAAA CCTTTTGATTCTGCTCAACTTGCATCTGAATGTTTTGGTGGAGAGATGACGACCA AGACCTATGATTCTGTTACAGATAAATTTATGGACTTCTCTTTTGAAAAGACACA CAGTGCATATATGCTGTTTTACAAACGCATGGAACCAGAGGAAGAAAATGGCAG AGAATACAAATTTGATGTTTCGTCAGAGTTACTAGAGTGGATTTGGCATGATAAC ATGCAGTTTCTTCAAGACAAAAACATTTTTGAACATACATATTTTGGATTTATGT GGCAATTGTGTAGTTGTATTCCCAGTACATTACCAGATCCTAAAGCTGTGTCCTT AATGACAGCAAAGTTAAGCACTTCCTTTGTCCTAGAGACATTTATTCATTCTAAA GAAAAGCCCACGATGCTTCAGTGGATTGAACTGTTGACGAAACAGTTTAATAAT AGTCAGGCAGCTTGTGAGTGGTTTTTAGATCGTATGGCTGATGACGACTGGTGGC CAATGCAGATACTAATTAAGTGCCCTAATCAAATTGTGAGACAGATGTTTCAGCG TTTGTGTATCCATGTGATTCAGAGGCTGAGACCTGTGCATGCTCATCTCTATTTGC AGCCAGGAATGGAAGATGGGTCAGATGATATGGATACCTCAGTAGAAGATATTG GTGGTCGTTCATGTGTCACTCGCTTTGTGAGAACCCTGTTATTAATTATGGAACA TGGTGTAAAACCTCACAGTAAACATCTTACAGAGTATTTTGCCTTCCTTTACGAA TTTGCAAAAATGGGTGAAGAAGAGAGCCAATTTTTGCTTTCATTGCAAGCTATAT CTACAATGGTACATTTTTACATGGGAACAAAAGGACCTGAAAATCCTCAAGTTG AAGTGTTATCAGAGGAAGAAGGGGAAGAAGAAGAGGAGGAAGAAGATATCCTC TCTCTGGCAGAAGAAAAATACAGGCCAGCTGCCCTTGAAAAGATGATAGCTTTA GTTGCTCTTTTGGTTGAACAGTCTCGATCAGAAAGGTGAAATGTTTCGAATTTAA AATGTTTAAAGCATGTTTGGTTTTATTATTTTTACATAATTGTTTACCACTAGTTT TTCCACTAGCTTTTTATTATATATGTTTAATTATGTAATTGTTATTCACTAGCTTTT ATTATATAAATCCTTTTAAATAATACTACTATTCATCAACTCTTGTGGCATAAGA ATTTCAGTTTTTTCTACCAAACTTTTACTTCATCTATGAGTCGTGTTAGAAATAGT CATTGAAAAAATATACAGTAAAATATCTAAAAAAAAAAAAAAAGG.

[5983] The human 23479 sequence (FIG. 33; SEQ ID NO:74) is approximately 3494 nucleotides long. The nucleic acid sequence includes an initiation codon (ATG) and a termination codon (TGA) which are underscored above. The region between and inclusive of the initiation codon and the termination codon is a methionine-initiated coding sequence of about 2805 nucleotides, including the termination codon (nucleotides indicated as “coding” of SEQ ID NO:74; SEQ ID NO:76). The coding sequence encodes a 934 amino acid protein (SEQ ID NO:75), which is recited as follows: 37 MSVAYTYDNLAPRVLKAQSDHRSRHEVSHYSMWLLVSWAHCCSLVKSSLADSDHL (SEQ ID NO: 75) QDWLKKLTLLIPETAVRHESCSGLYKLSLSGLDGGDSINRSFLLLAASTLLKFLPDAQ ALKPIRIDDYEEEPILKPGCKEYFWLLCKLVDNIHIKDASQTTLLDLDALARHLADCI RSREILDHQDGNVEDDGLTGLLRLATSVVKHKPPFKFSREGQEFLRDIFNLLFLLPSL KDRQQPKCKSHSTRAAAYDLLVEMVKGSVENYRLIHNWVMAQHMQSHAPYKWD YWPHEDVRAECRFVGLTNLGATCYLASTIQQLYMIPEARQAVFTAKYSEDMKHKTT LLELQKMFTYLMESECKAYNPRPFCKTYTMDKQPLNTGEQKDMTEFFTDLITKIEEM SPELKNTVKSLFGGVITNNVVSLDCEHVSQTAEEFYTVRCQVADMKNIYESLDEVTI KDTLEGDNMYTCSHCGKKVRAEKRACFKKLPRILSFNTMRYTFNMVTMMKEKVNT HFSFPLRLDMTPYTEDFLMGKSERKEGFKEVSDHSKDSESYEYDLIGVTVHTGTADG GHYYSFIRDIVNPHAYKNNKWYLFNDAEVKPFDSAQLASECFGGEMTTKTYDSVTD KFMDFSFEKTHSAYMLFYKRMEPEEENGREYKFDVSSELLEWIWHDNMQFLQDKNI FEHTYFGFMWQLCSCIPSTLPDPKAVSLMTAKLSTSFVLETFIHSKEKPTMLQWIELL TKQFNNSQAACEWFLDRMADDDWWPMQILIKCPNQIVRQMFQRLCIHVIQRLRPVH AHLYLQPGMEDGSDDMDTSVEDIGGRSCVTRFVRTLLLIMEHGVKPHSKHLTEYFA FLYEFAKMGEEESQFLLSLQAISTMVHFYMGTKGPENPQVEVLSEEEGEEEEEEEDIL SLAEEKYRPAALEKMIALVALLVEQSRSER.

[5984] The human 48120 nucleic acid sequence is recited as follows: 38 CCACGCGTCCGGCCTAGTCCTGAGAGGCTGGGCCGGCGGCGGCTGCGGCGGGAG (SEQ ID NO:77) ACCGGTGACCCGCGGCTGGGCGCCTCGGCCATGACTGCGGAGCTGCAGCAGGAC GACGCGGCCGGCGCGGCAGACGGCCACGGCTCGAGCTGCCAAATGCTGTTAAAT CAACTGAGAGAAATCACAGGCATTCAGGACCCTTCCTTTCTCCATGAAGCTCTGA AGGCCAGTAATGGTGACATTACTCAGGCAGTCAGCCTTCTCACTGATGAGAGAG TTAAGGAGCCCAGTCAAGACACTGTTGCTACAGAACCATCTGAAGTAGAGGGGA GTGCTGCCAACAAGGAAGTATTAGCAAAAGTTATAGACCTTACTCATGATAACA AAGATGATCTTCAGGCTGCCATTGCTTTGAGTCTACTGGAGTCTCCCAAAATTCA AGCTGATGGAAGAGATCTTAACAGGATGCATGAAGCAACCTCTGCAGAAACTAA ACGCTCAAAGAGAAAACGCTGTGAAGTCTGGGGAGAAAACCCCAATCCCAATG ACTGGAGGAGAGTTGATGGTTGGCCAGTTGGGCTGAAAAATGTTGGCAATACAT GTTGGTTTAGTGCTGTTATTCAGTCTCTCTTTCAATTGCCTGAATTTCGAAGACTT GTTCTCAGTTATAGTCTGCCACAAAATGTACTTGAAAATTGTCGAAGTCATACAG AAAAGAGAAATATCATGTTTATGCAAGAGCTTCAGTATTTGTTTGCTCTAATGAT GGGATCAAATAGAAAATTTGTAGACCCGTCTGCAGCCCTGGATCTATTAAAGGG AGCATTCCGATCATCTGAGGAACAGCAGCAAGATGTGAGTGAATTCACACACAA GCTCCTGGATTGGCTAGAGGACGCATTCCAGCTAGCTGTTAATGTTAACAGTCCC AGGAACAAATCTGAAAATCCAATGGTGCAGCTGTTCTATGGTACTTTCCTGACTG AAGGGGTTCGTGAAGGAAAACCCTTTTGTAACAATGAGACCTTCGGCCAGTATC CTCTTCAGGTAAACGGTTATCGCAACTTAGACGAGTGTTTGGAAGGGGCCATGGT GGAGGGTGATGTTGAGCTTCTTCCCTCCGATCACTCGGTGAAGTATGGACAAGA GCGTTGGTTTACAAAGCTACCTCCAGTGTTGACCTTTGAACTCTCAAGATTTGAG TTTAATCAGTCCCTTGGGCAGCCAGAGAAAATTCACAATAAGCTGGAATTTCCTC AGATTATTTATATGGACAGGTACATGTACAGGAGCAAGGAGCTTATTCGAAATA AGAGAGAGTGTATTCGAAAGTTGAAGGAGGAAATAAAAATTCTGCAGCAAAAA TTGGAAAGGTATGTGAAATATGGCTCAGGCCCAGCTCGGTTCCCGCTCCCGGAC ATGCTGAAATATGTTATTGAATTTGCTAGTACAAAACCTGCCTCAGAAAGCTGTC CACCTGAAAGTGACACACATATGACATTACCACTTTCTTCAGTGCACTGCTCGGT TTCTGACCAGACATCCAAGGAAAGTACAAGTACAGAAAGCTCTTCTCAGGATGT TGAAAGTACCTTTTCTTCTCCTGAAGATTCTTTACCCAAGTCTAAACCACTGACAT CTTCTCGGTCTTCCATGGAAATGCCTTCACAGCCAGCTCCACGAACAGTCACAGA TGAGGAGATAAATTTTGTTAAGACCTGTCTTCAGAGATGGAGGAGTGAGATTGA ACAAGATATACAAGATTTAAAGACTTGTATTGCAAGTACTACTCAGACTATTGAA CAGATGTACTGCGATCCTCTCCTTCGTCAGGTGCCTTATCGCTTGCATGCAGTTCT TGTTCATGAAGGACAAGCAAATGCTGGACACTATTGGGCCTATATCTATAATCAA CCCCGACAGAGCTGGCTCAAGTACAATGACATCTCTGTTACTGAATCTTCCTGGG AAGAAGTTGAAAGAGATTCCTATGGAGGCCTGAGAAATGTTAGTGCTTACTGTC TGATGTACATTAATGACAAACTACCCTACTTCAATGCAGAGGCAGCCCCAACTG AATCAGATCAAATGTCAGAAGTGGAAGCCCTATCTGTGGAACTCAAGCATTACA TTCAGGAGGATAACTGGCGGTTTGAGCAGGAAGTAGAGGAGTGGGAAGAAGAG CAGTCTTGCAAAATCCCTCAAATGGAGTCCTCCACCAACTCCTCATCACAGGACT ACTCTACATCACAAGAGCCTTCAGTAGCCTCTTCTCATGGGGTTCGCTGCTTGTC GTCTGAGCATGCTGTGATTGTAAAGGAGCAAACTGCCCAGGCTATTGCAAACAC AGCCCGTGCCTATGAGAAGAGCGGTGTAGAAGCGGCACTGAGTGAGGTTAAAGA AGCTGAACCCAAGAAGCCCATGCCCCAGGAAACAAACCTTGCAGAGCAGTCAG AACAGCCCCCAAAGGCTAATGATGCAGAGTCTACTGCCCAGCCTAATTCTGAGG TCTCTGAAGTCGAGATTCCCAGTGTGGGAAGGATTCTGGTTAGATCTGATGCAGA TGGATATGATGAGGAGGTGATGCTGAGCCCTGCCATGCAAGGGGTCATCCTGGC CATAGCTAAAGCCCGTCAGACCTTTGACCGAGATGGGTCTGAAGCAGGGCTGAT TAAGGCATTCCATGAAGAATACTCCAGGCTCTATCAGCTTGCCAAAGAGACCCC CACCTCTCACAGTGATCCTCGACTTCAGCATGTCCTTGTCTACTTTTTCCAAAATG AAGCACCCAAAAGGGTAGTAGAACGAACCCTTCTGGAACAGTTTGCAGATAAAA ATCTTAGCTATGATGAAAGATCAATCAGCATTATGAAGGTGGCTCAAGCGAAAC TGAAGGAAATTGGTCCAGATGACATGAATATGGAAGAGTACAAGAAGTGGCATG AAGATTATAGTTTGTTCCGAAAAGTGTCTGTGTATCTCCTAACAGGCCTAGAACT CTATCAAAAAGGAAAGTACCAAGAGGCACTTTCCTACCTGGTATATGCCTACCA GAGCAATGCTGCCCTGCTGATGAAGGGGCCCCGCCGGGGGGTCAAAGAATCCGT GATTGCTTTATACCGAAGAAAATGCCTTCTGGAGCTGAATGCCAAAGCAGCTTCT CTTTTTGAAACAAATGATGATCACTCCGTAACTGAGGGCATTAATGTGATGAATG AACTGATCATCCCCTGCATTCACCTTATCATTAATAATGACATTTCCAAGGATGA TCTGGATGCCATTGAGGTCATGAGAAACCATTGGTGCTCTTACCTTGGGCAAGAT ATTGCAGAAAATCTGCAGCTGTGCCTAGGGGAGTTTCTACCCAGACTTCTAGATC CTTCTGCAGAAATCATCGTCTTGAAAGAGCCTCCAACTATTCGACCCAATTCTCC CTATGACCTATGTAGCCGATTTGCAGCTGTCATGGAGTCAATTCAGGGAGTTTCA ACTGTGACAGTGAAATAAGCTCCCACATGTTCAAGGCCCATTCTGGTTCCTGGCT GCCTGCCTCTTGCACAGAAGTTCGTTGTCATAGTGCTCACCTTGGGAAAAGGATT AGGTGGGCACATAAGATTCCGATCAGACCCCAACCATGCTGCATGTGTAAAGAA GGATTGAAAATAAAATTGCACTTTTTAGGTACAAAATCATAAAAGCTGTTTCACT AGAAAAGGCAGAAAGCAGTGTATTAAGGTGTTGAATTACGCCAGAAGACCTGAA ATGCCTTGTACCTACAACAATGCTTAGGCTTTTCTAAGCCTCTTGCCACTTTTAAA ATTATCCTTCAGGCATAAATATTTTTGACAGCAGAATAGAAGAATGATTCATGAG AACCTGAACCAGATGAACAGCTACTAGTTATTTTATCAAATACAGATGACATTTA AAAATTCTTAACTACAAGAGATTAGAAATATAAACCTTGCCTGGCTCTTGCCAGG AGATAACAAAATGGGTTGCTGATGAACTGCACCCTTTTACATGTGGGTAGAATAT AAGCTCACATGGCAGTGAGATGTTGAAAAGTCAAAAGAGACCTGTCTCTCTCCTT TCTTTTCTATCTTTAAACCAGAAAACCTCATACTCAGTCCTCAGTGAAAGAAAGT AAAGTATTAAGGACTTTAGGCAGAAGAGCATTGTGTAACTTGACTGAAGATCAT CCATTAATAGTTATTAGGCATTTAGGTAAAATTTTCTAATACCTAAAAATTGTCA AAAACAGTCAATAGGGCTACTGCTGGCCCAAAGACCATTTAGGTCCACCTCCTCT TTTTTGCTCTTTTTTTTTTTTCTGTGACAGTTTCACTGTGTCGCCCAGGCTGGCGTT CAGTGGTGCAATCTCAGCTCACTGCAAACTCTGTCTCCTGGGCTCAAGTGATTCT CGTGCCTCAGCCTCCCGAATAGCTGGAATTACGGGCATGCACCACCACACCTGG CTAATTTTTGTATTTTTAATAGAGATGGGGTTTCACCATATTGGCCAGGCTGATCT CTAACTCCTGGCCTCAAGTGATCTATCTGCCTCCCTCAGCCTCCCAAAGTCTGGG ATTGCAGACAAGTCATCGTACCCGGCCTTCTTTTTTGCCCTTAAAAGTAAGGGAT GTGGGTTTGTACAAAAAAAAAAAAAAAAAAAAAAAAAAACCAGCATACATATG CAAAACTATATATATATGTATATGTAGAGAAAAATACTTCCCATTGATCATTTTT AAAAGGCTTCTGATTGGATATTGTGTTTTAACCAAATTTTAAAGATTAATGGAAT CATGAAAGGGAAAAAATTGATACAACTATGCAGATTTTATAAATGTGCAATAAA AGTATTTGTTTTACA.

[5985] The human 48120 sequence (FIG. 35; SEQ ID NO:77) is approximately 4873 nucleotides long. The nucleic acid sequence includes an initiation codon (ATG) and a termination codon (TAA) which are underscored above. The region between and inclusive of the initiation codon and the termination codon is a methionine-initiated coding sequence of about 3420 nucleotides, including the termination codon (nucleotides indicated as “coding” of SEQ ID NO:77; SEQ ID NO:79). The coding sequence encodes a 1139 amino acid protein (SEQ ID NO:78), which is recited as follows: 39 MTAELQQDDAAGAADGHGSSCQMLLNQLREITGIQDPSFLHEALKASNGDITQAVS (SEQ ID NO:78) LLTDERVKEPSQDTVATEPSEVEGSAANKEVLAKVIDLTHDNKDDLQAAIALSLLES PKIQADGRDLNRMHEATSAETKRSKRKRCEVWGENPNPNDWRRVDGWPVGLKNV GNTCWFSAVIQSLFQLPEFRRLVLSYSLPQNVLENCRSHTEKRNIMFMQELQYLFAL MMGSNRKFVDPSAALDLLKGAFRSSEEQQQDVSEFTHKLLDWLEDAFQLAVNVNS PRNKSENPMVQLFYGTFLTEGVREGKPFCNNETFGQYPLQVNGYRNLDECLEGAMV EGDVELLPSDHSVKYGQERWFTKLPPVLTFELSRFEFNQSLGQPEKIHNKLEFPQIIY MDRYMYRSKELIRNKRECIRKLKEEIKILQQKLERYVKYGSGPARFPLPDMLKYVIE FASTKPASESCPPESDTHMTLPLSSVHCSVSDQTSKESTSTESSSQDVESTFSSPEDSLP KSKPLTSSRSSMEMPSQPAPRTVTDEEINFVKTCLQRWRSEIEQDIQDLKTCIASTTQT IEQMYCDPLLRQVPYRLHAVLVHEGQANAGHYWAYIYNQPRQSWLKYNDISVTESS WEEVERDSYGGLRNVSAYCLMYINDKLPYFNAEAAPTESDQMSEVEALSVELKHYI QEDNWRFEQEVEEWEEEQSCKIPQMESSTNSSSQDYSTSQEPSVASSHGVRCLSSEH AVIVKEQTAQAIANTARAYEKSGVEAALSEVKEAEPKKPMPQETNLAEQSEQPPKA NDAESTAQPNSEVSEVEIPSVGRILVRSDADGYDEEVMLSPAMQGVILAIAKARQTF DRDGSEAGLIKAFHEEYSRLYQLAKETPTSHSDPRLQHVLVYFFQNEAPKRVVERTL LEQFADKNLSYDERSISIMKVAQAKLKEIGPDDMNMEEYKKWHEDYSLFRKVSVYL LTGLELYQKGKYQEALSYLVYAYQSNAALLMKGPRRGVKESVIALYRRKCLLELNA KAASLFETNDDHSVTEGINVMNELIIPCIHLIINNDISKDDLDAIEVMRNHWCSYLGQ DIAENLQLCLGEFLPRLLDPSAEIIVLKEPPTIRPNSPYDLCSRFAAVMESIQGVSTVTV K.

[5986] The human 46689 nucleic acid sequence is recited as follows: 40 CACTAGTAACGCCGCCATGTGCTGGAATTCGCCCTTCTCGGGAAGCGCGCCATTG (SEQ ID NO:80) TGTTGGTACCCGGGAATTCGCGGCCGCGTCGACGCCCGCCGGGGCTCTCCAGCTT CGCCATGCCGCCGTGGGGCGCCGCCCTCGCGCTCATCTTGGCCGTGCTCGCCCTT CTCGGCCTGCTCGGCCCGCGGCTCCGGGGACCCTGGGGGCGCGCCGTCGGAGAG AGGACCCTGCCGGGGGCCCAAGACCGAGACGACGGGGAGGAGGCGGACGGCGG AGGCCCGGCGGACCAGTTCAGCGACGGGCGCGAGCCACTGCCGGGAGGGTGCA GCCTTGTTTGCAAGCCGTCGGCCCTGGCCCAGTGCCTGCTGCGCGCCCTGCGGCG CTCAGAGGCGCTGGAGGCCGGCCCGCGCTCCTGGTTCTCCGGGCCCCACCTGCA GACCCTCTGCCACTTCGTCCTGCCCGTAGCGCCTGGGCCTGAGCTGGCCCGGGAG TACCTGCAGTTGGCGGACGATGGGCTAGTGGCCCTGGACTGGGTGGTAGGACCT TGTGTTCGGGGCCGCCGGATCACCAGCGCCGGGGGCCTTCCTGCGGTGCTTCTGG TGATCCCCAATGCGTGGGGTCGCCTCACCCGCAACGTGCTCGGCCTTTGCTTGCT CGCCCTGGAGCGCGGCTACTACCCGGTCATCTTCCATCGCCGCGGCCACCACGGT TGCCCACTGGTCAGCCCCCGGCTGCAGCCTTTCGGGGACCCGTCCGACCTCAAGG AGGCGGTCACATACATCCGCTTCCGACACCCGGCGGCGCCGCTGTTCGCGGTGA GCGAAGGCTCGGGCTCGGCGCTGCTCCTGTCCTACCTGGGCGAGTGCGGCTCCTC CAGCTACGTGACAGGCGCCGCCTGCATCTCGCCCGTGCTGCGCTGCCGAGAGTG GTTCGAGGCCGGCCTGCCCTGGCCCTACGAGCGGGGCTTTCTGCTCCACCAGAA GATCGCCCTCAGCAGGTATGCCACAGCCCTGGAGGACACTGTGGACACCAGCAG ACTGTTCAGGAGCCGTTCCCTTCGAGAGTTTGAGGAGGCTCTCTTCTGCCACACC AAAAGCTTCCCCATCAGCTGGGATGCCTACTGGGACCGCAACGACCCGCTCCGG GATGTCGATGAGGCAGCCGTGCCTGTGCTGTGTATCTGCAGTGCTGACGACCCCG TGTGTGGACCCCCAGACCACACTCTGACAACTGAACTCTTCCACAGCAACCCCTA CTTCTTCCTCCTGCTCAGTCGCCACGGAGGCCACTGTGGCTTCCTGCGCCAGGAG CCCTTGCCAGCCTGGAGCCATGAGGTCATCTTGGAGTCCTTCCGGGCCTTGACTG AGTTCTTCCGAACGGAGGAGAGGATTAAAGGGCTGAGCAGGCACAGAGCTTCCT TCCTTGGGGGCCGTCGTCGTGGGGGAGCCTTGCAGAGGCGGGAAGTCTCTTCCTC TTCCAACCTGGAGGAGATCTTTAACTGGAAGCGATCATACACAAGGTGAGAGAC CTGGCCTGAGAACCCCCAAGTCCTGCAAAGAAAAACAGAGCTGGGCAAGGGGG AGTCCTGGAAAGATGGGGCGGACTGAACAGAGGGAGCTCCAGCTCTGTGCTCCT CATTCAGTCCCTCTCTCTTAAATTGGTGCCTTGAAAGAGAAGGAACGTCCTGCGA GCCTGCACTCACTTCATCCTCAGCAGAACTCCTGCCTGGCCTCTGCTCAACATAT CCCTACTCATCCGGTCAGCAGCGGCGCGTTCCAGTCACTGTCACCTGTCACTGAC ATCACAAGCCAAAGGATAGCACTTTTTCAATCCATGGACTCAGGAGAAAATGCC CTCTTACTGGCAGTGGCTAGAGGGATGAGACGTTTGTGTATGTCACTGGGCAGTG ACCCCGATTCTCAAGCTGGAGCCATTTGATGTCATGAGGACAGGATGTTTGTGTC TCGGCCCCACTTCCCTCATTTGCTCTGTGGTTGTGGCGCCCTGCTTTGACCGAATG CTCTGGCAACTGCGGCAGCAGGCTTGTGTGTGTGAGAAGGGCGGCAGAGGCAGT GGGGCTGGCT.

[5987] The human 46689 sequence (FIG. 37; SEQ ID NO:80), which is approximately 2082 nucleotides long. The nucleic acid sequence includes an initiation codon (ATG) and a termination codon (TGA) that are underscored above. The region between and inclusive of the initiation codon and the termination codon is a methionine-initiated coding sequence of about 1407 nucleotides, including the termination codon (nucleotides indicated as “coding” of SEQ ID NO:80; SEQ BD NO:82). The coding sequence encodes a 468 amino acid protein (SEQ ID NO:81), which is recited as follows: 41 MPPWGAALALILAVLALLGLLGPRLRGPWGRAVGERTLPGAQDRDDGEEADGGGP (SEQ ID NO:81) ADQFSDGREPLPGGCSLVCKPSALAQCLLRALRRSEALEAGPRSWFSGPHLQTLCHF VLPVAPGPELAREYLQLADDGLVALDWVVGPCVRGRRITSAGGLPAVLLVIPNAWG RLTRNVLGLCLLALERGYYPVIFHRRGHHGCPLVSPRLQPFGDPSDLKEAVTYIRFRH PAAPLFAVSEGSGSALLLSYLGECGSSSYVTGAACISPVLRCREWFEAGLPWPYERG FLLHQKIALSRYATALEDTVDTSRLFRSRSLREFEEALFCHTKSFPISWDAYWDRNDP LRDVDEAAVPVLCICSADDPVCGPPDHTLTTELFHSNPYFFLLLSRHGGHCGFLRQEP LPAWSHEVILESFRALTEFFRTEERIKGLSRHRASFLGGRRRGGALQRREVSSSSNLEE IFNWKRSYTR.

Example 49 Tissue Distribution of 23479, 48120, or 46689 mRNA by TaqMan Analysis

[5988] Endogenous human 23479, 48120, or 46689 gene expression was determined using the Perkin-Elmer/ABI 7700 Sequence Detection System which employs TaqMan technology. Briefly, TaqMan technology relies on standard RT-PCR with the addition of a third gene-specific oligonucleotide (referred to as a probe) which has a fluorescent dye coupled to its 5′ end (typically 6-FAM) and a quenching dye at the 3′end (typically TAMRA). When the fluorescently tagged oligonucleotide is intact, the fluorescent signal from the 5′dye is quenched. As PCR proceeds, the 5′ to 3′nucleolytic activity of Taq polymerase digests the labeled primer, producing a free nucleotide labeled with 6-FAM, which is now detected as a fluorescent signal. The PCR cycle where fluorescence is first released and detected is directly proportional to the starting amount of the gene of interest in the test sample, thus providing a quantitative measure of the initial template concentration. Samples can be internally controlled by the addition of a second set of primers/probe specific for a housekeeping gene such as GAPDH which has been labeled with a different fluorophore on the 5′end (typically VIC).

[5989] To determine the level of 23479 and 48120 in various human tissues, primer/probe sets were designed. Total RNA was prepared from a series of human tissues using an RNeasy kit from Qiagen. First strand cDNA was prepared from 1 &mgr;g total RNA using an oligo-dT primer and Superscript II reverse transcriptase (Gibco/BRL). cDNA obtained from approximately 50 ng total RNA was used per TaqMan reaction. Tissues tested include the human tissues and several cell lines shown in Tables 19 and 20. 23479 mRNA was detected in coronary SMC, HUVEC, normal brain cortex, brain hypothalamus, lung tumor, and erythroid cells (Table 19). 48120 expression was found in heart, heart CHF, normal brain cortex, and lung tumor (Table 20).

[5990] To determine the level of 46689 in various human tissues a primer/probe set was designed. Total RNA was prepared from a series of human tissues using an RNeasy kit from Qiagen. First strand cDNA was prepared from 1 &mgr;g total RNA using an oligo-dT primer and Superscript II reverse transcriptase (Gibco/BRL). cDNA obtained from approximately 50 ng total RNA was used per TaqMan reaction. Tissues tested include the human tissues and several cell lines shown in Tables 18-24. 46689 mRNA was detected in hematopoietic cells (e.g., bone-marrow mononuclear cells), lung, thymus, glial cells, kidney, and the brain (e.g., the cortex), as well as in numerous tumors from the lung, brain, breast, and ovary. 42 TABLE 19 Expression of 23479 mRNA Relative Tissue Expression Artery normal 136.3135 Aorta diseased 23.0355 Vein normal 6.7776 Coronary Smooth Muscle Cells (SMC) 219.1514 Human Umbilical Vein Endothelial Cells (HUVEC) 829.3195 Hemangioma 36.3979 Heart normal 102.9489 Heart Congestive Heart Failure 89.6222 Kidney 122.8526 Skeletal Muscle 170.755 Adipose normal 5.4861 Pancreas 37.6815 primary osteoblasts 44.1942 Osteoclasts (differentiated) 0 Skin normal 4.3796 Spinal cord normal 37.4212 Brain Cortex normal 675.9554 Brain Hypothalamus normal 219.9123 Nerve 32.1286 Dorsal Root Ganglion 54.7879 Breast normal 29.2585 Breast tumor 69.8304 Ovary normal 88.3883 Ovary Tumor 1.0761 Prostate Normal 6.7308 Prostate Tumor 54.7879 Salivary glands 3.2848 Colon normal 0.7689 Colon Tumor 27.3939 Lung normal 0.5929 Lung tumor 184.2837 Lung Chronic Obstructive Pulmonary Disease 1.3433 Colon Inflammatory Bowel Disease 0.2358 Liver normal 2.015 Liver fibrosis 11.2028 Spleen normal 2.7147 Tonsil normal 14.18 Lymph node normal 10.0616 Small intestine normal 3.0121 Skin-Decubitus 8.6086 Synovium 0.7026 Bone Marrow, Mononuclear Cells 6.9441 Activated peripheral blood mononuclear cells 2.7241 Neutrophils 16.9215 Megakaryocytes 89.9333 Erythroid 424.8425

[5991] Expression of human 23479 mRNA was detected in many tissues. Prominent expression was detected in cardiovascular tissues (e.g., arteries, smooth muscle cells, endothelial cells, and the heart), kidney, skeletal muscle, brain (e.g., the cortex and hypothalamus), ovary, and blood cells (e.g., megakaryocytes and erythroid cells). Significantly, expression of human 23479 was elevated in lung, breast, prostate, and colon tumors, as compared to its expression in the appropriate normal tissues. In contrast, expression of human 23479 was decreased in ovary tumors as compared to its expression in normal ovary tissue. 43 TABLE 20 Expression of 48120 mRNA Relative Tissue Expression Artery normal 5.9826 Aorta diseased 0.6858 Vein normal 0.398 Coronary Smooth Muscle Cells 8.6986 Human Umbilical Vein Endothelial Cells 13.9848 Hemangioma 2.3388 Heart normal 34.5541 Heart congestive heart failure 104.0248 Kidney 3.9334 Adipose normal 0 Pancreas 3.3422 primary osteoblasts 0.2025 Osteoclasts (differentiated) 0 Skin normal 0 Spinal cord normal 0.4832 Brain Cortex normal 39.83 Brain Hypothalamus normal 6.8485 Nerve 1.8542 Dorsal Root Ganglion 1.348 Breast normal 1.6142 Breast tumor 1.249 Ovary normal 5.962 Ovary Tumor 0.4832 Prostate Normal 1.9531 Prostate Tumor 4.4716 Salivary glands 0.231 Colon normal 0.3133 Colon Tumor 5.2082 Lung normal 0.1567 Lung tumor 49.8936 Lung Chronic Obstructive Pulmonary Disease 0.2247 Colon Inflammatory Bowel Disease 0.06 Liver normal 0.3898 Liver fibrosis 1.334 Spleen normal 1.5592 Tonsil normal 1.8931 Lymph node normal 1.6367 Small intestine normal 0.4325 Macrophages 0.0183 Synovium 0 Bone Marrow, Mononuclear Cells 0.0363 Activated peripheral blood mononuclear cells 0.0754 Neutrophils 0.5628 Megakaryocytes 0.0407 Erythroid 0.5325 positive control 59.1286 Skeletal Muscle 21.8685

[5992] Expression of human 48120 mRNA was detected in many tissues, including cardiovascular tissues (e.g., arteries, smooth muscle cells, endothelial cells, and the heart), kidney, pancreas, skeletal muscle, brain (e.g., the cortex and hypothalamus), breast, ovary, prostate, spleen, tonsil, and lymph node tissue. Significantly, expression of human 48120 was highly elevated in heart tissue from a patient that suffered from congestive heart failure. In addition, expression of human 48120 was elevated in lung, colon and prostate tumors, as compared to its expression in the appropriate normal tissues. In contrast, expression of human 48120 was decreased in ovary tumors as compared to its expression in normal ovary tissue. 44 TABLE 21 Expression of 46689 mRNA Relative Tissue Expression Artery normal 0 Vein normal 0 Aortic SMC EARLY 6.23 Aortic SMC LATE 5.24 Static HUVEC 7.13 Shear HUVEC 7.88 Heart normal 0.94 Heart CHF 1.33 Kidney 10.4 Skeletal Muscle 1.19 Adipose normal 7.46 Pancreas 4.67 Primary Osteoblasts 2 Osteoclasts (diff) 0.16 Skin normal 2.69 Spinal cord normal 1.18 Brain Cortex normal 8.07 Brain Hypothalamus normal 3.97 Nerve 3.44 DRG (Dorsal Root Ganglion) 3.31 Glial Cells (Astrocytes) 12.8 Glioblastoma 2.65 Breast normal 6.51 Breast tumor 104.57 Ovary normal 0.23 Ovary Tumor 8.84 Prostate Normal 5.04 Prostate Tumor 0.47 Epithelial Cells (Prostate) 0.36 Colon normal 0.09 Colon Tumor 2.83 Lung normal 40.18 Lung tumor 16.89 Lung COPD 81.76 Colon IBD 2.45 Liver normal 2.26 Liver fibrosis 4.01 Dermal Cells- fibroblasts 6.25 Spleen normal 6.4 Tonsil normal 3.68 Lymph node 0.03 Thymus normal 22.6 Skin-Decubitus 9.57 Synovium (rheumatoid Arthritis) 49.63 BM-MNC 47.94 Activated PBMC 0.63

[5993] Expression of human 46689 mRNA was detected in bone marrow mononuclear cells (BM-MNC), lung, thymus, glial cells, kidney, brain cortex, human umbilical vein endothelial cells (HUVECs), aortic smooth muscle cells (SMCs), heart, skeletal muscle, adipose tissue, pancreas, osteoblasts and osteoclasts, skin, spinal chord, hypothalamus, dorsal root ganglia (DRG) and other nerve cells, breast, ovary, prostate, prostate epithelial cells, colon, liver, dermal fibroblasts, spleen, tonsil, lymph nodes, and activated pre bone marrow cells. In addition, human 46689 is expressed in tissues, cells, or fluids associated with disease, including a breast tumor, lung tissue from a patient with chronic obstructive pulmonary disease, sinovium from a patient with rheumatoid arthritis, a lung tumor, decubitus skin, an ovary tumor, liver tissue from a patient with liver fibrosis, a colon tumor, colon tissue form a patient with inflammatory bowel disease (IBD), a prostate tumor, and heart tissue from a patient that had congestive heart failure. Importantly, 46689 expression was elevated in the breast, ovary, and colon tumors as compared to the appropriate normal tissue. 45 TABLE 22 Expression of 46689 mRNA Relative Tissue Expression Hemangioma 0.0 Hemangioma 0.9 Hemangioma 0.5 Normal Kidney 4.8 Renal Cell Carcinoma 0.3 Wilms Tumor 6.4 Wilms Tumor 4.2 Skin 0.0 Uterine Adenocarcinoma 1.2 Neuroblastoma 1.2 Fetal Adrenal 0.4 Fetal Kidney 4.8 Fetal Heart 1.4 Normal Heart 2.1 Cartilage 1.9 Spinal cord 4.2 lymphangioma 2.8 Endometrial polyps 0.0 Synovium (RA) 0.4 Hyperkeratotic skin 1.8

[5994] Expression of 46689 was detected in normal kidney, fetal kidney, fetal adrenal gland, normal heart, fetal heart, cartilage, and spinal chord tissues. In addition, expression of 46689 was observed in several samples associated with disease, including Wilms tumors, a lymphangioma, hyoperkaratotic skin, a uterine adenocarcinoma, a neuroblastoma, hemangiomas, synovial fluid from a patient with rheumatoid arthritis (RA), and a renal cell carcinoma. 46 TABLE 23 Expression of 46689 mRNA Relative Tissue Expression PIT 400 Breast N 0.49 PIT 56 Breast N 7.34 MDA 106 Breast T 1.17 MDA 234 Breast T 0.46 NDR 57 Breast T 1.02 MDA 304 Breast T 0.85 NDR 58 Breast T 2.84 NDR 132 Breast T 6.99 NDR 07 Breast T 0.33 NDR 12 Breast T 13.00 PIT 208 Ovary N 1.81 CHT 620 Ovary N 1.05 CHT 619 Ovary N 1.79 CLN 03 Ovary T 1.24 CLN 05 Ovary T 3.57 CLN 17 Ovary T 2.78 CLN 07 Ovary T 0.43 CLN 08 Ovary T 0.30 MDA 216 Ovary T 0.52 CLN 012 Ovary T 4.63 MDA 25 Ovary T 5.86 MDA 183 Lung N 0.08 CLN 930 Lung N 0.59 MDA 185 Lung N 0.51 CHT 816 Lung N 0.21 MPI 215 Lung T--SmC 3.51 MDA 259 Lung T-PDNSCCL 2.14 CHT 832 Lung T-PDNSCCL 5.05 MDA 253 Lung T-PDNSCCL 3.96 CHT 814 Lung T-SCC 5.90 CHT 911 Lung T-SCC 16.01 CHT 726 Lung T-SCC 0.91 CHT 845 Lung T-AC 11.01

[5995] Expression of human 46689 was detected in normal breast, ovary, and lung tissue samples, as well as in breast, ovary, and lung tumors. Significantly, expression of human 46689 mRNA was elevated in 8/8 lung tumors analyzed, 4/7 ovary tumors analyzed, and 1/8 breast tumors analyzed, as compared to normal lung, ovary, and breast tissue samples, respectively. This indicates that human 46689 is a marker of tumor formation and/or growth, especially in the lung. Abbreviation used: N, normal; T, tumor; SmC, small cell carcinoma; PDNSCCL, poorly differentiated non-small cell carcinoma of the lung; SCC, squamous cell carcinoma; AC, adenocarcinoma. 47 TABLE 24 Expression of 46689 mRNA Relative Tissue Expression Colon N 0.33 Colon N 4.51 Colon N 2.66 Colon N 23.63 Colon T 0.30 Colon T 0.52 Colon T 6.18 Colon T 1.88 Colon T 0.20 Liver Met 0.20 Liver Met 0.14 Liver Met 0.32 Liver Met 0.18 Liver Nor 0.10 Liver Nor 0.27 Brain N 0.04 Brain N 0.05 Brain N 0.11 Brain N 0.05 Astrocytes 0.74 Brain T 1.85 Brain T 0.91 Brain T 0.35 Brain T 0.30 HMVEC-Arrested 0.13 HMVEC-Proliferating 0.17 Placenta 0.49 Fetal Adrenal 0.14 Fetal Adrenal 0.00 Fetal Liver 0.11 Fetal Liver 2.85 Wilms T 0.07 Renal T 0.00 Endometrial AC 0.13

[5996] Expression of 46689 mRNA was detected in normal colon, liver, and brain tissue, as well as in human vascular endothelial cells (HMVEC), fetal adrenal gland, and fetal liver. Expression of 46689 was also detected in colon tumors, liver metastases, brain tumors, a wilms tumor, a renal tumor, and an endometrial adenocarcinoma (AC). Significantly, 4/4 brain tumors analyzed displayed elevated 46689 mRNA expression as compared to normal brain tissue, again suggesting that human 46689 is a marker for tumor formation and/or growth in some tissue. Abbreviations used: N, normal; T, tumor; Met, metastasis. 48 TABLE 25 Expression of 46689 mRNA Relative Tissue Expression PIT 337 Colon N 13.09 CHT 410 Colon N 4.32 CHT 425 Colon N 4.46 CHT 371 Colon N 4.02 PIT 281 Colon N 7.21 NDR 211 Colon N 4.50 CHT 122 Adenomas 6.50 CHT 887 Adenomas 21.64 CHT 414 Colonic ACA-B 9.29 CHT 841 Colonic ACA-B 8.00 CHT 890 Colonic ACA-B 3.09 CHT 910 Colonic ACA-B 3.34 CHT 807 Colonic ACA-B 3.75 CHT 382 Colonic ACA-B 3.92 CHT 377 Colonic ACA-B 3.04 CHT 520 Colonic ACA-C 4.58 CHT 596 Colonic ACA-C 4.04 CHT 907 Colonic ACA-C 6.97 CHT 372 Colonic ACA-C 11.05 NDR 210 Colonic ACA-C 9.26 CHT 1365 Colonic ACA-C 6.75 CLN 740 Liver N 26.83 CLN 741 Liver N 26.64 NDR 165 Liver N 6.11 NDR 150 Liver N 20.05 PIT 236 Liver N 5.68 CHT 1878 Liver N 21.27 CHT 077 Colon to Liver Met 7.73 CHT 119 Colon to Liver Met 21.64 CHT 131 Colon to Liver Met 18.07 CHT 218 Colon to Liver Met 14.73 CHT 739 Colon to Liver Met 14.63 CHT 755 Colon to Liver Met 18.52 CHT 215 Col Abdominal Met 2.57

[5997] Expression of 46689 mRNA was detected in both normal colon and liver tissues, as well as in colon tumors and liver metastases. Of the colon tumors, 1/15 displayed an increase in 46689 expression relative to normal colon tissue. Abbreviations used: N, normal; ACA, adenocarcinoma; Met, metastasis. 49 TABLE 26 Expression of 46689 mRNA Relative Cell line Expression MCF-7 Breast T 53.47 ZR75 Breast T 44.66 T47D Breast T 49.38 MDA 231 Breast T 19.71 MDA 435 Breast T 7.81 SKBr3 Breast 19.78 DLD 1 ColonT (stageC) 113.83 HCT116 Colon T 31.58 HT29 Colon T 42.25 Colo 205 Colon T 15.68 NCIH125 Lung T 8.34 A549 Lung T 55.36 NHBE Lung 33.49 SKOV-3 ovary 94.08 OVCAR-3 ovary 10.82 293 Baby Kidney 6.71 293T Baby Kidney 19.92

[5998] Expression of 46689 mRNA in cell lines that are suitable for transplantation into mice as part of a xenografting experiment. All cell lines displayed high expression of human 46689, with the exception of MDA 435 Breast tumor cells, NCIH 125 Lung tumor cells, and 293 Baby Kidney cells. Expression of 46689 mRNA is particularly high in stage 3 DLD1 Colon tumor cells and SKOV-3 Ovary cells. 50 TABLE 27 Expression of 46689 mRNA Relative Tissue Expression RIP Angio 9.0366 RIP Tumor 13.9848 Xeno Parent 1 3.6955 Xeno Parent 2 0.0566 Xeno VEGF 1 0.1763 Xeno VEGF 2 0.251 Spleen 3.0754 Heart 0.4078 Kidney 0.2536 Liver 3.3422 VEGF 1 0.862 VEGF 2 1.3294 P1 0.4431 P2 0.2853

[5999] Cells expressing human 46689 were transplanted into mice and allowed to form tumors in vivo. The tumor cells then isolated and analyzed for 46689 expression. RIP Angio shows 46689 expression in cells that were removed just after angiogenesis had begun to provide blood vessels to the tumor, while RIP Tumor shows 46689 expression in cells that were removed after angiogenesis had contributed to further tumor growth. Angiogenesis-dependent tumor growth was correlated with an increase in 46689 expression. The Xeno VEGF1, Xeno VEGF2, VEGF1, and VEGF2 values correspond to 46689 expression levels for cells transplanted into mice that overexpress either VEGF1 or VEGF2, growth factors involved in angiogenesis. The Xeno Parent 1, Xeno Parent 2, P1, and P2 lanes are control mice that do not overexpress either VEGF1 or VEGF2.

Example 50 Tissue Distribution of 23479, 48120, or 46689 mRNA by Northern Analysis

[6000] Northern blot hybridizations with various RNA samples can be performed under standard conditions and washed under stringent conditions, i.e., 0.2×SSC at 65° C. A DNA probe corresponding to all or a portion of the 23479, 48120, or 46689 cDNA (SEQ ID NO:74) can be used. The DNA was radioactively labeled with 32P-dCTP using the Prime-It Kit (Stratagene, La Jolla, Calif.) according to the instructions of the supplier. Filters containing mRNA from mouse hematopoietic and endocrine tissues, and cancer cell lines (Clontech, Palo Alto, Calif.) can be probed in ExpressHyb hybridization solution (Clontech) and washed at high stringency according to manufacturer's recommendations.

Example 51 Recombinant Expression of 23479, 48120, or 46689 in Bacterial Cells

[6001] In this example, 23479, 48120, or 46689 is expressed as a recombinant glutathione-S-transferase (GST) fusion polypeptide in E. coli and the fusion polypeptide is isolated and characterized. Specifically, 23479, 48120, or 46689 is fused to GST and this fusion polypeptide is expressed in E. coli, e.g., strain PEB199. Expression of the GST-23479, 48120, or 46689 fusion protein in PEB199 is induced with IPTG. The recombinant fusion polypeptide is purified from crude bacterial lysates of the induced PEB199 strain by affinity chromatography on glutathione beads. Using polyacrylamide gel electrophoretic analysis of the polypeptide purified from the bacterial lysates, the molecular weight of the resultant fusion polypeptide is determined.

Example 52 Expression of Recombinant 23479, 48120, or 46689 Protein in COS Cells

[6002] To express the 23479, 48120, or 46689 gene in COS cells (e.g., COS-7 cells, CV-1 origin SV40 cells; Gluzman (1981) Cell 123:175-182), the pcDNA/Amp vector by Invitrogen Corporation (San Diego, Calif.) is used. This vector contains an SV40 origin of replication, an ampicillin resistance gene, an E. coli replication origin, a CMV promoter followed by a polylinker region, and an SV40 intron and polyadenylation site. A DNA fragment encoding the entire 23479, 48120, or 46689 protein and an HA tag (Wilson et al. (1984) Cell 37:767) or a FLAG tag fused in-frame to its 3′end of the fragment is cloned into the polylinker region of the vector, thereby placing the expression of the recombinant protein under the control of the CMV promoter.

[6003] To construct the plasmid, the 23479, 48120, or 46689 DNA sequence is amplified by PCR using two primers. The 5′primer contains the restriction site of interest followed by approximately twenty nucleotides of the 23479, 48120, or 46689 coding sequence starting from the initiation codon; the 3′end sequence contains complementary sequences to the other restriction site of interest, a translation stop codon, the HA tag or FLAG tag and the last 20 nucleotides of the 23479, 48120, or 46689 coding sequence. The PCR amplified fragment and the pcDNA/Amp vector are digested with the appropriate restriction enzymes and the vector is dephosphorylated using the CIAP enzyme (New England Biolabs, Beverly, Mass.). Preferably the two restriction sites chosen are different so that the 23479, 48120, or 46689 gene is inserted in the correct orientation. The ligation mixture is transformed into E. coli cells (strains HB101, DH5&agr;, SURE, available from Stratagene Cloning Systems, La Jolla, Calif., can be used), the transformed culture is plated on ampicillin media plates, and resistant colonies are selected. Plasmid DNA is isolated from transformants and examined by restriction analysis for the presence of the correct fragment.

[6004] COS cells are subsequently transfected with the 23479, 48120, or 46689-pcDNA/Amp plasmid DNA using the calcium phosphate or calcium chloride co-precipitation methods, DEAE-dextran-mediated transfection, lipofection, or electroporation. Other suitable methods for transfecting host cells can be found in Sambrook, J., Fritsh, E. F., and Maniatis, T. (1989) Molecular Cloning: A Laboratory Manual. 2nd, ed., Cold Spring Harbor Laboratory, Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y. The expression of the 23479, 48120, or 46689 polypeptide is detected by radiolabelling (35S-methionine or 35S-cysteine available from NEN, Boston, Mass., can be used) and immunoprecipitation (Harlow, E. and Lane, D. (1988) Antibodies: A Laboratory Manual, Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y.) using an HA specific monoclonal antibody. Briefly, the cells are labeled for 8 hours with 35S-methionine (or 35S-cysteine). The culture media are then collected and the cells are lysed using detergents (RIPA buffer, 150 mM NaCl, 1% NP-40, 0.1% SDS, 0.5% DOC, 50 mM Tris, pH 7.5). Both the cell lysate and the culture media are precipitated with an HA specific monoclonal antibody. Precipitated polypeptides are then analyzed by SDS-PAGE.

[6005] Alternatively, DNA containing the 23479, 48120, or 46689 coding sequence is cloned directly into the polylinker of the pcDNA/Amp vector using the appropriate restriction sites. The resulting plasmid is transfected into COS cells in the manner described above, and the expression of the 23479, 48120, or 46689 polypeptide is detected by radiolabelling and immunoprecipitation using a 23479, 48120, or 46689 specific monoclonal antibody.

Examples for 80091 Example 53 Identification and Characterization of Human 80091 cDNA

[6006] 51 ATGGTGGCTGATGCCTGTAATCCCAACAGTTTGGGAGACTGGGGAGGAAGATCA (SEQ ID NO:94) TTTGAGGCCAGGAGTTTGAGACCAGCCTGGGCTCAAGCAGTCCTGCCTCAGCCTC CCAAAGTGCTGGGATTACAGATGGGTCATCTTACTCTGGAAGACTATCAGATCTG GAGTGTGAAAAATGTTCTTGCCAATGAGTTTTTGAACCTCCTTTTCCAGGTGTGT CACATAGTTCTGGGGTTAAGACCAGCTACTCCGGAAGAAGAAGGACAAATTATT AGAGGATGGTTAGAACGAGAGAGCAGGTATGGTCTGCAAGCAGGACACAACTG GTTTATCATCTCCATGCAGTGGTGGCAACAGTGGAAAGAATATGTCAAATACGA TGCCAACCCTGTGGTAATTGAGCCATCATCTGTTTTGAATGGAGGAAAATACTCA TTTGGAACTGCAGCCCATCCTATGGAGCAGGTCGAAGATAGAATTGGAAGCAGC CTCAGTTACGTGAATACTACAGAAGAGAAATTTTCAGACAACATTTCTACTGCAT CTGAAGCCTCAGAAACTGCTGGCAGCGGCTTTCTGTATTCTGCCACACCAGGGGC AGATGTTTGCTTTGCTCGACAACATAACACTTCTGACAATAACAACCAGTGTTTG CTGGGAGCCAATGGGAATATTTTGTTGCACCTTAACCCTCAGAAACCAGGGGCT ATTGATAATCAGCCATTAGTAACTCAAGAACCAGTAAAGGCTACATCATTAACA CTAGAAGGAGGACGATTAAAACGAACTCCACAGCTGATTCATGGAAGAGACTAT GAAATGGTCCCAGAACCTGTGTGGAGAGCACTTTATCACTGGTATGGAGCAAAC CTGGCCTTACCTAGACCAGTTATCAAGAACAGCAAGACAGACATCCCAGAGCTG GAATTATTTCCCCGCTATCTTCTCTTCCTGAGACAGCAGCCTGCCACTCGGACAC AGCAGTCTAACATCTGGGTGAATATGGGAAATGTACCTTCTCCGAATGCACCTTT AAAGCGGGTATTAGCCTATACAGGCTGTTTTAGTCGAATGCAGACCATCAAGGA AATTCACGAATATCTATCTCAAAGACTGCGCATTAAAGAGGAAGATATGCGCCT GTGGCTATACAACAGTGAGAACTACCTTACTCTTCTGGATGATGAGGATCATAAA TTGGAATATTTGAAAATCCAGGATGAACAACACCTGGTAATTGAAGTTCGCAAC AAAGATATGAGTTGGCCTGAGGAGATGTCTTTTATAGCAAATAGTAGTAAAATA GATAGACACAAGGTTCCCACAGAAAAGGGAGCCACAGGTCTAAGCAATCTGGG AAACACATGCTTCATGAACTCAAGCATCCAGTGTGTTAGTAACACACAGCCACT GACACAGTATTTTATCTCAGGGAGACATCTTTATGAACTCAACAGGACAAATCCC ATTGGTATGAAGGGGCATATGGCTAAATGCTATGGTGATTTAGTGCAGGAACTTT GGAGTGGAACTCAGAAGAATGTTGCCCCATTAAAGCTTCGGTGGACCATAGCAA AATATGCTCCCAGGTTTAATGGGTTTCAGCAACAGGACTCCCAAGAACTTCTGGC TTTTCTCTTGGATGGTCTTCATGAAGATCTTAATCGAGTCCATGAAAAGCCATAT GTGGAACTGAAGGACAGTGATGGGCGACCAGACTGGGAAGTAGCTGCAGAGGC CTGGGACAACCATCTAAGAAGAAATAGATCAATTGTTGTGGATTTGTTCCATGGG CAGCTAAGATCTCAAGTAAAATGCAAGACATGTGGGCATATAAGTGTCCGATTT GACCCTTTCAATTTTTTGTCTTTGCCACTACCAATGGACAGTTATATGCACTTAGA AATAACAGTGATTAAGTTAGATGGTACTACCCCTGTACGGTATGGACTAAGACT GAATATGGATGAAAAGTACACAGGTTTAAAAAAACAGCTGAGTGATCTCTGTGG ACTTAATTCAGAACAAATCCTTCTAGCAGAAGTACATGGTTCCAACATAAAGAA CTTTCCTCAGGACAACCAAAAAGTACGACTCTCAGTGAGTGGATTTTTGTGTGCA TTTGAAATTCCTGTCCCTGTGTCTCCAATTTCAGCTTCTAGTCCAACACAGACAG ATTTCTCCTCTTCGCCATCTACAAATGAAATGTTCACCCTAACTACCAATGGGGA CCTACCCCGACCAATATTCATCCCCAATGGAATGCCAAACACTGTTGTGCCATGT GGAACTGAGAAGAACTTCACAAATGGAATGGTTAATGGTCACATGCCATCTCTT CCTGACAGCCCCTTTACAGGTTACATCATTGCAGTCCACCGAAAAATGATGAGG ACAGAACTGTATTTCCTGTCATCTCAGAAGAATCGCCCCAGCCTCTTTGGAATGC CATTGATTGTTCCATGTACTGTGCATACCCGGAAGAAAGACCTATATGATGCGGT TTGGATTCAAGTATCCCGGTTAGCGAGCCCACTCCCACCTCAGGAAGCTAGTAAT CATGCCCAGGATTGTGACGACAGTATGGGCTATCAATATCCATTCACTCTACGAG TTGTGCAGAAAGATGGGAACTCCTGTGCTTGGTGCCCATGGTATAGATTTTGCAG AGGCTGTAAAATTGATTGTGGGGAAGACAGAGCTTTCATTGGAAATGCCTATAT CGCTGTGGATTGGGATCCCACAGCCCTTCACCTTCGCTATCAAACATCCCAGGAA AGGGTTGTAGATGAGCATGAGAGTGTGGAGCAGAGTCGGCGAGCGCAAGCCGA GCCCATCAACCTGGACAGCTGTCTCCGTGCTTTCACCAGTGAGGAAGAGCTAGG GGAAAATGAGATGTACTACTGTTCCAAGTGTAAGACCCACTGCTTAGCAACAAA GAAGCTGGATCTCTGGAGGCTTCCACCCATCCTGATTATTCACCTTAAGCGATTT CAATTTGTAAATGGTCGGTGGATAAAATCACAGAAAATTGTCAAATTTCCTCGGG AAAGTTTTGATCCAAGTGCTTTTTTGGTACCAAGAGACCCGGCTCTCTGCCAGCA TAAACCACTCACACCCCAGGGGGATGAGCTCTCTGAGCCCAGGATTCTGGCAAG GGAGGTGAAGAAAGTGGATGCGCAGAGTTCGGCTGGGGAAGAGGACGTGCTCC TGAGCAAAAGCCCATCCTCACTCAGCGCTAACATCATCAGCAGCCCGAAAGGTT CTCCTTCTTCATCAAGAAAAAGTGGAACCAGCTGTCCCTCCAGCAAAAACAGCA GCCCTAATAGCAGCCCACGGACTTTGGGGAGGAGCAAAGGGAGGCTCCGGCTGC CCCAGATTGGCAGCAAAAATAAACTGTCAAGTAGTAAAGAGAACTTGGATGCCA GCAAAGAAAATGGGGCTGGGCAGATATGTGAGCTGGCTGACGCCTTGAGTCGAG GGCATGTGCTGGGGGGCAGCCAACCAGAGTTGGTCACTCCTCAGGACCATGAGG TAGCTTTGGCCAATGGATTCCTTTATGAGCATGAAGCATGTGGCAATGGCTACAG CAATGGTCAGCTTGGAAACCACAGTGAAGAAGACAGCACTGATGACCAAAGAG AAGATACTCGTATTAAGCCTATTTATAATCTATATGCAATTTCGTGCCATTCAGG AATTCTGGGTGGGGGCCATTACGTCACTTATGCCAAAAACCCAAACTGCAAGTG GTACTGTTACAATGACAGCAGCTGTAAGGAACTTCACCCGGATGAAATTGACAC CGACTCTGCCTACATTCTTTTCTATGAGCAGCAGGGGATAGACTATGCACAATTT CTGCCAAAGACTGATGGCAAAAAGATGGCAGACACAAGCAGTATGGATGAAGA CTTTGAGTCTGATTACAAAAAGTACTGTGTGTTACAGTAA.

[6007] The human 80091 sequence (SEQ ID NO:94) is approximately 3954 nucleotides long. The nucleic acid sequence includes an initiation codon (ATG) and a termination codon (TAA) which are underscored above. The region between and inclusive of the initiation codon and the termination codon is a methionine-initiated coding sequence, including the termination codon. The coding sequence encodes a 1317 amino acid protein (SEQ ID NO:95), which is recited as follows: 52 MVADACNPNSLGDWGGRSFEARSLRPAWAQAVLPQPPKVLGLQMGHLTLEDYQI (SEQ ID NO:95) WSVKNVLANEFLNLLFQVCHIVLGLRPATPEEEGQIIRGWLERESRYGLQAGHNWFII SMQWWQQWKEYVKYDANPVVIEPSSVLNGGKYSFGTAAHPMEQVEDRIGSSLSYV NTTEEKFSDNISTASEASETAGSGFLYSATPGADVCFARQHNTSDNNNQCLLGANGN ILLHLNPQKPGAIDNQPLVTQEPVKATSLTLEGGRLKRTPQLIHGRDYEMVPEPVWR ALYHWYGANLALPRPVIKNSKTDIPELELFPRYLLFLRQQPATRTQQSNIWVNMGNV PSPNAPLKRVLAYTGCFSRMQTIKEIHEYLSQRLRIKEEDMRLWLYNSENYLTLLDD EDHKLEYLKIQDEQHLVIEVRNKDMSWPEEMSFIANSSKIDRHKVPTEKGATGLSNL GNTCFMNSSIQCVSNTQPLTQYFISGRHLYELNRTNPIGMKGHMAKCYGDLVQELW SGTQKNVAPLKLRWTIAKYAPRFNGFQQQDSQELLAFLLDGLHEDLNRVHEKPYVE LKDSDGRPDWEVAAEAWDNHLRRNRSIVVDLFHGQLRSQVKCKTCGHISVRFDPFN FLSLPLPMDSYMHLEITVIKLDGTTPVRYGLRLNMDEKYTGLKKQLSDLCGLNSEQI LLAEVHGSNIKNFPQDNQKVRLSVSGFLCAFEIPVPVSPISASSPTQTDFSSSPSTNEMF TLTTNGDLPRPIFIPNGMPNTVVPCGTEKNFTNGMVNGHMPSLPDSPFTGYIIAVHRK MMRTELYFLSSQKNRPSLFGMPLIVPCTVHTRKKDLYDAVWIQVSRLASPLPPQEAS NHAQDCDDSMGYQYPFTLRVVQKDGNSCAWCPWYRFCRGCKIDCGEDRAFIGNA YIAVDWDPTALHLRYQTSQERVVDEHESVEQSRRAQAEPINLDSCLRAFTSEEELGE NEMYYCSKCKTHCLATKKLDLWRLPPILIIHLKRFQFVNGRWIKSQKIVKFPRESFDP SAFLVPRDPALCQHKPLTPQGDELSEPRILAREVKKVDAQSSAGEEDVLLSKSPSSLS ANIISSPKGSPSSSRKSGTSCPSSKNSSPNSSPRTLGRSKGRLRLPQIGSKNKLSSSKEN LDASKENGAGQICELADALSRGHVLGGSQPELVTPQDHEVALANGFLYEHEACGNG YSNGQLGNHSEEDSTDDQREDTRIKPIYNLYAISCHSGILGGGHYVTYAKNPNCKWY CYNDSSCKELHPDEIDTDSAYILFYEQQGIDYAQFLPKTDGKKMADTSSMDEDFESD YKKYCVLQ.

Example 54 Tissue Distribution of 80091 mRNA by TaqMan Analysis

[6008] Endogenous human 80091 gene expression was determined using the Perkin-Elmer/ABI 7700 Sequence Detection System which employs TaqMan technology. Briefly, TaqMan technology relies on standard RT-PCR with the addition of a third gene-specific oligonucleotide (referred to as a probe) which has a fluorescent dye coupled to its 5′end (typically 6-FAM) and a quenching dye at the 3′end (typically TAMRA). When the fluorescently tagged oligonucleotide is intact, the fluorescent signal from the 5′dye is quenched. As PCR proceeds, the 5′ to 3′nucleolytic activity of Taq polymerase digests the labeled primer, producing a free nucleotide labeled with 6-FAM, which is now detected as a fluorescent signal. The PCR cycle where fluorescence is first released and detected is directly proportional to the starting amount of the gene of interest in the test sample, thus providing a quantitative measure of the initial template concentration. Samples can be internally controlled by the addition of a second set of primers/probe specific for a housekeeping gene such as GAPDH which has been labeled with a different fluorophore on the 5′end (typically VIC).

[6009] To determine the level of 80091 in various human tissues a primer/probe set was designed. Total RNA was prepared from a series of human tissues using an RNeasy kit from Qiagen. First strand cDNA was prepared from 1 &mgr;g total RNA using an oligo-dT primer and Superscript II reverse transcriptase (Gibco/BRL). cDNA obtained from approximately 50 ng total RNA was used per TaqMan reaction. Tissues tested include the human tissues and several cell lines shown in Table 28. The highest levels 80091 of mRNA expression were observed in erythroid cells, followed by brain cortex and HUVEC. 53 TABLE 28 Expression of 80091 Relative Tissue Type Expression Artery normal 38.6068 Aorta diseased 10.5253 Vein normal 4.1433 Coronary SMC 48.5293 HUVEC 122.4275 Hemangioma 8.5492 Heart normal 31.4674 Heart CHF 32.5771 Kidney 35.4026 Skeletal Muscle 60.1622 Adipose normal 1.5646 Pancreas 6.2584 Primary osteoblasts 27.9695 Osteoclasts (diff) 0.2302 Skin normal 11.4382 Spinal cord normal 29.5643 Brain Cortex normal 219.9123 Brain Hypothalamus normal 59.1286 Nerve 30.3955 DRG (Dorsal Root Ganglion) 60.7909 Breast normal 11.5978 Breast tumor 10.8587 Ovary normal 11.9239 Ovary Tumor 2.2436 Prostate Normal 0.9017 Prostate Tumor 10.8587 Salivary glands 0.4556 Colon normal 0.509 Colon Tumor 7.7049 Lung normal 1.1374 Lung tumor 22.5614 Lung COPD 0.98 Colon IBD 0.1187 Liver normal 6.6843 Liver fibrosis 17.4576 Spleen normal 1.6595 Tonsil normal 1.0358 Lymph node normal 2.5154 Small intestine normal 1.0987 Skin-Decubitus 5.5435 Synovium 0.8298 BM-MNC 21.6423 Activated PBMC 1.8414 Neutrophils 10.8212 Megakaryocytes 38.0753 Erythroid 461.6912

Example 55 Tissue Distribution of 80091 mRNA by Northern Analysis

[6010] Northern blot hybridizations with various RNA samples can be performed under standard conditions and washed under stringent conditions, i.e., 0.2×SSC at 65° C. A DNA probe corresponding to all or a portion of the 80091 cDNA (SEQ ID NO:94) can be used. The DNA was radioactively labeled with 32P-dCTP using the Prime-It Kit (Stratagene, La Jolla, Calif.) according to the instructions of the supplier. Filters containing mRNA from mouse hematopoietic and endocrine tissues, and cancer cell lines (Clontech, Palo Alto, Calif.) can be probed in ExpressHyb hybridization solution (Clontech) and washed at high stringency according to manufacturer's recommendations.

Example 56 Recombinant Expression of 80091 in Bacterial Cells

[6011] In this example, 80091 is expressed as a recombinant glutathione-S-transferase (GST) fusion polypeptide in E. coli and the fusion polypeptide is isolated and characterized. Specifically, 80091 is fused to GST and this fusion polypeptide is expressed in E. coli, e.g., strain PEB199. Expression of the GST-80091 fusion protein in PEB199 is induced with IPTG. The recombinant fusion polypeptide is purified from crude bacterial lysates of the induced PEB199 strain by affinity chromatography on glutathione beads. Using polyacrylamide gel electrophoretic analysis of the polypeptide purified from the bacterial lysates, the molecular weight of the resultant fusion polypeptide is determined.

Example 57 Expression of Recombinant 80091 Protein in COS Cells

[6012] To express the 80091 gene in COS cells (e.g., COS-7 cells, CV-1 origin SV40 cells; Gluzman (1981) Cell 123: 175-182), the pcDNA/Amp vector by Invitrogen Corporation (San Diego, Calif.) is used. This vector contains an SV40 origin of replication, an ampicillin resistance gene, an E. coli replication origin, a CMV promoter followed by a polylinker region, and an SV40 intron and polyadenylation site. A DNA fragment encoding the entire 80091 protein and an HA tag (Wilson et al. (1984) Cell 37:767) or a FLAG tag fused in-frame to its 3′end of the fragment is cloned into the polylinker region of the vector, thereby placing the expression of the recombinant protein under the control of the CMV promoter.

[6013] To construct the plasmid, the 80091 DNA sequence is amplified by PCR using two primers. The 5′primer contains the restriction site of interest followed by approximately twenty nucleotides of the 80091 coding sequence starting from the initiation codon; the 3′ end sequence contains complementary sequences to the other restriction site of interest, a translation stop codon, the HA tag or FLAG tag and the last 20 nucleotides of the 80091 coding sequence. The PCR amplified fragment and the pcDNA/Amp vector are digested with the appropriate restriction enzymes and the vector is dephosphorylated using the CIAP enzyme (New England Biolabs, Beverly, Mass.). Preferably the two restriction sites chosen are different so that the 80091_gene is inserted in the correct orientation. The ligation mixture is transformed into E. coli cells (strains HB101, DH5&agr;, SURE, available from Stratagene Cloning Systems, La Jolla, Calif., can be used), the transformed culture is plated on ampicillin media plates, and resistant colonies are selected. Plasmid DNA is isolated from transformants and examined by restriction analysis for the presence of the correct fragment.

[6014] COS cells are subsequently transfected with the 80091-pcDNA/Amp plasmid DNA using the calcium phosphate or calcium chloride co-precipitation methods, DEAE-dextran-mediated transfection, lipofection, or electroporation. Other suitable methods for transfecting host cells can be found in Sambrook, J., Fritsh, E. F., and Maniatis, T. (1989) Molecular Cloning: A Laboratory Manual 2nd, ed., Cold Spring Harbor Laboratory, Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y. The expression of the 80091 polypeptide is detected by radiolabelling (35S-methionine or 35S-cysteine available from NEN, Boston, Mass., can be used) and immunoprecipitation (Harlow, E. and Lane, D. (1988) Antibodies: A Laboratory Manual, Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y.) using an HA specific monoclonal antibody. Briefly, the cells are labeled for 8 hours with 35S-methionine (or 35S-cysteine). The culture media are then collected and the cells are lysed using detergents (RIPA buffer, 150 mM NaCl, 1% NP-40, 0.1% SDS, 0.5% DOC, 50 mM Tris, pH 7.5). Both the cell lysate and the culture media are precipitated with an HA specific monoclonal antibody. Precipitated polypeptides are then analyzed by SDS-PAGE.

[6015] Alternatively, DNA containing the 80091 coding sequence is cloned directly into the polylinker of the pcDNA/Amp vector using the appropriate restriction sites. The resulting plasmid is transfected into COS cells in the manner described above, and the expression of the 80091 polypeptide is detected by radiolabelling and immunoprecipitation using an 80091 specific monoclonal antibody.

Examples for 46508 Example 58 Identification and Characterization of Human 46508 cDNA

[6016] The human 46508 sequence (FIG. 42; SEQ ID NO:101), which is approximately 1180 nucleotides long, including untranslated regions, contains a predicted methionine-initiated coding sequence of about 684 nucleotides, including the termination codon (nucleotides indicated as “coding” of SEQ ID NO:101; SEQ ID NO:103). The coding sequence encodes a 711 amino acid protein (SEQ ID NO: 102).

Example 59 Tissue Distribution of 46508 mRNA by TaqMan Analysis

[6017] Endogenous human 46508 gene expression was determined using the Perkin-Elmer/ABI 7700 Sequence Detection System which employs TaqMan technology. Briefly, TaqMan technology relies on standard RT-PCR with the addition of a third gene-specific oligonucleotide (referred to as a probe) which has a fluorescent dye coupled to its 5′end (typically 6-FAM) and a quenching dye at the 3′end (typically TAMRA). When the fluorescently tagged oligonucleotide is intact, the fluorescent signal from the 5′dye is quenched. As PCR proceeds, the 5′ to 3′nucleolytic activity of Taq polymerase digests the labeled primer, producing a free nucleotide labeled with 6-FAM, which is now detected as a fluorescent signal. The PCR cycle where fluorescence is first released and detected is directly proportional to the starting amount of the gene of interest in the test sample, thus providing a quantitative measure of the initial template concentration. Samples can be internally controlled by the addition of a second set of primers/probe specific for a housekeeping gene such as GAPDH which has been labeled with a different fluorophore on the 5′end (typically VIC).

[6018] To determine the level of 46508 in various human tissues a primer/probe set was designed. Total RNA was prepared from a series of human tissues using an RNeasy kit from Qiagen. First strand cDNA was prepared from 1 &mgr;g total RNA using an oligo-dT primer and Superscript II reverse transcriptase (Gibco/BRL). cDNA obtained from approximately 50 ng total RNA was used per TaqMan reaction. Tissues tested include the human tissues and several cell lines shown in Tables 29, 30 and 31.

[6019] Table 29 below shows expression of 46508 mRNA in various normal and diseased tissues, detected using TaqMan analysis. The highest transcriptional expression of 46508 was noted in HUVEC cell line, with moderate to high expression found in normal pancreas, brain, hypothalamus, skeletal muscle, DRG (dorsal root ganglion) and skin. Moderate expression also noted in normal and tumor pairs of breast, ovarian, prostate, colon and lung tissue along with fibrotic liver, normal heart and diseased heart (CHF, congestive heart failure) samples. 54 TABLE 29 Tissue Distribution of 46508 mRNA by TaqMan Analysis Tissue Type Expression Artery normal 8.6685 Aorta diseased 5.2992 Vein normal 1.7725 Coronary SMC 17.1577 HUVEC 49.0365 Hemangioma 5.0834 Heart normal 8.2009 Heart CHF 7.1393 Kidney 9.4204 Skeletal Muscle 16.6308 Adipose normal 3.9608 Pancreas 20.3335 primary osteoblasts 2.7431 Osteoclasts (diff) 0.8955 Skin normal 13.0031 Spinal cord normal 5.1365 Brain Cortex normal 24.2647 Brain Hypothalamus normal 25.2951 Nerve 7.8942 DRG (Dorsal Root Ganglion) 16.5159 Breast normal 6.5016 Breast tumor 11.4382 Ovary normal 10.0965 Ovary Tumor 6.7542 Prostate Normal 10.273 Prostate Tumor 11.0485 Salivary glands 2.1822 Colon normal 3.5327 Colon Tumor 14.9885 Lung normal 4.9273 Lung tumor 7.3655 Lung COPD 5.0658 Colon IBD 3.14 Liver normal 9.3229 Liver fibrosis 8.6385 Spleen normal 2.4129 Tonsil normal 2.83 Lymph node normal 5.3176 Small intestine normal 2.0573 Macrophages 1.3526 Synovium 2.3388 BM-MNC 0.2563 Activated PBMC 1.5538 Neutrophils 1.5755 Megakaryocytes 1.57 Erythroid 15.3566 positive control 21.1969

[6020] Table 30 below also shows expression of 46508 mRNA in various normal and diseased tissues, detected using TaqMan analysis. Table 30 shows the high transcriptional expression of 46508 in tumor samples compared to normal organ matched controls. High transcriptional expression was noted in 5/5 primary ovarian tumors, 4/4 primary colon tumors, 2/2 colon to liver metastases, 3/6 primary lung tumors and proliferating HMVEC cells when compared to arrested HMVEC cells. 55 TABLE 30 Expression of 46508 mRNA in Normal and Cancerous Tissues Tissue Type Expression PIT 400 Breast Normal 28.36 PIT 372 Breast Normal 36.40 CHT 1228 Breast Normal 8.88 MDA 304 Breast Tumor: MD-IDC 8.34 CHT 2002 Breast Tumor: IDC 3.55 MDA 236-Breast Tumor: PD-IDC 3.21 CHT 562 Breast Tumor: IDC 13.51 NDR 138 Breast Tumor ILC (LG) 25.30 CHT 1841 Lymph node (Breast met) 7.19 PIT 58 Lung (Breast met) 12.22 CHT 620 Ovary Normal 14.83 CHT 619 Ovary Normal 7.09 CLN 012 Ovary Tumor 40.53 CLN 07 Ovary Tumor 28.26 CLN 17 Ovary Tumor 94.40 MDA 25 Ovary Tumor 80.21 CLN 08 Ovary Tumor 28.07 PIT 298 Lung Normal 2.14 MDA 185 Lung Normal 6.05 CLN 930 Lung Normal 12.13 MPI 215 Lung Tumor --SmC 9.69 MDA 259 Lung Tumor -PDNSCCL 13.94 CHT 832 Lung Tumor -PDNSCCL 8.64 MDA 262 Lung Tumor -SCC 68.87 CHT 793 Lung Tumor -ACA 20.83 CHT 331 Lung Tumor -ACA 8.14 CHT 405 Colon Normal 2.50 CHT 1685 Colon Normal 2.10 CHT 371 Colon Normal 0.95 CHT 382 Colon Tumor: MD 69.11 CHT 528 Colon Tumor: MD 54.79 CLN 609 Colon Tumor 15.25 NDR 210 Colon Tumor: MD-PD 121.16 CHT 340 Colon-Liver Met 15.25 CHT 1637 Colon-Liver Met 10.82 PIT 260 Liver N (female) 1.46 CHT 1653 Cervix Squamous CC 19.51 CHT 569 Cervix Squamous CC 1.31 A24 HMVEC-Arrested 11.56 C48 HMVEC-Proliferating 31.80 Pooled Hemangiomas 1.16 HCT116N22 Normoxic 76.42 HCT116H22 Hypoxic 68.39

[6021] Table 31 indicates that 46508 mRNA is highly expressed in several ovarian cell lines including SKOV3/Var, A2780, MDA2774 and ES-2. The table allows comparisons between two normal ovarian surface epithelium samples (MDA 127 Normal Ovary and MDA 224 Normal Ovary) and two ovarian ascites (MDA 124 Ovarian Ascites and MDA 126 Ovarian Ascites) samples. Expression of 46508 mRNA is upregulated in one of the ascites samples. The table also shows an experiment where the ovarian cancer cell line, HEY, was serum starved for 24 hours. Time points were taken at 0, 1, 3, 6, 9 and 12 hours after the addition of 10% serum (HEY 0 hr, HEY 1 hr, HEY 3 hr, HEY 6 hr, HEY 9 hr, and HEY 12 hr, respectively). Since cMyc protein is highly upregulated at 1 hour after addition of serum and phosphorylated at 6 hours, the experiment is a good model for identifying targets that are downstream of cMyc. These data indicate that 46508 mRNA may be regulated in a manner similar to cMyc since the expression increases from 1 to 9 hours after the addition of serum.

[6022] Also shown are data involving the ovarian cancer cell lines SKOV3 and SKOV3/Variant. These cell lines were grown in three different cellular environments: on plastic, in soft agar, and as a subcutaneous tumor in nude mice (all cells grown in 10% serum). The plastic sample was used as the “control” in each experiment. The SKOV3/Var cell line is a variant of the parental cell line SKOV3 which is resistant to cisplatin. These data indicate that 46508 mRNA is upregulated in environments that may be more similar to the tumor in vivo (the soft agar and subcutaneous tumor) compared to growth on plastic. 56 TABLE 31 Expression in Various Ovarian Cells Tissue Type Expression SKOV-3 No GF 40.67 SKOV-3 EGF ′15 41.52 SKOV-3 EGF ′30 46.23 SKOV-3 EGF ′60 38.88 SKOV-3 Hrg ′15 36.02 SKOV-3 Hrg ′30 41.67 SKOV-3 Hrg ′60 48.87 SKOV-3 Serum ′30 54.98 SKOV-3var No GF 151.25 SKOV-3var EGF ′15 140.63 SKOV-3var EGF ′30 126.74 SKOV-3var EGF ′60 125.43 SKOV-3var Hrg ′15 141.61 SKOV-3var Hrg ′30 170.76 SKOV-3var Hrg ′60 140.15 SKOV-3var Serum ′30 190.78 HEY Plastic 50.94 HEY Soft Agar 22.96 SKOV-3 35.65 SKOV-3var 122.00 A2780 159.87 A2780-ADR 60.79 OVCAR-3 54.98 OVCAR-4 59.75 MDA2774 123.28 DOV13 36.52 Caov-3 18.14 ES-2 101.18 HEY 0 hr 55.17 HEY 1 hr 62.50 HEY 3 hr 74.84 HEY 6 hr 69.59 HEY 9 hr 77.75 HEY 12 hr 68.63 SKOV-3 SubQ Tumor 18.01 SKOV-3 Variant Plastic 152.30 SKOV-3 Var SubQ Tumor 9.39 MDA 127 Normal Ovary 11.44 MDA 224 Normal Ovary 17.10 MDA 124 Ovarian Ascites 18.84 MDA 126 Ovarian Ascites 41.67 HEY 63.81 SKOV-3 Plastic 72.80

Example 60 Tissue Distribution of 46508 mRNA by Northern Analysis

[6023] Northern blot hybridizations with various RNA samples can be performed under standard conditions and washed under stringent conditions, i.e., 0.2×SSC at 65° C. A DNA probe corresponding to all or a portion of the 46508 cDNA (SEQ ID NO:101) can be used. The DNA was radioactively labeled with 32P-dCTP using the Prime-It Kit (Stratagene, La Jolla, Calif.) according to the instructions of the supplier. Filters containing mRNA from mouse hematopoietic and endocrine tissues, and cancer cell lines (Clontech, Palo Alto, Calif.) can be probed in ExpressHyb hybridization solution (Clontech) and washed at high stringency according to manufacturer's recommendations.

Example 61 Recombinant Expression of 46508 in Bacterial Cells

[6024] In this example, 46508 is expressed as a recombinant glutathione-S-transferase (GST) fusion polypeptide in E. coli and the fusion polypeptide is isolated and characterized. Specifically, 46508 is fused to GST and this fusion polypeptide is expressed in E. coli, e.g., strain PEB199. Expression of the GST-46508 fusion protein in PEB199 is induced with IPTG. The recombinant fusion polypeptide is purified from crude bacterial lysates of the induced PEB 199 strain by affinity chromatography on glutathione beads. Using polyacrylamide gel electrophoretic analysis of the polypeptide purified from the bacterial lysates, the molecular weight of the resultant fusion polypeptide is determined.

Example 62 Expression of Recombinant 46508 Protein in COS Cells

[6025] To express the 46508 gene in COS cells (e.g., COS-7 cells, CV-1 origin SV40 cells; Gluzman (1981) Cell 123:175-182), the pcDNA/Amp vector by Invitrogen Corporation (San Diego, Calif.) is used. This vector contains an SV40 origin of replication, an ampicillin resistance gene, an E. coli replication origin, a CMV promoter followed by a polylinker region, and an SV40 intron and polyadenylation site. A DNA fragment encoding the entire 46508 protein and an HA tag (Wilson et al. (1984) Cell 37:767) or a FLAG tag fused in-frame to its 3′end of the fragment is cloned into the polylinker region of the vector, thereby placing the expression of the recombinant protein under the control of the CMV promoter.

[6026] To construct the plasmid, the 46508 DNA sequence is amplified by PCR using two primers. The 5′primer contains the restriction site of interest followed by approximately twenty nucleotides of the 46508 coding sequence starting from the initiation codon; the 3′end sequence contains complementary sequences to the other restriction site of interest, a translation stop codon, the HA tag or FLAG tag and the last 20 nucleotides of the 46508 coding sequence. The PCR amplified fragment and the pcDNA/Amp vector are digested with the appropriate restriction enzymes and the vector is dephosphorylated using the CIAP enzyme (New England Biolabs, Beverly, Mass.). Preferably the two restriction sites chosen are different so that the 46508_gene is inserted in the correct orientation. The ligation mixture is transformed into E. coli cells (strains HB101, DH5&agr;, SURE, available from Stratagene Cloning Systems, La Jolla, Calif., can be used), the transformed culture is plated on ampicillin media plates, and resistant colonies are selected. Plasmid DNA is isolated from transformants and examined by restriction analysis for the presence of the correct fragment.

[6027] COS cells are subsequently transfected with the 46508-pcDNA/Amp plasmid DNA using the calcium phosphate or calcium chloride co-precipitation methods, DEAE-dextran-mediated transfection, lipofection, or electroporation. Other suitable methods for transfecting host cells can be found in Sambrook, J., Fritsh, E. F., and Maniatis, T. (1989) Molecular Cloning: A Laboratory Manual. 2nd, ed., Cold Spring Harbor Laboratory, Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y. The expression of the 46508 polypeptide is detected by radiolabelling (35S-methionine or 35S-cysteine available from NEN, Boston, Mass., can be used) and immunoprecipitation (Harlow, E. and Lane, D. (1988) Antibodies: A Laboratory Manual, Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y.) using an HA specific monoclonal antibody. Briefly, the cells are labeled for 8 hours with 35S-methionine (or 35S-cysteine). The culture media are then collected and the cells are lysed using detergents (RIPA buffer, 150 mM NaCl, 1% NP-40, 0.1% SDS, 0.5% DOC, 50 mM Tris, pH 7.5). Both the cell lysate and the culture media are precipitated with an HA specific monoclonal antibody. Precipitated polypeptides are then analyzed by SDS-PAGE.

[6028] Alternatively, DNA containing the 46508 coding sequence is cloned directly into the polylinker of the pcDNA/Amp vector using the appropriate restriction sites. The resulting plasmid is transfected into COS cells in the manner described above, and the expression of the 46508 polypeptide is detected by radiolabelling and immunoprecipitation using a 46508 specific monoclonal antibody.

[6029] Equivalents

[6030] Those skilled in the art will recognize, or be able to ascertain using no more than routine experimentation, many equivalents to the specific embodiments of the invention described herein. Such equivalents are intended to be encompassed by the following claims.

Claims

1. An isolated nucleic acid molecule selected from the group consisting of:

a) a nucleic acid comprising the nucleotide sequence of SEQ ID NO:1, 3, 4, 6, 10, 12, 16, 18, 22, 24, 27, 29, 33, 35, 41, 43, 48, 50, 53, 55, 61, 63, 74, 76, 77, 79, 80, 82, 94, 101 or 103; and

b) a nucleic acid molecule which encodes a polypeptide comprising the amino acid sequence of SEQ ID NO:2, 5, 11, 17, 23, 28, 34, 42, 49, 54, 62, 75, 78, 81, 95 or 102.

2. The nucleic acid molecule of claim 1, further comprising vector nucleic acid sequences.

3. The nucleic acid molecule of claim 1, further comprising nucleic acid sequences encoding a heterologous polypeptide.

4. A host cell which contains the nucleic acid molecule of claim 1.

5. An isolated polypeptide comprising the amino acid sequence of SEQ ID NO:2, 5, 11, 17, 23, 28, 34, 42, 49, 54, 62, 75, 78, 81, 95 or 102.

6. The polypeptide of claim 5 further comprising heterologous amino acid sequences.

7. An antibody or antigen-binding fragment thereof that selectively binds to a polypeptide of claim 5.

8. A method for producing a polypeptide comprising the amino acid sequence of SEQ ID NO:2, 5, 11, 17, 23, 28, 34, 42, 49, 54, 62, 75, 78, 81, 95 or 102, the method comprising culturing the host cell of claim 4 under conditions in which the nucleic acid molecule is expressed.

9. A method for detecting the presence of a polypeptide of claim 5 in a sample, comprising:

a) contacting the sample with a compound which selectively binds to the polypeptide; and

b) determining whether the compound binds to the polypeptide in the sample.

10. The method of claim 9, wherein the compound which binds to the polypeptide is an antibody.

11. A kit comprising a compound which selectively binds to a polypeptide of claim 5 and instructions for use.

12. A method for detecting the presence of a nucleic acid molecule of claim 1 in a sample, comprising the steps of:

a) contacting the sample with a nucleic acid probe or primer which selectively hybridizes to the nucleic acid molecule; and

b) determining whether the nucleic acid probe or primer binds to a nucleic acid molecule in the sample.

13. The method of claim 12, wherein the sample comprises mRNA molecules and is contacted with a nucleic acid probe.

14. A kit comprising a compound which selectively hybridizes to a nucleic acid molecule of claim 1 and instructions for use.

15. A method for identifying a compound which binds to a polypeptide of claim 5 comprising the steps of:

a) contacting a polypeptide, or a cell expressing a polypeptide of claim 5 with a test compound; and

b) determining whether the polypeptide binds to the test compound.

16. A method for modulating the activity of a polypeptide of claim 5, comprising contacting a polypeptide or a cell expressing a polypeptide of claim 5 with a compound which binds to the polypeptide in a sufficient concentration to modulate the activity of the polypeptide.

17. A method of inhibiting aberrant activity of a 26443, 46873, 61833, 26493, 58224, 46980, 32225, 47508, 56939, 33410, 33521, 23479, 48120, 46689, 80091, or 46508-expressing cell, comprising contacting a 26443, 46873, 61833, 26493, 58224, 46980, 32225, 47508, 56939, 33410, 33521, 23479, 48120, 46689, 80091, or 46508-expressing cell with a compound that modulates the activity or expression of a polypeptide of claim 5, in an amount which is effective to reduce or inhibit the aberrant activity of the cell.

18. The method of claim 17, wherein the compound is selected from the group consisting of a peptide, a phosphopeptide, a small organic molecule, and an antibody.

19. A method of treating or preventing a disorder characterized by aberrant activity of a 26443, 46873, 61833, 26493, 58224, 46980, 32225, 47508, 56939, 33410, 33521, 23479, 48120, 46689, 80091, or 46508-expressing cell, in a subject, comprising:

administering to the subject an effective amount of a compound that modulates the activity or expression of a nucleic acid molecule of claim 1, such that the aberrant activity of the 26443, 46873, 61833, 26493, 58224, 46980, 32225, 47508, 56939, 33410, 33521, 23479, 48120, 46689, 80091, or 46508-expressing cell is reduced or inhibited.