Method for generation of sequence sampled maps of complex genomes

The present invention relates to a rapid and powerful sequence "sequence sampled mapping" method for sequencing complex genomes. The invention method is applicable to genomic DNA, preferably mammalian chromosomes, and in a preferred embodiment, employs a "bottom-up" mapping strategy, which allows for the simultaneous analysis of multiple cosmid clones for the detection of overlaps. The sequence sample mapping method is useful first, for the completion of high density sequence-based maps, and ultimately, for the complete sequencing of genomic DNA directly from cosmid clones.

Skip to:  ·  Claims  ·  References Cited  · Patent History  ·  Patent History

Claims

1. A method for sequencing complex genomes, said method comprising:

(1) sequencing at least 100 nucleotides from the end of each member of a library of cosmid clones,
wherein said cosmid clones are prepared by inserting genomic DNA fragments into cosmid vectors, and
wherein the cosmid vectors include sequences of nucleotides that flank at least one end of the inserted DNA, and that serve as transcription initiation sites for the synthesis of nucleic acids specific to the ends of the inserted DNA,
(2) determining the relative spatial relationship between the cosmid clones, and
(3) assembling a sequence sampled map by correlating the end-specific nucleotide sequence information with the relative spatial relationship between the cosmids.

2. A method according to claim 1 wherein the relative spatial relationship between the cosmids has been determined prior to sequencing the end-specific nucleotides of each member of a library of cosmid clones.

3. A method according to claim 1 wherein the relative spatial relationship between the cosmids is determined by the cosmid multiplex analysis method.

4. A method according to claim 1 wherein the relative spatial relationship between the cosmids is determined by restriction-fragment-length mapping of the cosmids.

5. A method according to claim 1 wherein at least 250 nucleotides are sequence from the end of the cosmid clones.

6. A method according to claim 1 wherein said cosmid clones are generated in cosmid vectors allowing for the synthesis of end-specific nucleic acid sequences directly from at least one end of DNA fragments inserted therein.

7. A method according to claim 6 wherein said cosmid vectors comprise at least one promoter specific for a bacteriophage RNA polymerase and a cloning site allowing for the insertion of DNA fragments, said promoter being positioned operatively for transcription of a DNA fragment inserted into said cloning site.

8. A method according to claim 7 wherein said cosmid vectors comprise two oppositely oriented promoters, each of which is specific for a bacteriophage RNA polymerase, positioned on two sides of said cloning site, operatively for transcription of a DNA fragment inserted into said cloning site.

9. A method according to claim 8 wherein each of said bacteriophage RNA polymerase-specific promoters is selected from the group consisting of promoters specific for bacteriophage T7 RNA polymerase, and promoters specific for bacteriophage T3 RNA polymerase.

10. A method according to claim 9 wherein said cosmid vector is selected from the group consisting of pWE8, pWE10, pWE15, and pWE16.

11. A method according to claim 6 wherein said cosmid vectors comprise at least two cos sites.

12. A method according to claim 11 wherein said cos sites are separated by unique restriction sites.

13. A method according to claim 12 wherein said cosmid vector is selected from the group consisting of sCOS-1, sCOS-2, sCOS-4, and derivatives thereof.

14. A method for sequencing complex genomes, said method comprising:

(1) preparing a genomic library of cosmid clones by inserting DNA fragments from said genome into cosmid vectors, wherein the cosmid vectors include sequences of nucleotides that flank at least one end of the inserted DNA, and that serve as transcription initiation sites for the synthesis of end-specific probes,
(2) arranging the cosmid clones, whereby each clone may be identified and replicas of said arrangement may be reproduced,
(3) pooling portions of said cosmid clones and synthesizing pools of mixed end-specific probes from the DNA inserts that have been prepared from said pooled clones, wherein each pool contains fewer than all of the cosmid clones in the library, but all of the cosmid clones in the library are included in at least one pool,
(4) hybridizing each pool of probes to a replica of said arranged cosmid clones and identifying the cosmid clones in each replica that hybridize to the probes, wherein said identified clones include the pooled cosmid clones and cosmid clones that contain DNA inserts that overlap with the DNA inserts in the pooled clones,
(5) identifying the cosmid clones from among those identified in step (4) that hybridize to two or more pools of probes, thereby identifying groups of cosmid clones that include overlapping DNA,
(6) assembling contigs from said groups, and
(7) sequencing the fragment ends of the DNA inserts of each of the overlapping cosmid clones.

15. A method according to claim 14 wherein cross-hybridizing clones are identified by comparing the data sets obtained from two groups of cosmid clones containing at least one common clone, and repeating the pairwise comparison with other groups of clones containing at least one common clone.

16. A method according to claim 14 wherein said cosmid clones are pooled according to the rows and columns of a two-dimensional matrix, and said mixed end-specific probes are hybridized to a replica of the entire matrix.

17. A method according to claim 14 wherein said cosmid clones are pooled according to the planes intersecting with a three-dimensional matrix, and said mixed end-specific probes are hybridized to a replica of the entire matrix.

18. A method according to claim 14 wherein said cosmid clones are generated in cosmid vectors allowing for the synthesis of end-specific RNA sequences directly from at least one end of DNA fragments inserted therein.

19. A method according to claim 18 wherein said cosmid vectors comprise at least one promoter specific for a bacteriophage RNA polymerase and a cloning site allowing for the insertion of DNA fragments, said promoter being positioned operatively for transcription of a DNA fragment inserted into said cloning site.

20. A method according to claim 14 wherein said cosmid vectors comprise two oppositely oriented promoters, each of which is specific for a bacteriophage RNA polymerase, positioned on two sides of said cloning site, operatively for transcription of a DNA fragment inserted into said cloning site.

Referenced Cited
U.S. Patent Documents
5219726 June 15, 1993 Evans
Other references
  • Bellanne-Chantelot, et al., "Mapping The Whole Human Genome By Fingerprinting Yeast Artificial Chromosomes," Cell, 70:1059-1068 (1992). Bates, et al., "Double cos Site Vectors: Simplified Cosmid Cloning," Gene, 26:137-146 (1983). Chumakov, et al., "Continuum Of Overlapping Clones Spanning The Entire Human Chromosome 21q," Nature, 359:380-387 (1992). Coulson, et al., "Toward A Physical Map Of The Genome Of The Nematode Caenorhabditis elegans," Proc. Natl. Acad. Sci.(USA), 83:7821-7825 (1986). Cox, et al., "Radiation Hybrid Mapping: A Somatic Cell Genetic Method For Constructing High-Resolution Maps Of Mammalian Chromosomes," Science, 250:245-250 (1990). Daniels, et al., "Analysis Of The Escherichia coli Genome: DNA Sequence Of The Region From 84.5 to 86.5 Minutes," Science, 257:771-778 (1992). Delattre, et al., "Mapping Of Human Chromosome 22 with A Panel Of Somatic Cell Hybrids," Genomics, 9:721-727 (1991). Ehrich, et al., "A Family Of Cosmid Vectors With The Multi-Copy R6K Replication Orgin," Gene, 57:229-237 (1987). Evans, et al., "Physical Mapping Of Complex Genomes By Cosmid Multiplex Analysis," Proc. Natl. Acad. Sci.(USA), 86:5030-5034 (1989). Foote, et al., "The Human Y Chromosome: Overlapping DNA Clones Spanning The Euchromatic Region," 258:60-66 (1992). Heding, et al., "The Generation Of Ordered Sets Of Cosmid DNA Clones From Human Chromosome Region 11p," Genomics, 13:89-94 (1992). Hermanson, et al., "Cosmid Linking Clones Localized To The Long Arm Of Human Chromosome 11," Genomics, 13:134-143 (1992). Hori, et al., "A High- Resolution Cytogenetic Map Of 168 Cosmid DNA Markers For Human Chromosome 11," Genomics, 13:129-133 (1992). Kohara, et al., "The Physical Map Of The Whole E. coli Chromosome: Application Of A New Strategy For Rapid Analysis And Sorting Of A Large Genomic Library," Cell, 50:495-508 (1987). Lander, et al., "Genomic Mapping By Fingerprinting Random Clones: A Mathematical Analysis," Genomics, 2:231-239 (1988). Lichter, et al., "Rapid Detection Of Human Chromosome 21 Abberations By in situ Hybridization," Proc. Natl. Acad. Sci.(USA), 85:9664-9668 (1988). Martin-Gallardo, et al., "Automated DNA Sequencing And Analysis Of 106 Kilobases From Human Chromosome 19q13.3," Nat. Genet., 1:34-39 (1992). Olson, et al., "Random-Clone Strategy For Genomic Restriction Mapping In Yeast," Proc. Natl. Acad. Sci.(USA), 83:7826-7830 (1986). Oliver, et al., "The Complete DNA Sequence Of Yeast Chromosome III," Nature, 357:38-46 (1992). Olson, et al., "A Common Language For Physical Mapping Of The Human Genome," Science, 245:1434-1435 (1989). Poustka, et al., "Jumping Libraries And Linking Libraries: The Next Generation Of Molecular Tools in Mammalian Genetics," Trends Genetics, 2:174-179 (1986). Saiki, et al., "Primer-Directed Enzymatic Amplification Of DNA With A Thermostable DNA Polymerase," Science, 239:487-491 (1988). Sulston, et al., "The C. elegans Genome Sequencing Project: A Beginning," Nature, 356:37-41 (1992). Tanigami, et al., "Mapping Of 262 DNA Markers Into 24 Intervals On Human Chromosome 11," Am. J. Hum. Genet., 50:56-64 (1992). Wahl, et al., "Northern and Southern Blots," Methods in Enzymology, 152:572-581 (1987). Wahl, et al., "Cosmid Vectors For Rapid Genomic Walking, Restriction Mapping, and Gene Transfer," Proc. Natl. Acad. Sci.(USA), 84:2160-2164 (1987). Wilson, et al., "Nucleotide Sequence Analysis of 95 kb Near The 3' End Of The Murine T-Cell Receptor .alpha./ .delta.Chain Locus: Strategy And Methodology," Genomics, 13:1198-1208 (1992). Voss et al., Nucleic Acids Res. 18(4), 1066 (1990). Palca, Nature 325, 651 (1987). Evans et al., Gene 79, 9-20 (1989).
Patent History
Patent number: 5851760
Type: Grant
Filed: Sep 7, 1993
Date of Patent: Dec 22, 1998
Assignee: The Salk Institute for Biological Studies (La Jolla, CA)
Inventors: Glen A. Evans (San Marcos, CA), Michael W. Smith (San Diego, CA)
Primary Examiner: Kenneth R. Horlick
Law Firm: Stephen F. Reiter Gray Cary Ware & Freidenrich
Application Number: 8/117,952
Classifications
Current U.S. Class: 435/6; Using Fungi (435/911)
International Classification: C12Q 168; C12P 1934;