Patents by Inventor Yuan-Jyue Chen

Yuan-Jyue Chen has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20220282243
    Abstract: This disclosure provides techniques and systems for efficient random access to digital data encoded in oligonucleotides (e.g., DNA). Random access to DNA-encoded data is provided by amplification using polymerase chain reaction (PCR) and primer pairs that selectively amplify only the oligonucleotides encoding a desired set of digital data. Multiple separate random-access requests are prepared for multiplex DNA sequencing by generating copy-normalized amplification products. Copy-normalized amplification products are efficiently created by performing multiple singleplex PCR reactions in parallel and measuring the quantity of oligonucleotides in each reaction. The PCR reactions are performed in parallel through the use of multiple isolated reaction volumes such as water-in-oil microdroplets or individual wells on a plate.
    Type: Application
    Filed: May 20, 2022
    Publication date: September 8, 2022
    Inventors: Yuan-Jyue CHEN, Bichlien NGUYEN, Karin STRAUSS
  • Patent number: 11365411
    Abstract: This disclosure provides techniques and systems for efficient random access to digital data encoded in oligonucleotides (e.g., DNA). Random access to DNA-encoded data is provided by amplification using polymerase chain reaction (PCR) and primer pairs that selectively amplify only the oligonucleotides encoding a desired set of digital data. Multiple separate random-access requests are prepared for multiplex DNA sequencing by generating copy-normalized amplification products. Copy-normalized amplification products are efficiently created by performing multiple singleplex PCR reactions in parallel and measuring the quantity of oligonucleotides in each reaction. The PCR reactions are performed in parallel through the use of multiple isolated reaction volumes such as water-in-oil microdroplets or individual wells on a plate.
    Type: Grant
    Filed: January 21, 2020
    Date of Patent: June 21, 2022
    Assignee: MICROSOFT TECHNOLOGY LICENSING, LLC
    Inventors: Yuan-Jyue Chen, Bichlien Nguyen, Karin Strauss
  • Publication number: 20220179891
    Abstract: In some embodiments, techniques are provided for conducting similarity-based searches using DNA. In some embodiments, sets of features that represent stored data sets are encoded in DNA sequences such that a hybridization yield between a molecule having a given stored DNA sequence and a molecule having a reverse complement of a DNA sequence that encodes a set of features that represent a query data set reflects an amount of similarity between the set of features that represent the query data set and the set of features encoded in the given stored DNA sequence. In some embodiments, machine learning techniques are used to determine the DNA sequence encoding. In some embodiments, machine learning techniques are used to predict hybridization yields between DNA molecules.
    Type: Application
    Filed: April 9, 2020
    Publication date: June 9, 2022
    Applicants: University of Washington, Microsoft Technology Licensing, LLC
    Inventors: Luis Ceze, Karin Strauss, Georg Seelig, Callie Bee, Yuan-Jyue Chen
  • Publication number: 20210332412
    Abstract: This disclosure describes a technique for performing random access in a pool of polynucleotides by using one unique primer and one homopolymer primer to selectively amplify some but not all of the polynucleotides in the pool. The polynucleotides are synthesized by a template independent polymerase such as terminal deoxynucleotide transferase (TdT) rather than by phosphoramidite synthesis. Enzymatic synthesis efficiently creates homopolymer sequences through unregulated synthesis. Use of one homopolymer primer instead of two unique primers decreases the complexity, time, and cost of synthesizing the polynucleotides. Use of a unique primer provides a sequence that can be varied to uniquely identify multiple different groups of polynucleotides. This enables random access by polymerase chain reaction (PCR) amplification while still benefitting from the efficiency of homopolymer synthesis. The polynucleotides may include payload regions that use a sequence of nucleotides to encode digital data.
    Type: Application
    Filed: April 24, 2020
    Publication date: October 28, 2021
    Inventors: Yuan-Jyue CHEN, Bichlien NGUYEN
  • Publication number: 20210222160
    Abstract: This disclosure provides techniques and systems for efficient random access to digital data encoded in oligonucleotides (e.g., DNA). Random access to DNA-encoded data is provided by amplification using polymerase chain reaction (PCR) and primer pairs that selectively amplify only the oligonucleotides encoding a desired set of digital data. Multiple separate random-access requests are prepared for multiplex DNA sequencing by generating copy-normalized amplification products. Copy-normalized amplification products are efficiently created by performing multiple singleplex PCR reactions in parallel and measuring the quantity of oligonucleotides in each reaction. The PCR reactions are performed in parallel through the use of multiple isolated reaction volumes such as water-in-oil microdroplets or individual wells on a plate.
    Type: Application
    Filed: January 21, 2020
    Publication date: July 22, 2021
    Inventors: Yuan-Jyue CHEN, Bichlien Nguyen, Karin Strauss
  • Publication number: 20210205775
    Abstract: Substrates for solid-phase synthesis are reused by freeing synthesized polymers without removing the linkers that hold the polymers to the substrate. The linkers may be made of oligonucleotides or polypeptides. In an implementation, the polymers are released by cleavage of the linkers and then the truncated linkers are regenerated by adding back the portion that was removed. In an implementation, molecular bonds between the linkers and the polymers are cleaved releasing the polymers while leaving the linkers available for reuse without regeneration. In an implementation, single-stranded oligonucleotide linkers are hybridized to complementary strands that hold the polymers to the substrate with double-stranded oligonucleotide complexes. The double-stranded oligonucleotide complexes are denatured releasing the polymers while leaving the original linkers attached to the substrate. The polymers that are synthesized with these techniques may be the same or different type of molecules than the linkers.
    Type: Application
    Filed: January 6, 2020
    Publication date: July 8, 2021
    Inventors: Bichlien NGUYEN, Yuan-Jyue CHEN, Jake SMITH
  • Publication number: 20210155923
    Abstract: Electrically controlled hybridization is used to selectively assemble oligonucleotides on the surface of a microelectrode array. Controlled activation of individual electrodes in the microelectrode array attracts oligonucleotides in solution to specific regions of the array where they hybridize to other oligonucleotides anchored on the array. The oligonucleotides that hybridize may provide locations for subsequent oligonucleotides to hybridize. The active electrodes and the oligonucleotides in solution may be varied during each round of synthesis. This allows for multiple oligonucleotides each with different and specific sequences to be created in parallel. This is accomplished without the use of phosphoramidite chemical synthesis or template-independent DNA polymerase enzymatic synthesis. Oligonucleotides created with these techniques may be used to encode digital data. Fully assembled oligonucleotides may be separated from the array and sequenced, stored, or otherwise processed.
    Type: Application
    Filed: November 27, 2019
    Publication date: May 27, 2021
    Inventors: Yuan-Jyue CHEN, Bichlien NGUYEN, Jake SMITH, Karin STRAUSS
  • Patent number: 10930370
    Abstract: Artificial polynucleotides may have different characteristics than natural polynucleotides so conventional base-calling algorithms may make incorrect base calls. However, because artificial polynucleotides are typically designed to have certain characteristics, the known characteristics of the artificial polynucleotide can be used to modify the base-calling algorithm. This disclosure describes polynucleotide sequencers adapted to sequence artificial polynucleotides by modifying a base-calling algorithm of the polynucleotide sequencer according to known characteristics of the artificial polynucleotides. The base-calling algorithm analyzes raw data generated by a polynucleotide sequencer and identifies which nucleotide base occupies a given position on a polynucleotide strand.
    Type: Grant
    Filed: May 26, 2017
    Date of Patent: February 23, 2021
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Karin Strauss, Siena Dumas Ang, Luis Ceze, Yuan-Jyue Chen, Hsing-Yeh Parker, Bichlien Nguyen, Robert Carlson
  • Publication number: 20210050073
    Abstract: Multiplex similarity search can be performed in a DNA data storage context. The described technologies can support a plurality of different DNA data storage queries in a single query run. A linking strand can be used to connect a query to its matching data element. After the query finds a matching data element, a result strand can be sequenced to the reveal the matching data element as well as which of the queries resulted in the match. Thus, in a multiplex similarity search scenario, a plurality of result strands from a single query run can be correlated to a plurality of different queries. Also, the result strand can be of significantly longer length than both the unmatched data strands and the unmatched query strands. Therefore, filtering based on length can provide more accurate results.
    Type: Application
    Filed: August 14, 2019
    Publication date: February 18, 2021
    Applicant: Microsoft Technology Licensing, LLC
    Inventors: Yuan-Jyue Chen, Karin Strauss
  • Patent number: 10793897
    Abstract: This disclosure describes techniques to improve the accuracy of random access of data stored in polynucleotide sequence data storage systems. Primers used in polynucleotide sequence replication and amplification can be scored against a number of criteria that indicate the fitness of sequences of nucleotides to function as primers. Primers having scores that indicate a particular fitness to function as primers can be added to a specific group of primers. The primers from the group of primers can be used in amplification and replication of polynucleotide sequences that encode digital data. Additionally, an amount of overlap between primer targets and payloads encoding digital data can be determined. Minimizing the amount of overlap between primer targets and payloads can improve the efficiency of polynucleotide replication and amplification. The bits of the digital data can be randomized to minimize the amount of overlap between payloads encoding the digital data and primer targets.
    Type: Grant
    Filed: February 8, 2017
    Date of Patent: October 6, 2020
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Yuan-Jyue Chen, Luis H. Ceze, Sergey Yekhanin, Siena Dumas Ang, Karin Strauss
  • Patent number: 10787699
    Abstract: This disclosure describes techniques to improve the accuracy of random access of data stored in polynucleotide sequence data storage systems. Primers used in polynucleotide sequence replication and amplification can be scored against a number of criteria that indicate the fitness of sequences of nucleotides to function as primers. Primers having scores that indicate a particular fitness to function as primers can be added to a specific group of primers. The primers from the group of primers can be used in amplification and replication of polynucleotide sequences that encode digital data. Additionally, an amount of overlap between primer targets and payloads encoding digital data can be determined. Minimizing the amount of overlap between primer targets and payloads can improve the efficiency of polynucleotide replication and amplification. The bits of the digital data can be randomized to minimize the amount of overlap between payloads encoding the digital data and primer targets.
    Type: Grant
    Filed: February 8, 2017
    Date of Patent: September 29, 2020
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Yuan-Jyue Chen, Karin Strauss, Luis H. Ceze, Siena Dumas Ang, Sergey Yekhanin
  • Patent number: 10774379
    Abstract: This disclosure describes frameworks and techniques related to the random access of digital data encoded by polynucleotides. Digital data of a data file can be encoded as a series of nucleotides and one or more polynucleotide sequences can be generated that encode the digital data for the data file. The bits of the digital data can be segmented to produce multiple polynucleotide sequences that encode the bits of the digital data with each polynucleotide sequence encoding an individual segment of the digital data. The individual segments can be grouped together and associated with a group identifier. Each data file can be associated with a number of group identifiers and the number of segments in each group can be within a specified range. Primers corresponding to the group identifiers can be used to selectively access the polynucleotides that encode the digital data of a data file.
    Type: Grant
    Filed: March 15, 2017
    Date of Patent: September 15, 2020
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Yuan-Jyue Chen, Karin Strauss, Luis H. Ceze, Lee Organick
  • Patent number: 10689684
    Abstract: This disclosure describes techniques to improve the sequencing of polynucleotides by decreasing the likelihood of errors occurring during a sequencing calibration process. In implementations, regions of polynucleotides that are used for the calibration process can be modified to reduce a number of polynucleotides that have a same nucleotide at one or more positions of the calibration regions. In some cases, the calibration regions can be modified by adding a sequence to the polynucleotides that replaces the original calibration regions. Also, the calibration regions can be modified by rearranging the nucleotides at the different positions of the calibration regions. Additionally, the calibration regions can be modified by adding sequences of varying length to the polynucleotides being sequenced to produce polynucleotides having varying length with different calibration regions.
    Type: Grant
    Filed: February 14, 2017
    Date of Patent: June 23, 2020
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Yuan-Jyue Chen, Karin Strauss, Luis H. Ceze, Lee Organick, Randolph Lopez, Georg Seelig
  • Publication number: 20200004926
    Abstract: This disclosure describes an efficient method to copy all polynucleotides encoding digital data of digital files in a polynucleotide storage container while maintaining random access capabilities over a collection of files or data items in the container. The disclosure further describes a process whereby random-access and sequencing of the polynucleotides are combined in a single step.
    Type: Application
    Filed: June 29, 2018
    Publication date: January 2, 2020
    Inventors: Karin Strauss, Yuan Jyue Chen
  • Publication number: 20190376120
    Abstract: Techniques for random access of particular DNA strands from a mixture of DNA strands are described. DNA strands that encode pieces of the same digital file are labeled with the same identification sequence. The identification sequence is used to selectively separate DNA strands that contain portions of the same digital file from other DNA strands. A DNA staple positions DNA strands with the identification sequence adjacent to sequencing adaptors. DNA ligase joins the molecules to create a longer molecule with the region encoding the digital file flanked by sequencing adaptors. DNA strands that include sequencing adaptors are sequenced and the sequence data is available for further analysis. DNA strands without the identification sequence are not joined to sequencing adaptors, and thus, are not sequenced. As a result, the sequencing data produced by the DNA sequencer comes from those DNA strands that included the identification sequence.
    Type: Application
    Filed: October 30, 2017
    Publication date: December 12, 2019
    Inventors: Karin STRAUSS, Yuan-Jyue CHEN
  • Publication number: 20190358604
    Abstract: A system includes a synthesizer unit having a fluid input to receive fluids and a communication input to receive commands to synthesize data-encoded DNA sequences and cleave the DNA. A first flexible chemistry reaction chamber module may be fluidically coupled to the synthesizer unit to receive the data-encoded DNA sequences and amplify the sequences. A deposition unit may be fluidically coupled to the first flexible chemistry reaction chamber module to receive the amplified DNA sequences and encapsulate the amplified DNA sequences into one or more wells in a storage plate for storage and retrieval to and from a plate storage unit. Retrieved DNA may be processed and read by further units.
    Type: Application
    Filed: May 22, 2018
    Publication date: November 28, 2019
    Inventors: Bichlien H. Nguyen, Douglas P. Kelley, Karin Strauss, Robert Carlson, Hsing-Yeh Parker, John Mulligan, Luis H. Ceze, Yuan-Jyue Chen, Douglas Carmean
  • Publication number: 20180265921
    Abstract: This disclosure describes frameworks and techniques related to the random access of digital data encoded by polynucleotides. Digital data of a data file can be encoded as a series of nucleotides and one or more polynucleotide sequences can be generated that encode the digital data for the data file. The bits of the digital data can be segmented to produce multiple polynucleotide sequences that encode the bits of the digital data with each polynucleotide sequence encoding an individual segment of the digital data. The individual segments can be grouped together and associated with a group identifier. Each data file can be associated with a number of group identifiers and the number of segments in each group can be within a specified range. Primers corresponding to the group identifiers can be used to selectively access the polynucleotides that encode the digital data of a data file.
    Type: Application
    Filed: March 15, 2017
    Publication date: September 20, 2018
    Inventors: Yuan-Jyue Chen, Karin Strauss, Luis H. Ceze, Lee Organick
  • Publication number: 20180253528
    Abstract: Artificial polynucleotides may have different characteristics than natural polynucleotides so conventional base-calling algorithms may make incorrect base calls. However, because artificial polynucleotides are typically designed to have certain characteristics, the known characteristics of the artificial polynucleotide can be used to modify the base-calling algorithm. This disclosure describes polynucleotide sequencers adapted to sequence artificial polynucleotides by modifying a base-calling algorithm of the polynucleotide sequencer according to known characteristics of the artificial polynucleotides. The base-calling algorithm analyzes raw data generated by a polynucleotide sequencer and identifies which nucleotide base occupies a given position on a polynucleotide strand.
    Type: Application
    Filed: May 26, 2017
    Publication date: September 6, 2018
    Inventors: Karin Strauss, Siena Dumas Ang, Luis Ceze, Yuan-Jyue Chen, Hsing-Yeh Parker, Bichlien Nguyen, Robert Carlson
  • Publication number: 20180230509
    Abstract: This disclosure describes techniques to improve the sequencing of polynucleotides by decreasing the likelihood of errors occurring during a sequencing calibration process. In implementations, regions of polynucleotides that are used for the calibration process can be modified to reduce a number of polynucleotides that have a same nucleotide at one or more positions of the calibration regions. In some cases, the calibration regions can be modified by adding a sequence to the polynucleotides that replaces the original calibration regions. Also, the calibration regions can be modified by rearranging the nucleotides at the different positions of the calibration regions. Additionally, the calibration regions can be modified by adding sequences of varying length to the polynucleotides being sequenced to produce polynucleotides having varying length with different calibration regions.
    Type: Application
    Filed: February 14, 2017
    Publication date: August 16, 2018
    Inventors: Yuan-Jyue Chen, Karin Strauss, Luis H. Ceze, Lee Organick, Randolph Lopez, Georg Seelig
  • Publication number: 20180223340
    Abstract: This disclosure describes techniques to improve the accuracy of random access of data stored in polynucleotide sequence data storage systems. Primers used in polynucleotide sequence replication and amplification can be scored against a number of criteria that indicate the fitness of sequences of nucleotides to function as primers. Primers having scores that indicate a particular fitness to function as primers can be added to a specific group of primers. The primers from the group of primers can be used in amplification and replication of polynucleotide sequences that encode digital data. Additionally, an amount of overlap between primer targets and payloads encoding digital data can be determined. Minimizing the amount of overlap between primer targets and payloads can improve the efficiency of polynucleotide replication and amplification. The bits of the digital data can be randomized to minimize the amount of overlap between payloads encoding the digital data and primer targets.
    Type: Application
    Filed: February 8, 2017
    Publication date: August 9, 2018
    Inventors: Yuan-Jyue Chen, Luis H. Ceze, Sergey Yekhanin, Siena Dumas Ang, Karin Strauss