Patents by Inventor Daniel Gillick

Daniel Gillick has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 10579733
    Abstract: A method for identifying codemixed text includes receiving codemixed text and segmenting the codemixed text into a plurality of tokens. Each token includes at least one character and is delineated from any adjacent tokens by a space. For each token of the codemixed text, the method also includes extracting features from the token and predicting a probability distribution over possible languages for the token using a language identifier model configured to receive the extracted features from the token as feature inputs. The method also includes assigning a language to each token of the codemixed text by executing a greedy search on the probability distribution over the possible languages predicted for each respective token.
    Type: Grant
    Filed: May 10, 2018
    Date of Patent: March 3, 2020
    Assignee: Google LLC
    Inventors: Jason Riesa, Daniel Gillick, Yuan Zhang, Anton Bakalov, Jason Baldridge, David Weiss
  • Publication number: 20190347323
    Abstract: A method for identifying codemixed text includes receiving codemixed text and segmenting the codemixed text into a plurality of tokens. Each token includes at least one character and is delineated from any adjacent tokens by a space. For each token of the codemixed text, the method also includes extracting features from the token and predicting a probability distribution over possible languages for the token using a language identifier model configured to receive the extracted features from the token as feature inputs. The method also includes assigning a language to each token of the codemixed text by executing a greedy search on the probability distribution over the possible languages predicted for each respective token.
    Type: Application
    Filed: May 10, 2018
    Publication date: November 14, 2019
    Applicant: Google LLC
    Inventors: Jason Riesa, Daniel Gillick, Yuan Zhang, Anton Bakalov, Jason Baldridge, David Weiss
  • Patent number: 9619457
    Abstract: A computer-implemented technique can include obtaining a training corpus including pairs of (i) documents and (ii) corresponding abstracts. The technique can include identifying a set of entity mentions in each abstract and each corresponding document based on their respective part-of-speech (POS) tags and dependency parses. The technique can include clustering the sets of entity mentions referring to a same underlying entity to obtain clusters for each document and each corresponding abstract. The technique can include aligning specific abstract entity mentions to corresponding document entity mentions to obtain a set of aligned abstract and document entities. The technique can include labeling the set of aligned entities as salient and unaligned entities as non-salient to generate a labeled corpus. The technique can also include training features of a classifier using the labeled corpus to obtain a trained classifier.
    Type: Grant
    Filed: July 16, 2014
    Date of Patent: April 11, 2017
    Assignee: GOOGLE INC.
    Inventors: Daniel Gillick, Amarnag Subramanya