Patents by Inventor Rishita Rajal Anubhai

Rishita Rajal Anubhai has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 11741168
    Abstract: Techniques for multi-label document classification are described. Clustering is used to cluster labels in a set. A machine learning model including a multi-label classifier for each cluster is created, the multi-label classifier for a given cluster to classify a document with one or more of the labels in the cluster.
    Type: Grant
    Filed: September 30, 2019
    Date of Patent: August 29, 2023
    Assignee: Amazon Technologies, Inc.
    Inventors: Sravan Babu Bodapati, Rishita Rajal Anubhai, Yahor Pushkin
  • Patent number: 11734937
    Abstract: Techniques for creating a text classifier machine learning (ML) model are described. According to some embodiments, a language processing service finetunes a language ML model on unlabeled documents of a user, and then trains that finetuned language ML model on labeled documents of the user to be a text classifier that is customized for that user’s domain, e.g., the user’s documents. Additionally, the finetuned language ML model may be trained on labeled documents of the user, for prediction objectives for unlabeled data, before being trained as the text classifier.
    Type: Grant
    Filed: January 2, 2020
    Date of Patent: August 22, 2023
    Assignee: Amazon Technologies, Inc.
    Inventors: Yahor Pushkin, Sravan Babu Bodapati, Rishita Rajal Anubhai, Dimitrios Soulios, Yaser Al-Onaizan
  • Patent number: 11657307
    Abstract: Techniques for data lake-based text generation and data augmentation for machine learning training are described. A user-provided dataset including documents and corresponding label information can be automatically supplemented by creating additional high-quality document samples, with labels, via a large repository of documents in a data lake. Documents from the data lake may be identified as being semantically similar to the user-provided documents but different enough to allow a resulting model to learn from the variation in these documents. New documents can be generated from user-provided document samples or data lake sample documents by identifying and replacing slots within the samples and rewriting adjunct tokens.
    Type: Grant
    Filed: November 27, 2019
    Date of Patent: May 23, 2023
    Assignee: Amazon Technologies, Inc.
    Inventors: Sravan Babu Bodapati, Rishita Rajal Anubhai, Georgiana Dinu, Yaser Al-Onaizan
  • Patent number: 11531846
    Abstract: Techniques for extending sensitive data tagging without reannotating training data are described. A method for extending sensitive data tagging without reannotating training data may include hosting a plurality of models at a model endpoint in a machine learning service, each model trained to identify a different sensitive data type in a transcript of content, adding a new model to the model endpoint, the new model trained to identify a new sensitive data entity in the transcript of content, identifying sensitive entities in the transcript by each of the plurality of models and the new model, merging inference responses generated by each of the plurality of models and the new model using at least one inference policy, and returning a merged inference response identifying a plurality of sensitive entities in the transcript.
    Type: Grant
    Filed: September 30, 2019
    Date of Patent: December 20, 2022
    Assignee: Amazon Technologies, Inc.
    Inventors: Sravan Babu Bodapati, Rishita Rajal Anubhai, Pu Paul Zhao, Katrin Kirchhoff
  • Publication number: 20220100967
    Abstract: Methods, systems, and computer-readable media for lifecycle management for customized natural language processing are disclosed. A natural language processing (NLP) customization service determines a task definition associated with an NLP model based (at least in part) on user input. The task definition comprises an indication of one or more tasks to be implemented using the NLP model and one or more requirements associated with use of the NLP model. The service determines the NLP model based (at least in part) on the task definition. The service trains the NLP model. The NLP model is used to perform inference for a plurality of input documents. The inference outputs a plurality of predictions based (at least in part) on the input documents. Inference data is collected based (at least in part) on the inference. The service generates a retrained NLP model based (at least in part) on the inference data.
    Type: Application
    Filed: September 30, 2020
    Publication date: March 31, 2022
    Applicant: Amazon Technologies, Inc.
    Inventors: Yahor Pushkin, Rishita Rajal Anubhai, Sameer Karnik, Sunil Mallya Kasaragod, Abhinav Goyal, Yaser Al-Onaizan, Ashish Singh, Ashish Khare
  • Publication number: 20220100963
    Abstract: Methods, systems, and computer-readable media for event extraction from documents with co-reference are disclosed. An event extraction service identifies one or more trigger groups in a document comprising text. An individual one of the trigger groups comprises one or more textual references to an occurrence of an event. The one or more trigger groups are associated with one or more semantic roles for entities. The event extraction service identifies one or more entity groups in the document. An individual one of the entity groups comprises one or more textual references to a real-world object. The event extraction service assigns one or more of the entity groups to one or more of the semantic roles. The event extraction service generates an output indicating the one or more trigger groups and one or more entity groups assigned to the semantic roles.
    Type: Application
    Filed: September 30, 2020
    Publication date: March 31, 2022
    Applicant: Amazon Technologies, Inc.
    Inventors: Rishita Rajal Anubhai, Yahor Pushkin, Graham Vintcent Horwood, Yinxiao Zhang, Ravindra Manjunatha, Jie Ma, Alessandra Brusadin, Jonathan Steuck, Shuai Wang, Sameer Karnik, Miguel Ballesteros Martinez, Sunil Mallya Kasaragod, Yaser Al-Onaizan
  • Patent number: 11227009
    Abstract: Techniques are described for a de-obfuscation framework that utilizes image recognition of text. A word input by a user is received by the de-obfuscation service. Visual feature data associated with an image corresponding to each character of the word is generated. Word embeddings are generated using the visual feature data and each character of the word using a character encoder layer. Feature vectors are generated from the word embedding by combining the generated word embeddings and a provided word embedding using a second neural network. The generated feature vector is classified. Potential text obfuscation is detected from the classified generated feature vector using a lexicon to determine de-obfuscated text closet to the user text.
    Type: Grant
    Filed: September 30, 2019
    Date of Patent: January 18, 2022
    Assignee: Amazon Technologies, Inc.
    Inventors: Rishita Rajal Anubhai, Sravan Babu Bodapati