Patents by Inventor Gioacchino Tangari

Gioacchino Tangari has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20240232187
    Abstract: The present disclosure is related to techniques for converting a natural language utterance to a logical form query and deriving a natural language interpretation of the logical form query. The techniques include accessing a Meaning Resource Language (MRL) query and converting the MRL query into a MRL structure including logical form statements. The converting includes extracting operations and associated attributes from the MRL query and generating the logical form statements from the operations and associated attributes. The techniques further include translating each of the logical form statements into a natural language expression based on a grammar data structure that includes a set of rules for translating logical form statements into corresponding natural language expressions, combining the natural language expressions into a single natural language expression, and providing the single natural language expression as an interpretation of the natural language utterance.
    Type: Application
    Filed: May 22, 2023
    Publication date: July 11, 2024
    Applicant: Oracle International Corporation
    Inventors: Chang Xu, Poorya Zaremoodi, Cong Duy Vu Hoang, Nitika Mathur, Philip Arthur, Steve Wai-Chun Siu, Aashna Devang Kanuga, Gioacchino Tangari, Mark Edward Johnson, Thanh Long Duong, Vishal Vishnoi, Stephen Andrew McRitchie, Christopher Mark Broadbent
  • Publication number: 20240169161
    Abstract: Obtaining collections of sentences in different languages that are usable for training models in various applications of artificial intelligence is provided. A method is provided that obtains, from text corpus, webpages in a plurality of languages, each of the webpages corresponding to an URL; obtains annotations for each of the webpages based on its URL, to obtain annotated data entries corresponding to the webpages, each of the annotated data entries including a classification label corresponding to a sub-topic of one of a plurality of topics, where each of the plurality of topics includes a corresponding plurality of sub-topics; filters the annotated data entries to obtain topic-specific content in a target language based on the classification labels, the topic-specific content corresponding to one or more sub-topics; performs post-processing on the topic-specific content to obtain result data; and outputs the result data for the topic.
    Type: Application
    Filed: August 21, 2023
    Publication date: May 23, 2024
    Applicant: Oracle International Corporation
    Inventors: Paria Jamshid Lou, Gioacchino Tangari, Jason Black, Bhagya Gayathri Hettige, Xu Zhong, Poorya Zaremoodi, Thanh Long Duong, Mark Edward Johnson
  • Publication number: 20240134850
    Abstract: The present disclosure is related to techniques for converting a natural language utterance to a logical form query and deriving a natural language interpretation of the logical form query. The techniques include accessing a Meaning Resource Language (MRL) query and converting the MRL query into a MRL structure including logical form statements. The converting includes extracting operations and associated attributes from the MRL query and generating the logical form statements from the operations and associated attributes. The techniques further include translating each of the logical form statements into a natural language expression based on a grammar data structure that includes a set of rules for translating logical form statements into corresponding natural language expressions, combining the natural language expressions into a single natural language expression, and providing the single natural language expression as an interpretation of the natural language utterance.
    Type: Application
    Filed: May 21, 2023
    Publication date: April 25, 2024
    Applicant: Oracle International Corporation
    Inventors: Chang Xu, Poorya Zaremoodi, Cong Duy Vu Hoang, Nitika Mathur, Philip Arthur, Steve Wai-Chun Siu, Aashna Devang Kanuga, Gioacchino Tangari, Mark Edward Johnson, Thanh Long Duong, Vishal Vishnoi, Stephen Andrew McRitchie, Christopher Mark Broadbent
  • Publication number: 20240062011
    Abstract: Techniques are disclosed herein for using named entity recognition to resolve entity expression while transforming natural language to a meaning representation language. In one aspect, a method includes accessing natural language text, predicting, by a first machine learning model, a class label for a token in the natural language text, predicting, by a second machine-learning model, operators for a meaning representation language and a value or value span for each attribute of the operators, in response to determining that the value or value span for a particular attribute matches the class label, converting a portion of the natural language text for the value or value span into a resolved format, and outputting syntax for the meaning representation language. The syntax comprises the operators with the portion of the natural language text for the value or value span in the resolved format.
    Type: Application
    Filed: July 13, 2023
    Publication date: February 22, 2024
    Applicant: Oracle International Corporation
    Inventors: Aashna Devang Kanuga, Cong Duy Vu Hoang, Mark Edward Johnson, Vasisht Raghavendra, Yuanxu Wu, Steve Wai-Chun Siu, Nitika Mathur, Gioacchino Tangari, Shubham Pawankumar Shah, Vanshika Sridharan, Zikai Li, Diego Andres Cornejo Barra, Stephen Andrew McRitchie, Christopher Mark Broadbent, Vishal Vishnoi, Srinivasa Phani Kumar Gadde, Poorya Zaremoodi, Thanh Long Duong, Bhagya Gayathri Hettige, Tuyen Quang Pham, Arash Shamaei, Thanh Tien Vu, Yakupitiyage Don Thanuja Samodhve Dharmasiri
  • Publication number: 20240062044
    Abstract: Techniques are disclosed herein for addressing catastrophic forgetting and over-generalization while training a model to transform natural language to a logical form such as a meaning representation language. The techniques include accessing training data comprising natural language examples, augmenting the training data to generate expanded training data, training a machine learning model on the expanded training data, and providing the trained machine learning model. The augmenting includes (i) generating contrastive examples by revising natural language of examples identified to have caused regression during training of a machine learning model with the training data, (ii) generating alternative examples by modifying operators of examples identified within the training data that belong to a concept that exhibits bias, or (iii) a combination of (i) and (ii).
    Type: Application
    Filed: August 18, 2023
    Publication date: February 22, 2024
    Applicant: Oracle International Corporation
    Inventors: Shivashankar Subramanian, Dalu Guo, Gioacchino Tangari, Nitika Mathur, Cong Duy Vu Hoang, Mark Edward Johnson, Thanh Long Duong
  • Publication number: 20240061833
    Abstract: Techniques are disclosed for augmenting training data for training a machine learning model to generate database queries. Training data comprising a first training example comprising a first natural language utterance, a logical form for the first natural language utterance, and associated first metadata is obtained. From the first training example, a template utterance is generated. A second natural language utterance is generated by filling slots in the template utterance based on a database schema and database values. Updated metadata is produced based on the first metadata and the second natural language utterance. A second training example is generated, comprising the second natural language utterance, the logical form for the first natural language utterance, and the updated metadata. The training data is augmented by adding the second training example. A machine learning model is trained to generate a database query comprising the database operation using the augmented training data set.
    Type: Application
    Filed: July 5, 2023
    Publication date: February 22, 2024
    Applicant: Oracle International Corporation
    Inventors: Gioacchino Tangari, Nitika Mathur, Philip Arthur, Cong Duy Vu Hoang, Aashna Devang Kanuga, Steve Wai-Chun Siu, Syed Najam Abbas Zaidi, Poorya Zaremoodi, Thanh Long Duong, Mark Edward Johnson
  • Publication number: 20240061835
    Abstract: Systems and methods fine-tune a pretrained machine learning model. For a model having multiple layers, an initial set of configurations is identified, each configuration establishing layers to be frozen and layers to be fine-tuned. A configuration that is optimized with respect to one or more parameters is selected, establishing a set of fine-tuning layers and a set of frozen layers. An input for the model is provided to a remote system. An output of the set of frozen layers of the model, given the provided input, is received back and locally stored. The set of fine-tuning layers of the model is loaded from the remote system. The model is fine-tuned by retrieving the locally stored output of the set of frozen layers, and updating weights associated with the set of fine-tuning layers of the machine learning model.
    Type: Application
    Filed: August 21, 2023
    Publication date: February 22, 2024
    Applicant: Oracle International Corporation
    Inventors: Shivashankar Subramanian, Gioacchino Tangari, Thanh Tien Vu, Cong Duy Vu Hoang, Poorya Zaremoodi, Dalu Guo, Mark Edward Johnson, Thanh Long Duong
  • Publication number: 20240061834
    Abstract: Systems and methods identify whether an input utterance is suitable for providing to a machine learning model configured to generate a query for a database. Techniques include generating an input string by concatenating a natural language utterance with a database schema representation for a database; providing the input string to a first machine learning model; based on the input string, generating, by the first machine learning model, a score indicating whether the natural language utterance is translatable to a database query for the database and should be routed to a second machine learning model, the second machine learning model configured to generate a query for the database based on the natural language utterance; comparing the score to a threshold value; and responsive to determining that the score exceeds the threshold value, providing the natural language utterance or the input string to the second machine learning model.
    Type: Application
    Filed: August 21, 2023
    Publication date: February 22, 2024
    Applicant: Oracle International Corporation
    Inventors: Gioacchino Tangari, Cong Duy Vu Hoang, Poorya Zaremoodi, Philip Arthur, Nitika Mathur, Mark Edward Johnson, Thanh Long Duong
  • Publication number: 20240062108
    Abstract: Techniques are disclosed herein for training and deploying a named entity recognition model. The techniques include implementing a nested labeling scheme for named entities within the training data and then training a machine learning model on the training data The techniques further include extracting an entity hierarchy for a predicted class based on a hierarchical template associated with a composite label, where the predicted class is representative of multiple named entity classes comprising at least a parent class and a child class associated with the composite label. The techniques further include increasing the volume of training data via data mining for sequence tags in a language corpus and then training a machine learning model on the training data.
    Type: Application
    Filed: May 25, 2023
    Publication date: February 22, 2024
    Applicant: Oracle International Corporation
    Inventors: Tuyen Quang Pham, Bhagya Hettige, Gioacchino Tangari, Yakupitiyage Don Thanuja Samodhye Dharmasiri, Thanh Long Duong
  • Publication number: 20240061832
    Abstract: Techniques are disclosed herein for converting a natural language utterance to an intermediate database query representation. An input string is generated by concatenating a natural language utterance with a database schema representation for a database. Based on the input string, a first encoder generates one or more embeddings of the natural language utterance and the database schema representation. A second encoder encodes relations between elements in the database schema representation and words in the natural language utterance based on the one or more embeddings. A grammar-based decoder generates an intermediate database query representation based on the encoded relations and the one or more embeddings. Based on the intermediate database query representation and an interface specification, a database query is generated in a database query language.
    Type: Application
    Filed: June 14, 2023
    Publication date: February 22, 2024
    Applicant: Oracle International Corporation
    Inventors: Cong Duy Vu Hoang, Stephen Andrew McRitchie, Mark Edward Johnson, Shivashankar Subramanian, Aashna Devang Kanuga, Nitika Mathur, Gioacchino Tangari, Steve Wai-Chun Siu, Poorya Zaremoodi, Vasisht Raghavendra, Thanh Long Duong, Srinivasa Phani Kumar Gadde, Vishal Vishnoi, Christopher Mark Broadbent, Philip Arthur, Syed Najam Abbas Zaidi
  • Publication number: 20240062021
    Abstract: Techniques are disclosed herein for calibrating confidence scores of a machine learning model trained to translate natural language to a meaning representation language. The techniques include obtaining one or more raw beam scores generated from one or more beam levels of a decoder of a machine learning model trained to translate natural language to a logical form, where each of the one or more raw beam scores is a conditional probability of a sub-tree determined by a heuristic search algorithm of the decoder at one of the one or more beam levels, classifying, by a calibration model, a logical form output by the machine learning model as correct or incorrect based on the one or more raw beam scores, and providing the logical form with a confidence score that is determined based on the classifying of the logical form.
    Type: Application
    Filed: February 9, 2023
    Publication date: February 22, 2024
    Applicant: Oracle International Corporation
    Inventors: Gioacchino Tangari, Cong Duy Vu Hoang, Mark Edward Johnson, Poorya Zaremoodi, Nitika Mathur, Aashna Devang Kanuga, Thanh Long Duong
  • Publication number: 20230186026
    Abstract: Techniques are disclosed herein for synthesizing synthetic training data to facilitate training a natural language to logical form model. In one aspect, training data can be synthesized from original under a framework based on templates and a synchronous context-free grammar. In one aspect, training data can be synthesized under a framework based on a probabilistic context-free grammar and a translator. In one aspect, training data can be synthesized under a framework based on tree-to-string translation. In one aspect, the synthetic training data can be combined with original training data in order to train a machine learning model to translate an utterance to a logical form.
    Type: Application
    Filed: December 13, 2022
    Publication date: June 15, 2023
    Applicant: Oracle International Corporation
    Inventors: Philip Arthur, Vishal Vishnoi, Mark Edward Johnson, Thanh Long Duong, Srinivasa Phani Kumar Gadde, Balakota Srinivas Vinnakota, Cong Duy Vu Hoang, Steve Wai-Chun Siu, Nitika Mathur, Gioacchino Tangari, Aashna Devang Kanuga
  • Publication number: 20230186025
    Abstract: Techniques for preprocessing data assets to be used in a natural language to logical form model based on scalable search and content-based schema linking. In one particular aspect, a method includes accessing an utterance, classifying named entities within the utterance into predefined classes, searching value lists within the database schema using tokens from the utterance to identify and output value matches including: (i) any value within the value lists that matches a token from the utterance and (ii) any attribute associated with a matching value, generating a data structure by organizing and storing: (i) each of the named entities and an assigned class for each of the named entities, (ii) each of the value matches and the token matching each of the value matches, and (iii) the utterance, in a predefined format for the data structure, and outputting the data structure.
    Type: Application
    Filed: December 13, 2022
    Publication date: June 15, 2023
    Applicant: Oracle International Corporation
    Inventors: Jae Min John, Vishal Vishnoi, Mark Edward Johnson, Thanh Long Duong, Srinivasa Phani Kumar Gadde, Balakota Srinivas Vinnakota, Shivashankar Subramanian, Cong Duy Vu Hoang, Yakupitiyage Don Thanuja Samodhye Dharmasiri, Nitika Mathur, Aashna Devang Kanuga, Philip Arthur, Gioacchino Tangari, Steve Wai-Chun Siu
  • Publication number: 20230186161
    Abstract: Techniques are disclosed herein for synthesizing synthetic training data to facilitate training a natural language to logical form model. In one aspect, training data can be synthesized from original under a framework based on templates and a synchronous context-free grammar. In one aspect, training data can be synthesized under a framework based on a probabilistic context-free grammar and a translator. In one aspect, training data can be synthesized under a framework based on tree-to-string translation. In one aspect, the synthetic training data can be combined with original training data in order to train a machine learning model to translate an utterance to a logical form.
    Type: Application
    Filed: December 13, 2022
    Publication date: June 15, 2023
    Applicant: Oracle International Corporation
    Inventors: Philip Arthur, Vishal Vishnoi, Mark Edward Johnson, Thanh Long Duong, Srinivasa Phani Kumar Gadde, Balakota Srinivas Vinnakota, Cong Duy Vu Hoang, Steve Wai-Chun Siu, Nitika Mathur, Gioacchino Tangari, Aashna Devang Kanuga
  • Publication number: 20230185834
    Abstract: Techniques are disclosed herein for synthesizing synthetic training data to facilitate training a natural language to logical form model. In one aspect, training data can be synthesized from original under a framework based on templates and a synchronous context-free grammar. In one aspect, training data can be synthesized under a framework based on a probabilistic context-free grammar and a translator. In one aspect, training data can be synthesized under a framework based on tree-to-string translation. In one aspect, the synthetic training data can be combined with original training data in order to train a machine learning model to translate an utterance to a logical form.
    Type: Application
    Filed: December 13, 2022
    Publication date: June 15, 2023
    Applicant: Oracle International Corporation
    Inventors: Philip Arthur, Vishal Vishnoi, Mark Edward Johnson, Thanh Long Duong, Srinivasa Phani Kumar Gadde, Balakota Srinivas Vinnakota, Cong Duy Vu Hoang, Steve Wai-Chun Siu, Nitika Mathur, Gioacchino Tangari, Aashna Devang Kanuga