Patents by Inventor Ariel Gedaliah Kobren
Ariel Gedaliah Kobren has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Patent number: 12242568Abstract: Techniques are disclosed for augmenting data sets used for training machine learning models and for generating predictions by trained machine learning models. These techniques may increase a number and diversity of examples within an initial training dataset of sentences by extracting a subset of words from the existing training dataset of sentences. The techniques may conserve scarce sample data in few-shot situations by training a data generation model using general data obtained from a general data source.Type: GrantFiled: September 6, 2022Date of Patent: March 4, 2025Assignee: Oracle International CorporationInventors: Ariel Gedaliah Kobren, Swetasudha Panda, Michael Louis Wick, Qinlan Shen, Jason Anthony Peck
-
Publication number: 20240184998Abstract: A target set of texts, for training and/or evaluating a text classification model, is augmented using insertions into a base text within the original target set. In an embodiment, an expanded text, including the base text and an insertion word, must satisfy one or more inclusion criteria in order to be added to the target set. The inclusion criteria may require that the expanded text constitutes a successful attack on the classification model, the expanded text has a satisfactory perplexity score, and/or the expanded text is verified as being valid. In an embodiment, if a number of expanded texts added into the target set is below a threshold number, insertions are made into an expanded text (which was generated based on the base text). Inclusion criteria are evaluated against the doubly-expanded text to determine whether to add the doubly-expanded text to the target set.Type: ApplicationFiled: February 8, 2024Publication date: June 6, 2024Applicant: Oracle International CorporationInventors: Naveen Jafer Nizar, Ariel Gedaliah Kobren
-
Patent number: 11934795Abstract: A target set of texts, for training and/or evaluating a text classification model, is augmented using insertions into a base text within the original target set. In an embodiment, an expanded text, including the base text and an insertion word, must satisfy one or more inclusion criteria in order to be added to the target set. The inclusion criteria may require that the expanded text constitutes a successful attack on the classification model, the expanded text has a satisfactory perplexity score, and/or the expanded text is verified as being valid. In an embodiment, if a number of expanded texts added into the target set is below a threshold number, insertions are made into an expanded text (which was generated based on the base text). Inclusion criteria are evaluated against the doubly-expanded text to determine whether to add the doubly-expanded text to the target set.Type: GrantFiled: August 3, 2021Date of Patent: March 19, 2024Assignee: Oracle International CorporationInventors: Naveen Jafer Nizar, Ariel Gedaliah Kobren
-
Publication number: 20230401286Abstract: Techniques are disclosed for augmenting data sets used for training machine learning models and for generating predictions by trained machine learning models. These techniques may increase a number and diversity of examples within an initial training dataset of sentences by extracting a subset of words from the existing training dataset of sentences. The techniques may conserve scarce sample data in few-shot situations by training a data generation model using general data obtained from a general data source.Type: ApplicationFiled: September 6, 2022Publication date: December 14, 2023Applicant: Oracle International CorporationInventors: Ariel Gedaliah Kobren, Swetasudha Panda, Michael Louis Wick, Qinlan Shen, Jason Anthony Peck
-
Publication number: 20230401285Abstract: Techniques are disclosed for augmenting data sets used for training machine learning models and for generating predictions by trained machine learning models. The techniques generate synthesized data from sample data and train a machine learning model using the synthesized data to augment a sample data set. Embodiments selectively partition the sample data set and synthesized data into a training data and a validation data, which are used to generate and select machine learning models.Type: ApplicationFiled: September 6, 2022Publication date: December 14, 2023Applicant: Oracle International CorporationInventors: Ariel Gedaliah Kobren, Swetasudha Panda, Michael Louis Wick, Qinlan Shen, Jason Anthony Peck
-
Publication number: 20230368015Abstract: Techniques are described herein for training and applying machine learning models. The techniques include implementing an entropy-based loss function for training high-capacity machine learning models, such as deep neural networks, with anti-modeling. The entropy-based loss function may cause the model to have high entropy on negative data, helping prevent the model from becoming confidently wrong about the negative data while reducing the likelihood of generalizing from disfavored signals.Type: ApplicationFiled: September 8, 2022Publication date: November 16, 2023Applicant: Oracle International CorporationInventors: Michael Louis Wick, Ariel Gedaliah Kobren, Swetasudha Panda
-
Publication number: 20230032208Abstract: Techniques are disclosed for augmenting data sets used for training machine learning models and for generating predictions by trained machine learning models. These techniques may increase a number (and diversity) of examples within an initial training dataset of sentences by extracting a subset of words from the existing training dataset of sentences. The extracted subset includes no stopwords and fewer content words than found in the initial training dataset. The remaining words may be re-ordered. Using the extracted and re-ordered subset of words, the dataset generation model produces a second set of sentences that are different from the first set. The second set of sentences may be used to increase a number of examples in classes with few examples.Type: ApplicationFiled: July 30, 2021Publication date: February 2, 2023Applicant: Oracle International CorporationInventors: Ariel Gedaliah Kobren, Naveen Jafer Nizar, Michael Louis Wick, Swetasudha Panda
-
Publication number: 20220245362Abstract: A target set of texts, for training and/or evaluating a text classification model, is augmented using insertions into a base text within the original target set. In an embodiment, an expanded text, including the base text and an insertion word, must satisfy one or more inclusion criteria in order to be added to the target set. The inclusion criteria may require that the expanded text constitutes a successful attack on the classification model, the expanded text has a satisfactory perplexity score, and/or the expanded text is verified as being valid. In an embodiment, if a number of expanded texts added into the target set is below a threshold number, insertions are made into an expanded text (which was generated based on the base text). Inclusion criteria are evaluated against the doubly-expanded text to determine whether to add the doubly-expanded text to the target set.Type: ApplicationFiled: August 3, 2021Publication date: August 4, 2022Applicant: Oracle International CorporationInventors: Naveen Jafer Nizar, Ariel Gedaliah Kobren
-
Publication number: 20220051134Abstract: Techniques are described for identifying successful adversarial attacks for a black box reading comprehension model using an extracted white box reading comprehension model. The system trains a white box reading comprehension model that behaves similar to the black box reading comprehension model using the set of queries and corresponding responses from the black box reading comprehension model as training data. The system tests adversarial attacks, involving modified informational content for execution of queries, against the trained white box reading comprehension model. Queries used for successful attacks on the white box model may be applied to the black box model itself as part of a black box improvement process.Type: ApplicationFiled: December 9, 2020Publication date: February 17, 2022Applicant: Oracle International CorporationInventors: Naveen Jafer Nizar, Ariel Gedaliah Kobren