Patents by Inventor Jean-Baptiste Frederic George Tristan

Jean-Baptiste Frederic George Tristan has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Similarity Analysis Using Enhanced MinHash

Publication number: 20240168934

Abstract: A first set and a second set are identified as operands for a set operation of a similarity analysis task iteration. Using respective minimum hash information arrays and contributor count arrays of the two sets, a minimum hash information array and contributor count array of a derived set resulting from the set operation is generated. An entry in the contributor count array of the derived set indicates the number of child sets of the derived set that meet a criterion with respect to a corresponding entry in the minimum hash information array of the derived set. The generated minimum hash information array and the contributor count array are stored as part of input for a subsequent iteration. After a termination criterion of the task is met, output of the task is stored.

Type: Application

Filed: January 29, 2024

Publication date: May 23, 2024

Inventors: Michael Louis Wick, Jean-Baptiste Frederic George Tristan, Swetasudha Panda
Control system for learning to rank fairness

Patent number: 11948102

Abstract: A Bayesian test of demographic parity for learning to rank may be applied to determine ranking modifications. A fairness control system receiving a ranking of items may apply Bayes factors to determine a likelihood of bias for the ranking. These Bayes factors may include a factor for determining bias in each item and a factor for determining bias in the ranking of the items. An indicator of bias may be generated using the applied Bayes factors and the fairness control system may modify the ranking if the determines likelihood of bias satisfies modification criteria for the ranking.

Type: Grant

Filed: August 12, 2022

Date of Patent: April 2, 2024

Assignee: Oracle International Corporation

Inventors: Jean-Baptiste Frederic George Tristan, Michael Louis Wick, Swetasudha Panda
Similarity analysis using enhanced MinHash

Patent number: 11921687

Abstract: A first set and a second set are identified as operands for a set operation of a similarity analysis task iteration. Using respective minimum hash information arrays and contributor count arrays of the two sets, a minimum hash information array and contributor count array of a derived set resulting from the set operation is generated. An entry in the contributor count array of the derived set indicates the number of child sets of the derived set that meet a criterion with respect to a corresponding entry in the minimum hash information array of the derived set. The generated minimum hash information array and the contributor count array are stored as part of input for a subsequent iteration. After a termination criterion of the task is met, output of the task is stored.

Type: Grant

Filed: June 10, 2019

Date of Patent: March 5, 2024

Assignee: Oracle International Corporation

Inventors: Michael Louis Wick, Jean-Baptiste Frederic George Tristan, Swetasudha Panda
Enforcing Fairness on Unlabeled Data to Improve Modeling Performance

Publication number: 20230394371

Abstract: Fairness of a trained classifier may be ensured by generating a data set for training, the data set generated using input data points of a feature space including multiple dimensions and according to different parameters including an amount of label bias, a control for discrepancy between rarity of features, and an amount of selection bias. Unlabeled data points of the input data comprising unobserved ground truths are labeled according to the amount of label bias and the input data sampled according to the amount of selection bias and the control for the discrepancy between the rarity of features. The classifier is then trained using the sampled and labeled data points as well as additional unlabeled data points. The trained classifier is then usable to determine unbiased classifications of one or more labels for one or more other data sets.

Type: Application

Filed: August 22, 2023

Publication date: December 7, 2023

Inventors: Michael Louis Wick, Swetasudha Panda, Jean-Baptiste Frederic George Tristan
Enforcing fairness on unlabeled data to improve modeling performance

Patent number: 11775863

Abstract: Fairness of a trained classifier may be ensured by generating a data set for training, the data set generated using input data points of a feature space including multiple dimensions and according to different parameters including an amount of label bias, a control for discrepancy between rarity of features, and an amount of selection bias. Unlabeled data points of the input data comprising unobserved ground truths are labeled according to the amount of label bias and the input data sampled according to the amount of selection bias and the control for the discrepancy between the rarity of features. The classifier is then trained using the sampled and labeled data points as well as additional unlabeled data points. The trained classifier is then usable to determine unbiased classifications of one or more labels for one or more other data sets.

Type: Grant

Filed: February 4, 2020

Date of Patent: October 3, 2023

Assignee: Oracle International Corporation

Inventors: Michael Louis Wick, Swetasudha Panda, Jean-Baptiste Frederic George Tristan
Control System for Learning to Rank Fairness

Publication number: 20220382768

Abstract: A Bayesian test of demographic parity for learning to rank may be applied to determine ranking modifications. A fairness control system receiving a ranking of items may apply Bayes factors to determine a likelihood of bias for the ranking. These Bayes factors may include a factor for determining bias in each item and a factor for determining bias in the ranking of the items. An indicator of bias may be generated using the applied Bayes factors and the fairness control system may modify the ranking if the determines likelihood of bias satisfies modification criteria for the ranking.

Type: Application

Filed: August 12, 2022

Publication date: December 1, 2022

Inventors: Jean-Baptiste Frederic George Tristan, Michael Louis Wick, Swetasudha Panda
Evaluating language models using negative data

Patent number: 11488579

Abstract: A method of evaluating a language model using negative data may include accessing a first language model that is trained using a first training corpus, and accessing a second language model. The second language model may be configured to generate outputs that are less grammatical than outputs generated by the first language model. The method may also include training the second language model using a second training corpus, and generating output text from the second language model. The method may further include testing the first language model using the output text from the second language model.

Type: Grant

Filed: June 2, 2020

Date of Patent: November 1, 2022

Assignee: Oracle International Corporation

Inventors: Michael Louis Wick, Jean-Baptiste Frederic George Tristan, Jason Peck
Control system for learning to rank fairness

Patent number: 11416500

Abstract: A Bayesian test of demographic parity for learning to rank may be applied to determine ranking modifications. A fairness control system receiving a ranking of items may apply Bayes factors to determine a likelihood of bias for the ranking. These Bayes factors may include a factor for determining bias in each item and a factor for determining bias in the ranking of the items. An indicator of bias may be generated using the applied Bayes factors and the fairness control system may modify the ranking if the determines likelihood of bias satisfies modification criteria for the ranking.

Type: Grant

Filed: February 4, 2020

Date of Patent: August 16, 2022

Assignee: Oracle International Corporation

Inventors: Jean-Baptiste Frederic George Tristan, Michael Louis Wick, Swetasudha Panda
Online Post-Processing In Rankings For Constrained Utility Maximization

Publication number: 20220050848

Abstract: Online post-processing may be performed for rankings generated with constrained utility maximization. A stream of data items may be received. A batch of data items from the stream may be ranked according to a ranking model trained to rank data items in a descending order of relevance. The batch of data items may be associated with a current time step. A re-ranking model may be applied to generate a re-ranking of the batch of data items according to a re-ranking policy that considers the current batch and previous batches with regard to a ranking constraint. The re-ranked items may then be sent to an application.

Type: Application

Filed: July 6, 2021

Publication date: February 17, 2022

Inventors: Swetasudha Panda, Ariel Kobren, Jean-Baptiste Frederic George Tristan, Michael Louis Wick
EVALUATING LANGUAGE MODELS USING NEGATIVE DATA

Publication number: 20210375262

Abstract: A method of evaluating a language model using negative data may include accessing a first language model that is trained using a first training corpus, and accessing a second language model. The second language model may be configured to generate outputs that are less grammatical than outputs generated by the first language model. The method may also include training the second language model using a second training corpus, and generating output text from the second language model. The method may further include testing the first language model using the output text from the second language model.

Type: Application

Filed: June 2, 2020

Publication date: December 2, 2021

Applicant: Oracle International Corporation

Inventors: Michael Louis Wick, Jean-Baptiste Frederic George Tristan, Jason Peck
REMOVING UNDESIRABLE SIGNALS FROM LANGUAGE MODELS USING NEGATIVE DATA

Publication number: 20210374361

Abstract: A method for training a language model using negative data may include accessing a first training corpus comprising positive training data and accessing a second training corpus comprising negative training data. The method may further include training a first language model using at least the first training corpus, the second training corpus, and a maximum likelihood function. The maximum likelihood function may maximize the likelihood of the first language model predicting the positive training data while minimizing the likelihood of the first language model predicting the negative training data.

Type: Application

Filed: June 2, 2020

Publication date: December 2, 2021

Applicant: Oracle International Corporation

Inventors: Michael Louis Wick, Jean-Baptiste Frederic George Tristan, Adam Craig Pocock, Katherine Silverstein
Enhanced Techniques For Bias Analysis

Publication number: 20210374582

Abstract: A fairness metric of decisions pertaining to a plurality of candidates indicated in a data set is estimated. Using a Hamiltonian Monte Carlo sampling algorithm, sample sets corresponding to random variables of a null model and an alternate model are obtained. A respective kernel density estimator is fitted on at least some sample sets, and importance sampling is implemented on additional samples generated using the kernel density estimators. The estimated fairness metric is provided via one or more programmatic interfaces.

Type: Application

Filed: June 26, 2020

Publication date: December 2, 2021

Inventors: Jean-Baptiste Frederic George Tristan, Michael Louis Wick, Stephen J. Green
Systems and methods for scalable hierarchical coreference

Patent number: 11017151

Abstract: A scalable hierarchical coreference method that employs a homomorphic compression scheme that supports addition and partial subtraction to more efficiently represent the data and the evolving intermediate results of probabilistic inference. The method may encode the features underlying conditional random field models of coreference resolution so that cosine similarities can be efficiently computed. The method may be applied to compressing features and intermediate inference results for conditional random fields. The method may allow compressed representations to be added and subtracted in a way that preserves the cosine similarities.

Type: Grant

Filed: March 27, 2020

Date of Patent: May 25, 2021

Assignee: Oracle International Corporation

Inventors: Michael Louis Wick, Jean-Baptiste Frederic George Tristan, Stephen Joseph Green
Bias parameters for topic modeling

Patent number: 10990763

Abstract: Systems and methods are disclosed to improve a topic modeling system that tunes a topic model for a set of topics from a corpus of documents, by allowing users to pre-inform the tuning process with bias parameters for desired associations in the topic model. In embodiments, the topic model may be a Latent Dirichlet Allocation (LDA) model. In embodiments, the bias parameter may indicate a fixed association where a particular word in a particular document is associated with a particular topic. In embodiments, the bias parameter may specify a weight value that biases the inference process with regard to a particular association. Advantageously, the disclosed features allow users to specify a small number of parameters to steer the tuning process towards a set of desired topics. As a result, the topic model may be generated more quickly and with more useful topics.

Type: Grant

Filed: May 9, 2019

Date of Patent: April 27, 2021

Assignee: Oracle International Corporation

Inventors: Daniel Peterson, Jean-Baptiste Frederic George Tristan, Robert James Oberbreckling
Similarity Analysis Using Enhanced MinHash

Publication number: 20200387743

Abstract: A first set and a second set are identified as operands for a set operation of a similarity analysis task iteration. Using respective minimum hash information arrays and contributor count arrays of the two sets, a minimum hash information array and contributor count array of a derived set resulting from the set operation is generated. An entry in the contributor count array of the derived set indicates the number of child sets of the derived set that meet a criterion with respect to a corresponding entry in the minimum hash information array of the derived set. The generated minimum hash information array and the contributor count array are stored as part of input for a subsequent iteration. After a termination criterion of the task is met, output of the task is stored.

Type: Application

Filed: June 10, 2019

Publication date: December 10, 2020

Inventors: Michael Louis Wick, Jean-Baptiste Frederic George Tristan, Swetasudha Panda
Control System for Learning to Rank Fairness

Publication number: 20200372035

Abstract: A Bayesian test of demographic parity for learning to rank may be applied to determine ranking modifications. A fairness control system receiving a ranking of items may apply Bayes factors to determine a likelihood of bias for the ranking. These Bayes factors may include a factor for determining bias in each item and a factor for determining bias in the ranking of the items. An indicator of bias may be generated using the applied Bayes factors and the fairness control system may modify the ranking if the determines likelihood of bias satisfies modification criteria for the ranking.

Type: Application

Filed: February 4, 2020

Publication date: November 26, 2020

Inventors: Jean-Baptiste Frederic George Tristan, Michael Louis Wick, Swetasudha Panda
Enforcing Fairness on Unlabeled Data to Improve Modeling Performance

Publication number: 20200372406

Abstract: Fairness of a trained classifier may be ensured by generating a data set for training, the data set generated using input data points of a feature space including multiple dimensions and according to different parameters including an amount of label bias, a control for discrepancy between rarity of features, and an amount of selection bias. Unlabeled data points of the input data comprising unobserved ground truths are labeled according to the amount of label bias and the input data sampled according to the amount of selection bias and the control for the discrepancy between the rarity of features. The classifier is then trained using the sampled and labeled data points as well as additional unlabeled data points. The trained classifier is then usable to determine unbiased classifications of one or more labels for one or more other data sets.

Type: Application

Filed: February 4, 2020

Publication date: November 26, 2020

Inventors: Michael Louis Wick, Swetasudha Panda, Jean-Baptiste Frederic George Tristan
Bayesian Test of Demographic Parity for Learning to Rank

Publication number: 20200372290

Abstract: A Bayesian test of demographic parity for learning to rank may be applied to determine ranking modifications. A fairness control system receiving a ranking of items may apply Bayes factors to determine a likelihood of bias for the ranking. These Bayes factors may include a factor for determining bias in each item and a factor for determining bias in the ranking of the items. An indicator of bias may be generated using the applied Bayes factors and the fairness control system may modify the ranking if the determines likelihood of bias satisfies modification criteria for the ranking.

Type: Application

Filed: February 4, 2020

Publication date: November 26, 2020

Inventors: Jean-Baptiste Frederic George Tristan, Pallika Haridas Kanani, Michael Louis Wick, Swetasudha Panda, Haniyeh Mahmoudian
BIAS PARAMETERS FOR TOPIC MODELING

Publication number: 20200279019

Abstract: Systems and methods are disclosed to improve a topic modeling system that tunes a topic model for a set of topics from a corpus of documents, by allowing users to pre-inform the tuning process with bias parameters for desired associations in the topic model. In embodiments, the topic model may be a Latent Dirichlet Allocation (LDA) model. In embodiments, the bias parameter may indicate a fixed association where a particular word in a particular document is associated with a particular topic. In embodiments, the bias parameter may specify a weight value that biases the inference process with regard to a particular association. Advantageously, the disclosed features allow users to specify a small number of parameters to steer the tuning process towards a set of desired topics. As a result, the topic model may be generated more quickly and with more useful topics.

Type: Application

Filed: May 9, 2019

Publication date: September 3, 2020

Inventors: Daniel Peterson, Jean-Baptiste Frederic George Tristan, Robert James Oberbreckling
SYSTEMS AND METHODS FOR SCALABLE HIERARCHICAL COREFERENCE

Publication number: 20200226318

Abstract: A scalable hierarchical coreference method that employs a homomorphic compression scheme that supports addition and partial subtraction to more efficiently represent the data and the evolving intermediate results of probabilistic inference. The method may encode the features underlying conditional random field models of coreference resolution so that cosine similarities can be efficiently computed. The method may be applied to compressing features and intermediate inference results for conditional random fields. The method may allow compressed representations to be added and subtracted in a way that preserves the cosine similarities.

Type: Application

Filed: March 27, 2020

Publication date: July 16, 2020

Inventors: Michael Louis Wick, Jean-Baptiste Frederic George Tristan, Stephen Joseph Green

1 2 next