Patents by Inventor Jean-Baptiste Tristan
Jean-Baptiste Tristan has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Patent number: 11636308Abstract: According to embodiments, a recurrent neural network (RNN) is equipped with a set data structure whose operations are differentiable, which data structure can be used to store information for a long period of time. This differentiable set data structure can “remember” an event in the sequence of sequential data that may impact another event much later in the sequence, thereby allowing the RNN to classify the sequence based on many kinds of long dependencies. An RNN that is equipped with the differentiable set data structure can be properly trained with backpropagation and gradient descent optimizations. According to embodiments, a differentiable set data structure can be used to store and retrieve information with a simple set-like interface. According to further embodiments, the RNN can be extended to support several add operations, which can make the differentiable set data structure behave like a Bloom filter.Type: GrantFiled: October 31, 2016Date of Patent: April 25, 2023Assignee: Oracle International CorporationInventors: Jean-Baptiste Tristan, Michael Wick, Manzil Zaheer
-
Patent number: 11521069Abstract: Embodiments employ an inference method for neural networks that enforces deterministic constraints on outputs without performing post-processing or expensive discrete search over the feasible space. Instead, for each input, the continuous weights are nudged until the network's unconstrained inference procedure generates an output that satisfies the constraints. This is achieved by expressing the hard constraints as an optimization problem over the continuous weights and employing backpropagation to change the weights of the network. Embodiments optimize over the energy of the violating outputs; since the weights directly determine the output through the energy, embodiments are able to manipulate the unconstrained inference procedure to produce outputs that conform to global constraints.Type: GrantFiled: March 6, 2017Date of Patent: December 6, 2022Assignee: Oracle International CorporationInventors: Michael Wick, Jean-Baptiste Tristan, Jay Yoon Lee
-
Patent number: 11263541Abstract: Systems and methods are disclosed to build and execute a decision system based on multiple machine learned decision models. In embodiments, the decision system performs a hashing technique to reduce relevant features of the input data into a feature vector for each decision model. The feature vector reduces the dimensionality of the feature universe of the input data, and its use allows the decision models to be trained and executed using less computing resources. In embodiments, the decision system implements an ensembled decision model that makes decisions based on a combination function that combines the decision results of the individual models in the ensemble. The decision models employ different hashing techniques to hash the input features differently, so that errors caused by the feature hashing of individual models are reduced in the aggregate.Type: GrantFiled: September 27, 2017Date of Patent: March 1, 2022Assignee: Oracle International CorporationInventors: Jean-Baptiste Tristan, Adam Pocock, Michael Wick, Guy Steele
-
Data-parallel parameter estimation of the Latent Dirichlet allocation model by greedy Gibbs sampling
Patent number: 10860829Abstract: A novel data-parallel algorithm is presented for topic modeling on a highly-parallel hardware architectures. The algorithm is a Markov-Chain Monte Carlo algorithm used to estimate the parameters of the LDA topic model. This algorithm is based on a highly parallel partially-collapsed Gibbs sampler, but replaces a stochastic step that draws from a distribution with an optimization step that computes the mean of the distribution directly and deterministically. This algorithm is correct, it is statistically performant, and it is faster than state-of-the art algorithms because it can exploit the massive amounts of parallelism by processing the algorithm on a highly-parallel architecture, such as a GPU. Furthermore, the partially-collapsed Gibbs sampler converges about as fast as the collapsed Gibbs sampler and identifies solutions that are as good, or even better, as the collapsed Gibbs sampler.Type: GrantFiled: January 16, 2015Date of Patent: December 8, 2020Assignee: Oracle International CorporationInventors: Jean-Baptiste Tristan, Guy Steele -
Patent number: 10496929Abstract: The present invention relates to a probabilistic programming compiler that (a) generates data-parallel inference code to sample from probability distributions in models provided to the compiler; and (b) utilizes a modular framework to allow addition and removal of inference algorithm information based on which the compiler generates the inference code. For a given model, the described compiler can generate inference code that implements any one or more of the inference algorithms that are available to the compiler. The modular compiler framework utilizes an intermediate representation (IR) that symbolically represents features of probability distributions. The compiler then uses the IR as a basis for emitting inference code to sample from the one or more probability distributions represented in the IR.Type: GrantFiled: June 26, 2014Date of Patent: December 3, 2019Assignee: ORACLE INTERNATIONAL CORPORATIONInventors: Jean-Baptiste Tristan, Guy L. Steele, Jr., Daniel E. Huang, Joseph Tassarotti
-
Patent number: 10394872Abstract: Herein is described an unsupervised learning method to discover topics and reduce the dimensionality of documents by designing and simulating a stochastic cellular automaton. A key formula that appears in many inference methods for LDA is used as the local update rule of the cellular automaton. Approximate counters may be used to represent counter values being tracked by the inference algorithms. Also, sparsity may be used to reduce the amount of computation needed for sampling a topic for particular words in the corpus being analyzed.Type: GrantFiled: November 4, 2015Date of Patent: August 27, 2019Assignee: Oracle International CorporationInventors: Jean-Baptiste Tristan, Stephen J. Green, Guy L. Steele, Jr., Manzil Zaheer
-
Publication number: 20190114319Abstract: Embodiments make novel use of random data structures to facilitate streaming inference for a Latent Dirichlet Allocation (LDA) model. Utilizing random data structures facilitates streaming inference by entirely avoiding the need for pre-computation, which is generally an obstacle to many current “streaming” variants of LDA as described above. Specifically, streaming inference—based on an inference algorithm such as Stochastic Cellular Automata (SCA), Gibbs sampling, and/or Stochastic Expectation Maximization (SEM)—is implemented using a count-min sketch to track sufficient statistics for the inference procedure. Use of a count-min sketch avoids the need to know the vocabulary size V a priori. Also, use of a count-min sketch directly enables feature hashing, which addresses the problem of effectively encoding words into indices without the need of pre-computation. Approximate counters are also used within the count-min sketch to avoid bit overflow issues with the counts in the sketch.Type: ApplicationFiled: March 23, 2018Publication date: April 18, 2019Inventors: Jean-Baptiste Tristan, Michael Wick, Stephen Green
-
Publication number: 20190095805Abstract: Systems and methods are disclosed to build and execute a decision system based on multiple machine learned decision models. In embodiments, the decision system performs a hashing technique to reduce relevant features of the input data into a feature vector for each decision model. The feature vector reduces the dimensionality of the feature universe of the input data, and its use allows the decision models to be trained and executed using less computing resources. In embodiments, the decision system implements an ensembled decision model that makes decisions based on a combination function that combines the decision results of the individual models in the ensemble. The decision models employ different hashing techniques to hash the input features differently, so that errors caused by the feature hashing of individual models are reduced in the aggregate.Type: ApplicationFiled: September 27, 2017Publication date: March 28, 2019Inventors: Jean-Baptiste Tristan, Adam Pocock, Michael Wick, Guy Steele
-
Patent number: 10157346Abstract: An efficient parallel Gibbs sampler using butterfly-patterned partial sums is provided. Instead of building and searching a complete prefix sums table, an alternative “butterfly patterned partial sums table” is described that integrates a lightweight transposition and partial sums operation. Accordingly, the usual full matrix transposition and full prefix sums table building operations can be omitted in favor of building the butterfly-patterned partial sums table, which requires less computational and communication effort. This butterfly-patterned partial sums table is used by a modified binary search phase that calculates the needed prefix-sum table values on-the-fly using the butterfly-patterned partial sums table. Transposed memory access is also provided while avoiding the full matrix transform, providing significant performance benefits for highly parallel architectures, such as graphics processing units (GPUs) where 1-stride or sequential memory accesses are important for optimization.Type: GrantFiled: May 15, 2015Date of Patent: December 18, 2018Assignee: ORACLE INTERNATIONAL CORPORATIONInventors: Guy L. Steele, Jr., Jean-Baptiste Tristan
-
Patent number: 10147044Abstract: Herein is described a data-parallel algorithm for topic modeling in which the memory requirements are streamlined for implementation on a highly-parallel architecture, such as a GPU. Specifically, approximate counters are used in a large mixture model or clustering algorithm (e.g., an uncollapsed Gibbs sampler) to decrease memory usage over what is required when conventional counters are used. The decreased memory usage of the approximate counters allows a highly-parallel architecture with limited memory to process more computations for the large mixture model more efficiently. Embodiments describe binary Morris approximate counters, general Morris approximate counters, and Csrös approximate counters in the context of an uncollapsed Gibbs sampler, and, more specifically, for a Greedy Gibbs sampler.Type: GrantFiled: August 6, 2015Date of Patent: December 4, 2018Assignee: ORACLE INTERNATIONAL CORPORATIONInventors: Guy L. Steele, Jr., Jean-Baptiste Tristan
-
Patent number: 10140281Abstract: Herein is described a data-parallel algorithm for topic modeling on a distributed system in which memory and communication bandwidth requirements are streamlined for distributed implementation. According to embodiments, a distributed LDA Gibbs sampling algorithm shares approximate counter values amongst the nodes of a distributed system. These approximate counter values are repeatedly aggregated and then shared again to perform the distributed LDA Gibbs sampling. In order to maintain the shared counter values as approximate counter values of sixteen bits or less, approximate counter values are summed to produce aggregate approximate counter values. These small aggregate approximate counter values are shared between the nodes of the distributed system. As such, the addition of various types of approximate counters is described herein.Type: GrantFiled: August 7, 2015Date of Patent: November 27, 2018Assignee: ORACLE INTERNATIONAL CORPORATIONInventors: Guy L. Steele, Jr., Jean-Baptiste Tristan
-
Publication number: 20180121792Abstract: According to embodiments, a recurrent neural network (RNN) is equipped with a set data structure whose operations are differentiable, which data structure can be used to store information for a long period of time. This differentiable set data structure can “remember” an event in the sequence of sequential data that may impact another event much later in the sequence, thereby allowing the RNN to classify the sequence based on many kinds of long dependencies. An RNN that is equipped with the differentiable set data structure can be properly trained with backpropagation and gradient descent optimizations. According to embodiments, a differentiable set data structure can be used to store and retrieve information with a simple set-like interface. According to further embodiments, the RNN can be extended to support several add operations, which can make the differentiable set data structure behave like a Bloom filter.Type: ApplicationFiled: October 31, 2016Publication date: May 3, 2018Inventors: Jean-Baptiste Tristan, Michael Wick, Manzil Zaheer
-
Publication number: 20180121807Abstract: Embodiments employ an inference method for neural networks that enforces deterministic constraints on outputs without performing post-processing or expensive discrete search over the feasible space. Instead, for each input, the continuous weights are nudged until the network's unconstrained inference procedure generates an output that satisfies the constraints. This is achieved by expressing the hard constraints as an optimization problem over the continuous weights and employing backpropagation to change the weights of the network. Embodiments optimize over the energy of the violating outputs; since the weights directly determine the output through the energy, embodiments are able to manipulate the unconstrained inference procedure to produce outputs that conform to global constraints.Type: ApplicationFiled: March 6, 2017Publication date: May 3, 2018Inventors: Michael Wick, Jean-Baptiste Tristan, Jay Yoon Lee
-
Patent number: 9767416Abstract: Herein is described a data-parallel and sparse algorithm for topic modeling. This algorithm is based on a highly parallel algorithm for a Greedy Gibbs sampler. The Greedy Gibbs sampler is a Markov-Chain Monte Carlo algorithm that estimates topics, in an unsupervised fashion, by estimating the parameters of the topic model Latent Dirichlet Allocation (LDA). The Greedy Gibbs sampler is a data-parallel algorithm for topic modeling, and is configured to be implemented on a highly-parallel architecture, such as a GPU. The Greedy Gibbs sampler is modified to take advantage of data sparsity while maintaining the parallelism. Furthermore, in an embodiment, implementation of the Greedy Gibbs sampler uses both densely-represented and sparsely-represented matrices to reduce the amount of computation while maintaining fast accesses to memory for implementation on a GPU.Type: GrantFiled: June 30, 2015Date of Patent: September 19, 2017Assignee: Oracle International CorporationInventors: Jean-Baptiste Tristan, Guy L. Steele, Jr., Joseph Tassarotti
-
Publication number: 20170039265Abstract: Herein is described a data-parallel algorithm for topic modeling on a distributed system in which memory and communication bandwidth requirements are streamlined for distributed implementation. According to embodiments, a distributed LDA Gibbs sampling algorithm shares approximate counter values amongst the nodes of a distributed system. These approximate counter values are repeatedly aggregated and then shared again to perform the distributed LDA Gibbs sampling. In order to maintain the shared counter values as approximate counter values of sixteen bits or less, approximate counter values are summed to produce aggregate approximate counter values. These small aggregate approximate counter values are shared between the nodes of the distributed system. As such, the addition of various types of approximate counters is described herein.Type: ApplicationFiled: August 7, 2015Publication date: February 9, 2017Inventors: Guy L. Steele, JR., Jean-Baptiste Tristan
-
Publication number: 20160350411Abstract: Herein is described an unsupervised learning method to discover topics and reduce the dimensionality of documents by designing and simulating a stochastic cellular automaton. A key formula that appears in many inference methods for LDA is used as the local update rule of the cellular automaton. Approximate counters may be used to represent counter values being tracked by the inference algorithms. Also, sparsity may be used to reduce the amount of computation needed for sampling a topic for particular words in the corpus being analyzed.Type: ApplicationFiled: November 4, 2015Publication date: December 1, 2016Inventors: Jean-Baptiste Tristan, Stephen J. Green, Guy L. Steele, JR., Manzil Zaheer
-
Publication number: 20160224544Abstract: Herein is described a data-parallel and sparse algorithm for topic modeling. This algorithm is based on a highly parallel algorithm for a Greedy Gibbs sampler. The Greedy Gibbs sampler is a Markov-Chain Monte Carlo algorithm that estimates topics, in an unsupervised fashion, by estimating the parameters of the topic model Latent Dirichlet Allocation (LDA). The Greedy Gibbs sampler is a data-parallel algorithm for topic modeling, and is configured to be implemented on a highly-parallel architecture, such as a GPU. The Greedy Gibbs sampler is modified to take advantage of data sparsity while maintaining the parallelism. Furthermore, in an embodiment, implementation of the Greedy Gibbs sampler uses both densely-represented and sparsely-represented matrices to reduce the amount of computation while maintaining fast accesses to memory for implementation on a GPU.Type: ApplicationFiled: June 30, 2015Publication date: August 4, 2016Inventors: Jean-Baptiste Tristan, Guy L. Steele, JR., Joseph Tassarotti
-
Publication number: 20160224900Abstract: Herein is described a data-parallel algorithm for topic modeling in which the memory requirements are streamlined for implementation on a highly-parallel architecture, such as a GPU. Specifically, approximate counters are used in a large mixture model or clustering algorithm (e.g., an uncollapsed Gibbs sampler) to decrease memory usage over what is required when conventional counters are used. The decreased memory usage of the approximate counters allows a highly-parallel architecture with limited memory to process more computations for the large mixture model more efficiently. Embodiments describe binary Morris approximate counters, general Morris approximate counters, and Csrös approximate counters in the context of an uncollapsed Gibbs sampler, and, more specifically, for a Greedy Gibbs sampler.Type: ApplicationFiled: August 6, 2015Publication date: August 4, 2016Inventors: Guy L. Steele, JR., Jean-Baptiste Tristan
-
Publication number: 20160224902Abstract: An efficient parallel Gibbs sampler using butterfly-patterned partial sums is provided. Instead of building and searching a complete prefix sums table, an alternative “butterfly patterned partial sums table” is described that integrates a lightweight transposition and partial sums operation. Accordingly, the usual full matrix transposition and full prefix sums table building operations can be omitted in favor of building the butterfly-patterned partial sums table, which requires less computational and communication effort. This butterfly-patterned partial sums table is used by a modified binary search phase that calculates the needed prefix-sum table values on-the-fly using the butterfly-patterned partial sums table. Transposed memory access is also provided while avoiding the full matrix transform, providing significant performance benefits for highly parallel architectures, such as graphics processing units (GPUs) where 1-stride or sequential memory accesses are important for optimization.Type: ApplicationFiled: May 15, 2015Publication date: August 4, 2016Inventors: GUY L. STEELE, JR., JEAN-BAPTISTE TRISTAN
-
DATA-PARALLEL PARAMETER ESTIMATION OF THE LATENT DIRICHLET ALLOCATION MODEL BY GREEDY GIBBS SAMPLING
Publication number: 20160210718Abstract: A novel data-parallel algorithm is presented for topic modeling on a highly-parallel hardware architectures. The algorithm is a Markov-Chain Monte Carlo algorithm used to estimate the parameters of the LDA topic model. This algorithm is based on a highly parallel partially-collapsed Gibbs sampler, but replaces a stochastic step that draws from a distribution with an optimization step that computes the mean of the distribution directly and deterministically. This algorithm is correct, it is statistically performant, and it is faster than state-of-the art algorithms because it can exploit the massive amounts of parallelism by processing the algorithm on a highly-parallel architecture, such as a GPU. Furthermore, the partially-collapsed Gibbs sampler converges about as fast as the collapsed Gibbs sampler and identifies solutions that are as good, or even better, as the collapsed Gibbs sampler.Type: ApplicationFiled: January 16, 2015Publication date: July 21, 2016Inventors: Jean-Baptiste Tristan, Guy Steele