Patents by Inventor Guy L. Steele, Jr.

Guy L. Steele, Jr. has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 11392350
    Abstract: Embodiments comprise construction of a collection of pseudorandom number generators (PRNGs), with either a known or unknown cardinality, using unique brine values that comprise a salt value for the collection and also different index values for each PRNG for the collection. The additive parameters of such PRNGs are based on the respective brine values of the PRNGs, thereby ensuring that the PRNGs in the collection have different state cycles. Embodiments make it likely that PRNGs from different collections have distinct additive parameters by choosing a pseudorandom salt value for each collection. According to embodiments, a stream of generators in a collection is created by a spliterator that carries a salt value for the collection and combines the salt value with index values for the generators to produce brined additive parameters for the PRNGs in the stream. According to embodiments, such a stream may be executed by multiple threads in parallel.
    Type: Grant
    Filed: August 22, 2019
    Date of Patent: July 19, 2022
    Assignee: ORACLE INTERNATIONAL CORPORATION
    Inventor: Guy L. Steele, Jr.
  • Patent number: 10922052
    Abstract: A method and apparatus is provided for generating pseudorandom numbers in a way that is deterministic (i.e., repeatable), that passes statistical tests, can have multiple instances of objects generating pseudorandom numbers at the same time. Also, the collection of pseudorandom numbers generated by multiple instances have the same statistical properties as numbers generated by a single instance (i.e., randomness). Embodiments described herein generate pseudorandom values by using a plurality of subsidiary linear congruential generators and combining their outputs nonlinearly. According to embodiments, after their outputs have been combined, a mixing function is applied. Embodiments include an on-demand split method in the style of the SplitMix algorithm.
    Type: Grant
    Filed: October 12, 2015
    Date of Patent: February 16, 2021
    Assignee: Oracle International Corporation
    Inventor: Guy L. Steele, Jr.
  • Publication number: 20200401378
    Abstract: Embodiments comprise construction of a collection of pseudorandom number generators (PRNGs), with either a known or unknown cardinality, using unique brine values that comprise a salt value for the collection and also different index values for each PRNG for the collection. The additive parameters of such PRNGs are based on the respective brine values of the PRNGs, thereby ensuring that the PRNGs in the collection have different state cycles. Embodiments make it likely that PRNGs from different collections have distinct additive parameters by choosing a pseudorandom salt value for each collection. According to embodiments, a stream of generators in a collection is created by a spliterator that carries a salt value for the collection and combines the salt value with index values for the generators to produce brined additive parameters for the PRNGs in the stream. According to embodiments, such a stream may be executed by multiple threads in parallel.
    Type: Application
    Filed: August 22, 2019
    Publication date: December 24, 2020
    Inventor: Guy L. Steele, Jr.
  • Patent number: 10545865
    Abstract: A lookup circuit evaluates hash functions that map keys to addresses in lookup tables. The circuit may include multiple hash function sub-circuits, each of which applies a respective hash function to an input key value, producing a hash value. Each hash function sub-circuit may multiply bit vectors representing key values by a sparse bit matrix and may add a constant bit vector to the results. The hash function sub-circuits may be constructed using odd-parity circuits that accept as inputs subsets of the bits of the bit vectors representing the key values. The sparse bit matrices may be chosen or generated so that there are at least twice as many 0-bits per row as 1-bits or there is an upper bound on the number of 1-bits per row. Using sparse bit matrices in the hash function sub-circuits may allow the lookup circuit to perform lookup operations with very low latency.
    Type: Grant
    Filed: May 16, 2016
    Date of Patent: January 28, 2020
    Assignee: Oracle International Corporation
    Inventors: Guy L. Steele, Jr., David R. Chase
  • Patent number: 10503716
    Abstract: A lookup circuit evaluates hash functions that map keys to addresses in lookup tables. The circuit includes multiple hash function sub-circuits, each of which applies a respective hash function to an input key, producing a hash value. Candidate pairs of hash functions to be implemented by the hash function sub-circuits may be generated and tested for suitability in hashing a particular collection of keys. The suitability testing may include computing hash value bit vectors by applying each hash function in a candidate pair to a given key, and determining (using a modified union-find type operation that organizes objects in each set as a directed graph whose root points to itself) whether the resulting hash value bit vectors belong to the same set. The union-find type operation may include a limited distance-from-root test, path compression, or exception handling for special cases, but not a rank test.
    Type: Grant
    Filed: October 31, 2013
    Date of Patent: December 10, 2019
    Assignee: Oracle International Corporation
    Inventors: David R. Chase, Guy L. Steele, Jr.
  • Patent number: 10496929
    Abstract: The present invention relates to a probabilistic programming compiler that (a) generates data-parallel inference code to sample from probability distributions in models provided to the compiler; and (b) utilizes a modular framework to allow addition and removal of inference algorithm information based on which the compiler generates the inference code. For a given model, the described compiler can generate inference code that implements any one or more of the inference algorithms that are available to the compiler. The modular compiler framework utilizes an intermediate representation (IR) that symbolically represents features of probability distributions. The compiler then uses the IR as a basis for emitting inference code to sample from the one or more probability distributions represented in the IR.
    Type: Grant
    Filed: June 26, 2014
    Date of Patent: December 3, 2019
    Assignee: ORACLE INTERNATIONAL CORPORATION
    Inventors: Jean-Baptiste Tristan, Guy L. Steele, Jr., Daniel E. Huang, Joseph Tassarotti
  • Patent number: 10394872
    Abstract: Herein is described an unsupervised learning method to discover topics and reduce the dimensionality of documents by designing and simulating a stochastic cellular automaton. A key formula that appears in many inference methods for LDA is used as the local update rule of the cellular automaton. Approximate counters may be used to represent counter values being tracked by the inference algorithms. Also, sparsity may be used to reduce the amount of computation needed for sampling a topic for particular words in the corpus being analyzed.
    Type: Grant
    Filed: November 4, 2015
    Date of Patent: August 27, 2019
    Assignee: Oracle International Corporation
    Inventors: Jean-Baptiste Tristan, Stephen J. Green, Guy L. Steele, Jr., Manzil Zaheer
  • Patent number: 10289454
    Abstract: A system may perform work stealing using a dynamically configurable separation between stealable and non-stealable work items. The work items may be held in a double-ended queue (deque), and the value of a variable (index) may indicate the position of the last stealable work item or the first non-stealable work item in the deque. A thread may steal a work item only from the portion of another thread's deque that holds stealable items. The owner of a deque may add work items to the deque and may modify the number or percentage of stealable work items, the number or percentage of non-stealable work items, and/or the ratio between stealable and non-stealable work items in the deque during execution. For example, the owner may convert stealable work items to non-stealable work items, or vice versa, in response to changing conditions and/or according to various work-stealing policies.
    Type: Grant
    Filed: April 18, 2016
    Date of Patent: May 14, 2019
    Assignee: Oracle International Corporation
    Inventors: Yosef Lev, Guy L. Steele, Jr.
  • Patent number: 10157346
    Abstract: An efficient parallel Gibbs sampler using butterfly-patterned partial sums is provided. Instead of building and searching a complete prefix sums table, an alternative “butterfly patterned partial sums table” is described that integrates a lightweight transposition and partial sums operation. Accordingly, the usual full matrix transposition and full prefix sums table building operations can be omitted in favor of building the butterfly-patterned partial sums table, which requires less computational and communication effort. This butterfly-patterned partial sums table is used by a modified binary search phase that calculates the needed prefix-sum table values on-the-fly using the butterfly-patterned partial sums table. Transposed memory access is also provided while avoiding the full matrix transform, providing significant performance benefits for highly parallel architectures, such as graphics processing units (GPUs) where 1-stride or sequential memory accesses are important for optimization.
    Type: Grant
    Filed: May 15, 2015
    Date of Patent: December 18, 2018
    Assignee: ORACLE INTERNATIONAL CORPORATION
    Inventors: Guy L. Steele, Jr., Jean-Baptiste Tristan
  • Patent number: 10147044
    Abstract: Herein is described a data-parallel algorithm for topic modeling in which the memory requirements are streamlined for implementation on a highly-parallel architecture, such as a GPU. Specifically, approximate counters are used in a large mixture model or clustering algorithm (e.g., an uncollapsed Gibbs sampler) to decrease memory usage over what is required when conventional counters are used. The decreased memory usage of the approximate counters allows a highly-parallel architecture with limited memory to process more computations for the large mixture model more efficiently. Embodiments describe binary Morris approximate counters, general Morris approximate counters, and Csrös approximate counters in the context of an uncollapsed Gibbs sampler, and, more specifically, for a Greedy Gibbs sampler.
    Type: Grant
    Filed: August 6, 2015
    Date of Patent: December 4, 2018
    Assignee: ORACLE INTERNATIONAL CORPORATION
    Inventors: Guy L. Steele, Jr., Jean-Baptiste Tristan
  • Patent number: 10140281
    Abstract: Herein is described a data-parallel algorithm for topic modeling on a distributed system in which memory and communication bandwidth requirements are streamlined for distributed implementation. According to embodiments, a distributed LDA Gibbs sampling algorithm shares approximate counter values amongst the nodes of a distributed system. These approximate counter values are repeatedly aggregated and then shared again to perform the distributed LDA Gibbs sampling. In order to maintain the shared counter values as approximate counter values of sixteen bits or less, approximate counter values are summed to produce aggregate approximate counter values. These small aggregate approximate counter values are shared between the nodes of the distributed system. As such, the addition of various types of approximate counters is described herein.
    Type: Grant
    Filed: August 7, 2015
    Date of Patent: November 27, 2018
    Assignee: ORACLE INTERNATIONAL CORPORATION
    Inventors: Guy L. Steele, Jr., Jean-Baptiste Tristan
  • Patent number: 9767416
    Abstract: Herein is described a data-parallel and sparse algorithm for topic modeling. This algorithm is based on a highly parallel algorithm for a Greedy Gibbs sampler. The Greedy Gibbs sampler is a Markov-Chain Monte Carlo algorithm that estimates topics, in an unsupervised fashion, by estimating the parameters of the topic model Latent Dirichlet Allocation (LDA). The Greedy Gibbs sampler is a data-parallel algorithm for topic modeling, and is configured to be implemented on a highly-parallel architecture, such as a GPU. The Greedy Gibbs sampler is modified to take advantage of data sparsity while maintaining the parallelism. Furthermore, in an embodiment, implementation of the Greedy Gibbs sampler uses both densely-represented and sparsely-represented matrices to reduce the amount of computation while maintaining fast accesses to memory for implementation on a GPU.
    Type: Grant
    Filed: June 30, 2015
    Date of Patent: September 19, 2017
    Assignee: Oracle International Corporation
    Inventors: Jean-Baptiste Tristan, Guy L. Steele, Jr., Joseph Tassarotti
  • Publication number: 20170102922
    Abstract: A method and apparatus is provided for generating pseudorandom numbers in a way that is deterministic (i.e., repeatable), that passes statistical tests, can have multiple instances of objects generating pseudorandom numbers at the same time. Also, it is desirable that the collection of pseudorandom numbers generated by multiple instances have the same statistical properties as numbers generated by a single instance (i.e., randomness). Embodiments described herein generate pseudorandom values by using a plurality of subsidiary linear congruential generators and combining their outputs nonlinearly. According to embodiments, after their outputs have been combined, a mixing function is applied. Embodiments include an on-demand split method in the style of the SplitMix algorithm.
    Type: Application
    Filed: October 12, 2015
    Publication date: April 13, 2017
    Inventor: Guy L. Steele, JR.
  • Publication number: 20170039265
    Abstract: Herein is described a data-parallel algorithm for topic modeling on a distributed system in which memory and communication bandwidth requirements are streamlined for distributed implementation. According to embodiments, a distributed LDA Gibbs sampling algorithm shares approximate counter values amongst the nodes of a distributed system. These approximate counter values are repeatedly aggregated and then shared again to perform the distributed LDA Gibbs sampling. In order to maintain the shared counter values as approximate counter values of sixteen bits or less, approximate counter values are summed to produce aggregate approximate counter values. These small aggregate approximate counter values are shared between the nodes of the distributed system. As such, the addition of various types of approximate counters is described herein.
    Type: Application
    Filed: August 7, 2015
    Publication date: February 9, 2017
    Inventors: Guy L. Steele, JR., Jean-Baptiste Tristan
  • Publication number: 20160350411
    Abstract: Herein is described an unsupervised learning method to discover topics and reduce the dimensionality of documents by designing and simulating a stochastic cellular automaton. A key formula that appears in many inference methods for LDA is used as the local update rule of the cellular automaton. Approximate counters may be used to represent counter values being tracked by the inference algorithms. Also, sparsity may be used to reduce the amount of computation needed for sampling a topic for particular words in the corpus being analyzed.
    Type: Application
    Filed: November 4, 2015
    Publication date: December 1, 2016
    Inventors: Jean-Baptiste Tristan, Stephen J. Green, Guy L. Steele, JR., Manzil Zaheer
  • Publication number: 20160259724
    Abstract: A lookup circuit evaluates hash functions that map keys to addresses in lookup tables. The circuit may include multiple hash function sub-circuits, each of which applies a respective hash function to an input key value, producing a hash value. Each hash function sub-circuit may multiply bit vectors representing key values by a sparse bit matrix and may add a constant bit vector to the results. The hash function sub-circuits may be constructed using odd-parity circuits that accept as inputs subsets of the bits of the bit vectors representing the key values. The sparse bit matrices may be chosen or generated so that there are at least twice as many 0-bits per row as 1-bits or there is an upper bound on the number of 1-bits per row. Using sparse bit matrices in the hash function sub-circuits may allow the lookup circuit to perform lookup operations with very low latency.
    Type: Application
    Filed: May 16, 2016
    Publication date: September 8, 2016
    Inventors: Guy L. Steele, JR., David R. Chase
  • Publication number: 20160232035
    Abstract: A system may perform work stealing using a dynamically configurable separation between stealable and non-stealable work items. The work items may be held in a double-ended queue (deque), and the value of a variable (index) may indicate the position of the last stealable work item or the first non-stealable work item in the deque. A thread may steal a work item only from the portion of another thread's deque that holds stealable items. The owner of a deque may add work items to the deque and may modify the number or percentage of stealable work items, the number or percentage of non-stealable work items, and/or the ratio between stealable and non-stealable work items in the deque during execution. For example, the owner may convert stealable work items to non-stealable work items, or vice versa, in response to changing conditions and/or according to various work-stealing policies.
    Type: Application
    Filed: April 18, 2016
    Publication date: August 11, 2016
    Inventors: Yosef Lev, Guy L. Steele, JR.
  • Publication number: 20160224902
    Abstract: An efficient parallel Gibbs sampler using butterfly-patterned partial sums is provided. Instead of building and searching a complete prefix sums table, an alternative “butterfly patterned partial sums table” is described that integrates a lightweight transposition and partial sums operation. Accordingly, the usual full matrix transposition and full prefix sums table building operations can be omitted in favor of building the butterfly-patterned partial sums table, which requires less computational and communication effort. This butterfly-patterned partial sums table is used by a modified binary search phase that calculates the needed prefix-sum table values on-the-fly using the butterfly-patterned partial sums table. Transposed memory access is also provided while avoiding the full matrix transform, providing significant performance benefits for highly parallel architectures, such as graphics processing units (GPUs) where 1-stride or sequential memory accesses are important for optimization.
    Type: Application
    Filed: May 15, 2015
    Publication date: August 4, 2016
    Inventors: GUY L. STEELE, JR., JEAN-BAPTISTE TRISTAN
  • Publication number: 20160224544
    Abstract: Herein is described a data-parallel and sparse algorithm for topic modeling. This algorithm is based on a highly parallel algorithm for a Greedy Gibbs sampler. The Greedy Gibbs sampler is a Markov-Chain Monte Carlo algorithm that estimates topics, in an unsupervised fashion, by estimating the parameters of the topic model Latent Dirichlet Allocation (LDA). The Greedy Gibbs sampler is a data-parallel algorithm for topic modeling, and is configured to be implemented on a highly-parallel architecture, such as a GPU. The Greedy Gibbs sampler is modified to take advantage of data sparsity while maintaining the parallelism. Furthermore, in an embodiment, implementation of the Greedy Gibbs sampler uses both densely-represented and sparsely-represented matrices to reduce the amount of computation while maintaining fast accesses to memory for implementation on a GPU.
    Type: Application
    Filed: June 30, 2015
    Publication date: August 4, 2016
    Inventors: Jean-Baptiste Tristan, Guy L. Steele, JR., Joseph Tassarotti
  • Publication number: 20160224900
    Abstract: Herein is described a data-parallel algorithm for topic modeling in which the memory requirements are streamlined for implementation on a highly-parallel architecture, such as a GPU. Specifically, approximate counters are used in a large mixture model or clustering algorithm (e.g., an uncollapsed Gibbs sampler) to decrease memory usage over what is required when conventional counters are used. The decreased memory usage of the approximate counters allows a highly-parallel architecture with limited memory to process more computations for the large mixture model more efficiently. Embodiments describe binary Morris approximate counters, general Morris approximate counters, and Csrös approximate counters in the context of an uncollapsed Gibbs sampler, and, more specifically, for a Greedy Gibbs sampler.
    Type: Application
    Filed: August 6, 2015
    Publication date: August 4, 2016
    Inventors: Guy L. Steele, JR., Jean-Baptiste Tristan