Patents by Inventor John Clifton Woolley, JR.

John Clifton Woolley, JR. has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20230222619
    Abstract: Apparatuses, systems, and techniques to indicate contextual information to be used by available logical processors. In at least one embodiment, one or more circuits are to perform an application programming interface (API) to indicate a first set of contextual information to be used by a first subset of available processors.
    Type: Application
    Filed: January 13, 2022
    Publication date: July 13, 2023
    Inventors: David Anthony Fontaine, Maciej Marcin Piechotka, Kyrylo Perelygin, Lukasz Krystian Ligowski, Ashutosh Jain, Jitendra Pratap Singh Chauhan, Jaydeep Marathe, Magnus Strengert, Xiaonan Tian, Sebastian Piotr Jodlowski, John Clifton Woolley, JR.
  • Publication number: 20190220731
    Abstract: In one embodiment of the present invention, a convolution engine configures a parallel processing pipeline to perform multi-convolution operations. More specifically, the convolution engine configures the parallel processing pipeline to independently generate and process individual image tiles. In operation, for each image tile, the pipeline calculates source locations included in an input image batch based on one or more start addresses and one or more offsets. Subsequently, the pipeline copies data from the source locations to the image tile. The pipeline then performs matrix multiplication operations between the image tile and a filter tile to generate a contribution of the image tile to an output matrix. To optimize the amount of memory used, the pipeline creates each image tile in shared memory as needed. Further, to optimize the throughput of the matrix multiplication operations, the values of the offsets are precomputed by a convolution preprocessor.
    Type: Application
    Filed: March 26, 2019
    Publication date: July 18, 2019
    Inventors: John Clifton WOOLLEY, JR., John TRAN
  • Patent number: 10255547
    Abstract: In one embodiment of the present invention, a convolution engine configures a parallel processing pipeline to perform multi-convolution operations. More specifically, the convolution engine configures the parallel processing pipeline to independently generate and process individual image tiles. In operation, for each image tile, the pipeline calculates source locations included in an input image batch based on one or more start addresses and one or more offsets. Subsequently, the pipeline copies data from the source locations to the image tile. The pipeline then performs matrix multiplication operations between the image tile and a filter tile to generate a contribution of the image tile to an output matrix. To optimize the amount of memory used, the pipeline creates each image tile in shared memory as needed. Further, to optimize the throughput of the matrix multiplication operations, the values of the offsets are precomputed by a convolution preprocessor.
    Type: Grant
    Filed: November 25, 2015
    Date of Patent: April 9, 2019
    Assignee: NVIDIA CORPORATION
    Inventors: John Clifton Woolley, Jr., John Tran
  • Patent number: 9804826
    Abstract: System and method for pseudo-random number generation based on a recursion with significantly increased multithreaded parallelism. A single pseudo-random generator program is assigned with multiple threads to process in parallel. N state elements indexed incrementally are arranged into a matrix comprising x rows, where a respective adjacent pair of state elements in a same column are related by g=(M+j)mod N, wherein j and g represent indexes of the pair of state elements. x can be determined through an modular manipulative inverse of M and N. The matrix can be divided into sections with each section having a number of columns, and each thread is assigned with a section. In this manner, the majority of the requisite interactions among the state elements occur without expensive inter-thread communications, and further each thread may only need to communicate with a single other thread for a small number of times.
    Type: Grant
    Filed: December 5, 2014
    Date of Patent: October 31, 2017
    Assignee: Nvidia Corporation
    Inventors: Przemyslaw Tredak, John Clifton Woolley, Jr.
  • Publication number: 20160162402
    Abstract: In one embodiment of the present invention, a convolution engine configures a parallel processing pipeline to perform multi-convolution operations. More specifically, the convolution engine configures the parallel processing pipeline to independently generate and process individual image tiles. In operation, for each image tile, the pipeline calculates source locations included in an input image batch based on one or more start addresses and one or more offsets. Subsequently, the pipeline copies data from the source locations to the image tile. The pipeline then performs matrix multiplication operations between the image tile and a filter tile to generate a contribution of the image tile to an output matrix. To optimize the amount of memory used, the pipeline creates each image tile in shared memory as needed. Further, to optimize the throughput of the matrix multiplication operations, the values of the offsets are precomputed by a convolution preprocessor.
    Type: Application
    Filed: November 25, 2015
    Publication date: June 9, 2016
    Inventors: John Clifton WOOLLEY, JR., John TRAN
  • Publication number: 20160162262
    Abstract: System and method for pseudo-random number generation based on a recursion with significantly increased multithreaded parallelism. A single pseudo-random generator program is assigned with multiple threads to process in parallel. N state elements indexed incrementally are arranged into a matrix comprising x rows, where a respective adjacent pair of state elements in a same column are related by g=(M+j) mod N, wherein j and g represent indexes of the pair of state elements. x can be determined through an modular manipulative inverse of M and N. The matrix can be divided into sections with each section having a number of columns, and each thread is assigned with a section. In this manner, the majority of the requisite interactions among the state elements occur without expensive inter-thread communications, and further each thread may only need to communicate with a single other thread for a small number of times.
    Type: Application
    Filed: December 5, 2014
    Publication date: June 9, 2016
    Inventors: Przemyslaw Tredak, John Clifton Woolley, JR.