Patents by Inventor Allen H. Rush

Allen H. Rush has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 11803385
    Abstract: An array processor includes processor element arrays (PEAs) distributed in rows and columns. The PEAs are configured to perform operations on parameter values. A first sequencer received a first direct memory access (DMA) instruction that includes a request to read data from at least one address in memory. A texture address (TA) engine requests the data from the memory based on the at least one address and a texture data (TD) engine provides the data to the PEAs. The PEAs provide first synchronization signals to the TD engine to indicate availability of registers for receiving the data. The TD engine provides second synchronization signals to the first sequencer in response to receiving acknowledgments that the PEAs have consumed the data.
    Type: Grant
    Filed: December 10, 2021
    Date of Patent: October 31, 2023
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Sateesh Lagudu, Arun Vaidyanathan Ananthanarayan, Michael Mantor, Allen H. Rush
  • Patent number: 11769041
    Abstract: Systems, apparatuses, and methods for implementing a low latency long short-term memory (LSTM) machine learning engine using sequence interleaving techniques are disclosed. A computing system includes at least a host processing unit, a machine learning engine, and a memory. The host processing unit detects a plurality of sequences which will be processed by the machine learning engine. The host processing unit interleaves the sequences into data blocks and stores the data blocks in the memory. When the machine learning engine receives a given data block, the machine learning engine performs, in parallel, a plurality of matrix multiplication operations on the plurality of sequences in the given data block and a plurality of coefficients. Then, the outputs of the matrix multiplication operations are coupled to one or more LSTM layers.
    Type: Grant
    Filed: October 31, 2018
    Date of Patent: September 26, 2023
    Assignees: Advanced Micro Devices, Inc., ATI Technologies ULC
    Inventors: Sateesh Lagudu, Lei Zhang, Allen H. Rush
  • Publication number: 20230289191
    Abstract: An array processor includes processor element arrays distributed in rows and columns. The processor element arrays perform operations on parameter values. The array processor also includes memory interfaces that broadcast sets of the parameter values to mutually exclusive subsets of the rows and columns of the processor element arrays. In some cases, the array processor includes single-instruction-multiple-data (SIMD) units including subsets of the processor element arrays in corresponding rows, workgroup processors (WGPs) including subsets of the SIMD units, and a memory fabric configured to interconnect with an external memory that stores the parameter values. The memory interfaces broadcast the parameter values to the SIMD units that include the processor element arrays in rows associated with the memory interfaces and columns of processor element arrays that are implemented across the SIMD units in the WGPs. The memory interfaces access the parameter values from the external memory via the memory fabric.
    Type: Application
    Filed: March 30, 2023
    Publication date: September 14, 2023
    Inventors: Sateesh LAGUDU, Allen H. Rush, Michael Mantor, Arun Vaidyanathan Ananthanarayan, Prasad Nagabhushanamgari, Maxim V. Kazakov
  • Publication number: 20230244751
    Abstract: A processing device is provided which comprises memory configured to store data and a plurality of processor cores in communication with each other via first and second hierarchical communication links. Processor cores of a first hierarchical processor core group are in communication with each other via the first hierarchical communication links and are configured to store, in the memory, a sub-portion of data of a first matrix and a sub-portion of data of a second matrix. The processor cores are also configured to determine a product of the sub-portion of data of the first matrix and the sub-portion of data of the second matrix, receive, from another processor core, another sub-portion of data of the second matrix and determine a product of the sub-portion of data of the first matrix and the other sub-portion of data of the second matrix.
    Type: Application
    Filed: April 7, 2023
    Publication date: August 3, 2023
    Applicant: Advanced Micro Devices, Inc.
    Inventors: Shaizeen Aga, Nuwan Jayasena, Allen H. Rush, Michael Ignatowski
  • Patent number: 11640444
    Abstract: A processing device is provided which comprises memory configured to store data and a plurality of processor cores in communication with each other via first and second hierarchical communication links. Processor cores of a first hierarchical processor core group are in communication with each other via the first hierarchical communication links and are configured to store, in the memory, a sub-portion of data of a first matrix and a sub-portion of data of a second matrix. The processor cores are also configured to determine a product of the sub-portion of data of the first matrix and the sub-portion of data of the second matrix, receive, from another processor core, another sub-portion of data of the second matrix and determine a product of the sub-portion of data of the first matrix and the other sub-portion of data of the second matrix.
    Type: Grant
    Filed: March 22, 2021
    Date of Patent: May 2, 2023
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Shaizeen Aga, Nuwan Jayasena, Allen H. Rush, Michael Ignatowski
  • Patent number: 11635967
    Abstract: An array processor includes processor element arrays distributed in rows and columns. The processor element arrays perform operations on parameter values. The array processor also includes memory interfaces that broadcast sets of the parameter values to mutually exclusive subsets of the rows and columns of the processor element arrays. In some cases, the array processor includes single-instruction-multiple-data (SIMD) units including subsets of the processor element arrays in corresponding rows, workgroup processors (WGPs) including subsets of the SIMD units, and a memory fabric configured to interconnect with an external memory that stores the parameter values. The memory interfaces broadcast the parameter values to the SIMD units that include the processor element arrays in rows associated with the memory interfaces and columns of processor element arrays that are implemented across the SIMD units in the WGPs. The memory interfaces access the parameter values from the external memory via the memory fabric.
    Type: Grant
    Filed: September 25, 2020
    Date of Patent: April 25, 2023
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Sateesh Lagudu, Allen H. Rush, Michael Mantor, Arun Vaidyanathan Ananthanarayan, Prasad Nagabhushanamgari, Maxim V. Kazakov
  • Patent number: 11579514
    Abstract: A system and method for controlling characteristics of collected image data are disclosed. The system and method include performing pre-processing of an image using GPUs, configuring an optic based on the pre-processing, the configuring being designed to account for features of the pre-processed image, acquiring an image using the configured optic, processing the acquired image using GPUs, and determining if the processed acquired image accounts for feature of the pre-processed image, and the determination is affirmative, outputting the image, wherein if the determination is negative repeating the configuring of the optic and re-acquiring the image.
    Type: Grant
    Filed: December 23, 2020
    Date of Patent: February 14, 2023
    Assignee: ADVANCED MICRO DEVICES, INC.
    Inventors: Allen H. Rush, Hui Zhou
  • Publication number: 20220319096
    Abstract: An apparatus includes a processor and a collision detection unit operatively coupled to the processor. The collision detection unit is configured to process, using a machine learning model, one or more parameters associated with a ray cast in virtual environment comprising an object. The machine learning model is configured to approximate a mesh representing the object. The collision detection unit is further configured to determine if the ray collides with the object based on processing the one or more parameters. In response to determining if the ray collides with the object, the collision detection unit is configured to generate collision data associated with the ray and the object.
    Type: Application
    Filed: March 30, 2021
    Publication date: October 6, 2022
    Inventors: Thomas Daniel PERRY, Gabor SINES, Mehdi SAEEDI, Allen H. RUSH
  • Patent number: 11409840
    Abstract: An array processor includes processor element arrays distributed in rows and columns. The processor element arrays perform operations on parameter values. The array processor also includes memory interfaces that are dynamically mapped to mutually exclusive subsets of the rows and columns of the processor element arrays based on dimensions of matrices that provide the parameter values to the processor element arrays. In some cases, the processor element arrays are vector arithmetic logic unit (ALU) processors and the memory interfaces are direct memory access (DMA) engines. The rows of the processor element arrays in the subsets are mutually exclusive to the rows in the other subsets and the columns of the processor element arrays in the subsets are mutually exclusive to the columns in the other subsets. The matrices can be symmetric or asymmetric, e.g., one of the matrices can be a vector having a single column.
    Type: Grant
    Filed: September 25, 2020
    Date of Patent: August 9, 2022
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Sateesh Lagudu, Allen H. Rush, Michael Mantor, Arun Vaidyanathan Ananthanarayan, Prasad Nagabhushanamgari
  • Patent number: 11379942
    Abstract: A system and method for controlling characteristics of collected image data are disclosed. The system and method include performing pre-processing of an image using GPUs, configuring an optic based on the pre-processing, the configuring being designed to account for features of the pre-processed image, acquiring an image using the configured optic, processing the acquired image using GPUs, and determining if the processed acquired image accounts for feature of the pre-processed image, and the determination is affirmative, outputting the image, wherein if the determination is negative repeating the configuring of the optic and re-acquiring the image.
    Type: Grant
    Filed: December 5, 2017
    Date of Patent: July 5, 2022
    Assignee: ADVANCED MICRO DEVICES, INC.
    Inventors: Allen H. Rush, Hui Zhou
  • Publication number: 20220197973
    Abstract: A processing system includes a first set and a second set of general-purpose registers (GPRs) and memory access circuitry that fetches nonzero values of a sparse matrix into consecutive slots in the first set. The memory access circuitry also fetches values of an expanded matrix into consecutive slots in the second set of GPRs. The expanded matrix is formed based on values of a vector and locations of the nonzero values in the sparse matrix. The processing system also includes a set of multipliers that concurrently perform multiplication of the nonzero values in slots of the first set of GPRs with the values of the vector in corresponding slots of the second set. Reduced sum circuitry accumulates results from the set of multipliers for rows of the sparse matrix.
    Type: Application
    Filed: December 17, 2020
    Publication date: June 23, 2022
    Inventors: Sateesh LAGUDU, Allen H. RUSH, Michael MANTOR
  • Publication number: 20220197655
    Abstract: An array processor includes processor element arrays (PEAs) distributed in rows and columns. The PEAs are configured to perform operations on parameter values. A first sequencer received a first direct memory access (DMA) instruction that includes a request to read data from at least one address in memory. A texture address (TA) engine requests the data from the memory based on the at least one address and a texture data (TD) engine provides the data to the PEAs. The PEAs provide first synchronization signals to the TD engine to indicate availability of registers for receiving the data. The TD engine provides second synchronization signals to the first sequencer in response to receiving acknowledgments that the PEAs have consumed the data.
    Type: Application
    Filed: December 10, 2021
    Publication date: June 23, 2022
    Inventors: Sateesh LAGUDU, Arun Vaidyanathan ANANTHANARAYAN, Michael Mantor, Allen H. Rush
  • Publication number: 20220100813
    Abstract: An array processor includes processor element arrays distributed in rows and columns. The processor element arrays perform operations on parameter values. The array processor also includes memory interfaces that are dynamically mapped to mutually exclusive subsets of the rows and columns of the processor element arrays based on dimensions of matrices that provide the parameter values to the processor element arrays. In some cases, the processor element arrays are vector arithmetic logic unit (ALU) processors and the memory interfaces are direct memory access (DMA) engines. The rows of the processor element arrays in the subsets are mutually exclusive to the rows in the other subsets and the columns of the processor element arrays in the subsets are mutually exclusive to the columns in the other subsets. The matrices can be symmetric or asymmetric, e.g., one of the matrices can be a vector having a single column.
    Type: Application
    Filed: September 25, 2020
    Publication date: March 31, 2022
    Inventors: Sateesh LAGUDU, Allen H. RUSH, Michael MANTOR, Arun Vaidyanathan ANANTHANARAYAN, Prasad NAGABHUSHANAMGARI
  • Publication number: 20220100528
    Abstract: An array processor includes processor element arrays distributed in rows and columns. The processor element arrays perform operations on parameter values. The array processor also includes memory interfaces that broadcast sets of the parameter values to mutually exclusive subsets of the rows and columns of the processor element arrays. In some cases, the array processor includes single-instruction-multiple-data (SIMD) units including subsets of the processor element arrays in corresponding rows, workgroup processors (WGPs) including subsets of the SIMD units, and a memory fabric configured to interconnect with an external memory that stores the parameter values. The memory interfaces broadcast the parameter values to the SIMD units that include the processor element arrays in rows associated with the memory interfaces and columns of processor element arrays that are implemented across the SIMD units in the WGPs. The memory interfaces access the parameter values from the external memory via the memory fabric.
    Type: Application
    Filed: September 25, 2020
    Publication date: March 31, 2022
    Inventors: Sateesh LAGUDU, Allen H. RUSH, Michael MANTOR, Arun Vaidyanathan ANANTHANARAYAN, Prasad NAGABHUSHANAMGARI, Maxim V. KAZAKOV
  • Patent number: 11200060
    Abstract: An array processor includes processor element arrays (PEAs) distributed in rows and columns. The PEAs are configured to perform operations on parameter values. A first sequencer received a first direct memory access (DMA) instruction that includes a request to read data from at least one address in memory. A texture address (TA) engine requests the data from the memory based on the at least one address and a texture data (TD) engine provides the data to the PEAs. The PEAs provide first synchronization signals to the TD engine to indicate availability of registers for receiving the data. The TD engine provides second synchronization signals to the first sequencer in response to receiving acknowledgments that the PEAs have consumed the data.
    Type: Grant
    Filed: December 23, 2020
    Date of Patent: December 14, 2021
    Assignee: ADVANCED MICRO DEVICES, INC.
    Inventors: Sateesh Lagudu, Arun Vaidyanathan Ananthanarayan, Michael Mantor, Allen H. Rush
  • Publication number: 20210209192
    Abstract: A processing device is provided which comprises memory configured to store data and a plurality of processor cores in communication with each other via first and second hierarchical communication links. Processor cores of a first hierarchical processor core group are in communication with each other via the first hierarchical communication links and are configured to store, in the memory, a sub-portion of data of a first matrix and a sub-portion of data of a second matrix. The processor cores are also configured to determine a product of the sub-portion of data of the first matrix and the sub-portion of data of the second matrix, receive, from another processor core, another sub-portion of data of the second matrix and determine a product of the sub-portion of data of the first matrix and the other sub-portion of data of the second matrix.
    Type: Application
    Filed: March 22, 2021
    Publication date: July 8, 2021
    Applicant: Advanced Micro Devices, Inc.
    Inventors: Shaizeen Aga, Nuwan Jayasena, Allen H. Rush, Michael Ignatowski
  • Publication number: 20210149276
    Abstract: A system and method for controlling characteristics of collected image data are disclosed. The system and method include performing pre-processing of an image using GPUs, configuring an optic based on the pre-processing, the configuring being designed to account for features of the pre-processed image, acquiring an image using the configured optic, processing the acquired image using GPUs, and determining if the processed acquired image accounts for feature of the pre-processed image, and the determination is affirmative, outputting the image, wherein if the determination is negative repeating the configuring of the optic and re-acquiring the image.
    Type: Application
    Filed: December 23, 2020
    Publication date: May 20, 2021
    Applicant: Advanced Micro Devices, Inc.
    Inventors: Allen H. Rush, Hui Zhou
  • Patent number: 10956536
    Abstract: A processing device is provided which comprises memory configured to store data and a plurality of processor cores in communication with each other via first and second hierarchical communication links. Processor cores of a first hierarchical processor core group are in communication with each other via the first hierarchical communication links and are configured to store, in the memory, a sub-portion of data of a first matrix and a sub-portion of data of a second matrix. The processor cores are also configured to determine a product of the sub-portion of data of the first matrix and the sub-portion of data of the second matrix, receive, from another processor core, another sub-portion of data of the second matrix and determine a product of the sub-portion of data of the first matrix and the other sub-portion of data of the second matrix.
    Type: Grant
    Filed: October 31, 2018
    Date of Patent: March 23, 2021
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Shaizeen Aga, Nuwan Jayasena, Allen H. Rush, Michael Ignatowski
  • Patent number: 10902087
    Abstract: A processing device is provided which includes memory and a processor comprising a plurality of processor cores in communication with each other via first and second hierarchical communication links. Each processor core in a group of the processor cores is in communication with each other via the first hierarchical communication links. Each processor core is configured to store, in the memory, one of a plurality of sub-portions of data of a first matrix, store, in the memory, one of a plurality of sub-portions of data of a second matrix, determine an outer product of the sub-portion of data of the first matrix and the sub-portion of data of the second matrix, receive, from another processor core of the group of processor cores, another sub-portion of data of the second matrix and determine another outer product of the sub-portion of data of the first matrix and the other sub-portion of data of the second matrix.
    Type: Grant
    Filed: October 31, 2018
    Date of Patent: January 26, 2021
    Assignee: ADVANCED MICRO DEVICES, INC.
    Inventors: Shaizeen Aga, Nuwan Jayasena, Allen H. Rush, Michael Ignatowski
  • Patent number: 10884319
    Abstract: A system and method for controlling characteristics of collected image data are disclosed. The system and method include performing pre-processing of an image using GPUs, configuring an optic based on the pre-processing, the configuring being designed to account for features of the pre-processed image, acquiring an image using the configured optic, processing the acquired image using GPUs, and determining if the processed acquired image accounts for feature of the pre-processed image, and the determination is affirmative, outputting the image, wherein if the determination is negative repeating the configuring of the optic and re-acquiring the image.
    Type: Grant
    Filed: October 23, 2017
    Date of Patent: January 5, 2021
    Assignee: ADVANCED MICRO DEVICES, INC.
    Inventors: Allen H. Rush, Hui Zhou