Patents by Inventor Rexford Alan Hill

Rexford Alan Hill has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20240118902
    Abstract: An aspect of the disclosure relates to a data processing system, including: an input medium configured to include a first set of blocks of data including a first set of block of compressed data and a first set of metadata, respectively; an output medium configured to include a first set of blocks of decompressed data each having a predetermined number of decompressed elements; and a set of single instruction multiple data (SIMD) processors configured to: access the first set of blocks of data from the input medium, respectively; decompress the first set of blocks of compressed data to generate the first set of blocks of decompressed data based on the first set of metadata, respectively; and provide the first set of blocks of decompressed data to the output medium, respectively.
    Type: Application
    Filed: June 22, 2023
    Publication date: April 11, 2024
    Inventors: Eric Wayne MAHURIN, Erich PLONDKE, Hitesh Kumar GUPTA, Colin Beaton VERRILLI, Rexford Alan HILL
  • Publication number: 20230185532
    Abstract: A method of exploiting activation sparsity in deep neural networks is described. The method includes retrieving an activation tensor and a weight tensor where the activation tensor is a sparse activation tensor. The method also includes generating a compressed activation tensor comprising non-zero activations of the activation tensor, where the compressed activation tensor has fewer columns than the activation tensor. The method further includes processing the compressed activation tensor and the weight tensor to generate an output tensor.
    Type: Application
    Filed: February 2, 2023
    Publication date: June 15, 2023
    Inventors: Rexford Alan HILL, Aaron Douglass LAMB, Michael GOLDFARB, Amin ANSARI, Christopher LOTT
  • Patent number: 11669747
    Abstract: A method of constraining data represented in a deep neural network is described. The method includes determining an initial shifting specified to convert a fixed-point input value to a floating-point output value. The method also includes determining an additional shifting specified to constrain a dynamic range during converting of the fixed-point input value to the floating-point output value. The method further includes performing both the initial shifting and the additional shifting together to form a dynamic, range constrained, normalized floating-point output value.
    Type: Grant
    Filed: October 29, 2019
    Date of Patent: June 6, 2023
    Assignee: Qualcomm Incorporated
    Inventors: Rexford Alan Hill, Eric Wayne Mahurin, Aaron Douglass Lamb, Albert Danysh, Erich Plondke, David Hoyle
  • Patent number: 11487998
    Abstract: In one embodiment, a depth-first deep convolutional network (DCN) having a first convolutional layer having a first first-layer kernel and adapted to convolve a first input and a second convolutional layer having a first second-layer kernel and adapted to convolve a second-layer input. A method for the DCN includes initiating convolution in the first convolution layer of the first input tensor with the first first-layer kernel to generate a value strip for the second input tensor and, prior to completion of the convolution in the first convolution layer, initiating convolution in the second convolution layer of the second input with the first second-layer kernel to generate a value strip for a third layer.
    Type: Grant
    Filed: June 17, 2019
    Date of Patent: November 1, 2022
    Assignee: Qualcomm Incorporated
    Inventors: Rexford Alan Hill, Sruthikesh Surineni, Adrienne Milner, Vito Bica
  • Publication number: 20210182684
    Abstract: A method performed by a computing device includes determining a partition for depth-first processing by a multi-layer artificial neural network (ANN) of the computing device. The computing device comprising a processor, on-chip memory, and off-chip memory. The first partition determined based on an amount of on-chip memory used by the first partition, an available amount of on-chip memory, and a size of a write back to the off-chip memory. The method also includes processing, at the device via the multi-layer ANN, an input, using the depth-first processing in accordance with the partition.
    Type: Application
    Filed: December 14, 2020
    Publication date: June 17, 2021
    Inventors: Piero ZAPPI, Jin Won LEE, Christopher LOTT, Rexford Alan HILL
  • Patent number: 11029745
    Abstract: Systems and methods are disclosed method for controlling instantaneous current changes in parallel processors with arrays of parallel computing elements, such as neural processors. An exemplary method comprises monitoring the array of computing elements and determining a transition from a first activity level of the array to a second activity level of the array, such as an idle-to-active or active-to-idle transition. Once a transition is determined, the array is selectively controlled to minimize the instantaneous current change from the transition from the first activity level to the second activity level.
    Type: Grant
    Filed: November 8, 2018
    Date of Patent: June 8, 2021
    Assignee: QUALCOMM Incorporated
    Inventors: Kyle Ernewein, Jason Edward Podaima, Francisco Perez, John Daniels, Alex Miler, Jeffrey Gemar, Rexford Alan Hill, Haoping Xu
  • Patent number: 11010313
    Abstract: A method, apparatus, and system for an architecture for machine learning acceleration is presented. An apparatus includes a plurality of processing elements, each including a tightly-coupled memory, and a memory system coupled to the processing elements. A global synchronization manager is coupled to the plurality of the processing elements and to the memory system. The processing elements do not implement a coherency protocol with respect to the memory system. The processing elements implement direct memory access with respect to the memory system, and the global synchronization manager is configured to synchronize operations of the plurality of processing elements through the TCMs.
    Type: Grant
    Filed: August 29, 2019
    Date of Patent: May 18, 2021
    Assignee: Qualcomm Incorporated
    Inventors: Colin Beaton Verrilli, Natarajan Vaidhyanathan, Rexford Alan Hill
  • Publication number: 20200394500
    Abstract: In one embodiment, a depth-first deep convolutional network (DCN) having a first convolutional layer having a first first-layer kernel and adapted to convolve a first input and a second convolutional layer having a first second-layer kernel and adapted to convolve a second-layer input. A method for the DCN includes initiating convolution in the first convolution layer of the first input tensor with the first first-layer kernel to generate a value strip for the second input tensor and, prior to completion of the convolution in the first convolution layer, initiating convolution in the second convolution layer of the second input with the first second-layer kernel to generate a value strip for a third layer.
    Type: Application
    Filed: June 17, 2019
    Publication date: December 17, 2020
    Inventors: Rexford Alan Hill, Sruthikesh Surineni, Adrienne Milner, Vito Bica
  • Publication number: 20200134475
    Abstract: A method of constraining data represented in a deep neural network is described. The method includes determining an initial shifting specified to convert a fixed-point input value to a floating-point output value. The method also includes determining an additional shifting specified to constrain a dynamic range during converting of the fixed-point input value to the floating-point output value. The method further includes performing both the initial shifting and the additional shifting together to form a dynamic, range constrained, normalized floating-point output value.
    Type: Application
    Filed: October 29, 2019
    Publication date: April 30, 2020
    Inventors: Rexford Alan HILL, Eric Wayne MAHURIN, Aaron Douglass LAMB, Albert DANYSH, Eric PLONDKE, David HOYLE
  • Patent number: 10621690
    Abstract: A computing device may allocate a plurality of blocks in the memory, wherein each of the plurality of blocks is of a uniform fixed size in the memory. The computing device may further store a plurality of bandwidth-compressed graphics data into the respective plurality of blocks in the memory, wherein one or more of the plurality of bandwidth-compressed graphics data each has a size that is smaller than the fixed size. The computing device may further store data associated with the plurality of bandwidth-compressed graphics data into unused space of one or more of the plurality of blocks that contains the respective one or more of the plurality of bandwidth-compressed graphics data.
    Type: Grant
    Filed: September 17, 2015
    Date of Patent: April 14, 2020
    Assignee: QUALCOMM Incorporated
    Inventors: Andrew Evan Gruber, Rexford Alan Hill, Shambhoo Khandelwal
  • Publication number: 20200073830
    Abstract: A method, apparatus, and system for an architecture for machine learning acceleration is presented. An apparatus includes a plurality of processing elements, each including a tightly-coupled memory, and a memory system coupled to the processing elements. A global synchronization manager is coupled to the plurality of the processing elements and to the memory system. The processing elements do not implement a coherency protocol with respect to the memory system. The processing elements implement direct memory access with respect to the memory system, and the global synchronization manager is configured to synchronize operations of the plurality of processing elements through the TCMs.
    Type: Application
    Filed: August 29, 2019
    Publication date: March 5, 2020
    Inventors: Colin Beaton VERRILLI, Natarajan VAIDHYANATHAN, Rexford Alan HILL
  • Publication number: 20200073470
    Abstract: Systems and methods are disclosed method for controlling instantaneous current changes in parallel processors with arrays of parallel computing elements, such as neural processors. An exemplary method comprises monitoring the array of computing elements and determining a transition from a first activity level of the array to a second activity level of the array, such as an idle-to-active or active-to-idle transition. Once a transition is determined, the array is selectively controlled to minimize the instantaneous current change from the transition from the first activity level to the second activity level.
    Type: Application
    Filed: November 8, 2018
    Publication date: March 5, 2020
    Inventors: KYLE ERNEWEIN, JASON EDWARD PODAIMA, FRANCISCO PEREZ, JOHN DANIELS, ALEX MILER, JEFFREY GEMAR, REXFORD ALAN HILL, HAOPING XU
  • Publication number: 20170083997
    Abstract: A computing device may allocate a plurality of blocks in the memory, wherein each of the plurality of blocks is of a uniform fixed size in the memory. The computing device may further store a plurality of bandwidth-compressed graphics data into the respective plurality of blocks in the memory, wherein one or more of the plurality of bandwidth-compressed graphics data each has a size that is smaller than the fixed size. The computing device may further store data associated with the plurality of bandwidth-compressed graphics data into unused space of one or more of the plurality of blocks that contains the respective one or more of the plurality of bandwidth-compressed graphics data.
    Type: Application
    Filed: September 17, 2015
    Publication date: March 23, 2017
    Inventors: Andrew Evan Gruber, Rexford Alan Hill, Shambhoo Khandelwal