Patents by Inventor Rexford Alan Hill

Rexford Alan Hill has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

SINGLE INSTRUCTION MULTIPLE DATA (SIMD) SPARSE DECOMPRESSION WITH VARIABLE DENSITY

Publication number: 20240118902

Abstract: An aspect of the disclosure relates to a data processing system, including: an input medium configured to include a first set of blocks of data including a first set of block of compressed data and a first set of metadata, respectively; an output medium configured to include a first set of blocks of decompressed data each having a predetermined number of decompressed elements; and a set of single instruction multiple data (SIMD) processors configured to: access the first set of blocks of data from the input medium, respectively; decompress the first set of blocks of compressed data to generate the first set of blocks of decompressed data based on the first set of metadata, respectively; and provide the first set of blocks of decompressed data to the output medium, respectively.

Type: Application

Filed: June 22, 2023

Publication date: April 11, 2024

Inventors: Eric Wayne MAHURIN, Erich PLONDKE, Hitesh Kumar GUPTA, Colin Beaton VERRILLI, Rexford Alan HILL
EXPLOITING ACTIVATION SPARSITY IN DEEP NEURAL NETWORKS

Publication number: 20230185532

Abstract: A method of exploiting activation sparsity in deep neural networks is described. The method includes retrieving an activation tensor and a weight tensor where the activation tensor is a sparse activation tensor. The method also includes generating a compressed activation tensor comprising non-zero activations of the activation tensor, where the compressed activation tensor has fewer columns than the activation tensor. The method further includes processing the compressed activation tensor and the weight tensor to generate an output tensor.

Type: Application

Filed: February 2, 2023

Publication date: June 15, 2023

Inventors: Rexford Alan HILL, Aaron Douglass LAMB, Michael GOLDFARB, Amin ANSARI, Christopher LOTT
Constraining function approximation hardware integrated with fixed-point to floating-point conversion

Patent number: 11669747

Abstract: A method of constraining data represented in a deep neural network is described. The method includes determining an initial shifting specified to convert a fixed-point input value to a floating-point output value. The method also includes determining an additional shifting specified to constrain a dynamic range during converting of the fixed-point input value to the floating-point output value. The method further includes performing both the initial shifting and the additional shifting together to form a dynamic, range constrained, normalized floating-point output value.

Type: Grant

Filed: October 29, 2019

Date of Patent: June 6, 2023

Assignee: Qualcomm Incorporated

Inventors: Rexford Alan Hill, Eric Wayne Mahurin, Aaron Douglass Lamb, Albert Danysh, Erich Plondke, David Hoyle
Depth-first convolution in deep neural networks

Patent number: 11487998

Abstract: In one embodiment, a depth-first deep convolutional network (DCN) having a first convolutional layer having a first first-layer kernel and adapted to convolve a first input and a second convolutional layer having a first second-layer kernel and adapted to convolve a second-layer input. A method for the DCN includes initiating convolution in the first convolution layer of the first input tensor with the first first-layer kernel to generate a value strip for the second input tensor and, prior to completion of the convolution in the first convolution layer, initiating convolution in the second convolution layer of the second input with the first second-layer kernel to generate a value strip for a third layer.

Type: Grant

Filed: June 17, 2019

Date of Patent: November 1, 2022

Assignee: Qualcomm Incorporated

Inventors: Rexford Alan Hill, Sruthikesh Surineni, Adrienne Milner, Vito Bica
DEPTH-FIRST DEEP CONVOLUTIONAL NEURAL NETWORK INFERENCE

Publication number: 20210182684

Abstract: A method performed by a computing device includes determining a partition for depth-first processing by a multi-layer artificial neural network (ANN) of the computing device. The computing device comprising a processor, on-chip memory, and off-chip memory. The first partition determined based on an amount of on-chip memory used by the first partition, an available amount of on-chip memory, and a size of a write back to the off-chip memory. The method also includes processing, at the device via the multi-layer ANN, an input, using the depth-first processing in accordance with the partition.

Type: Application

Filed: December 14, 2020

Publication date: June 17, 2021

Inventors: Piero ZAPPI, Jin Won LEE, Christopher LOTT, Rexford Alan HILL
Systems and methods for controlling instantaneous current changes in parallel processors

Patent number: 11029745

Abstract: Systems and methods are disclosed method for controlling instantaneous current changes in parallel processors with arrays of parallel computing elements, such as neural processors. An exemplary method comprises monitoring the array of computing elements and determining a transition from a first activity level of the array to a second activity level of the array, such as an idle-to-active or active-to-idle transition. Once a transition is determined, the array is selectively controlled to minimize the instantaneous current change from the transition from the first activity level to the second activity level.

Type: Grant

Filed: November 8, 2018

Date of Patent: June 8, 2021

Assignee: QUALCOMM Incorporated

Inventors: Kyle Ernewein, Jason Edward Podaima, Francisco Perez, John Daniels, Alex Miler, Jeffrey Gemar, Rexford Alan Hill, Haoping Xu
Method, apparatus, and system for an architecture for machine learning acceleration

Patent number: 11010313

Abstract: A method, apparatus, and system for an architecture for machine learning acceleration is presented. An apparatus includes a plurality of processing elements, each including a tightly-coupled memory, and a memory system coupled to the processing elements. A global synchronization manager is coupled to the plurality of the processing elements and to the memory system. The processing elements do not implement a coherency protocol with respect to the memory system. The processing elements implement direct memory access with respect to the memory system, and the global synchronization manager is configured to synchronize operations of the plurality of processing elements through the TCMs.

Type: Grant

Filed: August 29, 2019

Date of Patent: May 18, 2021

Assignee: Qualcomm Incorporated

Inventors: Colin Beaton Verrilli, Natarajan Vaidhyanathan, Rexford Alan Hill
DEPTH-FIRST CONVOLUTION IN DEEP NEURAL NETWORKS

Publication number: 20200394500

Abstract: In one embodiment, a depth-first deep convolutional network (DCN) having a first convolutional layer having a first first-layer kernel and adapted to convolve a first input and a second convolutional layer having a first second-layer kernel and adapted to convolve a second-layer input. A method for the DCN includes initiating convolution in the first convolution layer of the first input tensor with the first first-layer kernel to generate a value strip for the second input tensor and, prior to completion of the convolution in the first convolution layer, initiating convolution in the second convolution layer of the second input with the first second-layer kernel to generate a value strip for a third layer.

Type: Application

Filed: June 17, 2019

Publication date: December 17, 2020

Inventors: Rexford Alan Hill, Sruthikesh Surineni, Adrienne Milner, Vito Bica
CONSTRAINING FUNCTION APPROXIMATION HARDWARE INTEGRATED WITH FIXED-POINT TO FLOATING-POINT CONVERSION

Publication number: 20200134475

Abstract: A method of constraining data represented in a deep neural network is described. The method includes determining an initial shifting specified to convert a fixed-point input value to a floating-point output value. The method also includes determining an additional shifting specified to constrain a dynamic range during converting of the fixed-point input value to the floating-point output value. The method further includes performing both the initial shifting and the additional shifting together to form a dynamic, range constrained, normalized floating-point output value.

Type: Application

Filed: October 29, 2019

Publication date: April 30, 2020

Inventors: Rexford Alan HILL, Eric Wayne MAHURIN, Aaron Douglass LAMB, Albert DANYSH, Eric PLONDKE, David HOYLE
Storing bandwidth-compressed graphics data

Patent number: 10621690

Abstract: A computing device may allocate a plurality of blocks in the memory, wherein each of the plurality of blocks is of a uniform fixed size in the memory. The computing device may further store a plurality of bandwidth-compressed graphics data into the respective plurality of blocks in the memory, wherein one or more of the plurality of bandwidth-compressed graphics data each has a size that is smaller than the fixed size. The computing device may further store data associated with the plurality of bandwidth-compressed graphics data into unused space of one or more of the plurality of blocks that contains the respective one or more of the plurality of bandwidth-compressed graphics data.

Type: Grant

Filed: September 17, 2015

Date of Patent: April 14, 2020

Assignee: QUALCOMM Incorporated

Inventors: Andrew Evan Gruber, Rexford Alan Hill, Shambhoo Khandelwal
METHOD, APPARATUS, AND SYSTEM FOR AN ARCHITECTURE FOR MACHINE LEARNING ACCELERATION

Publication number: 20200073830

Abstract: A method, apparatus, and system for an architecture for machine learning acceleration is presented. An apparatus includes a plurality of processing elements, each including a tightly-coupled memory, and a memory system coupled to the processing elements. A global synchronization manager is coupled to the plurality of the processing elements and to the memory system. The processing elements do not implement a coherency protocol with respect to the memory system. The processing elements implement direct memory access with respect to the memory system, and the global synchronization manager is configured to synchronize operations of the plurality of processing elements through the TCMs.

Type: Application

Filed: August 29, 2019

Publication date: March 5, 2020

Inventors: Colin Beaton VERRILLI, Natarajan VAIDHYANATHAN, Rexford Alan HILL
SYSTEMS AND METHODS FOR CONTROLLING INSTANTANEOUS CURRENT CHANGES IN PARALLEL PROCESSORS

Publication number: 20200073470

Abstract: Systems and methods are disclosed method for controlling instantaneous current changes in parallel processors with arrays of parallel computing elements, such as neural processors. An exemplary method comprises monitoring the array of computing elements and determining a transition from a first activity level of the array to a second activity level of the array, such as an idle-to-active or active-to-idle transition. Once a transition is determined, the array is selectively controlled to minimize the instantaneous current change from the transition from the first activity level to the second activity level.

Type: Application

Filed: November 8, 2018

Publication date: March 5, 2020

Inventors: KYLE ERNEWEIN, JASON EDWARD PODAIMA, FRANCISCO PEREZ, JOHN DANIELS, ALEX MILER, JEFFREY GEMAR, REXFORD ALAN HILL, HAOPING XU
STORING BANDWIDTH-COMPRESSED GRAPHICS DATA

Publication number: 20170083997

Abstract: A computing device may allocate a plurality of blocks in the memory, wherein each of the plurality of blocks is of a uniform fixed size in the memory. The computing device may further store a plurality of bandwidth-compressed graphics data into the respective plurality of blocks in the memory, wherein one or more of the plurality of bandwidth-compressed graphics data each has a size that is smaller than the fixed size. The computing device may further store data associated with the plurality of bandwidth-compressed graphics data into unused space of one or more of the plurality of blocks that contains the respective one or more of the plurality of bandwidth-compressed graphics data.

Type: Application

Filed: September 17, 2015

Publication date: March 23, 2017

Inventors: Andrew Evan Gruber, Rexford Alan Hill, Shambhoo Khandelwal