Patents by Inventor Haishan Zhu

Haishan Zhu has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

TRANSPOSING MATRICES BASED ON A MULTI-LEVEL CROSSBAR

Publication number: 20240134564

Abstract: Embodiments of the present disclosure include systems and methods for transposing matrices based on a multi-level crossbar. A system may include a memory configured to store a matrix comprising a plurality of elements arranged in a set of rows and a set of columns. A system may include an input buffer configured to retrieve a subset of a plurality of elements from the memory. Each element in the subset of the plurality of elements is retrieved from a different column in the matrix. A system may include a multi-level crossbar configured to perform a transpose operation on the subset of the plurality of elements. A system may include an output buffer configured to receive the transposed subset of the plurality of elements and store, in the memory, each element in the transposed subset of the plurality of elements in a different column in the matrix.

Type: Application

Filed: October 19, 2022

Publication date: April 25, 2024

Inventors: Jinhang CHOI, Haishan ZHU, Yi LUO, Eric S. CHUNG
SYSTEMS AND METHODS FOR RETIRING IN MULTI-STREAM DATA MOVEMENT

Publication number: 20240134683

Abstract: A hardware retire circuit includes: one or more input queues, each queue corresponding to an input stream of tasks and being configured to store input task identifiers corresponding to tasks of the input stream; and processing logic configured to: receive a completed task event; determine whether a completed task queue identifier and a completed task identifier of the completed task event match an input task identifier of an input task at a head of an input queue having an input queue identifier corresponding to the completed task queue identifier; and in response to determining a match, pop the task at the head of the input queue and output a task retirement event corresponding to the input task.

Type: Application

Filed: October 20, 2022

Publication date: April 25, 2024

Applicant: Microsoft Technology Licensing, LLC

Inventors: Yi LUO, Jinwen XI, Xuan ZUO, Haishan ZHU, Eric Sen CHUNG
PROGRAM ACCELERATORS WITH MULTIDIMENSIONAL NESTED COMMAND STRUCTURES

Publication number: 20240127107

Abstract: Embodiments of the present disclosure include techniques for machine language processing. In one embodiment, the present disclosure include commands with data structures comprising fields describing multi-dimensional data and fields describing synchronization. Large volumes of data may be processed and automatically synchronized by execution of a single command.

Type: Application

Filed: October 14, 2022

Publication date: April 18, 2024

Inventors: Haishan ZHU, Eric S. CHUNG
DEEP FUSION OF KERNEL EXECUTION

Publication number: 20240126617

Abstract: Embodiments of the present disclosure include techniques for machine language processing. In one embodiment, the present disclosure includes configuring functional modules on a machine learning processor to execute a plurality of machine learning (ML) operations during a plurality of time segments. During the time segments, a first portion of the ML operations execute serially and at least one other ML operation executes during at least a majority of the time of each of the time segments. Serial ML operations may be processed simultaneously with the at least one other ML operation.

Type: Application

Filed: October 14, 2022

Publication date: April 18, 2024

Inventors: Haishan ZHU, Preyas Janak SHAH, Tiyasa MITRA, Eric S. CHUNG
HIERARCHICAL PROGRAMMING MODEL FOR ARTIFICIAL INTELLIGENCE HARDWARE

Publication number: 20240086233

Abstract: Embodiments of the present disclosure include systems and methods for providing a hierarchical programming model for AI hardware. A system includes a set of lower-level control threads. The system also includes a higher-level control thread configured to receive a command from a device, generate a set of commands based on the command, and provide the set of commands to a subset of the set of lower-level control threads. A lower-level control thread in the subset of the set of lower-level control threads is configured to instruct, based on a particular command in the set of commands, a subset of a plurality of processing threads to perform a set of operations.

Type: Application

Filed: September 9, 2022

Publication date: March 14, 2024

Inventors: Haishan ZHU, Eric S. CHUNG
Neural network training with decreased memory consumption and processor utilization

Patent number: 11853897

Abstract: Bounding box quantization can reduce the quantity of bits utilized to express numerical values prior to the multiplication of matrices comprised of such numerical values, thereby reducing both memory consumption and processor utilization. Stochastic rounding can provide sufficient precision to enable the storage of weight values in reduced-precision formats without having to separately store weight values in a full-precision format. Alternatively, other rounding mechanisms, such as round to nearest, can be utilized to exchange weight values in reduced-precision formats, while also storing weight values in full-precision formats for subsequent updating. To facilitate conversion, reduced-precision formats such as brain floating-point format can be utilized.

Type: Grant

Filed: December 9, 2022

Date of Patent: December 26, 2023

Assignee: Microsoft Technology Licensing, LLC

Inventors: Taesik Na, Daniel Lo, Haishan Zhu, Eric Sen Chung
Managing processing system efficiency

Patent number: 11704158

Abstract: Methods, systems, and computer storage media storing instructions for managing processing system efficiency. One of the methods includes obtaining data splitting a plurality of general-purpose processing units in a processing system into a high-priority domain and a low-priority domain, wherein the general-purpose processing units in the high-priority domain are assigned to perform one or more tasks comprising one or more high-priority tasks, and the general-purpose processing units in the low-priority domain are assigned to perform one or more low-priority tasks; and during runtime of the processing system, obtaining memory usage measurements that characterize usage of system memory by the high-priority domain and the low-priority domain; and adjusting, based on the memory usage measurements, a configuration of (i) the high-priority domain, (ii) the low-priority domain, or (iii) both to adjust utilization of the system memory by the general-purpose processing units.

Type: Grant

Filed: January 29, 2021

Date of Patent: July 18, 2023

Assignee: Google LLC

Inventors: Liqun Cheng, Rama Krishna Govindaraju, Haishan Zhu, David Lo, Parthasarathy Ranganathan, Nishant Patil
Neural Network Training With Decreased Memory Consumption And Processor Utilization

Publication number: 20230110219

Abstract: Bounding box quantization can reduce the quantity of bits utilized to express numerical values prior to the multiplication of matrices comprised of such numerical values, thereby reducing both memory consumption and processor utilization. Stochastic rounding can provide sufficient precision to enable the storage of weight values in reduced-precision formats without having to separately store weight values in a full-precision format. Alternatively, other rounding mechanisms, such as round to nearest, can be utilized to exchange weight values in reduced-precision formats, while also storing weight values in full-precision formats for subsequent updating. To facilitate conversion, reduced-precision formats such as brain floating-point format can be utilized.

Type: Application

Filed: December 9, 2022

Publication date: April 13, 2023

Applicant: Microsoft Technology Licensing, LLC

Inventors: Taesik NA, Daniel LO, Haishan ZHU, Eric Sen CHUNG
Neural network training with decreased memory consumption and processor utilization

Patent number: 11526761

Abstract: Bounding box quantization can reduce the quantity of bits utilized to express numerical values prior to the multiplication of matrices comprised of such numerical values, thereby reducing both memory consumption and processor utilization. Stochastic rounding can provide sufficient precision to enable the storage of weight values in reduced-precision formats without having to separately store weight values in a full-precision format. Alternatively, other rounding mechanisms, such as round to nearest, can be utilized to exchange weight values in reduced-precision formats, while also storing weight values in full-precision formats for subsequent updating. To facilitate conversion, reduced-precision formats such as brain floating-point format can be utilized.

Type: Grant

Filed: August 24, 2019

Date of Patent: December 13, 2022

Assignee: Microsoft Technology Licensing, LLC

Inventors: Taesik Na, Daniel Lo, Haishan Zhu, Eric Sen Chung
MANAGING PROCESSING SYSTEM EFFICIENCY

Publication number: 20210224129

Abstract: Methods, systems, and computer storage media storing instructions for managing processing system efficiency. One of the methods includes obtaining data splitting a plurality of general-purpose processing units in a processing system into a high-priority domain and a low-priority domain, wherein the general-purpose processing units in the high-priority domain are assigned to perform one or more tasks comprising one or more high-priority tasks, and the general-purpose processing units in the low-priority domain are assigned to perform one or more low-priority tasks; and during runtime of the processing system, obtaining memory usage measurements that characterize usage of system memory by the high-priority domain and the low-priority domain; and adjusting, based on the memory usage measurements, a configuration of (i) the high-priority domain, (ii) the low-priority domain, or (iii) both to adjust utilization of the system memory by the general-purpose processing units.

Type: Application

Filed: January 29, 2021

Publication date: July 22, 2021

Inventors: Liqun Cheng, Rama Krishna Govindaraju, Haishan Zhu, David Lo, Parthasarathy Ranganathan, Nishant Patil
Neural Network Training With Decreased Memory Consumption And Processor Utilization

Publication number: 20210056423

Abstract: Bounding box quantization can reduce the quantity of bits utilized to express numerical values prior to the multiplication of matrices comprised of such numerical values, thereby reducing both memory consumption and processor utilization. Stochastic rounding can provide sufficient precision to enable the storage of weight values in reduced-precision formats without having to separately store weight values in a full-precision format. Alternatively, other rounding mechanisms, such as round to nearest, can be utilized to exchange weight values in reduced-precision formats, while also storing weight values in full-precision formats for subsequent updating. To facilitate conversion, reduced-precision formats such as brain floating-point format can be utilized.

Type: Application

Filed: August 24, 2019

Publication date: February 25, 2021

Inventors: Taesik NA, Daniel LO, Haishan ZHU, Eric Sen CHUNG
Managing processing system efficiency

Patent number: 10908964

Abstract: Methods, systems, and computer storage media storing instructions for managing processing system efficiency. One of the methods includes obtaining data splitting a plurality of general-purpose processing units in a processing system into a high-priority domain and a low-priority domain, wherein the general-purpose processing units in the high-priority domain are assigned to perform one or more tasks comprising one or more high-priority tasks, and the general-purpose processing units in the low-priority domain are assigned to perform one or more low-priority tasks; and during runtime of the processing system, obtaining memory usage measurements that characterize usage of system memory by the high-priority domain and the low-priority domain; and adjusting, based on the memory usage measurements, a configuration of (i) the high-priority domain, (ii) the low-priority domain, or (iii) both to adjust utilization of the system memory by the general-purpose processing units.

Type: Grant

Filed: November 21, 2018

Date of Patent: February 2, 2021

Assignee: Google LLC

Inventors: Liqun Cheng, Rama Krishna Govindaraju, Haishan Zhu, David Lo, Parthasarathy Ranganathan, Nishant Patil
MIXED PRECISION TRAINING OF AN ARTIFICIAL NEURAL NETWORK

Publication number: 20200302283

Abstract: The use of mixed precision values when training an artificial neural network (ANN) can increase performance while reducing cost. Certain portions and/or steps of an ANN may be selected to use higher or lower precision values when training. Additionally, or alternatively, early phases of training are accurate enough with lower levels of precision to quickly refine an ANN model, while higher levels of precision may be used to increase accuracy for later steps and epochs. Similarly, different gates of a long short-term memory (LSTM) may be supplied with values having different precisions.

Type: Application

Filed: March 18, 2019

Publication date: September 24, 2020

Inventors: Haishan ZHU, Taesik NA, Daniel LO, Eric S. CHUNG
DITHERED QUANTIZATION OF PARAMETERS DURING TRAINING WITH A MACHINE LEARNING TOOL

Publication number: 20200218982

Abstract: A machine learning tool uses dithered quantization of parameters during training of a machine learning model such as a neural network. The machine learning tool receives training data and initializes certain parameters of the machine learning model (e.g., weights for connections between nodes of a neural network, biases for nodes). The machine learning tool trains the parameters in one or more iterations based on the training data. In particular, in a given iteration, the machine learning tool applies the machine learning model to at least some of the training data and, based at least in part on the results, determines parameter updates to the parameters. The machine learning tool updates the parameters using the parameter updates and a dithered quantizer function, which can add random values before a rounding or truncation operation.

Type: Application

Filed: January 4, 2019

Publication date: July 9, 2020

Applicant: Microsoft Technology Licensing, LLC

Inventors: Thomas M. ANNAU, Haishan ZHU, Daniel LO, Eric S. CHUNG
MANAGING PROCESSING SYSTEM EFFICIENCY

Publication number: 20190155658

Abstract: Methods, systems, and computer storage media storing instructions for managing processing system efficiency. One of the methods includes obtaining data splitting a plurality of general-purpose processing units in a processing system into a high-priority domain and a low-priority domain, wherein the general-purpose processing units in the high-priority domain are assigned to perform one or more tasks comprising one or more high-priority tasks, and the general-purpose processing units in the low-priority domain are assigned to perform one or more low-priority tasks; and during runtime of the processing system, obtaining memory usage measurements that characterize usage of system memory by the high-priority domain and the low-priority domain; and adjusting, based on the memory usage measurements, a configuration of (i) the high-priority domain, (ii) the low-priority domain, or (iii) both to adjust utilization of the system memory by the general-purpose processing units.

Type: Application

Filed: November 21, 2018

Publication date: May 23, 2019

Inventors: Liqun Cheng, Rama Krishna Govindaraju, Haishan Zhu, David Lo, Parthasarathy Ranganathan, Nishant Patil