Patents by Inventor Liqun Cheng

Liqun Cheng has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Autonomous Warehouse-Scale Computers

Publication number: 20250181400

Abstract: The subject matter described herein provides systems and techniques to address the challenges of growing hardware and workload heterogeneity using a Warehouse-Scale Computer (WSC) design that improves the efficiency and utilization of WSCs. The WSC design may include an abstraction layer and an efficiency layer in the software stack of the WSC. The abstraction layer and the efficiency layer may be designed to improve job scheduling, simplify resource management, and drive hardware-software co-optimization using machine learning techniques and automation in order to customize the WSC for applications at scale. The abstraction layer may embrace platform/hardware and workload diversity through greater coordination between hardware and higher layers of the WSC software stack in the WSC design. The efficiency layer may employ machine learning techniques at scale to realize hardware/software co-optimizations as a part of the autonomous WSC design.

Type: Application

Filed: March 4, 2024

Publication date: June 5, 2025

Inventors: David Lo, Liqun Cheng, Parthasarathy Ranganathan, Sundar Jayakumar Dev
HARDWARE-OPTIMIZED NEURAL ARCHITECTURE SEARCH

Publication number: 20250077833

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for determining an architecture for a task neural network that is configured to perform a particular machine learning task on a target set of hardware resources. When deployed on a target set of hardware, such as a collection of datacenter accelerators, the task neural network may be capable of performing the particular machine learning task with enhanced accuracy and speed.

Type: Application

Filed: August 30, 2024

Publication date: March 6, 2025

Inventors: Sheng Li, Norman Paul Jouppi, Quoc V. Le, Mingxing Tan, Ruoming Pang, Liqun Cheng, Andrew Li
Hardware-optimized neural architecture search

Patent number: 12131244

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for determining an architecture for a task neural network that is configured to perform a particular machine learning task on a target set of hardware resources. When deployed on a target set of hardware, such as a collection of datacenter accelerators, the task neural network may be capable of performing the particular machine learning task with enhanced accuracy and speed.

Type: Grant

Filed: September 30, 2020

Date of Patent: October 29, 2024

Assignee: Google LLC

Inventors: Sheng Li, Norman Paul Jouppi, Quoc V. Le, Mingxing Tan, Ruoming Pang, Liqun Cheng, Andrew Li
Generating recommendations based on performance metrics for scheduling jobs on distributed computing devices

Patent number: 12056534

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for scheduling operations represented as a computational graph on a distributed computing network. A method includes: receiving data representing operations to be executed in order to perform a job on a plurality of hardware accelerators of a plurality of different accelerator types; generating, for the job and from at least the data representing the operations, features that represent a predicted performance for the job on hardware accelerators of the plurality of different accelerator types; generating, from the features, a respective predicted performance metric for the job for each of the plurality of different accelerator types according to a performance objective function; and providing, to a scheduling system, one or more recommendations for scheduling the job on one or more recommended types of hardware accelerators.

Type: Grant

Filed: December 30, 2022

Date of Patent: August 6, 2024

Assignee: Google LLC

Inventors: Sheng Li, Brian Zhang, Liqun Cheng, Norman Paul Jouppi, Yun Ni
Heterogeneous ML Accelerator Cluster with Flexible System Resource Balance

Publication number: 20240231667

Abstract: Aspects of the disclosure are directed to a heterogeneous machine learning accelerator system with compute and memory nodes connected by high speed chip-to-chip interconnects. While existing remote/disaggregated memory may require memory expansion via remote processing units, aspects of the disclosure add memory nodes into machine learning accelerator clusters via the chip-to-chip interconnects without needing assistance from remote processing units to achieve higher performance, simpler software stack, and/or lower cost. The memory nodes may support prefetch and intelligent compression to enable the use of low cost memory without performance degradation.

Type: Application

Filed: January 10, 2023

Publication date: July 11, 2024

Inventors: Sheng Li, Sridhar Lakshmanamurthy, Norman Paul Jouppi, Martin Guy Dixon, Daniel Stodolsky, Quoc V. Le, Liqun Cheng, Erik Karl Norden, Parthasarathy Ranganathan
Asymmetric data communication for host-device interface

Patent number: 12026118

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, are described for performing asymmetric data communication at a host-device interface of a system. The methods include identifying devices coupled to a host of the system and generating a system topology that identifies a connectivity of the devices and identifies bus lanes that enable data transfers at the system. The host determines that a first connection between the host and a first device of the multiple devices has an asymmetric bandwidth requirement. The host configures a set of bus lanes of a data bus connecting the first device and the host to allocate a different number of the bus lanes to data egress from the host than to data ingress to the host. The bus lanes are configured to allocate the differing number of bus lanes based on the asymmetric bandwidth requirement of the first connection.

Type: Grant

Filed: November 29, 2021

Date of Patent: July 2, 2024

Assignee: Google LLC

Inventors: Nishant Patil, Liqun Cheng
Autonomous warehouse-scale computers

Patent number: 11960936

Abstract: The subject matter described herein provides systems and techniques to address the challenges of growing hardware and workload heterogeneity using a Warehouse-Scale Computer (WSC) design that improves the efficiency and utilization of WSCs. The WSC design may include an abstraction layer and an efficiency layer in the software stack of the WSC. The abstraction layer and the efficiency layer may be designed to improve job scheduling, simplify resource management, and drive hardware-software co-optimization using machine learning techniques and automation in order to customize the WSC for applications at scale. The abstraction layer may embrace platform/hardware and workload diversity through greater coordination between hardware and higher layers of the WSC software stack in the WSC design. The efficiency layer may employ machine learning techniques at scale to realize hardware/software co-optimizations as a part of the autonomous WSC design.

Type: Grant

Filed: January 15, 2021

Date of Patent: April 16, 2024

Assignee: Google LLC

Inventors: David Lo, Liqun Cheng, Parthasarathy Ranganathan, Sundar Jayakumar Dev
OneShot Neural Architecture and Hardware Architecture Search

Publication number: 20240037373

Abstract: Aspects of the disclosure are directed to jointly searching machine learning model architectures and hardware architectures in a combined space of models, hardware, and mapping strategies. A search strategy is utilized where all models, hardware, and mappings are evaluated together at once via weight sharing and a supernetwork. A multi-objective reward function is utilized with objectives for quality, performance, power, and area.

Type: Application

Filed: July 28, 2022

Publication date: February 1, 2024

Inventors: Sheng Li, Norman Paul Jouppi, Garrett Axel Andersen, Quoc V. Le, Liqun Cheng, Parthasarathy Ranganathan
Hybrid and Hierarchical Multi-Trial and OneShot Neural Architecture Search on Datacenter Machine Learning Accelerators

Publication number: 20230297580

Abstract: According to various implementations, generally disclosed herein is a hybrid and hierarchical neural architecture search (NAS) approach. The approach includes performing a search space partitioning scheme to divide the search space into sub-search spaces. The approach further includes performing a first type of NAS, such as a Multi-trial NAS, to cover a search across the sub-search spaces. The approach also includes performing a second type of NAS, such as a One-Shot NAS, to cover each sub-search space. The approach further includes automatically stopping the second type of NAS based on one or more early stopping criteria.

Type: Application

Filed: April 15, 2022

Publication date: September 21, 2023

Inventors: Sheng Li, Garrett Axel Andersen, Norman Paul Jouppi, Quoc V. Le, Liqun Cheng, Parthasarathy Ranganathan, Julian Paul Grady, Yang Li, Martin Wicke, Yifeng Lu, Yun Ni, Kun Wang
Managing processing system efficiency

Patent number: 11704158

Abstract: Methods, systems, and computer storage media storing instructions for managing processing system efficiency. One of the methods includes obtaining data splitting a plurality of general-purpose processing units in a processing system into a high-priority domain and a low-priority domain, wherein the general-purpose processing units in the high-priority domain are assigned to perform one or more tasks comprising one or more high-priority tasks, and the general-purpose processing units in the low-priority domain are assigned to perform one or more low-priority tasks; and during runtime of the processing system, obtaining memory usage measurements that characterize usage of system memory by the high-priority domain and the low-priority domain; and adjusting, based on the memory usage measurements, a configuration of (i) the high-priority domain, (ii) the low-priority domain, or (iii) both to adjust utilization of the system memory by the general-purpose processing units.

Type: Grant

Filed: January 29, 2021

Date of Patent: July 18, 2023

Assignee: Google LLC

Inventors: Liqun Cheng, Rama Krishna Govindaraju, Haishan Zhu, David Lo, Parthasarathy Ranganathan, Nishant Patil
RECOMMENDATIONS FOR SCHEDULING JOBS ON DISTRIBUTED COMPUTING DEVICES

Publication number: 20230222000

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for scheduling operations represented as a computational graph on a distributed computing network. A method includes: receiving data representing operations to be executed in order to perform a job on a plurality of hardware accelerators of a plurality of different accelerator types; generating, for the job and from at least the data representing the operations, features that represent a predicted performance for the job on hardware accelerators of the plurality of different accelerator types; generating, from the features, a respective predicted performance metric for the job for each of the plurality of different accelerator types according to a performance objective function; and providing, to a scheduling system, one or more recommendations for scheduling the job on one or more recommended types of hardware accelerators.

Type: Application

Filed: December 30, 2022

Publication date: July 13, 2023

Inventors: Sheng Li, Brian Zhang, Liqun Cheng, Norman Paul Jouppi, Yun Ni
Hardware-Aware Progressive Training Of Machine Learning Models

Publication number: 20230108177

Abstract: Aspects of the disclosure provide for hardware-aware progressive training of machine learning models. A training system trains a model in accordance with a training process and different values specified in a training schedule for both hardware-level and model-level performance settings. Hardware-level performance settings can cause hardware features of computing resources used to train the model to be enabled, disabled, or modified at various points during training. Model-level performance settings can take on a variety of values to adjust characteristics of the machine learning model being trained or of the training process, during different stages of training. The training system can identify and apply complementary values of hardware- and model-level performance settings to generate training schedules that improve model training speed at earlier stages of training, while improving model quality at later stages of training.

Type: Application

Filed: August 31, 2022

Publication date: April 6, 2023

Inventors: Sheng Li, Mingxing Tan, Norman Paul Jouppi, Quoc V. Le, Liqun Cheng, Ruoming Pang, Parthasarathy Ranganathan
Recommendations for scheduling jobs on distributed computing devices

Patent number: 11544105

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for scheduling operations represented as a computational graph on a distributed computing network. A method includes: receiving data representing operations to be executed in order to perform a job on a plurality of hardware accelerators of a plurality of different accelerator types; generating, for the job and from at least the data representing the operations, features that represent a predicted performance for the job on hardware accelerators of the plurality of different accelerator types; generating, from the features, a respective predicted performance metric for the job for each of the plurality of different accelerator types according to a performance objective function; and providing, to a scheduling system, one or more recommendations for scheduling the job on one or more recommended types of hardware accelerators.

Type: Grant

Filed: October 11, 2019

Date of Patent: January 3, 2023

Assignee: Google LLC

Inventors: Sheng Li, Brian Zhang, Liqun Cheng, Norman Paul Jouppi, Yun Ni
Neural Architecture Scaling For Hardware Accelerators

Publication number: 20220230048

Abstract: Methods, systems, and apparatus, including computer-readable media, for scaling neural network architectures on hardware accelerators. A method includes receiving training data and information specifying target computing resources, and performing using the training data, a neural architecture search over a search space to identify an architecture for a base neural network. A plurality of scaling parameter values for scaling the base neural network can be identified, which can include repeatedly selecting a plurality of candidate scaling parameter values, and determining a measure of performance for the base neural network scaled according to the plurality of candidate scaling parameter values, in accordance with a plurality of second objectives including a latency objective. An architecture for a scaled neural network can be determined using the architecture of the base neural network scaled according to the plurality of scaling parameter values.

Type: Application

Filed: February 12, 2021

Publication date: July 21, 2022

Inventors: Andrew Li, Sheng Li, Mingxing Tan, Ruoming Pang, Liqun Cheng, Quoc V. Le, Norman Paul Jouppi
Autonomous Warehouse-Scale Computers

Publication number: 20220229698

Abstract: The subject matter described herein provides systems and techniques to address the challenges of growing hardware and workload heterogeneity using a Warehouse-Scale Computer (WSC) design that improves the efficiency and utilization of WSCs. The WSC design may include an abstraction layer and an efficiency layer in the software stack of the WSC. The abstraction layer and the efficiency layer may be designed to improve job scheduling, simplify resource management, and drive hardware-software co-optimization using machine learning techniques and automation in order to customize the WSC for applications at scale. The abstraction layer may embrace platform/hardware and workload diversity through greater coordination between hardware and higher layers of the WSC software stack in the WSC design. The efficiency layer may employ machine learning techniques at scale to realize hardware/software co-optimizations as a part of the autonomous WSC design.

Type: Application

Filed: January 15, 2021

Publication date: July 21, 2022

Inventors: David Lo, Liqun Cheng, Parthasarathy Ranganathan, Sundar Jayakumar Dev
ASYMMETRIC DATA COMMUNICATION FOR HOST-DEVICE INTERFACE

Publication number: 20220083493

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, are described for performing asymmetric data communication at a host-device interface of a system. The methods include identifying devices coupled to a host of the system and generating a system topology that identifies a connectivity of the devices and identifies bus lanes that enable data transfers at the system. The host determines that a first connection between the host and a first device of the multiple devices has an asymmetric bandwidth requirement. The host configures a set of bus lanes of a data bus connecting the first device and the host to allocate a different number of the bus lanes to data egress from the host than to data ingress to the host. The bus lanes are configured to allocate the differing number of bus lanes based on the asymmetric bandwidth requirement of the first connection.

Type: Application

Filed: November 29, 2021

Publication date: March 17, 2022

Inventors: Nishant Patil, Liqun Cheng
HARDWARE-OPTIMIZED NEURAL ARCHITECTURE SEARCH

Publication number: 20220019869

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for determining an architecture for a task neural network that is configured to perform a particular machine learning task on a target set of hardware resources. When deployed on a target set of hardware, such as a collection of datacenter accelerators, the task neural network may be capable of performing the particular machine learning task with enhanced accuracy and speed.

Type: Application

Filed: September 30, 2020

Publication date: January 20, 2022

Inventors: Sheng Li, Norman Paul Jouppi, Quoc V. Le, Mingxing Tan, Ruoming Pang, Liqun Cheng, Andrew Li
Asymmetric data communication for host-device interface

Patent number: 11188494

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, are described for performing asymmetric data communication at a host-device interface of a system. The methods include identifying devices coupled to a host of the system and generating a system topology that identifies a connectivity of the devices and identifies bus lanes that enable data transfers at the system. The host determines that a first connection between the host and a first device of the multiple devices has an asymmetric bandwidth requirement. The host configures a set of bus lanes of a data bus connecting the first device and the host to allocate a different number of the bus lanes to data egress from the host than to data ingress to the host. The bus lanes are configured to allocate the differing number of bus lanes based on the asymmetric bandwidth requirement of the first connection.

Type: Grant

Filed: July 29, 2019

Date of Patent: November 30, 2021

Assignee: Google LLC

Inventors: Nishant Patil, Liqun Cheng
MANAGING PROCESSING SYSTEM EFFICIENCY

Publication number: 20210224129

Abstract: Methods, systems, and computer storage media storing instructions for managing processing system efficiency. One of the methods includes obtaining data splitting a plurality of general-purpose processing units in a processing system into a high-priority domain and a low-priority domain, wherein the general-purpose processing units in the high-priority domain are assigned to perform one or more tasks comprising one or more high-priority tasks, and the general-purpose processing units in the low-priority domain are assigned to perform one or more low-priority tasks; and during runtime of the processing system, obtaining memory usage measurements that characterize usage of system memory by the high-priority domain and the low-priority domain; and adjusting, based on the memory usage measurements, a configuration of (i) the high-priority domain, (ii) the low-priority domain, or (iii) both to adjust utilization of the system memory by the general-purpose processing units.

Type: Application

Filed: January 29, 2021

Publication date: July 22, 2021

Inventors: Liqun Cheng, Rama Krishna Govindaraju, Haishan Zhu, David Lo, Parthasarathy Ranganathan, Nishant Patil
JOB SCHEDULING ON DISTRIBUTED COMPUTING DEVICES

Publication number: 20210073028

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for scheduling operations represented as a computational graph on a distributed computing network. A method includes: receiving data representing operations to be executed in order to perform a job on a plurality of hardware accelerators of a plurality of different accelerator types; generating, for the job and from at least the data representing the operations, features that represent a predicted performance for the job on hardware accelerators of the plurality of different accelerator types; generating, from the features, a respective predicted performance metric for the job for each of the plurality of different accelerator types according to a performance objective function; and providing, to a scheduling system, one or more recommendations for scheduling the job on one or more recommended types of hardware accelerators.

Type: Application

Filed: October 11, 2019

Publication date: March 11, 2021

Inventors: Sheng Li, Brian Zhang, Liqun Cheng, Norman Paul Jouppi, Yun Ni

1 2 3 next