Patents by Inventor Yakun Shao

Yakun Shao has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Deep neural network accelerator with fine-grained parallelism discovery

Patent number: 11966835

Abstract: A sparse convolutional neural network accelerator system that dynamically and efficiently identifies fine-grained parallelism in sparse convolution operations. The system determines matching pairs of non-zero input activations and weights from the compacted input activation and weight arrays utilizing a scalable, dynamic parallelism discovery unit (PDU) that performs a parallel search on the input activation array and the weight array to identify reducible input activation and weight pairs.

Type: Grant

Filed: January 23, 2019

Date of Patent: April 23, 2024

Assignee: NVIDIA CORP.

Inventors: Ching-En Lee, Yakun Shao, Angshuman Parashar, Joel Emer, Stephen W. Keckler
Scalable multi-die deep learning system

Patent number: 11769040

Abstract: A distributed deep neural net (DNN) utilizing a distributed, tile-based architecture implemented on a semiconductor package. The package includes multiple chips, each with a central processing element, a global memory buffer, and processing elements. Each processing element includes a weight buffer, an activation buffer, and multiply-accumulate units to combine, in parallel, the weight values and the activation values.

Type: Grant

Filed: July 19, 2019

Date of Patent: September 26, 2023

Assignee: NVIDIA CORP.

Inventors: Yakun Shao, Rangharajan Venkatesan, Nan Jiang, Brian Matthew Zimmer, Jason Clemons, Nathaniel Pinckney, Matthew R Fojtik, William James Dally, Joel S. Emer, Stephen W. Keckler, Brucek Khailany
Efficient Neural Network Accelerator Dataflows

Publication number: 20220076110

Abstract: A distributed deep neural net (DNN) utilizing a distributed, tile-based architecture includes multiple chips, each with a central processing element, a global memory buffer, and a plurality of additional processing elements. Each additional processing element includes a weight buffer, an activation buffer, and vector multiply-accumulate units to combine, in parallel, the weight values and the activation values using stationary data flows.

Type: Application

Filed: November 19, 2021

Publication date: March 10, 2022

Applicant: NVIDIA Corp.

Inventors: Yakun Shao, Rangharajan Venkatesan, Miaorong Wang, Daniel Smith, William James Dally, Joel Emer, Stephen W. Keckler, Brucek Khailany
Efficient neural network accelerator dataflows

Patent number: 11270197

Abstract: A distributed deep neural net (DNN) utilizing a distributed, tile-based architecture includes multiple chips, each with a central processing element, a global memory buffer, and a plurality of additional processing elements. Each additional processing element includes a weight buffer, an activation buffer, and vector multiply-accumulate units to combine, in parallel, the weight values and the activation values using stationary data flows.

Type: Grant

Filed: November 4, 2019

Date of Patent: March 8, 2022

Assignee: NVIDIA Corp.

Inventors: Yakun Shao, Rangharajan Venkatesan, Miaorong Wang, Daniel Smith, William James Dally, Joel Emer, Stephen W. Keckler, Brucek Khailany
EFFICIENT NEURAL NETWORK ACCELERATOR DATAFLOWS

Publication number: 20200293867

Abstract: A distributed deep neural net (DNN) utilizing a distributed, tile-based architecture includes multiple chips, each with a central processing element, a global memory buffer, and a plurality of additional processing elements. Each additional processing element includes a weight buffer, an activation buffer, and vector multiply-accumulate units to combine, in parallel, the weight values and the activation values using stationary data flows.

Type: Application

Filed: November 4, 2019

Publication date: September 17, 2020

Applicant: NVIDIA Corp.

Inventors: Yakun Shao, Rangharajan Venkatesan, Miaorong Wang, Daniel Smith, William James Dally, Joel Emer, Stephen W. Keckler, Brucek Khailany
SCALABLE MULTI-DIE DEEP LEARNING SYSTEM

Publication number: 20200082246

Abstract: A distributed deep neural net (DNN) utilizing a distributed, tile-based architecture implemented on a semiconductor package. The package includes multiple chips, each with a central processing element, a global memory buffer, and processing elements. Each processing element includes a weight buffer, an activation buffer, and multiply-accumulate units to combine, in parallel, the weight values and the activation values.

Type: Application

Filed: July 19, 2019

Publication date: March 12, 2020

Applicant: NVIDIA Corp.

Inventors: Yakun Shao, Rangharajan Venkatesan, Nan Jiang, Brian Matthew Zimmer, Jason Clemons, Nathaniel Pinckney, Matthew R. Fojtik, William James Dally, Joel S. Emer, Stephen W. Keckler, Brucek Khailany
DEEP NEURAL NETWORK ACCELERATOR WITH FINE-GRAINED PARALLELISM DISCOVERY

Publication number: 20190370645

Abstract: A sparse convolutional neural network accelerator system that dynamically and efficiently identifies fine-grained parallelism in sparse convolution operations. The system determines matching pairs of non-zero input activations and weights from the compacted input activation and weight arrays utilizing a scalable, dynamic parallelism discovery unit (PDU) that performs a parallel search on the input activation array and the weight array to identify reducible input activation and weight pairs.

Type: Application

Filed: January 23, 2019

Publication date: December 5, 2019

Inventors: Ching-En Lee, Yakun Shao, Angshuman Parashar, Joel Emer, Stephen W. Keckler

Deep neural network accelerator with fine-grained parallelism discovery

Scalable multi-die deep learning system

Efficient Neural Network Accelerator Dataflows

Efficient neural network accelerator dataflows

EFFICIENT NEURAL NETWORK ACCELERATOR DATAFLOWS

SCALABLE MULTI-DIE DEEP LEARNING SYSTEM

DEEP NEURAL NETWORK ACCELERATOR WITH FINE-GRAINED PARALLELISM DISCOVERY