Patents by Inventor Yujie Hu

Yujie Hu has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Apparatus and method of fast floating-point adder tree for neural networks

Patent number: 11106431

Abstract: A computing device to implement fast floating-point adder tree for the neural network applications is disclosed. The fast float-point adder tree comprises a data preparation module, a fast fixed-point Carry-Save Adder (CSA) tree, and a normalization module. The floating-point input data comprises a sign bit, exponent part and fraction part. The data preparation module aligns the fraction part of the input data and prepares the input data for subsequent processing. The fast adder uses a signed fixed-point CSA tree to quickly add a large number of fixed-point data into 2 output values and then uses a normal adder to add the 2 output values into one output value. The fast adder uses for a large number of operands is based on multiple levels of fast adders for a small number of operands. The output from the signed fixed-point Carry-Save Adder tree is converted to a selected floating-point format.

Type: Grant

Filed: February 22, 2020

Date of Patent: August 31, 2021

Assignee: DINOPLUSAI HOLDINGS LIMITED

Inventors: Yutian Feng, Yujie Hu
AI Accelerator Virtualization

Publication number: 20210264257

Abstract: An AI (Artificial Intelligence) processor for Neural Network (NN) Processing shared by multiple users is disclosed. The AI processor comprises a Multiplier Unit (MXU), a Scalar Computing Unit (SCU), a unified buffer coupled to the MXU and SCU to store data and a control circuitry coupled to the CCU and the unified buffer. The MXU comprises a plurality of Processing Elements (PEs) responsible for computing matrix multiplications. The SCU coupled to output of the MXU is responsible for computing the activation function. The control circuitry is configured to perform the space division and time division NN processing for a plurality of users. At one time instance, at least one of the MXU and SCU is shared by two or more users; and at least one user is using a part of the MXU while the other user is using a part of the SCU.

Type: Application

Filed: February 28, 2019

Publication date: August 26, 2021

Inventors: Yujie HU, Xiaosong WANG, Tong WU, Steven SERTILLANGE
Mission-Critical AI Processor with Multi-Layer Fault Tolerance Support

Publication number: 20210141697

Abstract: Embodiments described herein provide a mission-critical artificial intelligence (AI) processor (MAIP), which includes multiple types of HEs (hardware elements) comprising one or more HEs configured to perform operations associated with multi-layer NN (neural network) processing, at least one spare HE, a data buffer to store correctly computed data in a previous layer of multi-layer NN processing computed, and fault tolerance (FT) control logic. The FT control logic is configured to: determine a fault in a current layer NN processing associated with the HE; cause the correctly computed data in the previous layer of multi-layer NN processing to be copied or moved to said at least one spare HE; and cause said at least one spare HE to perform the current layer NN processing using said at least one spare HE and the correctly computed data in the previous layer of multi-layer NN processing.

Type: Application

Filed: February 25, 2019

Publication date: May 13, 2021

Inventors: Chung Kuang CHIN, Yujie HU, Tong WU, Clifford GOLD, Yick Kei WONG, Xiaosong WANG, Steven SERTILLANGE, Zongwei ZHU
Apparatus and Method of Fast Floating-Point Adder Tree for Neural Networks

Publication number: 20200272417

Abstract: A computing device to implement fast floating-point adder tree for the neural network applications is disclosed. The fast float-point adder tree comprises a data preparation module, a fast fixed-point Carry-Save Adder (CSA) tree, and a normalization module. The floating-point input data comprises a sign bit, exponent part and fraction part. The data preparation module aligns the fraction part of the input data and prepares the input data for subsequent processing. The fast adder uses a signed fixed-point CSA tree to quickly add a large number of fixed-point data into 2 output values and then uses a normal adder to add the 2 output values into one output value. The fast adder uses for a large number of operands is based on multiple levels of fast adders for a small number of operands. The output from the signed fixed-point Carry-Save Adder tree is converted to a selected floating-point format.

Type: Application

Filed: February 22, 2020

Publication date: August 27, 2020

Inventors: Yutian Feng, Yujie Hu
Mission-critical AI processor with record and replay support

Patent number: 10747631

Abstract: Embodiments described herein provide a mission-critical artificial intelligence (AI) processor (MAIP), which includes an instruction buffer, processing circuitry, a data buffer, command circuitry, and communication circuitry. During operation, the instruction buffer stores a first hardware instruction and a second hardware instruction. The processing circuitry executes the first hardware instruction, which computes an intermediate stage of an AI model. The data buffer stores data generated from executing the first hardware instruction. The command circuitry determines that the second hardware instruction is a hardware-initiated store instruction for transferring the data from the data buffer. Based on the hardware-initiated store instruction, the communication circuitry transfers the data from the data buffer to a memory device of a computing system, which includes the mission-critical processor, via a communication interface.

Type: Grant

Filed: June 5, 2018

Date of Patent: August 18, 2020

Assignee: DINOPLUSAI HOLDINGS LIMITED

Inventors: Yujie Hu, Tong Wu, Xiaosong Wang, Zongwei Zhu, Chung Kuang Chin, Clifford Gold, Steven Sertillange, Yick Kei Wong
Computing Device for Fast Weighted Sum Calculation in Neural Networks

Publication number: 20190279083

Abstract: A computing device for fast weighted sum calculation in neural networks is disclosed. The computing device comprises an array of processing elements configured to accept an input array. Each processing element comprises a plurality of multipliers and a multiple levels of accumulators. A set of weights associated with the inputs and a target output are provided to a target processing element to compute the weighted sum for the target output. The device according to the present invention reduces the computation time from M clock cycles to log2M, where M is the size of the input array.

Type: Application

Filed: April 19, 2018

Publication date: September 12, 2019

Inventors: Cliff Gold, Tong Wu, Yujie Hu, Chung Kuang Chin, Xiaosong Wang, Yick Kei Wong
MISSION-CRITICAL AI PROCESSOR WITH RECORD AND REPLAY SUPPORT

Publication number: 20190227887

Abstract: Embodiments described herein provide a mission-critical artificial intelligence (AI) processor (MAIP), which includes an instruction buffer, processing circuitry, a data buffer, command circuitry, and communication circuitry. During operation, the instruction buffer stores a first hardware instruction and a second hardware instruction. The processing circuitry executes the first hardware instruction, which computes an intermediate stage of an AI model. The data buffer stores data generated from executing the first hardware instruction. The command circuitry determines that the second hardware instruction is a hardware-initiated store instruction for transferring the data from the data buffer. Based on the hardware-initiated store instruction, the communication circuitry transfers the data from the data buffer to a memory device of a computing system, which includes the mission-critical processor, via a communication interface.

Type: Application

Filed: June 5, 2018

Publication date: July 25, 2019

Applicant: DinoplusAI Holdings Limited

Inventors: Yujie Hu, Tong Wu, Xiaosong Wang, Zongwei Zhu, Chung Kuang Chin, Clifford Gold, Steven Sertillange, Yick Kei Wong
Method and apparatus for configuring application software

Publication number: 20080127162

Abstract: A cross-platform configuration system manages configuration information for application software. In one embodiment, a process includes, but is not limited to, storing configuration information in a configuration file using a cross-platform markup language, the configuration information including configuration data associated with the operating environment and user data associated with the application, and configuring the application by accessing the configuration file without using a registry of an operating environment in which the application is running.

Type: Application

Filed: November 29, 2006

Publication date: May 29, 2008

Inventors: Kui Xu, Yujie Hu, Ting Wang
Charging adapter for electric vehicle

Patent number: D1065059

Type: Grant

Filed: November 18, 2022

Date of Patent: March 4, 2025

Inventor: Yujie Hu