Patents by Inventor Yuanwei Fang

Yuanwei Fang has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Performing a top-k function using a binary heap tree

Patent number: 11954093

Abstract: Embodiments of the disclosure provide devices and methods for performing a top-k function. The device can include: a memory comprising a plurality of register files for storing the data elements, the plurality of register files comprising a parent register file and a first child register file associated with the parent register file, wherein the parent register file is associated with: first interface circuitry configured for reading a first parent data element from the parent register file and receiving a first child data element and a second child data element from the first child register file; and first comparison circuitry configured for updating the parent register file and the first child register file based on the first parent data element, the first child data element, and the second child data element according to a given principle.

Type: Grant

Filed: June 4, 2020

Date of Patent: April 9, 2024

Assignee: Alibaba Group Holding Limited

Inventors: Fei Sun, Shuangchen Li, Dimin Niu, Fei Xue, Yuanwei Fang
METHOD AND SYSTEM FOR PROGRAM SAMPLING USING NEURAL NETWORK

Publication number: 20230394300

Abstract: This application describes methods, systems, and apparatus, for neural network-based program sampling (NPS). An example device may obtain an assembly code of a program and an execution trace of the program, and divide the assembly code into a plurality of execution intervals. The device may construct a plurality of code graphs respectively corresponding to the plurality of execution intervals, and for each of the plurality of code graphs: generate a plurality of graph snapshots based on the code graph and the execution trace of the program; embed, by using a Graph Neural Network, the plurality of graph snapshots into a plurality of vectors; and aggregate the plurality of vectors into an execution embedding. The device may cluster the plurality of execution embeddings into a plurality of clusters and select representative execution intervals of the program based on the plurality of clusters for execution.

Type: Application

Filed: October 28, 2022

Publication date: December 7, 2023

Inventors: Yuanwei FANG, Jian CHEN, Yen-Kuang CHEN, Yuan XIE
Scalable system-in-package architectures

Patent number: 11704271

Abstract: A system-in-package architecture in accordance with aspects includes a logic die and one or more memory dice coupled together in a three-dimensional slack. The logic die can include one or more global building blocks and a plurality of local building blocks. The number of local building blocks can be scalable. The local building blocks can include a plurality of engines and memory controllers. The memory controllers can be configured to directly couple one or more of the engines to the one or more memory dice. The number and type of local building blocks, and the number and types of engines and memory controllers can be scalable.

Type: Grant

Filed: August 20, 2020

Date of Patent: July 18, 2023

Assignee: Alibaba Group Holding Limited

Inventors: Lide Duan, Wei Han, Yuhao Wang, Fei Xue, Yuanwei Fang, Hongzhong Zheng
Apparatuses and methods for map reduce

Patent number: 11500811

Abstract: The present disclosure relates to a method and an apparatus for map reduce. In some embodiments, an exemplary processing unit includes: a 2-dimensional (2D) processing element (PE) array comprising a plurality of PEs, each PE comprising a first input and a second input, the first inputs of the PEs in a linear array in a first dimension of the PE array being connected in series and the second inputs of the PEs in a linear array in a second dimension of the PE array being connected in parallel, each PE being configured to perform an operation on data from the first input or second input; and a plurality of reduce tree units, each reduce tree unit being coupled with the PEs in a linear array in the first dimension or the second dimension of the PE array and configured to perform a first reduction operation.

Type: Grant

Filed: June 12, 2020

Date of Patent: November 15, 2022

Assignee: Alibaba Group Holding Limited

Inventors: Yuanwei Fang, Tae Meon Bae, Sicheng Li, Minghai Qin, Guanlin Wu, Yen-kuang Chen
Method and system for processing video content

Patent number: 11445200

Abstract: Embodiments of the disclosure provide systems and methods for processing video content. The method can include: receiving raw video data of a video; determining a texture complexity for the video based on the raw video data; determining an encoding mode for the raw video data based on the texture complexity; and encoding the raw video data using the determined encoding mode.

Type: Grant

Filed: May 12, 2020

Date of Patent: September 13, 2022

Assignee: Alibaba Group Holding Limited

Inventors: Minghai Qin, Guanlin Wu, Tae Meon Bae, Sicheng Li, Yuanwei Fang, Yen-kuang Chen
Bit-packed array processing using SIMD

Patent number: 11442729

Abstract: A method and system for processing a bit-packed array using one or more processors, including determining a data element size of the bit-packed array, determining a lane configuration of a single-instruction multiple-data (SIMD) unit for processing the bit-packed array based at least in part on the determined data element size, the lane configuration being determined from among a plurality of candidate lane configurations, each candidate lane configuration having a different number of vector register lanes and a corresponding bit capacity per vector register lane, configuring the SIMD unit according to the determined lane configuration, and loading one or more data elements into each vector register lane of the SIMD unit. SIMD instructions may be executed on the loaded one or more data elements of each vector register lane in parallel, and a result of the SIMD instruction may be stored in memory.

Type: Grant

Filed: October 26, 2020

Date of Patent: September 13, 2022

Assignee: Google LLC

Inventors: Junwhan Ahn, Jichuan Chang, Andrew McCormick, Yuanwei Fang, Yixin Luo
Method and system for compiler optimization based on artificial intelligence

Patent number: 11403090

Abstract: This application describes methods, systems, and apparatus, including computer programs encoded on computer storage media, of an AI-assisted compiler. An example method includes obtaining intermediate code and executable code generated by compiling a computer program with a compiler; determining a reward based on one or more traces obtained by executing the executable code in a runtime system; generating an embedding vector based on the intermediate code and the one or more traces to represent code execution states; determining, using a reinforcement learning agent, one or more optimization actions based on the embedding vector and the reward; and updating the compiler by applying the one or more optimization actions.

Type: Grant

Filed: December 8, 2020

Date of Patent: August 2, 2022

Assignee: ALIBABA GROUP HOLDING LIMITED

Inventors: Yuanwei Fang, Yen-kuang Chen
METHOD AND SYSTEM FOR MICROARCHITECTURE-AWARE PROGRAM SAMPLING

Publication number: 20220215241

Abstract: This application describes methods, systems, and apparatus, including computer programs encoded on computer storage media, for microarchitecture-aware program sampling. An exemplary method includes receiving one or more traces collected from one or more microarchitectures executing a computer program for evaluating hardware configurations; training a machine learning (ML) model with multi-task learning based on the one or more traces as one or more training tasks; generating a plurality of embedded vectors representing the computer program; and updating, based on the trained ML model, the plurality of embedded vectors.

Type: Application

Filed: January 5, 2021

Publication date: July 7, 2022

Inventors: Yuanwei FANG, Minghai QIN, Yen-kuang CHEN
METHOD AND SYSTEM FOR COMPILER OPTIMIZATION BASED ON ARTIFICIAL INTELLIGENCE

Publication number: 20220179635

Abstract: This application describes methods, systems, and apparatus, including computer programs encoded on computer storage media, of an AI-assisted compiler. An example method includes obtaining intermediate code and executable code generated by compiling a computer program with a compiler; determining a reward based on one or more traces obtained by executing the executable code in a runtime system; generating an embedding vector based on the intermediate code and the one or more traces to represent code execution states; determining, using a reinforcement learning agent, one or more optimization actions based on the embedding vector and the reward; and updating the compiler by applying the one or more optimization actions.

Type: Application

Filed: December 8, 2020

Publication date: June 9, 2022

Inventors: Yuanwei FANG, Yen-kuang CHEN
Bit-Packed Array Processing Using SIMD

Publication number: 20220129269

Abstract: A method and system for processing a bit-packed array using one or more processors, including determining a data element size of the bit-packed array, determining a lane configuration of a single-instruction multiple-data (SIMD) unit for processing the bit-packed array based at least in part on the determined data element size, the lane configuration being determined from among a plurality of candidate lane configurations, each candidate lane configuration having a different number of vector register lanes and a corresponding bit capacity per vector register lane, configuring the SIMD unit according to the determined lane configuration, and loading one or more data elements into each vector register lane of the SIMD unit. SIMD instructions may be executed on the loaded one or more data elements of each vector register lane in parallel, and a result of the SIMD instruction may be stored in memory.

Type: Application

Filed: October 26, 2020

Publication date: April 28, 2022

Applicant: Google LLC

Inventors: Junwhan Ahn, Jichuan Chang, Andrew McCormick, Yuanwei Fang, Yixin Luo
INTELLIGENT COMPUTING RESOURCES ALLOCATION FOR FEATURE NETWORK BASED ON FEATURE PROPAGATION

Publication number: 20220103831

Abstract: The present disclosure relates to a method for scheduling computation resources for generating feature maps for video. The method comprises determining runtime for generating feature maps of a reference picture and a predicted picture, determining available computation resources for generating the feature maps, and allocating, based on the runtime, one or more computation resources among the available computation resources for generating the feature maps such that the feature maps are generated at regular time intervals.

Type: Application

Filed: September 30, 2020

Publication date: March 31, 2022

Inventors: Sicheng Ll, Yuanwei Fang, Minghai Qin, Yen-kuang Chen
Region of interest quality controllable video coding techniques

Patent number: 11277626

Abstract: Video coding techniques including differential bit rate or quality coding of one or more regions of interest and one or more non-regions of interest based on information including one or more of coordinates of the one or more regions of interest, a target complexity, residual encoder bit data, a requested quality, a difference between the current video data frame and a reconstructed video data frame, a target quality, a requested bit rate, frame target bit allocation and an as encoded bit rate.

Type: Grant

Filed: February 21, 2020

Date of Patent: March 15, 2022

Assignee: Alibaba Group Holding Limited

Inventors: Guanlin Wu, Minghai Qin, Tae Meon Bae, Sicheng Li, Yuanwei Fang, Yen-Kuang Chen
USING TAGGED INSTRUCTION EXTENSION TO EXPRESS DEPENDENCY FOR MEMORY-BASED ACCELERATOR INSTRUCTIONS

Publication number: 20220058024

Abstract: A method of performing out-of-order execution in a processing system comprising a processing unit and one or more accelerators comprises dispatching a plurality of coarse-grained instructions, each instruction extended to comprise one or more tags, wherein each tag comprises dependency information for the respective instruction expressed at a coarse-grained level. The method also comprises translating the plurality of coarse-grained instructions into a plurality of fine-grained instructions, wherein the dependency information is translated into dependencies expressed at a fine-grained level. Further, the method comprises resolving the dependencies at the fine-grained level and scheduling the plurality of fine-grained instructions for execution across the one or more accelerators in the processing system.

Type: Application

Filed: August 18, 2020

Publication date: February 24, 2022

Inventors: Yuanwei FANG, Fei SUN, Fei XUE, Yuejian XIE, Yuhao WANG, Yen-Kuang CHEN
SCALABLE SYSTEM-IN-PACKAGE ARCHITECTURES

Publication number: 20220058150

Abstract: A system-in-package architecture in accordance with aspects includes a logic die and one or more memory dice coupled together in a three-dimensional slack. The logic die can include one or more global building blocks and a plurality of local building blocks. The number of local building blocks can be scalable. The local building blocks can include a plurality of engines and memory controllers. The memory controllers can be configured to directly couple one or more of the engines to the one or more memory dice. The number and type of local building blocks, and the number and types of engines and memory controllers can be scalable.

Type: Application

Filed: August 20, 2020

Publication date: February 24, 2022

Inventors: Lide DUAN, Wei HAN, Yuhao WANG, Fei XUE, Yuanwei FANG, Hongzhong ZHENG
SYSTEMS AND METHODS TO ENCODE REGIONS-OF-INTEREST BASED ON VIDEO CONTENT DETECTION

Publication number: 20220021888

Abstract: Video coding techniques including variable bitrate encoding based on regions-of-interest (ROIs) and the type of the video content, the type of sets of frames of the video content, the type of scenes of the video content, or the like.

Type: Application

Filed: July 16, 2020

Publication date: January 20, 2022

Inventors: Minghai QIN, Yen-kuang CHEN, Tae Meon BAE, Guanlin WU, Yuanwei FANG, Sicheng LI
APPARATUSES AND METHODS FOR MAP REDUCE

Publication number: 20210390076

Abstract: The present disclosure relates to a method and an apparatus for map reduce. In some embodiments, an exemplary processing unit includes: a 2-dimensional (2D) processing element (PE) array comprising a plurality of PEs, each PE comprising a first input and a second input, the first inputs of the PEs in a linear array in a first dimension of the PE array being connected in series and the second inputs of the PEs in a linear array in a second dimension of the PE array being connected in parallel, each PE being configured to perform an operation on data from the first input or second input; and a plurality of reduce tree units, each reduce tree unit being coupled with the PEs in a linear array in the first dimension or the second dimension of the PE array and configured to perform a first reduction operation.

Type: Application

Filed: June 12, 2020

Publication date: December 16, 2021

Inventors: Yuanwei FANG, Tae Meon BAE, Sicheng LI, Minghai QIN, Guanlin WU, Yen-kuang CHEN
SYSTEM AND METHOD FOR PERFORMING A TOP-K FUNCTION

Publication number: 20210382871

Abstract: Embodiments of the disclosure provide devices and methods for performing a top-k function. The device can include: a memory comprising a plurality of register files for storing the data elements, the plurality of register files comprising a parent register file and a first child register file associated with the parent register file, wherein the parent register file is associated with: first interface circuitry configured for reading a first parent data element from the parent register file and receiving a first child data element and a second child data element from the first child register file; and first comparison circuitry configured for updating the parent register file and the first child register file based on the first parent data element, the first child data element, and the second child data element according to a given principle.

Type: Application

Filed: June 4, 2020

Publication date: December 9, 2021

Inventors: Fei SUN, Shuangchen LI, Dimin NIU, Fei XUE, Yuanwei FANG
METHOD AND SYSTEM FOR PROCESSING VIDEO CONTENT

Publication number: 20210360258

Abstract: Embodiments of the disclosure provide systems and methods for processing video content. The method can include: receiving raw video data of a video; determining a texture complexity for the video based on the raw video data; determining an encoding mode for the raw video data based on the texture complexity; and encoding the raw video data using the determined encoding mode.

Type: Application

Filed: May 12, 2020

Publication date: November 18, 2021

Inventors: Minghai QIN, Guanlin WU, Tae Meon BAE, Sicheng LI, Yuanwei FANG, Yen-kuang CHEN
REGION OF INTEREST QUALITY CONTROLLABLE VIDEO CODING TECHNIQUES

Publication number: 20210266570

Abstract: Video coding techniques including differential bit rate or quality coding of one or more regions of interest and one or more non-regions of interest based on information including one or more of coordinates of the one or more regions of interest, a target complexity, residual encoder bit data, a requested quality, a difference between the current video data frame and a reconstructed video data frame, a target quality, a requested bit rate, frame target bit allocation and an as encoded bit rate.

Type: Application

Filed: February 21, 2020

Publication date: August 26, 2021

Inventors: Guanlin WU, Minghai QIN, Tae Meon BAE, Sicheng LI, Yuanwei FANG, Yen-Kuang CHEN
COORDINATED APPLICATION FIREWALL

Publication number: 20180124018

Abstract: Aspects may relate to a server comprising: an interface to receive a service request; and a processor coupled to the interface to receive the service request, the processor configured to: implement a firewall appliance for the service request; operate a first micro-security application to generate an anomaly alert for the service request; and operate a second micro-security application to receive the anomaly alert from the first micro-security application or from another server's micro-security application and to determine whether the service request corresponds to a non-benign behavior.

Type: Application

Filed: December 22, 2016

Publication date: May 3, 2018

Inventors: Gheorghe Cascaval, Hui Chao, Mihai Christodorescu, Drew Dean, Dinakar Khurjati, Shuhua Ge, Hilmi Gunes Kayacik, Arun Raman, Ahmet Salih Buyukkayhan, Yuanwei Fang