Patents by Inventor Anbang Yao

Anbang Yao has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Methods and apparatus for discriminative semantic transfer and physics-inspired optimization of features in deep learning

Patent number: 11669718

Abstract: Methods and apparatus for discrimitive semantic transfer and physics-inspired optimization in deep learning are disclosed. A computation training method for a convolutional neural network (CNN) includes receiving a sequence of training images in the CNN of a first stage to describe objects of a cluttered scene as a semantic segmentation mask. The semantic segmentation mask is received in a semantic segmentation network of a second stage to produce semantic features. Using weights from the first stage as feature extractors and weights from the second stage as classifiers, edges of the cluttered scene are identified using the semantic features.

Type: Grant

Filed: May 22, 2018

Date of Patent: June 6, 2023

Assignee: Intel Corporation

Inventors: Anbang Yao, Hao Zhao, Ming Lu, Yiwen Guo, Yurong Chen
KINEMATIC INTERACTION SYSTEM WITH IMPROVED POSE TRACKING

Publication number: 20230154092

Abstract: Techniques are disclosed for providing improved pose tracking of a subject using a 2D camera and generating a 3D image that recreates the pose of the subject. A 3D skeleton map is estimated from a 2D skeleton map of the subject using, for example, a neural network. A template 3D skeleton map is accessed or generated having bone segments that have lengths set using, for instance, anthropometry statistics based on a given height of the template 3D skeleton map. An improved 3D skeleton map is then produced by at least retargeting one or more of the plurality of bone segments of the estimated 3D skeleton map to more closely match the corresponding template bone segments of the template 3D skeleton map. The improved 3D skeleton map can then be animated in various ways (e.g., using various skins or graphics) to track corresponding movements of the subject.

Type: Application

Filed: April 23, 2020

Publication date: May 18, 2023

Inventors: Shandong Wang, Yangyuxuan Kang, Anbang Yao, Ming Lu, Yurong Chen
Methods and apparatus for enhancing a neural network using binary tensor and scale factor pairs

Patent number: 11640526

Abstract: Methods and apparatus are disclosed for enhancing a neural network using binary tensor and scale factor pairs. For one example, a method of optimizing a trained convolutional neural network (CNN) includes initializing an approximation residue as a trained weight tensor for the trained CNN. A plurality of binary tensors and scale factor pairs are determined. The approximation residue is updated using the binary tensors and scale factor pairs.

Type: Grant

Filed: May 22, 2018

Date of Patent: May 2, 2023

Assignee: Intel Corporation

Inventors: Yiwen Guo, Anbang Yao, Hao Zhao, Ming Lu, Yurong Chen
Systems and methods for generating gaussian random numbers with hardware acceleration

Patent number: 11635943

Abstract: Described herein are hardware acceleration of random number generation for machine learning and deep learning applications. An apparatus (700) includes a uniform random number generator (URNG) circuit (710) to generate uniform random numbers and an adder circuit (750) that is coupled to the URNG circuit (710). The adder circuit hardware (750) accelerates generation of Gaussian random numbers for machine learning.

Type: Grant

Filed: April 7, 2017

Date of Patent: April 25, 2023

Assignee: Intel Corporation

Inventors: Yiwen Guo, Anbang Yao, Dongqi Cai, Libin Wang, Lin Xu, Ping Hu, Shandong Wang, Wenhua Cheng
METHODS AND APPARATUS FOR DEEP LEARNING NETWORK EXECUTION PIPELINE ON MULTI-PROCESSOR PLATFORM

Publication number: 20230114725

Abstract: Methods and systems are disclosed using an execution pipeline on a multi-processor platform for deep learning network execution. In one example, a network workload analyzer receives a workload, analyzes a computation distribution of the workload, and groups the network nodes into groups. A network executor assigns each group to a processing core of the multi-core platform so that the respective processing core handle computation tasks of the received workload for the respective group.

Type: Application

Filed: August 15, 2022

Publication date: April 13, 2023

Inventors: Liu Yang, Anbang Yao
METHODS AND APPARATUS FOR MODIFYING A MACHINE LEARNING MODEL

Publication number: 20230093823

Abstract: Methods, apparatus, systems, and articles of manufacture for modifying a machine learning model are disclosed. An example apparatus includes a supervised branch inserter to insert a supervised branch into a machine learning model at an identified location, a first cluster generator to generate a first cluster of the inserted supervised branch using a first clustering technique, a second cluster generator to generate a second cluster of the inserted supervised branch using a second clustering technique, the second clustering technique different from the first clustering technique, a cluster joiner to join the first cluster and the second cluster to form a clustering block, the clustering block appended to an end of the supervised branch, and a propagation strategy executor to execute a propagation training strategy to modify a parameter of the machine learning model.

Type: Application

Filed: December 18, 2019

Publication date: March 30, 2023

Inventors: Anbang Yao, Ping Hu, Yangyuxuan Kang, Yurong Chen
COMPUTE OPTIMIZATIONS FOR LOW PRECISION MACHINE LEARNING OPERATIONS

Publication number: 20230061331

Abstract: One embodiment provides a multi-chip module accelerator usable to execute tensor data processing operations a multi-chip module. The multi-chip module may include a memory stack including multiple memory dies and parallel processor circuitry communicatively coupled to the memory stack. The parallel processor circuitry may include multiprocessor cores to execute matrix multiplication and accumulate operations. The matrix multiplication and accumulate operations may include floating-point operations that are configurable to include two-dimensional matrix multiply and accumulate operations involving inputs that have differing floating-point precisions. The floating-point operations may include a first operation at a first precision and a second operation at a second precision. The first operation may include a multiply having at least one 16-bit floating-point input and the second operation may include an accumulate having a 32-bit floating-point input.

Type: Application

Filed: October 5, 2022

Publication date: March 2, 2023

Applicant: Intel Corporation

Inventors: Elmoustapha Ould-Ahmed-Vall, Sara S. Baghsorkhi, Anbang Yao, Kevin Nealis, Xiaoming Chen, Altug Koker, Abhishek R. Appu, John C. Weast, Mike B. Macpherson, Dukhwan Kim, Linda L. Hurd, Ben J. Ashbaugh, Barath Lakshmanan, Liwei Ma, Joydeep Ray, Ping T. Tang, Michael S. Strickland
COMPUTE OPTIMIZATIONS FOR LOW PRECISION MACHINE LEARNING OPERATIONS

Publication number: 20230061670

Abstract: One embodiment provides an apparatus comprising a memory stack including multiple memory dies and a parallel processor including a plurality of multiprocessors. Each multiprocessor has a single instruction, multiple thread (SIMT) architecture, the parallel processor coupled to the memory stack via one or more memory interfaces. At least one multiprocessor comprises a multiply-accumulate circuit to perform multiply-accumulate operations on matrix data in a stage of a neural network implementation to produce a result matrix comprising a plurality of matrix data elements at a first precision, precision tracking logic to evaluate metrics associated with the matrix data elements and indicate if an optimization is to be performed for representing data at a second stage of the neural network implementation, and a numerical transform unit to dynamically perform a numerical transform operation on the matrix data elements based on the indication to produce transformed matrix data elements at a second precision.

Type: Application

Filed: November 1, 2022

Publication date: March 2, 2023

Applicant: Intel Corporation

Inventors: Elmoustapha Ould-Ahmed-Vall, Sara S. Baghsorkhi, Anbang Yao, Kevin Nealis, Xiaoming Chen, Altug Koker, Abhishek R. Appu, John C. Weast, Mike B. Macpherson, Dukhwan Kim, Linda L. Hurd, Ben J. Ashbaugh, Barath Lakshmanan, Liwei Ma, Joydeep Ray, Ping T. Tang, Michael S. Strickland
Methods, systems and apparatus to improve deep learning resource efficiency

Patent number: 11593686

Abstract: Methods, apparatus, systems and articles of manufacture are disclosed to improve deep learning resource efficiency. An example apparatus includes a graph monitor to select a candidate operation node in response to receiving an operation graph, the operation graph including one or more other operation nodes, a node rule evaluator to evaluate the candidate operation node based on an operating principle, the operating principle to determine an output storage destination of the candidate operation node based on a topology of the operation graph, and a tag engine to tag the candidate operation node with a memory tag value based on the determined output storage destination.

Type: Grant

Filed: March 23, 2017

Date of Patent: February 28, 2023

Assignee: Intel Corporation

Inventors: Liu Yang, Anbang Yao
Semantic image segmentation using gated dense pyramid blocks

Patent number: 11594010

Abstract: An example apparatus for semantic image segmentation includes a receiver to receive an image to be segmented. The apparatus also includes a gated dense pyramid network including a plurality of gated dense pyramid (GDP) blocks to be trained to generate semantic labels for respective pixels in the received image. The apparatus further includes a generator to generate a segmented image based on the generated semantic labels.

Type: Grant

Filed: October 25, 2021

Date of Patent: February 28, 2023

Assignee: Intel Corporation

Inventors: Libin Wang, Anbang Yao, Jianguo Li, Yurong Chen
INSTRUCTIONS AND LOGIC TO PERFORM FLOATING POINT AND INTEGER OPERATIONS FOR MACHINE LEARNING

Publication number: 20230046506

Abstract: One embodiment provides for a graphics processing unit to accelerate machine-learning operations, the graphics processing unit comprising a multiprocessor having a single instruction, multiple thread (SIMT) architecture, the multiprocessor to execute at least one single instruction; and a first compute unit included within the multiprocessor, the at least one single instruction to cause the first compute unit to perform a two-dimensional matrix multiply and accumulate operation, wherein to perform the two-dimensional matrix multiply and accumulate operation includes to compute an intermediate product of 16-bit operands and to compute a 32-bit sum based on the intermediate product.

Type: Application

Filed: October 17, 2022

Publication date: February 16, 2023

Applicant: Intel Corporation

Inventors: Himanshu Kaul, Mark A. Anders, Sanu K. Mathew, Anbang Yao, Joydeep Ray, Ping T. Tang, Michael S. Strickland, Xiaoming Chen, Tatiana Shpeisman, Abhishek R. Appu, Altug Koker, Kamal Sinha, Balaji Vembu, Nicolas C. Galoppo Von Borries, Eriko Nurvitadhi, Rajkishore Barik, Tsung-Han Lin, Vasanth Ranganathan, Sanjeev Jahagirdar
AUTONOMOUS VEHICLE NEURAL NETWORK OPTIMIZATION

Publication number: 20230039729

Abstract: Methods and apparatus relating to autonomous vehicle neural network optimization techniques are described. In an embodiment, the difference between a first training dataset to be used for a neural network and a second training dataset to be used for the neural network is detected. The second training dataset is authenticated in response to the detection of the difference. The neural network is used to assist in an autonomous vehicle/driving. Other embodiments are also disclosed and claimed.

Type: Application

Filed: October 11, 2022

Publication date: February 9, 2023

Applicant: Intel Corporation

Inventors: Abhishek R. Appu, Altug Koker, Linda L. Hurd, Dukhwan Kim, Mike B. MacPherson, John C. Weast, Justin E. Gottschlich, Jingyi Jin, Barath Lakshmanan, Chandrasekaran Sakthivel, Michael S. Strickland, Joydeep Ray, Kamal Sinha, Prasoonkumar Surti, Balaji Vembu, Ping T. Tang, Anbang Yao, Tatiana Shpeisman, Xiaoming Chen
Recognition of activity in a video image sequence using depth information

Patent number: 11568682

Abstract: Techniques are provided for recognition of activity in a sequence of video image frames that include depth information. A methodology embodying the techniques includes segmenting each of the received image frames into a multiple windows and generating spatio-temporal image cells from groupings of windows from a selected sub-sequence of the frames. The method also includes calculating a four dimensional (4D) optical flow vector for each of the pixels of each of the image cells and calculating a three dimensional (3D) angular representation from each of the optical flow vectors. The method further includes generating a classification feature for each of the image cells based on a histogram of the 3D angular representations of the pixels in that image cell. The classification features are then provided to a recognition classifier configured to recognize the type of activity depicted in the video sequence, based on the generated classification features.

Type: Grant

Filed: December 1, 2020

Date of Patent: January 31, 2023

Assignee: INTEL CORPORATION

Inventors: Shaopeng Tang, Anbang Yao, Yurong Chen
Methods and systems using camera devices for deep channel and convolutional neural network images and formats

Patent number: 11551335

Abstract: Methods and systems are disclosed using camera devices for deep channel and Convolutional Neural Network (CNN) images and formats. In one example, image values are captured by a color sensor array in an image capturing device or camera. The image values provide color channel data. The captured image values by the color sensor array are input to a CNN having at least one CNN layer. The CNN provides CNN channel data for each layer. The color channel data and CNN channel data is to form a deep channel image that stored in a memory. In another example, image values are captured by sensor array. The captured image values by the sensor array are input a CNN having a first CNN layer. An output is generated at the first CNN layer using the captured image values by the color sensor array. The output of the first CNN layer is stored as a feature map of the captured image.

Type: Grant

Filed: April 7, 2017

Date of Patent: January 10, 2023

Assignee: Intel Corporation

Inventors: Lin Xu, Liu Yang, Anbang Yao, Dongqi Cai, Libin Wang, Ping Hu, Shandong Wang, Wenhua Cheng, Yiwen Guo, Yurong Chen
Coupled multi-task fully convolutional networks using multi-scale contextual information and hierarchical hyper-features for semantic image segmentation

Patent number: 11538164

Abstract: Techniques related to implementing fully convolutional networks for semantic image segmentation are discussed. Such techniques may include combining feature maps from multiple stages of a multi-stage fully convolutional network to generate a hyper-feature corresponding to an input image, up-sampling the hyper-feature and summing it with a feature map of a previous stage to provide a final set of features, and classifying the final set of features to provide semantic image segmentation of the input image.

Type: Grant

Filed: December 16, 2020

Date of Patent: December 27, 2022

Assignee: Intel Corporation

Inventors: Libin Wang, Anbang Yao, Yurong Chen
Methods and systems using improved training and learning for deep neural networks

Patent number: 11537851

Abstract: Methods and systems are disclosed using improved training and learning for deep neural networks. In one example, a deep neural network includes a plurality of layers, and each layer has a plurality of nodes. The nodes of each L layer in the plurality of layers are randomly connected to nodes of an L+1 layer. The nodes of each L+1 layer are connected to nodes in a subsequent L layer in a one-to-one manner. Parameters related to the nodes of each L layer are fixed. Parameters related to the nodes of each L+1 layers are updated. In another example, inputs for the input layer and labels for the output layer of a deep neural network are determined related to a first sample. A similarity between different pairs of inputs and labels is estimated using a Gaussian regression process.

Type: Grant

Filed: April 7, 2017

Date of Patent: December 27, 2022

Assignee: Intel Corporation

Inventors: Yiwen Guo, Anbang Yao, Dongqi Cai, Libin Wang, Lin Xu, Ping Hu, Shandong Wang, Wenhua Cheng, Yurong Chen
SPECIALIZED FIXED FUNCTION HARDWARE FOR EFFICIENT CONVOLUTION

Publication number: 20220391679

Abstract: One embodiment provides a graphics processor comprising an instruction cache to store an instruction and a compute block configured to perform multiply-accumulate operations in response to execution of the instruction. The compute block includes a scheduler to schedule a plurality of threads for execution of the instruction and multiply-accumulate circuitry configured to execute the instruction via the plurality of threads, wherein the multiply-accumulate circuitry includes a plurality of functional units configured to process, in parallel via the plurality of threads, a corresponding plurality of matrix elements to multiply a first matrix and a second matrix, and to multiply the first matrix and the second matrix includes to multiply data elements in a row of the first matrix by corresponding data elements in a column of the second matrix to generate a plurality of products.

Type: Application

Filed: August 11, 2022

Publication date: December 8, 2022

Applicant: Intel Corporation

Inventors: Rajkishore Barik, Elmoustapha Ould-Ahmed-Vall, Xiaoming Chen, Dhawal Srivastava, Anbang Yao, Kevin Nealis, Eriko Nurvitadhi, Sara S. Baghsorkhi, Balaji Vembu, Tatiana Shpeisman, Ping T. Tang
MIXED INFERENCE USING LOW AND HIGH PRECISION

Publication number: 20220382555

Abstract: One embodiment provides for a graphics processing unit (GPU) to accelerate machine learning operations, the GPU comprising an instruction cache to store a first instruction and a second instruction, the first instruction to cause the GPU to perform a floating-point operation, including a multi-dimensional floating-point operation, and the second instruction to cause the GPU to perform an integer operation; and a general-purpose graphics compute unit having a single instruction, multiple thread architecture, the general-purpose graphics compute unit to concurrently execute the first instruction and the second instruction.

Type: Application

Filed: June 14, 2022

Publication date: December 1, 2022

Applicant: Intel Corporation

Inventors: ELMOUSTAPHA OULD-AHMED-VALL, BARATH LAKSHMANAN, TATIANA SHPEISMAN, Joydeep Ray, Ping T. Tang, Michael Strickland, Xiaoming Chen, Anbang Yao, Ben J. Ashbaugh, Linda L. Hurd, Liwei Ma
INSTRUCTIONS AND LOGIC TO PERFORM FLOATING POINT AND INTEGER OPERATIONS FOR MACHINE LEARNING

Publication number: 20220357945

Abstract: One embodiment provides a graphics processor comprising a memory controller and a graphics processing resource coupled with the memory controller. The graphics processing resource includes circuitry configured to execute an instruction to perform a matrix operation on first input including weight data and second input including input activation data, generate intermediate data based on a result of the matrix operation, quantize the intermediate data to a floating-point format determined based on a statistical distribution of first output data, and output, as second output data, quantized intermediate data in a determined floating-point format.

Type: Application

Filed: June 7, 2022

Publication date: November 10, 2022

Applicant: Intel Corporation

Inventors: Himanshu Kaul, Mark A. Anders, Sanu K. Mathew, Anbang Yao, Joydeep Ray, Ping T. Tang, Michael S. Strickland, Xiaoming Chen, Tatiana Shpeisman, Abhishek R. Appu, Altug Koker, Kamal Sinha, Balaji Vembu, Nicolas C. Galoppo Von Borries, Eriko Nurvitadhi, Rajkishore Barik, Tsung-Han Lin, Vasanth Ranganathan, Sanjeev Jahagirdar
Specialized fixed function hardware for efficient convolution

Patent number: 11475286

Abstract: One embodiment provides an apparatus comprising an instruction cache to store a plurality of instructions, a scheduler unit coupled to the instruction cache, the scheduler unit to schedule the plurality of instructions for execution, an instruction fetch and decode unit to decode the plurality of instructions to determine a set of operations to perform in response, one or more compute blocks to perform parallel multiply-accumulate operations based on the instruction fetch and decode unit decoding a first instruction of the plurality of instructions, and matrix multiplication logic to perform matrix multiplication operations based on the instruction fetch and decode unit decoding a second instruction of the plurality of instructions.

Type: Grant

Filed: December 21, 2021

Date of Patent: October 18, 2022

Assignee: Intel Corporation

Inventors: Rajkishore Barik, Elmoustapha Ould-Ahmed-Vall, Xiaoming Chen, Dhawal Srivastava, Anbang Yao, Kevin Nealis, Eriko Nurvitadhi, Sara S. Baghsorkhi, Balaji Vembu, Tatiana Shpeisman, Ping T. Tang

prev 1 2 3 4 5 6 7 … next