Patents by Inventor Tsung-Han Lin

Tsung-Han Lin has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Supporting learned branch predictors

Patent number: 10534613

Abstract: Implementations of the disclosure provide a processing device comprising a branch predictor circuit to obtain a branch history for an application. The branch history comprising references to branching instructions associated with the application and an outcome of executing each branch. Using the branch history, a neutral network is trained to produce a weighted value for each branch of the branching instructions. Features of the branching instructions are identified based on the weighted values. Each feature identifying predictive information regarding the outcome of at least one branch of correlated branches having corresponding outcomes. A feature vector is determined based on the features. The feature vector comprises a plurality of data fields that identify an occurrence of a corresponding feature of the correlated branches with respect to the branch history. Using the feature vector, a data model is produced to determine a predicted outcome associated with the correlated branches.

Type: Grant

Filed: April 28, 2017

Date of Patent: January 14, 2020

Assignee: Intel Corporation

Inventors: Gokce Keskin, Stephen J. Tarsa, Gautham N. Chinya, Tsung-Han Lin, Perry H. Wang, Hong Wang
COLOR WHEEL MODULE AND PROJECTION APPARATUS

Publication number: 20200004128

Abstract: A color wheel module includes a fixing bracket, a carbon-iron alloy bracket, a motor, an optical filter, a first damper, and a second damper. The carbon-iron alloy bracket is fixed to the fixing bracket. The motor is fixed to the carbon-iron alloy bracket, and the motor is connected with the optical filter, so as to drive the optical filter to rotate between the motor and the fixing bracket. The first damper is disposed between the carbon-iron alloy bracket and the fixing bracket. The second damper is disposed between the motor and the carbon-iron alloy bracket.

Type: Application

Filed: June 26, 2019

Publication date: January 2, 2020

Inventors: TSUNG-HAN LIN, TUNG-CHOU HU, MIN-HSUEH LEE
Extend GPU/CPU coherency to multi-GPU cores

Patent number: 10521349

Abstract: In an example, an apparatus comprises a plurality of processing unit cores, a plurality of cache memory modules associated with the plurality of processing unit cores, and a machine learning model communicatively coupled to the plurality of processing unit cores, wherein the plurality of cache memory modules share cache coherency data with the machine learning model. Other embodiments are also disclosed and claimed.

Type: Grant

Filed: February 15, 2019

Date of Patent: December 31, 2019

Assignee: INTEL CORPORATION

Inventors: Chandrasekaran Sakthivel, Prasoonkumar Surti, John C. Weast, Sara S. Baghsorkhi, Justin E. Gottschlich, Abhishek R. Appu, Nicolas C. Galoppo Von Borries, Joydeep Ray, Narayan Srinivasa, Feng Chen, Ben J. Ashbaugh, Rajkishore Barik, Tsung-Han Lin, Kamal Sinha, Eriko Nurvitadhi, Balaji Vembu, Altug Koker
INSTRUCTIONS AND LOGIC TO PERFORM FLOATING-POINT AND INTEGER OPERATIONS FOR MACHINE LEARNING

Publication number: 20190369988

Abstract: One embodiment provides for a graphics processing unit to accelerate machine-learning operations, the graphics processing unit comprising a multiprocessor having a single instruction, multiple thread (SIMT) architecture, the multiprocessor to execute at least one single instruction; and a first compute unit included within the multiprocessor, the at least one single instruction to cause the first compute unit to perform a two-dimensional matrix multiply and accumulate operation, wherein to perform the two-dimensional matrix multiply and accumulate operation includes to compute a 32-bit intermediate product of 16-bit operands and to compute a 32-bit sum based on the 32-bit intermediate product.

Type: Application

Filed: June 5, 2019

Publication date: December 5, 2019

Applicant: Intel Corporation

Inventors: HIMANSHU KAUL, Mark A. Anders, Sanu K. Mathew, Anbang Yao, Joydeep Ray, Ping T. Tang, Michael S. Strickland, Xiaoming Chen, Tatiana Shpeisman, Abhishek R. Appu, Altug Koker, Kamal Sinha, Balaji Vembu, Nicolas C. Galoppo Von Borries, Eriko Nurvitadhi, Rajkishore Barik, Tsung-Han Lin, Vasanth Ranganathan, Sanjeev Jahagirdar
Instructions and logic to perform floating-point and integer operations for machine learning

Patent number: 10474458

Abstract: One embodiment provides for a machine-learning hardware accelerator comprising a compute unit having an adder and a multiplier that are shared between integer data path and a floating-point datapath, the upper bits of input operands to the multiplier to be gated during floating-point operation.

Type: Grant

Filed: October 18, 2017

Date of Patent: November 12, 2019

Assignee: Intel Corporation

Inventors: Himanshu Kaul, Mark A. Anders, Sanu K. Mathew, Anbang Yao, Joydeep Ray, Ping T. Tang, Michael S. Strickland, Xiaoming Chen, Tatiana Shpeisman, Abhishek R. Appu, Altug Koker, Kamal Sinha, Balaji Vembu, Nicolas C. Galoppo Von Borries, Eriko Nurvitadhi, Rajkishore Barik, Tsung-Han Lin, Vasanth Ranganathan, Sanjeev Jahagirdar
COMPUTE OPTIMIZATIONS FOR NEURAL NETWORKS

Publication number: 20190332903

Abstract: One embodiment provides for a compute apparatus comprising a decode unit to decode a single instruction into a decoded instruction that specifies multiple operands including a multi-bit input value and a bipolar binary weight associated with a neural network and an arithmetic logic unit including a multiplier, an adder, and an accumulator register. To execute the decoded instruction, the multiplier is to perform a multiplication operation on the multi-bit input based on the bipolar binary weight to generate an intermediate product and the adder is to add the intermediate product to a value stored in the accumulator register and update the value stored in the accumulator register.

Type: Application

Filed: July 8, 2019

Publication date: October 31, 2019

Applicant: Intel Corporation

Inventors: Kevin Nealis, Anbang Yao, Xiaoming Chen, Elmoustapha Ould-Ahmed-Vall, Sara S. Baghsorkhi, Eriko Nurvitadhi, Balaji Vembu, Nicolas C. Galoppo Von Borries, Rajkishore Barik, Tsung-Han Lin, Kamal Sinha
Compute optimization mechanism for deep neural networks

Patent number: 10417734

Abstract: An apparatus to facilitate compute optimization is disclosed. The apparatus includes a memory device including a first integrated circuit (IC) including a plurality of memory channels and a second IC including a plurality of processing units, each coupled to a memory channel in the plurality of memory channels.

Type: Grant

Filed: September 7, 2017

Date of Patent: September 17, 2019

Assignee: INTEL CORPORATION

Inventors: Prasoonkumar Surti, Narayan Srinivasa, Feng Chen, Joydeep Ray, Ben J. Ashbaugh, Nicolas C. Galoppo Von Borries, Eriko Nurvitadhi, Balaji Vembu, Tsung-Han Lin, Kamal Sinha, Rajkishore Barik, Sara S. Baghsorkhi, Justin E. Gottschlich, Altug Koker, Nadathur Rajagopalan Satish, Farshad Akhbari, Dukhwan Kim, Wenyin Fu, Travis T. Schluessler, Josh B. Mastronarde, Linda L. Hurd, John H. Feit, Jeffery S. Boles, Adam T. Lake, Karthik Vaidyanathan, Devan Burke, Subramaniam Maiyuran, Abhishek R. Appu
Compute optimization mechanism for deep neural networks

Patent number: 10417731

Abstract: An apparatus to facilitate compute optimization is disclosed. The apparatus includes a plurality of processing units each comprising a plurality of execution units (EUs), wherein the plurality of EUs comprise a first EU type and a second EU type.

Type: Grant

Filed: April 24, 2017

Date of Patent: September 17, 2019

Assignee: INTEL CORPORATION

Inventors: Prasoonkumar Surti, Narayan Srinivasa, Feng Chen, Joydeep Ray, Ben J. Ashbaugh, Nicolas C. Galoppo Von Borries, Eriko Nurvitadhi, Balaji Vembu, Tsung-Han Lin, Kamal Sinha, Rajkishore Barik, Sara S. Baghsorkhi, Justin E. Gottschlich, Altug Koker, Nadathur Rajagopalan Satish, Farshad Akhbari, Dukhwan Kim, Wenyin Fu, Travis T. Schluessler, Josh B. Mastronarde, Linda L. Hurd, John H. Feit, Jeffery S. Boles, Adam T. Lake, Karthik Vaidyanathan, Devan Burke, Subramaniam Maiyuran, Abhishek R. Appu
Compute optimizations for neural networks

Patent number: 10410098

Abstract: One embodiment provides for a compute apparatus to perform machine learning operations, the apparatus comprising a decode unit to decode a single instruction into a decoded instruction that specifies multiple operands including an input value and a quantized weight value associated with a neural network and an arithmetic logic unit including a barrel shifter, an adder, and an accumulator register, wherein to execute the decoded instruction, the barrel shifter is to shift the input value by the quantized weight value to generate a shifted input value and the adder is to add the shifted input value to a value stored in the accumulator register and update the value stored in the accumulator register.

Type: Grant

Filed: April 24, 2017

Date of Patent: September 10, 2019

Assignee: Intel Corporation

Inventors: Kevin Nealis, Anbang Yao, Xiaoming Chen, Elmoustapha Ould-Ahmed-Vall, Sara S. Baghsorkhi, Eriko Nurvitadhi, Balaji Vembu, Nicolas C. Galoppo Von Borries, Rajkishore Barik, Tsung-Han Lin, Kamal Sinha
Instruction and logic for nearest neighbor unit

Patent number: 10387797

Abstract: A processor includes a front end to decode an instruction, an allocator to pass the instruction to a nearest neighbor logic unit (NNLU) to execute the instruction, and a retirement unit to retire the instruction. The NNLU includes logic to determine input of the instruction for which nearest neighbors will be calculated, transform the input, retrieve candidate atoms for which the nearest neighbors will be calculated, compute distance between the candidate atoms and the input, and determine the nearest neighbors for the input based upon the computed distance.

Type: Grant

Filed: September 25, 2015

Date of Patent: August 20, 2019

Assignee: Intel Corporation

Inventors: Tsung-Han Lin, Gokce Keskin, Hsiang-Tsung Kung, She-Hwa Yen, Hong Wang
EXTEND GPU/CPU COHERENCY TO MULTI-GPU CORES

Publication number: 20190243764

Abstract: In an example, an apparatus comprises a plurality of processing unit cores, a plurality of cache memory modules associated with the plurality of processing unit cores, and a machine learning model communicatively coupled to the plurality of processing unit cores, wherein the plurality of cache memory modules share cache coherency data with the machine learning model. Other embodiments are also disclosed and claimed.

Type: Application

Filed: February 15, 2019

Publication date: August 8, 2019

Applicant: Intel Corporation

Inventors: Chandrasekaran Sakthivel, Prasoonkumar Surti, John C. Weast, Sara S. Baghsorkhi, Justin E. Gottschlich, Abhishek R. Appu, Nicolas C. Galoppo Von Borries, Joydeep Ray, Narayan Srinivasa, Feng Chen, Ben J. Ashbaugh, Rajkishore Barik, Tsung-Han Lin, Kamal Sinha, Eriko Nurvitadhi, Balaji Vembu, Altug Koker
Instructions and logic to perform floating-point and integer operations for machine learning

Patent number: 10353706

Abstract: One embodiment provides for a graphics processing unit to accelerate machine-learning operations, the graphics processing unit comprising a multiprocessor having a single instruction, multiple thread (SIMT) architecture, the multiprocessor to execute at least one single instruction; and a first compute unit included within the multiprocessor, the at least one single instruction to cause the first compute unit to perform a two-dimensional matrix multiply and accumulate operation, wherein to perform the two-dimensional matrix multiply and accumulate operation includes to compute a 32-bit intermediate product of 16-bit operands and to compute a 32-bit sum based on the 32-bit intermediate product.

Type: Grant

Filed: November 21, 2017

Date of Patent: July 16, 2019

Assignee: Intel Corporation

Inventors: Himanshu Kaul, Mark A. Anders, Sanu K. Mathew, Anbang Yao, Joydeep Ray, Ping T. Tang, Michael S. Strickland, Xiaoming Chen, Tatiana Shpeisman, Abhishek R. Appu, Altug Koker, Kamal Sinha, Balaji Vembu, Nicolas C. Galoppo Von Borries, Eriko Nurvitadhi, Rajkishore Barik, Tsung-Han Lin, Vasanth Ranganathan, Sanjeev Jahagirdar
Machine learning sparse computation mechanism

Patent number: 10346944

Abstract: An apparatus to facilitate processing of a sparse matrix is disclosed. The apparatus includes a plurality of processing units each comprising one or more processing elements, including logic to read operands, a multiplication unit to multiply two or more operands and a scheduler to identify operands having a zero value and prevent scheduling of the operands having the zero value at the multiplication unit.

Type: Grant

Filed: April 9, 2017

Date of Patent: July 9, 2019

Assignee: INTEL CORPORATION

Inventors: Eriko Nurvitadhi, Balaji Vembu, Tsung-Han Lin, Kamal Sinha, Rajkishore Barik, Nicolas C. Galoppo Von Borries
METHOD OF FABRICATING SEMICONDUCTOR DEVICE

Publication number: 20190157392

Abstract: A method of fabricating a semiconductor device is provided. In the method, a gate structure is formed on a semiconductor substrate. A photolithography process is performed with a mask having two transparent regions to form a photoresist layer having two openings in the semiconductor substrate. A first photoresist layer of the photoresist layer between the two openings is aligned to the gate structure and formed on the gate structure. The width of the first photoresist layer is shorter than the width of the gate structure such that a first side portion and a second side portion of the gate structure are exposed from both sides of the first photoresist layer, respectively. Next, an ion implantation process is performed to form lightly doped drain regions in the semiconductor substrate which are on two opposite sides of the gate structure of the photoresist layer.

Type: Application

Filed: November 22, 2017

Publication date: May 23, 2019

Applicant: Vanguard International Semiconductor Corporation

Inventors: Chih-Wei LIN, Tsung-Han LIN, Chao-Wei WU, Yen-Kai CHEN
PROGRAMMABLE COARSE GRAINED AND SPARSE MATRIX COMPUTE HARDWARE WITH ADVANCED SCHEDULING

Publication number: 20190139182

Abstract: One embodiment provides for a compute apparatus to perform machine learning operations, the compute apparatus comprising a decode unit to decode a single instruction into a decoded instruction, the decoded instruction to cause the compute apparatus to perform a complex machine learning compute operation.

Type: Application

Filed: November 21, 2018

Publication date: May 9, 2019

Applicant: Intel Corporation

Inventors: Eriko Nurvitadhi, Balaji Vembu, Nicolas C. Galoppo Von Borries, Rajkishore Barik, Tsung-Han Lin, Kamal Sinha, Nadathur Rajagopalan Satish, Jeremy Bottleson, Farshad Akhbari, Altug Koker, Narayan Srinivasa, Dukhwan Kim, Sara S. Baghsorkhi, Justin E. Gottschlich, Feng Chen, Elmoustapha Ould-Ahmed-Vall, Kevin Nealis, Xiaoming Chen, Anbang Yao
Systems, apparatuses, and methods for deep learning of feature detectors with sparse coding

Patent number: 10282465

Abstract: Detailed herein are embodiments of systems, methods, and apparatuses to be used for feature searching using an entry-based searching structure.

Type: Grant

Filed: June 20, 2014

Date of Patent: May 7, 2019

Assignee: Intel Corporation

Inventors: Tsung-Han Lin, Hsiang-Tsung Kung
Extend GPU/CPU coherency to multi-GPU cores

Patent number: 10261903

Abstract: In an example, an apparatus comprises a plurality of processing unit cores, a plurality of cache memory modules associated with the plurality of processing unit cores, and a machine learning model communicatively coupled to the plurality of processing unit cores, wherein the plurality of cache memory modules share cache coherency data with the machine learning model. Other embodiments are also disclosed and claimed.

Type: Grant

Filed: April 17, 2017

Date of Patent: April 16, 2019

Assignee: INTEL CORPORATION

Inventors: Chandrasekaran Sakthivel, Prasoonkumar Surti, John C. Weast, Sara S. Baghsorkhi, Justin E. Gottschlich, Abhishek R. Appu, Nicolas C. Galoppo Von Borries, Joydeep Ray, Narayan Srinivasa, Feng Chen, Ben J. Ashbaugh, Rajkishore Barik, Tsung-Han Lin, Kamal Sinha, Eriko Nurvitadhi, Balaji Vembu, Altug Koker
Object Motion Prediction and Autonomous Vehicle Control

Publication number: 20190049970

Abstract: Systems and methods for predicting object motion and controlling autonomous vehicles are provided. In one example embodiment, a computer implemented method includes obtaining state data indicative of at least a current or a past state of an object that is within a surrounding environment of an autonomous vehicle. The method includes obtaining data associated with a geographic area in which the object is located. The method includes generating a combined data set associated with the object based at least in part on a fusion of the state data and the data associated with the geographic area in which the object is located. The method includes obtaining data indicative of a machine-learned model. The method includes inputting the combined data set into the machine-learned model. The method includes receiving an output from the machine-learned model. The output can be indicative of a plurality of predicted trajectories of the object.

Type: Application

Filed: September 5, 2018

Publication date: February 14, 2019

Inventors: Nemanja Djuric, Vladan Radosavljevic, Thi Duong Nguyen, Tsung-Han Lin, Jeff Schneider, Henggang Cui, Fang-Chieh Chou, Tzu-Kuo Huang
Object Motion Prediction and Autonomous Vehicle Control

Publication number: 20190049987

Abstract: Systems and methods for predicting object motion and controlling autonomous vehicles are provided. In one example embodiment, a computer implemented method includes obtaining state data indicative of at least a current or a past state of an object that is within a surrounding environment of an autonomous vehicle. The method includes obtaining data associated with a geographic area in which the object is located. The method includes generating a combined data set associated with the object based at least in part on a fusion of the state data and the data associated with the geographic area in which the object is located. The method includes obtaining data indicative of a machine-learned model. The method includes inputting the combined data set into the machine-learned model. The method includes receiving an output from the machine-learned model. The output can be indicative of a predicted trajectory of the object.

Type: Application

Filed: October 13, 2017

Publication date: February 14, 2019

Inventors: Nemanja Djuric, Vladan Radosavljevic, Thi Duong Nguyen, Tsung-Han Lin, Jeff Schneider
VARIABLE EPOCH SPIKE TRAIN FILTERING

Publication number: 20190034782

Abstract: System and techniques for variable epoch spike train filtering are described herein. A spike trace storage may be initiated for an epoch. Here, the spike trace storage is included in a neural unit of neuromorphic hardware. Multiple spikes may be received at the neural unit during the epoch. The spike trace storage may be incremented for each of the multiple spikes to produce a count of received spikes. An epoch learning event may be obtained and a spike trace may be produced in response to the epoch learning event using the count of received spikes in the spike trace storage. Network parameters of the neural unit may be modified using the spike trace.

Type: Application

Filed: July 31, 2017

Publication date: January 31, 2019

Inventors: Michael I. Davies, Tsung-Han Lin

prev 1 2 3 4 5 6 7 next