Patents by Inventor Kamal Sinha

Kamal Sinha has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

COMPUTE OPTIMIZATION MECHANISM FOR DEEP NEURAL NETWORKS

Publication number: 20180308200

Abstract: An apparatus to facilitate compute optimization is disclosed.

Type: Application

Filed: April 24, 2017

Publication date: October 25, 2018

Inventors: Prasoonkumar Surti, Narayan Srinivasa, Feng Chen, Joydeep Ray, Ben J. Ashbaugh, Nicolas C. Galoppo Von Borries, Eriko Nurvitadhi, Balaji Vembu, Tsung-Han Lin, Kamal Sinha, Rajkishore Barik, Sara S. Baghsorkhi, Justin E. Gottschlich, Altug Koker, Nadathur Rajagopalan Satish, Farshad Akhbari, Dukhwan Kim, Wenyin Fu, Travis T. Schluessler, Josh B. Mastronarde, Linda L. Hurd, John H. Feit, Jeffery S. Boles, Adam T. Lake, Karthik Vaidyanathan, Devan Burke, Subramaniam Maiyuran, Abhishek R. Appu
COMPUTE OPTIMIZATION MECHANISM FOR DEEP NEURAL NETWORKS

Publication number: 20180308206

Abstract: An apparatus to facilitate compute optimization is disclosed. The apparatus includes a memory device including a first integrated circuit (IC) including a plurality of memory channels and a second IC including a plurality of processing units, each coupled to a memory channel in the plurality of memory channels.

Type: Application

Filed: September 7, 2017

Publication date: October 25, 2018

Inventors: Prasoonkumar Surti, Narayan Srinivasa, Feng Chen, Joydeep Ray, Ben J. Ashbaugh, Nicolas C. Galoppo Von Borries, Eriko Nurvitadhi, Balaji Vembu, Tsung-Han Lin, Kamal Sinha, Rajkishore Barik, Sara S. Baghsorkhi, Justin E. Gottschlich, Altug Koker, Nadathur Rajagopalan Satish, Farshad Akhbari, Dukhwan Kim, Wenyin Fu, Travis T. Schluessler, Josh B. Mastronarde, Linda L. Hurd, John H. Feit, Jeffery S. Boles, Adam T. Lake, Karthik Vaidyanathan, Devan Burke, Subramaniam Maiyuran, Abhishek R. Appu
Dynamically Reconfigurable Memory Subsystem for Graphics Processors

Publication number: 20180307429

Abstract: By predicting future memory subsystem request behavior based on live memory subsystem usage history collection, a preferred setting for handling predicted upcoming request behavior may be generated and used to dynamically reconfigure the memory subsystem. This mechanism can be done continuously and in real time during to ensure active tracking of system behavior.

Type: Application

Filed: April 21, 2017

Publication date: October 25, 2018

Inventors: Wenyin Fu, Abhishek R. Appu, Bhushan M. Borole, Altug Koker, Nikos Kaburlasos, Kamal Sinha
DYNAMIC DISTRIBUTED TRAINING OF MACHINE LEARNING MODELS

Publication number: 20180307984

Abstract: In an example, an apparatus comprises a plurality of execution units comprising at least a first type of execution unit and a second type of execution unit and logic, at least partially including hardware logic, to analyze a workload and assign the workload to one of the first type of execution unit or the second type of execution unit. Other embodiments are also disclosed and claimed.

Type: Application

Filed: April 24, 2017

Publication date: October 25, 2018

Applicant: Intel Corporation

Inventors: Altug Koker, Abhishek R. Appu, Kamal Sinha, Joydeep Ray, Balaji Vembu, Elmoustapha Ould-Ahmed-Vall, Sara S. Baghsorkhi, Anbang Yao, Kevin Nealis, Xiaoming Chen, John C. Weast, Justin E. Gottschlich, Prasoonkumar Surti, Chandrasekaran Sakthivel, Farshad Akhbari, Nadathur Rajagopalan Satish, Liwei Ma, Jeremy Bottleson, Eriko Nurvitadhi, Travis T. Schluessler, Ankur N. Shah, Jonathan Kennedy, Vasanth Ranganathan, Sanjeev Jahagirdar
COMPUTE OPTIMIZATION MECHANISM FOR DEEP NEURAL NETWORKS

Publication number: 20180308208

Abstract: An apparatus to facilitate compute optimization is disclosed.

Type: Application

Filed: November 21, 2017

Publication date: October 25, 2018

Applicant: Intel Corporation

Inventors: Prasoonkumar Surti, Narayan Srinivasa, Feng Chen, Joydeep Ray, Ben J. Ashbaugh, Nicolas C. Galoppo Von Borries, Eriko Nurvitadhi, Balaji Vembu, Tsung-Han Lin, Kamal Sinha, Rajkishore Barik, Sara S. Baghsorkhi, Justin E. Gottschlich, Altug Koker, Nadathur Rajagopalan Satish, Farshad Akhbari, Dukhwan Kim, Wenyin Fu, Travis T. Schluessler, Josh B. Mastronarde, Linda L. Hurd, John H. Feit, Jeffery S. Boles, Adam T. Lake, Karthik Vaidyanathan, Devan Burke, Subramaniam Maiyuran, Abhishek R. Appu
DYNAMIC PRECISION FOR NEURAL NETWORK COMPUTE OPERATIONS

Publication number: 20180307971

Abstract: In an example, an apparatus comprises a compute engine comprising a high precision component and a low precision component; and logic, at least partially including hardware logic, to receive instructions in the compute engine; select at least one of the high precision component or the low precision component to execute the instructions; and apply a gate to at least one of the high precision component or the low precision component to execute the instructions. Other embodiments are also disclosed and claimed.

Type: Application

Filed: April 24, 2017

Publication date: October 25, 2018

Applicant: Intel Corpoartion

Inventors: Kamal Sinha, Balaji Vembu, Eriko Nurvitadhi, Nicolas C. Galoppo Von Borries, Rajkishore Barik, Tsung-Han Lin, Joydeep Ray, Ping T. Tang, Michael S. Strickland, Xiaoming Chen, Anbang Yao, Tatiana Shpeisman, Abhishek R. Appu, Altug Koker, Farshad Akhbari, Narayan Srinivasa, Feng Chen, Dukhwan Kim, Nadathur Rajagopalan Satish, John C. Weast, Mike B. MacPherson, Linda L. Hurd, Vasanth Ranganathan, Sanjeev S. Jahagirdar
COMPRESSION MECHANISM

Publication number: 20180308256

Abstract: An apparatus to facilitate compute compression is disclosed. The apparatus includes a graphics processing unit including mapping logic to map a first block of integer pixel data to a compression block and compression logic to compress the compression block.

Type: Application

Filed: April 24, 2017

Publication date: October 25, 2018

Applicant: Intel Corporation

Inventors: Abhishek R. Appu, Altug Koker, Joydeep Ray, Balaji Vembu, Prasoonkumar Surti, Kamal Sinha, Nadathur Rajagopalan Satish, Narayan Srinivasa, Feng Chen, Dukhwan Kim, Farshad Akhbari
COMPUTE OPTIMIZATIONS FOR NEURAL NETWORKS

Publication number: 20180307950

Abstract: One embodiment provides for a compute apparatus to perform machine learning operations, the apparatus comprising a decode unit to decode a single instruction into a decoded instruction that specifies multiple operands including an input value and a quantized weight value associated with a neural network and an arithmetic logic unit including a barrel shifter, an adder, and an accumulator register, wherein to execute the decoded instruction, the barrel shifter is to shift the input value by the quantized weight value to generate a shifted input value and the adder is to add the shifted input value to a value stored in the accumulator register and update the value stored in the accumulator register.

Type: Application

Filed: April 24, 2017

Publication date: October 25, 2018

Applicant: Intel Corporation

Inventors: Kevin Nealis, Anbang Yao, Xiaoming Chen, Elmoustapha Ould-Ahmed-Vall, Sara S. Baghsorkhi, Eriko Nurvitadhi, Balaji Vembu, Nicolas C. Galoppo Von Borries, Rajkishore Barik, Tsung-Han Lin, Kamal Sinha
COORDINATION AND INCREASED UTILIZATION OF GRAPHICS PROCESSORS DURING INFERENCE

Publication number: 20180308202

Abstract: A mechanism is described for facilitating inference coordination and processing utilization for machine learning at autonomous machines. A method of embodiments, as described herein, includes detecting, at training time, information relating to one or more tasks to be performed according to a training dataset relating to a processor including a graphics processor. The method may further include analyzing the information to determine one or more portions of hardware relating to the processor capable of supporting the one or more tasks, and configuring the hardware to pre-select the one or more portions to perform the one or more tasks, while other portions of the hardware remain available for other tasks.

Type: Application

Filed: April 24, 2017

Publication date: October 25, 2018

Applicant: Intel Corporation

Inventors: Abhishek R. Appu, Altug Koker, John C. Weast, Mike B. Macpherson, Linda L. Hurd, Sara S. Baghsorkhi, Justin E. Gottschlich, Prasoonkumar Surti, Chandrasekaran Sakthivel, Liwei Ma, Elmoustapha Ould-Ahmed-Vall, Kamal Sinha, Joydeep Ray, Balaji Vembu, Sanjeev Jahagirdar, Vasanth Ranganathan, DUKHWAN Kim
Replacement Policies for a Hybrid Hierarchical Cache

Publication number: 20180300260

Abstract: A hybrid hierarchical cache is implemented at the same level in the access pipeline, to get the faster access behavior of a smaller cache and, at the same time, a higher hit rate at lower power for a larger cache, in some embodiments. A split cache at the same level in the access pipeline includes two caches that work together. In the hybrid, split, low level cache (e.g., L1) evictions are coordinated locally between the two L1 portions, and on a miss to both L1 portions, a line is allocated from a larger L2 cache to the smallest L1 cache.

Type: Application

Filed: April 17, 2017

Publication date: October 18, 2018

Inventors: Abhishek R. Appu, Joydeep Ray, James A. Valerio, Altug Koker, Prasoonkumar P. Surti, Balaji Vembu, Wenyin FU, Bhushan M. Borole, Kamal Sinha
System, Apparatus And Method For Reducing Voltage Swing On An Interconnect

Publication number: 20180301120

Abstract: In an embodiment, an apparatus includes: a repeater to receive an input signal at an input node and output an output signal at an output node; a dynamic header device coupled between the repeater and a supply voltage node; and a feedback device coupled between the output node and the dynamic header device to dynamically control the dynamic header device based at least in part on the output signal. Other embodiments are described and claimed.

Type: Application

Filed: April 17, 2017

Publication date: October 18, 2018

Inventors: Anupama A. Thaploo, Jaydeep P. Kulkarni, Bhushan M. Borole, Abhishek R. Appu, Altug Koker, Kamal Sinha, Wenyin Fu
AVOID CACHE LOOKUP FOR COLD CACHE

Publication number: 20180300251

Abstract: Methods and apparatus relating to techniques for avoiding cache lookup for cold cache. In an example, an apparatus comprises logic, at least partially comprising hardware logic, to receive, in a read/modify/write (RMW) pipeline, a cache access request from a requestor, wherein the cache request comprises a cache set identifier associated with requested data in the cache set, determine whether the cache set associated with the cache set identifier is in an inaccessible invalid state, and in response to a determination that the cache set is in an inaccessible state or an invalid state, to terminate the cache access request. Other embodiments are also disclosed and claimed.

Type: Application

Filed: April 17, 2017

Publication date: October 18, 2018

Applicant: Intel Corporation

Inventors: Abhishek R. Appu, Altug Koker, Joydeep Ray, Prasoonkumar Surti, Kamal Sinha, Kiran C. Veernapu, Balaji Vembu
AUTONOMOUS VEHICLE NEURAL NETWORK OPTIMIZATION

Publication number: 20180299841

Abstract: Methods and apparatus relating to autonomous vehicle neural network optimization techniques are described. In an embodiment, the difference between a first training dataset to be used for a neural network and a second training dataset to be used for the neural network is detected. The second training dataset is authenticated in response to the detection of the difference. The neural network is used to assist in an autonomous vehicle/driving. Other embodiments are also disclosed and claimed.

Type: Application

Filed: April 17, 2017

Publication date: October 18, 2018

Applicant: Intel Corporation

Inventors: Abhishek R. Appu, Altug Koker, Linda L. Hurd, Dukhwan Kim, Mike B. MacPherson, John C. Weast, Justin E. Gottschlich, Jingyi Jin, Barath Lakshmanan, Chandrasekaran Sakthivel, Michael S. Strickland, Joydeep Ray, Kamal Sinha, Prasoonkumar Surti, Balaji Vembu, Ping T. Tang, Anbang Yao, Tatiana Shpeisman, Xiaoming Chen, Vasanth Ranganathan, Sanjeev S. Jahagirdar
Pulse Triggered Flip Flop

Publication number: 20180302064

Abstract: A pulse triggered flip flop circuit includes an exclusive OR clock generating stage that receives an input clock, data and produces an output clock pulse. The stage produces a output clock pulse that only goes away when the data is fully captured. The stage disables the output clock pulse only when the data is fully captured. Moreover, the circuit only toggles when the input data changes, reducing power consumption in some embodiments.

Type: Application

Filed: April 17, 2017

Publication date: October 18, 2018

Inventors: Bhushan M. Borole, Anupama A. Thaploo, Altug Koker, Abhishek R. Appu, Kamal Sinha, Wenyin Fu
System, Apparatus And Method For Increasing Performance In A Processor During A Voltage Ramp

Publication number: 20180301119

Abstract: In one embodiment, a processor includes: a graphics processor to execute a workload; and a power controller coupled to the graphics processor. The power controller may include a voltage ramp circuit to receive a request for the graphics processor to operate at a first performance state having a first operating voltage and a first operating frequency and cause an output voltage of a voltage regulator to increase to the first operating voltage. The voltage ramp circuit may be configured to enable the graphics processor to execute the workload at an interim performance state having an interim operating voltage and an interim operating frequency when the output voltage reaches a minimum operating voltage. Other embodiments are described and claimed.

Type: Application

Filed: April 17, 2017

Publication date: October 18, 2018

Inventors: Altug Koker, Abhishek R. Appu, Bhushan M. Borole, Wenyin Fu, Kamal Sinha, Joydeep Ray
Optimizing Read Only Memory Surface Accesses

Publication number: 20180300929

Abstract: In accordance with some embodiments, a separate pipe is used in graphics processor for handling accesses, namely reads, to read only (RO) surfaces within caches. Moreover, the caches may have defined read only section and defined read write (RW) sections. The read only section may be accessed through a dedicated read only pipe and the read write section may be accessed through a read write pipe for those surfaces that can also be written. Thus, the read only sections are handled in a read only fashion without the need to accommodate writes.

Type: Application

Filed: April 17, 2017

Publication date: October 18, 2018

Inventors: Abhishek R. Appu, Joydeep Ray, Altug Koker, Balaji Vembu, Kamal Sinha, Prasoonkumar Surti, Wenyin Fu, Bhushan M. Borole, Vasanth Ranganathan
System, Apparatus And Method For Providing A Local Clock Signal For A Memory Array

Publication number: 20180299921

Abstract: In an embodiment, a processor includes at least one processor core and at least one graphics processor. The at least one graphics processor may include a register file having a plurality of entries, where at least a portion of the at least one graphics processor is to operate at a first operating frequency and the register file is to operate at a second operating frequency greater than the first operating frequency, to enable the at least one graphics processor to issue a plurality of write requests to the register file in a single clock cycle at the first operating frequency and receive a plurality of data elements of a plurality of read requests from the register file in the single clock cycle at the first operating frequency. Other embodiments are described and claimed.

Type: Application

Filed: April 17, 2017

Publication date: October 18, 2018

Inventors: Iqbal R. Rajwani, Altug Koker, Bhushan M. Borole, Kamal Sinha, Abhishek R. Appu, Anupama A. Thaploo, Sunil Nekkanti, Wenyin Fu
SENSORY ENHANCED AUGMENTED REALITY AND VIRTUAL REALITY DEVICE

Publication number: 20180299952

Abstract: Systems, apparatuses and methods may provide away to enhance an augmented reality (AR) and/or virtual reality (VR) user experience with environmental information captured from sensors located in one or more physical environments. More particularly, systems, apparatuses and methods may provide a way to track, by an eye tracker sensor, a gaze of a user, and capture, by the sensors, environmental information. The systems, apparatuses and methods may render feedback, by one or more feedback devices or display device, for a portion of the environment information based on the gaze of the user.

Type: Application

Filed: April 17, 2017

Publication date: October 18, 2018

Inventors: Altug Koker, Michael Apodaca, Kai Xiao, Chandrasekaran Sakthivel, Jeffery S. Boles, Adam T. Lake, James M. Holland, Pattabhiraman K, Sayan Lahiri, Radhakrishnan Venkataraman, Kamal Sinha, Ankur N. Shah, Deepak S. Vembar, Abhishek R. Appu, Joydeep Ray, Elmoustapha Ould-Ahmed-Vall
EXTEND GPU/CPU COHERENCY TO MULTI-GPU CORES

Publication number: 20180300246

Abstract: In an example, an apparatus comprises a plurality of processing unit cores, a plurality of cache memory modules associated with the plurality of processing unit cores, and a machine learning model communicatively coupled to the plurality of processing unit cores, wherein the plurality of cache memory modules share cache coherency data with the machine learning model. Other embodiments are also disclosed and claimed.

Type: Application

Filed: April 17, 2017

Publication date: October 18, 2018

Applicant: Intel Corporation

Inventors: Chandrasekaran Sakthivel, Prasoonkumar Surti, John C. Weast, Sara S. Baghsorkhi, Justin E. Gottschlich, Abhishek R. Appu, Nicolas C. Galoppo Von Borries, Joydeep Ray, Narayan Srinivasa, Feng Chen, Ben J. Ashbaugh, Rajkishore Barik, Tsung-Han Lin, Kamal Sinha, Eriko Nurvitadhi, Balaji Vembu, Altug Koker
ADAPTIVE CACHE SIZING PER WORKLOAD

Publication number: 20180300238

Abstract: Briefly, in accordance with one or more embodiments, an apparatus comprises a processor to monitor cache utilization of an application during execution of the application for a workload; and a memory to store cache utilization statistics responsive to the monitored cache utilization. The processor is to determine an optimal cache configuration for the application based at least in part on the cache utilization statistics for the workload such that a smallest amount of cache is turned on for subsequent executions of the workload by the application.

Type: Application

Filed: April 17, 2017

Publication date: October 18, 2018

Inventors: Balaji Vembu, Altug Koker, Josh B. Mastronarde, Nikos Kaburlasos, Abhishek R. Appu, Sanjeev S. Jahagirdar, Eric J. Asperheim, Subramaniam Maiyuran, Kiran C. Veernapu, Pattabhiraman K, Kamal Sinha, Bhushan M. Borole, Wenyin Fu, Joydeep Ray, Prasoonkumar Surti, Eric J. Hoekstra, Travis T. Schluessler, Linda L. Hurd

prev … 7 8 9 10 11 12 next