Patents by Inventor Dipankar Das

Dipankar Das has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

COMMUNICATION OPTIMIZATIONS FOR DISTRIBUTED MACHINE LEARNING

Publication number: 20190205745

Abstract: Embodiments described herein provide a system to configure distributed training of a neural network, the system comprising memory to store a library to facilitate data transmission during distributed training of the neural network; a network interface to enable transmission and receipt of configuration data associated with a set of worker nodes, the worker nodes configured to perform distributed training of the neural network; and a processor to execute instructions provided by the library, the instructions to cause the processor to create one or more groups of the worker nodes, the one or more groups of worker nodes to be created based on a communication pattern for messages to be transmitted between the worker nodes during distributed training of the neural network.

Type: Application

Filed: December 29, 2017

Publication date: July 4, 2019

Applicant: Intel Corporation

Inventors: Srinivas Sridharan, Karthikeyan Vaidyanathan, Dipankar Das, Chandrasekaran Sakthivel, Mikhail E. Smorkalov
Spectral imaging system

Patent number: 10228283

Abstract: A spectral imaging system includes a spectrometer and an optics imaging system. The spectrometer is operable for generating spectral signatures of objects from a scene. The optics imaging system is operable to generate six or more responses from the same scene. Each of the six or more responses represents different spectral content of the objects in the scene. The responses generated by the optics imaging system can be used to generate a hypercube using spectral reconstruction techniques. In an embodiment, the spectral imaging system could be implemented as part of a mobile phone.

Type: Grant

Filed: August 14, 2017

Date of Patent: March 12, 2019

Assignee: SPECTRAL INSIGHTS PRIVATE LIMITED

Inventors: Sumit Nath, Dipankar Das, Suhash Gerald
APPARATUS AND METHOD FOR VECTOR MULTIPLY AND ACCUMULATE OF PACKED BYTES

Publication number: 20190042236

Abstract: An apparatus and method for performing multiply-accumulate operations.

Type: Application

Filed: January 24, 2018

Publication date: February 7, 2019

Inventors: ALEXANDER HEINECKE, DIPANKAR DAS, ROBERT VALENTINE, MARK CHARNEY
INSTRUCTIONS FOR FUSED MULTIPLY-ADD OPERATIONS WITH VARIABLE PRECISION INPUT OPERANDS

Publication number: 20190042242

Abstract: Disclosed embodiments relate to instructions for fused multiply-add (FMA) operations with variable-precision inputs. In one example, a processor to execute an asymmetric FMA instruction includes fetch circuitry to fetch an FMA instruction having fields to specify an opcode, a destination, and first and second source vectors having first and second widths, respectively, decode circuitry to decode the fetched FMA instruction, and a single instruction multiple data (SIMD) execution circuit to process as many elements of the second source vector as fit into an SIMD lane width by multiplying each element by a corresponding element of the first source vector, and accumulating a resulting product with previous contents of the destination, wherein the SIMD lane width is one of 16 bits, 32 bits, and 64 bits, the first width is one of 4 bits and 8 bits, and the second width is one of 1 bit, 2 bits, and 4 bits.

Type: Application

Filed: March 29, 2018

Publication date: February 7, 2019

Inventors: Dipankar DAS, Naveen K. MELLEPUDI, Mrinmay DUTTA, Arun KUMAR, Dheevatsa MUDIGERE, Abhisek KUNDU
CIRCUITRY FOR LOW-PRECISION DEEP LEARNING

Publication number: 20190042939

Abstract: The present disclosure relates generally to techniques for improving the implementation of certain operations on an integrated circuit. In particular, deep learning techniques, which may use a deep neural network (DNN) topology, may be implemented more efficiently using low-precision weights and activation values by efficiently performing down conversion of data to a lower precision and by preventing data overflow during suitable computations. Further, by more efficiently mapping multipliers to programmable logic on the integrated circuit device, the resources used by the DNN topology to perform, for example, inference tasks may be reduced, resulting in improved integrated circuit operating speeds.

Type: Application

Filed: May 31, 2018

Publication date: February 7, 2019

Inventors: Martin Langhammer, Sudarshan Srinivasan, Gregg William Baeckler, Duncan Moss, Sasikanth Avancha, Dipankar Das
SCALING HALF-PRECISION FLOATING POINT TENSORS FOR TRAINING DEEP NEURAL NETWORKS

Publication number: 20180322382

Abstract: One embodiment provides for a machine-learning accelerator device a multiprocessor to execute parallel threads of an instruction stream, the multiprocessor including a compute unit, the compute unit including a set of functional units, each functional unit to execute at least one of the parallel threads of the instruction stream. The compute unit includes compute logic configured to execute a single instruction to scale an input tensor associated with a layer of a neural network according to a scale factor, the input tensor stored in a floating-point data type, the compute logic to scale the input tensor to enable a data distribution of data of the input tensor to be represented by a 16-bit floating point data type.

Type: Application

Filed: January 12, 2018

Publication date: November 8, 2018

Applicant: Intel Corporation

Inventors: NAVEEN MELLEMPUDI, DIPANKAR DAS
OPTIMIZED COMPUTE HARDWARE FOR MACHINE LEARNING OPERATIONS

Publication number: 20180322390

Abstract: One embodiment provides for a compute apparatus to perform machine learning operations, the compute apparatus comprising a fetch unit to fetch a single instruction having multiple input operands, wherein the multiple input operands have an unequal bit-length, a first input operand having a first bit-length and a second input operand having a second bit-length; a decode unit to decode the single instruction into a decoded instruction; an operand length unit to determine a smaller bit-length of the first bit-length and the second bit-length; and a compute unit to perform a matrix operation on the multiple input operands to generate an output value having a bit length of the smaller bit length.

Type: Application

Filed: January 12, 2018

Publication date: November 8, 2018

Applicant: Intel Corporation

Inventors: Dipankar Das, Roger Gramunt, Mikhail Smelyanskiy, Jesus Corbal, Dheevatsa Mudigere, Naveen K. Mellempudi, Alexander F. Heinecke
DYNAMIC PRECISION MANAGEMENT FOR INTEGER DEEP LEARNING PRIMITIVES

Publication number: 20180322607

Abstract: One embodiment provides for a graphics processing unit to perform computations associated with a neural network, the graphics processing unit comprising compute unit including a hardware logic unit having dynamic precision fixed-point logic; a decode unit to decode an instruction for execution by the compute unit, the instruction to cause the compute unit to perform a matrix arithmetic operation on a set of dynamic fixed-point tensors; and a dynamic precision manager to dynamically adjust the precision of a compute operation performed by the compute unit during the matrix arithmetic operation, the dynamic precision manager to adjust the precision of the compute operation to prevent an arithmetic overflow.

Type: Application

Filed: January 29, 2018

Publication date: November 8, 2018

Applicant: Intel Corporation

Inventors: Naveen MELLEMPUDI, DHEEVATSA MUDIGERE, DIPANKAR DAS, SRINIVAS SRIDHARAN
HARDWARE IMPLEMENTED POINT TO POINT COMMUNICATION PRIMITIVES FOR MACHINE LEARNING

Publication number: 20180322387

Abstract: One embodiment provides for a system to compute and distribute data for distributed training of a neural network, the system including first memory to store a first set of instructions including a machine learning framework; a fabric interface to enable transmission and receipt of data associated with the set of trainable machine learning parameters; a first set of general-purpose processor cores to execute the first set of instructions, the first set of instructions to provide a training workflow for computation of gradients for the trainable machine learning parameters and to communicate with a second set of instructions, the second set of instructions facilitate transmission and receipt of the gradients via the fabric interface; and a graphics processor to perform compute operations associated with the training workflow to generate the gradients for the trainable machine learning parameters.

Type: Application

Filed: January 12, 2018

Publication date: November 8, 2018

Applicant: Intel Corporation

Inventors: Srinivas Sridharan, Karthikeyan Vaidyanathan, Dipankar Das
DATA PARALLELISM AND HALO EXCHANGE FOR DISTRIBUTED MACHINE LEARNING

Publication number: 20180322606

Abstract: One embodiment provides for a method of transmitting data between multiple compute nodes of a distributed compute system, the method comprising multi-dimensionally partitioning data of a feature map across multiple nodes for distributed training of a convolutional neural network; performing a parallel convolution operation on the multiple partitions to train weight data of the neural network; and exchanging data between nodes to enable computation of halo regions, the halo regions having dependencies on data processed by a different node.

Type: Application

Filed: January 12, 2018

Publication date: November 8, 2018

Applicant: Intel Corporation

Inventors: Dipankar Das, KARTHIKEYAN VAIDYANATHAN, Srinivas Sridharan
INCREMENTAL PRECISION NETWORKS USING RESIDUAL INFERENCE AND FINE-GRAIN QUANTIZATION

Publication number: 20180314940

Abstract: One embodiment provides for a computing device comprising a parallel processor compute unit to perform a set of parallel integer compute operations; a ternarization unit including a weight ternarization circuit and an activation quantization circuit; wherein the weight ternarization circuit is to convert a weight tensor from a floating-point representation to a ternary representation including a ternary weight and a scale factor; wherein the activation quantization circuit is to convert an activation tensor from a floating-point representation to an integer representation; and wherein the parallel processor compute unit includes one or more circuits to perform the set of parallel integer compute operations on the ternary representation of the weight tensor and the integer representation of the activation tensor.

Type: Application

Filed: January 12, 2018

Publication date: November 1, 2018

Applicant: Intel Corporation

Inventors: Abhisek KUNDU, NAVEEN MELLEMPUDI, DHEEVATSA MUDIGERE, Dipankar DAS
ABSTRACTION LAYERS FOR SCALABLE DISTRIBUTED MACHINE LEARNING

Publication number: 20180293493

Abstract: One embodiment provides for a method of transmitting data between multiple compute nodes of a distributed compute system, the method comprising creating a global view of communication operations to be performed between the multiple compute nodes of the distributed compute system, the global view created using information specific to a machine learning model associated with the distributed compute system; using the global view to determine a communication cost of the communication operations; and automatically determining a number of network endpoints for use in transmitting the data between the multiple compute nodes of the distributed compute system.

Type: Application

Filed: April 10, 2017

Publication date: October 11, 2018

Applicant: Intel Corporation

Inventors: Dhiraj D. Kalamkar, KARTHIKEYAN VAIDYANATHAN, SRINIVAS SRIDHARAN, DIPANKAR DAS
ABSTRACTION LIBRARY TO ENABLE SCALABLE DISTRIBUTED MACHINE LEARNING

Publication number: 20180293492

Abstract: One embodiment provides for a non-transitory machine readable medium storing instructions which, when executed by one or more processors, cause the one or more processors to perform operations comprising providing an interface to define a neural network using machine-learning domain specific terminology, wherein the interface enables selection of a neural network topology and abstracts low-level communication details of distributed training of the neural network.

Type: Application

Filed: April 10, 2017

Publication date: October 11, 2018

Applicant: Intel Corporation

Inventors: Dhiraj D. Kalamkar, KARTHIKEYAN VAIDYANATHAN, SRINIVAS SRIDHARAN, DIPANKAR DAS
TECHNOLOGIES FOR SCALING MULTILAYERED ARTIFICIAL NEURAL NETWORK TRAINING ALGORITHMS

Publication number: 20180285733

Abstract: Technologies for artificial neural network training include a computing node with a host fabric interface that sends a message that includes one or more artificial neural network training algorithm values to another computing node in response to receipt of a request to send the message. Prior to sending the message, the host fabric interface may receive a request to quantize the message and quantize the message based on a quantization level included in the request to generate a quantized message. The quantization message includes one or more quantized values such that each quantized value has a lower precision than a corresponding artificial neural network training algorithm value. The host fabric interface then transmits the quantized message, which includes metadata indicative of the quantization level, to another computing node in response to quantization of the message for artificial neural network training. Other embodiments are described and claimed.

Type: Application

Filed: April 1, 2017

Publication date: October 4, 2018

Inventors: Naveen K. Mellempudi, Srinivas Sridharan, Dheevatsa Mudigere, Dipankar Das
SPECTRAL IMAGING SYSTEM

Publication number: 20180045569

Abstract: A spectral imaging system includes a spectrometer and an optics imaging system. The spectrometer is operable for generating spectral signatures of objects from a scene. The optics imaging system is operable to generate six or more responses from the same scene. Each of the six or more responses represents different spectral content of the objects in the scene. The responses generated by the optics imaging system can be used to generate a hypercube using spectral reconstruction techniques. In an embodiment, the spectral imaging system could be implemented as part of a mobile phone.

Type: Application

Filed: August 14, 2017

Publication date: February 15, 2018

Applicant: SPECTRAL INSIGHTS PRIVATE LIMITED

Inventors: Sumit Nath, Dipankar Das, Suhash Gerald
Highly sensitive magnetic tunable heterojunction device for resistive switching

Patent number: 9594129

Abstract: The present invention discloses highly sensitive magnetic heterojunction device consisting of a composite comprising ferromagnetic (La0.66Sr0.34MnO3) LSMO layer with ultra-thin ferrimagnetic CoFe2O4 (CFO) layer capable of giant resistive switching (RS) which can be tuned at micro tesla magnetic field at room temperature.

Type: Grant

Filed: June 25, 2012

Date of Patent: March 14, 2017

Assignee: COUNCIL OF SCIENTIFIC & INDUSTRIAL RESEARCH

Inventors: Satishchandra Balkrishna Ogale, Dipankar Das Sarma, Abhimanyu Singh Rana, Vishal Prabhakar Thakare, Anil Kumar Puri
Memory management systems and methods for embedded systems

Patent number: 9430677

Abstract: Methods and systems are provided for managing static memory associated with software of an embedded system. The method includes performing one or more steps on one or more processors. The steps include selectively assigning memory objects to static memory segments based on access of the memory object by the software; managing data of the memory segments based on the assigning; and selectively restoring the data of the memory segments based on the managing.

Type: Grant

Filed: July 10, 2012

Date of Patent: August 30, 2016

Assignee: GM GLOBLA TECHNOLOGY OPERATIONS LLC

Inventor: Dipankar Das
Reconfigurable interface-based electrical architecture

Patent number: 8930036

Abstract: An electrical network architecture including a reconfigurable interface layer, along with a corresponding reconfiguration methodology. The interface layer is comprised of reconfigurable interface devices which allow a plurality of sensors and actuators to communicate with a plurality of control units. Each sensor or actuator is connected to multiple interface devices, which in turn are connected to a bus. The control units are also connected to the bus. In the event of an interface device failure, other interface devices can be reconfigured to maintain communication between sensors, actuators and control units. In the event of a control unit failure, the interface devices can be reconfigured to route sensor and actuator message traffic to a different control unit which can handle the functions of the failed control unit. The overall number of control units can also be reduced, as each control unit has flexible access to many sensors and actuators.

Type: Grant

Filed: April 13, 2011

Date of Patent: January 6, 2015

Assignee: GM Global Technology Operations LLC

Inventors: Dipankar Das, Vinod Kumar Agrawal, Seetharaman Rajappan
HIGHLY SENSITIVE MAGNETIC TUNABLE HETEROJUNCTION DEVICE FOR RESISTIVE SWITCHING

Publication number: 20140287534

Abstract: The present invention discloses highly sensitive magnetic heterojunction device consisting of a composite comprising ferromagnetic (La0.66Sr0.34MnO3) LSMO layer with ultra-thin ferrimagnetic CoFe2O4 (CFO) layer capable of giant resistive switching (RS) which can be tuned at micro tesla magnetic field at room temperature.

Type: Application

Filed: June 25, 2012

Publication date: September 25, 2014

Applicant: COUNCIL OF SCIENTIFIC & INDUSTRIAL RESEARCH

Inventors: Satishchandra Balkrishna Ogale, Dipankar Das Sarma, Abhimanyu Singh Rana, Vishal Prabhakar Thakare, Anil Kumar Puri
Data integrity field (DIF) implementation with error detection and intelligent recovery mechanism

Patent number: 8806282

Abstract: An apparatus for providing a data integrity field implementation in a data processing system includes a controller operative to interface between a host device and a destination device in the data processing system for transferring at least one data block therebetween. The data processing system further includes an error detection module associated with the controller. The error detection module is operative to determine a probability of an error occurrence based at least in part on a measured current error rate for the data processing system. The controller is operative to implement an error correction methodology which is selectively adaptable as a function of the probability of an error occurrence.

Type: Grant

Filed: February 16, 2012

Date of Patent: August 12, 2014

Assignee: LSI Corporation

Inventors: Varun Shetty, Debjit Roy Choudhury, Dipankar Das, Ashank Reddy

prev 1 2 3 4 5 6 next