Patents by Inventor Vikram Saletore

Vikram Saletore has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Methods and apparatus for distributed training of a neural network

Patent number: 11966843

Abstract: Methods, apparatus, systems and articles of manufacture for distributed training of a neural network are disclosed. An example apparatus includes a neural network trainer to select a plurality of training data items from a training data set based on a toggle rate of each item in the training data set. A neural network parameter memory is to store neural network training parameters. A neural network processor is to generate training data results from distributed training over multiple nodes of the neural network using the selected training data items and the neural network training parameters. The neural network trainer is to synchronize the training data results and to update the neural network training parameters.

Type: Grant

Filed: June 13, 2022

Date of Patent: April 23, 2024

Assignee: Intel Corporation

Inventors: Meenakshi Arunachalam, Arun Tejusve Raghunath Rajan, Deepthi Karkada, Adam Procter, Vikram Saletore
INGESTION OF DATA FOR MACHINE LEARNING DISTRIBUTED TRAINING

Publication number: 20230274157

Abstract: Systems and methods include technology that identifies a first shard of shards of training data, where the training data is stored on a storage array, where the training data is divided into shards. The technology stores the first shard onto a first data storage of a first compute node, and copies the first shard from the first data storage of the first compute node onto a second data storage of the first compute node. The technology trains a machine learning model on the first shard stored in the second data storage during a first epoch of a training phase.

Type: Application

Filed: May 4, 2023

Publication date: August 31, 2023

Applicant: Intel Corporation

Inventors: Vikram Saletore, Yaser Afshar
METHODS AND APPARATUS FOR DISTRIBUTED TRAINING OF A NEURAL NETWORK

Publication number: 20220309349

Abstract: Methods, apparatus, systems and articles of manufacture for distributed training of a neural network are disclosed. An example apparatus includes a neural network trainer to select a plurality of training data items from a training data set based on a toggle rate of each item in the training data set. A neural network parameter memory is to store neural network training parameters. A neural network processor is to generate training data results from distributed training over multiple nodes of the neural network using the selected training data items and the neural network training parameters. The neural network trainer is to synchronize the training data results and to update the neural network training parameters.

Type: Application

Filed: June 13, 2022

Publication date: September 29, 2022

Inventors: Meenakshi Arunachalam, Arun Tejusve Raghunath Rajan, Deepthi Karkada, Adam Procter, Vikram Saletore
Automated resource usage configurations for deep learning neural network workloads on multi-generational computing architectures

Patent number: 11029971

Abstract: Systems, apparatuses and methods may provide for technology that identifies a first set of compute nodes and a second set of compute nodes, wherein the first set of compute nodes execute more slowly than the second set of compute nodes. The technology may also automatically determine a compute node configuration that results in a relatively low difference in completion time between the first set of compute nodes and the second set of compute nodes with respect to a neural network workload. In an example, the technology applies the compute node configuration to an execution of the neural network workload on one or more nodes in the first set of compute nodes and one or more nodes in the second set of compute nodes.

Type: Grant

Filed: January 28, 2019

Date of Patent: June 8, 2021

Assignee: Intel Corporation

Inventors: Meenakshi Arunachalam, Kushal Datta, Vikram Saletore, Vishal Verma, Deepthi Karkada, Vamsi Sripathi, Rahul Khanna, Mohan Kumar
Synchronization scheduler of distributed neural network training

Patent number: 10922610

Abstract: Systems, apparatuses and methods may provide for technology that conducts a first timing measurement of a blockage timing of a first window of the training of the neural network. The blockage timing measures a time that processing is impeded at layers of the neural network during the first window of the training due to synchronization of one or more synchronizing parameters of the layers. Based upon the first timing measurement, the technology is to determine whether to modify a synchronization barrier policy to include a synchronization barrier to impede synchronization of one or more synchronizing parameters of one of the layers during a second window of the training. The technology is further to impede the synchronization of the one or more synchronizing parameters of the one of the layers during the second window if the synchronization barrier policy is modified to include the synchronization barrier.

Type: Grant

Filed: September 14, 2017

Date of Patent: February 16, 2021

Assignee: Intel Corporation

Inventors: Adam Procter, Vikram Saletore, Deepthi Karkada, Meenakshi Arunachalam
AUTOMATED RESOURCE USAGE CONFIGURATIONS FOR DEEP LEARNING NEURAL NETWORK WORKLOADS ON MULTI-GENERATIONAL COMPUTING ARCHITECTURES

Publication number: 20190155620

Abstract: Systems, apparatuses and methods may provide for technology that identifies a first set of compute nodes and a second set of compute nodes, wherein the first set of compute nodes execute more slowly than the second set of compute nodes. The technology may also automatically determine a compute node configuration that results in a relatively low difference in completion time between the first set of compute nodes and the second set of compute nodes with respect to a neural network workload. In an example, the technology applies the compute node configuration to an execution of the neural network workload on one or more nodes in the first set of compute nodes and one or more nodes in the second set of compute nodes.

Type: Application

Filed: January 28, 2019

Publication date: May 23, 2019

Inventors: Meenakshi Arunachalam, Kushal Datta, Vikram Saletore, Vishal Verma, Deepthi Karkada, Vamsi Sripathi, Rahul Khanna, Mohan Kumar
SYNCHRONIZATION SCHEDULER OF DISTRIBUTED NEURAL NETWORK TRAINING

Publication number: 20190080233

Abstract: Systems, apparatuses and methods may provide for technology that conducts a first timing measurement of a blockage timing of a first window of the training of the neural network. The blockage timing measures a time that processing is impeded at layers of the neural network during the first window of the training due to synchronization of one or more synchronizing parameters of the layers. Based upon the first timing measurement, the technology is to determine whether to modify a synchronization barrier policy to include a synchronization barrier to impede synchronization of one or more synchronizing parameters of one of the layers during a second window of the training. The technology is further to impede the synchronization of the one or more synchronizing parameters of the one of the layers during the second window if the synchronization barrier policy is modified to include the synchronization barrier.

Type: Application

Filed: September 14, 2017

Publication date: March 14, 2019

Inventors: Adam Procter, Vikram Saletore, Deepthi Karkada, Meenakshi Arunachalam
METHODS AND APPARATUS FOR DISTRIBUTED TRAINING OF A NEURAL NETWORK

Publication number: 20190042934

Abstract: Methods, apparatus, systems and articles of manufacture for distributed training of a neural network are disclosed. An example apparatus includes a neural network trainer to select a plurality of training data items from a training data set based on a toggle rate of each item in the training data set. A neural network parameter memory is to store neural network training parameters. A neural network processor is to generate training data results from distributed training over multiple nodes of the neural network using the selected training data items and the neural network training parameters. The neural network trainer is to synchronize the training data results and to update the neural network training parameters.

Type: Application

Filed: December 1, 2017

Publication date: February 7, 2019

Inventors: Meenakshi Arunachalam, Arun Tejusve Raghunath Rajan, Deepthi Karkada, Adam Procter, Vikram Saletore
Atomic transactions to non-volatile memory

Patent number: 9524219

Abstract: Durable atomic transactions for non-volatile media are described. A processor includes an interface to a non-volatile storage medium and a functional unit to perform instructions associated with an atomic transaction. The instructions are to update data at a set of addresses in the non-volatile storage medium atomically. The functional unit is operable to perform a first instruction to create the atomic transaction that declares a size of the data to be updated atomically. The functional unit is also operable to perform a second instruction to start execution of the atomic transaction. The functional unit is further operable to perform a third instruction to commit the atomic transaction to the set of addresses in the non-volatile storage medium, wherein the updated data is not visible to other functional units of the processing device until the atomic transaction is complete.

Type: Grant

Filed: September 27, 2013

Date of Patent: December 20, 2016

Assignee: Intel Corporation

Inventors: Robert Bahnsen, Sridharan Sakthivelu, Vikram A. Saletore, Krishnaswamy Viswanathan, Matthew E. Tolentino, Kanivenahalli Govindaraju, Vincent J. Zimmer
ATOMIC TRANSACTIONS TO NON-VOLATILE MEMORY

Publication number: 20150095600

Abstract: Durable atomic transactions for non-volatile media are described. A processor includes an interface to a non-volatile storage medium and a functional unit to perform instructions associated with an atomic transaction. The instructions are to update data at a set of addresses in the non-volatile storage medium atomically. The functional unit is operable to perform a first instruction to create the atomic transaction that declares a size of the data to be updated atomically. The functional unit is also operable to perform a second instruction to start execution of the atomic transaction. The functional unit is further operable to perform a third instruction to commit the atomic transaction to the set of addresses in the non-volatile storage medium, wherein the updated data is not visible to other functional units of the processing device until the atomic transaction is complete.

Type: Application

Filed: September 27, 2013

Publication date: April 2, 2015

Inventors: Robert Bahnsen, Sridharan Sakthivelu, Vikram A. Saletore, Krishnaswamy Viswanathan, Matthew E. Tolentino, Kanivenahalli Govindaraju, Vincent J. Zimmer
Embedded transport acceleration architecture

Patent number: 7305493

Abstract: An apparatus and a system may include an adaptation module, a plurality of Direct Transport Interfaces (DTIs), a DTI accelerator, and a Transport Control Protocol/Internet Protocol (TCP/IP) accelerator. The adaptation module may provide a translated sockets call from an application program to one of the DTIs, where an included set of memory structures may couple the translated sockets call to the DTI accelerator, which may in turn couple the set of memory structures to the TCP/IP accelerator. An article may include data causing a machine to perform a method including: receiving an application program sockets call at the adaptation module, deriving a translated sockets call from the application program sockets call, receiving the translated sockets call at a DTI, coupling the translated sockets call to a DTI accelerator using a set of memory structures in the DTI, and coupling the set of memory structures to a TCP/IP accelerator.

Type: Grant

Filed: November 27, 2002

Date of Patent: December 4, 2007

Assignee: Intel Corporation

Inventors: Gary L. McAlpine, David B. Minturn, Hemal V. Shah, Annie Foong, Greg J. Regnier, Vikram A. Saletore
Sharing data in a user virtual address range with a kernel virtual address range

Patent number: 7290114

Abstract: Provided are a method, system, and program for sharing data in a user virtual address range with a kernel virtual address range. A user address in a user address space and length defining a user address range referencing physical locations in a memory are received. A determination is made of determining at least one page in the memory including the physical locations referenced by the user address range. For each determined page, one kernel address in a kernel address space is generated to reference the determined page, wherein at least one user address and at least one kernel address reference one page in the memory.

Type: Grant

Filed: November 17, 2004

Date of Patent: October 30, 2007

Assignee: Intel Corporation

Inventors: Paul M. Stillwell, Jr., Vikram A. Saletore
Sharing data in a user virtual address range with a kernel virtual address range

Publication number: 20060107020

Abstract: Provided are a method, system, and program for sharing data in a user virtual address range with a kernel virtual address range. A user address in a user address space and length defining a user address range referencing physical locations in a memory are received. A determination is made of determining at least one page in the memory including the physical locations referenced by the user address range. For each determined page, one kernel address in a kernel address space is generated to reference the determined page, wherein at least one user address and at least one kernel address reference one page in the memory.

Type: Application

Filed: November 17, 2004

Publication date: May 18, 2006

Inventors: Paul Stillwell, Vikram Saletore
Packet processing

Publication number: 20060072563

Abstract: In general, the disclosure describes a variety of techniques that can enhance packet processing operations.

Type: Application

Filed: October 5, 2004

Publication date: April 6, 2006

Inventors: Greg Regnier, Vikram Saletore, Gary McAlpine, Ram Huggahalli, Ravishankar Iyer, Ramesh Illikkal, David Minturn, Donald Newell, Srihari Makineni
Distributed and dynamic content replication for server cluster acceleration

Publication number: 20050188055

Abstract: The present inventive subject matter relates to the field of network computing, and more specifically to methods, systems, and software for accelerated performance of server clusters, server farms, and server grids. Some such embodiments include methods, systems, and software, that when executing cause content into the memories of servers in a cluster, sharing the contents of the memories amongst all servers in the cluster over a high-speed interconnect to form a high-speed cluster-wide memory. Some such embodiments include servicing content requests from a server that may or may not have the requested content in its local memory, but is able to directly access the requested content in the memory of another server in the cluster over the high-speed cluster wide memory. One such embodiment includes caching the content obtained from the memory of the other server for use in servicing subsequent requests for that content.

Type: Application

Filed: December 31, 2003

Publication date: August 25, 2005

Inventor: Vikram Saletore
Embedded transport acceleration architecture

Publication number: 20040103225

Abstract: An apparatus and a system may include an adaptation module, a plurality of Direct Transport Interfaces (DTIs), a DTI accelerator, and a Transport Control Protocol/Internet Protocol (TCP/IP) accelerator. The adaptation module may provide a translated sockets call from an application program to one of the DTIs, where an included set of memory structures may couple the translated sockets call to the DTI accelerator, which may in turn couple the set of memory structures to the TCP/IP accelerator. An article may include data causing a machine to perform a method including: receiving an application program sockets call at the adaptation module, deriving a translated sockets call from the application program sockets call, receiving the translated sockets call at a DTI, coupling the translated sockets call to a DTI accelerator using a set of memory structures in the DTI, and coupling the set of memory structures to a TCP/IP accelerator.

Type: Application

Filed: November 27, 2002

Publication date: May 27, 2004

Applicant: Intel Corporation

Inventors: Gary L. McAlpine, David B. Minturn, Hemal V. Shah, Annie Foong, Greg J. Regnier, Vikram A. Saletore
Monolithic programmable gain-integrator stage

Patent number: 4438354

Abstract: A switched capacitor gain stage (110, 120) having a programmable gain factor. This gain factor is determined by the connection of desired gain determining components (14-17; 25-28) contained within a component array (100, 101). A sample and hold circuit (46) is provided for the storage of the error voltage of the entire gain-integrator stage. This stored error voltage (V.sub.error) is inverted and integrated one time for each integration of the input voltage (V.sub.in), thus eliminating the effects of the inherent offset voltages of the circuit from the output voltage (V.sub.out).

Type: Grant

Filed: August 14, 1981

Date of Patent: March 20, 1984

Assignee: American Microsystems, Incorporated

Inventors: Yusuf A. Haque, Vikram Saletore, Jeffrey A. Schuler
Digital to analog and analog to digital converters with bipolar output signals

Patent number: 4431986

Abstract: A digital to analog converter (100) utilizes a current mirror connected to a reference voltage (V.sub.REF) to generate a constant reference current (I.sub.REF). A voltage divider (R.sub.1 and R.sub.2) is used in conjunction with a plurality of MOS transistors (X.sub.1 -X.sub.N) serving as current mirrors having specific current carrying capabilities which are controlled by selected binary digits (bits) of a digital signal. By the appropriate connection of desired ones of said plurality of MOS transistors, a specific fraction of said reference current is caused to flow through said plurality of MOS transistors. The amount of current flowing through said plurality of MOS transistors generates an output voltage (V.sub.OUT) from the digital to analog converter of this invention. This output voltage may be positive or negative with respect to the reference voltage, thus the output voltage is bipolar.

Type: Grant

Filed: October 9, 1981

Date of Patent: February 14, 1984

Assignee: American Microsystems, Incorporated

Inventors: Yusuf A. Haque, Vikram Saletore, Jeffrey A. Schuler