Patents by Inventor Vikram Saletore

Vikram Saletore has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 11966843
    Abstract: Methods, apparatus, systems and articles of manufacture for distributed training of a neural network are disclosed. An example apparatus includes a neural network trainer to select a plurality of training data items from a training data set based on a toggle rate of each item in the training data set. A neural network parameter memory is to store neural network training parameters. A neural network processor is to generate training data results from distributed training over multiple nodes of the neural network using the selected training data items and the neural network training parameters. The neural network trainer is to synchronize the training data results and to update the neural network training parameters.
    Type: Grant
    Filed: June 13, 2022
    Date of Patent: April 23, 2024
    Assignee: Intel Corporation
    Inventors: Meenakshi Arunachalam, Arun Tejusve Raghunath Rajan, Deepthi Karkada, Adam Procter, Vikram Saletore
  • Publication number: 20230274157
    Abstract: Systems and methods include technology that identifies a first shard of shards of training data, where the training data is stored on a storage array, where the training data is divided into shards. The technology stores the first shard onto a first data storage of a first compute node, and copies the first shard from the first data storage of the first compute node onto a second data storage of the first compute node. The technology trains a machine learning model on the first shard stored in the second data storage during a first epoch of a training phase.
    Type: Application
    Filed: May 4, 2023
    Publication date: August 31, 2023
    Applicant: Intel Corporation
    Inventors: Vikram Saletore, Yaser Afshar
  • Publication number: 20220309349
    Abstract: Methods, apparatus, systems and articles of manufacture for distributed training of a neural network are disclosed. An example apparatus includes a neural network trainer to select a plurality of training data items from a training data set based on a toggle rate of each item in the training data set. A neural network parameter memory is to store neural network training parameters. A neural network processor is to generate training data results from distributed training over multiple nodes of the neural network using the selected training data items and the neural network training parameters. The neural network trainer is to synchronize the training data results and to update the neural network training parameters.
    Type: Application
    Filed: June 13, 2022
    Publication date: September 29, 2022
    Inventors: Meenakshi Arunachalam, Arun Tejusve Raghunath Rajan, Deepthi Karkada, Adam Procter, Vikram Saletore
  • Patent number: 11029971
    Abstract: Systems, apparatuses and methods may provide for technology that identifies a first set of compute nodes and a second set of compute nodes, wherein the first set of compute nodes execute more slowly than the second set of compute nodes. The technology may also automatically determine a compute node configuration that results in a relatively low difference in completion time between the first set of compute nodes and the second set of compute nodes with respect to a neural network workload. In an example, the technology applies the compute node configuration to an execution of the neural network workload on one or more nodes in the first set of compute nodes and one or more nodes in the second set of compute nodes.
    Type: Grant
    Filed: January 28, 2019
    Date of Patent: June 8, 2021
    Assignee: Intel Corporation
    Inventors: Meenakshi Arunachalam, Kushal Datta, Vikram Saletore, Vishal Verma, Deepthi Karkada, Vamsi Sripathi, Rahul Khanna, Mohan Kumar
  • Patent number: 10922610
    Abstract: Systems, apparatuses and methods may provide for technology that conducts a first timing measurement of a blockage timing of a first window of the training of the neural network. The blockage timing measures a time that processing is impeded at layers of the neural network during the first window of the training due to synchronization of one or more synchronizing parameters of the layers. Based upon the first timing measurement, the technology is to determine whether to modify a synchronization barrier policy to include a synchronization barrier to impede synchronization of one or more synchronizing parameters of one of the layers during a second window of the training. The technology is further to impede the synchronization of the one or more synchronizing parameters of the one of the layers during the second window if the synchronization barrier policy is modified to include the synchronization barrier.
    Type: Grant
    Filed: September 14, 2017
    Date of Patent: February 16, 2021
    Assignee: Intel Corporation
    Inventors: Adam Procter, Vikram Saletore, Deepthi Karkada, Meenakshi Arunachalam
  • Publication number: 20190155620
    Abstract: Systems, apparatuses and methods may provide for technology that identifies a first set of compute nodes and a second set of compute nodes, wherein the first set of compute nodes execute more slowly than the second set of compute nodes. The technology may also automatically determine a compute node configuration that results in a relatively low difference in completion time between the first set of compute nodes and the second set of compute nodes with respect to a neural network workload. In an example, the technology applies the compute node configuration to an execution of the neural network workload on one or more nodes in the first set of compute nodes and one or more nodes in the second set of compute nodes.
    Type: Application
    Filed: January 28, 2019
    Publication date: May 23, 2019
    Inventors: Meenakshi Arunachalam, Kushal Datta, Vikram Saletore, Vishal Verma, Deepthi Karkada, Vamsi Sripathi, Rahul Khanna, Mohan Kumar
  • Publication number: 20190080233
    Abstract: Systems, apparatuses and methods may provide for technology that conducts a first timing measurement of a blockage timing of a first window of the training of the neural network. The blockage timing measures a time that processing is impeded at layers of the neural network during the first window of the training due to synchronization of one or more synchronizing parameters of the layers. Based upon the first timing measurement, the technology is to determine whether to modify a synchronization barrier policy to include a synchronization barrier to impede synchronization of one or more synchronizing parameters of one of the layers during a second window of the training. The technology is further to impede the synchronization of the one or more synchronizing parameters of the one of the layers during the second window if the synchronization barrier policy is modified to include the synchronization barrier.
    Type: Application
    Filed: September 14, 2017
    Publication date: March 14, 2019
    Inventors: Adam Procter, Vikram Saletore, Deepthi Karkada, Meenakshi Arunachalam
  • Publication number: 20190042934
    Abstract: Methods, apparatus, systems and articles of manufacture for distributed training of a neural network are disclosed. An example apparatus includes a neural network trainer to select a plurality of training data items from a training data set based on a toggle rate of each item in the training data set. A neural network parameter memory is to store neural network training parameters. A neural network processor is to generate training data results from distributed training over multiple nodes of the neural network using the selected training data items and the neural network training parameters. The neural network trainer is to synchronize the training data results and to update the neural network training parameters.
    Type: Application
    Filed: December 1, 2017
    Publication date: February 7, 2019
    Inventors: Meenakshi Arunachalam, Arun Tejusve Raghunath Rajan, Deepthi Karkada, Adam Procter, Vikram Saletore
  • Patent number: 9524219
    Abstract: Durable atomic transactions for non-volatile media are described. A processor includes an interface to a non-volatile storage medium and a functional unit to perform instructions associated with an atomic transaction. The instructions are to update data at a set of addresses in the non-volatile storage medium atomically. The functional unit is operable to perform a first instruction to create the atomic transaction that declares a size of the data to be updated atomically. The functional unit is also operable to perform a second instruction to start execution of the atomic transaction. The functional unit is further operable to perform a third instruction to commit the atomic transaction to the set of addresses in the non-volatile storage medium, wherein the updated data is not visible to other functional units of the processing device until the atomic transaction is complete.
    Type: Grant
    Filed: September 27, 2013
    Date of Patent: December 20, 2016
    Assignee: Intel Corporation
    Inventors: Robert Bahnsen, Sridharan Sakthivelu, Vikram A. Saletore, Krishnaswamy Viswanathan, Matthew E. Tolentino, Kanivenahalli Govindaraju, Vincent J. Zimmer
  • Publication number: 20150095600
    Abstract: Durable atomic transactions for non-volatile media are described. A processor includes an interface to a non-volatile storage medium and a functional unit to perform instructions associated with an atomic transaction. The instructions are to update data at a set of addresses in the non-volatile storage medium atomically. The functional unit is operable to perform a first instruction to create the atomic transaction that declares a size of the data to be updated atomically. The functional unit is also operable to perform a second instruction to start execution of the atomic transaction. The functional unit is further operable to perform a third instruction to commit the atomic transaction to the set of addresses in the non-volatile storage medium, wherein the updated data is not visible to other functional units of the processing device until the atomic transaction is complete.
    Type: Application
    Filed: September 27, 2013
    Publication date: April 2, 2015
    Inventors: Robert Bahnsen, Sridharan Sakthivelu, Vikram A. Saletore, Krishnaswamy Viswanathan, Matthew E. Tolentino, Kanivenahalli Govindaraju, Vincent J. Zimmer
  • Patent number: 7305493
    Abstract: An apparatus and a system may include an adaptation module, a plurality of Direct Transport Interfaces (DTIs), a DTI accelerator, and a Transport Control Protocol/Internet Protocol (TCP/IP) accelerator. The adaptation module may provide a translated sockets call from an application program to one of the DTIs, where an included set of memory structures may couple the translated sockets call to the DTI accelerator, which may in turn couple the set of memory structures to the TCP/IP accelerator. An article may include data causing a machine to perform a method including: receiving an application program sockets call at the adaptation module, deriving a translated sockets call from the application program sockets call, receiving the translated sockets call at a DTI, coupling the translated sockets call to a DTI accelerator using a set of memory structures in the DTI, and coupling the set of memory structures to a TCP/IP accelerator.
    Type: Grant
    Filed: November 27, 2002
    Date of Patent: December 4, 2007
    Assignee: Intel Corporation
    Inventors: Gary L. McAlpine, David B. Minturn, Hemal V. Shah, Annie Foong, Greg J. Regnier, Vikram A. Saletore
  • Patent number: 7290114
    Abstract: Provided are a method, system, and program for sharing data in a user virtual address range with a kernel virtual address range. A user address in a user address space and length defining a user address range referencing physical locations in a memory are received. A determination is made of determining at least one page in the memory including the physical locations referenced by the user address range. For each determined page, one kernel address in a kernel address space is generated to reference the determined page, wherein at least one user address and at least one kernel address reference one page in the memory.
    Type: Grant
    Filed: November 17, 2004
    Date of Patent: October 30, 2007
    Assignee: Intel Corporation
    Inventors: Paul M. Stillwell, Jr., Vikram A. Saletore
  • Publication number: 20060107020
    Abstract: Provided are a method, system, and program for sharing data in a user virtual address range with a kernel virtual address range. A user address in a user address space and length defining a user address range referencing physical locations in a memory are received. A determination is made of determining at least one page in the memory including the physical locations referenced by the user address range. For each determined page, one kernel address in a kernel address space is generated to reference the determined page, wherein at least one user address and at least one kernel address reference one page in the memory.
    Type: Application
    Filed: November 17, 2004
    Publication date: May 18, 2006
    Inventors: Paul Stillwell, Vikram Saletore
  • Publication number: 20060072563
    Abstract: In general, the disclosure describes a variety of techniques that can enhance packet processing operations.
    Type: Application
    Filed: October 5, 2004
    Publication date: April 6, 2006
    Inventors: Greg Regnier, Vikram Saletore, Gary McAlpine, Ram Huggahalli, Ravishankar Iyer, Ramesh Illikkal, David Minturn, Donald Newell, Srihari Makineni
  • Publication number: 20050188055
    Abstract: The present inventive subject matter relates to the field of network computing, and more specifically to methods, systems, and software for accelerated performance of server clusters, server farms, and server grids. Some such embodiments include methods, systems, and software, that when executing cause content into the memories of servers in a cluster, sharing the contents of the memories amongst all servers in the cluster over a high-speed interconnect to form a high-speed cluster-wide memory. Some such embodiments include servicing content requests from a server that may or may not have the requested content in its local memory, but is able to directly access the requested content in the memory of another server in the cluster over the high-speed cluster wide memory. One such embodiment includes caching the content obtained from the memory of the other server for use in servicing subsequent requests for that content.
    Type: Application
    Filed: December 31, 2003
    Publication date: August 25, 2005
    Inventor: Vikram Saletore
  • Publication number: 20040103225
    Abstract: An apparatus and a system may include an adaptation module, a plurality of Direct Transport Interfaces (DTIs), a DTI accelerator, and a Transport Control Protocol/Internet Protocol (TCP/IP) accelerator. The adaptation module may provide a translated sockets call from an application program to one of the DTIs, where an included set of memory structures may couple the translated sockets call to the DTI accelerator, which may in turn couple the set of memory structures to the TCP/IP accelerator. An article may include data causing a machine to perform a method including: receiving an application program sockets call at the adaptation module, deriving a translated sockets call from the application program sockets call, receiving the translated sockets call at a DTI, coupling the translated sockets call to a DTI accelerator using a set of memory structures in the DTI, and coupling the set of memory structures to a TCP/IP accelerator.
    Type: Application
    Filed: November 27, 2002
    Publication date: May 27, 2004
    Applicant: Intel Corporation
    Inventors: Gary L. McAlpine, David B. Minturn, Hemal V. Shah, Annie Foong, Greg J. Regnier, Vikram A. Saletore
  • Patent number: 4438354
    Abstract: A switched capacitor gain stage (110, 120) having a programmable gain factor. This gain factor is determined by the connection of desired gain determining components (14-17; 25-28) contained within a component array (100, 101). A sample and hold circuit (46) is provided for the storage of the error voltage of the entire gain-integrator stage. This stored error voltage (V.sub.error) is inverted and integrated one time for each integration of the input voltage (V.sub.in), thus eliminating the effects of the inherent offset voltages of the circuit from the output voltage (V.sub.out).
    Type: Grant
    Filed: August 14, 1981
    Date of Patent: March 20, 1984
    Assignee: American Microsystems, Incorporated
    Inventors: Yusuf A. Haque, Vikram Saletore, Jeffrey A. Schuler
  • Patent number: 4431986
    Abstract: A digital to analog converter (100) utilizes a current mirror connected to a reference voltage (V.sub.REF) to generate a constant reference current (I.sub.REF). A voltage divider (R.sub.1 and R.sub.2) is used in conjunction with a plurality of MOS transistors (X.sub.1 -X.sub.N) serving as current mirrors having specific current carrying capabilities which are controlled by selected binary digits (bits) of a digital signal. By the appropriate connection of desired ones of said plurality of MOS transistors, a specific fraction of said reference current is caused to flow through said plurality of MOS transistors. The amount of current flowing through said plurality of MOS transistors generates an output voltage (V.sub.OUT) from the digital to analog converter of this invention. This output voltage may be positive or negative with respect to the reference voltage, thus the output voltage is bipolar.
    Type: Grant
    Filed: October 9, 1981
    Date of Patent: February 14, 1984
    Assignee: American Microsystems, Incorporated
    Inventors: Yusuf A. Haque, Vikram Saletore, Jeffrey A. Schuler