Patents by Inventor Nir Shavit

Nir Shavit has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20200160182
    Abstract: A system and method of inferring a neural network (NN) on one or more target computing devices. The NN may include a plurality of layers, where at least one layer includes one or more kernels. Embodiments may include: receiving a data structure representing the NN; analyzing the data structure to produce one or more tasks, where each task may include computations pertaining to a kernel of the NN; selecting a sparse version of at least one kernel and replacing the at least one kernel with the sparse version; and compiling the one or more tasks to produce one or more respective tensor columns, The one or more tensor columns are adapted to fit in respective one or more cache memories of the one or more target computing devices, and include task instruction code that represents at least one computation of the kernel of the NN.
    Type: Application
    Filed: January 24, 2020
    Publication date: May 21, 2020
    Applicant: Neuralmagic Inc.
    Inventors: Alexander MATVEEV, Nir Shavit, Aleksandar Zlateski
  • Publication number: 20190370071
    Abstract: A method and system for computing one or more outputs of a neural network having a plurality of layers is provided. The method and system can include determining a plurality of sub-computations from total computations of the neural network to execute in parallel wherein the computations to execute in parallel involve computations from multiple layers. The method and system also can also include avoiding repeating overlapped computations and/or multiple memory reads and writes during execution.
    Type: Application
    Filed: May 30, 2019
    Publication date: December 5, 2019
    Applicant: Neuralmagic Inc.
    Inventors: Alexander Matveev, Nir Shavit
  • Publication number: 20190156214
    Abstract: Systems and methods may make exchanging data in a neural network (NN) during training more efficient. Exchanging weights among a number of processors training a NN across iterations may include sorting generated weights, compressing the sorted weights, and transmitting the compressed sorted weights. On each Kth iteration a sort order of the sorted weights may be created and transmitted. Exchanging weights among processors training a NN may include executing a forward pass to produce a set of loss values for processors, transmitting loss values to other processors, and at each of the processors, performing backpropagation on at least one layer of the NN using loss values received from other processors.
    Type: Application
    Filed: November 16, 2018
    Publication date: May 23, 2019
    Inventors: Alexander MATVEEV, Nir Shavit
  • Publication number: 20190156215
    Abstract: Systems and methods may make exchanging data in a neural network (NN) during training more efficient. Exchanging weights among a number of processors training a NN across iterations may include sorting generated weights, compressing the sorted weights, and transmitting the compressed sorted weights. On each Kth iteration a sort order of the sorted weights may be created and transmitted. Exchanging weights among processors training a NN may include executing a forward pass to produce a set of loss values for processors, transmitting loss values to other processors, and at each of the processors, performing backpropagation on at least one layer of the NN using loss values received from other processors.
    Type: Application
    Filed: November 16, 2018
    Publication date: May 23, 2019
    Applicant: Neuralmagic Inc.
    Inventors: Alexander MATVEEV, Nir SHAVIT
  • Publication number: 20190138902
    Abstract: A system and method for an improved convolutional layer in convolutional neural networks is provided. The convolution is performed via a transformation that includes relocating input, relocating convolution filters and performing an aggregate matrix multiply.
    Type: Application
    Filed: November 6, 2018
    Publication date: May 9, 2019
    Applicant: Neuralmagic Inc.
    Inventors: Alexander Matveev, Nir Shavit
  • Patent number: 10070803
    Abstract: Spirometer apparatus comprising main inhale-exhale tube having first end, main interior, and second open end, a plurality of smaller tubes intersecting said main-inhale exhale tube at first and second respective locations and having a plurality of smaller interiors respectively, the first location being closer to the first end than is the second location, wherein each of the smaller interiors are in fluid communication with the main interior solely via at least one aperture formed in each of the intersecting tubes at locations facing said second end, the intersecting tubes having first and second external cross-sections, the main tube having first and second internal cross-sections, wherein said first external cross-section is smaller than said first internal cross-section, said second external cross-section is smaller than said second internal cross-section, and wherein said second external cross-section is smaller than said first external cross-section, and a differential pressure sensor sensing the pressur
    Type: Grant
    Filed: August 12, 2010
    Date of Patent: September 11, 2018
    Assignee: LUNGTEK LTD.
    Inventor: Nir Shavit
  • Publication number: 20120136271
    Abstract: Spirometer apparatus comprising main inhale-exhale tube having first end, main interior, and second open end, a plurality of smaller tubes intersecting said main-inhale exhale tube at first and second respective locations and having a plurality of smaller interiors respectively, the first location being closer to the first end than is the second location, wherein each of the smaller interiors are in fluid communication with the main interior solely via at least one aperture formed in each of the intersecting tubes at locations facing said second end, the intersecting tubes having first and second external cross-sections, the main tube having first and second internal cross-sections, wherein said first external cross-section is smaller than said first internal cross-section, said second external cross-section is smaller than said second internal cross-section, and wherein said second external cross-section is smaller than said first external cross-section, and a differential pressure sensor sensing the pressur
    Type: Application
    Filed: August 12, 2010
    Publication date: May 31, 2012
    Applicant: LUNGTEK LTD.
    Inventor: Nir Shavit
  • Patent number: 7836228
    Abstract: A scalable first-in-first-out queue implementation adjusts to load on a host system. The scalable FIFO queue implementation is lock-free and linearizable, and scales to large numbers of threads. The FIFO queue implementation includes a central queue and an elimination structure for eliminating enqueue-dequeue operation pairs. The elimination mechanism tracks enqueue operations and/or dequeue operations and eliminates without synchronizing on the FIFO queue implementation.
    Type: Grant
    Filed: October 15, 2004
    Date of Patent: November 16, 2010
    Assignee: Oracle America, Inc.
    Inventors: Mark Moir, Ori Shalev, Nir Shavit
  • Publication number: 20080109608
    Abstract: The design of nonblocking linked data structures using single-location synchronization primitives such as compare-and-swap (CAS) is a complex affair that often requires severe restrictions on the way pointers are used. One way to address this problem is to provide stronger synchronization operations, for example, ones that atomically modify one memory location while simultaneously verifying the contents of others. We provide a simple and highly efficient nonblocking implementation of such an operation: an atomic k-word-compare single-swap operation (KCSS). Our implementation is obstruction-free. As a result, it is highly efficient in the uncontended case and relies on contention management mechanisms in the contended cases. It allows linked data structure manipulation without the complexity and restrictions of other solutions.
    Type: Application
    Filed: September 28, 2007
    Publication date: May 8, 2008
    Inventors: Nir Shavit, Mark Moir, Victor Luchangco
  • Publication number: 20080077748
    Abstract: The design of nonblocking linked data structures using single-location synchronization primitives such as compare-and-swap (CAS) is a complex affair that often requires severe restrictions on the way pointers are used. One way to address this problem is to provide stronger synchronization operations, for example, ones that atomically modify one memory location while simultaneously verifying the contents of others. We provide a simple and highly efficient nonblocking implementation of such an operation: an atomic k-word-compare single-swap operation (KCSS). Our implementation is obstruction-free. As a result, it is highly efficient in the uncontended case and relies on contention management mechanisms in the contended cases. It allows linked data structure manipulation without the complexity and restrictions of other solutions.
    Type: Application
    Filed: September 28, 2007
    Publication date: March 27, 2008
    Inventors: Nir Shavit, Mark Moir, Victor Luchangco
  • Publication number: 20080077775
    Abstract: The design of nonblocking linked data structures using single-location synchronization primitives such as compare-and-swap (CAS) is a complex affair that often requires severe restrictions on the way pointers are used. One way to address this problem is to provide stronger synchronization operations, for example, ones that atomically modify one memory location while simultaneously verifying the contents of others. We provide a simple and highly efficient nonblocking implementation of such an operation: an atomic k-word-compare single-swap operation (KCSS). Our implementation is obstruction-free. As a result, it is highly efficient in the uncontended case and relies on contention management mechanisms in the contended cases. It allows linked data structure manipulation without the complexity and restrictions of other solutions.
    Type: Application
    Filed: September 28, 2007
    Publication date: March 27, 2008
    Inventors: Nir Shavit, Mark Moir, Victor Luchangco
  • Publication number: 20080034166
    Abstract: The design of nonblocking linked data structures using single-location synchronization primitives such as compare-and-swap (CAS) is a complex affair that often requires severe restrictions on the way pointers are used. One way to address this problem is to provide stronger synchronization operations, for example, ones that atomically modify one memory location while simultaneously verifying the contents of others. We provide a simple and highly efficient nonblocking implementation of such an operation: an atomic k-word-compare single-swap operation (KCSS). Our implementation is obstruction-free. As a result, it is highly efficient in the uncontended case and relies on contention management mechanisms in the contended cases. It allows linked data structure manipulation without the complexity and restrictions of other solutions.
    Type: Application
    Filed: September 28, 2007
    Publication date: February 7, 2008
    Inventors: Nir Shavit, Mark Moir, Victor Lunchangeo
  • Publication number: 20070198792
    Abstract: A computer system includes multiple processing threads that execute in parallel. The multiple processing threads have access to a global environment including different types of metadata enabling the processing threads to carry out simultaneous execution depending on a currently selected type of lock mode. A mode controller monitoring the processing threads initiates switching from one type of lock mode to another depending on current operating conditions such as an amount of contention amongst the multiple processing threads to modify the shared data. The mode controller can switch from one lock mode another regardless of whether any of the multiple processes are in the midst of executing a respective transaction. A most efficient lock mode can be selected to carry out the parallel transactions. In certain cases, switching of lock modes causes one or more of the processing threads to abort and retry a respective transaction according to the new mode.
    Type: Application
    Filed: June 27, 2006
    Publication date: August 23, 2007
    Inventors: David Dice, Nir Shavit
  • Publication number: 20070198979
    Abstract: For each of multiple processes executing in parallel, as long as corresponding version information associated with a respective set of one or more shared variables used for computational purposes has not changed during execution of a respective transaction, results of the respective transaction can be globally committed to memory without causing data corruption. If version information associated with one or more respective shared variables (used to produce the transaction results) happens to change during a process of generating respective results, then a respective process can identify that another process modified the one or more respective shared variables during execution and that its transaction results should not be committed to memory. In this latter case, the transaction repeats itself until it is able to commit respective results without causing data corruption.
    Type: Application
    Filed: June 27, 2006
    Publication date: August 23, 2007
    Inventors: David Dice, Nir Shavit
  • Publication number: 20070198519
    Abstract: The present disclosure describes a unique way for each of multiple processes to operate in parallel using (e.g., reading, modifying, and writing to) the same shared data without causing corruption to the shared data. For example, each of multiple processes utilizes current and past data values associated with a global counter or clock for purposes of determining whether any shared variables used to produce a respective transaction outcome were modified (by another process) when executing a respective transaction. If a respective process detects that shared data used by respective process was modified during a transaction, the process can abort and retry the transaction rather than cause data corruption by storing locally maintained results associated with the transaction to a globally shared data space.
    Type: Application
    Filed: June 27, 2006
    Publication date: August 23, 2007
    Inventors: David Dice, Ori Shalev, Nir Shavit
  • Publication number: 20070198978
    Abstract: A computer system includes multiple processing threads that execute in parallel. The multiple processing threads have access to a global environment including i) shared data utilized by the multiple processing threads, ii) a globally accessible register or buffer of version information that changes each time a respective one of the multiple processing threads modifies the shared data, and iii) respective lock information indicating whether one of the multiple processing threads has locked the shared data preventing other processing threads from modifying the shared data. To prevent data corruption, each of the processing threads aborts if a given processing thread detects a change in the version information or another processing thread has a lock on the shared data. This technique is well suited for use in applications such as processing threads that support a high number of reads with a corresponding number of fewer respective writes to shared data.
    Type: Application
    Filed: June 27, 2006
    Publication date: August 23, 2007
    Inventors: David Dice, Nir Shavit
  • Publication number: 20070198781
    Abstract: Cache logic associated with a respective one of multiple processing threads executing in parallel updates corresponding data fields of a cache to uniquely mark its contents. The marked contents represent a respective read set for a transaction. For example, at an outset of executing a transaction, a respective processing thread chooses a data value to mark contents of the cache used for producing a transaction outcome for the processing thread. Upon each read of shared data from main memory, the cache stores a copy of the data and marks it as being used during execution of the processing thread. If uniquely marked contents of a respective cache line happen to be displaced (e.g., overwritten) during execution of a processing thread, then the transaction is aborted (rather than being committed to main memory) because there is a possibility that another transaction overwrote a shared data value used during the respective transaction.
    Type: Application
    Filed: July 18, 2006
    Publication date: August 23, 2007
    Inventors: David Dice, Nir Shavit
  • Publication number: 20070157202
    Abstract: One embodiment of the present invention provides a system that ensures that progress is made in an environment that supports execution of obstruction-free operations. During execution, when a process pi invokes an operation, the system checks a panic flag, which indicates whether a progress-ensuring mechanism is to be activated. If the panic flag is set, the progress-ensuring mechanism is activated, which causes the system to attempt to perform the operation by coordinating actions between processes to ensure that progress is made in spite of contention between the processes. On the other hand, if the panic flag is not set, the system attempts to perform the operation essentially as if the progress-ensuring mechanism were not present. In this case, if there is an indication that contention between processes is impeding progress, the system sets the panic flag, which causes the progress-ensuring mechanism to be activated so that processes will coordinate their actions to ensure that progress is made.
    Type: Application
    Filed: January 3, 2006
    Publication date: July 5, 2007
    Inventors: Mark Moir, Victor Luchangco, Nir Shavit
  • Publication number: 20060123156
    Abstract: Producers and consumer processes may synchronize and transfer data using a shared data structure. After locating a potential transfer location that indicates an EMPTY status, a producer may store data to be transferred in the transfer location. A producer may use a compare-and-swap (CAS) operation to store the transfer data to the transfer location. A consumer may subsequently read the transfer data from the transfer location and store, such as by using a CAS operation, a DONE status indicator in the transfer location. The producer may notice the DONE indication and may then set the status location back to EMPTY to indicate that the location is available for future transfers, by the same or a different producer. The producer may also monitor the transfer location and time out if no consumer has picked up the transfer data.
    Type: Application
    Filed: January 4, 2006
    Publication date: June 8, 2006
    Applicant: Sun Microsystems, Inc.
    Inventors: Mark Moir, Daniel Nussbaum, Ori Shalev, Nir Shavit
  • Publication number: 20050132374
    Abstract: A multiprocessor, multi-program, stop-the-world garbage collection program is described. The system initially over partitions the root sources, and then iteratively employs static and dynamic work balancing. Garbage collection threads compete dynamically for the initial partitions. Work stealing double-ended queues, where contention is reduced, are described to provide dynamic load balancing among the threads. Contention is resolved by using atomic instructions. The heap is broken into a young and an old generation where parallel semi-space copying is used to collect the young generation and parallel mark-compacting the old generation. Speed and efficiency of collection is enhanced by use of card tables and linking objects, and overflow conditions are efficiently handled by linking using class pointers. A garbage collection termination employs a global status word.
    Type: Application
    Filed: November 23, 2004
    Publication date: June 16, 2005
    Applicant: Sun Microsystems, Inc.
    Inventors: Christine Flood, David Detlefs, Nir Shavit, Xiaolan Zhang, Ole Agesen