Patents by Inventor Nir Shavit
Nir Shavit has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Publication number: 20200160182Abstract: A system and method of inferring a neural network (NN) on one or more target computing devices. The NN may include a plurality of layers, where at least one layer includes one or more kernels. Embodiments may include: receiving a data structure representing the NN; analyzing the data structure to produce one or more tasks, where each task may include computations pertaining to a kernel of the NN; selecting a sparse version of at least one kernel and replacing the at least one kernel with the sparse version; and compiling the one or more tasks to produce one or more respective tensor columns, The one or more tensor columns are adapted to fit in respective one or more cache memories of the one or more target computing devices, and include task instruction code that represents at least one computation of the kernel of the NN.Type: ApplicationFiled: January 24, 2020Publication date: May 21, 2020Applicant: Neuralmagic Inc.Inventors: Alexander MATVEEV, Nir Shavit, Aleksandar Zlateski
-
Publication number: 20190370071Abstract: A method and system for computing one or more outputs of a neural network having a plurality of layers is provided. The method and system can include determining a plurality of sub-computations from total computations of the neural network to execute in parallel wherein the computations to execute in parallel involve computations from multiple layers. The method and system also can also include avoiding repeating overlapped computations and/or multiple memory reads and writes during execution.Type: ApplicationFiled: May 30, 2019Publication date: December 5, 2019Applicant: Neuralmagic Inc.Inventors: Alexander Matveev, Nir Shavit
-
Publication number: 20190156214Abstract: Systems and methods may make exchanging data in a neural network (NN) during training more efficient. Exchanging weights among a number of processors training a NN across iterations may include sorting generated weights, compressing the sorted weights, and transmitting the compressed sorted weights. On each Kth iteration a sort order of the sorted weights may be created and transmitted. Exchanging weights among processors training a NN may include executing a forward pass to produce a set of loss values for processors, transmitting loss values to other processors, and at each of the processors, performing backpropagation on at least one layer of the NN using loss values received from other processors.Type: ApplicationFiled: November 16, 2018Publication date: May 23, 2019Inventors: Alexander MATVEEV, Nir Shavit
-
Publication number: 20190156215Abstract: Systems and methods may make exchanging data in a neural network (NN) during training more efficient. Exchanging weights among a number of processors training a NN across iterations may include sorting generated weights, compressing the sorted weights, and transmitting the compressed sorted weights. On each Kth iteration a sort order of the sorted weights may be created and transmitted. Exchanging weights among processors training a NN may include executing a forward pass to produce a set of loss values for processors, transmitting loss values to other processors, and at each of the processors, performing backpropagation on at least one layer of the NN using loss values received from other processors.Type: ApplicationFiled: November 16, 2018Publication date: May 23, 2019Applicant: Neuralmagic Inc.Inventors: Alexander MATVEEV, Nir SHAVIT
-
Publication number: 20190138902Abstract: A system and method for an improved convolutional layer in convolutional neural networks is provided. The convolution is performed via a transformation that includes relocating input, relocating convolution filters and performing an aggregate matrix multiply.Type: ApplicationFiled: November 6, 2018Publication date: May 9, 2019Applicant: Neuralmagic Inc.Inventors: Alexander Matveev, Nir Shavit
-
Patent number: 10070803Abstract: Spirometer apparatus comprising main inhale-exhale tube having first end, main interior, and second open end, a plurality of smaller tubes intersecting said main-inhale exhale tube at first and second respective locations and having a plurality of smaller interiors respectively, the first location being closer to the first end than is the second location, wherein each of the smaller interiors are in fluid communication with the main interior solely via at least one aperture formed in each of the intersecting tubes at locations facing said second end, the intersecting tubes having first and second external cross-sections, the main tube having first and second internal cross-sections, wherein said first external cross-section is smaller than said first internal cross-section, said second external cross-section is smaller than said second internal cross-section, and wherein said second external cross-section is smaller than said first external cross-section, and a differential pressure sensor sensing the pressurType: GrantFiled: August 12, 2010Date of Patent: September 11, 2018Assignee: LUNGTEK LTD.Inventor: Nir Shavit
-
Publication number: 20120136271Abstract: Spirometer apparatus comprising main inhale-exhale tube having first end, main interior, and second open end, a plurality of smaller tubes intersecting said main-inhale exhale tube at first and second respective locations and having a plurality of smaller interiors respectively, the first location being closer to the first end than is the second location, wherein each of the smaller interiors are in fluid communication with the main interior solely via at least one aperture formed in each of the intersecting tubes at locations facing said second end, the intersecting tubes having first and second external cross-sections, the main tube having first and second internal cross-sections, wherein said first external cross-section is smaller than said first internal cross-section, said second external cross-section is smaller than said second internal cross-section, and wherein said second external cross-section is smaller than said first external cross-section, and a differential pressure sensor sensing the pressurType: ApplicationFiled: August 12, 2010Publication date: May 31, 2012Applicant: LUNGTEK LTD.Inventor: Nir Shavit
-
Patent number: 7836228Abstract: A scalable first-in-first-out queue implementation adjusts to load on a host system. The scalable FIFO queue implementation is lock-free and linearizable, and scales to large numbers of threads. The FIFO queue implementation includes a central queue and an elimination structure for eliminating enqueue-dequeue operation pairs. The elimination mechanism tracks enqueue operations and/or dequeue operations and eliminates without synchronizing on the FIFO queue implementation.Type: GrantFiled: October 15, 2004Date of Patent: November 16, 2010Assignee: Oracle America, Inc.Inventors: Mark Moir, Ori Shalev, Nir Shavit
-
Publication number: 20080109608Abstract: The design of nonblocking linked data structures using single-location synchronization primitives such as compare-and-swap (CAS) is a complex affair that often requires severe restrictions on the way pointers are used. One way to address this problem is to provide stronger synchronization operations, for example, ones that atomically modify one memory location while simultaneously verifying the contents of others. We provide a simple and highly efficient nonblocking implementation of such an operation: an atomic k-word-compare single-swap operation (KCSS). Our implementation is obstruction-free. As a result, it is highly efficient in the uncontended case and relies on contention management mechanisms in the contended cases. It allows linked data structure manipulation without the complexity and restrictions of other solutions.Type: ApplicationFiled: September 28, 2007Publication date: May 8, 2008Inventors: Nir Shavit, Mark Moir, Victor Luchangco
-
Publication number: 20080077748Abstract: The design of nonblocking linked data structures using single-location synchronization primitives such as compare-and-swap (CAS) is a complex affair that often requires severe restrictions on the way pointers are used. One way to address this problem is to provide stronger synchronization operations, for example, ones that atomically modify one memory location while simultaneously verifying the contents of others. We provide a simple and highly efficient nonblocking implementation of such an operation: an atomic k-word-compare single-swap operation (KCSS). Our implementation is obstruction-free. As a result, it is highly efficient in the uncontended case and relies on contention management mechanisms in the contended cases. It allows linked data structure manipulation without the complexity and restrictions of other solutions.Type: ApplicationFiled: September 28, 2007Publication date: March 27, 2008Inventors: Nir Shavit, Mark Moir, Victor Luchangco
-
Publication number: 20080077775Abstract: The design of nonblocking linked data structures using single-location synchronization primitives such as compare-and-swap (CAS) is a complex affair that often requires severe restrictions on the way pointers are used. One way to address this problem is to provide stronger synchronization operations, for example, ones that atomically modify one memory location while simultaneously verifying the contents of others. We provide a simple and highly efficient nonblocking implementation of such an operation: an atomic k-word-compare single-swap operation (KCSS). Our implementation is obstruction-free. As a result, it is highly efficient in the uncontended case and relies on contention management mechanisms in the contended cases. It allows linked data structure manipulation without the complexity and restrictions of other solutions.Type: ApplicationFiled: September 28, 2007Publication date: March 27, 2008Inventors: Nir Shavit, Mark Moir, Victor Luchangco
-
Publication number: 20080034166Abstract: The design of nonblocking linked data structures using single-location synchronization primitives such as compare-and-swap (CAS) is a complex affair that often requires severe restrictions on the way pointers are used. One way to address this problem is to provide stronger synchronization operations, for example, ones that atomically modify one memory location while simultaneously verifying the contents of others. We provide a simple and highly efficient nonblocking implementation of such an operation: an atomic k-word-compare single-swap operation (KCSS). Our implementation is obstruction-free. As a result, it is highly efficient in the uncontended case and relies on contention management mechanisms in the contended cases. It allows linked data structure manipulation without the complexity and restrictions of other solutions.Type: ApplicationFiled: September 28, 2007Publication date: February 7, 2008Inventors: Nir Shavit, Mark Moir, Victor Lunchangeo
-
Publication number: 20070198792Abstract: A computer system includes multiple processing threads that execute in parallel. The multiple processing threads have access to a global environment including different types of metadata enabling the processing threads to carry out simultaneous execution depending on a currently selected type of lock mode. A mode controller monitoring the processing threads initiates switching from one type of lock mode to another depending on current operating conditions such as an amount of contention amongst the multiple processing threads to modify the shared data. The mode controller can switch from one lock mode another regardless of whether any of the multiple processes are in the midst of executing a respective transaction. A most efficient lock mode can be selected to carry out the parallel transactions. In certain cases, switching of lock modes causes one or more of the processing threads to abort and retry a respective transaction according to the new mode.Type: ApplicationFiled: June 27, 2006Publication date: August 23, 2007Inventors: David Dice, Nir Shavit
-
Publication number: 20070198979Abstract: For each of multiple processes executing in parallel, as long as corresponding version information associated with a respective set of one or more shared variables used for computational purposes has not changed during execution of a respective transaction, results of the respective transaction can be globally committed to memory without causing data corruption. If version information associated with one or more respective shared variables (used to produce the transaction results) happens to change during a process of generating respective results, then a respective process can identify that another process modified the one or more respective shared variables during execution and that its transaction results should not be committed to memory. In this latter case, the transaction repeats itself until it is able to commit respective results without causing data corruption.Type: ApplicationFiled: June 27, 2006Publication date: August 23, 2007Inventors: David Dice, Nir Shavit
-
Publication number: 20070198519Abstract: The present disclosure describes a unique way for each of multiple processes to operate in parallel using (e.g., reading, modifying, and writing to) the same shared data without causing corruption to the shared data. For example, each of multiple processes utilizes current and past data values associated with a global counter or clock for purposes of determining whether any shared variables used to produce a respective transaction outcome were modified (by another process) when executing a respective transaction. If a respective process detects that shared data used by respective process was modified during a transaction, the process can abort and retry the transaction rather than cause data corruption by storing locally maintained results associated with the transaction to a globally shared data space.Type: ApplicationFiled: June 27, 2006Publication date: August 23, 2007Inventors: David Dice, Ori Shalev, Nir Shavit
-
Publication number: 20070198978Abstract: A computer system includes multiple processing threads that execute in parallel. The multiple processing threads have access to a global environment including i) shared data utilized by the multiple processing threads, ii) a globally accessible register or buffer of version information that changes each time a respective one of the multiple processing threads modifies the shared data, and iii) respective lock information indicating whether one of the multiple processing threads has locked the shared data preventing other processing threads from modifying the shared data. To prevent data corruption, each of the processing threads aborts if a given processing thread detects a change in the version information or another processing thread has a lock on the shared data. This technique is well suited for use in applications such as processing threads that support a high number of reads with a corresponding number of fewer respective writes to shared data.Type: ApplicationFiled: June 27, 2006Publication date: August 23, 2007Inventors: David Dice, Nir Shavit
-
Publication number: 20070198781Abstract: Cache logic associated with a respective one of multiple processing threads executing in parallel updates corresponding data fields of a cache to uniquely mark its contents. The marked contents represent a respective read set for a transaction. For example, at an outset of executing a transaction, a respective processing thread chooses a data value to mark contents of the cache used for producing a transaction outcome for the processing thread. Upon each read of shared data from main memory, the cache stores a copy of the data and marks it as being used during execution of the processing thread. If uniquely marked contents of a respective cache line happen to be displaced (e.g., overwritten) during execution of a processing thread, then the transaction is aborted (rather than being committed to main memory) because there is a possibility that another transaction overwrote a shared data value used during the respective transaction.Type: ApplicationFiled: July 18, 2006Publication date: August 23, 2007Inventors: David Dice, Nir Shavit
-
Publication number: 20070157202Abstract: One embodiment of the present invention provides a system that ensures that progress is made in an environment that supports execution of obstruction-free operations. During execution, when a process pi invokes an operation, the system checks a panic flag, which indicates whether a progress-ensuring mechanism is to be activated. If the panic flag is set, the progress-ensuring mechanism is activated, which causes the system to attempt to perform the operation by coordinating actions between processes to ensure that progress is made in spite of contention between the processes. On the other hand, if the panic flag is not set, the system attempts to perform the operation essentially as if the progress-ensuring mechanism were not present. In this case, if there is an indication that contention between processes is impeding progress, the system sets the panic flag, which causes the progress-ensuring mechanism to be activated so that processes will coordinate their actions to ensure that progress is made.Type: ApplicationFiled: January 3, 2006Publication date: July 5, 2007Inventors: Mark Moir, Victor Luchangco, Nir Shavit
-
Publication number: 20060123156Abstract: Producers and consumer processes may synchronize and transfer data using a shared data structure. After locating a potential transfer location that indicates an EMPTY status, a producer may store data to be transferred in the transfer location. A producer may use a compare-and-swap (CAS) operation to store the transfer data to the transfer location. A consumer may subsequently read the transfer data from the transfer location and store, such as by using a CAS operation, a DONE status indicator in the transfer location. The producer may notice the DONE indication and may then set the status location back to EMPTY to indicate that the location is available for future transfers, by the same or a different producer. The producer may also monitor the transfer location and time out if no consumer has picked up the transfer data.Type: ApplicationFiled: January 4, 2006Publication date: June 8, 2006Applicant: Sun Microsystems, Inc.Inventors: Mark Moir, Daniel Nussbaum, Ori Shalev, Nir Shavit
-
Publication number: 20050132374Abstract: A multiprocessor, multi-program, stop-the-world garbage collection program is described. The system initially over partitions the root sources, and then iteratively employs static and dynamic work balancing. Garbage collection threads compete dynamically for the initial partitions. Work stealing double-ended queues, where contention is reduced, are described to provide dynamic load balancing among the threads. Contention is resolved by using atomic instructions. The heap is broken into a young and an old generation where parallel semi-space copying is used to collect the young generation and parallel mark-compacting the old generation. Speed and efficiency of collection is enhanced by use of card tables and linking objects, and overflow conditions are efficiently handled by linking using class pointers. A garbage collection termination employs a global status word.Type: ApplicationFiled: November 23, 2004Publication date: June 16, 2005Applicant: Sun Microsystems, Inc.Inventors: Christine Flood, David Detlefs, Nir Shavit, Xiaolan Zhang, Ole Agesen