Patents by Inventor David Nellans
David Nellans has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Patent number: 12130750Abstract: Computer systems often employ virtual address translation hierarchies in which virtual memory addresses are mapped to physical memory. Use of the virtual address translation hierarchy speeds up the virtual address translation when the required mapping is stored in one of the higher levels of the hierarchy. To reduce a number of misses occurring in the virtual address translation hierarchy, huge memory pages may be selectively employed, which map larger continuous regions of virtual memory to continuous regions of physical memory, thereby increasing the coverage of each entry in the virtual address translation hierarchy. The present disclosure provides hardware support for optimizing this huge memory page selection.Type: GrantFiled: March 6, 2023Date of Patent: October 29, 2024Assignee: NVIDIA CORPORATIONInventors: Aninda Manocha, Zi Yan, David Nellans
-
Publication number: 20240303201Abstract: Computer systems often employ virtual address translation hierarchies in which virtual memory addresses are mapped to physical memory. Use of the virtual address translation hierarchy speeds up the virtual address translation when the required mapping is stored in one of the higher levels of the hierarchy. To reduce a number of misses occurring in the virtual address translation hierarchy, huge memory pages may be selectively employed, which map larger continuous regions of virtual memory to continuous regions of physical memory, thereby increasing the coverage of each entry in the virtual address translation hierarchy. The present disclosure provides hardware support for optimizing this huge memory page selection.Type: ApplicationFiled: March 6, 2023Publication date: September 12, 2024Inventors: Aninda Manocha, Zi Yan, David Nellans
-
Patent number: 11880261Abstract: A system, method, and apparatus of power management for computing systems are included herein that optimize individual frequencies of components of the computing systems using machine learning. The computing systems can be tightly integrated systems that consider an overall operating budget that is shared between the components of the computing system while adjusting the frequencies of the individual components. An example of an automated method of power management includes: (1) learning, using a power management (PM) agent, frequency settings for different components of a computing system during execution of a repetitive application, and (2) adjusting the frequency settings of the different components using the PM agent, wherein the adjusting is based on the repetitive application and one or more limitations corresponding to a shared operating budget for the computing system.Type: GrantFiled: March 31, 2022Date of Patent: January 23, 2024Assignee: NVIDIA CorporationInventors: Evgeny Bolotin, Yaosheng Fu, Zi Yan, Gal Dalal, Shie Mannor, David Nellans
-
Publication number: 20230137205Abstract: Introduced herein is a technique that uses ML to autonomously find a cache management policy that achieves an optimal execution of a given workload of an application. Leveraging ML such as reinforcement learning, the technique trains an agent in an ML environment over multiple episodes of a stabilization process. For each time step in these training episodes, the agent executes the application while making an incremental change to the current policy, i.e., cache-residency statuses of memory address space associated with the workload, until the application can be executed at a stable level. The stable level of execution, for example, can be indicated by performance variations, such as standard deviations, between a certain number of neighboring measurement periods remaining within a certain threshold. The agent, who has been trained in the training episodes, infers the final cache management policy during the final, inferring episode.Type: ApplicationFiled: October 29, 2021Publication date: May 4, 2023Inventors: Yaosheng Fu, Shie Mannor, Evgeny Bolotin, David Nellans, Gal Dalal
-
Patent number: 11625279Abstract: In general, an application executes on a compute unit, such as a central processing unit (CPU) or graphics processing unit (GPU), to perform some function(s). In some circumstances, improved performance of an application, such as a graphics application, may be provided by executing the application across multiple compute units. However, when using multiple compute units in this manner, synchronization must be provided between the compute units. Synchronization, including the sharing of the data, is typically accomplished through memory. While a shared memory may cause bottlenecks, employing local memory for each compute unit may itself require synchronization (coherence) which can be costly in terms of resources, delay, etc. The present disclosure provides read-write page replication for multiple compute units that avoids the traditional challenges associated with coherence.Type: GrantFiled: February 11, 2020Date of Patent: April 11, 2023Assignee: NVIDIA CORPORATIONInventors: Daniel Lustig, Oreste Villa, David Nellans
-
Patent number: 11609879Abstract: In various embodiments, a parallel processor includes a parallel processor module implemented within a first die and a memory system module implemented within a second die. The memory system module is coupled to the parallel processor module via an on-package link. The parallel processor module includes multiple processor cores and multiple cache memories. The memory system module includes a memory controller for accessing a DRAM. Advantageously, the performance of the parallel processor module can be effectively tailored for memory bandwidth demands that typify one or more application domains via the memory system module.Type: GrantFiled: July 1, 2021Date of Patent: March 21, 2023Assignee: NVIDIA CorporationInventors: Yaosheng Fu, Evgeny Bolotin, Niladrish Chatterjee, Stephen William Keckler, David Nellans
-
Publication number: 20230079978Abstract: A system, method, and apparatus of power management for computing systems are included herein that optimize individual frequencies of components of the computing systems using machine learning. The computing systems can be tightly integrated systems that consider an overall operating budget that is shared between the components of the computing system while adjusting the frequencies of the individual components. An example of an automated method of power management includes: (1) learning, using a power management (PM) agent, frequency settings for different components of a computing system during execution of a repetitive application, and (2) adjusting the frequency settings of the different components using the PM agent, wherein the adjusting is based on the repetitive application and one or more limitations corresponding to a shared operating budget for the computing system.Type: ApplicationFiled: March 31, 2022Publication date: March 16, 2023Inventors: Evgeny Bolotin, Yaosheng Fu, Zi Yan, Gal Dalal, Shie Mannor, David Nellans
-
Publication number: 20220276984Abstract: In various embodiments, a parallel processor includes a parallel processor module implemented within a first die and a memory system module implemented within a second die. The memory system module is coupled to the parallel processor module via an on-package link. The parallel processor module includes multiple processor cores and multiple cache memories. The memory system module includes a memory controller for accessing a DRAM. Advantageously, the performance of the parallel processor module can be effectively tailored for memory bandwidth demands that typify one or more application domains via the memory system module.Type: ApplicationFiled: July 1, 2021Publication date: September 1, 2022Inventors: Yaosheng FU, Evgeny BOLOTIN, Niladrish CHATTERJEE, Stephen William KECKLER, David NELLANS
-
Publication number: 20210248014Abstract: In general, an application executes on a compute unit, such as a central processing unit (CPU) or graphics processing unit (GPU), to perform some function(s). In some circumstances, improved performance of an application, such as a graphics application, may be provided by executing the application across multiple compute units. However, when using multiple compute units in this manner, synchronization must be provided between the compute units. Synchronization, including the sharing of the data, is typically accomplished through memory. While a shared memory may cause bottlenecks, employing local memory for each compute unit may itself require synchronization (coherence) which can be costly in terms of resources, delay, etc. The present disclosure provides read-write page replication for multiple compute units that avoids the traditional challenges associated with coherence.Type: ApplicationFiled: February 11, 2020Publication date: August 12, 2021Inventors: Daniel Lustig, Oreste Villa, David Nellans
-
Patent number: 10489295Abstract: A system includes a data store and a memory cache subsystem. A method for pre-fetching data from the data store for the cache includes determining a performance characteristic of a data store. The method also includes identifying a pre-fetch policy configured to utilize the determined performance characteristic of the data store. The method also includes pre-fetching data stored in the data store by copying data from the data store to the cache according to the pre-fetch policy identified to utilize the determined performance characteristic of the data store.Type: GrantFiled: March 14, 2013Date of Patent: November 26, 2019Assignee: SANDISK TECHNOLOGIES LLCInventors: David Nellans, Torben Mathiasen, David Flynn, Nisha Talagala
-
Patent number: 10318324Abstract: Techniques are disclosed relating to enabling virtual machines to access data on a physical recording medium. In one embodiment, a computing system provides a logical address space for a storage device to an allocation agent that is executable to allocate the logical address space to a plurality of virtual machines having access to the storage device. In such an embodiment, the logical address space is larger than a physical address space of the storage device. The computing system may then process a storage request from one of the plurality of virtual machines. In some embodiments, the allocation agent is a hypervisor executing on the computing system. In some embodiments, the computing system tracks utilizations of the storage device by the plurality of virtual machines, and based on the utilizations, enforces a quality of service level associated with one or more of the plurality of virtual machines.Type: GrantFiled: July 13, 2017Date of Patent: June 11, 2019Assignee: SANDISK TECHNOLOGIES LLCInventors: Neil Carson, Nisha Talagala, Mark Brinicombe, Robert Wipfel, Anirudh Badam, David Nellans
-
Publication number: 20190073296Abstract: Data is stored on a non-volatile storage media in a sequential, log-based format. The formatted data defines an ordered sequence of storage operations performed on the non-volatile storage media. A storage layer maintains volatile metadata, which may include a forward index associating logical identifiers with respective physical storage units on the non-volatile storage media. The volatile metadata may be reconstructed from the ordered sequence of storage operations. Persistent notes may be used to maintain consistency between the volatile metadata and the contents of the non-volatile storage media. Persistent notes may identify data that does not need to be retained on the non-volatile storage media and/or is no longer valid.Type: ApplicationFiled: November 1, 2018Publication date: March 7, 2019Inventors: David Atkisson, David Nellans, David Flynn, Jens Axboe, Michael Zappe
-
Patent number: 10133663Abstract: Data is stored on a non-volatile storage media in a sequential, log-based format. The formatted data defines an ordered sequence of storage operations performed on the non-volatile storage media. A storage layer maintains volatile metadata, which may include a forward index associating logical identifiers with respective physical storage units on the non-volatile storage media. The volatile metadata may be reconstructed from the ordered sequence of storage operations. Persistent notes may be used to maintain consistency between the volatile metadata and the contents of the non-volatile storage media. Persistent notes may identify data that does not need to be retained on the non-volatile storage media and/or is no longer valid.Type: GrantFiled: October 3, 2013Date of Patent: November 20, 2018Assignee: Longitude Enterprise Flash S.A.R.L.Inventors: David Atkisson, David Nellans, David Flynn, Jens Axboe, Michael Zappe
-
Patent number: 10102075Abstract: A storage layer of a non-volatile storage device may be configured to provide key-value storage services. Key conflicts may be resolved by modifying the logical interface of data stored on the non-volatile storage device. Resolving a key conflict may comprise identifying an alternative key and implementing one or more range move operations configured to bind the stored data to the alternative key. The move operations may be implemented without relocating the data on the non-volatile storage device.Type: GrantFiled: March 24, 2016Date of Patent: October 16, 2018Assignee: SanDisk Technologies LLCInventors: Nisha Talagala, David Flynn, Swaminathan Sundararaman, Sriram Subramanian, David Nellans, Robert Wipfel, John Strasser
-
Patent number: 10013354Abstract: A storage layer (SL) for a non-volatile storage device presents a logical address space of a non-volatile storage device to storage clients. Storage metadata assigns logical identifiers in the logical address space to physical storage locations on the non-volatile storage device. Data is stored on the non-volatile storage device in a sequential log-based format. Data on the non-volatile storage device comprises an event log of the storage operations performed on the non-volatile storage device. The SL presents an interface for requesting atomic storage operations. Previous versions of data overwritten by the atomic storage device are maintained until the atomic storage operation is successfully completed. Data pertaining to a failed atomic storage operation may be identified using a persistent metadata flag stored with the data on the non-volatile storage device. Data pertaining to failed or incomplete atomic storage requests may be invalidated and removed from the non-volatile storage device.Type: GrantFiled: July 28, 2011Date of Patent: July 3, 2018Assignee: SANDISK TECHNOLOGIES LLCInventors: David Flynn, Stephan Uphoff, Xiangyong Ouyang, David Nellans, Robert Wipfel
-
Patent number: 9983993Abstract: An apparatus, system, and method are disclosed for implementing conditional storage operations. Storage clients access and allocate portions of an address space of a non-volatile storage device. A conditional storage request is provided, which causes data to be stored to the non-volatile storage device on the condition that the address space of the device can satisfy the entire request. If only a portion of the request can be satisfied, the conditional storage request may be deferred or fail. An atomic storage request is provided, which may comprise one or more storage operations. The atomic storage request succeeds if all of the one or more storage operations are complete successfully. If one or more of the storage operations fails, the atomic storage request is invalidated, which may comprise deallocating logical identifiers of the request and/or invalidating data on the non-volatile storage device pertaining to the request.Type: GrantFiled: January 13, 2016Date of Patent: May 29, 2018Assignee: SanDisk Technologies LLCInventors: David Flynn, David Nellans, Xiangyong Ouyang
-
Patent number: 9842128Abstract: An atomic storage module may be configured to implement atomic storage operation directed to a first set of identifiers in reference to a second, different set of identifiers. In response to completing the atomic storage operation, the atomic storage module may move the corresponding data to the first, target set of identifiers. The move operation may comprise modifying a logical interface of the data. The move operation may further include storing persistent metadata configured to bind the data to the first set of identifiers.Type: GrantFiled: June 24, 2014Date of Patent: December 12, 2017Assignee: SanDisk Technologies LLCInventors: Nisha Talagala, David Flynn, Swaminathan Sundararaman, Sriram Subramanian, David Nellans, Robert Wipfel, John Strasser
-
Publication number: 20170315832Abstract: Techniques are disclosed relating to enabling virtual machines to access data on a physical recording medium. In one embodiment, a computing system provides a logical address space for a storage device to an allocation agent that is executable to allocate the logical address space to a plurality of virtual machines having access to the storage device. In such an embodiment, the logical address space is larger than a physical address space of the storage device. The computing system may then process a storage request from one of the plurality of virtual machines. In some embodiments, the allocation agent is a hypervisor executing on the computing system. In some embodiments, the computing system tracks utilizations of the storage device by the plurality of virtual machines, and based on the utilizations, enforces a quality of service level associated with one or more of the plurality of virtual machines.Type: ApplicationFiled: July 13, 2017Publication date: November 2, 2017Inventors: Neil Carson, Nisha Talagala, Mark Brinicombe, Robert Wipfel, Anirudh Badam, David Nellans
-
Patent number: 9720717Abstract: Techniques are disclosed relating to enabling virtual machines to access data on a physical recording medium. In one embodiment, a computing system provides a logical address space for a storage device to an allocation agent that is executable to allocate the logical address space to a plurality of virtual machines having access to the storage device. In such an embodiment, the logical address space is larger than a physical address space of the storage device. The computing system may then process a storage request from one of the plurality of virtual machines. In some embodiments, the allocation agent is a hypervisor executing on the computing system. In some embodiments, the computing system tracks utilizations of the storage device by the plurality of virtual machines, and based on the utilizations, enforces a quality of service level associated with one or more of the plurality of virtual machines.Type: GrantFiled: March 14, 2013Date of Patent: August 1, 2017Assignee: SanDisk Technologies LLCInventors: Neil Carson, Nisha Talagala, Mark Brinicombe, Robert Wipfel, Anirudh Badam, David Nellans
-
Patent number: 9690694Abstract: An apparatus, system, and method are disclosed for storage address translation. The method includes storing, in volatile memory, a plurality of logical-to-physical mapping entries for a non-volatile recording device. The method includes persisting a logical-to-physical mapping entry from the volatile memory to recording media of the non-volatile recording device. The logical-to-physical mapping entry may be selected for persisting based on a mapping policy indicated by a client. The method includes loading the logical-to-physical mapping entry from the recording media of the non-volatile recording device into the volatile memory in response to a storage request associated with the logical-to-physical mapping entry.Type: GrantFiled: September 27, 2012Date of Patent: June 27, 2017Assignee: SanDisk Technologies, LLCInventors: David Nellans, Jens Axboe, Nick Piggin