Patents by Inventor Brian Fahs

Brian Fahs has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 11954518
    Abstract: Apparatuses, systems, and techniques to optimize processor resources at a user-defined level. In at least one embodiment, priority of one or more tasks are adjusted to prevent one or more other dependent tasks from entering an idle state due to lack of resources to consume.
    Type: Grant
    Filed: December 20, 2019
    Date of Patent: April 9, 2024
    Assignee: Nvidia Corporation
    Inventors: Jonathon Evans, Lacky Shah, Phil Johnson, Jonah Alben, Brian Pharris, Greg Palmer, Brian Fahs
  • Patent number: 11210253
    Abstract: Techniques are disclosed for tracking memory page accesses in a unified virtual memory system. An access tracking unit detects a memory page access generated by a first processor for accessing a memory page in a memory system of a second processor. The access tracking unit determines whether a cache memory includes an entry for the memory page. If so, then the access tracking unit increments an associated access counter. Otherwise, the access tracking unit attempts to find an unused entry in the cache memory that is available for allocation. If so, then the access tracking unit associates the second entry with the memory page, and sets an access counter associated with the second entry to an initial value. Otherwise, the access tracking unit selects a valid entry in the cache memory; clears an associated valid bit; associates the entry with the memory page; and initializes an associated access counter.
    Type: Grant
    Filed: June 24, 2019
    Date of Patent: December 28, 2021
    Assignee: NVIDIA Corporation
    Inventors: Jerome F. Duluk, Jr., Cameron Buschardt, James Leroy Deming, Brian Fahs, Mark Hairgrove, John Mashey
  • Publication number: 20210191754
    Abstract: Apparatuses, systems, and techniques to optimize processor resources at a user-defined level. In at least one embodiment, priority of one or more tasks are adjusted to prevent one or more other dependent tasks from entering an idle state due to lack of resources to consume.
    Type: Application
    Filed: December 20, 2019
    Publication date: June 24, 2021
    Inventors: Jonathon Evans, Lacky Shah, Phil Johnson, Jonah Alben, Brian Pharris, Greg Palmer, Brian Fahs
  • Publication number: 20200334076
    Abstract: An application binary interface (ABI) can be exposed in a processor to enable blocks of threads, which may correspond to separately compiled operators, to communicate without storing data to global memory external to the processor. The ABI can define how results of one computation, corresponding to a first thread block, will be organized in registers and shared memory of a processor at the end of one operator (i.e., kernel). The start of the next operator (i.e., kernel), corresponding to a second thread block, can consume the results from the registers and shared memory. Data can be stored to processor local storage for individual threads as they exit the block. Once published, libraries can be separately compiled, optimized, and tested as long as they adhere to the published ABI.
    Type: Application
    Filed: April 19, 2019
    Publication date: October 22, 2020
    Inventors: Brian Fahs, Michael Lightstone, Mostafa Hagog
  • Publication number: 20190340145
    Abstract: Techniques are disclosed for tracking memory page accesses in a unified virtual memory system. An access tracking unit detects a memory page access generated by a first processor for accessing a memory page in a memory system of a second processor. The access tracking unit determines whether a cache memory includes an entry for the memory page. If so, then the access tracking unit increments an associated access counter. Otherwise, the access tracking unit attempts to find an unused entry in the cache memory that is available for allocation. If so, then the access tracking unit associates the second entry with the memory page, and sets an access counter associated with the second entry to an initial value. Otherwise, the access tracking unit selects a valid entry in the cache memory; clears an associated valid bit; associates the entry with the memory page; and initializes an associated access counter.
    Type: Application
    Filed: June 24, 2019
    Publication date: November 7, 2019
    Inventors: Jerome F. DULUK, JR., Cameron BUSCHARDT, James Leroy DEMING, Brian FAHS, Mark HAIRGROVE, John MASHEY
  • Patent number: 10409730
    Abstract: One embodiment of the present invention includes a microcontroller coupled to a memory management unit (MMU). The MMU is coupled to a page table included in a physical memory, and the microcontroller is configured to perform one or more virtual memory operations associated with the physical memory and the page table. In operation, the microcontroller receives a page fault generated by the MMU in response to an invalid memory access via a virtual memory address. To remedy such a page fault, the microcontroller performs actions to map the virtual memory address to an appropriate location in the physical memory. By contrast, in prior-art systems, a fault handler would typically remedy the page fault. Advantageously, because the microcontroller executes these tasks locally with respect to the MMU and the physical memory, latency associated with remedying page faults may be decreased. Consequently, overall system performance may be increased.
    Type: Grant
    Filed: August 27, 2013
    Date of Patent: September 10, 2019
    Assignee: NVIDIA CORPORATION
    Inventors: Cameron Buschardt, Jerome F. Duluk, Jr., John Mashey, Mark Hairgrove, James Leroy Deming, Brian Fahs
  • Patent number: 10331603
    Abstract: Techniques are disclosed for tracking memory page accesses in a unified virtual memory system. An access tracking unit detects a memory page access generated by a first processor for accessing a memory page in a memory system of a second processor. The access tracking unit determines whether a cache memory includes an entry for the memory page. If so, then the access tracking unit increments an associated access counter. Otherwise, the access tracking unit attempts to find an unused entry in the cache memory that is available for allocation. If so, then the access tracking unit associates the second entry with the memory page, and sets an access counter associated with the second entry to an initial value. Otherwise, the access tracking unit selects a valid entry in the cache memory; clears an associated valid bit; associates the entry with the memory page; and initializes an associated access counter.
    Type: Grant
    Filed: April 9, 2018
    Date of Patent: June 25, 2019
    Assignee: NVIDIA CORPORATION
    Inventors: Jerome F. Duluk, Jr., Cameron Buschardt, James Leroy Deming, Brian Fahs, Mark Hairgrove, John Mashey
  • Patent number: 10310973
    Abstract: A technique for simultaneously executing multiple tasks, each having an independent virtual address space, involves assigning an address space identifier (ASID) to each task and constructing each virtual memory access request to include both a virtual address and the ASID. During virtual to physical address translation, the ASID selects a corresponding page table, which includes virtual to physical address mappings for the ASID and associated task. Entries for a translation look-aside buffer (TLB) include both the virtual address and ASID to complete each mapping to a physical address. Deep scheduling of tasks sharing a virtual address space may be implemented to improve cache affinity for both TLB and data caches.
    Type: Grant
    Filed: October 25, 2012
    Date of Patent: June 4, 2019
    Assignee: NVIDIA CORPORATION
    Inventors: Nick Barrow-Williams, Brian Fahs, Jerome F. Duluk, Jr., James Leroy Deming, Timothy John Purcell, Lucien Dunning, Mark Hairgrove
  • Patent number: 10216413
    Abstract: Techniques are provided by which memory pages may be migrated among PPU memories in a multi-PPU system. According to the techniques, a UVM driver determines that a particular memory page should change ownership state and/or be migrated between one PPU memory and another PPU memory. In response to this determination, the UVM driver initiates a peer transition sequence to cause the ownership state and/or location of the memory page to change. Various peer transition sequences involve modifying mappings for one or more PPU, and copying a memory page from one PPU memory to another PPU memory. Several steps in peer transition sequences may be performed in parallel for increased processing speed.
    Type: Grant
    Filed: May 1, 2017
    Date of Patent: February 26, 2019
    Assignee: NVIDIA CORPORATION
    Inventors: Jerome F. Duluk, Jr., John Mashey, Mark Hairgrove, Chenghuan Jia, Cameron Buschardt, Lucien Dunning, Brian Fahs
  • Patent number: 10169091
    Abstract: A technique for simultaneously executing multiple tasks, each having an independent virtual address space, involves assigning an address space identifier (ASID) to each task and constructing each virtual memory access request to include both a virtual address and the ASID. During virtual to physical address translation, the ASID selects a corresponding page table, which includes virtual to physical address mappings for the ASID and associated task. Entries for a translation look-aside buffer (TLB) include both the virtual address and ASID to complete each mapping to a physical address. Deep scheduling of tasks sharing a virtual address space may be implemented to improve cache affinity for both TLB and data caches.
    Type: Grant
    Filed: October 25, 2012
    Date of Patent: January 1, 2019
    Assignee: NVIDIA CORPORATION
    Inventors: Nick Barrow-Williams, Brian Fahs, Jerome F. Duluk, Jr., James Leroy Deming, Timothy John Purcell, Lucien Dunning, Mark Hairgrove
  • Patent number: 10152328
    Abstract: One embodiment of the present invention sets forth a technique for efficiently performing voting operations within a multi-threaded parallel-processing system. A group of related parallel program threads executes within a processor core together in parallel. A new instruction, called a “vote” instruction, is introduced that enables a parallel program thread to post an individual vote within the context of the group of related threads and to receive the result of the vote. In this fashion, the vote instruction advantageously reduces overhead associated with inter-thread communication, thereby improving overall system performance.
    Type: Grant
    Filed: May 31, 2012
    Date of Patent: December 11, 2018
    Assignee: NVIDIA CORPORATION
    Inventors: John R. Nickolls, Lars Nyland, Peter C. Mills, Jeremy Sugerman, Timothy Foley, Brian Fahs, Michael Garland, David P. Luebke
  • Patent number: 10133677
    Abstract: Techniques are disclosed for transitioning a memory page between memories in a virtual memory subsystem. A unified virtual memory (UVM) driver detects a page fault in response to a memory access request associated with a first memory page, where a local page table does not include an entry corresponding to a virtual memory address included in the memory access request. The UVM driver, in response to the page fault, executes a page fault sequence. The page fault sequence includes modifying the ownership state associated with the first memory page to be central-processing-unit-shared. The page fault sequence further includes scheduling the first memory page for migration from a system memory associated with a central processing unit (CPU) to a local memory associated with a parallel processing unit (PPU). One advantage of the disclosed approach is that the PPU accesses memory pages with greater efficiency.
    Type: Grant
    Filed: December 18, 2013
    Date of Patent: November 20, 2018
    Assignee: NVIDIA CORPORATION
    Inventors: Jerome F. Duluk, Jr., Cameron Buschardt, James Leroy Deming, Lucien Dunning, Brian Fahs, Mark Hairgrove, John Mashey
  • Patent number: 10061526
    Abstract: One embodiment of the present invention is a memory subsystem that includes a sliding window tracker that tracks memory accesses associated with a sliding window of memory page groups. When the sliding window tracker detects an access operation associated with a memory page group within the sliding window, the sliding window tracker sets a reference bit that is associated with the memory page group and is included in a reference vector that represents accesses to the memory page groups within the sliding window. Based on the values of the reference bits, the sliding window tracker causes the selection a memory page in a memory page group that has fallen into disuse from a first memory to a second memory. Because the sliding window tracker tunes the memory pages that are resident in the first memory to reflect memory access patterns, the overall performance of the memory subsystem is improved.
    Type: Grant
    Filed: May 31, 2016
    Date of Patent: August 28, 2018
    Assignee: NVIDIA CORPORATION
    Inventors: John Mashey, Cameron Buschardt, James Leroy Deming, Jerome F. Duluk, Jr., Brian Fahs
  • Publication number: 20180232332
    Abstract: Techniques are disclosed for tracking memory page accesses in a unified virtual memory system. An access tracking unit detects a memory page access generated by a first processor for accessing a memory page in a memory system of a second processor. The access tracking unit determines whether a cache memory includes an entry for the memory page. If so, then the access tracking unit increments an associated access counter. Otherwise, the access tracking unit attempts to find an unused entry in the cache memory that is available for allocation. If so, then the access tracking unit associates the second entry with the memory page, and sets an access counter associated with the second entry to an initial value. Otherwise, the access tracking unit selects a valid entry in the cache memory; clears an associated valid bit; associates the entry with the memory page; and initializes an associated access counter.
    Type: Application
    Filed: April 9, 2018
    Publication date: August 16, 2018
    Inventors: Jerome F. DULUK, JR., Cameron BUSCHARDT, James Leroy DEMING, Brian FAHS, Mark HAIRGROVE, John MASHEY
  • Patent number: 10037228
    Abstract: A technique for simultaneously executing multiple tasks, each having an independent virtual address space, involves assigning an address space identifier (ASID) to each task and constructing each virtual memory access request to include both a virtual address and the ASID. During virtual to physical address translation, the ASID selects a corresponding page table, which includes virtual to physical address mappings for the ASID and associated task. Entries for a translation look-aside buffer (TLB) include both the virtual address and ASID to complete each mapping to a physical address. Deep scheduling of tasks sharing a virtual address space may be implemented to improve cache affinity for both TLB and data caches.
    Type: Grant
    Filed: October 25, 2012
    Date of Patent: July 31, 2018
    Assignee: NVIDIA CORPORATION
    Inventors: Nick Barrow-Williams, Brian Fahs, Jerome F. Duluk, Jr., James Leroy Deming, Timothy John Purcell, Lucien Dunning, Mark Hairgrove
  • Patent number: 9952977
    Abstract: A method for managing a parallel cache hierarchy in a processing unit. The method including receiving an instruction that includes a cache operations modifier that identifies a level of the parallel cache hierarchy in which to cache data associated with the instruction; and implementing a cache replacement policy based on the cache operations modifier.
    Type: Grant
    Filed: September 24, 2010
    Date of Patent: April 24, 2018
    Assignee: NVIDIA CORPORATION
    Inventors: Steven James Heinrich, Alexander L. Minkin, Brett W. Coon, Rajeshwaran Selvanesan, Robert Steven Glanville, Charles McCarver, Anjana Rajendran, Stewart Glenn Carlton, John R. Nickolls, Brian Fahs
  • Patent number: 9940286
    Abstract: Techniques are disclosed for tracking memory page accesses in a unified virtual memory system. An access tracking unit detects a memory page access generated by a first processor for accessing a memory page in a memory system of a second processor. The access tracking unit determines whether a cache memory includes an entry for the memory page. If so, then the access tracking unit increments an associated access counter. Otherwise, the access tracking unit attempts to find an unused entry in the cache memory that is available for allocation. If so, then the access tracking unit associates the second entry with the memory page, and sets an access counter associated with the second entry to an initial value. Otherwise, the access tracking unit selects a valid entry in the cache memory; clears an associated valid bit; associates the entry with the memory page; and initializes an associated access counter.
    Type: Grant
    Filed: December 9, 2013
    Date of Patent: April 10, 2018
    Assignee: NVIDIA Corporation
    Inventors: Jerome F. Duluk, Jr., Cameron Buschardt, James Leroy Deming, Brian Fahs, Mark Hairgrove, John Mashey
  • Publication number: 20170371802
    Abstract: One embodiment of the present invention includes a microcontroller coupled to a memory management unit (MMU). The MMU is coupled to a page table included in a physical memory, and the microcontroller is configured to perform one or more virtual memory operations associated with the physical memory and the page table. In operation, the microcontroller receives a page fault generated by the MMU in response to an invalid memory access via a virtual memory address. To remedy such a page fault, the microcontroller performs actions to map the virtual memory address to an appropriate location in the physical memory. By contrast, in prior-art systems, a fault handler would typically remedy the page fault. Advantageously, because the microcontroller executes these tasks locally with respect to the MMU and the physical memory, latency associated with remedying page faults may be decreased. Consequently, overall system performance may be increased.
    Type: Application
    Filed: August 27, 2013
    Publication date: December 28, 2017
    Applicant: NVIDIA CORPORATION
    Inventors: Cameron BUSCHARDT, Jerome F. DULUK, JR., John MASHEY, Mark HAIRGROVE, James Leroy DEMING, Brian FAHS
  • Publication number: 20170371822
    Abstract: Techniques are disclosed for tracking memory page accesses in a unified virtual memory system. An access tracking unit detects a memory page access generated by a first processor for accessing a memory page in a memory system of a second processor. The access tracking unit determines whether a cache memory includes an entry for the memory page. If so, then the access tracking unit increments an associated access counter. Otherwise, the access tracking unit attempts to find an unused entry in the cache memory that is available for allocation. If so, then the access tracking unit associates the second entry with the memory page, and sets an access counter associated with the second entry to an initial value. Otherwise, the access tracking unit selects a valid entry in the cache memory; clears an associated valid bit; associates the entry with the memory page; and initializes an associated access counter.
    Type: Application
    Filed: December 9, 2013
    Publication date: December 28, 2017
    Applicant: NVIDIA CORPORATION
    Inventors: Jerome F. DULUK, Jr., Cameron BUSCHARDT, James Leroy DEMING, Brian FAHS, Mark HAIRGROVE, John MASHEY
  • Patent number: 9830210
    Abstract: One embodiment of the present invention includes techniques for a first processing unit to perform an atomic operation on a memory page shared with a second processing unit. The memory page is associated with a page table entry corresponding to the first processing unit. Before executing the atomic operation, an MMU included in the first processing unit evaluates an atomic permission bit that is included in the page table entry. If the MMU determines that the atomic permission bit is inactive, then the two processing units coordinate to change the permission status of the memory page. As part of the status change, the atomic permission bit in the page table entry is activated. Subsequently, the first processing unit performs the atomic operation uninterrupted by the second processing unit. Advantageously, coordinating the processing unit via the atomic permission bit ensures the proper and efficient execution of the atomic operation.
    Type: Grant
    Filed: August 27, 2013
    Date of Patent: November 28, 2017
    Assignee: NVIDIA Corporation
    Inventors: Jerome F. Duluk, Jr., John Mashey, Mark Hairgrove, James Leroy Deming, Cameron Buschardt, Brian Fahs