Patents by Inventor John H. Edmondson
John H. Edmondson has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Patent number: 11055097Abstract: One embodiment of the present invention includes techniques to decrease power consumption by reducing the number of redundant operations performed. In operation, a streamlining multiprocessor (SM) identifies uniform groups of threads that, when executed, apply the same deterministic operation to uniform sets of input operands. Within each uniform group of threads, the SM designates one thread as the anchor thread. The SM disables execution units assigned to all of the threads except the anchor thread. The anchor execution unit, assigned to the anchor thread, executes the operation on the uniform set of input operands. Subsequently, the SM sets the outputs of the non-anchor threads included in the uniform group of threads to equal the value of the anchor execution unit output.Type: GrantFiled: October 8, 2013Date of Patent: July 6, 2021Assignee: NVIDIA CorporationInventors: Gary M. Tarolli, John H. Edmondson, John Matthew Burgess, Robert Ohannessian
-
Patent number: 10241798Abstract: An issue control unit is configured to control the rate at which an instruction issue unit issues instructions to an execution pipeline in order to avoid spikes in power drawn by that execution pipeline. The issue control unit maintains a history buffer that reflects, for N previous cycles, the number of instructions issued during each of those N cycles. If the total number of instructions issued during the N previous cycles exceeds a threshold value, then the issue control unit throttles the instruction issue unit from issuing instructions during a subsequent cycle. In addition, the issue control unit increases the threshold value in proportion to the number of previously issued instructions and based on a variety of configurable parameters. Accordingly, the issue control unit maintains granular control over the rate with which the instruction issue unit “ramps up” to a maximum instruction issue rate.Type: GrantFiled: September 20, 2013Date of Patent: March 26, 2019Assignee: NVIDIA CORPORATIONInventors: Peter Sommers, Peter Nelson, Aniket Naik, John H. Edmondson
-
Patent number: 10095548Abstract: One embodiment of the present disclosure sets forth an effective way to maintain fairness and order in the scheduling of common resource access requests related to replay operations. Specifically, a streaming multiprocessor (SM) includes a total order queue (TOQ) configured to schedule the access requests over one or more execution cycles. Access requests are allowed to make forward progress when needed common resources have been allocated to the request. Where multiple access requests require the same common resource, priority is given to the older access request. Access requests may be placed in a sleep state pending availability of certain common resources. Deadlock may be avoided by allowing an older access request to steal resources from a younger resource request. One advantage of the disclosed technique is that older common resource access requests are not repeatedly blocked from making forward progress by newer access requests.Type: GrantFiled: May 21, 2012Date of Patent: October 9, 2018Assignee: NVIDIA CORPORATIONInventors: Michael Fetterman, Shirish Gadre, John H. Edmondson, Omkar Paranjape, Anjana Rajendran, Eric Lyell Hill, Rajeshwaran Selvanesan, Charles McCarver, Kevin Mitchell, Steven James Heinrich
-
Patent number: 9836325Abstract: One embodiment of the present disclosure sets forth an effective way to maintain fairness and order in the scheduling of common resource access requests related to replay operations. Specifically, a streaming multiprocessor (SM) includes a total order queue (TOQ) configured to schedule the access requests over one or more execution cycles. Access requests are allowed to make forward progress when needed common resources have been allocated to the request. Where multiple access requests require the same common resource, priority is given to the older access request. Access requests may be placed in a sleep state pending availability of certain common resources. Deadlock may be avoided by allowing an older access request to steal resources from a younger resource request. One advantage of the disclosed technique is that older common resource access requests are not repeatedly blocked from making forward progress by newer access requests.Type: GrantFiled: May 21, 2012Date of Patent: December 5, 2017Assignee: NVIDIA CorporationInventors: Michael Fetterman, Shirish Gadre, John H. Edmondson, Omkar Paranjape, Anjana Rajendran, Eric Lyell Hill, Rajeshwaran Selvanesan, Charles McCarver, Kevin Mitchell, Steven James Heinrich
-
Patent number: 9755994Abstract: One embodiment of the present disclosure sets forth an effective way to maintain fairness and order in the scheduling of common resource access requests related to replay operations. Specifically, a streaming multiprocessor (SM) includes a total order queue (TOQ) configured to schedule the access requests over one or more execution cycles. Access requests are allowed to make forward progress when needed common resources have been allocated to the request. Where multiple access requests require the same common resource, priority is given to the older access request. Access requests may be placed in a sleep state pending availability of certain common resources. Deadlock may be avoided by allowing an older access request to steal resources from a younger resource request. One advantage of the disclosed technique is that older common resource access requests are not repeatedly blocked from making forward progress by newer access requests.Type: GrantFiled: May 21, 2012Date of Patent: September 5, 2017Assignee: NVIDIA CorporationInventors: Michael Fetterman, Shirish Gadre, John H. Edmondson, Omkar Paranjape, Anjana Rajendran, Eric Lyell Hill, Rajeshwaran Selvanesan, Charles McCarver, Kevin Mitchell, Steven James Heinrich
-
Patent number: 9436625Abstract: Banks within a dynamic random access memory (DRAM) are managed with virtual bank managers. A DRAM controller receives a new memory access request to DRAM including a plurality of banks. If the request accesses a location in DRAM where no virtual bank manager includes parameters for the corresponding DRAM page, then a virtual bank manager is allocated to the physical bank associated with the DRAM page. The bank manager is initialized to include parameters needed by the DRAM controller to access the DRAM page. The memory access request is then processed using the parameters associated with the virtual bank manager. One advantage of the disclosed technique is that the banks of a DRAM module are controlled with fewer bank managers than in previous DRAM controller designs. As a result, less surface area on the DRAM controller circuit is dedicated to bank managers.Type: GrantFiled: June 13, 2012Date of Patent: September 6, 2016Assignee: NVIDIA CorporationInventors: Shu-Yi Yu, Ram Gummadi, John H. Edmondson
-
Patent number: 9274985Abstract: Banks within a dynamic random access memory (DRAM) are managed with virtual bank managers. A DRAM controller receives a new memory access request to DRAM including a plurality of banks. If the request accesses a location in DRAM where no virtual bank manager includes parameters for the corresponding DRAM page, then a virtual bank manager is allocated to the physical bank associated with the DRAM page. The bank manager is initialized to include parameters needed by the DRAM controller to access the DRAM page. The memory access request is then processed using the parameters associated with the virtual bank manager. One advantage of the disclosed technique is that the banks of a DRAM module are controlled with fewer bank managers than in previous DRAM controller designs. As a result, less surface area on the DRAM controller circuit is dedicated to bank managers.Type: GrantFiled: June 13, 2012Date of Patent: March 1, 2016Assignee: NVIDIA CorporationInventors: Shu-Yi Yu, Ram Gummadi, John H. Edmondson
-
Patent number: 9245601Abstract: A system and device are provided for implementing memory arrays using high-density latch cells. The device includes an array of cells arranged into columns and rows. Each cell comprises a latch cell that includes a transmission gate, a pair of inverters, and an output buffer. Each row of latch cells is connected to at least one common node for addressing the row of latch cells, and each column of latch cells is connected to a particular bit of an input signal and a particular bit of an output signal. A register file may be implemented using one or more arrays of the high-density latch cells to replace any or all of the banks of SRAM cells typically used to implement the register file.Type: GrantFiled: June 4, 2014Date of Patent: January 26, 2016Assignee: NVIDIA CorporationInventors: Mahmut Ersin Sinangil, John W. Poulton, Brucek Kurdo Khailany, John H. Edmondson
-
Publication number: 20150357009Abstract: A system and device are provided for implementing memory arrays using high-density latch cells. The device includes an array of cells arranged into columns and rows. Each cell comprises a latch cell that includes a transmission gate, a pair of inverters, and an output buffer. Each row of latch cells is connected to at least one common node for addressing the row of latch cells, and each column of latch cells is connected to a particular bit of an input signal and a particular bit of an output signal. A register file may be implemented using one or more arrays of the high-density latch cells to replace any or all of the banks of SRAM cells typically used to implement the register file.Type: ApplicationFiled: June 4, 2014Publication date: December 10, 2015Inventors: Mahmut Ersin Sinangil, John W. Poulton, Brucek Kurdo Khailany, John H. Edmondson
-
Patent number: 9058792Abstract: Sequential write operations to a unit of compressed memory, known as a compression tile, are examined to see if the same compression tile is being written. If the same compression tile is being written, the sequential write operations are coalesced into a single write operation and the entire compression tile is overwritten with the new data. Coalescing multiple write operations into a single write operation improves performance, because it avoids the read-modify-write operations that would otherwise be needed.Type: GrantFiled: November 1, 2006Date of Patent: June 16, 2015Assignee: NVIDIA CORPORATIONInventors: John H. Edmondson, Robert A. Alfieri, Michael F. Harris, Steven E. Molnar
-
Publication number: 20150100764Abstract: One embodiment of the present invention includes techniques to decrease power consumption by reducing the number of redundant operations performed. In operation, a streamlining multiprocessor (SM) identifies uniform groups of threads that, when executed, apply the same deterministic operation to uniform sets of input operands. Within each uniform group of threads, the SM designates one thread as the anchor thread. The SM disables execution units assigned to all of the threads except the anchor thread. The anchor execution unit, assigned to the anchor thread, executes the operation on the uniform set of input operands. Subsequently, the SM sets the outputs of the non-anchor threads included in the uniform group of threads to equal the value of the anchor execution unit output. Advantageously, by exploiting the uniformity of data to reduce the number of execution units that execute, the SM dramatically reduces the power consumption compared to conventional SMs.Type: ApplicationFiled: October 8, 2013Publication date: April 9, 2015Applicant: NVIDIA CORPORATIONInventors: Gary M. TAROLLI, John H. EDMONDSON, John Matthew BURGESS, Robert OHANNESSIAN
-
Publication number: 20150089198Abstract: An issue control unit is configured to control the rate at which an instruction issue unit issues instructions to an execution pipeline in order to avoid spikes in power drawn by that execution pipeline. The issue control unit maintains a history buffer that reflects, for N previous cycles, the number of instructions issued during each of those N cycles. If the total number of instructions issued during the N previous cycles exceeds a threshold value, then the issue control unit throttles the instruction issue unit from issuing instructions during a subsequent cycle. In addition, the issue control unit increases the threshold value in proportion to the number of previously issued instructions and based on a variety of configurable parameters. Accordingly, the issue control unit maintains granular control over the rate with which the instruction issue unit “ramps up” to a maximum instruction issue rate.Type: ApplicationFiled: September 20, 2013Publication date: March 26, 2015Applicant: NVIDIA CORPORATIONInventors: Peter SOMMERS, Peter NELSON, Aniket NAIK, John H. EDMONDSON
-
Patent number: 8949541Abstract: A method for cleaning dirty data in an intermediate cache is disclosed. A dirty data notification, including a memory address and a data class, is transmitted by a level 2 (L2) cache to frame buffer logic when dirty data is stored in the L2 cache. The data classes may include evict first, evict normal and evict last. In one embodiment, data belonging to the evict first data class is raster operations data with little reuse potential. The frame buffer logic uses a notification sorter to organize dirty data notifications, where an entry in the notification sorter stores the DRAM bank page number, a first count of cache lines that have resident dirty data and a second count of cache lines that have resident evict_first dirty data associated with that DRAM bank. The frame buffer logic transmits dirty data associated with an entry when the first count reaches a threshold.Type: GrantFiled: November 14, 2011Date of Patent: February 3, 2015Assignee: NVIDIA CorporationInventors: David B. Glasco, Peter B. Holmqvist, George R. Lynch, Patrick R. Marchand, James Roberts, John H. Edmondson
-
Patent number: 8928681Abstract: Sequential write operations to a unit of compressed memory, known as a compression tile, are examined to see if the same compression tile is being written. If the same compression tile is being written, the sequential write operations are coalesced into a single write operation and the entire compression tile is overwritten with the new data. Coalescing multiple write operations into a single write operation improves performance, because it avoids the read-modify-write operations that would otherwise be needed.Type: GrantFiled: December 29, 2009Date of Patent: January 6, 2015Assignee: NVIDIA CorporationInventors: John H. Edmondson, Robert A. Alfieri, Michael F. Harris, Steven E. Molnar
-
Patent number: 8700865Abstract: A shared resource management system and method are described. In one embodiment a shared resource management system includes a plurality of engines, a shared resource, and a shared resource management unit. In one exemplary implementation the shared resource is a memory and the shared resource management unit is a memory management unit (MMU). The plurality of engines perform processing. The shared resource supports the processing. For example, a memory stores information and instructions for the engines. The shared resource management unit manages memory operations and handles access requests associated with compressed data.Type: GrantFiled: November 2, 2006Date of Patent: April 15, 2014Assignee: NVIDIA CorporationInventors: James M. Van Dyke, John H. Edmondson, Lingfeng Yuan, Brian D. Hutsell
-
Patent number: 8656093Abstract: One embodiment of the invention sets forth a mechanism to transmit commands received from an L2 cache to a bank page within the DRAM. An arbiter unit determines which commands from a command sorter to transmit to a command queue. An activate command associated with the bank page related to the commands is also transmitted to an activate queue. The last command in the command queue is marked as “last.” An interlock counter stores a count of “last” commands in the read/write command queue. A DRAM controller transmits activate and commands from the activate queue and the command queue to the DRAM. Each time a command marked as “last” is encountered, the DRAM controller decrements the interlock counter. If the count in the interlock counter is zero, then the command marked as “last” is marked as “auto-precharge.” The “auto-precharge” command, when processed, causes the bank page to be closed.Type: GrantFiled: December 1, 2008Date of Patent: February 18, 2014Assignee: NVIDIA CorporationInventors: John H. Edmondson, Shane Keil
-
Publication number: 20130339592Abstract: Banks within a dynamic random access memory (DRAM) are managed with virtual bank managers. A DRAM controller receives a new memory access request to DRAM including a plurality of banks. If the request accesses a location in DRAM where no virtual bank manager includes parameters for the corresponding DRAM page, then a virtual bank manager is allocated to the physical bank associated with the DRAM page. The bank manager is initialized to include parameters needed by the DRAM controller to access the DRAM page. The memory access request is then processed using the parameters associated with the virtual bank manager. One advantage of the disclosed technique is that the banks of a DRAM module are controlled with fewer bank managers than in previous DRAM controller designs. As a result, less surface area on the DRAM controller circuit is dedicated to bank managers.Type: ApplicationFiled: June 13, 2012Publication date: December 19, 2013Inventors: Shu-Yi YU, Ram GUMMADI, John H. EDMONDSON
-
Publication number: 20130311996Abstract: One embodiment of the present disclosure sets forth an effective way to maintain fairness and order in the scheduling of common resource access requests related to replay operations. Specifically, a streaming multiprocessor (SM) includes a total order queue (TOQ) configured to schedule the access requests over one or more execution cycles. Access requests are allowed to make forward progress when needed common resources have been allocated to the request. Where multiple access requests require the same common resource, priority is given to the older access request. Access requests may be placed in a sleep state pending availability of certain common resources. Deadlock may be avoided by allowing an older access request to steal resources from a younger resource request. One advantage of the disclosed technique is that older common resource access requests are not repeatedly blocked from making forward progress by newer access requests.Type: ApplicationFiled: May 21, 2012Publication date: November 21, 2013Inventors: Michael FETTERMAN, Shirish GADRE, John H. EDMONDSON, Omkar PARANJAPE, Anjana RAJENDRAN, Eric Lyell HILL, Rajeshwaran SELVANESAN, Charles McCARVER, Kevin MITCHELL, Steven James HEINRICH
-
Publication number: 20130311686Abstract: One embodiment of the present disclosure sets forth an effective way to maintain fairness and order in the scheduling of common resource access requests related to replay operations. Specifically, a streaming multiprocessor (SM) includes a total order queue (TOQ) configured to schedule the access requests over one or more execution cycles. Access requests are allowed to make forward progress when needed common resources have been allocated to the request. Where multiple access requests require the same common resource, priority is given to the older access request. Access requests may be placed in a sleep state pending availability of certain common resources. Deadlock may be avoided by allowing an older access request to steal resources from a younger resource request. One advantage of the disclosed technique is that older common resource access requests are not repeatedly blocked from making forward progress by newer access requests.Type: ApplicationFiled: May 21, 2012Publication date: November 21, 2013Inventors: Michael FETTERMAN, Shirish Gadre, John H. Edmondson, Omkar Paranjape, Anjana Rajendran, Eric Lyell Hill, Rajeshwaran Selvanesan, Charles McCarver, Kevin Mitchell, Steven James Heinrich
-
Publication number: 20130311999Abstract: One embodiment of the present disclosure sets forth an effective way to maintain fairness and order in the scheduling of common resource access requests related to replay operations. Specifically, a streaming multiprocessor (SM) includes a total order queue (TOQ) configured to schedule the access requests over one or more execution cycles. Access requests are allowed to make forward progress when needed common resources have been allocated to the request. Where multiple access requests require the same common resource, priority is given to the older access request. Access requests may be placed in a sleep state pending availability of certain common resources. Deadlock may be avoided by allowing an older access request to steal resources from a younger resource request. One advantage of the disclosed technique is that older common resource access requests are not repeatedly blocked from making forward progress by newer access requests.Type: ApplicationFiled: May 21, 2012Publication date: November 21, 2013Inventors: Michael FETTERMAN, Shirish Gadre, John H. Edmondson, Omkar Paranjape, Anjana Rajendran, Eric Lyell Hill, Rajeshwaran Selvanesan, Charles McCarver, Kevin Mitchell, Steven James Heinrich