Patents by Inventor Jerome F. Duluk

Jerome F. Duluk has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 8493395
    Abstract: A method and system for overriding state information programmed into a processor using an application programming interface (API) avoids introducing error conditions in the processor. An override monitor unit within the processor stores the programmed state for any setting that is overridden so that the programmed state can be restored when the error condition no longer exists. The override monitor unit overrides the programmed state by forcing the setting to a legal value that does not cause an error condition. The processor is able to continue operating without notifying a device driver that an error condition has occurred since the error condition is avoided.
    Type: Grant
    Filed: July 16, 2012
    Date of Patent: July 23, 2013
    Assignee: Nvidia Corporation
    Inventors: Jerome F. Duluk, Jr., Henry P. Moreton, Steven E. Molnar, John S. Montrym
  • Publication number: 20130185725
    Abstract: One embodiment of the present invention sets forth a technique for selecting a first processor included in a plurality of processors to receive work related to a compute task. The technique involves analyzing state data of each processor in the plurality of processors to identify one or more processors that have already been assigned one compute task and are eligible to receive work related to the one compute task, receiving, from each of the one or more processors identified as eligible, an availability value that indicates the capacity of the processor to receive new work, selecting a first processor to receive work related to the one compute task based on the availability values received from the one or more processors, and issuing, to the first processor via a cooperative thread array (CTA), the work related to the one compute task.
    Type: Application
    Filed: January 18, 2012
    Publication date: July 18, 2013
    Inventors: Karim M. ABDALLA, Lacky V. Shah, Jerome F. Duluk, JR., Timothy John Purcell, Tanmoy Mandal, Gentaro Hirota
  • Publication number: 20130185728
    Abstract: One embodiment of the present invention sets forth a technique for assigning a compute task to a first processor included in a plurality of processors. The technique involves analyzing each compute task in a plurality of compute tasks to identify one or more compute tasks that are eligible for assignment to the first processor, where each compute task is listed in a first table and is associated with a priority value and an allocation order that indicates relative time at which the compute task was added to the first table. The technique further involves selecting a first task compute from the identified one or more compute tasks based on at least one of the priority value and the allocation order, and assigning the first compute task to the first processor for execution.
    Type: Application
    Filed: January 18, 2012
    Publication date: July 18, 2013
    Inventors: Karim M. Abdalla, Lacky V. Shah, Jerome F. Duluk, JR., Timothy John Purcell, Tanmoy Mandal, Gentaro Hirota
  • Publication number: 20130160021
    Abstract: One embodiment of the present invention sets forth a technique for enabling the insertion of generated tasks into a scheduling pipeline of a multiple processor system allows a compute task that is being executed to dynamically generate a dynamic task and notify a scheduling unit of the multiple processor system without intervention by a CPU. A reflected notification signal is generated in response to a write request when data for the dynamic task is written to a queue. Additional reflected notification signals are generated for other events that occur during execution of a compute task, e.g., to invalidate cache entries storing data for the compute task and to enable scheduling of another compute task.
    Type: Application
    Filed: December 16, 2011
    Publication date: June 20, 2013
    Inventors: Timothy John PURCELL, Lacky V. Shah, Jerome F. Duluk, JR., Sean J. Treichler, Karim M. Abdalla, Philip Alexander Cuadra, Brian Pharris
  • Publication number: 20130152094
    Abstract: One embodiment of the present invention sets forth a technique for error-checking a compute task. The technique involves receiving a pointer to a compute task, storing the pointer in a scheduling queue, determining that the compute task should be executed, retrieving the pointer from the scheduling queue, determining via an error-check procedure that the compute task is eligible for execution, and executing the compute task.
    Type: Application
    Filed: December 9, 2011
    Publication date: June 13, 2013
    Inventors: Jerome F. Duluk, JR., Timothy John Purcell, Jesse David Hall, Phlip Alexander Cuadra
  • Publication number: 20130152093
    Abstract: A time slice group (TSG) is a grouping of different streams of work (referred to herein as “channels”) that share the same context information. The set of channels belonging to a TSG are processed in a pre-determined order. However, when a channel stalls while processing, the next channel with independent work can be switched to fully load the parallel processing unit. Importantly, because each channel in the TSG shares the same context information, a context switch operation is not needed when the processing of a particular channel in the TSG stops and the processing of a next channel in the TSG begins. Therefore, multiple independent streams of work are allowed to run concurrently within a single context increasing utilization of parallel processing units.
    Type: Application
    Filed: December 9, 2011
    Publication date: June 13, 2013
    Inventors: Samuel H. DUNCAN, Lacky V. SHAH, Sean J. TREICHLER, Daniel Elliot WEXLER, Jerome F. DULUK, JR., Phillip Browning JOHNSON, Jonathon Stuart Ramsay EVANS
  • Publication number: 20130117751
    Abstract: One embodiment of the present invention sets forth a technique for encapsulating compute task state that enables out-of-order scheduling and execution of the compute tasks. The scheduling circuitry organizes the compute tasks into groups based on priority levels. The compute tasks may then be selected for execution using different scheduling schemes. Each group is maintained as a linked list of pointers to compute tasks that are encoded as task metadata (TMD) stored in memory. A TMD encapsulates the state and parameters needed to initialize, schedule, and execute a compute task.
    Type: Application
    Filed: November 9, 2011
    Publication date: May 9, 2013
    Inventors: Jerome F. DULUK, JR., Lacky V. SHAH, Sean J. TREICHLER
  • Publication number: 20130117758
    Abstract: One embodiment of the present invention sets forth a technique for managing the allocation and release of resources during multi-threaded program execution. Programmable reference counters are initialized to values that limit the amount of resources for allocation to tasks that share the same reference counter. Resource parameters are specified for each task to define the amount of resources allocated for consumption by each array of execution threads that is launched to execute the task. The resource parameters also specify the behavior of the array for acquiring and releasing resources. Finally, during execution of each thread in the array, an exit instruction may be configured to override the release of the resources that were allocated to the array. The resources may then be retained for use by a child task that is generated during execution of a thread.
    Type: Application
    Filed: November 8, 2011
    Publication date: May 9, 2013
    Inventors: Philip Alexander Cuadra, Karim M. Abdalla, Jerome F. Duluk, JR., Luke Durant, Gerald F. Luiz, Timothy John Purcell, Lacky V. Shah
  • Patent number: 8429656
    Abstract: Methods and apparatuses are presented for graphics operations with thread count throttling, involving operating a processor to carry out multiple threads of execution of, wherein the processor comprises at least one execution unit capable of supporting up to a maximum number of threads, obtaining a defined memory allocation size for allocating, in at least one memory device, a thread-specific memory space for the multiple threads, obtaining a per thread memory requirement corresponding to the thread-specific memory space, determining a thread count limit based on the defined memory allocation size and the per thread memory requirement, and sending a command to the processor to cause the processor to limit the number of threads carried out by the at least one execution unit to a reduced number of threads, the reduced number of threads being less than the maximum number of threads.
    Type: Grant
    Filed: November 2, 2006
    Date of Patent: April 23, 2013
    Assignee: NVIDIA Corporation
    Inventors: Jerome F. Duluk, Jr., Bryon S. Nordquist
  • Patent number: 8427493
    Abstract: One embodiment of the present invention sets forth a technique for reducing the overhead for transmitting explicit begin and explicit end commands that are needed in primitive draw command sequences. A draw method includes a header to specify an implicit begin command, an implicit end command, and instancing information for a primitive draw command sequence. The header is followed by a packet including one or more data words (dwords) that each specify a primitive topology, starting offset into a vertex or index buffer, and vertex or index count. Only a single clock cycle is consumed to transmit and process the header. The performance of graphics application programs that have many small batches of geometry (as is typical of many workstation applications) may be improved since the overhead of transmitting and processing the explicit begin and explicit end draw commands is reduced.
    Type: Grant
    Filed: September 29, 2010
    Date of Patent: April 23, 2013
    Assignee: NVIDIA Corporation
    Inventors: Jerome F. Duluk, Jr., Thomas Roell
  • Publication number: 20130074088
    Abstract: One embodiment of the present invention sets forth a technique for dynamically scheduling and managing compute tasks with different execution priority levels. The scheduling circuitry organizes the compute tasks into groups based on priority levels. The compute tasks may then be selected for execution using different scheduling schemes, such as round-robin, priority, and partitioned priority. Each group is maintained as a linked list of pointers to compute tasks that are encoded as queue metadata (QMD) stored in memory. A QMD encapsulates the state needed to execute a compute task. When a task is selected for execution by the scheduling circuitry, the QMD is removed for a group and transferred to a table of active compute tasks. Compute tasks are then selected from the active task table for execution by a streaming multiprocessor.
    Type: Application
    Filed: September 19, 2011
    Publication date: March 21, 2013
    Inventors: Timothy John PURCELL, Lacky V. Shah, Jerome F. Duluk, JR.
  • Patent number: 8330766
    Abstract: A system and method for performing zero-bandwidth-clears reduces external memory accesses by a graphics processor when performing clears and subsequent read operations. A set of clear values is stored in the graphics processor. Each region of a color or z buffer may be configured using a zero-bandwidth-clear command to reference a clear value without writing the external memory. The clear value is provided to a requestor without accessing the external memory when a read access is performed.
    Type: Grant
    Filed: December 19, 2008
    Date of Patent: December 11, 2012
    Assignee: NVIDIA Corporation
    Inventors: David Kirk McAllister, Steven E. Molnar, Jerome F. Duluk, Jr., Emmett M. Kilgariff, Patrick R. Brown, Christian Johannes Amsinck, James Michael O'Connor, John Matthew Burgess, Gregory Alan Muthler, James Robertson
  • Patent number: 8319783
    Abstract: A system and method for performing zero-bandwidth-clears reduces external memory accesses by a graphics processor when performing clears and subsequent read operations. A set of clear values is stored in the graphics processor. Each portion of a color or z buffer may be configured using a zero-bandwidth-clear command to reference a clear value without writing the external memory. The clear value is provided to a requestor without accessing the external memory when a read access is performed.
    Type: Grant
    Filed: December 19, 2008
    Date of Patent: November 27, 2012
    Assignee: NVIDIA Corporation
    Inventors: David Kirk McAllister, Steven E. Molnar, Peter B. Holmqvist, Jerome F. Duluk, Jr., Cass W. Everitt, Emmett M. Kilgariff, Patrick R. Brown, Christian Johannes Amsinck
  • Publication number: 20120284568
    Abstract: A method and system for overriding state information programmed into a processor using an application programming interface (API) avoids introducing error conditions in the processor. An override monitor unit within the processor stores the programmed state for any setting that is overridden so that the programmed state can be restored when the error condition no longer exists. The override monitor unit overrides the programmed state by forcing the setting to a legal value that does not cause an error condition. The processor is able to continue operating without notifying a device driver that an error condition has occurred since the error condition is avoided.
    Type: Application
    Filed: July 16, 2012
    Publication date: November 8, 2012
    Applicant: NVIDIA Corporation
    Inventors: Jerome F. Duluk, JR., Henry P. Moreton, Steven E. Molnar, John S. Montrym
  • Patent number: 8228338
    Abstract: A method and system for overriding state information programmed into a processor using an application programming interface (API) avoids introducing error conditions in the processor. An override monitor unit within the processor stores the programmed state for any setting that is overridden so that the programmed state can be restored when the error condition no longer exists. The override monitor unit overrides the programmed state by forcing the setting to a legal value that does not cause an error condition. The processor is able to continue operating without notifying a device driver that an error condition has occurred since the error condition is avoided.
    Type: Grant
    Filed: January 19, 2007
    Date of Patent: July 24, 2012
    Assignee: NVIDIA Corporation
    Inventors: Jerome F. Duluk, Jr., Henry P. Moreton, Steven E. Molnar, John S. Montrym
  • Patent number: 8223158
    Abstract: A method and system for connecting multiple shaders are disclosed. Specifically, one embodiment of the present invention sets forth a method, which includes the steps of configuring a set of shaders in a user-defined sequence within a modular pipeline (MPipe), allocating resources to execute the programming instructions of each of the set of shaders in the user-defined sequence to operate on the data unit, and directing the output of the MPipe to an external sink.
    Type: Grant
    Filed: December 19, 2006
    Date of Patent: July 17, 2012
    Assignee: NVIDIA Corporation
    Inventors: John Erik Lindholm, Michael C. Shebanow, Jerome F. Duluk, Jr.
  • Patent number: 8134570
    Abstract: A system, method and computer program product are provided for packing graphics attributes. In use, a plurality of graphics attributes is identified. Such graphics attributes are packed, such that the packed graphics attributes are capable of being processed utilizing a pixel shader.
    Type: Grant
    Filed: September 18, 2006
    Date of Patent: March 13, 2012
    Assignee: NVIDIA Corporation
    Inventors: Jerome F. Duluk, Jr., Andrew J. Tao, Roger L. Allen, Svetoslav D. Tzvetkov, Yan Yan Tang, Elena M. Ing
  • Patent number: 8085275
    Abstract: A push buffer-related system, method and computer program product are provided. Initially, an entry is obtained from a buffer storage describing a size and location of a portion of a push buffer. To this end, the portion of the push buffer is capable of being retrieved, utilizing the entry from the buffer storage.
    Type: Grant
    Filed: December 20, 2005
    Date of Patent: December 27, 2011
    Assignee: NVIDIA Corporation
    Inventors: Jerome F. Duluk, Jr., Paolo E. Sabella, Henry Packard Moreton
  • Patent number: 8044966
    Abstract: Method and apparatus for display image adjustment is described. More particularly, handles associated with polygon vertices of a polygon rendered image are provided as a graphical user interface (GUI). These handles may be selected and moved by a user with a cursor pointing device to adjust a displayed image for keystoning, among other types of distortion. This GUI allows a user to adjust a projected image for position of a projector with respect to imaging surface, as well as for imaging surface contour, where such contour may be at least substantially planar, cylindrical, or spherical and where such contour may comprise multiple imaging surfaces. This advantageously may be done without special optics or special equipment. An original image is used as texture for rendering polygons, where the image is applied to the rendered polygons.
    Type: Grant
    Filed: December 29, 2009
    Date of Patent: October 25, 2011
    Assignee: NVIDIA Corporation
    Inventors: Michael B. Diamond, Abraham B. de Waal, David R. Morey, Jerome F. Duluk, Jr.
  • Publication number: 20110242119
    Abstract: One embodiment of the present invention sets forth a method for generating work to be processed by a graphics pipeline residing within a graphics processor. The method includes the steps of receiving an indication that a first graphics workload is to be submitted to a command queue associated with the graphics processor, allocating a first portion of shader accessible memory for one or more units of state information that are necessary for processing the first graphics workload, populating the first portion of shader accessible memory with the one or more units of state information, and transmitting to the command queue of the graphics processor the one or more units of state information stored within the first portion of shader accessible memory, wherein the first graphics workload is processed within the graphics pipeline based on the one or more units of state information.
    Type: Application
    Filed: April 1, 2011
    Publication date: October 6, 2011
    Inventors: Jeffrey A. BOLZ, Jesse David Hall, Jerome F. Duluk, JR., Patrick R. Brown, Gregory Scott Palmer