Patents by Inventor Michael C Shebanow

Michael C Shebanow has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

COVERAGE CACHING

Publication number: 20120281004

Abstract: A technique for caching coverage information for edges that are shared between adjacent graphics primitives may reduce the number of times a shared edge is rasterized. Consequently, power consumed during rasterization may be reduced. During rasterization of a first graphics primitive coverage information is generated that (1) indicates cells within a sampling grid that are entirely outside an edge of the first graphics primitive and (2) indicates cells within the sampling grid that are intersected by the edge and are only partially covered by the first graphics primitive. The coverage information for the edge is stored in a cache. When a second graphics primitive is rasterized that shares the edge with the first graphics primitive, the coverage information is read from the cache instead of being recomputed.

Type: Application

Filed: May 2, 2012

Publication date: November 8, 2012

Inventors: Michael C. Shebanow, Anjul Patney
Method and system for connecting multiple shaders

Patent number: 8223158

Abstract: A method and system for connecting multiple shaders are disclosed. Specifically, one embodiment of the present invention sets forth a method, which includes the steps of configuring a set of shaders in a user-defined sequence within a modular pipeline (MPipe), allocating resources to execute the programming instructions of each of the set of shaders in the user-defined sequence to operate on the data unit, and directing the output of the MPipe to an external sink.

Type: Grant

Filed: December 19, 2006

Date of Patent: July 17, 2012

Assignee: NVIDIA Corporation

Inventors: John Erik Lindholm, Michael C. Shebanow, Jerome F. Duluk, Jr.
Hardware warning protocol for processing units

Patent number: 8127181

Abstract: Processing units are configured to capture the unit state in unit level error status registers when a runtime error event is detected in order to facilitate debugging of runtime errors. The reporting of warnings may be disabled or enabled to selectively monitor each processing unit. Warnings for each processing unit are propagated to an exception register in a front end monitoring unit. The warnings are then aggregated and propagated to an interrupt register in a front end monitoring unit in order to selectively generate an interrupt and facilitate debugging. A debugging application may be used to query the interrupt, exception, and unit level error status registers to determine the cause of the error. A default error handling behavior that overrides error conditions may be used in conjunction with the hardware warning protocol to allow the processing units to continue operating and facilitate in the debug of runtime errors.

Type: Grant

Filed: November 2, 2007

Date of Patent: February 28, 2012

Assignee: NVIDIA Corporation

Inventors: Michael C. Shebanow, John S. Montrym, Richard A. Silkebakken, Robert C. Keller
Unit status reporting protocol

Patent number: 8019978

Abstract: A unit status reporting protocol may also be used for context switching, debugging, and removing deadlock conditions in a processing unit. A processing unit is in one of five states: empty, active, stalled, quiescent, and halted. The state that a processing unit is in is reported to a front end monitoring unit to enable the front end monitoring unit to determine when a context switch may be performed or when a deadlock condition exists. The front end monitoring unit can issue a halt command to perform a context switch or take action to remove a deadlock condition and allow processing to resume.

Type: Grant

Filed: August 13, 2007

Date of Patent: September 13, 2011

Assignee: NVIDIA Corporation

Inventors: Michael C. Shebanow, Robert C. Keller, Richard A. Silkebakken
DISTRIBUTED STREAM OUTPUT IN A PARALLEL PROCESSING UNIT

Publication number: 20110141122

Abstract: A technique for performing stream output operations in a parallel processing system is disclosed. A stream synchronization unit is provided that enables the parallel processing unit to track batches of vertices being processed in a graphics processing pipeline. A plurality of stream output units is also provided, where each stream output unit writes vertex attribute data to one or more stream output buffers for a portion of the batches of vertices. A messaging protocol is implemented between the stream synchronization unit and the plurality of stream output units that ensures that each of the stream output units writes vertex attribute data for the particular batch of vertices distributed to that particular stream output unit in the same order in the stream output buffers as the order in which the batch of vertices was received from a device driver by the parallel processing unit.

Type: Application

Filed: September 29, 2010

Publication date: June 16, 2011

Inventors: Ziyad S. Hakura, Rohit Gupta, Michael C. Shebanow, Emmett M. Kilgariff
TRAP HANDLER ARCHITECTURE FOR A PARALLEL PROCESSING UNIT

Publication number: 20110078427

Abstract: A trap handler architecture is incorporated into a parallel processing subsystem such as a GPU. The trap handler architecture minimizes design complexity and verification efforts for concurrently executing threads by imposing a property that all thread groups associated with a streaming multi-processor are either all executing within their respective code segments or are all executing within the trap handler code segment.

Type: Application

Filed: September 29, 2009

Publication date: March 31, 2011

Inventors: Michael C. Shebanow, Jack Choquette, Brett W. Coon, Steven J. Heinrich, Aravind Kalaiah, John R. Nickolls, Daniel Salinas, Ming Y. Siu, Tommy Thorn, Nicholas Wang
DEFERRED COMPLETE VIRTUAL ADDRESS COMPUTATION FOR LOCAL MEMORY SPACE REQUESTS

Publication number: 20110078358

Abstract: One embodiment of the present invention sets forth a technique for computing virtual addresses for accessing thread data. Components of the complete virtual address for a thread group are used to determine whether or not a cache line corresponding to the complete virtual address is not allocated in the cache. Actual computation of the complete virtual address is deferred until after determining that a cache line corresponding to the complete virtual address is not allocated in the cache.

Type: Application

Filed: August 17, 2010

Publication date: March 31, 2011

Inventor: Michael C. Shebanow
COALESCING MEMORY BARRIER OPERATIONS ACROSS MULTIPLE PARALLEL THREADS

Publication number: 20110078692

Abstract: One embodiment of the present invention sets forth a technique for coalescing memory barrier operations across multiple parallel threads. Memory barrier requests from a given parallel thread processing unit are coalesced to reduce the impact to the rest of the system. Additionally, memory barrier requests may specify a level of a set of threads with respect to which the memory transactions are committed. For example, a first type of memory barrier instruction may commit the memory transactions to a level of a set of cooperating threads that share an L1 (level one) cache. A second type of memory barrier instruction may commit the memory transactions to a level of a set of threads sharing a global memory. Finally, a third type of memory barrier instruction may commit the memory transactions to a system level of all threads sharing all system memories. The latency required to execute the memory barrier instruction varies based on the type of memory barrier instruction.

Type: Application

Filed: September 21, 2010

Publication date: March 31, 2011

Inventors: John R. NICKOLLS, Steven James Heinrich, Brett W. Coon, Michael C. Shebanow
Address Mapping for a Parallel Thread Processor

Publication number: 20110078689

Abstract: A method for thread address mapping in a parallel thread processor. The method includes receiving a thread address associated with a first thread in a thread group; computing an effective address based on a location of the thread address within a local window of a thread address space; computing a thread group address in an address space associated with the thread group based on the effective address and a thread identifier associated with a first thread; and computing a virtual address associated with the first thread based on the thread group address and a thread group identifier, where the virtual address is used to access a location in a memory associated with the thread address to load or store data.

Type: Application

Filed: September 24, 2010

Publication date: March 31, 2011

Inventors: Michael C. SHEBANOW, Yan Yan Tang, John R. Nickolls
Halt context switching method and system

Patent number: 7916146

Abstract: In a processing pipeline having a plurality of units, an interface unit is provided between a first, upstream pipeline unit that needs to be drained prior to a context switch and a second, downstream pipeline unit that might halt prior to a context switch. The interface unit redirects data that are drained from the first pipeline unit and to be received by the second pipeline unit, to a buffer memory provided in the front end of the processing pipeline. The contents of the buffer memory are subsequently dumped into memory reserved for the context that is being stored. When the processing pipeline is restored with this context, the data that were dumped into memory are retrieved back into the buffer memory and provided to the interface unit. The interface unit receives these commands and directs them to the second pipeline unit.

Type: Grant

Filed: December 2, 2005

Date of Patent: March 29, 2011

Assignee: NVIDIA Corporation

Inventors: Robert C. Keller, Michael C. Shebanow, Makarand M. Dharmapurikar
Unified Collector Structure for Multi-Bank Register File

Publication number: 20110072243

Abstract: One embodiment of the present invention sets forth a technique for collecting operands specified by an instruction. As a sequence of instructions is received the operands specified by the instructions are assigned to ports, so that each one of the operands specified by a single instruction is assigned to a different port. Reading of the operands from a multi-bank register file is scheduled by selecting an operand from each one of the different ports to produce an operand read request and ensuring that two or more of the selected operands are not stored in the same bank of the multi-bank register file. The operands specified by the operand read request are read from the multi-bank register file in a single clock cycle. Each instruction is then executed as the operands specified by the instruction are read from the multi-bank register file and collected over one or more clock cycles.

Type: Application

Filed: September 3, 2010

Publication date: March 24, 2011

Inventors: Xiaogang Qiu, Ming Y. Siu, Yan Yan Tang, John Erik Lindholm, Michael C. Shebanow, Stuart F. Oberman
INSTRUCTIONS FOR MANAGING A PARALLEL CACHE HIERARCHY

Publication number: 20110072213

Abstract: A method for managing a parallel cache hierarchy in a processing unit. The method includes receiving an instruction from a scheduler unit, where the instruction comprises a load instruction or a store instruction; determining that the instruction includes a cache operations modifier that identifies a policy for caching data associated with the instruction at one or more levels of the parallel cache hierarchy; and executing the instruction and caching the data associated with the instruction based on the cache operations modifier.

Type: Application

Filed: September 22, 2010

Publication date: March 24, 2011

Inventors: John R. NICKOLLS, Brett W. Coon, Michael C. Shebanow
SECURE AND POSITIVE AUTHENTICATION ACROSS A NETWORK

Publication number: 20100095117

Abstract: One embodiment takes the form of a method for authenticating an identity of a first party to a second party, without any prior contact between the parties. Further, the first party may authenticate its identity to the second party while eliminating the ability of the second party to steal the first party's identity. A trusted authority may facilitate authenticating the identity of two or more communicating parties. In one embodiment, the authority may ensure the validity of the identification of a number of parties talking over a communications network. The parties communicating over the secure network trust what the authority states concerning the identities of the other parties in the network. Another embodiment may prevent the authority from monitoring which two parties are communicating to each other through the network.

Type: Application

Filed: October 15, 2008

Publication date: April 15, 2010

Inventor: Michael C. Shebanow
Atomic memory operators in a parallel processor

Patent number: 7627723

Abstract: Methods, apparatuses, and systems are presented for updating data in memory while executing multiple threads of instructions, involving receiving a single instruction from one of a plurality of concurrently executing threads of instructions, in response to the single instruction received, reading data from a specific memory location, performing an operation involving the data read from the memory location to generate a result, and storing the result to the specific memory location, without requiring separate load and store instructions, and in response to the single instruction received, precluding another one of the plurality of threads of instructions from altering data at the specific memory location while reading of the data from the specific memory location, performing the operation involving the data, and storing the result to the specific memory location.

Type: Grant

Filed: September 21, 2006

Date of Patent: December 1, 2009

Assignee: NVIDIA Corporation

Inventors: Ian A. Buck, John R. Nickolls, Michael C. Shebanow, Lars S. Nyland
Context switching using halt sequencing protocol

Patent number: 7512773

Abstract: A halt sequencing protocol permits a context switch to occur in a processing pipeline even before all units of the processing pipeline are idle. The context switch method based on the halt sequencing protocol includes the steps of issuing a halt request signal to the units of a processing pipeline, monitoring the status of each of the units, and freezing the states of all of the units when they are either idle or halted. Then, the states of the units, which pertain to the thread that has been halted, are dumped into memory, and the units are restored with states corresponding to a different thread that is to be executed after the context switch.

Type: Grant

Filed: October 18, 2005

Date of Patent: March 31, 2009

Assignee: NVIDIA Corporation

Inventors: Michael C. Shebanow, Robert C. Keller, Richard A. Silkebakken, Benjamin J. Garlick
Split data-flow scheduling mechanism

Patent number: 7293162

Abstract: A scheduling scheme and mechanism for a processor system is disclosed. The scheduling scheme provides a reservation station system that includes a control reservation station and a data reservation station. The reservation station system receives an operational entry and for each operational entry it identifies scheduling state information, operand state information, and operand information. The reservation station system stores the scheduling state information and operand information as a control reservation station entry in the control reservation station and stores the operating state information and the operand information as a data reservation station entry in the data reservation station. When control reservation station entries are identified as ready, they are scheduled and issued for execution by a functional unit.

Type: Grant

Filed: December 18, 2002

Date of Patent: November 6, 2007

Assignee: Fujitsu Limited

Inventors: Michael C Shebanow, Michael G Butler
Split data-flow scheduling mechanism

Publication number: 20040123077

Abstract: A scheduling scheme and mechanism for a processor system is disclosed. The scheduling scheme provides a reservation station system that includes a control reservation station and a data reservation station. The reservation station system receives an operational entry and for each operational entry it identifies scheduling state information, operand state information, and operand information. The reservation station system stores the scheduling state information and operand information as a control reservation station entry in the control reservation station and stores the operating state information and the operand information as a data reservation station entry in the data reservation station. When control reservation station entries are identified as ready, they are scheduled and issued for execution by a functional unit.

Type: Application

Filed: December 18, 2002

Publication date: June 24, 2004

Inventors: Michael C. Shebanow, Michael G. Butler
Distributed resource allocation mechanism

Publication number: 20040123298

Abstract: A system and method for performing dynamic resource allocation. A deallocation block sends batons to an allocation block representing assigned resources. The allocation block receives the assigned resources and, if needed, allocates the assigned resources to an execution machine that preforms tasks such as executing instructions. The deallocation block continually sends batons independent of the allocation block's current need for resources. The deallocation returns unused batons or sends used an indication of used batons to the deallocation block. The deallocation block is physically decoupled and distributed from the allocation block.

Type: Application

Filed: June 11, 2003

Publication date: June 24, 2004

Inventor: Michael C. Shebanow
Structure and method for instruction boundary machine state restoration

Patent number: 5966530

Abstract: A high-performance processor is disclosed with structure and methods for: (1) aggressively scheduling long latency instructions including load/store instructions while maintaining precise state; (2) maintaining and restoring state at any instruction boundary; (3) tracking instruction status; (4) checkpointing instructions; (5) creating, maintaining, and using a time-out checkpoint; (6) tracking floating-point exceptions; (7) creating, maintaining, and using a watchpoint for plural, simultaneous, unresolved-branch evaluation; and (9) increasing processor throughput while maintaining precise state. In one embodiment of the invention, a method of restoring machine state in a processor at any instruction boundary is disclosed. For any instruction which may modify control registers, the processor is either synchronized prior to execution or an instruction checkpoint is stored to preserve state; and for any instruction that creates a program counter discontinuity an instruction checkpoint is stored.

Type: Grant

Filed: June 11, 1997

Date of Patent: October 12, 1999

Assignee: Fujitsu, Ltd.

Inventors: Gene W. Shen, John Szeto, Niteen A. Patkar, Michael C. Shebanow
Programmable instruction trap system and method

Patent number: 5896526

Abstract: A system and method providing a programmable hardware device within a CPU. The programmable hardware device permits a plurality of instructions to be trapped before they are executed. The instructions that are to be trapped are programmable to provide flexibility during CPU debugging and to ensure that a variety of application programs can be properly executed by the CPU. The system must also provide a means for permitting a trapped instruction to be emulated and/or to be executed serially.

Type: Grant

Filed: February 18, 1998

Date of Patent: April 20, 1999

Assignee: Fujitsu, Ltd.

Inventors: Sunil Savkar, Gene W. Shen, Farnad Sajjadian, Michael C. Shebanow

prev 1 2 3 next