Patents by Inventor John H. Edmondson

John H. Edmondson has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Connecting multiple pixel shaders to a frame buffer without a crossbar

Patent number: 7830392

Abstract: The number of crossbars in a graphics processing unit is reduced by assigning each of a plurality of pixels to one of a plurality of pixel shaders based at least in part on a location of each of the plurality of pixels within an image area, generating an attribute value for each of the plurality of pixels using the plurality of pixel shaders, mapping the attribute value of each of the plurality of pixels to one of a plurality of memory partitions, and storing the attribute values in the memory partitions according to the mapping. The attribute value generated by a particular one of the pixel shaders is mapped to the same one of the plurality of memory partitions.

Type: Grant

Filed: December 18, 2006

Date of Patent: November 9, 2010

Assignee: NVIDIA Corporation

Inventors: John M. Danskin, Steven E. Molnar, John S. Montrym, Mark French, John H. Edmondson
Compression tag state interlock

Patent number: 7808507

Abstract: Systems and methods for determining a compression tag state prior to memory client arbitration may reduce the latency for memory accesses. A compression tag is associated with each portion of a surface stored in memory and indicates whether or not the data stored in each portion is compressed or not. A client uses the compression tags to construct memory access requests and the size of each request is based on whether or not the portion of the surface to be accessed is compressed or not. When multiple clients access the same surface the compression tag reads are interlocked with the pending memory access requests to ensure that the compression tags provided to each client are accurate. This mechanism allows for memory bandwidth optimizations including reordering memory access requests for efficient access.

Type: Grant

Filed: September 18, 2006

Date of Patent: October 5, 2010

Assignee: NVIDIA Corporation

Inventors: James M. Van Dyke, John H. Edmondson, Brian D. Hutsell, Michael F. Harris
Memory addressing controlled by PTE fields

Patent number: 7805587

Abstract: Embodiments of the present invention enable virtual-to-physical memory address translation using optimized bank and partition interleave patterns to improve memory bandwidth by distributing data accesses over multiple banks and multiple partitions. Each virtual page has a corresponding page table entry that specifies the physical address of the virtual page in linear physical address space. The page table entry also includes a data kind field that is used to guide and optimize the mapping process from the linear physical address space to the DRAM physical address space, which is used to directly access one or more DRAM. The DRAM physical address space includes a row, bank and column address. The data kind field is also used to optimize the starting partition number and partition interleave pattern that defines the organization of the selected physical page of memory within the DRAM memory system.

Type: Grant

Filed: November 1, 2006

Date of Patent: September 28, 2010

Assignee: NVIDIA Corporation

Inventors: James M. Van Dyke, John H. Edmondson
Hierarchical flush barrier mechanism with deadlock avoidance

Patent number: 7685371

Abstract: A data processing system can establish or maintain data coherency by issuing a data flush operation. The data processing system can be configured as a host executing one or more independent processes using one or more lower level devices. The lower level devices can be viewed as peer devices. Any of the host or the plurality of peer devices can be configured to initiate the flush operation. A device can determine whether the initiator of a flush operation is the host or a peer device. The device can perform a flush limited to local memory, or a subset of all available memory, if a peer device initiates the flush operation.

Type: Grant

Filed: April 19, 2006

Date of Patent: March 23, 2010

Assignee: NVIDIA Corporation

Inventors: Samuel Hammond Duncan, Robert A. Alfieri, John H. Edmondson, David William Nuechterlein, Michael A. Woodmansee
Page stream sorter for poor locality access patterns

Patent number: 7664905

Abstract: In some applications, such as video motion compression processing for example, a request pattern or “stream” of requests for accesses to memory (e.g., DRAM) may have, over a large number of requests, a relatively small number of requests to the same page. Due to the small number of requests to the same page, conventionally sorting to aggregate page hits may not be very effective. Reordering the stream can be used to “bury” or “hide” much of the necessary precharge/activate time, which can have a highly positive impact on overall throughput. For example, separating accesses to different rows of the same bank by at least a predetermined number of clocks can effectively hide the overhead involved in precharging/activating the rows.

Type: Grant

Filed: November 3, 2006

Date of Patent: February 16, 2010

Assignee: NVIDIA Corporation

Inventors: David A. Jarosh, Sonny S. Yeoh, Colyn S. Case, John H. Edmondson
Mapping memory partitions to virtual memory pages

Patent number: 7620793

Abstract: Systems and methods for addressing memory using non-power-of-two virtual memory page sizes improve graphics memory bandwidth by distributing graphics data for efficient access during rendering. Various partition strides may be selected for each virtual memory page to modify the number of sequential addresses mapped to each physical memory partition and change the interleaving granularity. The addressing scheme allows for modification of a bank interleave pattern for each virtual memory page to reduce bank conflicts and improve memory bandwidth utilization. The addressing scheme also allows for modification of a partition interleave pattern for each virtual memory page to distribute accesses amongst multiple partitions and improve memory bandwidth utilization.

Type: Grant

Filed: August 28, 2006

Date of Patent: November 17, 2009

Assignee: NVIDIA Corporation

Inventors: John H. Edmondson, Henry P. Moreton
Method and apparatus for providing peer-to-peer data transfer within a computing environment

Patent number: 7451259

Abstract: A method and apparatus for providing peer-to-peer data transfer through an interconnecting fabric. The method and apparatus enable a first device to read and/or write data to/from a local memory of a second device by communicating read and write requests across the interconnectivity fabric. Such data transfer can be performed even when the communication protocol of the interconnectivity fabric does not permit such transfers.

Type: Grant

Filed: December 6, 2004

Date of Patent: November 11, 2008

Assignee: NVIDIA Corporation

Inventors: Samuel H. Duncan, Wei-Je Huang, John H. Edmondson
Page stream sorter for poor locality access patterns

Publication number: 20080109613

Abstract: In some applications, such as video motion compression processing for example, a request pattern or “stream” of requests for accesses to memory (e.g., DRAM) may have, over a large number of requests, a relatively small number of requests to the same page. Due to the small number of requests to the same page, conventionally sorting to aggregate page hits may not be very effective. Reordering the stream can be used to “bury” or “hide” much of the necessary precharge/activate time, which can have a highly positive impact on overall throughput. For example, separating accesses to different rows of the same bank by at least a predetermined number of clocks can effectively hide the overhead involved in precharging/activating the rows.

Type: Application

Filed: November 3, 2006

Publication date: May 8, 2008

Applicant: NVIDIA Corporation

Inventors: David A. Jarosh, Sonny S. Yeoh, Colyn S. Case, John H. Edmondson
Method and apparatus for providing peer-to-peer data transfer within a computing environment

Patent number: 7275123

Abstract: A method and apparatus for providing peer-to-peer data transfer through an interconnecting fabric. The method and apparatus enable a first device to read and/or write data to/from a local memory of a second device by communicating read and write requests across the interconnectivity fabric. Such data transfer can be performed even when the communication protocol of the interconnectivity fabric does not permit such transfers.

Type: Grant

Filed: December 6, 2004

Date of Patent: September 25, 2007

Assignee: NVIDIA Corporation

Inventors: Samuel H. Duncan, Wei-Je Huang, John H. Edmondson
Digital clock recovery circuit

Patent number: 7257183

Abstract: A clock recovery circuit includes a sampler for sampling a data signal. Logic determines whether a data edge lags or precedes a clock edge which drives the sampler, and provides early and late indications. A filter filters the early and late indications, and a phase controller adjusts the phase of the clock based on the filtered indications. Based on the filtered indications, a frequency estimator estimates the frequency difference between the data and clock, providing an input to the phase controller to further adjust the phase so as to continually correct for the frequency difference.

Type: Grant

Filed: June 21, 2002

Date of Patent: August 14, 2007

Assignee: Rambus Inc.

Inventors: William J. Dally, John H. Edmondson, Ramin Farjad-Rad
Digital clock recovery circuit

Publication number: 20030086339

Abstract: A clock recovery circuit includes a sampler for sampling a data signal. Logic determines whether a data edge lags or precedes a clock edge which drives the sampler, and provides early and late indications. A filter filters the early and late indications, and a phase controller adjusts the phase of the clock based on the filtered indications. Based on the filtered indications, a frequency estimator estimates the frequency difference between the data and clock, providing an input to the phase controller to further adjust the phase so as to continually correct for the frequency difference.

Type: Application

Filed: June 21, 2002

Publication date: May 8, 2003

Applicant: Velio Communications, Inc.

Inventors: William J. Dally, John H. Edmondson, Ramin Farjad-Rad
Method and apparatus for predicting multiple conditional branches

Patent number: 6272624

Abstract: The outcome of a plurality of branch instructions in a computer program is predicted by fetching a plurality or group of instructions in a given slot, along with a corresponding prediction. A group global history (gghist) is maintained to indicate of recent program control flow. In addition, a predictor table comprising a plurality of predictions, preferably saturating counters. A particular counter is updated when a branch is encountered. The particular counter is associated with a branch instruction by hashing the fetched instruction group's program counter (PC) with the gghist. To predict multiple branch instruction outcomes, the gghist is hashed with the PC to form an index which is used to access naturally aligned but randomly ordered predictions in the predictor table, which are then reordered based on value of the lower gghits bits. Preferably, instructions are fetched in blocks of eight instructions.

Type: Grant

Filed: April 2, 1999

Date of Patent: August 7, 2001

Assignee: Compaq Computer Corporation

Inventors: Glenn P. Giacalone, John H. Edmondson
Method and apparatus for predicting memory dependence using store sets

Patent number: 6108770

Abstract: A method of scheduling program instructions for execution in a computer processor comprises fetching and holding instructions from an instruction memory and executing the fetched instructions out of program order. When load/store order violations are detected, the effects of the load operation and its dependent instructions are erased and they are re-executed. The load is associated with all stores on whose data the load depends. This collection of stores is called a store set. On a subsequent issuance of the load, its execution is delayed until any store in the load's store set has issued. Two loads may share a store set, and separate store sets are merged when a load from one store set is found to depend on a store from another store set. A preferred embodiment employs two tables. The first is a store set ID table (SSIT) which is indexed by part of, or a hash of, an instruction PC.

Type: Grant

Filed: June 24, 1998

Date of Patent: August 22, 2000

Assignee: Digital Equipment Corporation

Inventors: George Z. Chrysos, Joel S. Emer, Bruce E. Edwards, John H. Edmondson
System interface protocol with optional module cache

Patent number: 5987544

Abstract: A computer system includes a plurality of processor modules coupled to a system bus with each of said processor modules including a processor interfaced to the system bus. The processor module has a backup cache memory and tag store. An index bus is coupled between the processor and the backup cache and backup cache tag store with said bus carrying only an index portion of a memory address to said backup cache and said tag store. A duplicate tag store is coupled to an interface with the duplicate tag memory including means for storing duplicate tag addresses and duplicate tag valid, shared and dirty bits. The duplicate tag store and the separate index bus provide higher performance from the processor by minimizing external interrupts to the processor to check on cache status and also allows other processors access to the processor's duplicate tag while the processor is processing other transactions.

Type: Grant

Filed: September 8, 1995

Date of Patent: November 16, 1999

Assignee: Digital Equipment Corporation

Inventors: Peter J. Bannon, Anil K. Jain, John H. Edmondson, Ruben William Sixtus Castelino
Method for increasing system bandwidth through an on-chip address lock register

Patent number: 5615167

Abstract: A computer system comprising one or more processor modules. Each processor module comprising a central processing unit comprising a storage element disposed in the central processing unit dedicated for storing a semaphore address lock value and a semaphore lock flag value, a cache memory system for storing data and instruction values used by the central processing unit, a system bus interface for communicating with other processor modules over a system bus, a memory system implemented as a common system resource available to the processor modules for storing data and instructions, an IO system implemented as a common system resource available to the plurality of processor modules for each to communicate with data input devices and data output devices, and a system bus connecting the processor module to the memory system and to the IO system.

Type: Grant

Filed: September 8, 1995

Date of Patent: March 25, 1997

Assignee: Digital Equipment Corporation

Inventors: Anil K. Jain, John H. Edmondson, Peter J. Bannon
Pipelined computer with operand context queue to simplify context-dependent execution flow

Patent number: 5542058

Abstract: A macropipelined microprocessor chip adheres to strict read and write ordering by sequentially buffering operands in queues during instruction decode, then removing the operands in order during instruction execution. Any instruction that requires additional access to memory inserts the requests into the queued sequence (in a specifier queue) such that read and write ordering is preserved. A specifier queue synchronization counter captures synchronization points to coordinate memory request operations among the autonomous instruction decode unit, instruction execution unit, and memory sub-system. The synchronization method does not restrict the benefit of overlapped execution in the pipelined. Another feature is treatment of a variable bit field operand type that does not restrict the location of operand data. Instruction execution flows in a pipelined processor having such an operand type are vastly different depending on whether operand data resides in registers or memory.

Type: Grant

Filed: October 4, 1994

Date of Patent: July 30, 1996

Assignee: Digital Equipment Corporation

Inventors: John E. Brown, III, G. Michael Uhler, John H. Edmondson, Debra Bernstein
Combined write-operand queue and read-after-write dependency scoreboard

Patent number: 5471591

Abstract: In a pipelined digital computer, an instruction decoder decodes register specifiers from multiple instructions, and stores them in a source queue and a destination queue. An execution unit successively obtains source specifiers of an instruction from the source queue, initiates an operation upon the source specifiers, reads a destination specifier from the destination queue, and retires the result at the specified destination. Read-after-write conflicts may occur because the execution unit may overlap execution of a plurality of instructions. Just prior to beginning execution of a current instruction, the destination queue is checked for conflict between the source specifiers of the current instruction and the destination specifiers of previously issued but not yet retired instructions. When an instruction is issued for execution, its destination specifiers in the destination queue are marked to indicate that they are associated with an executed but not yet retired instruction.

Type: Grant

Filed: October 30, 1992

Date of Patent: November 28, 1995

Assignee: Digital Equipment Corporation

Inventors: John H. Edmondson, Larry L. Biro
Ensuring write ordering under writeback cache error conditions

Patent number: 5347648

Abstract: Writeback transactions from a processor and cache are fed to a main memory through a writeback queue, and non-writeback transactions from the processor and cache are fed to the main memory through a non-writeback queue. When a cache error is detected, an error transition mode (ETM) is entered that provides limited use of the data in the cache; a read or write request for data not owned in the cache is made to the main memory instead of the cache, even when the data is valid in the cache, although owned data is read from the cache. In ETM, when the processor makes a first write request to data not owned in the cache followed by a second write request to data owned in the cache, write data of the first write request is prevented from being received by the main memory after write data of the second request while permitting writeback of the data owned by the cache.

Type: Grant

Filed: July 15, 1992

Date of Patent: September 13, 1994

Assignee: Digital Equipment Corporation

Inventors: Rebecca L. Stamm, Ruth I. Bahar, Raymond L. Strouble, Nicholas D. Wade, John H. Edmondson
Fast area-efficient multi-bit binary adder with low fan-out signals

Patent number: 5278783

Abstract: A carry look-ahead adder obtains high speed with minimum gate fan-in and a regular array of area-efficient logic cells in a datapath by including a first row of propagate-generate bit cells, a second row of block-propagate bit cells generating a hierarchy of block-propagate and block-generate bits, a third row of carry bit cells: and a bottom level of sum bit cells. The second row of block-propagate bit cells supply the block-propagate and block-generate bits to the first carry bit cells in chained segments of carry bit cells. In a preferred embodiment for a 32-bit complementary metal-oxide semiconductor (CMOS) adder, the logic gates are limited to a fan-in of three, and the block-propagate bit cells in the second row are interconnected to form two binary trees, each including fifteen cells, and the carry cells are chained in segments including up to four cells.

Type: Grant

Filed: October 30, 1992

Date of Patent: January 11, 1994

Assignee: Digital Equipment Corporation

Inventor: John H. Edmondson

prev 1 2 3