Patents by Inventor James Michael O'Connor

James Michael O'Connor has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Reducing coupling and power noise on PAM-4 I/O interface

Patent number: 11966348

Abstract: Methods of operating a serial data bus divide series of data bits into sequences of one or more bits and encode the sequences as N-level symbols, which are then transmitted at multiple discrete voltage levels. These methods may be utilized to communicate over serial data lines to improve bandwidth and reduce crosstalk and other sources of noise.

Type: Grant

Filed: January 28, 2019

Date of Patent: April 23, 2024

Assignee: NVIDIA Corp.

Inventors: Donghyuk Lee, James Michael O'Connor
Prefetch kernels on data-parallel processors

Patent number: 11954036

Abstract: Embodiments include methods, systems and non-transitory computer-readable computer readable media including instructions for executing a prefetch kernel that includes memory accesses for prefetching data for a processing kernel into a memory, and, subsequent to executing at least a portion of the prefetch kernel, executing the processing kernel where the processing kernel includes accesses to data that is stored into the memory resulting from execution of the prefetch kernel.

Type: Grant

Filed: November 11, 2022

Date of Patent: April 9, 2024

Assignee: Advanced Micro Devices, Inc.

Inventors: Nuwan S. Jayasena, James Michael O'Connor, Michael Mantor
COMBINED ON-PACKAGE AND OFF-PACKAGE MEMORY SYSTEM

Publication number: 20230393788

Abstract: A combined on-package and off-package memory system uses a custom base-layer within which are fabricated one or more dedicated interfaces to off-package memories. An on-package processor and on-package memories are also directly coupled to the custom base-layer. The custom base-layer includes memory management logic between the processor and memories (both off and on package) to steer requests. The memories are exposed as a combined memory space having greater bandwidth and capacity compared with either the off-package memories or the on-package memories alone. The memory management logic services requests while maintaining quality of service (QoS) to satisfy bandwidth requirements for each allocation. An allocation may include any combination of the on and/or off package memories. The memory management logic also manages data migration between the on and off package memories.

Type: Application

Filed: August 23, 2023

Publication date: December 7, 2023

Inventors: Nilandrish Chatterjee, James Michael O'Connor, Donghyuk Lee, Gaurav Uttreja, Wishwesh Anil Gandhi
Combined on-package and off-package memory system

Patent number: 11789649

Abstract: A combined on-package and off-package memory system uses a custom base-layer within which are fabricated one or more dedicated interfaces to off-package memories. An on-package processor and on-package memories are also directly coupled to the custom base-layer. The custom base-layer includes memory management logic between the processor and memories (both off and on package) to steer requests. The memories are exposed as a combined memory space having greater bandwidth and capacity compared with either the off-package memories or the on-package memories alone. The memory management logic services requests while maintaining quality of service (QoS) to satisfy bandwidth requirements for each allocation. An allocation may include any combination of the on and/or off package memories. The memory management logic also manages data migration between the on and off package memories.

Type: Grant

Filed: April 22, 2021

Date of Patent: October 17, 2023

Assignee: NVIDIA Corporation

Inventors: Niladrish Chatterjee, James Michael O'Connor, Donghyuk Lee, Gaurav Uttreja, Wishwesh Anil Gandhi
APPLICATION PARTITIONING FOR LOCALITY IN A STACKED MEMORY SYSTEM

Publication number: 20230315651

Abstract: Embodiments of the present disclosure relate to application partitioning for locality in a stacked memory system. In an embodiment, one or more memory dies are stacked on the processor die. The processor die includes multiple processing tiles and each memory die includes multiple memory tiles. Vertically aligned memory tiles are directly coupled to and comprise the local memory block for a corresponding processing tile. An application program that operates on dense multi-dimensional arrays (matrices) may partition the dense arrays into sub-arrays associated with program tiles. Each program tile is executed by a processing tile using the processing tile's local memory block to process the associated sub-array. Data associated with each sub-array is stored in a local memory block and the processing tile corresponding to the local memory block executes the program tile to process the sub-array data.

Type: Application

Filed: March 30, 2022

Publication date: October 5, 2023

Inventors: William James Dally, Carl Thomas Gray, Stephen W. Keckler, James Michael O'Connor
MEMORY STACKED ON PROCESSOR FOR HIGH BANDWIDTH

Publication number: 20230275068

Abstract: Embodiments of the present disclosure relate to memory stacked on processor for high bandwidth. Systems and methods are disclosed for providing a one-level memory for a processing system by stacking bulk memory on a processor die. In an embodiment, one or more memory dies are stacked on the processor die. The processor die includes multiple processing tiles, where each tile includes a processing unit, mapper, and tile network. Each memory die includes multiple memory tiles. The processing tile is coupled to each memory tile that is above or below the processing tile. The vertically aligned memory tiles comprise the local memory block for the processing tile. The ratio of memory bandwidth (byte) to floating-point operation (B:F) may improve 50× for accessing the local memory block compared with conventional memory. Additionally, the energy consumed to transfer each bit may be reduced by 10×.

Type: Application

Filed: February 28, 2022

Publication date: August 31, 2023

Inventors: William James Dally, Carl Thomas Gray, Stephen W. Keckler, James Michael O'Connor
Techniques for generating and processing hierarchical representations of sparse matrices

Patent number: 11709812

Abstract: One embodiment sets forth a technique for generating a tree structure within a computer memory for storing sparse data. The technique includes dividing a matrix into a first plurality of equally sized regions. The technique also includes dividing at least one region in the first plurality of regions into a second plurality of regions, where the second plurality of regions includes a first region and one or more second regions that have a substantially equal number of nonzero matrix values and are formed within the first region. The technique further includes creating the tree structure within the computer memory by generating a first plurality of nodes representing the first plurality of regions, generating a second plurality of nodes representing the second plurality of regions, and grouping, under a first node representing the first region, one or more second nodes representing the one or more second regions.

Type: Grant

Filed: May 19, 2021

Date of Patent: July 25, 2023

Assignee: NVIDIA Corporation

Inventors: Hanrui Wang, James Michael O'Connor, Donghyuk Lee
PREFETCH KERNELS ON DATA-PARALLEL PROCESSORS

Publication number: 20230076872

Abstract: Embodiments include methods, systems and non-transitory computer-readable computer readable media including instructions for executing a prefetch kernel that includes memory accesses for prefetching data for a processing kernel into a memory, and, subsequent to executing at least a portion of the prefetch kernel, executing the processing kernel where the processing kernel includes accesses to data that is stored into the memory resulting from execution of the prefetch kernel.

Type: Application

Filed: November 11, 2022

Publication date: March 9, 2023

Applicant: Advanced Micro Devices, Inc.

Inventors: Nuwan S. Jayasena, James Michael O'Connor, Michael Mantor
MEMORY INTERFACE WITH REDUCED ENERGY TRANSMIT MODE

Publication number: 20230043152

Abstract: PAM encoding techniques that leverage unused idle periods in channels between data transmissions to apply longer but more energy-efficient codes. To improve energy savings, multiple sparse encoding schemes may be utilized selectively to fit different sized gaps in the traffic. These approaches may provide energy reductions, for example with memory READ and WRITE traffic, when transferring 4-bit data using 3-symbol sequences.

Type: Application

Filed: February 9, 2022

Publication date: February 9, 2023

Applicant: NVIDIA Corp.

Inventors: James Michael O'Connor, Donghyuk Lee
TECHNIQUES FOR PERFORMING MATRIX COMPUTATIONS USING HIERARCHICAL REPRESENTATIONS OF SPARSE MATRICES

Publication number: 20220374961

Abstract: One embodiment sets forth a technique for performing matrix operations. The technique includes traversing a tree structure to access one or more non-empty regions within a matrix. The tree structure includes a first plurality of nodes and a second plurality of nodes corresponding to non-empty regions in the matrix. The first plurality of nodes includes a first node representing a first region and one or more second nodes that are children of the first node and represent second region(s) with an equal size formed within the first region. The second plurality of nodes include a third node representing a third region and one or more fourth nodes that are children of the third node and represent fourth region(s) with substantially equal numbers of non-zero matrix values formed within the third region. The technique also includes performing matrix operation(s) based on the non-empty region(s) to generate a matrix operation result.

Type: Application

Filed: May 19, 2021

Publication date: November 24, 2022

Inventors: Hanrui Wang, James Michael O'Connor, Donghyuk Lee
TECHNIQUES FOR ACCELERATING MATRIX MULTIPLICATION COMPUTATIONS USING HIERARCHICAL REPRESENTATIONS OF SPARSE MATRICES

Publication number: 20220374496

Abstract: One embodiment sets forth a technique for performing one or more matrix multiplication operations based on a first matrix and a second matrix. The technique includes receiving data associated with the first matrix from a first traversal engine that accesses nonzero elements included in the first matrix via a first tree structure. The technique also includes performing one or more computations on the data associated with the first matrix and the data associated with the second matrix to produce a plurality of partial results. The technique further includes combining the plurality of partial results into one or more intermediate results and storing the one or more intermediate results in a first buffer memory.

Type: Application

Filed: May 19, 2021

Publication date: November 24, 2022

Inventors: Hanrui WANG, James Michael O'CONNOR, Donghyuk LEE
TECHNIQUES FOR GENERATING AND PROCESSING HIERARCHICAL REPRESENTATIONS OF SPARSE MATRICES

Publication number: 20220374403

Abstract: One embodiment sets forth a technique for generating a tree structure within a computer memory for storing sparse data. The technique includes dividing a matrix into a first plurality of equally sized regions. The technique also includes dividing at least one region in the first plurality of regions into a second plurality of regions, where the second plurality of regions includes a first region and one or more second regions that have a substantially equal number of nonzero matrix values and are formed within the first region. The technique further includes creating the tree structure within the computer memory by generating a first plurality of nodes representing the first plurality of regions, generating a second plurality of nodes representing the second plurality of regions, and grouping, under a first node representing the first region, one or more second nodes representing the one or more second regions.

Type: Application

Filed: May 19, 2021

Publication date: November 24, 2022

Inventors: Hanrui WANG, James Michael O'CONNOR, Donghyuk LEE
Prefetch kernels on data-parallel processors

Patent number: 11500778

Abstract: Embodiments include methods, systems and non-transitory computer-readable computer readable media including instructions for executing a prefetch kernel with reduced intermediate state storage resource requirements. These include executing a prefetch kernel on a graphics processing unit (GPU), such that the prefetch kernel begins executing before a processing kernel. The prefetch kernel performs memory operations that are based upon at least a subset of memory operations in the processing kernel.

Type: Grant

Filed: March 9, 2020

Date of Patent: November 15, 2022

Assignee: Advanced Micro Devices, Inc.

Inventors: Nuwan S. Jayasena, James Michael O'Connor, Michael Mantor
COMBINED ON-PACKAGE AND OFF-PACKAGE MEMORY SYSTEM

Publication number: 20220342595

Abstract: A combined on-package and off-package memory system uses a custom base-layer within which are fabricated one or more dedicated interfaces to off-package memories. An on-package processor and on-package memories are also directly coupled to the custom base-layer. The custom base-layer includes memory management logic between the processor and memories (both off and on package) to steer requests. The memories are exposed as a combined memory space having greater bandwidth and capacity compared with either the off-package memories or the on-package memories alone. The memory management logic services requests while maintaining quality of service (QoS) to satisfy bandwidth requirements for each allocation. An allocation may include any combination of the on and/or off package memories. The memory management logic also manages data migration between the on and off package memories.

Type: Application

Filed: April 22, 2021

Publication date: October 27, 2022

Inventors: Niladrish Chatterjee, James Michael O'Connor, Donghyuk Lee, Gaurav Uttreja, Wishwesh Anil Gandhi
System and method for page-conscious GPU instruction

Patent number: 11301256

Abstract: Embodiments disclose a system and method for reducing virtual address translation latency in a wide execution engine that implements virtual memory. One example method describes a method comprising receiving a wavefront, classifying the wavefront into a subset based on classification criteria selected to reduce virtual address translation latency associated with a memory support structure, and scheduling the wavefront for processing based on the classifying.

Type: Grant

Filed: August 22, 2014

Date of Patent: April 12, 2022

Assignee: Advanced Micro Devices, Inc.

Inventors: Lisa R. Hsu, James Michael O'Connor
Data bus inversion (DBI) on pulse amplitude modulation (PAM) and reducing coupling and power noise on PAM-4 I/O

Patent number: 11159153

Abstract: Mechanisms to reduce noise and/or energy consumption in PAM communication systems, utilizing conditional symbol substitution in each burst interval of a multi-data lane serial data bus.

Type: Grant

Filed: March 7, 2019

Date of Patent: October 26, 2021

Assignee: NVIDIA Corp.

Inventors: Donghyuk Lee, James Michael O'Connor, John Wilson
REDUCING COUPLING AND POWER NOISE ON PAM-4 I/O INTERFACE

Publication number: 20200242062

Abstract: Methods of operating a serial data bus divide series of data bits into sequences of one or more bits and encode the sequences as N-level symbols, which are then transmitted at multiple discrete voltage levels. These methods may be utilized to communicate over serial data lines to improve bandwidth and reduce crosstalk and other sources of noise.

Type: Application

Filed: January 28, 2019

Publication date: July 30, 2020

Inventors: Donghyuk Lee, James Michael O'Connor
PREFETCH KERNELS ON DATA-PARALLEL PROCESSORS

Publication number: 20200210341

Abstract: Embodiments include methods, systems and non-transitory computer-readable computer readable media including instructions for executing a prefetch kernel with reduced intermediate state storage resource requirements. These include executing a prefetch kernel on a graphics processing unit (GPU), such that the prefetch kernel begins executing before a processing kernel. The prefetch kernel performs memory operations that are based upon at least a subset of memory operations in the processing kernel.

Type: Application

Filed: March 9, 2020

Publication date: July 2, 2020

Applicant: Advanced Micro Devices, Inc.

Inventors: Nuwan S. Jayasena, James Michael O'Connor, Michael Mantor
Relaxed 433 encoding to reduce coupling and power noise on PAM-4 data buses

Patent number: 10657094

Abstract: Methods of operating a serial data bus divide series of data bits into sequences of one or more bits and encode the sequences as N-level symbols, which are then transmitted at multiple discrete voltage levels. These methods may be utilized to communicate over serial data lines to improve bandwidth and reduce crosstalk and other sources of noise.

Type: Grant

Filed: March 7, 2019

Date of Patent: May 19, 2020

Assignee: NVIDIA Corp.

Inventors: Donghyuk Lee, James Michael O'Connor, John Wilson
424 encoding schemes to reduce coupling and power noise on PAM-4 data buses

Patent number: 10599606

Abstract: Methods of operating a serial data bus generate two-level bridge symbols to insert between four-level symbols on one or more data lanes of the serial data bus, to reduce voltage deltas on the one or more data lanes during data transmission on the serial data bus.

Type: Grant

Filed: March 21, 2019

Date of Patent: March 24, 2020

Assignee: NVIDIA Corp.

Inventors: Donghyuk Lee, James Michael O'Connor, John Wilson

1 2 3 4 next