Patents by Inventor Alan G. Gara

Alan G. Gara has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Processors, methods, and systems with a configurable spatial accelerator

Patent number: 10558575

Abstract: Systems, methods, and apparatuses relating to a configurable spatial accelerator are described. In one embodiment, a processor includes a core with a decoder to decode an instruction into a decoded instruction and an execution unit to execute the decoded instruction to perform a first operation; a plurality of processing elements; and an interconnect network between the plurality of processing elements to receive an input of a dataflow graph comprising a plurality of nodes, wherein the dataflow graph is to be overlaid into the interconnect network and the plurality of processing elements with each node represented as a dataflow operator in the plurality of processing elements, and the plurality of processing elements is to perform a second operation when an incoming operand set arrives at the plurality of processing elements.

Type: Grant

Filed: December 30, 2016

Date of Patent: February 11, 2020

Assignee: Intel Corporation

Inventors: Kermin E. Fleming, Jr., Kent D. Glossop, Simon C. Steely, Jr., Jinjie Tang, Alan G. Gara
PROCESSORS, METHODS, AND SYSTEMS WITH A CONFIGURABLE SPATIAL ACCELERATOR

Publication number: 20180189231

Abstract: Systems, methods, and apparatuses relating to a configurable spatial accelerator are described. In one embodiment, a processor includes a core with a decoder to decode an instruction into a decoded instruction and an execution unit to execute the decoded instruction to perform a first operation; a plurality of processing elements; and an interconnect network between the plurality of processing elements to receive an input of a dataflow graph comprising a plurality of nodes, wherein the dataflow graph is to be overlaid into the interconnect network and the plurality of processing elements with each node represented as a dataflow operator in the plurality of processing elements, and the plurality of processing elements is to perform a second operation when an incoming operand set arrives at the plurality of processing elements.

Type: Application

Filed: December 30, 2016

Publication date: July 5, 2018

Inventors: KERMIN E. FLEMING, JR., KENT D. GLOSSOP, SIMON C. STEELY, JR., JINJIE TANG, ALAN G. GARA
Mitigating component performance variation

Patent number: 9864423

Abstract: Apparatus and methods may provide for characterizing a plurality of similar components of a distributed computing system based on a maximum safe operation level associated with each component and storing characterization data in a database and allocating non-uniform power to each similar component based at least in part on the characterization data in the database to substantially equalize performance of the components.

Type: Grant

Filed: December 24, 2015

Date of Patent: January 9, 2018

Assignee: Intel Corporation

Inventors: Alan G. Gara, Steve S. Sylvester, Jonathan M. Eastep, Ramkumar Nagappan, Christopher M. Cantalupo
Managing power consumption and performance of computing systems

Patent number: 9857858

Abstract: A method and system for managing power consumption and performance of computing systems are described herein. The method includes monitoring an overall power consumption of the computing systems to determine whether the overall power consumption is above or below an overall power consumption limit, and monitoring a performance of each computing system to determine whether the performance is within a performance tolerance. The method further includes adjusting the power consumption limits for the computing systems or the performances of the computing systems such that the overall power consumption is below the overall power consumption limit and the performance of each computing system is within the performance tolerance.

Type: Grant

Filed: May 17, 2012

Date of Patent: January 2, 2018

Assignee: Intel Corporation

Inventors: Devadatta V. Bodas, John H. Crawford, Alan G. Gara
Mitigating component performance variation

Publication number: 20170185129

Abstract: Apparatus and methods may provide for characterizing a plurality of similar components of a distributed computing system based on a maximum safe operation level associated with each component and storing characterization data in a database and allocating non-uniform power to each similar component based at least in part on the characterization data in the database to substantially equalize performance of the components.

Type: Application

Filed: December 24, 2015

Publication date: June 29, 2017

Applicant: Intel Corporation

Inventors: Alan G. Gara, Steve S. Sylvester, Jonathan M. Eastep, Ramkumar Nagappan, Christopher M. Cantalupo
MANAGING POWER CONSUMPTION AND PERFORMANCE OF COMPUTING SYSTEMS

Publication number: 20150169026

Abstract: A method and system for managing power consumption and performance of computing systems are described herein. The method includes monitoring an overall power consumption of the computing systems to determine whether the overall power consumption is above or below an overall power consumption limit, and monitoring a performance of each computing system to determine whether the performance is within a performance tolerance. The method further includes adjusting the power consumption limits for the computing systems or the performances of the computing systems such that the overall power consumption is below the overall power consumption limit and the performance of each computing system is within the performance tolerance.

Type: Application

Filed: May 17, 2012

Publication date: June 18, 2015

Inventors: Devadatta V. Bodas, John H. Crawford, Alan G Gara
Shared performance monitor in a multiprocessor system

Patent number: 8904392

Abstract: A performance monitoring unit (PMU) and method for monitoring performance of events occurring in a multiprocessor system. The multiprocessor system comprises a plurality of processor devices units, each processor device for generating signals representing occurrences of events in the processor device, and, a single shared counter resource for performance monitoring. The performance monitor unit is shared by all processor cores in the multiprocessor system. The PMU is further programmed to monitor event signals issued from non-processor devices.

Type: Grant

Filed: May 31, 2012

Date of Patent: December 2, 2014

Assignee: International Business Machines Corporation

Inventors: George Chiu, Alan G. Gara, Valentina Salapura
Method and apparatus for efficiently tracking queue entries relative to a timestamp

Patent number: 8756350

Abstract: An apparatus and method for tracking coherence event signals transmitted in a multiprocessor system. The apparatus comprises a coherence logic unit, each unit having a plurality of queue structures with each queue structure associated with a respective sender of event signals transmitted in the system. A timing circuit associated with a queue structure controls enqueuing and dequeuing of received coherence event signals, and, a counter tracks a number of coherence event signals remaining enqueued in the queue structure and dequeued since receipt of a timestamp signal. A counter mechanism generates an output signal indicating that all of the coherence event signals present in the queue structure at the time of receipt of the timestamp signal have been dequeued. In one embodiment, the timestamp signal is asserted at the start of a memory synchronization operation and, the output signal indicates that all coherence events present when the timestamp signal was asserted have completed.

Type: Grant

Filed: June 26, 2007

Date of Patent: June 17, 2014

Assignee: International Business Machines Corporation

Inventors: Matthias A. Blumrich, Dong Chen, Alan G. Gara, Mark E. Giampapa, Philip Heidelberger, Martin Ohmacht, Valentina Salapura, Pavlos Vranas
Snoop filter for filtering snoop requests

Patent number: 8677073

Abstract: A method and apparatus for supporting cache coherency in a multiprocessor computing environment having multiple processing units, each processing unit having one or more local cache memories associated and operatively connected therewith. The method comprises providing a snoop filter device associated with each processing unit, each snoop filter device having a plurality of dedicated input ports for receiving snoop requests from dedicated memory writing sources in the multiprocessor computing environment. Each snoop filter device includes a plurality of parallel operating port snoop filters in correspondence with the plurality of dedicated input ports, each port snoop filter implementing one or more parallel operating sub-filter elements that are adapted to concurrently filter snoop requests received from respective dedicated memory writing sources and forward a subset of those requests to its associated processing unit.

Type: Grant

Filed: August 16, 2012

Date of Patent: March 18, 2014

Assignee: Intel Corporation

Inventors: Matthias A. Blumrich, Dong Chen, Alan G. Gara, Mark E. Giampapa, Philip Heidelberger, Dirk I. Hoenicke, Martin Ohmacht, Valentina Salapura, Pavlos M. Vranas
Massively parallel supercomputer

Patent number: 8667049

Abstract: A novel massively parallel supercomputer of hundreds of teraOPS-scale includes node architectures based upon System-On-a-Chip technology, i.e., each processing node comprises a single Application Specific Integrated Circuit (ASIC). Within each ASIC node is a plurality of processing elements each of which consists of a central processing unit (CPU) and plurality of floating point processors to enable optimal balance of computational performance, packaging density, low cost, and power and cooling requirements. The plurality of processors within a single node individually or simultaneously work on any combination of computation or communication as required by the particular algorithm being solved. The system-on-a-chip ASIC nodes are interconnected by multiple independent networks that optimally maximizes packet communications throughput and minimizes latency.

Type: Grant

Filed: August 3, 2012

Date of Patent: March 4, 2014

Assignee: International Business Machines Corporation

Inventors: Matthias A. Blumrich, Dong Chen, George L. Chiu, Thomas M. Cipolla, Paul W. Coteus, Alan G. Gara, Mark E. Giampap, Philip Heidlberger, Gerard V. Kopcsay, Lawrence S. Mok, Todd E. Takken
Write-through cache optimized for dependence-free parallel regions

Patent number: 8627010

Abstract: An apparatus and computer program product for improving performance of a parallel computing system. A first hardware local cache controller associated with a first local cache memory device of a first processor detects an occurrence of a false sharing of a first cache line by a second processor running the program code and allows the false sharing of the first cache line by the second processor. The false sharing of the first cache line occurs upon updating a first portion of the first cache line in the first local cache memory device by the first hardware local cache controller and subsequent updating a second portion of the first cache line in a second local cache memory device by a second hardware local cache controller.

Type: Grant

Filed: September 5, 2012

Date of Patent: January 7, 2014

Assignee: International Business Machines Corporation

Inventors: Alexandre E. Eichenberger, Alan G. Gara, Martin Ohmacht, Vijayalakshmi Srinivasan
Write-through cache optimized for dependence-free parallel regions

Patent number: 8516197

Abstract: An apparatus, method and computer program product for improving performance of a parallel computing system. A first hardware local cache controller associated with a first local cache memory device of a first processor detects an occurrence of a false sharing of a first cache line by a second processor running the program code and allows the false sharing of the first cache line by the second processor. The false sharing of the first cache line occurs upon updating a first portion of the first cache line in the first local cache memory device by the first hardware local cache controller and subsequent updating a second portion of the first cache line in a second local cache memory device by a second hardware local cache controller.

Type: Grant

Filed: February 11, 2011

Date of Patent: August 20, 2013

Assignee: International Business Machines Corporation

Inventors: Alexandre E. Eichenberger, Alan G. Gara, Martin Ohmacht, Vijayalakshmi Srinivasan
Combined group ECC protection and subgroup parity protection

Patent number: 8468416

Abstract: A method and system are disclosed for providing combined error code protection and subgroup parity protection for a given group of n bits. The method comprises the steps of identifying a number, m, of redundant bits for said error protection; and constructing a matrix P, wherein multiplying said given group of n bits with P produces m redundant error correction code (ECC) protection bits, and two columns of P provide parity protection for subgroups of said given group of n bits. In the preferred embodiment of the invention, the matrix P is constructed by generating permutations of m bit wide vectors with three or more, but an odd number of, elements with value one and the other elements with value zero; and assigning said vectors to rows of the matrix P.

Type: Grant

Filed: June 26, 2007

Date of Patent: June 18, 2013

Assignee: International Business Machines Corporation

Inventors: Alan G. Gara, Dong Chen, Philip Heidelberger, Martin Ohmacht
NOVEL MASSIVELY PARALLEL SUPERCOMPUTER

Publication number: 20120311299

Abstract: A novel massively parallel supercomputer of hundreds of teraOPS-scale includes node architectures based upon System-On-a-Chip technology, i.e., each processing node comprises a single Application Specific Integrated Circuit (ASIC). Within each ASIC node is a plurality of processing elements each of which consists of a central processing unit (CPU) and plurality of floating point processors to enable optimal balance of computational performance, packaging density, low cost, and power and cooling requirements. The plurality of processors within a single node individually or simultaneously work on any combination of computation or communication as required by the particular algorithm being solved. The system-on-a-chip ASIC nodes are interconnected by multiple independent networks that optimally maximizes packet communications throughput and minimizes latency.

Type: Application

Filed: August 3, 2012

Publication date: December 6, 2012

Applicant: International Business Machines Corporation

Inventors: Matthias A. Blumrich, Dong Chen, George L. Chiu, Thomas M. Cipolla, Paul W. Coteus, Alan G. Gara, Mark E. Giampapa, Philip Heidlberger, Gerard V. Kopcsay, Lawrence S. Mok, Todd E. Takken
NOVEL SNOOP FILTER FOR FILTERING SNOOP REQUESTS

Publication number: 20120311272

Abstract: A method and apparatus for supporting cache coherency in a multiprocessor computing environment having multiple processing units, each processing unit having one or more local cache memories associated and operatively connected therewith. The method comprises providing a snoop filter device associated with each processing unit, each snoop filter device having a plurality of dedicated input ports for receiving snoop requests from dedicated memory writing sources in the multiprocessor computing environment. Each snoop filter device includes a plurality of parallel operating port snoop filters in correspondence with the plurality of dedicated input ports, each port snoop filter implementing one or more parallel operating sub-filter elements that are adapted to concurrently filter snoop requests received from respective dedicated memory writing sources and forward a subset of those requests to its associated processing unit.

Type: Application

Filed: August 16, 2012

Publication date: December 6, 2012

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Matthias A. Blumrich, Dong Chen, Alan G. Gara, Mark E. Giampapa, Philip Heidelberger, Dirk I. Hoenicke, Martin Ohmacht, Valentina Salapura, Pavlos M. Vranas
SHARED PERFORMANCE MONITOR IN A MULTIPROCESSOR SYSTEM

Publication number: 20120304020

Abstract: A performance monitoring unit (PMU) and method for monitoring performance of events occurring in a multiprocessor system. The multiprocessor system comprises a plurality of processor devices units, each processor device for generating signals representing occurrences of events in the processor device, and, a single shared counter resource for performance monitoring. The performance monitor unit is shared by all processor cores in the multiprocessor system. The PMU is further programmed to monitor event signals issued from non-processor devices.

Type: Application

Filed: May 31, 2012

Publication date: November 29, 2012

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: George Chiu, Alan G. Gara, Valentina Salapura
Snoop filter for filtering snoop requests

Patent number: 8255638

Abstract: A method and apparatus for supporting cache coherency in a multiprocessor computing environment having multiple processing units, each processing unit having one or more local cache memories associated and operatively connected therewith. The method comprises providing a snoop filter device associated with each processing unit, each snoop filter device having a plurality of dedicated input ports for receiving snoop requests from dedicated memory writing sources in the multiprocessor computing environment. Each snoop filter device includes a plurality of parallel operating port snoop filters in correspondence with the plurality of dedicated input ports, each port snoop filter implementing one or more parallel operating sub-filter elements that are adapted to concurrently filter snoop requests received from respective dedicated memory writing sources and forward a subset of those requests to its associated processing unit.

Type: Grant

Filed: May 1, 2008

Date of Patent: August 28, 2012

Assignee: International Business Machines Corporation

Inventors: Matthias A. Blumrich, Dong Chen, Alan G. Gara, Mark E. Giampapa, Philip Heidelberger, Dirk I. Hoenicke, Martin Ohmacht, Valentina Salapura, Pavlos M. Vranas
Massively parallel supercomputer

Patent number: 8250133

Abstract: A novel massively parallel supercomputer of hundreds of teraOPS-scale includes node architectures based upon System- On-a-Chip technology, i.e., each processing node comprises a single Application Specific Integrated Circuit (ASIC). Within each ASIC node is a plurality of processing elements each of which consists of a central processing unit (CPU) and plurality of floating point processors to enable optimal balance of computational performance, packaging density, low cost, and power and cooling requirements. The plurality of processors within a single node individually or simultaneously work on any combination of computation or communication as required by the particular algorithm being solved. The system-on-a-chip ASIC nodes are interconnected by multiple independent networks that optimally maximizes packet communications throughput and minimizes latency.

Type: Grant

Filed: June 26, 2009

Date of Patent: August 21, 2012

Assignee: International Business Machines Corporation

Inventors: Matthias A. Blumrich, Dong Chen, George L. Chiu, Thomas M. Cipolla, Paul W. Coteus, Alan G. Gara, Mark E. Giampapa, Philip Heidelberger, Gerard V. Kopcsay, Lawrence S. Mok, Todd E. Takken
Shared performance monitor in a multiprocessor system

Patent number: 8230433

Abstract: A performance monitoring unit (PMU) and method for monitoring performance of events occurring in a multiprocessor system. The multiprocessor system comprises a plurality of processor devices units, each processor device for generating signals representing occurrences of events in the processor device, and, a single shared counter resource for performance monitoring. The performance monitor unit is shared by all processor cores in the multiprocessor system.

Type: Grant

Filed: June 26, 2007

Date of Patent: July 24, 2012

Assignee: International Business Machines Corporation

Inventors: George Chiu, Alan G. Gara, Valentina Salapura
Simplifying and speeding the management of intra-node cache coherence

Patent number: 8161248

Abstract: A method and apparatus for managing coherence between two processors of a two processor node of a multi-processor computer system. Generally the present invention relates to a software algorithm that simplifies and significantly speeds the management of cache coherence in a message passing parallel computer, and to hardware apparatus that assists this cache coherence algorithm. The software algorithm uses the opening and closing of put/get windows to coordinate the activated required to achieve cache coherence. The hardware apparatus may be an extension to the hardware address decode, that creates, in the physical memory address space of the node, an area of virtual memory that (a) does not actually exist, and (b) is therefore able to respond instantly to read and write requests from the processing elements.

Type: Grant

Filed: November 24, 2010

Date of Patent: April 17, 2012

Assignee: International Business Machines Corporation

Inventors: Matthias A. Blumrich, Dong Chen, Paul W. Coteus, Alan G. Gara, Mark E. Giampapa, Phillip Heidelberger, Dirk Hoenicke, Martin Ohmacht

1 2 3 4 5 … next