Patents by Inventor Steven R. Kunkel

Steven R. Kunkel has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 9280465
    Abstract: A technique of operating a data processing system, includes logging addresses for cache lines modified by a producer core in a data array of a producer cache to create a high-availability (HA) log for the producer core. The technique also includes moving the HA log directly from the producer cache to a consumer cache of a consumer core and moving HA data associated with the addresses of the HA log directly from the producer cache to the consumer cache. The HA log corresponds to a cache line that includes multiple of the addresses. Finally, the technique includes processing, by the consumer core, the HA log and the HA data for the data processing system.
    Type: Grant
    Filed: October 8, 2013
    Date of Patent: March 8, 2016
    Assignee: GLOBALFOUNDRIES INC.
    Inventors: Guy Lynn Guthrie, Steven R. Kunkel, Hien Minh Le, Geraint North, William J. Starke
  • Patent number: 9274952
    Abstract: A technique of operating a data processing system includes logging addresses for cache lines modified by a producer core in a data array of a producer cache to create a high-availability (HA) log for the producer core. The technique also includes moving the HA log directly from the producer cache to a consumer cache of a consumer core and moving HA data associated with the addresses of the HA log directly from the producer cache to the consumer cache. The HA log corresponds to a cache line that includes multiple of the addresses. Finally, the technique includes processing, by the consumer core, the HA log and the HA data for the data processing system.
    Type: Grant
    Filed: January 31, 2014
    Date of Patent: March 1, 2016
    Assignee: GLOBALFOUNDRIES INC.
    Inventors: Guy Lynn Guthrie, Steven R. Kunkel, Hien Minh Le, Geraint North, William J. Starke
  • Publication number: 20150100731
    Abstract: A technique of operating a data processing system, includes logging addresses for cache lines modified by a producer core in a data array of a producer cache to create a high-availability (HA) log for the producer core. The technique also includes moving the HA log directly from the producer cache to a consumer cache of a consumer core and moving HA data associated with the addresses of the HA log directly from the producer cache to the consumer cache. The HA log corresponds to a cache line that includes multiple of the addresses. Finally, the technique includes processing, by the consumer core, the HA log and the HA data for the data processing system.
    Type: Application
    Filed: October 8, 2013
    Publication date: April 9, 2015
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Guy Lynn Guthrie, Steven R. Kunkel, Hien Minh Le, Geraint North, William J. Starke
  • Publication number: 20150100732
    Abstract: A technique of operating a data processing system, includes logging addresses for cache lines modified by a producer core in a data array of a producer cache to create a high-availability (HA) log for the producer core. The technique also includes moving the HA log directly from the producer cache to a consumer cache of a consumer core and moving HA data associated with the addresses of the HA log directly from the producer cache to the consumer cache. The HA log corresponds to a cache line that includes multiple of the addresses. Finally, the technique includes processing, by the consumer core, the HA log and the HA data for the data processing system.
    Type: Application
    Filed: January 31, 2014
    Publication date: April 9, 2015
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Guy Lynn Guthrie, Steven R. Kunkel, Hien Minh Le, Geraint North, William J. Starke
  • Patent number: 8874853
    Abstract: A method, circuit arrangement, and design structure utilize broadcast prediction data to determine whether to globally broadcast a memory request in a computing system of the type that includes a plurality of nodes, each node including a plurality of processing units. The method includes updating broadcast prediction data for a cache line associated with a first memory request within a hardware-based broadcast prediction data structure in turn associated with a first processing unit in response to the first memory request, the broadcast prediction data for the cache line including data associated with a history of ownership of the cache line. The method further comprises accessing the broadcast prediction data structure and determining whether to perform an early broadcast of a second memory request to a second node based on broadcast prediction data within the broadcast prediction data structure in response to that second memory request associated with the cache line.
    Type: Grant
    Filed: June 4, 2010
    Date of Patent: October 28, 2014
    Assignee: International Business Machines Corporation
    Inventors: Jason F. Cantin, Steven R. Kunkel
  • Patent number: 8782646
    Abstract: In a NUMA-topology computer system that includes multiple nodes and multiple logical partitions, some of which may be dedicated and others of which are shared, NUMA optimizations are enabled in shared logical partitions. This is done by specifying a home node parameter in each virtual processor assigned to a logical partition. When a task is created by an operating system in a shared logical partition, a home node is assigned to the task, and the operating system attempts to assign the task to a virtual processor that has a home node that matches the home node for the task. The partition manager then attempts to assign virtual processors to their corresponding home nodes. If this can be done, NUMA optimizations may be performed without the risk of reducing the performance of the shared logical partition.
    Type: Grant
    Filed: November 21, 2012
    Date of Patent: July 15, 2014
    Assignee: International Business Machnies Corporation
    Inventors: Vaijayanthimala K. Anand, Mark R. Funk, Steven R. Kunkel, Mysore S. Srinivas, Randal C. Swanberg, Ronald D. Young
  • Patent number: 8490094
    Abstract: In a NUMA-topology computer system that includes multiple nodes and multiple logical partitions, some of which may be dedicated and others of which are shared, NUMA optimizations are enabled in shared logical partitions. This is done by specifying a home node parameter in each virtual processor assigned to a logical partition. When a task is created by an operating system in a shared logical partition, a home node is assigned to the task, and the operating system attempts to assign the task to a virtual processor that has a home node that matches the home node for the task. The partition manager then attempts to assign virtual processors to their corresponding home nodes. If this can be done, NUMA optimizations may be performed without the risk of reducing the performance of the shared logical partition.
    Type: Grant
    Filed: February 27, 2009
    Date of Patent: July 16, 2013
    Assignee: International Business Machines Corporation
    Inventors: Vaijayanthimala K. Anand, Mark R. Funk, Steven R. Kunkel, Mysore S. Srinivas, Randal C. Swanberg, Ronald D. Young
  • Patent number: 8397030
    Abstract: A system and method of a region coherence protocol for use in Region Coherence Arrays (RCAs) deployed in clustered shared-memory multiprocessor systems which optimize cache-to-cache transfers by allowing broadcast memory requests to be provided to only a portion of a clustered shared-memory multiprocessor system. Interconnect hierarchy levels can be devised for logical groups of processors, processors on the same chip, processors on chips aggregated into a multichip module, multichip modules on the same printed circuit board, and for processors on other printed circuit boards or in other cabinets.
    Type: Grant
    Filed: June 24, 2008
    Date of Patent: March 12, 2013
    Assignee: International Business Machines Corporation
    Inventors: Jason F. Cantin, Steven R. Kunkel
  • Patent number: 8370584
    Abstract: A method, circuit arrangement, and design structure utilize a lock prediction data structure to control ownership of a cache line in a shared memory computing system. In a first node among the plurality of nodes, lock prediction data in a hardware-based lock prediction data structure for a cache line associated with a first memory request is updated in response to that first memory request, wherein at least a portion of the lock prediction data is predictive of whether the cache line is associated with a release operation. The lock prediction data is then accessed in response to a second memory request associated with the cache line and issued by a second node and a determination is made as to whether to transfer ownership of the cache line from the first node to the second node based at least in part on the accessed lock prediction data.
    Type: Grant
    Filed: June 22, 2012
    Date of Patent: February 5, 2013
    Assignee: International Business Machines Corporation
    Inventors: Jason F. Cantin, Steven R. Kunkel
  • Publication number: 20120265942
    Abstract: A method, circuit arrangement, and design structure utilize a lock prediction data structure to control ownership of a cache line in a shared memory computing system. In a first node among the plurality of nodes, lock prediction data in a hardware-based lock prediction data structure for a cache line associated with a first memory request is updated in response to that first memory request, wherein at least a portion of the lock prediction data is predictive of whether the cache line is associated with a release operation. The lock prediction data is then accessed in response to a second memory request associated with the cache line and issued by a second node and a determination is made as to whether to transfer ownership of the cache line from the first node to the second node based at least in part on the accessed lock prediction data.
    Type: Application
    Filed: June 22, 2012
    Publication date: October 18, 2012
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Jason F. Cantin, Steven R. Kunkel
  • Patent number: 8244988
    Abstract: A method, circuit arrangement, and design structure utilize a lock prediction data structure to control ownership of a cache line in a shared memory computing system. In a first node among the plurality of nodes, lock prediction data in a hardware-based lock prediction data structure for a cache line associated with a first memory request is updated in response to that first memory request, wherein at least a portion of the lock prediction data is predictive of whether the cache line is associated with a release operation. The lock prediction data is then accessed in response to a second memory request associated with the cache line and issued by a second node and a determination is made as to whether to transfer ownership of the cache line from the first node to the second node based at least in part on the accessed lock prediction data.
    Type: Grant
    Filed: April 30, 2009
    Date of Patent: August 14, 2012
    Assignee: International Business Machines Corporation
    Inventors: Jason F. Cantin, Steven R. Kunkel
  • Patent number: 8161493
    Abstract: An aspect of the present invention improves the accuracy of measuring processor utilization of multi-threaded cores by providing a calibration facility that derives utilization in the context of the overall dynamic operating state of the core by assigning weights to idle threads and assigning weights to run threads, depending on the status of the core. From previous chip designs it has been established in a Simultaneous Multi Thread (SMT) core that not all idle cycles in a hardware thread can be equally converted into useful work. Competition for core resources reduces the conversion efficiency of one thread's idle cycles when any other thread is running on the same core.
    Type: Grant
    Filed: July 15, 2008
    Date of Patent: April 17, 2012
    Assignee: International Business Machines Corporation
    Inventors: Michael S. Floyd, Steven R. Kunkel, Aaron C. Sawdey, Philip L. Vitale
  • Patent number: 8112587
    Abstract: A method, circuit arrangement, and design structure for prefetching data for responding to a memory request, in a shared memory computing system of the type that includes a plurality of nodes, is provided. Prefetching data comprises, receiving, in response to a first memory request by a first node, presence data for a memory region associated with the first memory request from a second node that sources data requested by the first memory request, and selectively prefetching at least one cache line from the memory region based on the received presence data. Responding to a memory request comprises tracking presence data associated with memory regions associated with cached cache lines in the first node, and, in response to a memory request by a second node, forwarding the tracked presence data for a memory region associated with the memory request to the second node.
    Type: Grant
    Filed: April 30, 2009
    Date of Patent: February 7, 2012
    Assignee: International Business Machines Corporation
    Inventors: Jason F. Cantin, Steven R. Kunkel
  • Publication number: 20110302374
    Abstract: A method, circuit arrangement, and design structure utilize broadcast prediction data to determine whether to globally broadcast a memory request in a computing system of the type that includes a plurality of nodes, each node including a plurality of processing units. The method includes updating broadcast prediction data for a cache line associated with a first memory request within a hardware-based broadcast prediction data structure in turn associated with a first processing unit in response to the first memory request, the broadcast prediction data for the cache line including data associated with a history of ownership of the cache line. The method further comprises accessing the broadcast prediction data structure and determining whether to perform an early broadcast of a second memory request to a second node based on broadcast prediction data within the broadcast prediction data structure in response to that second memory request associated with the cache line.
    Type: Application
    Filed: June 4, 2010
    Publication date: December 8, 2011
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Jason F. Cantin, Steven R. Kunkel
  • Publication number: 20100287561
    Abstract: An aspect of the present invention improves the accuracy of measuring processor utilization of multi-threaded cores by providing a calibration facility that derives utilization in the context of the overall dynamic operating state of the core by assigning weights to idle threads and assigning weights to run threads, depending on the status of the core. From previous chip designs it has been established in a Simultaneous Multi Thread (SMT) core that not all idle cycles in a hardware thread can be equally converted into useful work. Competition for core resources reduces the conversion efficiency of one thread's idle cycles when any other thread is running on the same core.
    Type: Application
    Filed: July 15, 2008
    Publication date: November 11, 2010
    Applicant: International Business Machines Corporation
    Inventors: Michael S. Floyd, Steven R. Kunkel, Aaron C. Sawdey, Philip L. Vitale
  • Publication number: 20100281220
    Abstract: A method, circuit arrangement, and design structure utilize a lock prediction data structure to control ownership of a cache line in a shared memory computing system. In a first node among the plurality of nodes, lock prediction data in a hardware-based lock prediction data structure for a cache line associated with a first memory request is updated in response to that first memory request, wherein at least a portion of the lock prediction data is predictive of whether the cache line is associated with a release operation. The lock prediction data is then accessed in response to a second memory request associated with the cache line and issued by a second node and a determination is made as to whether to transfer ownership of the cache line from the first node to the second node based at least in part on the accessed lock prediction data.
    Type: Application
    Filed: April 30, 2009
    Publication date: November 4, 2010
    Applicant: International Business Machines Corporation
    Inventors: Jason F. Cantin, Steven R. Kunkel
  • Publication number: 20100281221
    Abstract: A method, circuit arrangement, and design structure for prefetching data for responding to a memory request, in a shared memory computing system of the type that includes a plurality of nodes, is provided. Prefetching data comprises, receiving, in response to a first memory request by a first node, presence data for a memory region associated with the first memory request from a second node that sources data requested by the first memory request, and selectively prefetching at least one cache line from the memory region based on the received presence data. Responding to a memory request comprises tracking presence data associated with memory regions associated with cached cache lines in the first node, and, in response to a memory request by a second node, forwarding the tracked presence data for a memory region associated with the memory request to the second node.
    Type: Application
    Filed: April 30, 2009
    Publication date: November 4, 2010
    Applicant: International Business Machines Corporation
    Inventors: Jason F. Cantin, Steven R. Kunkel
  • Publication number: 20100223622
    Abstract: In a NUMA-topology computer system that includes multiple nodes and multiple logical partitions, some of which may be dedicated and others of which are shared, NUMA optimizations are enabled in shared logical partitions. This is done by specifying a home node parameter in each virtual processor assigned to a logical partition. When a task is created by an operating system in a shared logical partition, a home node is assigned to the task, and the operating system attempts to assign the task to a virtual processor that has a home node that matches the home node for the task. The partition manager then attempts to assign virtual processors to their corresponding home nodes. If this can be done, NUMA optimizations may be performed without the risk of reducing the performance of the shared logical partition.
    Type: Application
    Filed: February 27, 2009
    Publication date: September 2, 2010
    Applicant: International Business Machines Corporation
    Inventors: Vaijayanthimala K. Anand, Mark R. Funk, Steven R. Kunkel, Mysore S. Srinivas, Randal C. Swanberg, Ronald D. Young
  • Patent number: 7747826
    Abstract: A cache coherent data processing system includes at least first and second coherency domains each including at least one processing unit. The first coherency domain includes a first cache memory, and the second coherency domain includes a coherent second cache memory. The first cache memory within the first coherency domain of the data processing system holds a memory block in a storage location associated with an address tag and a coherency state field. The coherency state field is set to a state that indicates that the address tag is valid, that the storage location does not contain valid data, and that the memory block is likely cached only within the first coherency domain.
    Type: Grant
    Filed: April 15, 2008
    Date of Patent: June 29, 2010
    Assignee: International Business Machines Corporation
    Inventors: Jason F. Cantin, James S. Fields, Jr., Steven R. Kunkel, William J. Starke
  • Patent number: 7747825
    Abstract: A method, apparatus, and computer program product are disclosed for reducing the number of unnecessarily broadcast local requests to reduce the latency to access data from remote nodes in an SMP computer system. A shared invalid cache coherency protocol state is defined that predicts whether a memory read request to read data in a shared cache line can be satisfied within a local node. When a cache line is in the shared invalid state, a valid copy of the data is predicted to be located in the local node. When a cache line is in the invalid state and not in the shared invalid state, a valid copy of the data is predicted to be located in one of the remote nodes. Memory read requests to read data in a cache line that is not currently in the shared invalid state are broadcast first to remote nodes.
    Type: Grant
    Filed: April 22, 2008
    Date of Patent: June 29, 2010
    Assignee: International Business Machines Corporation
    Inventors: Jason Frederick Cantin, Steven R. Kunkel