Patents by Inventor Chen-Yong Cher
Chen-Yong Cher has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Patent number: 9436552Abstract: According to an aspect, a method for triggering creation of a checkpoint in a computer system includes executing a task in a processing node of the computer system and determining whether it is time to read a monitor associated with a metric of the task. The monitor is read to determine a value of the metric based on determining that it is time to read the monitor. A threshold for triggering creation of the checkpoint is determined based on the value of the metric. Based on determining that the value of the metric has crossed the threshold, the checkpoint including state data of the task is created to enable restarting execution of the task upon a restart operation.Type: GrantFiled: June 12, 2014Date of Patent: September 6, 2016Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventor: Chen-Yong Cher
-
Patent number: 9417882Abstract: There is provided a processor implemented method for controlling a lock-stepped cohort. The method includes receiving instructions for each of a first lane and a second lane. The first lane is for the lock-stepped cohort and the second lane is for another cohort. The method further includes detecting a condition in which a first instruction at the first lane will have a higher latency than a second instruction at the second lane. The method also includes setting an indicator indicating where the first lane encountered the first instruction. The method additionally includes setting the first lane to inactive, while keeping the second lane active. The method further includes setting the first lane to active on a subsequent opportunity to execute said first instruction.Type: GrantFiled: December 23, 2013Date of Patent: August 16, 2016Assignee: International Business Machines CorporationInventors: Jose R. Brunheroto, Chen-Yong Cher, Hubertus Franke, Jamin Naghmouchi
-
Patent number: 9317350Abstract: A method for faulty memory utilization in a memory system includes: obtaining information regarding memory health status of at least one memory page in the memory system; determining an error tolerance of the memory page when the information regarding memory health status indicates that a failure is predicted to occur in an area of the memory system affecting the memory page; initiating a migration of data stored in the memory page when it is determined that the data stored in the memory page is non-error-tolerant; notifying at least one application regarding a predicted operating system failure and/or a predicted application failure when it is determined that data stored in the memory page is non-error-tolerant and cannot be migrated; and notifying at least one application regarding the memory failure predicted to occur when it is determined that data stored in the memory page is error-tolerant.Type: GrantFiled: September 9, 2013Date of Patent: April 19, 2016Assignee: International Business Machines CorporationInventors: Chen-Yong Cher, Carlos H. Andrade Costa, Yoonho Park, Bryan S. Rosenburg, Kyung D. Ryu
-
Publication number: 20160103736Abstract: Processor register protection management is disclosed. In embodiments, a method of processor register protection management can include determining a sensitive logical register for executable code generated by a compiler, generating an error-correction table identifying the sensitive logical register, and storing the error-correction table in a memory accessible by a processor. The processor can be configured to generate a duplicate register of the sensitive logical register identified by the error-correction table.Type: ApplicationFiled: October 9, 2014Publication date: April 14, 2016Inventors: Pradip Bose, Chen-Yong Cher, Meeta S. Gupta
-
Patent number: 9298553Abstract: A method for selective duplication of subtasks in a high-performance computing system includes: monitoring a health status of one or more nodes in a high-performance computing system, where one or more subtasks of a parallel task execute on the one or more nodes; identifying one or more nodes as having a likelihood of failure which exceeds a first prescribed threshold; selectively duplicating the one or more subtasks that execute on the one or more nodes having a likelihood of failure which exceeds the first prescribed threshold; and notifying a messaging library that one or more subtasks were duplicated.Type: GrantFiled: February 8, 2014Date of Patent: March 29, 2016Assignee: International Business Machines CorporationInventors: Carlos H. Andrade Costa, Chen-Yong Cher, Yoonho Park, Bryan S. Rosenburg, Kyung D. Ryu
-
Publication number: 20160085640Abstract: A method for selective duplication of subtasks in a high-performance computing system includes: monitoring a health status of one or more nodes in a high-performance computing system, where one or more subtasks of a parallel task execute on the one or more nodes; identifying one or more nodes as having a likelihood of failure which exceeds a first prescribed threshold; selectively duplicating the one or more subtasks that execute on the one or more nodes having a likelihood of failure which exceeds the first prescribed threshold; and notifying a messaging library that one or more subtasks were duplicated.Type: ApplicationFiled: December 2, 2015Publication date: March 24, 2016Inventors: Carlos H. Andrade Costa, Chen-Yong Cher, Yoonho Park, Bryan S. Rosenburg, Kyung D. Ryu
-
Patent number: 9280383Abstract: According to an aspect, a method for checkpointing in a hybrid computing node includes executing a task in a processing accelerator of the hybrid computing node. A checkpoint is created in a local memory of the processing accelerator. The checkpoint includes state data to restart execution of the task in the processing accelerator upon a restart operation. Execution of the task is resumed in the processing accelerator after creating the checkpoint. The state data of the checkpoint are transferred from the processing accelerator to a main processor of the hybrid computing node while the processing accelerator is executing the task.Type: GrantFiled: June 12, 2014Date of Patent: March 8, 2016Assignee: International Business Machines CorporationInventor: Chen-Yong Cher
-
Publication number: 20160042280Abstract: A method for managing a network queue memory includes receiving sensor information about the network queue memory, predicting a memory failure in the network queue memory based on the sensor information, and outputting a notification through a plurality of nodes forming a network and using the network queue memory, the notification configuring communications between the nodes.Type: ApplicationFiled: August 7, 2014Publication date: February 11, 2016Inventors: CARLOS H. ANDRADE COSTA, CHEN-YONG CHER, YOONHO PARK, BRYAN S. ROSENBURG, KYUNG D. RYU
-
Patent number: 9251340Abstract: A mechanism is provided for detecting malicious activity in a functional unit. For a current activity value associated with a functional unit, a determination is made as to whether a thermal level associated with the functional unit differs from a verified thermal level beyond a first predetermined threshold. Responsive to the thermal level associated with the functional unit differing from the verified thermal level beyond the first predetermined threshold, a determination is made as to whether there is a known profile of thread activity levels that substantially matches current thread activity levels. Responsive to identifying the known profile that substantially matches the current thread activity levels, thread activity levels are compared to the known profile of thread activity levels. Responsive to the thread activity levels differing from the known profile beyond a second predetermined threshold, an indication of suspected abnormal activity associated with the given functional unit is sent.Type: GrantFiled: September 19, 2013Date of Patent: February 2, 2016
-
Publication number: 20160011996Abstract: A Multi-Petascale Highly Efficient Parallel Supercomputer of 100 petaflop-scale includes node architectures based upon System-On-a-Chip technology, where each processing node comprises a single Application Specific Integrated Circuit (ASIC). The ASIC nodes are interconnected by a five dimensional torus network that optimally maximize the throughput of packet communications between nodes and minimize latency. The network implements collective network and a global asynchronous network that provides global barrier and notification functions. Integrated in the node design include a list-based prefetcher. The memory system implements transaction memory, thread level speculation, and multiversioning cache that improves soft error rate at the same time and supports DMA functionality allowing for parallel processing message-passing.Type: ApplicationFiled: April 30, 2015Publication date: January 14, 2016Inventors: Sameh Asaad, Ralph E. Bellofatto, Michael A. Blocksome, Matthias A. Blumrich, Peter Boyle, Jose R. Brunheroto, Dong Chen, Chen-Yong Cher, George L. Chiu, Norman Christ, Paul W. Coteus, Kristan D. Davis, Gabor J. Dozsa, Alexandre E. Eichenberger, Noel A. Eisley, Matthew R. Ellavsky, Kahn C. Evans, Bruce M. Fleischer, Thomas W. Fox, Alan Gara, Mark E. Giampapa, Thomas M. Gooding, Michael K. Gschwind, John A. Gunnels, Shawn A. Hall, Rudolf A. Haring, Philip Heidelberger, Todd A. Inglett, Brant L. Knudson, Gerard V. Kopcsay, Sameer Kumar, Amith R. Mamidala, James A. Marcella, Mark G. Megerian, Douglas R. Miller, Samuel J. Miller, Adam J. Muff, Michael B. Mundy, John K. O'Brien, Kathryn M. O'Brien, Martin Ohmacht, Jeffrey J. Parker, Ruth J. Poole, Joseph D. Ratterman, Valentina Salapura, David L. Satterfield, Robert M. Senger, Burkhard Steinmacher-Burow, William M. Stockdell, Craig B. Stunkel, Krishnan Sugavanam, Yutaka Sugawara, Todd E. Takken, Barry M. Trager, James L. Van Oosten, Charles D. Wait, Robert E. Walkup, Alfred T. Watson, Robert W. Wisniewski, Peng Wu
-
Patent number: 9218488Abstract: A mechanism is provided for detecting malicious activity in a functional unit. For a current activity value associated with a functional unit, a determination is made as to whether a thermal level associated with the functional unit differs from a verified thermal level beyond a first predetermined threshold. Responsive to the thermal level associated with the functional unit differing from the verified thermal level beyond the first predetermined threshold, a determination is made as to whether there is a known profile of thread activity levels that substantially matches current thread activity levels. Responsive to identifying the known profile that substantially matches the current thread activity levels, thread activity levels are compared to the known profile of thread activity levels. Responsive to the thread activity levels differing from the known profile beyond a second predetermined threshold, an indication of suspected abnormal activity associated with the given functional unit is sent.Type: GrantFiled: August 28, 2013Date of Patent: December 22, 2015
-
Publication number: 20150363225Abstract: According to an aspect, a method for checkpointing in a hybrid computing node includes executing a task in a processing accelerator of the hybrid computing node. A checkpoint is created in a local memory of the processing accelerator. The checkpoint includes state data to restart execution of the task in the processing accelerator upon a restart operation. Execution of the task is resumed in the processing accelerator after creating the checkpoint. The state data of the checkpoint are transferred from the processing accelerator to a main processor of the hybrid computing node while the processing accelerator is executing the task.Type: ApplicationFiled: June 12, 2014Publication date: December 17, 2015Inventor: Chen-Yong Cher
-
Publication number: 20150363277Abstract: According to an aspect, a method for triggering creation of a checkpoint in a computer system includes executing a task in a processing node of the computer system and determining whether it is time to read a monitor associated with a metric of the task. The monitor is read to determine a value of the metric based on determining that it is time to read the monitor. A threshold for triggering creation of the checkpoint is determined based on the value of the metric. Based on determining that the value of the metric has crossed the threshold, the checkpoint including state data of the task is created to enable restarting execution of the task upon a restart operation.Type: ApplicationFiled: June 12, 2014Publication date: December 17, 2015Inventor: Chen-Yong Cher
-
Patent number: 9172714Abstract: A mechanism is provided for detecting malicious activity in a functional unit of a data processing system. A set of activity values associated with a set of functional units and a set of thermal levels associated with the set of functional units are monitored. For a current activity value associated with the functional unit in the set of functional units, a determination is made as to whether a thermal level associated with the functional unit differs from a verified thermal level beyond a predetermined threshold. Responsive to the thermal level associated with the functional unit differing from the verified thermal level beyond the predetermined threshold, sending an indication of suspected abnormal activity associated with the given functional unit.Type: GrantFiled: August 28, 2013Date of Patent: October 27, 2015
-
Publication number: 20150268856Abstract: There is provided a method for managing a solid state storage system with hybrid storage technologies. The method includes monitoring one or more storage request streams to identify operating mode characteristics therein from among a set of possible operating mode characteristics. The set of possible operating mode characteristics correspond to a set of available operating modes of the hybrid storage technologies. The method further includes identifying a current operating mode from among the set of available operating modes responsive to the identified operating mode characteristics. The method also includes predicting a likely future operating mode responsive to variations in workload requirements to generate at least one future operating mode prediction. The method additionally includes controlling at least one of data placement, wear leveling, and garbage collection, responsive to the at least one future operating mode prediction.Type: ApplicationFiled: March 20, 2014Publication date: September 24, 2015Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Chen-Yong Cher, Michele M. Franceschini, Ashish Jagmohan
-
Publication number: 20150227426Abstract: A method for selective duplication of subtasks in a high-performance computing system includes: monitoring a health status of one or more nodes in a high-performance computing system, where one or more subtasks of a parallel task execute on the one or more nodes; identifying one or more nodes as having a likelihood of failure which exceeds a first prescribed threshold; selectively duplicating the one or more subtasks that execute on the one or more nodes having a likelihood of failure which exceeds the first prescribed threshold; and notifying a messaging library that one or more subtasks were duplicated.Type: ApplicationFiled: February 8, 2014Publication date: August 13, 2015Applicant: International Business Machines CorporationInventors: Carlos H. Andrade Costa, Chen-Yong Cher, Yoonho Park, Bryan S. Rosenburg, Kyung D. Ryu
-
Patent number: 9088597Abstract: A mechanism is provided for detecting malicious activity in a functional unit of a data processing system. A set of activity values associated with a set of functional units and a set of thermal levels associated with the set of functional units are monitored. For a current activity value associated with the functional unit in the set of functional units, a determination is made as to whether a thermal level associated with the functional unit differs from a verified thermal level beyond a predetermined threshold. Responsive to the thermal level associated with the functional unit differing from the verified thermal level beyond the predetermined threshold, sending an indication of suspected abnormal activity associated with the given functional unit.Type: GrantFiled: September 19, 2013Date of Patent: July 21, 2015
-
Patent number: 9081501Abstract: A Multi-Petascale Highly Efficient Parallel Supercomputer of 100 petaOPS-scale computing, at decreased cost, power and footprint, and that allows for a maximum packaging density of processing nodes from an interconnect point of view. The Supercomputer exploits technological advances in VLSI that enables a computing model where many processors can be integrated into a single Application Specific Integrated Circuit (ASIC).Type: GrantFiled: January 10, 2011Date of Patent: July 14, 2015Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Sameh Asaad, Ralph E. Bellofatto, Michael A. Blocksome, Matthias A. Blumrich, Peter Boyle, Jose R. Brunheroto, Dong Chen, Chen-Yong Cher, George L. Chiu, Norman Christ, Paul W. Coteus, Kristan D. Davis, Gabor J. Dozsa, Alexandre E. Eichenberger, Noel A. Eisley, Matthew R. Ellavsky, Kahn C. Evans, Bruce M. Fleischer, Thomas W. Fox, Alan Gara, Mark E. Giampapa, Thomas M. Gooding, Michael K. Gschwind, John A. Gunnels, Shawn A. Hall, Rudolf A. Haring, Philip Heidelberger, Todd A. Inglett, Brant L. Knudson, Gerard V. Kopcsay, Sameer Kumar, Amith R. Mamidala, James A. Marcella, Mark G. Megerian, Douglas R. Miller, Samuel J. Miller, Adam J. Muff, Michael B. Mundy, John K. O'Brien, Kathryn M. O'Brien, Martin Ohmacht, Jeffrey J. Parker, Ruth J. Poole, Joseph D. Ratterman, Valentina Salapura, David L. Satterfield, Robert M. Senger, Brian Smith, Burkhard Steinmacher-Burow, William M. Stockdell, Craig B. Stunkel, Krishnan Sugavanam, Yutaka Sugawara, Todd E. Takken, Barry M. Trager, James L. Van Oosten, Charles D. Wait, Robert E. Walkup, Alfred T. Watson, Robert W. Wisniewski, Peng Wu
-
Publication number: 20150178089Abstract: There is provided a processor implemented method for controlling a lock-stepped cohort. The method includes receiving instructions for each of a first lane and a second lane. The first lane is for the lock-stepped cohort and the second lane is for another cohort. The method further includes detecting a condition in which a first instruction at the first lane will have a higher latency than a second instruction at the second lane. The method also includes setting an indicator indicating where the first lane encountered the first instruction. The method additionally includes setting the first lane to inactive, while keeping the second lane active. The method further includes setting the first lane to active on a subsequent opportunity to execute said first instruction.Type: ApplicationFiled: December 23, 2013Publication date: June 25, 2015Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Jose R. Brunheroto, Chen-Yong Cher, Hubertus Franke, Jamin Naghmouchi
-
Publication number: 20150074367Abstract: A method for faulty memory utilization in a memory system includes: obtaining information regarding memory health status of at least one memory page in the memory system; determining an error tolerance of the memory page when the information regarding memory health status indicates that a failure is predicted to occur in an area of the memory system affecting the memory page; initiating a migration of data stored in the memory page when it is determined that the data stored in the memory page is non-error-tolerant; notifying at least one application regarding a predicted operating system failure and/or a predicted application failure when it is determined that data stored in the memory page is non-error-tolerant and cannot be migrated; and notifying at least one application regarding the memory failure predicted to occur when it is determined that data stored in the memory page is error-tolerant.Type: ApplicationFiled: September 9, 2013Publication date: March 12, 2015Applicant: International Business Machines CorporationInventors: Chen-Yong Cher, Carlos H. Andrade Costa, Yoonho Park, Bryan S. Rosenburg, Kyung D. Ryu