Patents by Inventor Gheorghe C. Cascaval

Gheorghe C. Cascaval has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 8255913
    Abstract: In a global shared memory (GSM) environment, a method provides local notification of completion of a global shared memory (GSM) operation processed by a first task executing at a local node of the distributed system. The system includes multiple nodes on which different tasks of a single job execute and perform GSM operations that are received from a second task via a via host fabric interface (HFI) and associated HFR window assigned to the first tasks. The local task initiates execution of a GSM operation on the local node. The task then monitors for and detects a completion of the execution of the GSM operation on the local node. When the task detects completion of the execution of the GSM operation, the task issues an internal notification to inform the locally-executing tasks of the completion of the GSM operation.
    Type: Grant
    Filed: February 1, 2008
    Date of Patent: August 28, 2012
    Assignee: International Business Machines Corporation
    Inventors: Lakshminarayana B. Arimilli, Robert S. Blackmore, Gheorghe C. Cascaval, Ramakrishnan Rajamony
  • Patent number: 8255626
    Abstract: Mechanisms for performing predicated atomic commits based on consistency of watches is provided. These mechanisms include executing, by a thread executing on a processor of the data processing system, an atomic release instruction. A determination is made as to whether a speculative store has been lost, due to an eviction of a memory block to which the speculative store is performed, since a previous atomic release instruction was processed. In response to the speculative store having been lost, invalidating, by the processor, speculative stores that have been performed since the previous atomic release instruction was processed. In addition, the method comprises, in response to the speculative store not having been lost, committing, by the processor, speculative stores that have been performed since the previous atomic release instruction was processed.
    Type: Grant
    Filed: December 9, 2009
    Date of Patent: August 28, 2012
    Assignee: International Business Machines Corporation
    Inventors: Colin B. Blundell, Harold W. Cain, III, Gheorghe C. Cascaval, Maged M. Michael
  • Patent number: 8250307
    Abstract: According to a method of data processing, a memory controller receives a prefetch load request from a processor core of a data processing system. The prefetch load request specifies a requested line of data. In response to receipt of the prefetch load request, the memory controller determines by reference to a stream of demand requests how much data is to be supplied to the processor core in response to the prefetch load request. In response to the memory controller determining to provide less than all of the requested line of data, the memory controller provides less than all of the requested line of data to the processor core.
    Type: Grant
    Filed: February 1, 2008
    Date of Patent: August 21, 2012
    Assignee: International Business Machines Corporation
    Inventors: Ravi K. Arimilli, Gheorghe C. Cascaval, Balaram Sinharoy, William E. Speight, Lixin Zhang
  • Patent number: 8239879
    Abstract: A method for providing global notification of completion of a global shared memory (GSM) operation during processing by a target task executing at a target node of a distributed system. The distributed system has at least one other node on which an initiating task that generated the GSM operation is homed. The target task receives the GSM operation from the initiating task, via a host fabric interface (HFI) window assigned to the target task. The task initiates execution of the GSM operation on the target node. The task detects completion of the execution of the GSM operation on the target node, and issues a global notification to at least the initiating task. The global notification indicates the completion of the execution of the GSM operation to one or more tasks of a single job distributed across multiple processing nodes.
    Type: Grant
    Filed: February 1, 2008
    Date of Patent: August 7, 2012
    Assignee: International Business Machines Corporation
    Inventors: Lakshminarayana B. Arimilli, Robert S. Blackmore, Gheorghe C. Cascaval, Ramakrishnan Rajamony
  • Patent number: 8136103
    Abstract: A method for combined static and dynamic compilation of program code to remove delinquent loads can include statically compiling source code into executable code with instrumented sections each being suspected of including a delinquent load, and also into a separate intermediate language representation with annotated portions each corresponding to one of the instrumented sections. The method also can include executing the instrumented sections repeatedly and monitoring cache misses for each execution. Finally, the method can include dynamically recompiling selected ones of the instrumented sections using corresponding ones of the annotated portions of the separate intermediate language representation only after a threshold number of executions of the selected ones of the instrumented sections, each recompilation include a pre-fetch directive at a pre-fetch distances tuned to avoid the delinquent load.
    Type: Grant
    Filed: March 28, 2008
    Date of Patent: March 13, 2012
    Assignee: International Business Machines Corporation
    Inventors: Gheorghe C. Cascaval, Yaoqing Gao, Allan H. Kielstra, Kevin A. Stoodley
  • Patent number: 8122439
    Abstract: A method and computer product for dynamically and precisely discovering delinquent memory operations through integration of compilers, performance monitoring tools, and analysis tools are provided. The method includes compiling an application, and linking the application with a tracing library to generate executable, compiler annotated information and linker mapping information. The application is executed to obtain runtime trace information that includes hardware performance counters and tracing library instrumentation events. The trace information, the compiler annotated information, and the linker mapping information are analyzed to produce a delinquent memory operation file containing delinquent memory operation information. The delinquent memory operation information of the delinquent memory operation file is read by the compiler to perform memory reference mapping to guide static analysis and memory hierarchy optimization.
    Type: Grant
    Filed: August 9, 2007
    Date of Patent: February 21, 2012
    Assignee: International Business Machines Corporation
    Inventors: Gheorghe C. Cascaval, Yaoqing C. Gao, Kamen Y. Yotov
  • Publication number: 20110258532
    Abstract: Methods and devices for accelerating webpage rendering by a browser store document object model (DOM) tree structures and computations of rendered pages, and compare portions of a DOM tree of pages being render to determining if portions of the DOM tree structures match. If a DOM tree of a webpage to be rendered matches a DOM tree stored in memory, the computations associated with the match DOM tree may be recalled from memory, obviating the need to perform the calculations to render the page. A tree isomorphism algorithm may be used to recognize DOM trees stored in memory that match the DOM tree of the webpage to be rendered. Reusing rendering computations may significantly reducing the time and resources required for rendering web pages. Identifying reusable portions of calculation results based on DOM tree isomorphism enables the browser to reuse stored webpage rendering calculations even when URLs do not match.
    Type: Application
    Filed: April 28, 2011
    Publication date: October 20, 2011
    Inventors: Luis CEZE, Gheorghe C. Cascaval, Bin Wang, Michael P. Mahan, Chetan S. Dhillon, Wendell Ruotsi, Vikram Mandyam
  • Publication number: 20110138126
    Abstract: Mechanisms for performing predicated atomic commits based on consistency of watches is provided. These mechanisms include executing, by a thread executing on a processor of the data processing system, an atomic release instruction. A determination is made as to whether a speculative store has been lost, due to an eviction of a memory block to which the speculative store is performed, since a previous atomic release instruction was processed. In response to the speculative store having been lost, invalidating, by the processor, speculative stores that have been performed since the previous atomic release instruction was processed. In addition, the method comprises, in response to the speculative store not having been lost, committing, by the processor, speculative stores that have been performed since the previous atomic release instruction was processed.
    Type: Application
    Filed: December 9, 2009
    Publication date: June 9, 2011
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Colin B. Blundell, Harold W. Cain, III, Gheorghe C. Cascaval, Maged M. Michael
  • Publication number: 20110066831
    Abstract: A method, system and computer program product for issuing one or more software initiated operations for creating a checkpoint of a register file and memory, and for restoring a register file and memory to the checkpointed state. At the execution of a checkpoint operation, the system returns a condition code indicating success or failure. When the condition code is set equal to one, one or more checkpoints are initiated. Contents of the register file and gated store buffer are stored each time the one or more checkpoints are initiated. When the checkpoint is created, the system notifies software when a hardware checkpoint capacity has been reached. One or more of the software checkpoint, hardware checkpoint, and handler checkpoint are utilized to provide a more precise point of restoration. During software execution, the register file and gated store buffer can be restored as defined by the one or more previous checkpoints.
    Type: Application
    Filed: September 15, 2009
    Publication date: March 17, 2011
    Applicant: IBM CORPORATION
    Inventors: Colin B. Blundell, Harold W. Cain, III, Gheorghe C. Cascaval, Maged M. Michael
  • Publication number: 20110066820
    Abstract: A method, a system and a computer program product for handling speculative stores. The system determines when a speculative store buffer is not full. An indicator is generated when the speculative store buffer is not full, and the speculative stores are input into the speculative store buffer. When the speculative store buffer is full, a full buffer indicator is generated. Speculative stores prevented from entering the speculative store buffer are overflow stores. The overflow list is searched to determine whether one or more addresses of the overflow stores are present in the overflow list. When one or more addresses of the overflow stores are not present in the overflow list, the overflow stores are stored in the overflow list.
    Type: Application
    Filed: September 15, 2009
    Publication date: March 17, 2011
    Applicant: IBM CORPORATION
    Inventors: Colin B. Blundell, Harold W. Cain, III, Gheorghe C. Cascaval, Maged M. Michael
  • Publication number: 20110016470
    Abstract: Mechanisms are provided for handling conflicts in a transactional memory system. The mechanisms execute threads in a data processing system in a first conflict resolution mode of operation in which threads execute conflicting transactional blocks speculatively. The mechanisms determine, for a transactional block, if the first conflict resolution mode of operation is to be transitioned to a second conflict resolution mode of operation in which threads accessing conflicting transactional blocks are executed serially and non-speculatively. Moreover, the mechanisms execute a thread that accesses the transactional block using the second conflict resolution mode of operation in response to the determination indicating that the first conflict resolution mode of operation is to be transitioned to the second conflict resolution mode of operation.
    Type: Application
    Filed: July 17, 2009
    Publication date: January 20, 2011
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Harold W. Cain, III, Gheorghe C. Cascaval, Maged M. Michael
  • Publication number: 20100293339
    Abstract: A method of data processing in a processor includes maintaining a usage history indicating demand usage of prefetched data retrieved into cache memory. An amount of data to prefetch by a data prefetch request is selected based upon the usage history. The data prefetch request is transmitted to a memory hierarchy to prefetch the selected amount of data into cache memory.
    Type: Application
    Filed: February 1, 2008
    Publication date: November 18, 2010
    Inventors: RAVI K. ARIMILLI, Gheorghe C. Cascaval, Balaram Sinharoy, William E. Speight, Lixin Zhang
  • Patent number: 7610266
    Abstract: A method for vertical integrated performance and environment monitoring includes steps, or acts, of: defining one or more events to provide a unified specification; registering one or more events to be detected; detecting an occurrence of at least one of the registered event or events; generating a monitoring entry each time one of the registered events is detected; and entering each of the monitoring entries generated into a single logical entity.
    Type: Grant
    Filed: May 25, 2005
    Date of Patent: October 27, 2009
    Assignee: International Business Machines Corporation
    Inventors: Gheorghe C. Cascaval, Evelyn Duesterwald, Peter F. Sweeney, Robert W. Wisniewski
  • Publication number: 20090249316
    Abstract: A method for combined static and dynamic compilation of program code to remove delinquent loads can include statically compiling source code into executable code with instrumented sections each being suspected of including a delinquent load, and also into a separate intermediate language representation with annotated portions each corresponding to one of the instrumented sections. The method also can include executing the instrumented sections repeatedly and monitoring cache misses for each execution. Finally, the method can include dynamically recompiling selected ones of the instrumented sections using corresponding ones of the annotated portions of the separate intermediate language representation only after a threshold number of executions of the selected ones of the instrumented sections, each recompilation include a pre-fetch directive at a pre-fetch distances tuned to avoid the delinquent load.
    Type: Application
    Filed: March 28, 2008
    Publication date: October 1, 2009
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Gheorghe C. Cascaval, Yaoqing Gao, Allan H. Kielstra, Kevin A. Stoodley
  • Patent number: 7596680
    Abstract: A system and method to extend the number of architecturally visible registers in a processor while preserving the number of bits of the instruction encoding. The system comprises: an indirection table that encodes register patterns for the registers used in an instruction; instructions to load and store such table entries; a mechanism to identify instructions that use the indirection table; and a mechanism to identify a set of bits in instructions that are used to index into the indirection table. According to another embodiment, a method of encoding registers in a computer instruction comprises constructing a table having a plurality of entries. Each entry specifies a combination of a plurality of registers. The method also comprises generating an instruction having a reference to one of the entries in the table. The method then comprises accessing the plurality of registers specified by the referenced table entry.
    Type: Grant
    Filed: September 15, 2003
    Date of Patent: September 29, 2009
    Assignee: International Business Machines Corporation
    Inventors: Gheorghe C. Cascaval, Siddhartha Chatterjee
  • Publication number: 20090198965
    Abstract: According to a method of data processing, a memory controller receives a prefetch load request from a processor core of a data processing system. The prefetch load request specifies a requested line of data. In response to receipt of the prefetch load request, the memory controller determines by reference to a stream of demand requests how much data is to be supplied to the processor core in response to the prefetch load request. In response to the memory controller determining to provide less than all of the requested line of data, the memory controller provides less than all of the requested line of data to the processor core.
    Type: Application
    Filed: February 1, 2008
    Publication date: August 6, 2009
    Inventors: RAVI K. ARIMILLI, Gheorghe C. Cascaval, Balaram Sinharoy, William E. Speight, Lixin Zhang
  • Publication number: 20090199191
    Abstract: In a global shared memory (GSM) environment, a method provides local notification of completion of a global shared memory (GSM) operation processed by a first task executing at a local node of the distributed system. The system includes multiple nodes on which different tasks of a single job execute and perform GSM operations that are received from a second task via a via host fabric interface (HFI) and associated HFR window assigned to the first tasks. The local task initiates execution of a GSM operation on the local node. The task then monitors for and detects a completion of the execution of the GSM operation on the local node. When the task detects completion of the execution of the GSM operation, the task issues an internal notification to inform the locally-executing tasks of the completion of the GSM operation.
    Type: Application
    Filed: February 1, 2008
    Publication date: August 6, 2009
    Inventors: Lakshminarayana B. Arimilli, Robert S. Blackmore, Gheorghe C. Cascaval, Ramakrishnan Rajamony
  • Publication number: 20090199182
    Abstract: A method for providing global notification of completion of a global shared memory (GSM) operation during processing by a target task executing at a target node of a distributed system. The distributed system has at least one other node on which an initiating task that generated the GSM operation is homed. The target task receives the GSM operation from the initiating task, via a host fabric interface (HFI) window assigned to the target task. The task initiates execution of the GSM operation on the target node. The task detects completion of the execution of the GSM operation on the target node, and issues a global notification to at least the initiating task. The global notification indicates the completion of the execution of the GSM operation to one or more tasks of a single job distributed across multiple processing nodes.
    Type: Application
    Filed: February 1, 2008
    Publication date: August 6, 2009
    Inventors: Lakshminarayana B. Arimilli, Robert S. Blackmore, Gheorghe C. Cascaval, Ramakrishnan Rajamony
  • Publication number: 20090198910
    Abstract: According to method of data processing in a multiprocessor data processing system, in response to a processor touch request targeting a target granule of a cache line of data containing multiple granules, a processing unit originates on an interconnect of the multiprocessor data processing system a partial touch request that requests a copy of only the target granule for subsequent query access. In response to a combined response to the partial touch request indicating success, the combined response representing a system-wide response to the partial touch request, the processing unit receives the target granule of the target cache line and updates a coherency state of the target granule while retaining a coherency state of at least one other granule of the cache line.
    Type: Application
    Filed: February 1, 2008
    Publication date: August 6, 2009
    Inventors: RAVI K. ARIMILLI, Gheorghe C. Cascaval, Balaram Sinharoy, William E. Speight, Lixin Zhang
  • Publication number: 20090198903
    Abstract: In at least one embodiment, a processor detects during execution of program code whether a load instruction within the program code is associated with a hint. In response to detecting that the load instruction is not associated with a hint, the processor retrieves a full cache line of data from the memory hierarchy into the processor in response to the load instruction. In response to detecting that the load instruction is associated with a hint, a processor retrieves a partial cache line of data into the processor from the memory hierarchy in response to the load instruction.
    Type: Application
    Filed: February 1, 2008
    Publication date: August 6, 2009
    Inventors: RAVI K. ARIMILLI, Gheorghe C. Cascaval, Balaram Sinharoy, William E. Speight, Lixin Zhang