Patents by Inventor Partha Pal Tirumalai

Partha Pal Tirumalai has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 8726251
    Abstract: Embodiments of the invention provide systems and methods for automatically parallelizing loops with non-speculative pipelined execution of chunks of iterations with pre-computation of selected values. Non-DOALL loops are identified and divided the loops into chunks. The chunks are assigned to separate logical threads, which may be further assigned to hardware threads. As a thread performs its runtime computations, subsequent threads attempt to pre-compute their respective chunks of the loop. These pre-computations may result in a set of assumed initial values and pre-computed final variable values associated with each chunk. As subsequent pre-computed chunks are reached at runtime, those assumed initial values can be verified to determine whether to proceed with runtime computation of the chunk or to avoid runtime execution and instead use the pre-computed final variable values.
    Type: Grant
    Filed: March 29, 2011
    Date of Patent: May 13, 2014
    Assignee: Oracle International Corporation
    Inventors: Spiros Kalogeropulos, Partha Pal Tirumalai
  • Publication number: 20120254888
    Abstract: Embodiments of the invention provide systems and methods for automatically parallelizing loops with non-speculative pipelined execution of chunks of iterations with pre-computation of selected values. Non-DOALL loops are identified and divided the loops into chunks. The chunks are assigned to separate logical threads, which may be further assigned to hardware threads. As a thread performs its runtime computations, subsequent threads attempt to pre-compute their respective chunks of the loop. These pre-computations may result in a set of assumed initial values and pre-computed final variable values associated with each chunk. As subsequent pre-computed chunks are reached at runtime, those assumed initial values can be verified to determine whether to proceed with runtime computation of the chunk or to avoid runtime execution and instead use the pre-computed final variable values.
    Type: Application
    Filed: March 29, 2011
    Publication date: October 4, 2012
    Applicant: ORACLE INTERNATIONAL CORPORATION
    Inventors: SPIROS KALOGEROPULOS, PARTHA PAL TIRUMALAI
  • Patent number: 8166486
    Abstract: Methods and apparatus provide for a workload adjuster to estimate the startup cost of one or more non-main threads of loop execution and to estimate the amount of workload to be migrated between different threads. Upon deciding to parallelize the execution of a loop, the workload adjuster creates a scheduling policy with a workload for a main thread and workloads for respective non-main threads. The scheduling policy distributes iterations of a parallelized loop to the workload of the main thread and iterations of the parallelized loop to the workloads of the non-main threads. The workload adjuster evaluates a start-up cost of the workload of a non-main thread and, based on the start-up cost, migrates a portion of the workload for that non-main thread to the main thread's workload.
    Type: Grant
    Filed: December 4, 2007
    Date of Patent: April 24, 2012
    Assignee: Oracle America, Inc.,
    Inventors: Yonghong Song, Spiros Kalogeropulos, Partha Pal Tirumalai
  • Publication number: 20090144746
    Abstract: Methods and apparatus provide for a workload adjuster to estimate the startup cost of one or more non-main threads of loop execution and to estimate the amount of workload to be migrated between different threads. Upon deciding to parallelize the execution of a loop, the workload adjuster creates a scheduling policy with a workload for a main thread and workloads for respective non-main threads. The scheduling policy distributes iterations of a parallelized loop to the workload of the main thread and iterations of the parallelized loop to the workloads of the non-main threads. The workload adjuster evaluates a start-up cost of the workload of a non-main thread and, based on the start-up cost, migrates a portion of the workload for that non-main thread to the main thread's workload.
    Type: Application
    Filed: December 4, 2007
    Publication date: June 4, 2009
    Inventors: Yonghong Song, Spiros Kalogeropulos, Partha Pal Tirumalai
  • Patent number: 6678796
    Abstract: A method and apparatus for scheduling instructions to provide adequate prefetch latency is disclosed during compilation of a program code in to a program. The prefetch scheduler component of the present invention selects a memory operation within the program code as a “martyr load” and removes the prefetch associated with the martyr load, if any. The prefetch scheduler takes advantage of the latency associated with the martyr load to schedule prefetches for memory operations which follow the martyr load. The prefetches are scheduled “behind” (i.e., prior to) the martyr load to allow the prefetches to complete before the associated memory operations are carried out. The prefetch schedule component continues this process throughout the program code to optimize prefetch scheduling and overall program operation.
    Type: Grant
    Filed: October 3, 2000
    Date of Patent: January 13, 2004
    Assignee: Sun Microsystems, Inc.
    Inventors: Nicolai Kosche, Peter C. Damron, Joseph Chamdani, Partha Pal Tirumalai
  • Patent number: 6634024
    Abstract: The present invention integrates data prefetching into a modulo scheduling technique to provide for the generation of assembly code having improved performance. Modulo scheduling can produce optimal steady state code for many important cases by sufficiently separating defining instructions (producers) from using instructions (consumers), thereby avoiding machine stall cycles and simultaneously maximizing processor utilization. Integrating data prefetching within modulo scheduling yields high performance assembly code by prefetching data from memory while at the same time using modulo scheduling to efficiently schedule the remaining operations. The invention integrates data prefetching into modulo scheduling by postponing prefetch insertion until after modulo scheduling is complete. Actual insertion of the prefetch instructions occurs in a postpass after the generation of appropriate prologue-kernel-epilogue code.
    Type: Grant
    Filed: June 27, 2001
    Date of Patent: October 14, 2003
    Assignee: Sun Microsystems, Inc.
    Inventors: Partha Pal Tirumalai, Rajagopalan Mahadevan
  • Patent number: 6341370
    Abstract: The present invention integrates data prefetching into a modulo scheduling technique to provide for the generation of assembly code having improved performance. Modulo scheduling can produce optimal steady state code for many important cases by sufficiently separating defining instructions (producers) from using instructions (consumers), thereby avoiding machine stall cycles and simultaneously maximizing processor utilization. Integrating data prefetching within modulo scheduling yields high performance assembly code by prefetching data from memory while at the same time using modulo scheduling to efficiently schedule the remaining operations. The invention integrates data prefetching into modulo scheduling by postponing prefetch insertion until after modulo scheduling is complete. Actual insertion of the prefetch instructions occurs in a postpass after the generation of appropriate prologue-kernel-epilogue code.
    Type: Grant
    Filed: April 24, 1998
    Date of Patent: January 22, 2002
    Assignee: Sun Microsystems, Inc.
    Inventors: Partha Pal Tirumalai, Rajagopalan Mahadevan
  • Publication number: 20020007484
    Abstract: The present invention integrates data prefetching into a modulo scheduling technique to provide for the generation of assembly code having improved performance. Modulo scheduling can produce optimal steady state code for many important cases by sufficiently separating defining instructions (producers) from using instructions (consumers), thereby avoiding machine stall cycles and simultaneously maximizing processor utilization. Integrating data prefetching within modulo scheduling yields high performance assembly code by prefetching data from memory while at the same time using modulo scheduling to efficiently schedule the remaining operations. The invention integrates data prefetching into modulo scheduling by postponing prefetch insertion until after modulo scheduling is complete. Actual insertion of the prefetch instructions occurs in a postpass after the generation of appropriate prologue-kernel-epilogue code.
    Type: Application
    Filed: June 27, 2001
    Publication date: January 17, 2002
    Inventors: Partha Pal Tirumalai, Rajagopalan Mahadevan