Patents by Inventor Dz Ching Ju

Dz Ching Ju has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20100250854
    Abstract: An efficient and effective compiler data prefetching technique is disclosed in which memory accesses may be prefetched are represented in linear induction expressions. Furthermore, indirect memory accesses indexed by other memory accesses of linear induction expressions in scalar loops may be prefetched.
    Type: Application
    Filed: March 16, 2010
    Publication date: September 30, 2010
    Inventor: Dz-ching Ju
  • Publication number: 20090133023
    Abstract: Systems and methods provide a single reader single writer (SRSW) queue structure having entries that can be concurrently accessed in an atomic manner with a single memory access. The SRSW queues may be combined to create more complicated queues, including multiple reader single writer (MRSW), single reader multiple writer (SRMW), and multiple reader multiple writer (MRMW) queues.
    Type: Application
    Filed: December 29, 2005
    Publication date: May 21, 2009
    Inventors: Xiao-Feng Li, Dz-ching Ju
  • Patent number: 7529888
    Abstract: In some embodiments, the invention involves a system and method relating to software caching with bounded-error delayed updates. Embodiments of the present invention describe a delayed-update software-controlled cache, which may be used to reduce memory access latencies and improve throughput for domain specific applications that are tolerant of errors caused by delayed updates of cached values. In at least one embodiment of the present invention, software caching may be implemented by using a compiler to automatically generate caching code in application programs that must access and/or update memory. Cache is accessed for a period of time, even if global data has been updated, to delay costly memory accesses. Other embodiments are described and claimed.
    Type: Grant
    Filed: November 19, 2004
    Date of Patent: May 5, 2009
    Assignee: Intel Corporation
    Inventors: Michael Kerby Chen, Dz-ching Ju
  • Patent number: 7512738
    Abstract: Provided are a method, system, and program for allocating call stack frame entries at different memory levels to functions in a program. Functions in a program accessing state information stored in call stack frame entries are processed. Call stack frame entries are allocated to the state information for each function, wherein the call stack frame entries span multiple memory levels, and wherein one function is capable of being allocated stack entries in multiple memory levels.
    Type: Grant
    Filed: September 30, 2004
    Date of Patent: March 31, 2009
    Assignee: Intel Corporation
    Inventors: Vinod K. Balakrishnan, Ruiqi Lian, Junchao Zhang, Dz-ching Ju
  • Patent number: 7350024
    Abstract: A method for applying software controlled caching and ordered thread optimizations in network applications includes collecting statistics for program variables, selecting program variable candidates for ordered synchronization and/or software controlled cache optimization, performing a safety check to ensure candidates can be properly optimized, and generating code for selected optimization candidates.
    Type: Grant
    Filed: December 16, 2004
    Date of Patent: March 25, 2008
    Assignee: Intel Corporation
    Inventors: Michael Chen, Dz-ching Ju
  • Publication number: 20080065872
    Abstract: Methods and apparatus for preserving precise exceptions in code reordering by using control speculation are disclosed. A disclosed system uses a control speculation module to reorder instructions within an application program and preserve precise exceptions. Instructions, excepting and non-excepting, can be reordered by the control speculation module if the instructions meet certain conditions. When an excepting instruction is reordered, a check instruction is inserted into the program execution path and a recovery block is generated. The check instruction determines if the reordered excepting instruction actually needs to generate an exception. The recovery block contains instructions to revert the effects of code reordering. If the check instruction detects the need for an exception, the recovery block is executed to restore the architectural state of the processor and the exception is handled.
    Type: Application
    Filed: November 8, 2007
    Publication date: March 13, 2008
    Inventor: Dz-ching Ju
  • Patent number: 7313790
    Abstract: Methods and apparatus for preserving precise exceptions in code reordering by using control speculation are disclosed. A disclosed system uses a control speculation module to reorder instructions within an application program and preserve precise exceptions. Instructions, excepting and non-excepting, can be reordered by the control speculation module if the instructions meet certain conditions. When an excepting instruction is reordered, a check instruction is inserted into the program execution path and a recovery block is generated. The check instruction determines if the reordered excepting instruction actually needs to generate an exception. The recovery block contains instructions to revert the effects of code reordering. If the check instruction detects the need for an exception, the recovery block is executed to restore the architectural state of the processor and the exception is handled.
    Type: Grant
    Filed: June 23, 2003
    Date of Patent: December 25, 2007
    Assignee: Intel Corporation
    Inventor: Dz-ching Ju
  • Publication number: 20070226740
    Abstract: A system that concurrently executes threads of a multi-threaded application pauses the execution of one thread, then pauses the execution of another thread before the second thread alters a shared memory state. Chipsets and software to implement embodiments of the invention are also described and claimed.
    Type: Application
    Filed: February 28, 2006
    Publication date: September 27, 2007
    Inventors: Xiao-Feng Li, Dz-ching Ju
  • Publication number: 20070226720
    Abstract: System and method are described for register optimization during code translation utilizes a technique that removes the time overhead for analyzing register usage and eliminates fixed restraints on the compiler register usage. The present invention for register optimization utilizes a compiler to produce a register usage bit vector in a NOP instruction within each basic block (i.e., subroutine, function, and/or procedure). Each bit in the bit vector represents a particular caller-saved register. A bit is set if, at the location of NOP instruction, the compiler uses the corresponding register within that basic block containing the NOP instruction to hold information to be used at a later time. During the translation, the translator examines the register usage bit vector to very quickly determine which registers are free and therefore can be used during the register optimization without the need to save and restore the register values.
    Type: Application
    Filed: May 31, 2007
    Publication date: September 27, 2007
    Inventors: Ding-Kai Chen, Dz-Ching Ju
  • Patent number: 7257806
    Abstract: System and method are described for register optimization during code translation utilizes a technique that removes the time overhead for analyzing register usage and eliminates fixed restraints on the compiler register usage. The present invention for register optimization utilizes a compiler to produce a register usage bit vector in a NOP instruction within each basic block (i.e., subroutine, function, and/or procedure). Each bit in the bit vector represents a particular caller-saved register. A bit is set if, at the location of NOP instruction, the compiler uses the corresponding register within that basic block containing the NOP instruction to hold information to be used at a later time. During the translation, the translator examines the register usage bit vector to very quickly determine which registers are free and therefore can be used during the register optimization without the need to save and restore the register values.
    Type: Grant
    Filed: October 21, 1999
    Date of Patent: August 14, 2007
    Assignee: Hewlett-Packard Development Company, L.P.
    Inventors: Ding-Kai Chen, Dz-Ching Ju
  • Publication number: 20070130114
    Abstract: Methods and apparatus to optimize the processing throughput of data structures in programs are disclosed. A disclosed method to automatically optimize processing throughput of a data structure in a program comprises recording information representative of at least one access of the data structure, analyzing the representative information, and modifying the program to optimize the at least one access of the data structure based on the analysis, wherein modifying the program includes modifying at least one instruction of the program to translate one of the at least one access of the data structure from a first memory to a second memory.
    Type: Application
    Filed: October 16, 2006
    Publication date: June 7, 2007
    Inventors: Xiao-Feng Li, Lixia Liu, Dz-ching Ju
  • Publication number: 20070079079
    Abstract: There is provided a method and apparatus to reduce access to shared data storage. The apparatus analyzes a multithreaded application and generates metadata that is utilized to optimize the multithreaded application that executes on multiple processing elements.
    Type: Application
    Filed: September 30, 2005
    Publication date: April 5, 2007
    Inventors: Xiao-Feng Li, Haibo Lin, Dz-ching Ju
  • Publication number: 20070061286
    Abstract: A method and system to optimize throughput of executable program code are provided. The system comprises a profiler to receive a representation of a plurality of functions, an aggregator, and a mapper to map the plurality of aggregates to a plurality of processors. The aggregator may be configured to create an aggregate for each function from the plurality of functions thereby creating a plurality of aggregates, choose an optimization action between grouping and duplication based on the number of aggregates in the plurality of aggregates, the number of available processing elements (PEs), and execution time of each aggregate, and perform the chosen optimization action.
    Type: Application
    Filed: September 1, 2005
    Publication date: March 15, 2007
    Inventors: Lixia Liu, Dz-ching Ju, Michael Chen
  • Publication number: 20060282707
    Abstract: Techniques that may be utilized in a multiprocessor system are described. In one embodiment, one or more signals are generated to indicate that a breakpoint instruction is executed by one of the plurality of processors in the multiprocessor system.
    Type: Application
    Filed: June 9, 2005
    Publication date: December 14, 2006
    Inventors: Mark Rosenbluth, Xiao-Feng Li, Dz-ching Ju, Aaron Kunze
  • Publication number: 20060168399
    Abstract: A method for applying software controlled caching and ordered thread optimizations in network applications includes collecting statistics for program variables, selecting program variable candidates for ordered synchronization and/or software controlled cache optimization, performing a safety check to ensure candidates can be properly optimized, and generating code for selected optimization candidates.
    Type: Application
    Filed: December 16, 2004
    Publication date: July 27, 2006
    Inventors: Michael Chen, Dz-ching Ju
  • Patent number: 7082602
    Abstract: We disclose a function unit based finite state automata data structure for use in computer program compilers. According to an aspect of an embodiment, the data structure comprises a function unit vector, having no more used bits than there are issue ports for any particular microprocessor, and a plurality of valid template assignments for each function unit vector. In a preferred embodiment, the template assignments are constructed so as to account for dispersal rules associated with the particular microprocessor. Further, the template assignments can be sorted according to priority data.
    Type: Grant
    Filed: April 12, 2002
    Date of Patent: July 25, 2006
    Assignee: Intel Corporation
    Inventors: Chen Fu, Dong-Yuan Chen, Chengyong Wu, Dz-Ching Ju
  • Patent number: 7058937
    Abstract: A compiler comprising an integrated instruction scheduler and resource management system is provided. According to an aspect of an embodiment, the resource management system includes a function unit based finite state automata system. Instructions to be compiled are modeled through the function unit based finite state automata system based on their function unit usage, before they are emitted as compiled computer code. We also disclose a function unit based finite state automata data structure and computer implemented methods for making the same.
    Type: Grant
    Filed: April 12, 2002
    Date of Patent: June 6, 2006
    Assignee: Intel Corporation
    Inventors: Chen Fu, Dong-Yuan Chen, Chengyong Wu, Dz-Ching Ju
  • Publication number: 20060112237
    Abstract: In some embodiments, the invention involves a system and method relating to software caching with bounded-error delayed updates. Embodiments of the present invention describe a delayed-update software-controlled cache, which may be used to reduce memory access latencies and improve throughput for domain specific applications that are tolerant of errors caused by delayed updates of cached values. In at least one embodiment of the present invention, software caching may be implemented by using a compiler to automatically generate caching code in application programs that must access and/or update memory. Cache is accessed for a period of time, even if global data has been updated, to delay costly memory accesses. Other embodiments are described and claimed.
    Type: Application
    Filed: November 19, 2004
    Publication date: May 25, 2006
    Inventors: Michael Chen, Dz-ching Ju
  • Publication number: 20060070046
    Abstract: Provided are a method, system, and program for allocating call stack frame entries at different memory levels to functions in a program. Functions in a program accessing state information stored in call stack frame entries are processed. Call stack frame entries are allocated to the state information for each function, wherein the call stack frame entries span multiple memory levels, and wherein one function is capable of being allocated stack entries in multiple memory levels.
    Type: Application
    Filed: September 30, 2004
    Publication date: March 30, 2006
    Inventors: Vinod Balakrishnan, Ruiqi Lian, Junchao Zhang, Dz-ching Ju
  • Publication number: 20060002224
    Abstract: Operands may be assigned to physical registers within partitioned register banks by identifying possible candidate register banks for an operand. Prior to allocation of the operand to a candidate register bank, conflicts between candidate register banks, if any, may be identified and resolved.
    Type: Application
    Filed: June 30, 2004
    Publication date: January 5, 2006
    Applicant: Intel Corporation
    Inventors: Junchao Zhang, Dz-ching Ju, Ruiqi Lian, Guei-Yuan Lueh, Zhaoqing Zhang