Patents by Inventor Youfeng Wu

Youfeng Wu has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Method and apparatus for recovering data values in dynamic runtime systems

Patent number: 7308682

Abstract: An arrangement is provided for data value recovery in an optimized program by precisely allocating predicate registers to guard branching instructions in the optimized program at compilation time. At execution time, an execution path leading to a recovery point is determined based on values of predicate registers guarding branching blocks. The values of non-current and non-resident data may be recovered at the recovery point according to the determined execution path. Optimization annotations may also be utilized for data value recovery.

Type: Grant

Filed: April 25, 2003

Date of Patent: December 11, 2007

Assignee: Intel Corporation

Inventor: Youfeng Wu
Performing dynamic information flow tracking

Publication number: 20070240141

Abstract: In one embodiment, the present invention includes a method for instrumenting a code block with code to perform dynamic information flow tracking. Then during execution, it may be determined whether a pattern of input data to the code block has been previously received by the code block. If so, the code block may be executed, otherwise the instrumented code block may be executed. Other embodiments are described and claimed.

Type: Application

Filed: March 30, 2006

Publication date: October 11, 2007

Inventors: Feng Qin, Cheng Wang, Ho-Seop Kim, Yuanyuan Zhou, Youfeng Wu
Apparatus and method for redundant software thread computation

Publication number: 20070174837

Abstract: An apparatus and method for redundant software thread computation. In one embodiment, the method includes the replication of an application into two communicating threads, a leading thread and a trailing thread. In one embodiment, the trailing thread repeats computations performed by the leading thread to detect transient faults, referred to herein as “soft errors.” A first in, first out (FIFO) buffer of shared memory is reserved for passing data between the leading thread and the trailing thread. The FIFO buffer may include a buffer head variable to write data to the FIFO buffer and a buffer tail variable to read data from the FIFO buffer. In one embodiment, data passing between the leading thread data buffering is restricted according to a data unit size and thread synchronization between a leading thread and the trailing thread is limited to buffer overflow/underflow detection. Other embodiments are described and claimed.

Type: Application

Filed: December 30, 2005

Publication date: July 26, 2007

Inventors: Cheng Wang, Youfeng Wu
Genetic algorithm for microcode compression

Publication number: 20070094164

Abstract: A method to compress microcode utilizing a genetic algorithm includes generating a population of chromosomes, each chromosome including one or more elements that indicate a cluster to which a portion of microcode memory belongs. The method further includes determining a fitness value of each chromosome and modifying the population of chromosomes based on the fitness values of the chromosomes to generate a new population of chromosomes. In addition, the method includes compressing the microcode memory using a cluster-based compression technique, wherein clusters are selected according to a chromosome from the new population with the best fitness value. Other embodiments are also disclosed.

Type: Application

Filed: September 27, 2005

Publication date: April 26, 2007

Inventors: Youfeng Wu, Mauricio Breternitz
Two-pass MRET trace selection for dynamic optimization

Publication number: 20070079293

Abstract: A first potential hot trace of a program is determined. A second potential hot trace of the program is determined. A common path from the first potential hot trace and the second potential hot trace is selected as the selected hot trace of the program.

Type: Application

Filed: September 30, 2005

Publication date: April 5, 2007

Inventors: Cheng Wang, Bixia Zheng, Ho-seop Kim, Mauricio Breternitz, Youfeng Wu
Compressing "warm" code in a dynamic binary translation environment

Publication number: 20070079296

Abstract: Selected regions of native instructions translated in a DBT environment from non-native instructions are compressed based on the independent compression of different fields of selected instructions using compression tables to reduce a length of selected fields. The regions of compressed instructions are stored and de-compressed into the native instructions during subsequent execution using de-compression tables. Specifically, for native instructions of a selected region, selected types of opcodes and/or operands may be compressed independently. The types may be selected by profiling the opcodes using benchmark programs and creating an opcode conversion table prior to compression, and scanning of the operands and creating an operand conversion table during compression of the opcodes.

Type: Application

Filed: September 30, 2005

Publication date: April 5, 2007

Inventors: Zhiyuan Li, Youfeng Wu
Apparatus and method for dynamic binary translator to support precise exceptions with minimal optimization constraints

Publication number: 20070079304

Abstract: A method and apparatus for dynamic binary translator to support precise exceptions with minimal optimization constraints. In one embodiment, the method includes the translation of a source binary application generated for a source instruction set architecture (ISA) into a sequential, intermediate representation (IR) of the source binary application. In one embodiment, the sequential IR is modified to incorporate exception recovery information for each of the exception instructions identified from the source binary application to enable a dynamic binary translator (DBT) to represent exception recovery values as regular values used by IR instructions. In one embodiment, the sequential IR may be optimized with a constraint on movement of an exception instruction downward past an irreversible instruction to form a non-sequential IR. In one embodiment, the non-sequential IR is optimized to form a translated binary application for a target ISA. Other embodiments are described and claimed.

Type: Application

Filed: September 30, 2005

Publication date: April 5, 2007

Inventors: Bixia Zheng, Cheng Wang, Ho-seop Kim, Mauricio Breternitz, Youfeng Wu
Run-ahead program execution with value prediction

Patent number: 7188234

Abstract: A data processing apparatus, a computer, an article including a machine-accessible medium, and a method of processing data are disclosed. The data processing apparatus may include a pair of pipelines sharing an instruction cache, data cache, and a branch predictor with the second pipeline running ahead of the first pipeline using a data value prediction module. The pipelines may be included in one or more processors and coupled to a memory to form a computer. The method includes executing a plurality of instructions using the pipeline pair, such that when a cache miss is encountered by the second pipeline during execution of a LOAD instruction, the data value prediction module supplies a predicted load value in lieu of a cached value, enabling continued execution of the plurality of instructions by the second pipeline without waiting for the return of the cached value.

Type: Grant

Filed: December 12, 2001

Date of Patent: March 6, 2007

Assignee: Intel Corporation

Inventors: Youfeng Wu, Tin-Fook Ngai
Compressing and accessing a microcode ROM

Publication number: 20070022279

Abstract: An arrangement is provided for compressing microcode ROM (“uROM”) in a processor and for efficiently accessing a compressed “uROM”. A clustering-based approach may be used to effectively compress a uROM. The approach groups similar columns of microcode into different clusters and identifies unique patterns within each cluster. Only unique patterns identified in each cluster are stored in a pattern storage. Indices, which help map an address of a microcode word (“uOP”) to be fetched from a uROM to unique patterns required for the uOP, may be stored in an index storage. Typically it takes a longer time to fetch a uOP from a compressed uROM than from an uncompressed uROM. The compressed uROM may be so designed that the process of fetching a uOP (or uOPs) from a compressed uROM may be fully-pipelined to reduce the access latency.

Type: Application

Filed: July 20, 2005

Publication date: January 25, 2007

Inventors: Youfeng Wu, Sangwook Kim, Mauricio Breternitz, Herbert Hum
Cache mechanism

Patent number: 7120749

Abstract: According to one embodiment a system is disclosed. The system includes a central processing unit (CPU), a first cache memory coupled to the CPU to store only data for vital loads that are to be immediately processed at the CPU, a second cache memory coupled to the CPU to store data for semi-vital loads to be processed at the CPU, and a third cache memory coupled to the CPU, the first cache memory and the second cache memory to store non-vital loads to be processed at the CPU.

Type: Grant

Filed: March 18, 2004

Date of Patent: October 10, 2006

Assignee: Intel Corporation

Inventors: Ryan Rakvic, Youfeng Wu, Bryan Black, John Shen
Method and system for reducing program code size

Publication number: 20060206886

Abstract: In a method for reducing code size, replaceable subsets of instructions at first locations in areas of infrequently executed instructions in a set of instructions and target subsets of instructions at second locations in the set of instructions are identified, wherein each replaceable subset matches at least one target subset. If multiple target subsets of instructions match one replaceable subset of instructions, one of the multiple matching target subsets is chosen as the matching target subset for the one replaceable subset based on whether the multiple target subsets are located in regions of frequently executed code. For each of at least some of the replaceable subsets of instructions, the replaceable subset of instructions is replaced with an instruction to cause the matching target subset of instructions at the second location to be executed.

Type: Application

Filed: December 22, 2004

Publication date: September 14, 2006

Applicant: INTEL CORPORATION

Inventors: Youfeng Wu, Mauricio Breternitz
Software set-value profiling and code reuse

Patent number: 7100155

Abstract: An apparatus and method for profiling candidate reuse regions and candidate load instructions aids in the selection of computation reuse regions and computation reuse instructions with good reuse qualities. Registers holding input values for candidate reuse regions are sampled periodically when the candidate reuse region is encountered. The register contents are combined into set-values. When a relatively small number of set-values account for a large percentage of occurrences, the candidate reuse region may be a good computation reuse region. Load instructions are profiled for the location accessed and the value loaded. The location and value are combined into location-values. The relative occurrence frequency of location-values can be used to evaluate load instructions as candidate instructions for reuse.

Type: Grant

Filed: March 10, 2000

Date of Patent: August 29, 2006

Assignee: Intel Corporation

Inventor: Youfeng Wu
Compressing microcode

Patent number: 7095342

Abstract: In one embodiment, the present invention includes a method to compress data stored in a memory to reduce size and power consumption. The method includes segmenting each word of a code portion into multiple fields, forming tables having unique entries for each of the fields, and assigning a pointer to each of the unique entries in each of the tables. Other embodiments are described and claimed.

Type: Grant

Filed: March 31, 2005

Date of Patent: August 22, 2006

Assignee: Intel Corporation

Inventors: Herbert Hum, Mauricio Breternitz, Jr., Youfeng Wu, Sangwook Kim
Method and system for reducing program code size

Publication number: 20060136678

Abstract: In a method for reducing code size a replaceable subset of instructions at a first location within a set of instructions and a matching target subset of instructions at a second location within the set of instructions are identified. A base offset and a relative offset are determined. The base offset and the relative offset indicate an absolute offset from the first location to the second location. An instruction to cause a base offset storage element to be loaded with the base offset is inserted prior to the first location. The replaceable subset of instructions is replaced with a second instruction to cause a program counter to be modified based on the relative offset and a value in the base offset register so that the modified program counter indicates the second location.

Type: Application

Filed: December 22, 2004

Publication date: June 22, 2006

Applicant: INTEL CORPORATION

Inventors: Youfeng Wu, Mauricio Breternitz
Method and apparatus for performing compiler transformation of software code using fastforward regions and value specialization

Patent number: 7039909

Abstract: A method and apparatus for providing compiler transformation of code using regions with simplified data and control flow and value specialization are described. In one embodiment, the method includes identifying in the code a plurality of potential candidates for value specialization, selecting a group of candidates from the plurality of potential candidates based on a value profile associated with each potential candidate, and determining specialized data for each selected candidate using a corresponding value profile. The method further includes forming a plurality of optimized regions based on corresponding specialized data. Each optimized region includes one or more selected candidates.

Type: Grant

Filed: September 29, 2001

Date of Patent: May 2, 2006

Assignee: Intel Corporation

Inventors: Youfeng Wu, Li-Ling Chen
Method and system for collaborative profiling for continuous detection of profile phase transitions

Patent number: 7032217

Abstract: A method and system for collaborative profiling for continuous detection of profile phase transitions is disclosed. In one embodiment, the method, comprises using hardware and software to perform continuous edge profiling on a program; detecting profile phase transitions continuously; and optimizing the program based upon the profile phase transitions and edge profile.

Type: Grant

Filed: March 26, 2001

Date of Patent: April 18, 2006

Assignee: Intel Corporation

Inventor: Youfeng Wu
Efficient execution and emulation of bit scan operations

Publication number: 20050289203

Abstract: Methods are disclosed to implement bit scan operations using properties of two's complement arithmetic and compute zero index instructions. A data value may be provided and the most-significant or least-significant bit may be determined using the methods set forth herein.

Type: Application

Filed: June 24, 2004

Publication date: December 29, 2005

Inventors: Mauricio Breternitz, Youfeng Wu, Tal Abir
Method, apparatus, and system to optimize frequently executed code and to use compiler transformation and hardware support to handle infrequently executed code

Patent number: 6964043

Abstract: The present invention relates to a method, apparatus, and system to optimize frequently executed code and to use compiler transformation and hardware support to handle infrequently executed code. The method includes compiling a computer program. The method further includes improving performance of the computer program by optimizing frequently executed code and using compiler transformation to handle infrequently executed code with hardware support. The method also includes storing temporarily the results produced during execution of a region to improve performance of the computer program. The method additionally includes committing the results produced when the execution of the region is completed successfully.

Type: Grant

Filed: October 30, 2001

Date of Patent: November 8, 2005

Assignee: Intel Corporation

Inventors: Youfeng Wu, Li-Ling Chen
Continuous trip count profiling for loop optimizations in two-phase dynamic binary translators

Publication number: 20050240896

Abstract: A method, machine readable medium, and system are disclosed. In one embodiment the method comprises collecting a loop trip count continuously during runtime of a region of code being executed that contains a loop, categorizing the trip count to identify one or more code modification techniques applicable to the loop, and dynamically applying the one or more applicable code modification techniques to alter the code that relates to the loop.

Type: Application

Filed: March 31, 2004

Publication date: October 27, 2005

Inventors: Youfeng Wu, Mauricio Breternitz
Compiler-directed speculative approach to resolve performance-degrading long latency events in an application

Patent number: 6959435

Abstract: A compiler-directed speculative approach to resolve performance-degrading long latency events in an application is described. One or more performance-degrading instructions are identified from multiple instructions to be executed in a program. A set of instructions prefetching the performance-degrading instruction is defined within the program. Finally, at least one speculative bit of each instruction of the identified set of instructions is marked to indicate a predetermined execution of the instruction.

Type: Grant

Filed: September 28, 2001

Date of Patent: October 25, 2005

Assignee: Intel Corporation

Inventors: Dz-Ching Ju, Youfeng Wu

prev … 6 7 8 9 10 11 12 next