Patents by Inventor Youfeng Wu

Youfeng Wu has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20160116963
    Abstract: Dynamically switching cores on a heterogeneous multi-core processing system may be performed by executing program code on a first processing core. Power up of a second processing core may be signaled. A first performance metric of the first processing core executing the program code may be collected. When the first performance metric is better than a previously determined core performance metric, power down of the second processing core may be signaled and execution of the program code may be continued on the first processing core. When the first performance metric is not better than the previously determined core performance metric, execution of the program code may be switched from the first processing core to the second processing core.
    Type: Application
    Filed: January 2, 2016
    Publication date: April 28, 2016
    Inventors: Youfeng Wu, Shiliang Hu, Edson Borin, Cheng Wang
  • Publication number: 20160116964
    Abstract: Dynamically switching cores on a heterogeneous multi-core processing system may be performed by executing program code on a first processing core. Power up of a second processing core may be signaled. A first performance metric of the first processing core executing the program code may be collected. When the first performance metric is better than a previously determined core performance metric, power down of the second processing core may be signaled and execution of the program code may be continued on the first processing core. When the first performance metric is not better than the previously determined core performance metric, execution of the program code may be switched from the first processing core to the second processing core.
    Type: Application
    Filed: January 2, 2016
    Publication date: April 28, 2016
    Inventors: Youfeng Wu, Shiliang Hu, Edson Borin, Cheng Wang
  • Publication number: 20160092223
    Abstract: A processor of an aspect includes a decode unit to decode a persistent store fence instruction. The processor also includes a memory subsystem module coupled with the decode unit. The memory subsystem module, in response to the persistent store fence instruction, is to ensure that a given data corresponding to the persistent store fence instruction is stored persistently in a persistent storage before data of all subsequent store instructions is stored persistently in the persistent storage. The subsequent store instructions occur after the persistent store fence instruction in original program order. Other processors, methods, systems, and articles of manufacture are also disclosed.
    Type: Application
    Filed: September 26, 2014
    Publication date: March 31, 2016
    Applicant: INTEL CORPORATION
    Inventors: Cheng Wang, Youfeng Wu, Rajesh M Sankaran
  • Publication number: 20160092234
    Abstract: An apparatus and method for speculative vectorization. For example, one embodiment of a processor comprises: a queue comprising a set of locations for storing addresses associated with vectorized memory access instructions; and execution logic to execute a first vectorized memory access instruction to access the queue and to compare a new address associated with the first vectorized memory access instruction with existing addresses stored within a specified range of locations within the queue to detect whether a conflict exists, the existing addresses having been previously stored responsive to one or more prior vectorized memory access instructions.
    Type: Application
    Filed: September 26, 2014
    Publication date: March 31, 2016
    Inventors: NALINI VASUDEVAN, CHENG WANG, YOUFENG WU, ALBERT HARTONO, SARA S. BAGHSORKHI
  • Publication number: 20160092285
    Abstract: A computer-implemented method for managing loop code in a compiler includes using a conflict detection procedure that detects across-iteration dependency for arrays of single memory addresses to determine whether a potential across-iteration dependency exists for arrays of memory addresses for ranges of memory accessed by the loop code.
    Type: Application
    Filed: September 25, 2014
    Publication date: March 31, 2016
    Inventors: Albert Hartono, Nalini Vasudevan, Sara S. Baghsorkhi, Cheng Wang, Youfeng Wu
  • Patent number: 9292221
    Abstract: Embodiments of the present disclosure describe a processor, which may include copy circuitry coupled to a shadow register file and a control register. The copy circuitry may be configured to copy content from a range of a number of registers to a shadow range of the shadow register file in a forward or backward direction. The forward or backward direction may be based at least in part on a value stored in the control register.
    Type: Grant
    Filed: September 29, 2011
    Date of Patent: March 22, 2016
    Assignee: INTEL CORPORATION
    Inventors: Cheng Wang, Youfeng Wu, Jaewoong Chung
  • Publication number: 20160019038
    Abstract: An apparatus and method is described herein for conditionally committing and/or speculative checkpointing transactions, which potentially results in dynamic resizing of transactions. During dynamic optimization of binary code, transactions are inserted to provide memory ordering safeguards, which enables a dynamic optimizer to more aggressively optimize code. And the conditional commit enables efficient execution of the dynamic optimization code, while attempting to prevent transactions from running out of hardware resources. While the speculative checkpoints enable quick and efficient recovery upon abort of a transaction. Processor hardware is adapted to support dynamic resizing of the transactions, such as including decoders that recognize a conditional commit instruction, a speculative checkpoint instruction, or both. And processor hardware is further adapted to perform operations to support conditional commit or speculative checkpointing in response to decoding such instructions.
    Type: Application
    Filed: September 28, 2015
    Publication date: January 21, 2016
    Inventors: Mauricio Breternitz, JR., Youfeng Wu, Cheng Wang, Edson Borin, Shiliang Hu, Criag B. Zilles
  • Patent number: 9239712
    Abstract: Apparatuses and methods may provide for determining a level of performance for processing one or more loops by a dynamic compiler and executing code optimizations to generate a pipelined schedule for the one or more loops that achieves the determined level of performance within a prescribed time period. In one example, a dependence graph may be established for the one or more loops, and each dependence graph may be partitioned into stages based on the level of performance.
    Type: Grant
    Filed: March 29, 2013
    Date of Patent: January 19, 2016
    Assignee: Intel Corporation
    Inventors: Hongbo Rong, Hyunchul Park, Youfeng Wu
  • Patent number: 9223714
    Abstract: A system, processor, and method to predict with high accuracy and retain instruction boundaries for previously executed instructions in order to decode variable length instructions is disclosed. In at least one embodiment, a disclosed processor includes an instruction fetch unit, an instruction cache, a boundary byte predictor, and an instruction decoder. In some embodiments, the instruction fetch unit provides an instruction address and the instruction cache produces an instruction tag and instruction cache content corresponding to the instruction address. The instruction decoder, in some embodiments, includes boundary byte logic to determine an instruction boundary in the instruction cache content.
    Type: Grant
    Filed: March 15, 2013
    Date of Patent: December 29, 2015
    Assignee: Intel Corporation
    Inventors: Mauricio Breternitz, Youfeng Wu, Peter Sassone, James Mason, Aashish Phansalkar, Balaji Vijayan
  • Publication number: 20150363306
    Abstract: Methods and systems to identify threads responsible for causing a concurrency bug in a computer program having a plurality of concurrently executing threads are disclosed. An example method disclosed herein includes defining, with a processor, a data type. The data type including a first predicate, the first predicate being invoked using a first program instruction inserted in a first thread of the plurality of threads, a second predicate, the second predicate being invoked using a second program instruction inserted in a second thread of the plurality of threads, and an expression defining a relationship between the first predicate and the second predicate. The method further includes, in response to determining the relationship is satisfied during execution of the computer program, identifying the first thread and the second thread as responsible for the concurrency bug.
    Type: Application
    Filed: August 26, 2015
    Publication date: December 17, 2015
    Inventors: Youfeng Wu, Justin Gottschlich, Gilles Pokam, Shiliang Hu, Ali-Reza Adl-Tabatabai
  • Publication number: 20150363242
    Abstract: Methods, apparatus, systems and articles of manufacture are disclosed to manage concurrent predicate expressions. An example method discloses inserting a first condition hook into a first thread, the first condition hook associated with a first condition, inserting a second condition hook into a second thread, the second condition hook associated with a second condition, preventing the second thread from executing until the first condition is satisfied, and identifying a concurrency violation when the second condition is satisfied.
    Type: Application
    Filed: August 24, 2015
    Publication date: December 17, 2015
    Inventors: Justin E. Gottschlich, Cristiano Ligieri Pereira, Gilles Pokam, Youfeng Wu
  • Patent number: 9170792
    Abstract: In an embodiment, a system includes a processor including at least one core to execute operations of a loop that includes S stages. The system also includes stage insertion means for adding a delay stage to the loop to increase a lifetime of a corresponding register associated with a first variable of the loop and to delay storage of contents of the register. The system also includes a dynamic random access memory (DRAM). Other embodiments are described and claimed.
    Type: Grant
    Filed: May 30, 2013
    Date of Patent: October 27, 2015
    Assignee: Intel Corporation
    Inventors: Hyunchul Park, Hongbo Rong, Youfeng Wu
  • Patent number: 9152417
    Abstract: Embodiments of apparatus, computer-implemented methods, systems, and computer-readable media are described herein for expediting execution time memory alias checking. A sequence of instructions targeted for execution on an execution processor may be received or retrieved. The execution processor may include a plurality of alias registers and circuitry configured to check entries in the alias register for memory aliasing. One or more optimizations may be performed on the received or retrieved sequence of instructions to optimize execution performance of the received or retrieved sequence of instructions. This may include a reorder of a plurality of memory instructions in the received or retrieved sequence of instructions. After the optimization, one or more move instructions may be inserted in the optimized sequence of instructions to move one or more entries among the alias registers during execution, to expedite alias checking at execution time. Other embodiments may be described and/or claimed.
    Type: Grant
    Filed: September 27, 2011
    Date of Patent: October 6, 2015
    Assignee: Intel Corporation
    Inventors: Cheng Wang, Youfeng Wu
  • Publication number: 20150277866
    Abstract: In an embodiment, a processor includes at least one core and a dynamic language accelerator to execute a bytecode responsive to a memory mapped input/output (MMIO) operation on a file descriptor associated with the dynamic language accelerator. The processor may block execution of native code while the dynamic language accelerator executes the bytecode. Other embodiments are described and claimed.
    Type: Application
    Filed: March 26, 2014
    Publication date: October 1, 2015
    Inventors: Cheng Wang, Youfeng Wu, Hongbo Rong, Hyunchul Park
  • Publication number: 20150277968
    Abstract: A system is disclosed that includes a processor and a dynamic random access memory (DRAM). The processor includes a hybrid transactional memory (HyTM) that includes hardware transactional memory (HTM), and a program debugger to replay a program that includes an HTM instruction and that has been executed has been executed using the HyTM. The program debugger includes a software emulator that is to replay the HTM instruction by emulation of the HTM. Other embodiments are disclosed and claimed.
    Type: Application
    Filed: March 26, 2014
    Publication date: October 1, 2015
    Inventors: Justin E. Gottschlich, Gilles A. Pokam, Shiliang Hu, Rolf Kassa, Youfeng Wu, Irina Calciu
  • Patent number: 9146844
    Abstract: An apparatus and method is described herein for conditionally committing and/or speculative checkpointing transactions, which potentially results in dynamic resizing of transactions. During dynamic optimization of binary code, transactions are inserted to provide memory ordering safeguards, which enables a dynamic optimizer to more aggressively optimize code. And the conditional commit enables efficient execution of the dynamic optimization code, while attempting to prevent transactions from running out of hardware resources. While the speculative checkpoints enable quick and efficient recovery upon abort of a transaction. Processor hardware is adapted to support dynamic resizing of the transactions, such as including decoders that recognize a conditional commit instruction, a speculative checkpoint instruction, or both. And processor hardware is further adapted to perform operations to support conditional commit or speculative checkpointing in response to decoding such instructions.
    Type: Grant
    Filed: May 13, 2013
    Date of Patent: September 29, 2015
    Assignee: Intel Corporation
    Inventors: Mauricio Breternitz, Youfeng Wu, Cheng Wang, Edson Borin, Shiliang Hu, Craig B. Zilles
  • Publication number: 20150268940
    Abstract: Technologies for automatic loop vectorization include a computing device with an optimizing compiler. During an optimization pass, the compiler identifies a loop and generates a transactional code segment including a vectorized implementation of the loop body including one or more vector memory read instructions capable of generating an exception. The compiler also generates a non-transactional fallback code segment including a scalar implementation of the loop body that is executed in response to an exception generated within the transactional code segment. The compiler may detect whether the loop contains a memory read dependent on a condition that may be updated in a previous iteration or whether the loop contains a potential data dependence between two iterations. The compiler may generate a dynamic check for an actual data dependence and an explicit transactional abort instruction to be executed when an actual data dependence exists. Other embodiments are described and claimed.
    Type: Application
    Filed: March 21, 2014
    Publication date: September 24, 2015
    Inventors: Sara S. Baghsorkhi, Albert Hartono, Youfeng Wu, Nalini Vasudevan, Cheng Wang
  • Patent number: 9135139
    Abstract: Methods and systems to identify and reproduce concurrency bugs in multi-threaded programs are disclosed. An example method disclosed herein includes defining a data type. The data type includes a first predicate associated with a first thread of a multi-threaded program that is associated with a first condition, a second predicate that is associated with a second thread of the multi-threaded program, the second predicate being associated with a second condition, and an expression that defines a relationship between the first predicate and the second predicate. The relationship, when satisfied, causes the concurrency bug to be detected. A concurrency bug detector conforming to the data type is used to detect the concurrency bug in the multi-threaded program.
    Type: Grant
    Filed: June 27, 2012
    Date of Patent: September 15, 2015
    Assignee: Intel Corporation
    Inventors: Youfeng Wu, Justin E. Gottschlich, Gilles Pokam, Shiliang Hu, Ali-Reza Adl-Tabatabai, Cristiano L. Pereira
  • Patent number: 9117021
    Abstract: Methods, apparatus, systems and articles of manufacture are disclosed to manage concurrent predicate expressions. An example method discloses inserting a first condition hook into a first thread, the first condition hook associated with a first condition, inserting a second condition hook into a second thread, the second condition hook associated with a second condition, preventing the second thread from executing until the first condition is satisfied, and identifying a concurrency violation when the second condition is satisfied.
    Type: Grant
    Filed: March 14, 2013
    Date of Patent: August 25, 2015
    Assignee: Intel Corporation
    Inventors: Justin E. Gottschlich, Cristiano Ligieri Pereira, Gilles Pokam, Youfeng Wu
  • Publication number: 20150212836
    Abstract: Methods and apparatus relating to conjugate code generation for efficient dynamic optimizations are described. In an embodiment, a binary code and an intermediate representation (IR) code are generated based at least partially on a source program. The binary code and the intermediate code are transmitted to a virtual machine logic. The binary code and the IR code each include a plurality of regions that are in one-to-one correspondence. Other embodiments are also claimed and described.
    Type: Application
    Filed: October 24, 2013
    Publication date: July 30, 2015
    Inventors: Hongbo Rong, Hyunchul Park, Cheng Wang, Youfeng Wu