Patents by Inventor Youfeng Wu

Youfeng Wu has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 10387296
    Abstract: Methods and systems to identify threads responsible for causing a concurrency bug in a computer program having a plurality of concurrently executing threads are disclosed. An example method disclosed herein includes defining, with a processor, a data type. The data type including a first predicate, the first predicate being invoked using a first program instruction inserted in a first thread of the plurality of threads, a second predicate, the second predicate being invoked using a second program instruction inserted in a second thread of the plurality of threads, and an expression defining a relationship between the first predicate and the second predicate. The method further includes, in response to determining the relationship is satisfied during execution of the computer program, identifying the first thread and the second thread as responsible for the concurrency bug.
    Type: Grant
    Filed: August 26, 2015
    Date of Patent: August 20, 2019
    Assignee: Intel corporation
    Inventors: Youfeng Wu, Justin Gottschlich, Gilles Pokam, Shiliang Hu, Ali-Reza Adl-Tabatabai, Cristiano Pereira
  • Patent number: 10324768
    Abstract: Embodiments described herein utilize restricted transactional memory (RTM) instructions to implement speculative compile time optimizations that will be automatically rolled back by hardware in the event of a missed speculation. In one embodiment, a lightweight version of RTM for speculative compiler optimization is described to provide lower operational overhead in comparison to conventional RTM implementations used when performing SLE.
    Type: Grant
    Filed: December 17, 2014
    Date of Patent: June 18, 2019
    Assignee: Intel Corporation
    Inventors: Cheng Wang, Youfeng Wu, Sara S. Baghsorkhi, Albert Hartono, Robert Valentine
  • Patent number: 10303525
    Abstract: Systems, methods, and apparatuses for data speculation execution (DSX) are described. In some embodiments, a hardware apparatus for performing DSX comprises a hardware decoder to decode an instruction, the instruction to include an opcode and an operand to store a portion of a fallback address, execution hardware to execute the decoded instruction to initiate a data speculative execution (DSX) region by activating DSX tracking hardware to track speculative memory accesses and detect ordering violations in the DSX region, and storing the fallback address.
    Type: Grant
    Filed: December 24, 2014
    Date of Patent: May 28, 2019
    Assignee: Intel Corporation
    Inventors: Elmoustapha Ould-Ahmed-Vall, Christopher J. Hughes, Robert Valentine, Milind B. Girkar, Hideki Ido, Youfeng Wu, Cheng Wang
  • Patent number: 10296343
    Abstract: A processing device including a first shadow register, a second shadow register, and an instruction execution circuit, communicatively coupled to the first shadow register and the second shadow register, to receive a sequence of instructions comprising a first local commit marker, a first global commit marker, and a first register access instruction referencing an architectural register, speculatively execute the first register access instruction to generate a speculative register state value associated with a physical register, responsive to identifying the first local commit marker, store, in the first shadow register, the speculative register state value, and responsive to identifying the first global commit marker, store, in the second shadow register, the speculative register state value.
    Type: Grant
    Filed: March 30, 2017
    Date of Patent: May 21, 2019
    Assignee: Intel Corporation
    Inventors: Vineeth Mekkat, Jason M. Agron, Youfeng Wu
  • Patent number: 10268497
    Abstract: Methods and apparatus relating to conjugate code generation for efficient dynamic optimizations are described. In an embodiment, a binary code and an intermediate representation (IR) code are generated based at least partially on a source program. The binary code and the intermediate code are transmitted to a virtual machine logic. The binary code and the IR code each include a plurality of regions that are in one-to-one correspondence. Other embodiments are also claimed and described.
    Type: Grant
    Filed: October 24, 2013
    Date of Patent: April 23, 2019
    Assignee: Intel Corporation
    Inventors: Hongbo Rong, Hyunchul Park, Cheng Wang, Youfeng Wu
  • Patent number: 10235177
    Abstract: In an example, an apparatus includes a binary translator (BT) including circuitry to: analyze a code block; determine that an architectural register mapped to a physical register in the physical register file is available for early reclamation; and insert a reclamation hint into the code block. In another example, a processor reclaims the physical register based at least in part on the reclamation hint.
    Type: Grant
    Filed: July 2, 2016
    Date of Patent: March 19, 2019
    Assignee: Intel Corporation
    Inventors: Vineeth Mekkat, Janghaeng Lee, Youfeng Wu
  • Patent number: 10191745
    Abstract: In one example a processor includes a region formation engine to identify a region of code for translation from a guest instruction set architecture to a native instruction set architecture. The processor also includes a binary translator to translate the region of code. The region formation engine is to perform aggressive region formation, which includes forming a region across a boundary of a return instruction. The translated region of code is to prevent a side entry into the translated region of code at a translated return target instruction included in the translated region of code. In more specific examples, performing aggressive region formation includes a region formation grow phase and a region formation cleanup phase. In the grow phase priority may be given to growing complete paths from a call target to a corresponding return. The region formation cleanup phase may comprise eliminating call targets that are not reachable.
    Type: Grant
    Filed: March 31, 2017
    Date of Patent: January 29, 2019
    Assignee: Intel Corporation
    Inventors: Hou-Jen Ko, Girish Venkatasubramanian, Jason Agron, Tyler Sondag, Youfeng Wu
  • Patent number: 10120686
    Abstract: A processor includes a front end including circuitry to decode instructions from an instruction stream, a data cache unit including circuitry to cache data for the processor, and a binary translator. The binary translator includes circuitry to identify a redundant store in the instruction stream, mark the start and end of a region of the instruction stream with the redundant store, remove the redundant store, and store an amended instruction stream with the redundant store removed.
    Type: Grant
    Filed: June 7, 2016
    Date of Patent: November 6, 2018
    Assignee: Intel Corporation
    Inventors: Vineeth Mekkat, Oleg Margulis, Ching-Tsun Chou, Youfeng Wu
  • Publication number: 20180285113
    Abstract: In one example a processor includes a region formation engine to identify a region of code for translation from a guest instruction set architecture to a native instruction set architecture. The processor also includes a binary translator to translate the region of code. The region formation engine is to perform aggressive region formation, which includes forming a region across a boundary of a return instruction. The translated region of code is to prevent a side entry into the translated region of code at a translated return target instruction included in the translated region of code. In more specific examples, performing aggressive region formation includes a region formation grow phase and a region formation cleanup phase. In the grow phase priority may be given to growing complete paths from a call target to a corresponding return. The region formation cleanup phase may comprise eliminating call targets that are not reachable.
    Type: Application
    Filed: March 31, 2017
    Publication date: October 4, 2018
    Applicant: Intel Corporation
    Inventors: Hou-Jen Ko, Girish Venkatasubramanian, Jason Agron, Tyler Sondag, Youfeng Wu
  • Publication number: 20180285112
    Abstract: A processing device including a first shadow register, a second shadow register, and an instruction execution circuit, communicatively coupled to the first shadow register and the second shadow register, to receive a sequence of instructions comprising a first local commit marker, a first global commit marker, and a first register access instruction referencing an architectural register, speculatively execute the first register access instruction to generate a speculative register state value associated with a physical register, responsive to identifying the first local commit marker, store, in the first shadow register, the speculative register state value, and responsive to identifying the first global commit marker, store, in the second shadow register, the speculative register state value.
    Type: Application
    Filed: March 30, 2017
    Publication date: October 4, 2018
    Inventors: Vineeth Mekkat, Jason M. Agron, Youfeng Wu
  • Patent number: 10078357
    Abstract: In one embodiment, the present invention includes an apparatus having a core including functional units each to execute instructions of a target instruction set architecture (ISA) and a power controller to control a power mode of a first functional unit responsive to a power identification field of a power instruction of a power region of a code block to be executed on the core. Other embodiments are described and claimed.
    Type: Grant
    Filed: January 12, 2015
    Date of Patent: September 18, 2018
    Assignee: Intel Corporation
    Inventors: Jaewoong Chung, Hanjun Kim, Youfeng Wu
  • Patent number: 9996356
    Abstract: Apparatus and method for detecting and recovering from incorrect memory dependence speculation in an out-of-order processor are described herein. For example, one embodiment of a method comprises: executing a first load instruction; detecting when the first load instruction experiences a bad store-to-load forwarding event during execution; tracking the occurrences of bad store-to-load forwarding event experienced by the first load instruction during execution; controlling enablement of an S-bit in the first load instruction based on the tracked occurrences; generating a plurality of load operations responsive to an enabled S-bit in first load instruction, wherein execution of the plurality of load operations produces a result equivalent to that from the execution of the first load instruction.
    Type: Grant
    Filed: December 26, 2015
    Date of Patent: June 12, 2018
    Assignee: Intel Corporation
    Inventors: Vineeth Mekkat, Oleg Margulis, Jason M. Agron, Ethan Schuchman, Sebastian Winkel, Youfeng Wu, Gisle Dankel
  • Patent number: 9971703
    Abstract: Technologies for persistent memory pointer access include a computing device having a persistent memory including one or more nonvolatile regions. The computing device may load a persistent memory pointer having a static region identifier, a segment identifier, and an offset from the persistent memory. The computing device may map the static region identifier to a dynamic region identifier and determine a virtual memory address of the persistent memory pointer target based on the dynamic region identifier, the segment identifier, and the offset. The computing device may load an in-storage representation of a persistent-export pointer from the persistent memory, map the in-storage representation to a runtime representation, and determine a target address of a persistent external data object based on the runtime representation. The computing device may include a compiler to generate output code including persistent memory pointer and/or persistent-export pointer accesses.
    Type: Grant
    Filed: August 9, 2017
    Date of Patent: May 15, 2018
    Assignee: Intel Corporation
    Inventors: Marcelo S. Cintra, Cheng Wang, Youfeng Wu, Alexandre Xavier DuChateau
  • Patent number: 9940229
    Abstract: Technologies for persistent memory programming include a computing device having a persistent memory including one or more nonvolatile regions. The computing device may assign a virtual memory address of a target location in persistent memory to a persistent memory pointer using persistent pointer strategy, and may dereference the pointer using the same strategy. Persistent pointer strategies include off-holder, ID-in-value, optimistic rectification, and pessimistic rectification. The computing device may log changes to persistent memory during the execution of a data consistency section, and commit changes to the persistent memory when the last data consistency section ends. Data consistency sections may be grouped by log group identifier. Using type metadata stored in the nonvolatile region, the computing device may identify the type of a root object within the nonvolatile region and then recursively identify the type of all objects referenced by the root object. Other embodiments are described and claimed.
    Type: Grant
    Filed: September 25, 2014
    Date of Patent: April 10, 2018
    Assignee: Intel Corporation
    Inventors: Xipeng Shen, Youfeng Wu, Cheng Wang, Hyunchul Park, Hongbo Rong
  • Publication number: 20180074827
    Abstract: A processor for redundant stores includes a front end including circuitry to decode instructions from an instruction stream, a data cache unit including circuitry to cache data for the processor, a binary translator, and a memory execution unit. The binary translator includes circuitry to identify a first region of the instruction stream including a redundant store, mark a first starting instruction of the first region with a protection designator, mark a first ending instruction of the first region with a clear designator, and store an amended instruction stream with the markings. The memory execution unit includes circuitry to track the first redundant store based on the protection designator and the clear designator to eliminate the first redundant store.
    Type: Application
    Filed: September 14, 2016
    Publication date: March 15, 2018
    Inventors: Vineeth Mekkat, Youfeng Wu, Sebastian Winkel, Oleg Margulis
  • Patent number: 9910650
    Abstract: A computer-implemented method for managing loop code in a compiler includes using a conflict detection procedure that detects across-iteration dependency for arrays of single memory addresses to determine whether a potential across-iteration dependency exists for arrays of memory addresses for ranges of memory accessed by the loop code.
    Type: Grant
    Filed: September 25, 2014
    Date of Patent: March 6, 2018
    Assignee: Intel Corporation
    Inventors: Albert Hartono, Nalini Vasudevan, Sara S. Baghsorkhi, Cheng Wang, Youfeng Wu
  • Publication number: 20180060049
    Abstract: Systems, apparatuses, and methods for a hardware and software system to automatically decompose a program into multiple parallel threads are described. In some embodiments, the systems and apparatuses execute a method of original code decomposition and/or generated thread execution.
    Type: Application
    Filed: June 6, 2017
    Publication date: March 1, 2018
    Inventors: DAVID J. SAGER, RUCHIRA SASANKA, RON GABOR, SHLOMO RAIKIN, JOSEPH NUZMAN, LEEOR PELED, JASON A. DOMER, HO-SEOP KIM, YOUFENG WU, KOICHI YAMADA, TIN-FOOK NGAI, HOWARD H. CHEN, JAYARAM BOBBA, JEFFREY J. COOK, OMAR M. SHAIKH, SURESH SRINIVAS
  • Publication number: 20180018177
    Abstract: An apparatus and method for speculative vectorization. For example, one embodiment of a processor comprises: a queue comprising a set of locations for storing addresses associated with vectorized memory access instructions; and execution logic to execute a first vectorized memory access instruction to access the queue and to compare a new address associated with the first vectorized memory access instruction with existing addresses stored within a specified range of locations within the queue to detect whether a conflict exists, the existing addresses having been previously stored responsive to one or more prior vectorized memory access instructions.
    Type: Application
    Filed: July 18, 2017
    Publication date: January 18, 2018
    Inventors: NALINI VASUDEVAN, CHENG WANG, YOUFENG WU, ALBERT HARTONO, SARA S. BAGHSORKHI
  • Publication number: 20180004524
    Abstract: In an example, there is disclosed an apparatus, including a binary translator (BT) including circuitry to: analyze a code block; determine that an architectural register mapped to a physical register in the physical register file is available for early reclamation; and insert a reclamation hint into the code block. There is also disclosed a processor to reclaim the physical register based at least in part on the reclamation hint.
    Type: Application
    Filed: July 2, 2016
    Publication date: January 4, 2018
    Applicant: Intel Corporation
    Inventors: Vineeth Mekkat, Janghaeng Lee, Youfeng Wu
  • Publication number: 20170351516
    Abstract: A processor includes a front end including circuitry to decode instructions from an instruction stream, a data cache unit including circuitry to cache data for the processor, and a binary translator. The binary translator includes circuitry to identify a redundant store in the instruction stream, mark the start and end of a region of the instruction stream with the redundant store, remove the redundant store, and store an amended instruction stream with the redundant store removed.
    Type: Application
    Filed: June 7, 2016
    Publication date: December 7, 2017
    Inventors: Vineeth Mekkat, Oleg Margulis, Ching-Tsun Chou, Youfeng Wu