Patents by Inventor Youfeng Wu
Youfeng Wu has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Patent number: 11755099Abstract: Example methods and apparatus to facilitate dynamic core selection are disclosed. An example apparatus includes a first processor core of a first type; a second processor core of a second type different from the first type; and software to: access a user-supplied hint indicative of a user preference to execute program code on the first processor core, the user-supplied hint including a user-defined attribute of the program code; monitor performance of the program code on the first processor core; determine, based on the user-defined attribute of the program code, a predicted performance of the program code on the second processor core is better than the performance of the program code on the first processor core; and ignore the user preference by migrating the program code from the first processor core for execution on the second processor core.Type: GrantFiled: June 28, 2022Date of Patent: September 12, 2023Assignee: Intel CorporationInventors: Youfeng Wu, Shiliang Hu, Edson Borin, Cheng Wang
-
Publication number: 20220326756Abstract: Example methods and apparatus to facilitate dynamic core selection are disclosed.Type: ApplicationFiled: June 28, 2022Publication date: October 13, 2022Inventors: Youfeng Wu, Shiliang Hu, Edson Borin, Cheng Wang
-
Patent number: 10725755Abstract: Systems, apparatuses, and methods for a hardware and software system to automatically decompose a program into multiple parallel threads are described. In some embodiments, the systems and apparatuses execute a method of original code decomposition and/or generated thread execution.Type: GrantFiled: June 6, 2017Date of Patent: July 28, 2020Assignee: Intel CorporationInventors: David J. Sager, Ruchira Sasanka, Ron Gabor, Shlomo Raikin, Joseph Nuzman, Leeor Peled, Jason A. Domer, Ho-Seop Kim, Youfeng Wu, Koichi Yamada, Tin-Fook Ngai, Howard H. Chen, Jayaram Bobba, Jeffrey J. Cook, Omar M. Shaikh, Suresh Srinivas
-
Patent number: 10540178Abstract: A processor for redundant stores includes a front end including circuitry to decode instructions from an instruction stream, a data cache unit including circuitry to cache data for the processor, a binary translator, and a memory execution unit. The binary translator includes circuitry to identify a first region of the instruction stream including a redundant store, mark a first starting instruction of the first region with a protection designator, mark a first ending instruction of the first region with a clear designator, and store an amended instruction stream with the markings. The memory execution unit includes circuitry to track the first redundant store based on the protection designator and the clear designator to eliminate the first redundant store.Type: GrantFiled: September 14, 2016Date of Patent: January 21, 2020Assignee: Intel CorporationInventors: Vineeth Mekkat, Youfeng Wu, Sebastian Winkel, Oleg Margulis
-
Patent number: 10534424Abstract: Dynamically switching cores on a heterogeneous multi-core processing system may be performed by executing program code on a first processing core. Power up of a second processing core may be signaled. A first performance metric of the first processing core executing the program code may be collected. When the first performance metric is better than a previously determined core performance metric, power down of the second processing core may be signaled and execution of the program code may be continued on the first processing core. When the first performance metric is not better than the previously determined core performance metric, execution of the program code may be switched from the first processing core to the second processing core.Type: GrantFiled: January 2, 2016Date of Patent: January 14, 2020Assignee: Intel CorporationInventors: Youfeng Wu, Shiliang Hu, Edson Borin, Cheng Wang
-
Patent number: 10489158Abstract: A processor of an aspect includes a decode unit to decode a persistent store fence instruction. The processor also includes a memory subsystem module coupled with the decode unit. The memory subsystem module, in response to the persistent store fence instruction, is to ensure that a given data corresponding to the persistent store fence instruction is stored persistently in a persistent storage before data of all subsequent store instructions is stored persistently in the persistent storage. The subsequent store instructions occur after the persistent store fence instruction in original program order. Other processors, methods, systems, and articles of manufacture are also disclosed.Type: GrantFiled: September 26, 2014Date of Patent: November 26, 2019Assignee: Intel CorporationInventors: Cheng Wang, Youfeng Wu, Rajesh M Sankaran
-
Publication number: 20190332158Abstract: Dynamically switching cores on a heterogeneous multi-core processing system may be performed by executing program code on a first processing core. Power up of a second processing core may be signaled. A first performance metric of the first processing core executing the program code may be collected. When the first performance metric is better than a previously determined core performance metric, power down of the second processing core may be signaled and execution of the program code may be continued on the first processing core. When the first performance metric is not better than the previously determined core performance metric, execution of the program code may be switched from the first processing core to the second processing core.Type: ApplicationFiled: July 11, 2019Publication date: October 31, 2019Inventors: Youfeng Wu, Shiliang Hu, Edson Borin, Cheng Wang
-
Patent number: 10437319Abstract: Dynamically switching cores on a heterogeneous multi-core processing system may be performed by executing program code on a first processing core. Power up of a second processing core may be signaled. A first performance metric of the first processing core executing the program code may be collected. When the first performance metric is better than a previously determined core performance metric, power down of the second processing core may be signaled and execution of the program code may be continued on the first processing core. When the first performance metric is not better than the previously determined core performance metric, execution of the program code may be switched from the first processing core to the second processing core.Type: GrantFiled: January 2, 2016Date of Patent: October 8, 2019Assignee: Intel CorporationInventors: Youfeng Wu, Shiliang Hu, Edson Borin, Cheng Wang
-
Patent number: 10437318Abstract: Dynamically switching cores on a heterogeneous multi-core processing system may be performed by executing program code on a first processing core. Power up of a second processing core may be signaled. A first performance metric of the first processing core executing the program code may be collected. When the first performance metric is better than a previously determined core performance metric, power down of the second processing core may be signaled and execution of the program code may be continued on the first processing core. When the first performance metric is not better than the previously determined core performance metric, execution of the program code may be switched from the first processing core to the second processing core.Type: GrantFiled: January 2, 2016Date of Patent: October 8, 2019Assignee: Intel CorporationInventors: Youfeng Wu, Shiliang Hu, Edson Borin, Cheng Wang
-
Patent number: 10387296Abstract: Methods and systems to identify threads responsible for causing a concurrency bug in a computer program having a plurality of concurrently executing threads are disclosed. An example method disclosed herein includes defining, with a processor, a data type. The data type including a first predicate, the first predicate being invoked using a first program instruction inserted in a first thread of the plurality of threads, a second predicate, the second predicate being invoked using a second program instruction inserted in a second thread of the plurality of threads, and an expression defining a relationship between the first predicate and the second predicate. The method further includes, in response to determining the relationship is satisfied during execution of the computer program, identifying the first thread and the second thread as responsible for the concurrency bug.Type: GrantFiled: August 26, 2015Date of Patent: August 20, 2019Assignee: Intel corporationInventors: Youfeng Wu, Justin Gottschlich, Gilles Pokam, Shiliang Hu, Ali-Reza Adl-Tabatabai, Cristiano Pereira
-
Patent number: 10324768Abstract: Embodiments described herein utilize restricted transactional memory (RTM) instructions to implement speculative compile time optimizations that will be automatically rolled back by hardware in the event of a missed speculation. In one embodiment, a lightweight version of RTM for speculative compiler optimization is described to provide lower operational overhead in comparison to conventional RTM implementations used when performing SLE.Type: GrantFiled: December 17, 2014Date of Patent: June 18, 2019Assignee: Intel CorporationInventors: Cheng Wang, Youfeng Wu, Sara S. Baghsorkhi, Albert Hartono, Robert Valentine
-
Patent number: 10303525Abstract: Systems, methods, and apparatuses for data speculation execution (DSX) are described. In some embodiments, a hardware apparatus for performing DSX comprises a hardware decoder to decode an instruction, the instruction to include an opcode and an operand to store a portion of a fallback address, execution hardware to execute the decoded instruction to initiate a data speculative execution (DSX) region by activating DSX tracking hardware to track speculative memory accesses and detect ordering violations in the DSX region, and storing the fallback address.Type: GrantFiled: December 24, 2014Date of Patent: May 28, 2019Assignee: Intel CorporationInventors: Elmoustapha Ould-Ahmed-Vall, Christopher J. Hughes, Robert Valentine, Milind B. Girkar, Hideki Ido, Youfeng Wu, Cheng Wang
-
Patent number: 10296343Abstract: A processing device including a first shadow register, a second shadow register, and an instruction execution circuit, communicatively coupled to the first shadow register and the second shadow register, to receive a sequence of instructions comprising a first local commit marker, a first global commit marker, and a first register access instruction referencing an architectural register, speculatively execute the first register access instruction to generate a speculative register state value associated with a physical register, responsive to identifying the first local commit marker, store, in the first shadow register, the speculative register state value, and responsive to identifying the first global commit marker, store, in the second shadow register, the speculative register state value.Type: GrantFiled: March 30, 2017Date of Patent: May 21, 2019Assignee: Intel CorporationInventors: Vineeth Mekkat, Jason M. Agron, Youfeng Wu
-
Patent number: 10268497Abstract: Methods and apparatus relating to conjugate code generation for efficient dynamic optimizations are described. In an embodiment, a binary code and an intermediate representation (IR) code are generated based at least partially on a source program. The binary code and the intermediate code are transmitted to a virtual machine logic. The binary code and the IR code each include a plurality of regions that are in one-to-one correspondence. Other embodiments are also claimed and described.Type: GrantFiled: October 24, 2013Date of Patent: April 23, 2019Assignee: Intel CorporationInventors: Hongbo Rong, Hyunchul Park, Cheng Wang, Youfeng Wu
-
Patent number: 10235177Abstract: In an example, an apparatus includes a binary translator (BT) including circuitry to: analyze a code block; determine that an architectural register mapped to a physical register in the physical register file is available for early reclamation; and insert a reclamation hint into the code block. In another example, a processor reclaims the physical register based at least in part on the reclamation hint.Type: GrantFiled: July 2, 2016Date of Patent: March 19, 2019Assignee: Intel CorporationInventors: Vineeth Mekkat, Janghaeng Lee, Youfeng Wu
-
Patent number: 10191745Abstract: In one example a processor includes a region formation engine to identify a region of code for translation from a guest instruction set architecture to a native instruction set architecture. The processor also includes a binary translator to translate the region of code. The region formation engine is to perform aggressive region formation, which includes forming a region across a boundary of a return instruction. The translated region of code is to prevent a side entry into the translated region of code at a translated return target instruction included in the translated region of code. In more specific examples, performing aggressive region formation includes a region formation grow phase and a region formation cleanup phase. In the grow phase priority may be given to growing complete paths from a call target to a corresponding return. The region formation cleanup phase may comprise eliminating call targets that are not reachable.Type: GrantFiled: March 31, 2017Date of Patent: January 29, 2019Assignee: Intel CorporationInventors: Hou-Jen Ko, Girish Venkatasubramanian, Jason Agron, Tyler Sondag, Youfeng Wu
-
Patent number: 10120686Abstract: A processor includes a front end including circuitry to decode instructions from an instruction stream, a data cache unit including circuitry to cache data for the processor, and a binary translator. The binary translator includes circuitry to identify a redundant store in the instruction stream, mark the start and end of a region of the instruction stream with the redundant store, remove the redundant store, and store an amended instruction stream with the redundant store removed.Type: GrantFiled: June 7, 2016Date of Patent: November 6, 2018Assignee: Intel CorporationInventors: Vineeth Mekkat, Oleg Margulis, Ching-Tsun Chou, Youfeng Wu
-
Publication number: 20180285112Abstract: A processing device including a first shadow register, a second shadow register, and an instruction execution circuit, communicatively coupled to the first shadow register and the second shadow register, to receive a sequence of instructions comprising a first local commit marker, a first global commit marker, and a first register access instruction referencing an architectural register, speculatively execute the first register access instruction to generate a speculative register state value associated with a physical register, responsive to identifying the first local commit marker, store, in the first shadow register, the speculative register state value, and responsive to identifying the first global commit marker, store, in the second shadow register, the speculative register state value.Type: ApplicationFiled: March 30, 2017Publication date: October 4, 2018Inventors: Vineeth Mekkat, Jason M. Agron, Youfeng Wu
-
Publication number: 20180285113Abstract: In one example a processor includes a region formation engine to identify a region of code for translation from a guest instruction set architecture to a native instruction set architecture. The processor also includes a binary translator to translate the region of code. The region formation engine is to perform aggressive region formation, which includes forming a region across a boundary of a return instruction. The translated region of code is to prevent a side entry into the translated region of code at a translated return target instruction included in the translated region of code. In more specific examples, performing aggressive region formation includes a region formation grow phase and a region formation cleanup phase. In the grow phase priority may be given to growing complete paths from a call target to a corresponding return. The region formation cleanup phase may comprise eliminating call targets that are not reachable.Type: ApplicationFiled: March 31, 2017Publication date: October 4, 2018Applicant: Intel CorporationInventors: Hou-Jen Ko, Girish Venkatasubramanian, Jason Agron, Tyler Sondag, Youfeng Wu
-
Patent number: 10078357Abstract: In one embodiment, the present invention includes an apparatus having a core including functional units each to execute instructions of a target instruction set architecture (ISA) and a power controller to control a power mode of a first functional unit responsive to a power identification field of a power instruction of a power region of a code block to be executed on the core. Other embodiments are described and claimed.Type: GrantFiled: January 12, 2015Date of Patent: September 18, 2018Assignee: Intel CorporationInventors: Jaewoong Chung, Hanjun Kim, Youfeng Wu