Patents by Inventor Perry Wang

Perry Wang has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 8868887
    Abstract: Method, apparatus, and program means for a programmable event driven yield mechanism that may activate other threads. In one embodiment, an apparatus includes execution resources to execute a plurality of instructions and a monitor to detect a condition indicating a low level of progress. The monitor can disrupt processing of a program by transferring to a handler in response to detecting the condition indicating a low level of progress. In another embodiment, thread switch logic may be coupled to a plurality of event monitors which monitor events within the multithreading execution logic. The thread switch logic switches threads based at least partially on a programmable condition of one or more of the performance monitors.
    Type: Grant
    Filed: November 5, 2004
    Date of Patent: October 21, 2014
    Assignee: Intel Corporation
    Inventors: Hong Wang, Per Hammarlund, Xiang Zou, John Shen, Xinmin Tian, Milind Girkar, Perry Wang, Piyush Desai
  • Patent number: 8843728
    Abstract: In one embodiment, the present invention includes a method for communicating an assertion signal from a first instruction sequencer to a plurality of accelerators coupled to the first instruction sequencer, detecting the assertion signal in the accelerators and communicating a request for a lock, and registering an accelerator that achieves the lock by communication of a registration message for the accelerator to the first instruction sequencer. Other embodiments are described and claimed.
    Type: Grant
    Filed: November 20, 2012
    Date of Patent: September 23, 2014
    Assignee: Intel Corporation
    Inventors: Perry Wang, Jamison Collins, Hong Wang
  • Patent number: 8612949
    Abstract: Methods and apparatuses for compiler-created helper thread for multi-threading are described herein. In one embodiment, exemplary process includes identifying a region of a main thread that likely has one or more delinquent loads, the one or more delinquent loads representing loads which likely suffer cache misses during an execution of the main thread, analyzing the region for one or more helper threads with respect to the main thread, and generating code for the one or more helper threads, the one or more helper threads being speculatively executed in parallel with the main thread to perform one or more tasks for the region of the main thread. Other methods and apparatuses are also described.
    Type: Grant
    Filed: December 31, 2009
    Date of Patent: December 17, 2013
    Assignee: Intel Corporation
    Inventors: Shih-wei Liao, Xinmin Tian, Gerolf F. Hoflehner, Hong Wang, Daniel M. Lavery, Perry Wang, Dongkeun Kim, Milind Girkar, John P. Shen
  • Publication number: 20130080746
    Abstract: In one embodiment, the present invention includes a method for communicating an assertion signal from a first instruction sequencer to a plurality of accelerators coupled to the first instruction sequencer, detecting the assertion signal in the accelerators and communicating a request for a lock, and registering an accelerator that achieves the lock by communication of a registration message for the accelerator to the first instruction sequencer. Other embodiments are described and claimed.
    Type: Application
    Filed: November 20, 2012
    Publication date: March 28, 2013
    Inventors: Perry Wang, Jamison Collins, Hong Wang
  • Patent number: 8380963
    Abstract: In one embodiment, the present invention includes a method for communicating an assertion signal from a first instruction sequencer to a plurality of accelerators coupled to the first instruction sequencer, detecting the assertion signal in the accelerators and communicating a request for a lock, and registering an accelerator that achieves the lock by communication of a registration message for the accelerator to the first instruction sequencer. Other embodiments are described and claimed.
    Type: Grant
    Filed: January 31, 2011
    Date of Patent: February 19, 2013
    Assignee: Intel Corporation
    Inventors: Perry Wang, Jamison Collins, Hong Wang
  • Patent number: 8296743
    Abstract: Presented are embodiments of methods and systems for library-based compilation and dispatch to automatically spread computations of a program across heterogeneous cores in a processing system. The source program contains a parallel-programming keyword, such as mapreduce, from a high-level, library-oriented parallel programming language. The compiler inserts one or more calls for a generic function, associated with the parallel-programming keyword, into the compiled code. A runtime library provides a predicate-based library system that includes multiple hardware specific implementations (“variants”) of the generic function. A runtime dispatch engine dynamically selects the best-available (e.g., most specific) variant, from a bundle of hardware-specific variants, for a given input and machine configuration.
    Type: Grant
    Filed: December 17, 2007
    Date of Patent: October 23, 2012
    Assignee: Intel Corporation
    Inventors: Michael D. Linderman, Jamison D. Collins, Perry Wang, Hong Wang
  • Patent number: 8074274
    Abstract: In one embodiment, the present invention includes a method for receiving a request from a user-level agent for programming of a user-level privilege for at least one architectural resource of an application-managed sequencer (AMS) and programming the user-level privilege for the at least one architectural resource using an operating system-managed sequencer (OMS) coupled to the AMS. Other embodiments are described and claimed.
    Type: Grant
    Filed: December 29, 2006
    Date of Patent: December 6, 2011
    Assignee: Intel Corporation
    Inventors: Hong Wang, Gautham Chinya, Perry Wang, Jamison Collins, Richard A. Hankins, Per Hammarlund, John Shen
  • Patent number: 8037465
    Abstract: Thread-data affinity optimization can be performed by a compiler during the compiling of a computer program to be executed on a cache coherent non-uniform memory access (cc-NUMA) platform. In one embodiment, the present invention includes receiving a program to be compiled. The received program is then compiled in a first pass and executed. During execution, the compiler collects profiling data using a profiling tool. Then, in a second pass, the compiler performs thread-data affinity optimization on the program using the collected profiling data.
    Type: Grant
    Filed: September 30, 2005
    Date of Patent: October 11, 2011
    Assignee: Intel Corporation
    Inventors: Xinmin Tian, Milind Girkar, David C. Sehr, Richard Grove, Wei Li, Hong Wang, Chris Newburn, Perry Wang, John Shen
  • Publication number: 20110125985
    Abstract: In one embodiment, the present invention includes a method for communicating an assertion signal from a first instruction sequencer to a plurality of accelerators coupled to the first instruction sequencer, detecting the assertion signal in the accelerators and communicating a request for a lock, and registering an accelerator that achieves the lock by communication of a registration message for the accelerator to the first instruction sequencer. Other embodiments are described and claimed.
    Type: Application
    Filed: January 31, 2011
    Publication date: May 26, 2011
    Inventors: Perry Wang, Jamison Collins, Hong Wang
  • Patent number: 7941791
    Abstract: Compiling a source code program for a heterogeneous multi-core processor having a first instruction sequencer, having a first instruction set architecture, an accelerator to the first instruction sequencer, wherein the accelerator comprises a heterogeneous resource with respect to the first instruction sequencer having a second instruction set architecture, the source code program having specified therein a region of source code for the first instruction set architecture of the processor and a region of source code for the second instruction set architecture of the processor.
    Type: Grant
    Filed: April 13, 2007
    Date of Patent: May 10, 2011
    Inventors: Perry Wang, Jamison Collins, Gautham Chinya, Hong Jiang, Hong Wang, Xinmin Tian, Guei-Yuan Lueh
  • Patent number: 7904696
    Abstract: In one embodiment, the present invention includes a method for communicating an assertion signal from a first instruction sequencer to a plurality of accelerators coupled to the first instruction sequencer via a dedicated interconnect, detecting the assertion signal in the accelerators and communicating a request for a lock on a second interconnect coupled to the first instruction sequencer and the accelerators, and registering an accelerator that achieves the lock by communication of a registration message for the accelerator to the first instruction sequencer via the second interconnect. Other embodiments are described and claimed.
    Type: Grant
    Filed: September 14, 2007
    Date of Patent: March 8, 2011
    Assignee: Intel Corporation
    Inventors: Perry Wang, Jamison Collins, Hong Wang
  • Publication number: 20100281471
    Abstract: Methods and apparatuses for compiler-created helper thread for multi-threading are described herein. In one embodiment, exemplary process includes identifying a region of a main thread that likely has one or more delinquent loads, the one or more delinquent loads representing loads which likely suffer cache misses during an execution of the main thread, analyzing the region for one or more helper threads with respect to the main thread, and generating code for the one or more helper threads, the one or more helper threads being speculatively executed in parallel with the main thread to perform one or more tasks for the region of the main thread. Other methods and apparatuses are also described.
    Type: Application
    Filed: December 31, 2009
    Publication date: November 4, 2010
    Inventors: Shih-Wei Liao, Xinmin Tian, Gerolf F. Hoflehner, Hong Wang, Daniel M. Lavery, Perry Wang, Dongkeun Kim, Milind Girkar, John P. Shen
  • Patent number: 7768518
    Abstract: Embodiments described herein disclose a system for enabling emulation of a MIMD ISA extension which supports user-level sequencer management and control, and a set of privileged code executed by both operating system managed sequencers and application managed sequencers, including different sets of persistent per-CPU and per-thread data. In one embodiment, a lightweight code layer executes beneath the operating system. This code layer is invoked in response to particular monitored events, such as the need for communication between an operating system managed sequencer and an application managed sequencer. Control is transferred to this code layer, for execution of special operations, after which control returns back to originally executing code. The code layer is normally dormant and can be invoked at any time when either a user application or the operating system is executing.
    Type: Grant
    Filed: September 27, 2006
    Date of Patent: August 3, 2010
    Assignee: Intel Corporation
    Inventors: Jamison Collins, Perry Wang, Bernard Lint, Koichi Yamada, Asit Mallick, Richard A. Hankins, Gautham Chinya
  • Patent number: 7657880
    Abstract: The latencies associated with retrieving instruction information for a main thread are decreased through the use of a simultaneous helper thread. The helper thread is permitted to execute Store instructions. Store blocker logic operates to prevent data associated with a Store instruction in a helper thread from being committed to memory. Dependence blocker logic operates to prevent data associated with a Store instruction in a speculative helper thread from being bypassed to a Load instruction in a non-speculative thread.
    Type: Grant
    Filed: August 1, 2003
    Date of Patent: February 2, 2010
    Assignee: Intel Corporation
    Inventors: Hong Wang, Tor Aamodt, Per Hammarlund, John Shen, Xinmin Tian, Milind Girkar, Perry Wang, Steve Shih-wei Liao
  • Patent number: 7603546
    Abstract: Embodiments of the present invention provide a method, apparatus and system which may include splitting a dependency chain into a set of reduced-width dependency chains; mapping one or more dependency chains onto one or more clustered dependency chain processors, wherein an issue-width of one or more of the clusters is adapted to accommodate a size of the dependency chains; and/or processing in parallel a plurality of dependency chains of a trace. Other embodiments are described and claimed.
    Type: Grant
    Filed: September 28, 2004
    Date of Patent: October 13, 2009
    Assignee: Intel Corporation
    Inventors: Satish Narayanasamy, Hong Wang, John Shen, Roni Rosner, Yoav Almog, Naftali Schwartz, Gerolf Hoflehner, Daniel LaVery, Wei Li, Xinmin Tian, Milind Girkar, Perry Wang
  • Patent number: 7587584
    Abstract: Method, apparatus, and program means for a programmable event driven yield mechanism that may activate other threads. In one embodiment, an apparatus includes execution resources to execute a plurality of instructions and an event detector to detect a long latency event associated with a synchronization object. The event detector can cause a first thread switch in response to the long latency event associated with the synchronization object. The apparatus may also include a spin detector to detect that the synchronization object is a contended synchronization object. The spin detector can cause a second thread switch in response to the detection of the contended synchronization object to enable a spin detect response.
    Type: Grant
    Filed: March 2, 2005
    Date of Patent: September 8, 2009
    Assignee: Intel Corporation
    Inventors: Natalie D. Enright, Jamison D. Collins, Perry Wang, Hong Wang, Xinmin Tran, John Shen, Gad Sheaffer, Per Hammarlund
  • Publication number: 20090158248
    Abstract: Presented are embodiments of methods and systems for library-based compilation and dispatch to automatically spread computations of a program across heterogeneous cores in a processing system. The source program contains a parallel-programming keyword, such as mapreduce, from a high-level, library-oriented parallel programming language. The compiler inserts one or more calls for a generic function, associated with the parallel-programming keyword, into the compiled code. A runtime library provides a predicate-based library system that includes multiple hardware specific implementations (“variants”) of the generic function. A runtime dispatch engine dynamically selects the best-available (e.g., most specific) variant, from a bundle of hardware-specific variants, for a given input and machine configuration.
    Type: Application
    Filed: December 17, 2007
    Publication date: June 18, 2009
    Inventors: Michael D. Linderman, Jamison D. Collins, Perry Wang, Hong Wang
  • Publication number: 20090077348
    Abstract: In one embodiment, the present invention includes a method for communicating an assertion signal from a first instruction sequencer to a plurality of accelerators coupled to the first instruction sequencer via a dedicated interconnect, detecting the assertion signal in the accelerators and communicating a request for a lock on a second interconnect coupled to the first instruction sequencer and the accelerators, and registering an accelerator that achieves the lock by communication of a registration message for the accelerator to the first instruction sequencer via the second interconnect. Other embodiments are described and claimed.
    Type: Application
    Filed: September 14, 2007
    Publication date: March 19, 2009
    Inventors: Perry Wang, Jamison Collins, Hong Wang
  • Patent number: 7487502
    Abstract: Method, apparatus, and program means for a programmable event driven yield mechanism that may activate other threads. In one embodiment, an apparatus includes execution resources to execute a plurality of instructions and a monitor to detect a condition indicating a low level of progress. The monitor can disrupt processing of a program by transferring to a handler in response to detecting the condition indicating a low level of progress. In another embodiment, thread switch logic may be coupled to a plurality of event monitors which monitor events within the multithreading execution logic. The thread switch logic switches threads based at least partially on a programmable condition of one or more of the performance monitors.
    Type: Grant
    Filed: February 19, 2003
    Date of Patent: February 3, 2009
    Assignee: Intel Corporation
    Inventors: Hong Wang, Per Hammarlund, Xiang Zou, John Shen, Xinmin Tian, Milind Girkar, Perry Wang, Piyush Desai
  • Publication number: 20080256330
    Abstract: Compiling a source code program for a heterogeneous multi-core processor having a first instruction sequencer, having a first instruction set architecture, an accelerator to the first instruction sequencer, wherein the accelerator comprises a heterogeneous resource with respect to the first instruction sequencer having a second instruction set architecture, the source code program having specified therein a region of source code for the first instruction set architecture of the processor and a region of source code for the second instruction set architecture of the processor.
    Type: Application
    Filed: April 13, 2007
    Publication date: October 16, 2008
    Inventors: Perry Wang, Jamison Collins, Gautham Chinya, Hong Jiang, Hong Wang, Xinmin Tian, Guei-Yuan Lueh