Patents by Inventor Xinmin Tian

Xinmin Tian has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 8607235
    Abstract: Method, apparatus and system embodiments to schedule OS-independent “shreds” without intervention of an operating system. For at least one embodiment, the shred is scheduled for execution by a scheduler routine rather than the operating system. A scheduler routine may run on each enabled sequencer. The schedulers may retrieve shred descriptors from a queue system. The sequencer associated with the scheduler may then execute the shred described by the descriptor. Other embodiments are also described and claimed.
    Type: Grant
    Filed: December 30, 2004
    Date of Patent: December 10, 2013
    Assignee: Intel Corporation
    Inventors: Richard A. Hankins, Hong Wang, Gautham N. Chinya, Trung A. Diep, Shivnandan D. Kaushik, Bryant E. Bigbee, John P. Shen, Asit K. Mallick, Baiju V. Patel, James Paul Held, Milind B. Girkar, Prashant Sethi, Xinmin Tian
  • Publication number: 20130318511
    Abstract: Methods and apparatuses associated with vectorization of scalar callee functions are disclosed herein. In various embodiments, compiling a first program may include generating one or more vectorized versions of a scalar callee function of the first program, based at least in part on vectorization annotations of the first program. Additionally, compiling may include generating one or more vectorized function signatures respectively associated with the one or more vectorized versions of the scalar callee function. The one or more vectorized function signatures may enable an appropriate vectorized version of the scalar callee function to be matched and invoked for a generic call from a caller function of a second program to a vectorized version of the scalar callee function.
    Type: Application
    Filed: April 1, 2011
    Publication date: November 28, 2013
    Inventors: Xinmin Tian, Sergey Stanislavoich Kozhukhov, Sergey Victorovich Preis, Robert Yehuda Geva, Konstantin Anatolyevich Pyjov, Hideki Sato, Milind Baburao Girkar, Aleksei Gurievich Chersaba, Nikolay Vladimirovich Panchenko
  • Patent number: 8589901
    Abstract: A system and method are configured to apply region level optimizations to a selected region of source code rather than loop level optimizations to a loop or loop nest. The region may include an outer loop, a plurality of inner loops and at least one control code. If the region includes an exceptional control flow statement and/or a procedure call, speculative region-level multi-versioning may be applied.
    Type: Grant
    Filed: December 22, 2010
    Date of Patent: November 19, 2013
    Inventors: Jin Lin, John L. Ng, Robert J. Cox, Xinmin Tian
  • Publication number: 20130219399
    Abstract: In an embodiment, a method is provided. The method includes managing user-level threads on a first instruction sequencer in response to executing user-level instructions on a second instruction sequencer that is under control of an application level program. A first user-level thread is run on the second instruction sequencer and contains one or more user level instructions. A first user level instruction has at least 1) a field that makes reference to one or more instruction sequencers or 2) implicitly references with a pointer to code that specifically addresses one or more instruction sequencers when the code is executed.
    Type: Application
    Filed: March 15, 2013
    Publication date: August 22, 2013
    Inventors: Hong Wang, John Shen, Edward Grochowski, Richard Hankins, Gautham Chinya, Bryant Bigbee, Shivnandan Kaushik, Xiang Chris Zou, Per Hammarlund, Scott Dion Rodgers, Xinmin Tian, Anil Aggawal, Prashant Sethi, Baiju Patel, James Held
  • Publication number: 20130219096
    Abstract: Method, apparatus, and program means for a programmable event driven yield mechanism that may activate other threads. In one embodiment, an apparatus includes execution resources to execute a plurality of instructions and a monitor to detect a condition indicating a low level of progress. The monitor can disrupt processing of a program by transferring to a handler in response to detecting the condition indicating a low level of progress. In another embodiment, thread switch logic may be coupled to a plurality of event monitors which monitor events within the multithreading execution logic. The thread switch logic switches threads based at least partially on a programmable condition of one or more of the performance monitors.
    Type: Application
    Filed: March 15, 2013
    Publication date: August 22, 2013
    Inventors: HONG WANG, PER HAMMARLUND, XIANG ZOU, JOHN P. SHEN, XINMIN TIAN, MILIND GIRKAR, PERRY H. WANG, PIYUSH N. DESAI
  • Publication number: 20130054940
    Abstract: In an embodiment, a method is provided. The method includes managing user-level threads on a first instruction sequencer in response to executing user-level instructions on a second instruction sequencer that is under control of an application level program. A first user-level thread is run on the second instruction sequencer and contains one or more user level instructions. A first user level instruction has at least 1) a field that makes reference to one or more instruction sequencers or 2) implicitly references with a pointer to code that specifically addresses one or more instruction sequencers when the code is executed.
    Type: Application
    Filed: September 10, 2012
    Publication date: February 28, 2013
    Inventors: Hong Wang, John Shen, Ed Grochowski, James Paul Held, Bryant Bigbee, Shivnandan D. Kaushik, Gautham Chinya, Xiang Zou, Per Hammarlund, Xinmin Tian, Anil Aggarwal, Scott Dion Rodgers, Prashant Sethi, Baiju V. Patel, Richard Andrew Hankins
  • Patent number: 8245244
    Abstract: Device, system, and method of executing a call to a routine within a transaction. In some embodiments an apparatus may include a memory having stored thereon compiled code corresponding to a transaction, wherein the transaction includes at least one call to a first routine of a pair of first and second mutually inverse routines, and wherein the compiled code includes a call to a first wrapped routine replacing the call to the first routine; and a runtime library including wrapper code, wherein the wrapper code, when executed in response to the call to the first wrapped routine, results in executing the call to the first routine within the transaction and undoing the call to the first routine responsive to abort of the transaction. Other embodiments are described and claimed.
    Type: Grant
    Filed: August 26, 2008
    Date of Patent: August 14, 2012
    Assignee: Intel Corporation
    Inventors: James H. Cownie, Ravi Narayanaswamy, Jeffrey V. Olivier, Serguei V. Preis, Xinmin Tian, Ali-Reza Adl-Tabatabai
  • Publication number: 20120167069
    Abstract: Methods and apparatus to provide loop parallelization based on loop splitting and/or index array are described. In one embodiment, one or more split loops, corresponding to an original loop, are generated based on the mis-speculation information. In another embodiment, a plurality of subloops are generated from an original loop based on an index array. Other embodiments are also described.
    Type: Application
    Filed: December 24, 2010
    Publication date: June 28, 2012
    Inventors: Jin Lin, Nishkam Ravi, Xinmin Tian, John L. Ng, Renat V. Valiullin
  • Publication number: 20120167068
    Abstract: A system and method are configured to apply region level optimizations to a selected region of source code rather than loop level optimizations to a loop or loop nest. The region may include an outer loop, a plurality of inner loops and at least one control code. If the region includes an exceptional control flow statement and/or a procedure call, speculative region-level multi-versioning may be applied.
    Type: Application
    Filed: December 22, 2010
    Publication date: June 28, 2012
    Inventors: Jin Lin, John L. Ng, Robert J. Cox, Xinmin Tian
  • Patent number: 8205200
    Abstract: Method, apparatus and system embodiments to schedule user-level OS-independent “shreds” without intervention of an operating system. For at least one embodiment, the shred is scheduled for execution by a scheduler routine rather than the operating system. The scheduler routine may receive compiler-generated hints from a compiler. The compiler hints may be generated by the compiler without user-provided pragmas, and may be passed to the scheduler routine via an API-like interface. The interface may include a scheduling hint data structure that is maintained by the compiler. Other embodiments are also described and claimed.
    Type: Grant
    Filed: November 29, 2005
    Date of Patent: June 19, 2012
    Assignee: Intel Corporation
    Inventors: Shih-wei Liao, Ryan N. Rakvic, Richard A. Hankins, Hong Wang, Gansha Wu, Guei-Yuan Lueh, Xinmin Tian, Paul M. Petersen, Sanjiv Shah, Trung Diep, John Shen, Gautham Chinya
  • Publication number: 20120117552
    Abstract: Methods to improve optimization of compilation are presented. In one embodiment, a method includes identifying one or more optimization speculations with respect to a code region and speculatively performing transformation on an intermediate representation of the code region in accordance with an optimization speculation. The method includes generating an advice message corresponding to the optimization speculation and displaying the advice message if the optimization speculation results in an improved compilation result.
    Type: Application
    Filed: November 9, 2010
    Publication date: May 10, 2012
    Inventors: Rakesh Krishnaiyer, Hideki Saito Ido, Ernesto Su, John L. Ng, Jin Lin, Xinmin Tian, Robert Y. Geva
  • Patent number: 8037465
    Abstract: Thread-data affinity optimization can be performed by a compiler during the compiling of a computer program to be executed on a cache coherent non-uniform memory access (cc-NUMA) platform. In one embodiment, the present invention includes receiving a program to be compiled. The received program is then compiled in a first pass and executed. During execution, the compiler collects profiling data using a profiling tool. Then, in a second pass, the compiler performs thread-data affinity optimization on the program using the collected profiling data.
    Type: Grant
    Filed: September 30, 2005
    Date of Patent: October 11, 2011
    Assignee: Intel Corporation
    Inventors: Xinmin Tian, Milind Girkar, David C. Sehr, Richard Grove, Wei Li, Hong Wang, Chris Newburn, Perry Wang, John Shen
  • Patent number: 7984431
    Abstract: According to one example embodiment, there is disclosed herein uses partial recurrence relaxation for parallelizing DOACROSS loops on multi-core computer architectures. By one example definition, a DOACROSS may be a loop that allows successive iterations executing by overlapping; that is, all iterations must impose a partial execution order. According to one embodiment, the inventive subject matter may be used to transform the dependence structure of a given loop with recurrences for maximal degree of thread-level parallelism (TLP), where the threads can be mapped on to either different logical processors (in a hyperthreaded processor) or can be mapped onto different physical cores (or processors) in a multi-core processor.
    Type: Grant
    Filed: March 31, 2007
    Date of Patent: July 19, 2011
    Assignee: Intel Corporation
    Inventors: Arun Kejariwal, Xinmin Tian, Wei Li, Milind B. Girkar
  • Patent number: 7941791
    Abstract: Compiling a source code program for a heterogeneous multi-core processor having a first instruction sequencer, having a first instruction set architecture, an accelerator to the first instruction sequencer, wherein the accelerator comprises a heterogeneous resource with respect to the first instruction sequencer having a second instruction set architecture, the source code program having specified therein a region of source code for the first instruction set architecture of the processor and a region of source code for the second instruction set architecture of the processor.
    Type: Grant
    Filed: April 13, 2007
    Date of Patent: May 10, 2011
    Inventors: Perry Wang, Jamison Collins, Gautham Chinya, Hong Jiang, Hong Wang, Xinmin Tian, Guei-Yuan Lueh
  • Publication number: 20110067011
    Abstract: In one embodiment a thread management method identifies in a main program a set of instructions that can be dynamically activated as speculative precomputation threads. A wait/sleep operation is performed on the speculative precomputation threads between thread creation and activation, and progress of non-speculative threads is gauged through monitoring a set of global variables, allowing the speculative precomputation threads to determine its relative progress with respect to non-speculative threads.
    Type: Application
    Filed: November 18, 2010
    Publication date: March 17, 2011
    Inventors: Hong Wang, Perry H. Wang, Ross David Weldon, Scott M. Ettinger, Hideki Saito, Milind B. Girkar, Steve Shih-wei Liao, Mohammad R. Haghighat, Xinmin Tian, John P. Shen, Oren Gershon
  • Patent number: 7882498
    Abstract: Provided are a method, system, and program for parallelizing source code with a compiler. Source code including source code statements is received. The source code statements are processed to determine a dependency of the statements. Multiple groups of statements are determined from the determined dependency of the statements, wherein statements in one group are dependent on one another. At least one directive is inserted in the source code, wherein each directive is associated with one group of statements. Resulting threaded code is generated including the inserted at least one directive. The group of statements to which the directive in the resulting threaded code applies are processed as a separate task. Each group of statements designated by the directive to be processed as a separate task may be processed concurrently with respect to other groups of statements.
    Type: Grant
    Filed: March 31, 2006
    Date of Patent: February 1, 2011
    Assignee: Intel Corporation
    Inventors: Guilherme D. Ottoni, Xinmin Tian, Hong Wang, Richard A. Hankins, Wei Li, John Shen
  • Publication number: 20100281471
    Abstract: Methods and apparatuses for compiler-created helper thread for multi-threading are described herein. In one embodiment, exemplary process includes identifying a region of a main thread that likely has one or more delinquent loads, the one or more delinquent loads representing loads which likely suffer cache misses during an execution of the main thread, analyzing the region for one or more helper threads with respect to the main thread, and generating code for the one or more helper threads, the one or more helper threads being speculatively executed in parallel with the main thread to perform one or more tasks for the region of the main thread. Other methods and apparatuses are also described.
    Type: Application
    Filed: December 31, 2009
    Publication date: November 4, 2010
    Inventors: Shih-Wei Liao, Xinmin Tian, Gerolf F. Hoflehner, Hong Wang, Daniel M. Lavery, Perry Wang, Dongkeun Kim, Milind Girkar, John P. Shen
  • Publication number: 20100122073
    Abstract: A method and apparatus for handling exceptions during execution of a transaction is herein described. A compiler associates a transaction exception handler (TEH) with a transaction in program code, such as through insertion of a call to the TEH. The TEH is also associated with an exception data structure, such as an unwind table, that is utilized during runtime to call an appropriate handler in response to an exception. Additionally, the TEH code is generated by the compiler and inserted into the program code. Upon encountering an exception during execution of the transaction, the TEH is capable of dynamically resizing the transaction to the point of the exception through an attempted commit.
    Type: Application
    Filed: November 10, 2008
    Publication date: May 13, 2010
    Inventors: Ravi Narayanaswamy, Xinmin Tian, Bratin Saha, Ali-Reza Adl-Tabatabai, Robert Geva, Clark Nelson, Sergey Preis, Sergey Kozhukhov, Aleksei G. Cherkasov
  • Publication number: 20100058362
    Abstract: Device, system, and method of executing a call to a routine within a transaction. In some embodiments an apparatus may include a memory having stored thereon compiled code corresponding to a transaction, wherein the transaction includes at least one call to a first routine of a pair of first and second mutually inverse routines, and wherein the compiled code includes a call to a first wrapped routine replacing the call to the first routine; and a runtime library including wrapper code, wherein the wrapper code, when executed in response to the call to the first wrapped routine, results in executing the call to the first routine within the transaction and undoing the call to the first routine responsive to abort of the transaction. Other embodiments are described and claimed.
    Type: Application
    Filed: August 26, 2008
    Publication date: March 4, 2010
    Inventors: James H. Cownie, Ravi Narayanaswamy, Jeffrey V. Olivier, Serguei V. Preis, Xinmin Tian, Ali-Reza Adl-Tabatabai
  • Patent number: 7657880
    Abstract: The latencies associated with retrieving instruction information for a main thread are decreased through the use of a simultaneous helper thread. The helper thread is permitted to execute Store instructions. Store blocker logic operates to prevent data associated with a Store instruction in a helper thread from being committed to memory. Dependence blocker logic operates to prevent data associated with a Store instruction in a speculative helper thread from being bypassed to a Load instruction in a non-speculative thread.
    Type: Grant
    Filed: August 1, 2003
    Date of Patent: February 2, 2010
    Assignee: Intel Corporation
    Inventors: Hong Wang, Tor Aamodt, Per Hammarlund, John Shen, Xinmin Tian, Milind Girkar, Perry Wang, Steve Shih-wei Liao