Patents by Inventor Sanjiv M. Shah
Sanjiv M. Shah has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Patent number: 8887174Abstract: A technique to monitor software thread performance and update software that issues or uses the thread(s) to reduce performance-inhibiting events. At least one embodiment of the invention uses hardware and/or software timers or counters to monitor various events associated with executing user-level threads and report these events back to a user-level software program, which can use the information to avoid or at least reduce performance-inhibiting events associated with the user-level threads.Type: GrantFiled: July 26, 2011Date of Patent: November 11, 2014Assignee: Intel CorporationInventors: Richard A. Hankins, Gautham N. Chinya, Hong Wang, Shivnandan D. Kaushik, Bryant E. Bigbee, John P. Shen, Trung A. Diep, Xiang Zou, Baiju V. Patel, Paul M. Petersen, Sanjiv M. Shah, Ryan N. Rakvic, Prashant Sethi
-
Patent number: 8689215Abstract: Methods, data structures, instructions, and techniques for structured exception handling for user-level threads in a multi-threading system are provided. Registered filter routines may be dispatched to a thread unit not managed by the operating system (OS). The dispatch may occur by allowing an OS-managed thread unit (proxy) to invoke the OS-provided structured exception handling service (including dispatcher) on behalf of the sequestered thread unit. Alternatively, an OS-managed thread unit may include dispatch code and may, without OS intervention, dispatch the filter routine to the sequestered thread unit. Other embodiments are also described and claimed.Type: GrantFiled: December 19, 2006Date of Patent: April 1, 2014Assignee: Intel CorporationInventors: Richard A. Hankins, Gautham N. Chinya, Hong Wang, David K. Poulsen, Shirish Aundhe, Baiju V. Patel, Sanjiv M. Shah
-
Publication number: 20120017221Abstract: A technique to monitor software thread performance and update software that issues or uses the thread(s) to reduce performance-inhibiting events. At least one embodiment of the invention uses hardware and/or software timers or counters to monitor various events associated with executing user-level threads and report these events back to a user-level software program, which can use the information to avoid or at least reduce performance-inhibiting events associated with the user-level threads.Type: ApplicationFiled: July 26, 2011Publication date: January 19, 2012Inventors: Richard A. Hankins, Gautham N. Chinya, Hong Wang, Shivnandan D. Kaushik, Bryant E. Bigbee, John P. Shen, Trung A. Diep, Xiang Zou, Baiju V. Patel, Paul M. Petersen, Sanjiv M. Shah, Ryan N. Rakvic, Prashant Sethi
-
Patent number: 8079035Abstract: Data structure creation, organization and management techniques for data local to user-level threads are provided. In one embodiment, a method includes generating, for a user-level thread (“shred”) to run on a thread unit that is not managed by an operating system (“OS”), a storage area for local data and maintaining state in the storage area across a context switch from the thread unit that is not managed by the OS to a second thread unit that is managed by the OS. Other embodiments are also described and claimed.Type: GrantFiled: December 27, 2005Date of Patent: December 13, 2011Assignee: Intel CorporationInventors: Richard A. Hankins, Gautham N. Chinya, Hong Wang, David K. Poulsen, Shirish Aundhe, John P. Shen, Sanjiv M. Shah, Baiju V. Patel
-
Patent number: 8010969Abstract: A technique to monitor software thread performance and update software that issues or uses the thread(s) to reduce performance-inhibiting events. At least one embodiment of the invention uses hardware and/or software timers or counters to monitor various events associated with executing user-level threads and report these events back to a user-level software program, which can use the information to avoid or at least reduce performance-inhibiting events associated with the user-level threads.Type: GrantFiled: June 13, 2005Date of Patent: August 30, 2011Assignee: Intel CorporationInventors: Richard A. Hankins, Gautham N. Chinya, Hong Wang, Shivnandan D. Kaushik, Bryant E. Bigbee, John P. Shen, Trung A. Diep, Xiang Zou, Baiju V. Patel, Paul M. Petersen, Sanjiv M. Shah, Ryan N. Rakvic, Prashant Sethi
-
Patent number: 7743233Abstract: Disclosed are embodiments of a system, methods and mechanism for management and translation of mapping between logical sequencer addresses and physical or logical sequencers in a multi-sequencer multithreading system. A mapping manager may manage assignment and mapping of logical sequencer addresses or pages to actual sequencers or frames of the system. Rationing logic associated with the mapping manager may take into account sequencer attributes when such mapping is performed Relocation logic associated with the mapping manager may manage spill and fill of context information to/from a backing store when re-mapping actual sequencers. Sequencers may be allocated singly, or may be allocated as part of partitioned blocks. The mapping manager may also include translation logic that provides an identifier for the mapped sequencer each time a logical sequencer address is used in a user program. Other embodiments are also described and claimed.Type: GrantFiled: April 5, 2005Date of Patent: June 22, 2010Assignee: Intel CorporationInventors: Hong Wang, Gautham N. Chinya, Richard A. Hankins, Shivnandan D. Kaushik, Bryant Bigbee, John Shen, Per Hammarlund, Xiang Zou, Jason W. Brandt, Prashant Sethi, Douglas M. Carmean, Baiju V. Patel, Scott Dion Rodgers, Ryan N. Rakvic, John L. Reid, David K. Poulsen, Sanjiv M. Shah, James Paul Held, James Charles Abel
-
Patent number: 7703094Abstract: A method and apparatus for adaptive and dynamic filtering of threaded programs. An embodiment of a method comprises analyzing the operation of a computer program, the computer program comprising a plurality of program threads; tracking overhead for the computer program; observing program events for the computer program; rationing overhead between program threads in inter-thread program events; and filtering program events based on a dynamic threshold.Type: GrantFiled: December 30, 2004Date of Patent: April 20, 2010Assignee: Intel CorporationInventors: Ekarat T Mongkolsmai, Douglas R Armstrong, Sanjiv M Shah
-
Patent number: 7500242Abstract: The present disclosure relates to acquiring and releasing a shared resource via a lock semaphore and, more particularly, to acquiring and releasing a shared resource via a lock semaphore utilizing a state machine.Type: GrantFiled: September 8, 2003Date of Patent: March 3, 2009Assignee: Intel CorporationInventors: Sanjiv M. Shah, Paul M. Petersen, Grant E. Haab
-
Publication number: 20080148259Abstract: Methods, data structures, instructions, and techniques for structured exception handling for user-level threads in a multi-threading system are provided. Registered filter routines may be dispatched to a thread unit not managed by the operating system (OS). The dispatch may occur by allowing an OS-managed thread unit (proxy) to invoke the OS-provided structured exception handling service (including dispatcher) on behalf of the sequestered thread unit. Alternatively, an OS-managed thread unit may include dispatch code and may, without OS intervention, dispatch the filter routine to the sequestered thread unit. Other embodiments are also described and claimed.Type: ApplicationFiled: December 19, 2006Publication date: June 19, 2008Inventors: Richard A. Hankins, Gautham N. Chinya, Hong Wang, David K. Poulsen, Shirish Aundhe, Baiju V. Patel, Sanjiv M. Shah
-
Patent number: 7069556Abstract: A method and apparatus for implementing a parallel construct comprised of a single task is described. A method comprises receiving a first code segment, the first code segment having a set of instances of a parallel construct, each of the set of instances of the parallel construct comprised of a task, and translating the first code segment to a second code segment, the second code segment, when being executed to perform operations comprising: allocating a shared value, the shared value to indicate a most current one of the set of instances encountered by one of a team of threads, allocating a private value for each of the team of threads, the private value to indicate one of the set of instances encountered by the private value's corresponding thread of the team of threads, maintaining the shared value with the team of threads, and maintaining the private value of each of the team of threads with the private value's corresponding thread of the team of threads.Type: GrantFiled: September 27, 2001Date of Patent: June 27, 2006Assignee: Intel CorporationInventors: Sanjiv M. Shah, Paul M. Petersen
-
Patent number: 6792599Abstract: A method and apparatus for a atomic operation is described. A method comprises receiving a first program unit in a parallel computing environment, the first program unit including a memory update operation to be performed atomically, the memory update operation having an operand, the operand being of a data-type and of a data size, and translating the first program unit into a second program unit, the second program unit to associate the memory update operation with a set of one or more low-level instructions upon determining that the data size of the operand is supported by the set of low-level instructions, the set of low-level instructions to ensure atomicity of the memory update operation.Type: GrantFiled: October 15, 2001Date of Patent: September 14, 2004Assignee: Intel CorporationInventors: David K. Poulsen, Sanjiv M. Shah, Paul M. Petersen, Grant E. Haab
-
Publication number: 20030135535Abstract: In some embodiments of the present invention, a parallel computer system provides a plurality of threads that execute code structures. A method and apparatus may be provided to copy data from one thread to another thread.Type: ApplicationFiled: January 11, 2002Publication date: July 17, 2003Inventors: Jay P. Hoeflinger, Sanjiv M. Shah, Paul M. Petersen, David K. Poulsen
-
Publication number: 20030126589Abstract: A method and apparatus for a reduction operation is described. A method may be utilized that includes receiving a first program unit in a parallel computing environment, the first program unit may include a reduction operation to be performed and translating the first program unit into a second program unit, the second program unit may associate the reduction operation with a set of one or more low-level instructions that may, in part, perform the reduction operation.Type: ApplicationFiled: January 2, 2002Publication date: July 3, 2003Inventors: David K. Poulsen, Sanjiv M. Shah, Paul M. Petersen, Grant E. Haab, Jay P. Hoeflinger
-
Publication number: 20030088856Abstract: A method and apparatus for generating source code to return the memory address of a descriptor are described. In an embodiment, the method includes generating a function having an argument. The function is expressed in a high-level programming language. The function includes a set of one or more instructions that instruct a compiler unit to return a memory address of the argument as a result of the function. The method also includes generating a call to the function. The call is expressed in the high-level programming language. The call causes the compiler unit to pass a descriptor as the argument.Type: ApplicationFiled: November 8, 2001Publication date: May 8, 2003Inventors: Jay P. Hoeflinger, Sanjiv M. Shah, David K. Poulsen
-
Publication number: 20030074649Abstract: A method and apparatus for a atomic operation is described. A method comprises receiving a first program unit in a parallel computing environment, the first program unit including a memory update operation to be performed atomically, the memory update operation having an operand, the operand being of a data-type and of a data size, and translating the first program unit into a second program unit, the second program unit to associate the memory update operation with a set of one or more low-level instructions upon determining that the data size of the operand is supported by the set of low-level instructions, the set of low-level instructions to ensure atomicity of the memory update operation.Type: ApplicationFiled: October 15, 2001Publication date: April 17, 2003Inventors: David K. Poulsen, Sanjiv M. Shah, Paul M. Petersen, Grant E. Haab
-
Publication number: 20030066056Abstract: In an embodiment, a method includes receiving a first source code having a number of global storage objects, wherein the number of global storage objects are to be accessed by a number of threads during execution. The method also includes translating the first source code into a second source code. The translating includes adding initialization logic for each of the number of global storage objects. The initialization logic includes generating private copies of each of the number of global storage objects during execution of the second source code. The initialization logic also includes generating at least one cache object during the execution of the second source code, wherein the private copies of each of the number of global storage objects are accessed through the at least one cache object during execution of the second source code.Type: ApplicationFiled: September 28, 2001Publication date: April 3, 2003Inventors: Paul M. Petersen, Sanjiv M. Shah, David K. Poulsen
-
Publication number: 20030061255Abstract: A method and apparatus for implementing a parallel construct comprised of a single task is described. A method comprises receiving a first code segment, the first code segment having a set of instances of a parallel construct, each of the set of instances of the parallel construct comprised of a task, and translating the first code segment to a second code segment, the second code segment, when being executed to perform operations comprising: allocating a shared value, the shared value to indicate a most current one of the set of instances encountered by one of a team of threads, allocating a private value for each of the team of threads, the private value to indicate one of the set of instances encountered by the private value's corresponding thread of the team of threads, maintaining the shared value with the team of threads, and maintaining the private value of each of the team of threads with the private value's corresponding thread of the team of threads.Type: ApplicationFiled: September 27, 2001Publication date: March 27, 2003Inventors: Sanjiv M. Shah, Paul M. Petersen
-
Patent number: 6286130Abstract: A software-implemented method for validating the correctness of parallel computer programs, written in various programming languages, with respect to these programs' corresponding sequential computer programs. Validation detects errors that could cause parallel computer programs to behave incorrectly or to produce incorrect results, and is accomplished by transforming these parallel computer programs under the control of a general purpose computer and sequentially executing the resulting transformed programs. The validation method is system-independent and is portable across various computer architectures and platforms since validation is accomplished via program transformation; thus, the method does not depend on the features of a particular hardware architecture or configuration, operating system, compiler, linker, or thread environment. The input to the validation method is a parallel computer program.Type: GrantFiled: August 5, 1997Date of Patent: September 4, 2001Assignee: Intel CorporationInventors: David K. Poulsen, Paul M. Petersen, Sanjiv M. Shah
-
Patent number: 5812852Abstract: A software-implemented method for dynamically and statically privatizing global storage objects in parallel computer programs written in various programming languages. Privatization is accomplished via transformation of these parallel computer programs under the control of a general purpose computer. The privatization method is system-independent and is portable across various computer architectures and platforms since privatization is accomplished via program transformation; thus, the method does not depend on the features of a particular hardware architecture or configuration, operating system, compiler, linker, or thread environment. The inputs to the method are a parallel computer program, comprising parallel regions of execution and global storage objects, and a privatization specification describing the global storage objects to be privatized and the particular parallel regions, and manner, in which each of these objects is to be privatized.Type: GrantFiled: November 14, 1996Date of Patent: September 22, 1998Assignee: Kuck & Associates, Inc.Inventors: David K. Poulsen, Paul M. Petersen, Sanjiv M. Shah