Patents by Inventor Trishul A. Chilimbi
Trishul A. Chilimbi has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Patent number: 10592252Abstract: Efficient instruction processing for sparse data includes extensions to a processor pipeline to identify zero-optimizable instructions that include at least one zero input operand, and bypass the execute stage of the processor pipeline, determining the result of the operation without executing the instruction. When possible, the extensions also bypass the writeback stage of the processor pipeline.Type: GrantFiled: December 31, 2015Date of Patent: March 17, 2020Assignee: Microsoft Technology Licensing, LLCInventors: Trishul A. Chilimbi, Olatunji Ruwase, Vivek Seshadri
-
Patent number: 10459727Abstract: Loop code processor optimizations are implemented as a loop optimizer extension to a processor pipeline. The loop optimizer generates optimized code associated with code loops that include at least one zero-optimizable instruction. The loop optimizer may generate multiple versions of optimized code associated with a particular code loop, where each of the multiple version of optimized code has a different associated condition under which the optimized code can be safely executed.Type: GrantFiled: December 31, 2015Date of Patent: October 29, 2019Assignee: Microsoft Technology Licensing, LLCInventors: Trishul A Chilimbi, Olatunji Ruwase, Vivek Seshadri
-
Publication number: 20170192793Abstract: Efficient instruction processing for sparse data includes extensions to a processor pipeline to identify zero-optimizable instructions that include at least one zero input operand, and bypass the execute stage of the processor pipeline, determining the result of the operation without executing the instruction. When possible, the extensions also bypass the writeback stage of the processor pipeline.Type: ApplicationFiled: December 31, 2015Publication date: July 6, 2017Inventors: Trishul A. Chilimbi, Olatunji Ruwase, Vivek Seshadri
-
Publication number: 20170193361Abstract: A neural network training tool selects from a plurality of parallelizing techniques and selects from a plurality of forward-propagation computation techniques. The neural network training tool performs a forward-propagation phase to train a neural network using the selected parallelizing technique and the selected forward-propagation computation technique based on one or more inputs. Additionally, the neural network training tool selects from a plurality computation techniques and from a plurality of parallelizing techniques for a backward-propagation phase. The neural network training tool performs a backward-propagation phase of training the neural network using the selected backward-propagation parallelizing technique and the selected backward-propagation computation technique to generate error gradients and weight deltas and to update weights associated with one or more layers of the neural network.Type: ApplicationFiled: December 31, 2015Publication date: July 6, 2017Inventors: Trishul A. Chilimbi, Olatunji Ruwase, Samyam Rajbhandari, Michael Carbin, Yuxiong He
-
Publication number: 20170192787Abstract: Loop code processor optimizations are implemented as a loop optimizer extension to a processor pipeline. The loop optimizer generates optimized code associated with code loops that include at least one zero-optimizable instruction. The loop optimizer may generate multiple versions of optimized code associated with a particular code loop, where each of the multiple version of optimized code has a different associated condition under which the optimized code can be safely executed.Type: ApplicationFiled: December 31, 2015Publication date: July 6, 2017Inventors: Trishul A. Chilimbi, Olatunji Ruwase, Vivek Seshadri
-
Publication number: 20170192896Abstract: A zero cache memory system extension includes a zero cache to store cache tags associated with zero cache lines, while a corresponding data cache stores cache tags and data bytes associated with non-zero cache lines. As non-zero data is written to the cache, cache lines may be moved from the zero cache to the data cache. Similarly, as zero data is written to the cache, cache lines may be moved from the data cache to the zero cache.Type: ApplicationFiled: December 31, 2015Publication date: July 6, 2017Inventors: Trishul A Chilimbi, Olatunji Ruwase, Vivek Seshadri
-
Patent number: 9329876Abstract: The described implementations relate to resource aware programming. In one case a program is obtained that is configured to perform a task in accordance with one or more quantitative metrics. An approximate version can be generated from the program. The approximate version is configured to perform the task in a manner that satisfies the one or more quantitative metrics while using fewer computer resources than the program.Type: GrantFiled: May 20, 2009Date of Patent: May 3, 2016Assignee: Microsoft Technology Licensing, LLCInventors: Trishul A. Chilimbi, Woongki Baek
-
Publication number: 20150324690Abstract: Training large neural network models by providing training input to model training machines organized as multiple replicas that asynchronously update a shared model via a global parameter server is described herein. In at least one embodiment, a system including a model module storing a portion of a model and a deep learning training module that communicates with the model module are configured for asynchronously sending updates to shared parameters associated with the model. The techniques herein describe receiving and processing a batch of data items to calculate updates. Replicas of training machines communicate asynchronously with a global parameter server to provide updates to a shared model and return updated weight values. The model may be modified to reflect the updated weight values. The techniques described herein include computation and communication optimizations that improve system efficiency and scaling of large neural networks.Type: ApplicationFiled: September 22, 2014Publication date: November 12, 2015Inventors: Trishul A. Chilimbi, Yutaka Suzue, Johnson R. Apacible, Karthik Kalyanaraman
-
Patent number: 8959442Abstract: A “Memory Allocation Visualizer” provides a dynamic visualization that animates memory allocation event trace information over a time period of execution of a program. Consequently, the Memory Allocation Visualizer provides a visualization and understanding of a program's memory system behavior. Various modes of display with custom color mappings and zooming allow the user to see how heaps are used over time (e.g., by allocation type, age, size, thread id, etc.). Custom displays also allow the user to detect potential memory leaks and fragmentation problems. Composable filters enable the user to focus on specific issues. Various techniques are used to enable processing of a very large numbers of trace events while enabling rapid response to visualization view changes.Type: GrantFiled: June 11, 2010Date of Patent: February 17, 2015Assignee: Microsoft CorporationInventors: Trishul A. Chilimbi, Bongshin Lee, George G. Robertson
-
Patent number: 8914781Abstract: Described is predicting cache locality in a multicore/multithreaded processing environment including when threads share cache data in a non-uniform interleaving manner. Thread execution traces are analyzed to compute a set of per-thread parameters that can then be used to predict cache miss rates for other cache sizes. In one aspect, a model is based upon a probability that the cache reuse distance will increase because of accesses by other threads, and another probability that the reuse distance will decrease because of intercept accesses by other threads to shared data blocks. Estimates of the number of shared data blocks, possibly shared data blocks and private data blocks are used in the computations.Type: GrantFiled: October 24, 2008Date of Patent: December 16, 2014Assignee: Microsoft CorporationInventors: Trishul A. Chilimbi, Chen Ding
-
Patent number: 8464221Abstract: A system and method for identifying a root cause of a wait in a computer system are provided. Given the identity of a thread of interest and time window, a longest wait period for the thread of interest within the time window is identified. The longest wait period is used as a starting node to generate a ready tree by walking backwards through the data in a system trace to construct a tree of readying events that ready threads for running on a processor. A potentially anomalous chain of events is automatically identified and highlighted in the ready tree. A visualization of the ready tree is presented to a user so that the user can explore the events in the tree and annotate the automatically generated tree to aid in problem diagnosis.Type: GrantFiled: June 16, 2009Date of Patent: June 11, 2013Assignee: Microsoft CorporationInventors: Alice X. Zheng, Trishul A Chilimbi, Shuo-Hsien Hsiao, Danyel A. Fisher, David M. Andrzejewski
-
Patent number: 8397221Abstract: Bounding resource consumption of code that processes recursive data structures and collections includes making use of quantitative functions (based on user input) that are associated with a tuple of data-structures and whose semantics is specified by describing the effect of various data-structure methods on the relevant quantitative functions. Counter variables are incorporated into source code to count loop iterations (and number of recursive procedure call invocations). Relevant quantitative functions are incorporated into the source code to allow computation of invariants (and hence bounds) on the incorporated counter variables in terms of the quantitative functions.Type: GrantFiled: October 7, 2008Date of Patent: March 12, 2013Assignee: Microsoft CorporationInventors: Sumit Gulwani, Krishna Kumar Mehra, Trishul A Chilimbi
-
Patent number: 8266598Abstract: Bounding resource consumption of code using abstract interpretation includes a static analysis to estimate a code's resource consumption in terms of units of resources utilized at any point during execution, expressed as a function of its scalar inputs. An instrumentation mechanism and an abstract interpretation mechanism are employed to compute bounds on the code resource consumption. The instrumentation mechanism includes incorporating one or more counter variables in the source code to count the number of loop iterations and recursive procedure call invocations. The abstract interpretation mechanism includes computing invariants on the instrumented counter variables and scalar program variables to obtain bounds on the number of loop iterations and recursive procedure call invocations, which are then composed together to obtain resource bounds for the entire program.Type: GrantFiled: May 5, 2008Date of Patent: September 11, 2012Assignee: Microsoft CorporationInventors: Sumit Gulwani, Krishna Kumar Mehra, Trishul A Chilimbi
-
Publication number: 20110307828Abstract: A “Memory Allocation Visualizer” provides a dynamic visualization that animates memory allocation event trace information over a time period of execution of a program. Consequently, the Memory Allocation Visualizer provides a visualization and understanding of a program's memory system behavior. Various modes of display with custom color mappings and zooming allow the user to see how heaps are used over time (e.g., by allocation type, age, size, thread id, etc.). Custom displays also allow the user to detect potential memory leaks and fragmentation problems. Composable filters enable the user to focus on specific issues. Various techniques are used to enable processing of a very large numbers of trace events while enabling rapid response to visualization view changes.Type: ApplicationFiled: June 11, 2010Publication date: December 15, 2011Applicant: MICROSOFT CORPORATIONInventors: Trishul A. Chilimbi, Bongshin Lee, George G. Robertson
-
Publication number: 20100318852Abstract: A system and method for identifying a root cause of a wait in a computer system are provided. Given the identity of a thread of interest and time window, a longest wait period for the thread of interest within the time window is identified. The longest wait period is used as a starting node to generate a ready tree by walking backwards through the data in a system trace to construct a tree of readying events that ready threads for running on a processor. A potentially anomalous chain of events is automatically identified and highlighted in the ready tree. A visualization of the ready tree is presented to a user so that the user can explore the events in the tree and annotate the automatically generated tree to aid in problem diagnosis.Type: ApplicationFiled: June 16, 2009Publication date: December 16, 2010Applicant: Microsoft CorporationInventors: Alice X. Zheng, Trishul A. Chilimbi, Shuo-Hsien Hsiao, Danyel A. Fisher, David M. Andrzejewski
-
Publication number: 20100299662Abstract: The described implementations relate to resource aware programming. In one case a program is obtained that is configured to perform a task in accordance with one or more quantitative metrics. An approximate version can be generated from the program. The approximate version is configured to perform the task in a manner that satisfies the one or more quantitative metrics while using fewer computer resources than the program.Type: ApplicationFiled: May 20, 2009Publication date: November 25, 2010Applicant: Microsoft CorporationInventors: Trishul A. Chilimbi, Woongki Baek
-
Patent number: 7788637Abstract: Described herein is an implementation of a technology for the construction, identification, and/or optimization of operating-system processes. At least one implementation, described herein, constructs an operating-system process having the contents as defined by a process manifest. Once constructed, the operating-system process is unalterable.Type: GrantFiled: April 29, 2005Date of Patent: August 31, 2010Assignee: Microsoft CorporationInventors: Galen C. Hunt, James R. Larus, John D. DeTreville, Edward P Wobber, Martin Abadi, Michael B. Jones, Trishul A. Chilimbi
-
Publication number: 20100107142Abstract: Described is predicting cache locality in a multicore/multithreaded processing environment including when threads share cache data in a non-uniform interleaving manner. Thread execution traces are analyzed to compute a set of per-thread parameters that can then be used to predict cache miss rates for other cache sizes. In one aspect, a model is based upon a probability that the cache reuse distance will increase because of accesses by other threads, and another probability that the reuse distance will decrease because of intercept accesses by other threads to shared data blocks. Estimates of the number of shared data blocks, possibly shared data blocks and private data blocks are used in the computations.Type: ApplicationFiled: October 24, 2008Publication date: April 29, 2010Applicant: MICROSOFT CORPORATIONInventors: Trishul A. Chilimbi, Chen Ding
-
Publication number: 20100088684Abstract: Bounding resource consumption of code that processes recursive data structures and collections includes making use of quantitative functions (based on user input) that are associated with a tuple of data-structures and whose semantics is specified by describing the effect of various data-structure methods on the relevant quantitative functions. Counter variables are incorporated into source code to count loop iterations (and number of recursive procedure call invocations). Relevant quantitative functions are incorporated into the source code to allow computation of invariants (and hence bounds) on the incorporated counter variables in terms of the quantitative functions.Type: ApplicationFiled: October 7, 2008Publication date: April 8, 2010Applicant: Microsoft CorporationInventors: Sumit Gulwani, Krishna Kumar Mehra, Trishul A. Chilimbi
-
Patent number: 7694300Abstract: Described herein is an implementation of a technology for the construction, identification, and/or optimization of operating-system processes. At least one implementation, described herein, constructs an operating-system process having the contents as defined by a process manifest. Once constructed, the operating-system process is unalterable.Type: GrantFiled: April 29, 2005Date of Patent: April 6, 2010Assignee: Microsoft CorporationInventors: Galen C. Hunt, James R. Larus, John D. DeTreville, Michael B. Jones, Trishul A. Chilimbi