Patents by Inventor Brian T. Lewis
Brian T. Lewis has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Patent number: 10186007Abstract: An example system for adaptive scheduling of task assignment among heterogeneous processor cores may include any number of CPUs, a graphics processing unit (GPU) and memory configured to store a pool of work items to be shared by the CPUs and GPU. The system may also include a GPU proxy profiling module associated with one of the CPU s to profile execution of a first portion of the work items on the GPU. The system may further include profiling modules, each associated with one of the CPUs, to profile execution of a second portion of the work items on each of the CPUs. The measured profiling information from the CPU profiling modules and the GPU proxy profiling module is used to calculate a distribution ratio for execution of a remaining portion of the work items between the CPUs and the GPU.Type: GrantFiled: December 26, 2014Date of Patent: January 22, 2019Assignee: Intel CorporationInventors: Rajkishore Barik, Tatiana Shpeisman, Brian T. Lewis, Rashid Kaleem
-
Publication number: 20180314936Abstract: A mechanism is described for facilitating smart distribution of resources for deep learning autonomous machines. A method of embodiments, as described herein, includes detecting one or more sets of data from one or more sources over one or more networks, and introducing a library to a neural network application to determine optimal point at which to apply frequency scaling without degrading performance of the neural network application at a computing device.Type: ApplicationFiled: April 28, 2017Publication date: November 1, 2018Applicant: Intel CorporationInventors: Rajkishore Barik, Brian T. Lewis, Murali Sundaresan, Jeffrey Jackson, Feng Chen, Xiaoming Chen, Mike Macpherson
-
Publication number: 20180314250Abstract: A mechanism is described for facilitating smart collection of data and smart management of autonomous machines. A method of embodiments, as described herein, includes detecting one or more sets of data from one or more sources over one or more networks, and combining a first computation directed to be performed locally at a local computing device with a second computation directed to be performed remotely at a remote computing device in communication with the local computing device over the one or more networks, where the first computation consumes low power, wherein the second computation consumes high power.Type: ApplicationFiled: April 28, 2017Publication date: November 1, 2018Applicant: Intel CorporationInventors: Brian T. Lewis, Feng Chen, Jeffrey R. Jackson, Justin E. Gottschlich, Rajkishore Barik, Xiaoming Chen, Prasoonkumar Surti, Mike B. Macpherson, Murali Sundaresan
-
Publication number: 20180314935Abstract: A mechanism is described for facilitating efficient training of neural networks at computing devices. A method of embodiments, as described herein, includes detecting one or more inputs for training of a neural network, and introducing randomness in floating point (FP) numbers to prevent overtraining of the neural network, where introducing randomness includes replacing less-significant low-order bits of operand and result values with new low-order bits during the training of the neural network.Type: ApplicationFiled: April 28, 2017Publication date: November 1, 2018Applicant: Intel CorporationInventors: Brian T. Lewis, Rajkishore Barik, Murali Sundaresan, Leonard Truong
-
Publication number: 20180300964Abstract: One embodiment provides for a computing device within an autonomous vehicle, the compute device comprising a wireless network device to enable a wireless data connection with an autonomous vehicle network, a set of multiple processors including a general-purpose processor and a general-purpose graphics processor, the set of multiple processors to execute a compute manager to manage execution of compute workloads associated with the autonomous vehicle, the compute workload associated with autonomous operations of the autonomous vehicle, and offload logic configured to execute on the set of multiple processors, the offload logic to determine to offload one or more of the compute workloads to one or more autonomous vehicles within range of the wireless network device.Type: ApplicationFiled: April 17, 2017Publication date: October 18, 2018Applicant: Intel CorporationInventors: BARATH LAKSHAMANAN, LINDA L. HURD, BEN J. ASHBAUGH, ELMOUSTAPHA OULD-AHMED-VALL, LIWEI MA, JINGYI JIN, JUSTIN E. GOTTSCHLICH, CHANDRASEKARAN SAKTHIVEL, MICHAEL S. STRICKLAND, BRIAN T. LEWIS, LINDSEY KUPER, ALTUG KOKER, ABHISHEK R. APPU, PRASOONKUMAR SURTI, JOYDEEP RAY, BALAJI VEMBU, JAVIER S. TUREK, NAILA FAROOQUI
-
Publication number: 20180300845Abstract: An apparatus to facilitate data prefetching is disclosed. The apparatus includes a memory, one or more execution units (EUs) to execute a plurality of processing threads and prefetch logic to prefetch pages of data from the memory to assist in the execution of the plurality of processing threads.Type: ApplicationFiled: April 17, 2017Publication date: October 18, 2018Applicant: Intel CorporationInventors: Adam T. Lake, Guei-Yuan Lueh, Balaji Vembu, Murali Ramadoss, Prasoonkumar Surti, Abhishek R. Appu, Altug Koker, Subramaniam M. Maiyuran, Eric C. Samson, David J. Cowperthwaite, Zhi Wang, Kun Tian, David Puffer, Brian T. Lewis
-
Publication number: 20180267844Abstract: Generally, this disclosure provides systems, devices, methods and computer readable media for implementing function callback requests between a first processor (e.g., a GPU) and a second processor (e.g., a CPU). The system may include a shared virtual memory (SVM) coupled to the first and second processors, the SVM configured to store at least one double-ended queue (Deque). An execution unit (EU) of the first processor may be associated with a first of the Deques and configured to push the callback requests to that first Deque. A request handler thread executing on the second processor may be configured to: pop one of the callback requests from the first Deque; execute a function specified by the popped callback request; and generate a completion signal to the EU in response to completion of the function.Type: ApplicationFiled: November 24, 2015Publication date: September 20, 2018Applicant: Intel CorporationInventors: BRIAN T. LEWIS, RAJKISHORE BARIK, TATIANA SHPEISMAN
-
Patent number: 9342384Abstract: Generally, this disclosure provides systems, devices, methods and computer readable media for implementing function callback requests between a first processor (e.g., a GPU) and a second processor (e.g., a CPU). The system may include a shared virtual memory (SVM) coupled to the first and second processors, the SVM configured to store at least one double-ended queue (Deque). An execution unit (EU) of the first processor may be associated with a first of the Deques and configured to push the callback requests to that first Deque. A request handler thread executing on the second processor may be configured to: pop one of the callback requests from the first Deque; execute a function specified by the popped callback request; and generate a completion signal to the EU in response to completion of the function.Type: GrantFiled: December 18, 2014Date of Patent: May 17, 2016Assignee: Intel CorporationInventors: Brian T. Lewis, Rajkishore Barik, Tatiana Shpeisman
-
Publication number: 20160055612Abstract: Generally, this disclosure provides systems, devices, methods and computer readable media for adaptive scheduling of task assignment among heterogeneous processor cores. The system may include any number of CPUs, a graphics processing unit (GPU) and memory configured to store a pool of work items to be shared by the CPUs and GPU. The system may also include a GPU proxy profiling module associated with one of the CPUs to profile execution of a first portion of the work items on the GPU. The system may further include profiling modules, each associated with one of the CPUs, to profile execution of a second portion of the work items on each of the CPUs. The measured profiling information from the CPU profiling modules and the GPU proxy profiling module is used to calculate a distribution ratio for execution of a remaining portion of the work items between the CPUs and the GPU.Type: ApplicationFiled: December 26, 2014Publication date: February 25, 2016Applicant: Intel CorporationInventors: RAJKISHORE BARIK, Tatiana Shpeisman, Brian T. Lewis, Rashid Kaleem
-
Publication number: 20150220340Abstract: Various embodiments are generally directed to techniques for assigning instances of blocks of instructions of a routine to one of multiple types of core of a heterogeneous set of cores of a processor component. An apparatus to select types of cores includes a processor component; a core selection component for execution by the processor component to select a core of multiple cores to execute an initial subset of multiple instances of an instruction block in parallel based on characteristics of instructions of the instruction block, and to select a core of the multiple cores to execute remaining instances of the multiple instances of the instruction block in parallel based on characteristics of execution of the initial subset stored in an execution database; and a monitoring component for execution by the processor component to record the characteristics of execution of the initial subset in the execution database. Other embodiments are described and claimed.Type: ApplicationFiled: October 4, 2013Publication date: August 6, 2015Inventors: Rajkishore BARIK, Brian T. LEWIS, Tatiana SHPEISMAN
-
Patent number: 8719828Abstract: An apparatus and method is described herein for adaptive thread scheduling in a transactional memory environment. A number of conflicts in a thread over time are tracked. And if the conflicts exceed a threshold, the thread may be delayed (adaptively scheduled) to avoid conflicts between competing threads. Moreover, a more complex version may track a number of transaction aborts within a first thread that are caused by a second thread over a period, as well as a total number of transactions executed by the first thread over the period. From the tracking, a conflict ratio is determined for the first thread with regard to the second thread. And when the first thread is to be scheduled, it may be delayed if the second thread is running and the conflict ratio is over a conflict ratio threshold.Type: GrantFiled: October 14, 2011Date of Patent: May 6, 2014Assignee: Intel CorporationInventors: Brian T. Lewis, Bratin Saha
-
Publication number: 20130097607Abstract: An apparatus and method is described herein for adaptive thread scheduling in a transactional memory environment. A number of conflicts in a thread over time are tracked. And if the conflicts exceed a threshold, the thread may be delayed (adaptively scheduled) to avoid conflicts between competing threads. Moreover, a more complex version may track a number of transaction aborts within a first thread that are caused by a second thread over a period, as well as a total number of transactions executed by the first thread over the period. From the tracking, a conflict ratio is determined for the first thread with regard to the second thread. And when the first thread is to be scheduled, it may be delayed if the second thread is running and the conflict ratio is over a conflict ratio threshold.Type: ApplicationFiled: October 14, 2011Publication date: April 18, 2013Inventors: Brian T. Lewis, Bratin Saha
-
Patent number: 7424705Abstract: Disclosed are a method, apparatus and system for dynamically managing layout of compiled code in a managed runtime environment. Profile feedback is generated during runtime, based on hardware event data that is gathered during runtime. A code manager dynamically relocates compiled code to reduce miss events based on the profile feedback. The code manager may also relocate virtual method tables in a virtual table region in order to reduce data miss events. The method does not require a prior run of an application program because profile feedback is based on event data that is tracked by hardware during execution of the software application and is not based on instrumented code.Type: GrantFiled: March 11, 2004Date of Patent: September 9, 2008Assignee: Intel CorporationInventors: Brian T. Lewis, James M. Stichnoth, Dong-Yuan Chen
-
Patent number: 7237064Abstract: Disclosed are a method, apparatus and system for managing a shared heap and compiled code cache in a managed runtime environment. Based on feedback generated during runtime, a runtime storage manager dynamically allocates storage space, from a shared storage region, between a compiled code cache and a heap. For at least one embodiment, the size of the shared storage region may be increased if a growth need is identified for both the compiled code cache and the heap during a single iteration of runtime storage manager processing.Type: GrantFiled: October 10, 2003Date of Patent: June 26, 2007Assignee: Intel CorporationInventor: Brian T. Lewis
-
Patent number: 6718438Abstract: The present invention uses feedback to determine the size of an object cache. The size of the cache, (i.e., its budget), varies and is determined based on feedback from the persistent object system. Persistent objects are evicted from the cache if the storage for persistent objects exceeds the budget. If the storage is less than the budget then persistent objects in the heap are retained while new persistent objects are added to the cache.Type: GrantFiled: December 13, 2000Date of Patent: April 6, 2004Assignee: Sun Microsystems, Inc.Inventors: Brian T. Lewis, Bernd J. W. Mathiske, Neal M. Gafter, Michael J. Jordan
-
Patent number: 6493730Abstract: One embodiment of the present invention provides a system for allocating storage space for objects within a persistent object system. The persistent object system includes an object heap that is organized into a young generation region and an old generation region. The system uses the young generation region for newly created objects and uses the old generation region for objects that have not been removed by several garbage collection cycles. The system allocates storage space for new (transient) objects in the young generation region of the object heap. Periodically, the system copies the transient objects from the object heap to a stable store to form a checkpoint of the system state. Transient objects become persistent objects when they are copied to the stable store. Persistent objects are removed from the object heap when the system is stopped and when room is needed in the object heap for additional objects.Type: GrantFiled: October 10, 2000Date of Patent: December 10, 2002Assignee: Sun Microsystems, Inc.Inventors: Brian T. Lewis, Bernd J. W. Mathiske, Antonios Printczis, Malcolm P. Atkinson
-
Publication number: 20020073283Abstract: The present invention uses feedback to determine the size of an object cache. The size of the cache, (i.e., its budget), varies and is determined based on feedback from the persistent object system. Persistent objects are evicted from the cache if the storage for persistent objects exceeds the budget. If the storage is less than the budget then persistent objects in the heap are retained while new persistent objects are added to the cache.Type: ApplicationFiled: December 13, 2000Publication date: June 13, 2002Inventors: Brian T. Lewis, Bernd J.W. Mathiske, Neal M. Gafter, Michael J. Jordan
-
Patent number: 5815712Abstract: A system for providing a user or agent control over functions defined by an object in a target application. The object is a new type of object called a controllable object, which publishes its functions and for use by a control application. When the target application execution is commenced, it generates predefined controllable objects, and then execution of the control application is commenced. The control application obtains a handle on the controllable object, and then is able to set any of a number of predefined values in the controllable object, such as individual variables or parameters, ranges of values, a list of choices from which the user can select, and others. In this way, the user can manipulate, test and optimize the target application even during its execution, by virtue of the pre-programmed controllable object functions.Type: GrantFiled: July 18, 1997Date of Patent: September 29, 1998Assignee: Sun Microsystems, Inc.Inventors: David M. Bristor, Brian T. Lewis, Graham Hamilton
-
Patent number: 5748881Abstract: A method and apparatus are disclosed which provide solutions to the problems which are encountered in an object oriented, distributed computer system in which attempts are made to monitor and display performance characteristics of objects in the system, where no prior knowledge of the objects exists. The invention disclosed herein is a generic monitoring and display system which can obtain performance data from and about objects and display the data in an appropriate manner without having to create special one time data acquisition and display programs, and which can select an appropriate display type based upon a display indicator contained in the captured data. Additionally, a tabular object is disclosed which can be used by operating objects to facilitate operating data collection and reporting.Type: GrantFiled: November 3, 1995Date of Patent: May 5, 1998Assignee: Sun Microsystems, Inc.Inventors: Brian T. Lewis, Graham Hamilton
-
Patent number: 5590331Abstract: A method and apparatus for generating a platform-standard object file containing machine-independent abstract code. Source code which defines a procedure is convened into abstract code which makes no assumptions about the platform on which the procedure will be executed. An abstract code platform-standard object file is generated based on the abstract code. The abstract code platform-standard object file includes a list of definitions of any global variables defined in the abstract code, a list of symbol references indicative of any external variables or external procedures referenced in the abstract code, a sequence of machine instructions for calling an execution routine when a client calls the procedure, and the abstract code which defines the procedure. The abstract code is preferably compressed before it is stored in the abstract code platform-standard object file. When a program including the abstract code platform-standard object file is executed, it is dynamically linked to the execution routine.Type: GrantFiled: December 23, 1994Date of Patent: December 31, 1996Assignee: Sun Microsystems, Inc.Inventors: Brian T. Lewis, Theodore C. Goldstein