Patents by Inventor Scott A. Mahlke

Scott A. Mahlke has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 11321604
    Abstract: Subject matter disclosed herein may relate to storage and/or processing of signals and/or states representative of neural network parameters in a computing device, and may relate more particularly to compressing signals and/or states representative of neural network nodes in a computing device.
    Type: Grant
    Filed: June 21, 2017
    Date of Patent: May 3, 2022
    Assignees: ARM Ltd., The Regents of the University of Michigan
    Inventors: Jiecao Yu, Andrew Lukefahr, David Palframan, Ganesh Dasika, Reetuparnda Das, Scott Mahlke
  • Patent number: 11275996
    Abstract: Subject matter disclosed herein may relate to storage of signals and/or states representative of parameters in a computing device, and may relate more particularly to storage of signals and/or states representative of neural network parameters in a computing device.
    Type: Grant
    Filed: June 21, 2017
    Date of Patent: March 15, 2022
    Assignees: ARM Ltd., The Regents of the University of Michigan
    Inventors: Jiecao Yu, Andrew Lukefahr, David Palframan, Ganesh Dasika, Reetuparnda Das, Scott Mahlke
  • Patent number: 10613866
    Abstract: A method, apparatus, and CRM that detect repetition of an out-of-order execution schedule for a group of instructions executed by an out-of-order processor. Data indicative of at least one performance metric for an instance of execution of said group of instructions by the out-of-order processor is determined. The determined data are compared with previous data of the at least one performance metric for at least one previous instance of execution of the group of instructions by the out-of-order processor. Repetition of the out-of-order execution schedule is detected dependent on the comparison.
    Type: Grant
    Filed: April 4, 2017
    Date of Patent: April 7, 2020
    Assignee: The Regents of the University of Michigan
    Inventors: Shruti Padmanabha, Andrew Lukefahr, Reetuparna Das, Scott Mahlke
  • Patent number: 10585701
    Abstract: A technique is provided for processing thread groups, each thread group having associated program code comprising a plurality of regions that each require access to an associated plurality of registers providing operand values for the instructions of that region. Capacity management circuitry is arranged, for a thread group having a region of the associated program code that is ready to be executed, to perform an operand setup process to reserve sufficient storage elements within an operand staging unit to provide the associated plurality of registers, and to cause the operand value for any input register to be preloaded into a reserved storage element allocated for that input register, an input register being a register whose operand value is required before the region can be executed. Scheduling circuitry selects for processing a thread group for which the operand setup process has been performed in respect of the region to be executed.
    Type: Grant
    Filed: October 12, 2017
    Date of Patent: March 10, 2020
    Assignee: The Regents of the University of Michigan
    Inventors: John Kloosterman, Jonathan Beaumont, Davoud Anoushe Jamshidi, Jonathan Bailey, Trevor Mudge, Scott Mahlke
  • Patent number: 10310858
    Abstract: Apparatus and a corresponding method for controlling a transition between use of first processing circuitry and second processing circuitry to execute program instructions are provided. Transition monitoring storage selects an entry for a load program instruction executed during the transition in dependence on a memory address from which a value is to be loaded and stores a program order timestamp for the load program instruction, unless a valid previously stored program order timestamp in the entry precedes the program order timestamp. Thus the oldest timestamp of an load instruction executed in the transition is held. At either the start or end (or both) of the transition the content of the transition monitoring storage is cleared.
    Type: Grant
    Filed: March 8, 2016
    Date of Patent: June 4, 2019
    Assignee: The Regents of the University of Michigan
    Inventors: Andrew Lukefahr, Shruti Padmanabha, Reetuparna Das, Scott Mahlke, Jiecao Yu
  • Publication number: 20190114205
    Abstract: An apparatus and method are provided for processing thread groups, where each thread group has associated program code and comprises one or more threads. Scheduling circuitry is used to select thread groups from a plurality of thread groups, and then thread processing circuitry is responsive to the scheduling circuitry to process one or more threads of a selected thread group by executing instructions of the associated program code. The associated program code comprises a plurality of regions that each require access to an associated plurality of registers providing operand values for the instructions of that region. An operand staging unit is provided that has a plurality of storage elements that are dynamically allocated to provide the associated plurality of registers for one or more of the regions.
    Type: Application
    Filed: October 12, 2017
    Publication date: April 18, 2019
    Inventors: John KLOOSTERMAN, Jonathan BEAUMONT, Davoud Anoushe JAMSHIDI, Jonathan BAILEY, Trevor MUDGE, Scott MAHLKE
  • Publication number: 20180373975
    Abstract: Subject matter disclosed herein may relate to storage and/or processing of signals and/or states representative of neural network parameters in a computing device, and may relate more particularly to compressing signals and/or states representative of neural network nodes in a computing device.
    Type: Application
    Filed: June 21, 2017
    Publication date: December 27, 2018
    Inventors: Jiecao Yu, Andrew Lukefahr, David Palframan, Ganesh Dasika, Reetuparnda Das, Scott Mahlke
  • Publication number: 20180373978
    Abstract: Subject matter disclosed herein may relate to storage of signals and/or states representative of parameters in a computing device, and may relate more particularly to storage of signals and/or states representative of neural network parameters in a computing device.
    Type: Application
    Filed: June 21, 2017
    Publication date: December 27, 2018
    Inventors: Jiecao Yu, Andrew Lukefahr, David Palframan, Ganesh Dasika, Reetuparnda Das, Scott Mahlke
  • Publication number: 20180285111
    Abstract: A method of detecting repetition of an out-of-order execution schedule for a group of instructions executed by an out-of-order processor, said method comprising: determining data indicative of at least one performance metric for an instance of execution of said group of instructions by said out-of-order processor; performing a comparison of said determined data with previous data of said at least one performance metric for at least one previous instance of execution of said group of instructions by said out-of-order processor; and detecting repetition of said out-of-order execution schedule dependent on said comparison, a corresponding apparatus and non-transitory computer-readable medium.
    Type: Application
    Filed: April 4, 2017
    Publication date: October 4, 2018
    Inventors: Shruti PADMANABHA, Andrew LUKEFAHR, Reetuparna DAS, Scott MAHLKE
  • Patent number: 9965279
    Abstract: An apparatus for processing data includes first execution circuitry, such as an out-of-order processor, and second execution circuitry, such as an in-order processor. The first execution circuitry is of higher performance but uses more energy than the second execution circuitry. Control circuitry switches between the first execution circuitry being active and the second execution circuitry being active. The control circuitry includes prediction circuitry which is configured to predict a predicted identity of a next sequence of program instructions to be executed in dependence upon a most recently executed sequence of program instructions and then in dependence upon this predicted identity to predict a predicted execution target corresponding to whether the next sequence of program instructions should be executed by the first execution circuitry or the second execution circuitry.
    Type: Grant
    Filed: November 29, 2013
    Date of Patent: May 8, 2018
    Assignee: The Regents of the University of Michigan
    Inventors: Shruti Padmanabha, Andrew Lukefahr, Reetuparna Das, Scott Mahlke
  • Patent number: 9898409
    Abstract: A multithreaded data processing system performs processing using resource circuitry which is a finite resource. A saturation signal is generated to indicate when the resource circuitry is no longer able to perform processing operations issued to it. This saturations signal may be used to select a scheduling algorithm to be used for further scheduling, such as switching to scheduling from a single thread as opposed to round-robin scheduling from all of the threads. Re-execution queue circuitry is used to queue processing operations which have been enabled to be issued so as to permit other processing operations which may not be blocked by the lack of use of circuitry to attempt issue.
    Type: Grant
    Filed: October 9, 2014
    Date of Patent: February 20, 2018
    Assignee: The Regents of the University of Michigan
    Inventors: Ankit Sethia, Scott Mahlke
  • Patent number: 9898333
    Abstract: Provided is a method and apparatus of selecting a preemption technique for a computation unit included in a processor to execute a second task before the at least one computation unit finishes executing a first task. The method includes receiving a preemption request, predicting a cost of preemption techniques based on a progress of the first task until receipt of the preemption request, and selecting one of the preemption techniques based on the predicted cost.
    Type: Grant
    Filed: August 7, 2015
    Date of Patent: February 20, 2018
    Assignee: SAMSUNG ELECTRONICS CO., LTD.
    Inventors: Jason Jong Kyu Park, Scott Mahlke, Donghoon Yoo
  • Patent number: 9870226
    Abstract: A data processing apparatus includes a first execution mechanism, such as an out-of-order processing circuitry, and a second execution mechanism 6 such as an in-order processing circuitry. Switching control circuitry controls switching between which of the first execution circuitry and the second execution circuitry is active at a given time. Latency indicating signals indicative of the latency associated with a candidate switching operation to be performed are supplied to the switching control circuitry and used to control the switching operation. The control of the switching operation may be to accelerate the switching operation, prevent the switching operation, perform early architectural state data transfer or other possibilities.
    Type: Grant
    Filed: July 3, 2014
    Date of Patent: January 16, 2018
    Assignee: The Regents of the University of Michigan
    Inventors: Shruti Padmanabha, Andrew Lukefahr, Reetuparna Das, Scott Mahlke
  • Publication number: 20170262285
    Abstract: Apparatus and a corresponding method for controlling a transition between use of first processing circuitry and second processing circuitry to execute program instructions are provided. Transition monitoring storage selects an entry for a load program instruction executed during the transition in dependence on a memory address from which a value is to be loaded and stores a program order timestamp for the load program instruction, unless a valid previously stored program order timestamp in the entry precedes the program order timestamp. Thus the oldest timestamp of an load instruction executed in the transition is held. At either the start or end (or both) of the transition the content of the transition monitoring storage is cleared.
    Type: Application
    Filed: March 8, 2016
    Publication date: September 14, 2017
    Inventors: Andrew LUKEFAHR, Shruti PADMANABHA, Reetuparna DAS, Scott MAHLKE, Jiecao YU
  • Patent number: 9652262
    Abstract: This follows a data processing system comprising multiple GPUs includes instruction queue circuitry storing data specifying program instructions for threads awaiting issue for execution. Instruction characterization circuitry determines one or more characteristics of the program instructions awaiting issue within the instructional queue circuitry and supplies this to operating parameter control circuitry. The operating parameter control circuitry alters one or more operating parameters of the system in response to the one or more characteristics of the program instructions awaiting issue.
    Type: Grant
    Filed: October 9, 2014
    Date of Patent: May 16, 2017
    Assignee: The Regents of the University of Michigan
    Inventors: Ankit Sethia, Scott Mahlke
  • Patent number: 9639363
    Abstract: A processor core includes a front end, and first and second back ends, the front end including a fetch engine configured to retrieve the sequence of data processing instructions for both the first back end and the second back end from a memory, and the first and second back ends are each configured to execute the sequence of program instructions. The core operates in a first mode in which the first back end is active and receives the sequence of data processing instructions from the fetch engine and the second back end is inactive, and a second mode in which the first back end is inactive and the second back end is active and receives the sequence of data processing instructions from the fetch engine, where the cycles-per-instruction rate is lower and energy consumption is higher for the first mode than the second mode.
    Type: Grant
    Filed: November 29, 2013
    Date of Patent: May 2, 2017
    Assignee: The Regents of the University of Michigan
    Inventors: Andrew Lukefahr, Reetuparna Das, Shruti Padmanabha, Scott Mahlke
  • Publication number: 20160179572
    Abstract: Provided is a method and apparatus of selecting a preemption technique for a computation unit included in a processor to execute a second task before the at least one computation unit finishes executing a first task. The method includes receiving a preemption request, predicting a cost of preemption techniques based on a progress of the first task until receipt of the preemption request, and selecting one of the preemption techniques based on the predicted cost.
    Type: Application
    Filed: August 7, 2015
    Publication date: June 23, 2016
    Applicant: SAMSUNG ELECTRONICS CO., LTD.
    Inventors: Jason Jong Kyu PARK, Scott MAHLKE, Donghoon YOO
  • Patent number: 9342478
    Abstract: Disclosed is configuration memory access technology in a processor with a reconfigurable architecture. The processor with the reconfigurable architecture includes an array of processing elements (PEs), a configuration memory and a token network. The configuration memory stores configuration data associated with controlling data flow of the respective PEs. The token network reads the configuration data from the configuration memory, estimates data flow of the PEs from the read configuration data, reads required configuration data from the configuration memory based on the estimated data flow, and supplies the required configuration data to corresponding PEs. By reducing configuration memory access frequency through a token network, power consumption may be reduced.
    Type: Grant
    Filed: October 19, 2009
    Date of Patent: May 17, 2016
    Assignee: Samsung Electronics Co., Ltd.
    Inventors: Heejun Shim, Sukjin Kim, Hyunchul Park, Scott Mahlke, Yongjun Park
  • Patent number: 9329846
    Abstract: Cooperative program code transformation includes receiving a transformation hint request, obtaining a suitable transformation hint, and providing the suitable transformation hint such that it is used to transform at least a portion of the program code and generate optimized code.
    Type: Grant
    Filed: November 22, 2010
    Date of Patent: May 3, 2016
    Assignee: Parakinetics Inc.
    Inventors: David I. August, Kevin C. Fan, Jae Wook Lee, Scott A. Mahlke, Mojtaba Mehrara
  • Publication number: 20160103715
    Abstract: A multithreaded data processing system performs processing using resource circuitry which is a finite resource. A saturation signal is generated to indicate when the resource circuitry is no longer able to perform processing operations issued to it. This saturations signal may be used to select a scheduling algorithm to be used for further scheduling, such as switching to scheduling from a single thread as opposed to round-robin scheduling from all of the threads. Re-execution queue circuitry is used to queue processing operations which have been enabled to be issued so as to permit other processing operations which may not be blocked by the lack of use of circuitry to attempt issue.
    Type: Application
    Filed: October 9, 2014
    Publication date: April 14, 2016
    Inventors: Ankit SETHIA, Scott MAHLKE