Patents by Inventor Matthew C. Merten

Matthew C. Merten has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 10152401
    Abstract: Novel instructions, logic, methods and apparatus are disclosed to test transactional execution status. Embodiments include decoding a first instruction to start a transactional region. Responsive to the first instruction, a checkpoint for a set of architecture state registers is generated and memory accesses from a processing element in the transactional region associated with the first instruction are tracked. A second instruction to detect transactional execution of the transactional region is then decoded. An operation is executed, responsive to decoding the second instruction, to determine if an execution context of the second instruction is within the transactional region. Then responsive to the second instruction, a first flag is updated. In some embodiments, a register may optionally be updated and/or a second flag may optionally be updated responsive to the second instruction.
    Type: Grant
    Filed: December 22, 2015
    Date of Patent: December 11, 2018
    Assignee: Intel Corporation
    Inventors: Ravi Rajwar, Bret L. Toll, Konrad K. Lai, Matthew C. Merten, Martin G. Dixon
  • Publication number: 20180253370
    Abstract: A processor is to execute and retire instructions for a virtual machine. A reload register is coupled to the core is to store a reload value. A performance monitoring counter (PMC) register is coupled to the reload register and an event-based sampler operatively is coupled to the reload register and the PMC register. The event-based sampler includes circuitry to load the reload value into the PMC register and increment the PMC register after detecting each occurrence of an event of a certain type as a result of execution of the instructions. Upon detecting an occurrence of the event after the PMC register reaches a predetermined trigger value, the event-based sampler is to execute microcode to generate field data for elements within a sampling record, wherein the field data relates to a current processor state of execution, and reload the reload value from the reload register into the PMC register.
    Type: Application
    Filed: May 7, 2018
    Publication date: September 6, 2018
    Inventors: Matthew C. Merten, Beeman C. Strong, Michael W. Chynoweth, Grant G. Zhou, Andreas Kleen, Kimberly C. Weier, Angela D. Schmid, Stanislav Bratanov, Seth Abraham, Jason W. Brandt, Ahmad Yasin
  • Patent number: 9965375
    Abstract: A core includes a memory buffer and executes an instruction within a virtual machine. A processor tracer captures trace data and formats the trace data as trace data packets. An event-based sampler generates field data for a sampling record in response to occurrence of an event of a certain type as a result of execution of the instruction. The processor tracer, upon receipt of the field data: formats the field data into elements of the sampling record as a group of record packets; inserts the group of record packets between the trace data packets as a combined packet stream; and stores the combined packet stream in the memory buffer as a series of output pages. The core, when in guest profiling mode, executes a virtual machine monitor to map output pages of the memory buffer to host physical pages of main memory using multilevel page tables.
    Type: Grant
    Filed: June 28, 2016
    Date of Patent: May 8, 2018
    Assignee: Intel Corporation
    Inventors: Matthew C. Merten, Beeman C. Strong, Michael W. Chynoweth, Grant G. Zhou, Andreas Kleen, Kimberly C. Weier, Angela D. Schmid, Stanislav Bratanov, Seth Abraham, Jason W. Brandt, Ahmad Yasin
  • Patent number: 9904553
    Abstract: A processor and method are described for scheduling operations for execution within a reservation station. For example, a method in accordance with one embodiment of the invention includes the operations of: classifying a plurality of operations based on the execution ports usable to execute those operations; allocating the plurality of operations into groups within a reservation station based on the classification, wherein each group is serviced by one or more execution ports corresponding to the classification, and wherein two or more entries within a group share a common read port and a common write port; dynamically scheduling two or more operations in a group for concurrent execution based on the ports capable of executing those operations and a relative age of the operations.
    Type: Grant
    Filed: May 19, 2016
    Date of Patent: February 27, 2018
    Assignee: INTEL CORPORATION
    Inventors: Bambang Sutanto, Srikanth T. Srinivasan, Matthew C. Merten, Chia Yin Kevin Lai, Ammon J Christiansen, Justin M. Deinlein
  • Publication number: 20180004628
    Abstract: There is disclosed in an example a processor, having: a front end including circuitry to decode instructions from an instruction stream; a data cache unit including circuitry to cache data for the processor; and a core triggering block (CTB) to provide integration between two or more different debug capabilities.
    Type: Application
    Filed: July 2, 2016
    Publication date: January 4, 2018
    Applicant: Intel Corporation
    Inventors: Beeman C. Strong, Matthew C. Merten, Lee W. Baugh
  • Publication number: 20170371769
    Abstract: A core includes a memory buffer and executes an instruction within a virtual machine. A processor tracer captures trace data and formats the trace data as trace data packets. An event-based sampler generates field data for a sampling record in response to occurrence of an event of a certain type as a result of execution of the instruction. The processor tracer, upon receipt of the field data: formats the field data into elements of the sampling record as a group of record packets; inserts the group of record packets between the trace data packets as a combined packet stream; and stores the combined packet stream in the memory buffer as a series of output pages. The core, when in guest profiling mode, executes a virtual machine monitor to map output pages of the memory buffer to host physical pages of main memory using multilevel page tables.
    Type: Application
    Filed: June 28, 2016
    Publication date: December 28, 2017
    Inventors: Matthew C. Merten, Beeman C. Strong, Michael W. Chynoweth, Grant G. Zhou, Andreas Kleen, Kimberly C. Weier, Angela D. Schmid, Stanislav Bratanov, Seth Abraham, Jason W. Brandt, Ahmad Yasin
  • Patent number: 9753832
    Abstract: In accordance with embodiments disclosed herein, there is provided systems and methods for minimizing bandwidth to compress an output stream of an instruction tracing system. For example, the method may include identifying a current instruction in a trace of the IT module as a conditional branch (CB) instruction. The method includes executing one of generating a CB packet including a byte pattern with an indication of outcome of the CB instruction, or adding an indication of the outcome of the CB instruction to the byte pattern of an existing CB packet. The method includes generating a packet when a subsequent instruction in the trace is not the CB instruction. The packet is different from the CB packet. The method also includes adding the packet into a deferred queue when the packet is deferrable. The method further includes outputting the CB packet followed by the deferred packet into a packet log.
    Type: Grant
    Filed: June 28, 2013
    Date of Patent: September 5, 2017
    Assignee: Intel Corporation
    Inventors: Ilya Wagner, Matthew C. Merten, Frank Binns, Christine E. Wang, Mayank Bomb, Tong Li, Thilo Schmitt, M D A. Rahman
  • Patent number: 9746903
    Abstract: Some implementations provide techniques and arrangements for adjusting a rate at which operations are performed by a processor based on a comparison of a first indication of power consumed by the processor as a result of performing a first set of operations and a second indication of power consumed by the processor as a result of performing a second set of operations. The rate at which operations are performed by the processor may be adjusted when the comparison indicates that a difference between the first indication of power consumed by the processor and the second indication of power consumed by the processor is greater than a threshold value.
    Type: Grant
    Filed: August 11, 2015
    Date of Patent: August 29, 2017
    Assignee: Intel Corporation
    Inventors: Anupama Suryanarayanan, Matthew C. Merten, Ryan L. Carlson, Stephen H. Gunther
  • Patent number: 9733939
    Abstract: A processor includes a processing unit including a storage module having stored thereon a physical reference list for storing identifications of physical registers that have been referenced by multiple logical registers, and a reclamation module for reclaiming physical registers to a free list based on a count of each of the physical registers on the physical reference list.
    Type: Grant
    Filed: September 28, 2012
    Date of Patent: August 15, 2017
    Assignee: Intel Corporation
    Inventors: Vijaykumar Balaram Kadgi, James D. Hadley, Avinash Sodani, Matthew C. Merten, Morris Marden, Joseph A. McMahon, Grace C. Lee, Laura A. Knauth, Robert S. Chappell, Fariborz Tabesh
  • Patent number: 9733937
    Abstract: A method, apparatus, and system are provided for performing compare and exchange operations using a sleep-wakeup mechanism. According to one embodiment, an instruction at a processor is executed to help acquire a lock on behalf of the processor. If the lock is unavailable to be acquired by the processor, the instruction is put to sleep until an event has occurred.
    Type: Grant
    Filed: March 15, 2013
    Date of Patent: August 15, 2017
    Assignee: Intel Corporation
    Inventors: Bratin Saha, Matthew C. Merten, Per Hammarlund
  • Publication number: 20170212825
    Abstract: A hardware profiling mechanism implemented by performance monitoring hardware enables page level automatic binary translation. The hardware during runtime identifies a code page in memory containing potentially optimizable instructions. The hardware requests allocation of a new page in memory associated with the code page, where the new page contains a collection of counters and each of the counters corresponds to one of the instructions in the code page. When the hardware detects a branch instruction having a branch target within the code page, it increments one of the counters that has the same position in the new page as the branch target in the code page. The execution of the code page is repeated and the counters are incremented when branch targets fall within the code page. The hardware then provides the counter values in the new page to a binary translator for binary translation.
    Type: Application
    Filed: January 10, 2017
    Publication date: July 27, 2017
    Inventors: Paul Caprioli, Matthew C. Merten, Muawya M. Al-Otoom, Omar M. Shaikh, Abhay S. Kanhere, Suresh Srinivas, Koichi Yamada, Vivek Thakkar, Pawel Osciak
  • Patent number: 9652236
    Abstract: A processor includes a logic to execute a first instruction and a second instruction. The first instruction is ordered before the second instruction. Each instruction references a respective logical register assigned to a respective physical register. The processor also includes logic to reassign a physical register of the second instruction to another logical register before retirement of the first instruction.
    Type: Grant
    Filed: December 23, 2013
    Date of Patent: May 16, 2017
    Assignee: Intel Corporation
    Inventors: Srikanth T. Srinivasan, Mark J. Dechene, Yury N. Ilin, Justin M. Deinlein, Christine E. Wang, Matthew C. Merten
  • Publication number: 20170132004
    Abstract: In one embodiment, a processor includes a performance monitor including a last branch record (LBR) stack to store a call stack to an event of interest, where the call stack is collected responsive to a trigger for the event. The processor further includes logic to control the LBR stack to operate in a call stack mode such that an entry to a call instruction for a leaf function is cleared on return from the leaf function. Other embodiments are described and claimed.
    Type: Application
    Filed: January 18, 2017
    Publication date: May 11, 2017
    Inventors: Michael W. Chynoweth, Peggy J. Irelan, Matthew C. Merten, Seung-Woo Kim, Laura A. Knauth, Stanislav Bratanov
  • Patent number: 9612938
    Abstract: In accordance with embodiments disclosed herein, there is provided systems and methods for providing status of a processing device with a periodic synchronization point in an instruction tracing system. For example, the method may include generating a boundary packet based on a unique byte pattern in a packet log. The boundary packet provides a starting point for packet decode. The method may also include generating a plurality of state packets based on status information of the processor. The plurality of state packets follows the boundary packet when outputted into the packet log.
    Type: Grant
    Filed: May 16, 2013
    Date of Patent: April 4, 2017
    Assignee: Intel Corporation
    Inventors: Frank Binns, Matthew C. Merten, Mayank Bomb, Beeman C. Strong, Peter Lachner, Jason W. Brandt, Itamar Kazachinsky, Ofer Levy, Md A. Rahman
  • Patent number: 9606602
    Abstract: In an embodiment, a processor includes at least one core including a first core. The first core includes memory execution logic to execute one or more memory instructions, memory dispatch logic to output a plurality of memory instructions to the memory execution logic, and reactive memory instruction tracking logic. The reactive memory instruction tracking logic is to detect an onset of a memory instruction high power event associated with execution of at least one of the memory instructions, and to indicate to the memory dispatch logic to throttle output of the memory instructions to the memory execution logic responsive to detection of the onset of the memory instruction high power event. The processor also includes cache memory coupled to the at least one core. Other embodiments are described and claimed.
    Type: Grant
    Filed: June 30, 2014
    Date of Patent: March 28, 2017
    Assignee: Intel Corporation
    Inventors: Anupama Suryanarayanan, Matthew C. Merten, Ryan L. Carlson
  • Patent number: 9582275
    Abstract: In one embodiment, a processor includes a performance monitor including a last branch record (LBR) stack to store a call stack to an event of interest, where the call stack is collected responsive to a trigger for the event. The processor further includes logic to control the LBR stack to operate in a call stack mode such that an entry to a call instruction for a leaf function is cleared on return from the leaf function. Other embodiments are described and claimed.
    Type: Grant
    Filed: May 31, 2011
    Date of Patent: February 28, 2017
    Assignee: Intel Corporation
    Inventors: Michael W. Chynoweth, Peggy J. Irelan, Matthew C. Merten, Seung-Woo Kim, Laura A. Knauth, Stanislav Bratanov
  • Publication number: 20170024213
    Abstract: A processor and method are described for scheduling operations for execution within a reservation station. For example, a method in accordance with one embodiment of the invention includes the operations of: classifying a plurality of operations based on the execution ports usable to execute those operations; allocating the plurality of operations into groups within a reservation station based on the classification, wherein each group is serviced by one or more execution ports corresponding to the classification, and wherein two or more entries within a group share a common read port and a common write port; dynamically scheduling two or more operations in a group for concurrent execution based on the ports capable of executing those operations and a relative age of the operations.
    Type: Application
    Filed: May 19, 2016
    Publication date: January 26, 2017
    Inventors: Bambang SUTANTO, Srikanth T. SRINIVASAN, Matthew C. MERTEN, Chia Yin Kevin LAI, Ammon J. CHRISTIANSEN, Justin M. DEINLEIN
  • Patent number: 9542191
    Abstract: A hardware profiling mechanism implemented by performance monitoring hardware enables page level automatic binary translation. The hardware during runtime identifies a code page in memory containing potentially optimizable instructions. The hardware requests allocation of a new page in memory associated with the code page, where the new page contains a collection of counters and each of the counters corresponds to one of the instructions in the code page. When the hardware detects a branch instruction having a branch target within the code page, it increments one of the counters that has the same position in the new page as the branch target in the code page. The execution of the code page is repeated and the counters are incremented when branch targets fall within the code page. The hardware then provides the counter values in the new page to a binary translator for binary translation.
    Type: Grant
    Filed: March 30, 2012
    Date of Patent: January 10, 2017
    Assignee: Intel Corporation
    Inventors: Paul Caprioli, Matthew C. Merten, Muawya M. Al-Otoom, Omar M. Shaikh, Abhay S. Kanhere, Suresh Srinivas, Koichi Yamada, Vivek Thakkar, Pawel Osciak
  • Patent number: 9535744
    Abstract: A processor, system, and method are described for continued retirement of operations during a commit of a speculative region of program code. For example, one embodiment of a method comprises the operations of identifying a plurality of transactional memory regions in program code, including a first transactional memory region; and retiring one or more of a plurality of operations which follow the first transactional memory region even when a commit operation associated with the first transactional memory region is waiting to complete.
    Type: Grant
    Filed: June 29, 2013
    Date of Patent: January 3, 2017
    Assignee: INTEL CORPORATION
    Inventors: Ravi Rajwar, Matthew C. Merten, Christine E. Wang, Vijaykumar B. Kadgi, Rajesh S. Parthasarathy
  • Patent number: 9495159
    Abstract: In response to detecting one or more conditions are met, a checkpoint of a current state of a thread may be created. One or more incomplete instructions may be moved from a first level of a re-order buffer to a second level of the re-order buffer. Each incomplete instruction may be currently executing or awaiting execution.
    Type: Grant
    Filed: September 27, 2013
    Date of Patent: November 15, 2016
    Assignee: Intel Corporation
    Inventors: Mark J. Dechene, Srikanth T. Srinivasan, Matthew C. Merten, Tong Li, Christine E. Wang