Patents by Inventor Matthew C. Merten
Matthew C. Merten has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Patent number: 10152401Abstract: Novel instructions, logic, methods and apparatus are disclosed to test transactional execution status. Embodiments include decoding a first instruction to start a transactional region. Responsive to the first instruction, a checkpoint for a set of architecture state registers is generated and memory accesses from a processing element in the transactional region associated with the first instruction are tracked. A second instruction to detect transactional execution of the transactional region is then decoded. An operation is executed, responsive to decoding the second instruction, to determine if an execution context of the second instruction is within the transactional region. Then responsive to the second instruction, a first flag is updated. In some embodiments, a register may optionally be updated and/or a second flag may optionally be updated responsive to the second instruction.Type: GrantFiled: December 22, 2015Date of Patent: December 11, 2018Assignee: Intel CorporationInventors: Ravi Rajwar, Bret L. Toll, Konrad K. Lai, Matthew C. Merten, Martin G. Dixon
-
Publication number: 20180253370Abstract: A processor is to execute and retire instructions for a virtual machine. A reload register is coupled to the core is to store a reload value. A performance monitoring counter (PMC) register is coupled to the reload register and an event-based sampler operatively is coupled to the reload register and the PMC register. The event-based sampler includes circuitry to load the reload value into the PMC register and increment the PMC register after detecting each occurrence of an event of a certain type as a result of execution of the instructions. Upon detecting an occurrence of the event after the PMC register reaches a predetermined trigger value, the event-based sampler is to execute microcode to generate field data for elements within a sampling record, wherein the field data relates to a current processor state of execution, and reload the reload value from the reload register into the PMC register.Type: ApplicationFiled: May 7, 2018Publication date: September 6, 2018Inventors: Matthew C. Merten, Beeman C. Strong, Michael W. Chynoweth, Grant G. Zhou, Andreas Kleen, Kimberly C. Weier, Angela D. Schmid, Stanislav Bratanov, Seth Abraham, Jason W. Brandt, Ahmad Yasin
-
Patent number: 9965375Abstract: A core includes a memory buffer and executes an instruction within a virtual machine. A processor tracer captures trace data and formats the trace data as trace data packets. An event-based sampler generates field data for a sampling record in response to occurrence of an event of a certain type as a result of execution of the instruction. The processor tracer, upon receipt of the field data: formats the field data into elements of the sampling record as a group of record packets; inserts the group of record packets between the trace data packets as a combined packet stream; and stores the combined packet stream in the memory buffer as a series of output pages. The core, when in guest profiling mode, executes a virtual machine monitor to map output pages of the memory buffer to host physical pages of main memory using multilevel page tables.Type: GrantFiled: June 28, 2016Date of Patent: May 8, 2018Assignee: Intel CorporationInventors: Matthew C. Merten, Beeman C. Strong, Michael W. Chynoweth, Grant G. Zhou, Andreas Kleen, Kimberly C. Weier, Angela D. Schmid, Stanislav Bratanov, Seth Abraham, Jason W. Brandt, Ahmad Yasin
-
Patent number: 9904553Abstract: A processor and method are described for scheduling operations for execution within a reservation station. For example, a method in accordance with one embodiment of the invention includes the operations of: classifying a plurality of operations based on the execution ports usable to execute those operations; allocating the plurality of operations into groups within a reservation station based on the classification, wherein each group is serviced by one or more execution ports corresponding to the classification, and wherein two or more entries within a group share a common read port and a common write port; dynamically scheduling two or more operations in a group for concurrent execution based on the ports capable of executing those operations and a relative age of the operations.Type: GrantFiled: May 19, 2016Date of Patent: February 27, 2018Assignee: INTEL CORPORATIONInventors: Bambang Sutanto, Srikanth T. Srinivasan, Matthew C. Merten, Chia Yin Kevin Lai, Ammon J Christiansen, Justin M. Deinlein
-
Publication number: 20180004628Abstract: There is disclosed in an example a processor, having: a front end including circuitry to decode instructions from an instruction stream; a data cache unit including circuitry to cache data for the processor; and a core triggering block (CTB) to provide integration between two or more different debug capabilities.Type: ApplicationFiled: July 2, 2016Publication date: January 4, 2018Applicant: Intel CorporationInventors: Beeman C. Strong, Matthew C. Merten, Lee W. Baugh
-
Publication number: 20170371769Abstract: A core includes a memory buffer and executes an instruction within a virtual machine. A processor tracer captures trace data and formats the trace data as trace data packets. An event-based sampler generates field data for a sampling record in response to occurrence of an event of a certain type as a result of execution of the instruction. The processor tracer, upon receipt of the field data: formats the field data into elements of the sampling record as a group of record packets; inserts the group of record packets between the trace data packets as a combined packet stream; and stores the combined packet stream in the memory buffer as a series of output pages. The core, when in guest profiling mode, executes a virtual machine monitor to map output pages of the memory buffer to host physical pages of main memory using multilevel page tables.Type: ApplicationFiled: June 28, 2016Publication date: December 28, 2017Inventors: Matthew C. Merten, Beeman C. Strong, Michael W. Chynoweth, Grant G. Zhou, Andreas Kleen, Kimberly C. Weier, Angela D. Schmid, Stanislav Bratanov, Seth Abraham, Jason W. Brandt, Ahmad Yasin
-
Patent number: 9753832Abstract: In accordance with embodiments disclosed herein, there is provided systems and methods for minimizing bandwidth to compress an output stream of an instruction tracing system. For example, the method may include identifying a current instruction in a trace of the IT module as a conditional branch (CB) instruction. The method includes executing one of generating a CB packet including a byte pattern with an indication of outcome of the CB instruction, or adding an indication of the outcome of the CB instruction to the byte pattern of an existing CB packet. The method includes generating a packet when a subsequent instruction in the trace is not the CB instruction. The packet is different from the CB packet. The method also includes adding the packet into a deferred queue when the packet is deferrable. The method further includes outputting the CB packet followed by the deferred packet into a packet log.Type: GrantFiled: June 28, 2013Date of Patent: September 5, 2017Assignee: Intel CorporationInventors: Ilya Wagner, Matthew C. Merten, Frank Binns, Christine E. Wang, Mayank Bomb, Tong Li, Thilo Schmitt, M D A. Rahman
-
Patent number: 9746903Abstract: Some implementations provide techniques and arrangements for adjusting a rate at which operations are performed by a processor based on a comparison of a first indication of power consumed by the processor as a result of performing a first set of operations and a second indication of power consumed by the processor as a result of performing a second set of operations. The rate at which operations are performed by the processor may be adjusted when the comparison indicates that a difference between the first indication of power consumed by the processor and the second indication of power consumed by the processor is greater than a threshold value.Type: GrantFiled: August 11, 2015Date of Patent: August 29, 2017Assignee: Intel CorporationInventors: Anupama Suryanarayanan, Matthew C. Merten, Ryan L. Carlson, Stephen H. Gunther
-
Patent number: 9733939Abstract: A processor includes a processing unit including a storage module having stored thereon a physical reference list for storing identifications of physical registers that have been referenced by multiple logical registers, and a reclamation module for reclaiming physical registers to a free list based on a count of each of the physical registers on the physical reference list.Type: GrantFiled: September 28, 2012Date of Patent: August 15, 2017Assignee: Intel CorporationInventors: Vijaykumar Balaram Kadgi, James D. Hadley, Avinash Sodani, Matthew C. Merten, Morris Marden, Joseph A. McMahon, Grace C. Lee, Laura A. Knauth, Robert S. Chappell, Fariborz Tabesh
-
Patent number: 9733937Abstract: A method, apparatus, and system are provided for performing compare and exchange operations using a sleep-wakeup mechanism. According to one embodiment, an instruction at a processor is executed to help acquire a lock on behalf of the processor. If the lock is unavailable to be acquired by the processor, the instruction is put to sleep until an event has occurred.Type: GrantFiled: March 15, 2013Date of Patent: August 15, 2017Assignee: Intel CorporationInventors: Bratin Saha, Matthew C. Merten, Per Hammarlund
-
Publication number: 20170212825Abstract: A hardware profiling mechanism implemented by performance monitoring hardware enables page level automatic binary translation. The hardware during runtime identifies a code page in memory containing potentially optimizable instructions. The hardware requests allocation of a new page in memory associated with the code page, where the new page contains a collection of counters and each of the counters corresponds to one of the instructions in the code page. When the hardware detects a branch instruction having a branch target within the code page, it increments one of the counters that has the same position in the new page as the branch target in the code page. The execution of the code page is repeated and the counters are incremented when branch targets fall within the code page. The hardware then provides the counter values in the new page to a binary translator for binary translation.Type: ApplicationFiled: January 10, 2017Publication date: July 27, 2017Inventors: Paul Caprioli, Matthew C. Merten, Muawya M. Al-Otoom, Omar M. Shaikh, Abhay S. Kanhere, Suresh Srinivas, Koichi Yamada, Vivek Thakkar, Pawel Osciak
-
Patent number: 9652236Abstract: A processor includes a logic to execute a first instruction and a second instruction. The first instruction is ordered before the second instruction. Each instruction references a respective logical register assigned to a respective physical register. The processor also includes logic to reassign a physical register of the second instruction to another logical register before retirement of the first instruction.Type: GrantFiled: December 23, 2013Date of Patent: May 16, 2017Assignee: Intel CorporationInventors: Srikanth T. Srinivasan, Mark J. Dechene, Yury N. Ilin, Justin M. Deinlein, Christine E. Wang, Matthew C. Merten
-
Publication number: 20170132004Abstract: In one embodiment, a processor includes a performance monitor including a last branch record (LBR) stack to store a call stack to an event of interest, where the call stack is collected responsive to a trigger for the event. The processor further includes logic to control the LBR stack to operate in a call stack mode such that an entry to a call instruction for a leaf function is cleared on return from the leaf function. Other embodiments are described and claimed.Type: ApplicationFiled: January 18, 2017Publication date: May 11, 2017Inventors: Michael W. Chynoweth, Peggy J. Irelan, Matthew C. Merten, Seung-Woo Kim, Laura A. Knauth, Stanislav Bratanov
-
Patent number: 9612938Abstract: In accordance with embodiments disclosed herein, there is provided systems and methods for providing status of a processing device with a periodic synchronization point in an instruction tracing system. For example, the method may include generating a boundary packet based on a unique byte pattern in a packet log. The boundary packet provides a starting point for packet decode. The method may also include generating a plurality of state packets based on status information of the processor. The plurality of state packets follows the boundary packet when outputted into the packet log.Type: GrantFiled: May 16, 2013Date of Patent: April 4, 2017Assignee: Intel CorporationInventors: Frank Binns, Matthew C. Merten, Mayank Bomb, Beeman C. Strong, Peter Lachner, Jason W. Brandt, Itamar Kazachinsky, Ofer Levy, Md A. Rahman
-
Patent number: 9606602Abstract: In an embodiment, a processor includes at least one core including a first core. The first core includes memory execution logic to execute one or more memory instructions, memory dispatch logic to output a plurality of memory instructions to the memory execution logic, and reactive memory instruction tracking logic. The reactive memory instruction tracking logic is to detect an onset of a memory instruction high power event associated with execution of at least one of the memory instructions, and to indicate to the memory dispatch logic to throttle output of the memory instructions to the memory execution logic responsive to detection of the onset of the memory instruction high power event. The processor also includes cache memory coupled to the at least one core. Other embodiments are described and claimed.Type: GrantFiled: June 30, 2014Date of Patent: March 28, 2017Assignee: Intel CorporationInventors: Anupama Suryanarayanan, Matthew C. Merten, Ryan L. Carlson
-
Patent number: 9582275Abstract: In one embodiment, a processor includes a performance monitor including a last branch record (LBR) stack to store a call stack to an event of interest, where the call stack is collected responsive to a trigger for the event. The processor further includes logic to control the LBR stack to operate in a call stack mode such that an entry to a call instruction for a leaf function is cleared on return from the leaf function. Other embodiments are described and claimed.Type: GrantFiled: May 31, 2011Date of Patent: February 28, 2017Assignee: Intel CorporationInventors: Michael W. Chynoweth, Peggy J. Irelan, Matthew C. Merten, Seung-Woo Kim, Laura A. Knauth, Stanislav Bratanov
-
Publication number: 20170024213Abstract: A processor and method are described for scheduling operations for execution within a reservation station. For example, a method in accordance with one embodiment of the invention includes the operations of: classifying a plurality of operations based on the execution ports usable to execute those operations; allocating the plurality of operations into groups within a reservation station based on the classification, wherein each group is serviced by one or more execution ports corresponding to the classification, and wherein two or more entries within a group share a common read port and a common write port; dynamically scheduling two or more operations in a group for concurrent execution based on the ports capable of executing those operations and a relative age of the operations.Type: ApplicationFiled: May 19, 2016Publication date: January 26, 2017Inventors: Bambang SUTANTO, Srikanth T. SRINIVASAN, Matthew C. MERTEN, Chia Yin Kevin LAI, Ammon J. CHRISTIANSEN, Justin M. DEINLEIN
-
Patent number: 9542191Abstract: A hardware profiling mechanism implemented by performance monitoring hardware enables page level automatic binary translation. The hardware during runtime identifies a code page in memory containing potentially optimizable instructions. The hardware requests allocation of a new page in memory associated with the code page, where the new page contains a collection of counters and each of the counters corresponds to one of the instructions in the code page. When the hardware detects a branch instruction having a branch target within the code page, it increments one of the counters that has the same position in the new page as the branch target in the code page. The execution of the code page is repeated and the counters are incremented when branch targets fall within the code page. The hardware then provides the counter values in the new page to a binary translator for binary translation.Type: GrantFiled: March 30, 2012Date of Patent: January 10, 2017Assignee: Intel CorporationInventors: Paul Caprioli, Matthew C. Merten, Muawya M. Al-Otoom, Omar M. Shaikh, Abhay S. Kanhere, Suresh Srinivas, Koichi Yamada, Vivek Thakkar, Pawel Osciak
-
Patent number: 9535744Abstract: A processor, system, and method are described for continued retirement of operations during a commit of a speculative region of program code. For example, one embodiment of a method comprises the operations of identifying a plurality of transactional memory regions in program code, including a first transactional memory region; and retiring one or more of a plurality of operations which follow the first transactional memory region even when a commit operation associated with the first transactional memory region is waiting to complete.Type: GrantFiled: June 29, 2013Date of Patent: January 3, 2017Assignee: INTEL CORPORATIONInventors: Ravi Rajwar, Matthew C. Merten, Christine E. Wang, Vijaykumar B. Kadgi, Rajesh S. Parthasarathy
-
Patent number: 9495159Abstract: In response to detecting one or more conditions are met, a checkpoint of a current state of a thread may be created. One or more incomplete instructions may be moved from a first level of a re-order buffer to a second level of the re-order buffer. Each incomplete instruction may be currently executing or awaiting execution.Type: GrantFiled: September 27, 2013Date of Patent: November 15, 2016Assignee: Intel CorporationInventors: Mark J. Dechene, Srikanth T. Srinivasan, Matthew C. Merten, Tong Li, Christine E. Wang