Patents by Inventor Matthew C. Merten

Matthew C. Merten has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20150006868
    Abstract: In accordance with embodiments disclosed herein, there is provided systems and methods for minimizing bandwidth to compress an output stream of an instruction tracing system. For example, the method may include identifying a current instruction in a trace of the IT module as a conditional branch (CB) instruction. The method includes executing one of generating a CB packet including a byte pattern with an indication of outcome of the CB instruction, or adding an indication of the outcome of the CB instruction to the byte pattern of an existing CB packet. The method includes generating a packet when a subsequent instruction in the trace is not the CB instruction. The packet is different from the CB packet. The method also includes adding the packet into a deferred queue when the packet is deferrable. The method further includes outputting the CB packet followed by the deferred packet into a packet log.
    Type: Application
    Filed: June 28, 2013
    Publication date: January 1, 2015
    Inventors: Ilya Wagner, Matthew C. Merten, Frank Binns, Christine E. Wang, Mayank Bomb, Tong Li, Thilo Schmitt, MD A. Rahman
  • Publication number: 20140344552
    Abstract: In accordance with embodiments disclosed herein, there is provided systems and methods for providing status of a processing device with a periodic synchronization point in an instruction tracing system. For example, the method may include generating a boundary packet based on a unique byte pattern in a packet log. The boundary packet provides a starting point for packet decode. The method may also include generating a plurality of state packets based on status information of the processor. The plurality of state packets follows the boundary packet when outputted into the packet log.
    Type: Application
    Filed: May 16, 2013
    Publication date: November 20, 2014
    Inventors: Frank Binns, Matthew C. Merten, Mayank Bomb, Beeman C. Strong, Peter Lachner, Jason W. Brandt, Itamar Kazachinsky, Ofer Levy, MD A. Rahman
  • Publication number: 20140337604
    Abstract: A processing device implementing minimizing bandwidth to track return targets by an instruction tracing system is disclosed. A processing device of the disclosure an instruction fetch unit comprising a return stack buffer (RSB) to predict a target address of a return (RET) instruction corresponding to a call (CALL) instruction. The processing device further includes a retirement unit comprising an instruction tracing module to initiate instruction tracing for instructions executed by the processing device, determine whether the target address of the RET instruction was mispredicted, determine a value of call depth counter (CDC) maintained by the instruction tracing module, and when the target address of the RET instruction was not mispredicted and when the value of the CDC is greater than zero, generate an indication that the RET instruction branches to a next linear instruction after the corresponding CALL instruction.
    Type: Application
    Filed: May 9, 2013
    Publication date: November 13, 2014
    Inventors: Beeman C. Strong, Matthew C. Merten, Tong Li
  • Publication number: 20140310504
    Abstract: Systems and methods for flag tracking in data manipulation operations involving move elimination. An example processing system comprises a first data structure including a plurality of physical register values; a second data structure including a plurality of pointers referencing elements of the first data structure; a third data structure including a plurality of move elimination sets, each move elimination set comprising two or more bits representing two or more logical data registers, the third data structure further comprising at least one bit associated with each move elimination set, the at least one bit representing one or more logical flag registers; a fourth data structure including an identifier of a data register sharing an element of the first data structure with a flag register; and a move elimination logic configured to perform a move elimination operation.
    Type: Application
    Filed: April 11, 2013
    Publication date: October 16, 2014
    Inventors: VIJAYKUMAR B. KADGI, JEREMY R. ANDERSON, JAMES D. HADLEY, TONG LI, MATTHEW C. MERTEN
  • Patent number: 8812792
    Abstract: A technique for using memory attributes to relay information to a program or other agent. More particularly, embodiments of the invention relate to using memory attribute bits to check various memory properties in an efficient manner.
    Type: Grant
    Filed: September 21, 2013
    Date of Patent: August 19, 2014
    Assignee: Intel Corporation
    Inventors: Quinn A. Jacobson, Anne W. Bracy, Hong Wang, John P. Shen, Per Hammarlund, Matthew C. Merten, Suresh Srinivas, Kshitij A. Doshi, Gautham Shinya, Bratin Saha, Ali-Reza Adi-Tabatabai, Gad Sheaffer
  • Publication number: 20140201505
    Abstract: A processor includes one or more execution units to execute instructions of a plurality of threads and thread control logic coupled to the execution units to predict whether a first of the plurality of threads is ready for selection in a current cycle based on readiness of instructions of the first thread in one or more previous cycles, to predict whether a second of the plurality of threads is ready for selection in the current cycle based on readiness of instructions of the second thread in the one or more previous cycles, and to select one of the first and second threads in the current cycle based on the predictions.
    Type: Application
    Filed: March 30, 2012
    Publication date: July 17, 2014
    Inventors: Matthew C. Merten, Tong Li, Vijaykumar B. Kadgi, Srikanth T. Srinivasan, Christine E. Wang
  • Publication number: 20140195790
    Abstract: A secondary jump execution unit (JEU) is incorporated in a micro-processor to operate concurrently with a primary JEU, enabling the execution of simultaneous branch operations with possible detection of multiple branch mispredicts. When branch operations are executed on both JEUs in a same instruction cycle, mispredict processing for the secondary JEU is skidded into the primary JEU's dispatch pipeline such that the branch processing for the secondary JEU occurs after processing of the branch for the primary JEU and while the primary JEU is not processing a branch. Moreover, in cases when a nuke command is also received from a reorder buffer of the processor, the branch processing for the secondary JEU is further delayed to accommodate processing of the nuke on the primary JEU. Further embodiments support the promotion of the secondary JEU to have access to the mispredict mechanisms of the primary JEU in certain circumstances.
    Type: Application
    Filed: December 28, 2011
    Publication date: July 10, 2014
    Inventors: Matthew C. Merten, Avinash Sodani, Sean P. Mirkes, Vijaykumar B. Kadgi, Bambang Sutanto, Chia Yin Kevin Lai, Morris Marden, Alexandre J. Farcy
  • Publication number: 20140189306
    Abstract: An enhanced loop streaming detection mechanism is provided in a processor to reduce power consumption. The processor includes a decoder to decode instructions in a loop into micro-operations, and a loop streaming detector to detect the presence of the loop in the micro-operations. The processor also includes a loop characteristic tracker unit to identify hardware components downstream from the decoder that are not to be used by the micro-operations in the loop, and to disable the identified hardware components. The processor also includes execution circuitry to execute the micro-operations in the loop with the identified hardware components disabled.
    Type: Application
    Filed: December 27, 2012
    Publication date: July 3, 2014
    Inventors: Matthew C. Merten, Justin M. Deinlein, Yury N. Ilin, Alexandre J. Farcy, Tong Li, Srikanth T. Srinivasan
  • Publication number: 20140181476
    Abstract: A scheduler implementing a dependency matrix having restricted entries is disclosed. A processing device of the disclosure includes a decode unit to decode an instruction and a scheduler communicably coupled to the decode unit. In one embodiment, the scheduler is configured to receive the decoded instruction, determine that the decoded instruction qualifies for allocation as a restricted reservation station (RS) entry type in a dependency matrix maintained by the scheduler, identify RS entries in the dependency matrix that are free for allocation, allocate one of the identified free RS entries with information of the decoded instruction in the dependency matrix, and update a row of the dependency matrix corresponding to the claimed RS entry with source dependency information of the decoded instruction.
    Type: Application
    Filed: December 21, 2012
    Publication date: June 26, 2014
    Inventors: Srikanth T. Srinivasan, Matthew C. Merten, Bambang Sutanto, Rahul R. Kulkarni, Justin M. Deinlein, James D. Hadley
  • Publication number: 20140156977
    Abstract: Techniques are described for enabling and/or disabling a secondary jump execution unit (JEU) in a micro-processor. The secondary JEU is incorporated in the micro-processor to operate concurrently with a primary JEU, and to enable the handling of simultaneous branch mispredicts on multiple branches. Activation and deactivation of the secondary JEU may be controlled by a pressure counter or a confidence counter. A pressure counter mechanism increments a count for each branch operation executed within the processor and decrements the count by a decay value during each cycle. A confidence counter mechanism increments a count for each correctly predicted branch, and decrements the count for each mispredict. Each counter signals an activation component, such as a port binding hardware component, to begin binding micro-operations to the secondary JEU when the counter exceeds an activation threshold. The counter mechanism may be thread-agnostic or thread-specific.
    Type: Application
    Filed: December 28, 2011
    Publication date: June 5, 2014
    Inventors: Mark J. Dechene, Matthew C. Merten, Sean P. Mirkes
  • Publication number: 20140095814
    Abstract: A processor includes a processing unit including a storage module having stored thereon a table for tracking physical registers in which each store operation stores source data and a memory renaming module for register renaming load operations based on the table.
    Type: Application
    Filed: September 28, 2012
    Publication date: April 3, 2014
    Inventors: MORRIS MARDEN, VIJAYKUMAR VIJAY KADGI, JAMES D. HADLEY, MATTHEW C. MERTEN, GRACE C. LEE, JOSEPH A. MCMAHON, ROBERT S. CHAPPELL, LAURA A. KNAUTH, FARIBORZ TABESH
  • Publication number: 20140095838
    Abstract: A processor includes a processing unit including a storage module having stored thereon a physical reference list for storing identifications of physical registers that have been referenced by multiple logical registers, and a reclamation module for reclaiming physical registers to a free list based on a count of each of the physical registers on the physical reference list.
    Type: Application
    Filed: September 28, 2012
    Publication date: April 3, 2014
    Inventors: VIJAYKUMAR VIJAY KADGI, JAMES D. HADLEY, AVINASH SODANI, MATTHEW C. MERTEN, MORRIS MARDEN, JOSEPH A. MCMAHON, GRACE C. LEE, LAURA A. KNAUTH, ROBERT S. CHAPPELL, FARIBORZ TABESH
  • Publication number: 20140025901
    Abstract: A technique for using memory attributes to relay information to a program or other agent. More particularly, embodiments of the invention relate to using memory attribute bits to check various memory properties in an efficient manner.
    Type: Application
    Filed: September 21, 2013
    Publication date: January 23, 2014
    Inventors: Quinn A. Jacobson, Anne C. Bracy, Hong Wang, John P. Shen, Per Hammarlund, Matthew C. Merten, Suresh Srinivas, Kshitij A. Doshi, Gautham Chinya, Bratin Saha, Ali-Reza Adi-Tabatabai, Gad Sheaffer
  • Publication number: 20140013086
    Abstract: A number of addition instructions are provided that have no data dependency between each other. A first addition instruction stores its carry output in a first flag of a flags register without modifying a second flag in the flags register. A second addition instruction stores its carry output in the second flag of the flags register without modifying the first flag in the flags register.
    Type: Application
    Filed: December 22, 2011
    Publication date: January 9, 2014
    Inventors: Vinodh Gopal, James D. Guilford, Gilbert M. Wolrich, Wajdi K. Feghali, Erdinc Ozturk, Martin G. Dixon, Sean P. Mirkes, Matthew C. Merten, Tong Li, Bret T. Toll, I
  • Patent number: 8607241
    Abstract: A method, apparatus, and system are provided for performing compare and exchange operations using a sleep-wakeup mechanism. According to one embodiment, an instruction at a processor is executed to help acquire a lock on behalf of the processor. If the lock is unavailable to be acquired by the processor, the instruction is put to sleep until an event has occurred.
    Type: Grant
    Filed: June 30, 2004
    Date of Patent: December 10, 2013
    Assignee: Intel Corporation
    Inventors: Bratin Saha, Matthew C. Merten, Per Hammarlund
  • Publication number: 20130311758
    Abstract: A hardware profiling mechanism implemented by performance monitoring hardware enables page level automatic binary translation. The hardware during runtime identifies a code page in memory containing potentially optimizable instructions. The hardware requests allocation of a new page in memory associated with the code page, where the new page contains a collection of counters and each of the counters corresponds to one of the instructions in the code page. When the hardware detects a branch instruction having a branch target within the code page, it increments one of the counters that has the same position in the new page as the branch target in the code page. The execution of the code page is repeated and the counters are incremented when branch targets fall within the code page. The hardware then provides the counter values in the new page to a binary translator for binary translation.
    Type: Application
    Filed: March 30, 2012
    Publication date: November 21, 2013
    Inventors: Paul Caprioli, Matthew C. Merten, Muawya M. Al-Otoom, Omar M. Shaikh, Abhay S. Kanhere, Suresh Srinivas, Koichi Yamada, Vivek Thakkar, Pawel Osciak
  • Patent number: 8560781
    Abstract: A technique for using memory attributes to relay information to a program or other agent. More particularly, embodiments of the invention relate to using memory attribute bits to check various memory properties in an efficient manner.
    Type: Grant
    Filed: July 4, 2011
    Date of Patent: October 15, 2013
    Assignee: Intel Corporation
    Inventors: Quinn A. Jacobson, Anne Weinberger Bracy, Hong Wang, John P. Shen, Per Hammarlund, Matthew C. Merten, Suresh Srinivas, Kshitij A. Doshi, Gautham N. Chinya, Bratin Saba, Ali-Reza Adl-Tabatabai, Gad S. Sheaffer
  • Publication number: 20130262837
    Abstract: A processor includes an execution unit to execute instructions, where each operand of each executed instruction has one or more elements of an element size and at least one operand of the instruction corresponds to a register of a register size. The processor further includes a counter configured to count a number of instructions that have been executed by the execution unit associated with a particular combination of register size and element size.
    Type: Application
    Filed: March 29, 2012
    Publication date: October 3, 2013
    Applicant: INTEL CORPORATION
    Inventors: Laura A. Knauth, Matthew C. Merten, Ronak Singhal, Hugh M. Caffey
  • Publication number: 20130232499
    Abstract: A method, apparatus, and system are provided for performing compare and exchange operations using a sleep-wakeup mechanism. According to one embodiment, an instruction at a processor is executed to help acquire a lock on behalf of the processor. If the lock is unavailable to be acquired by the processor, the instruction is put to sleep until an event has occurred.
    Type: Application
    Filed: March 15, 2013
    Publication date: September 5, 2013
    Inventors: Bratin Saha, Matthew C. Merten, Per Hammarlund
  • Publication number: 20130205119
    Abstract: Novel instructions, logic, methods and apparatus are disclosed to test transactional execution status. Embodiments include decoding a first instruction to start a transactional region. Responsive to the first instruction, a checkpoint for a set of architecture state registers is generated and memory accesses from a processing element in the transactional region associated with the first instruction are tracked. A second instruction to detect transactional execution of the transactional region is then decoded. An operation is executed, responsive to decoding the second instruction, to determine if an execution context of the second instruction is within the transactional region. Then responsive to the second instruction, a first flag is updated. In some embodiments, a register may optionally be updated and/or a second flag may optionally be updated responsive to the second instruction.
    Type: Application
    Filed: June 29, 2012
    Publication date: August 8, 2013
    Inventors: Ravi Rajwar, Bret L. Toll, Konrad K. Lai, Matthew C. Merten, Martin G. Dixon