Commitment Control Or Register Bypass Patents (Class 712/218)

Loop streaming detector for standard and complex instruction types

Patent number: 9367317

Abstract: A processor includes a microcode storage comprising a plurality of microcode flows and a decode logic coupled to the microcode storage. The decode logic is configured to receive a first instruction, decode the first instruction into an entry point vector to a first microcode flow in the microcode storage, the entry point vector comprising a first indicator specifying a number of clock cycles associated with the first microcode flow, initiate the microcode storage, wherein the microcode storage inserts microinstructions of the first microcode flow into an instruction queue, count clock cycles after initiating the microcode storage, and decode a second instruction without first receiving a return from the microcode storage, wherein the second instruction is decoded at a particular clock cycle based on the number of clock cycles associated with the first microcode flow.

Type: Grant

Filed: July 3, 2013

Date of Patent: June 14, 2016

Assignee: Intel Corporation

Inventors: Jonathan D. Combs, Jonathan Y. Tong
Systems and methods for determining instruction execution error by comparing an operand of a reference instruction to a result of a subsequent cross-check instruction

Patent number: 9361104

Abstract: In a data processing system having execution circuitry, a method includes providing a cross-check instruction and a reference instruction to the execution circuitry, where the reference instruction has an operand. The method also includes executing the reference instruction to obtain a first result. Residual information is derived from execution of the reference instruction, and the method also includes executing the cross-check instruction using the residual information to obtain a second result. The second result obtained from execution of the cross-check instruction is compared to the operand of the reference instruction to determine whether an error occurred during execution of the reference instruction or the cross-check instruction.

Type: Grant

Filed: August 13, 2010

Date of Patent: June 7, 2016

Assignee: FREESCALE SEMICONDUCTOR, INC.

Inventors: Gary R. Morrison, William C. Moyer
Apparatus and method for implementing instruction support for the camellia cipher algorithm

Patent number: 9317286

Abstract: A processor including instruction support for implementing the Camellia block cipher algorithm may issue, for execution, programmer-selectable instructions from a defined instruction set architecture (ISA). The processor may include a cryptographic unit that may receive instructions for execution. The instructions include one or more Camellia instructions defined within the ISA. In addition, the Camellia instructions may be executable by the cryptographic unit to implement portions of a Camellia cipher that is compliant with Internet Engineering Task Force (IETF) Request For Comments (RFC) 3713. In response to receiving a Camellia F( )-operation instruction defined within the ISA, the cryptographic unit may perform an F( ) operation, as defined by the Camellia cipher, upon a data input operand and a subkey operand, in which the data input operand and subkey operand may be specified by the Camellia F( )-operation instruction.

Type: Grant

Filed: March 31, 2009

Date of Patent: April 19, 2016

Assignee: Oracle America, Inc.

Inventors: Christopher H. Olson, Gregory F. Grohoski, Lawrence A. Spracklen
Fault-aware mapping for shared last level cache (LLC)

Patent number: 9286172

Abstract: Embodiments of systems, apparatuses, and methods for utilizing a faulty cache line in a cache are described. In some embodiments, a graphics processing unit is allowed to access a faulty cache line in the cache. A cache access request to access a faulty cache line from a central processing unit core is remapped to access a fault-free cache line.

Type: Grant

Filed: December 22, 2011

Date of Patent: March 15, 2016

Assignee: Intel Corporation

Inventors: Tanausu Ramirez, Javier Carretero Casado, Enric Herrero, Matteo Monchiero, Xavier Vera
Method and apparatus for context aware intelligent message handling

Patent number: 9232010

Abstract: A method, apparatus and computer program product are provided for context aware intelligent message handling. In the context of a method, the method includes receiving a message, that includes a message identifier, determining if the message is dependent on an unprocessed message, causing a message status, including a message identifier, to be stored in a memory, causing the message to be stored in the memory in an instance in which the message is dependent on an unprocessed messages, receiving a message process completion indication from a downstream system and updating the message status based on the message process completion indication; determining if the message stored in the memory continues to be dependent on an unprocessed message based on the updated message status, and causing the transmission of the message to the downstream system in an instance in which the message is no longer dependent on an unprocessed message.

Type: Grant

Filed: August 4, 2014

Date of Patent: January 5, 2016

Assignee: HERE Global B.V.

Inventor: Kottorage Buddika Gajapala
Compiling method and apparatus for scheduling block in pipeline

Patent number: 9207922

Abstract: Provided is a compiling method and apparatus for scheduling a block in a pipeline. The compiling method for scheduling a block in a pipeline may include profiling, using a processor, an access count of a block in a control flow of a program code, determining that the block is an important block, in response to an edge count of an edge entering the block being greater than or equal to a predetermined value, the edge count being included in the access count of the block, and scheduling the important block based on the access count to prevent a register writeback conflict.

Type: Grant

Filed: December 19, 2013

Date of Patent: December 8, 2015

Assignee: SAMSUNG ELECTRONICS CO., LTD.

Inventor: Taisong Jin
Data processing apparatus and method for performing register renaming for certain data processing operations without additional registers

Patent number: 9201656

Abstract: The data processing apparatus (and method) has processing circuitry for performing data processing operations in response to data processing instructions, the data processing instructions referencing logical registers. A set of physical registers are provided for storing data values for access by the processing circuitry when performing the data processing operations. Register renaming storage stores a one-to-one mapping between the logical registers and the physical registers, with the register renaming storage being accessed by the processing circuitry when performing the data processing operations in order to map the referenced logical registers to corresponding physical registers. Update circuitry is arranged to identify the physical registers corresponding to those multiple logical registers in the register renaming storage. Altered one-to-one mapping between multiple logical registers and identified physical registers is employed when performing the current data processing operation.

Type: Grant

Filed: December 2, 2011

Date of Patent: December 1, 2015

Assignee: ARM Limited

Inventors: Jean-Baptiste Brelot, Cédric Denis Robert Airaud
Dynamic designation of retirement order in out-of-order store queue

Patent number: 9170962

Abstract: A method, system and processing device for retiring data entries held within a store queue (STQ). The STQ of a processor cache is modified to receive and process several types of data entries including: non-synchronized (non-sync), thread of execution synchronized (thread-sync), and all thread of execution synchronized (all-thread-sync). The task of storing data entries, from the STQ out to memory or an input/output device, is modified to increase the effectiveness of the cache. The modified STQ allows non-sync, thread-sync, and all-thread-sync instructions to coexist in the STQ regardless of the thread of execution. Stored data entries, or stores are deterministically selected for retirement, according to the data entry type.

Type: Grant

Filed: December 21, 2007

Date of Patent: October 27, 2015

Assignee: International Business Machines Corporation

Inventor: Eric F. Robinson
Analyzing data flow graph to detect data for copying from central register file to local register file used in different execution modes in reconfigurable processing array

Patent number: 9164769

Abstract: A reconfigurable array is provided. The reconfigurable array includes a Very Long Instruction Word (VLIW) mode and a Coarse-Grained Array (CGA) mode. When the VLIW mode is converted to the CGA mode, instead of sharing a central register file between the VLIW mode and the CGA mode, live data to be used in the CGA mode is copied from the central register file to local register files.

Type: Grant

Filed: December 8, 2010

Date of Patent: October 20, 2015

Assignee: Samsung Electronics Co., Ltd.

Inventors: Won-Sub Kim, Tai-Song Jin, Dong-Hoon Yoo, Bernhard Egger, Jin-Seok Lee
System, method, and computer program product for copying data between memory locations

Patent number: 9164690

Abstract: A system, method, and computer program product are provided for copying data between memory locations. In use, a memory copy instruction is implemented. Additionally, data is copied from a first memory location to a second memory location, utilizing the memory copy instruction.

Type: Grant

Filed: July 27, 2012

Date of Patent: October 20, 2015

Assignee: NVIDIA Corporation

Inventors: Brucek Kurdo Khailany, Sean Jeffrey Treichler
System and method for balancing instruction loads between multiple execution units using assignment history

Patent number: 9122487

Abstract: A system and method for balancing instruction loads between multiple execution units are disclosed. One or more execution units may be represented by a slot configured to accept instructions on behalf of the execution unit(s). A decode unit may assign instructions to a particular slot for subsequent scheduling for execution. Slot assignments may be made based on an instruction's type and/or on a history of previous slot assignments. A cumulative slot assignment history may be maintained in a bias counter, the value of which reflects the bias of previous slot assignments. Slot assignments may be determined based on the value of the bias counter, in order to balance the instruction load across all slots, and all execution units. The bias counter may reflect slot assignments made only within a desired historical window. A separate data structure may store data reflecting the actual slot assignments made during the desired historical window.

Type: Grant

Filed: June 23, 2009

Date of Patent: September 1, 2015

Assignee: Oracle America, Inc.

Inventors: Robert T. Golla, Gregory F. Grohoski
Predictive retirement toolset

Patent number: 9104991

Abstract: A system assesses one or more applications for retirement. The system includes a processing device configured for receiving attribute data corresponding to one or more of a plurality of applications. The processing device is further configured for determining one or more of the plurality of applications to assess for retirement, translating at least some of the received attribute data into two or more translated values based at least in part on one or more predetermined values, and summing two or more of the translated values, thereby resulting in one or more combined values. The processing device is further configured for calculating one or more cumulative values based at least in part on the one or more combined values and converting the one or more cumulative values, thereby resulting in one or more probability values each indicating the probability of retirement of one of the one or more applications.

Type: Grant

Filed: July 30, 2010

Date of Patent: August 11, 2015

Assignee: Bank of America Corporation

Inventors: Erin Kristin Collins, Marianna E. Chandler, Patty M. Curtner, Sherrill Jean Massingham, Joni DeVoe McKeen, Darryl Alan Sansbury, Siroos Shahnizadeh, Anthony Simoes, Rajaraman Viswanathan
Accessing private data about the state of a data processing machine from storage that is publicly accessible

Patent number: 9087000

Abstract: According to an embodiment of the invention, a method for operating a data processing machine is described in which data about a state of the machine is written to a location in storage. The location is one that is accessible to software that may be written for the machine. The state data as written is encoded. This state data may be recovered from the storage according to a decoding process. Other embodiments are also described and claimed.

Type: Grant

Filed: March 15, 2013

Date of Patent: July 21, 2015

Assignee: Intel Corporation

Inventors: Scott H. Robinson, Gustavo P. Espinosa, Steven M. Bennett
Virtual register mode by designation of dedicated register as destination operand with switch connecting execution result feedback path and turning clock off to register file

Patent number: 9086887

Abstract: The invention relates to an electronic device for data processing, which includes an execution unit with a temporary register, a register file, a first feedback path from the data output of the execution unit to the register file, a second feedback path from the data output of the execution unit to the temporary register, a switch configured to connect the first feedback path and/or the second feedback path, and a logic stage coupled to control the switch. The control stage is configured to control the switch to connect the second feedback path if the data output of an execution unit is used as an operand in the subsequent operation of an execution unit.

Type: Grant

Filed: September 9, 2011

Date of Patent: July 21, 2015

Assignee: TEXAS INSTRUMENT INCORPORATED

Inventors: Marko Krüger, Steven Bartling, Markus Kösler
Size mis-match hazard detection

Patent number: 9081581

Abstract: An out-of-order processor 4 groups program instructions together to control their commitment to complete processing. If an instruction within a group has a source operand dependent upon a plurality of destination operands of other instructions then this is identified as a size mismatch hazard. When the program instruction having the size mismatch hazard reaches a commit point within the processor, then it is flushed together with any speculatively executed succeeding program instructions. Furthermore, the group of program instructions containing the program instruction containing the program instruction having the size mismatch is divided into a plurality of groups of program instructions each containing a single program instruction which are then replayed through the processing mechanisms.

Type: Grant

Filed: November 16, 2010

Date of Patent: July 14, 2015

Assignee: ARM Limited

Inventors: James Nolan Hardage, Conrado Blasco Allue, Glen Andrew Harris
Confirming store-to-load forwards

Patent number: 9003225

Abstract: A processor includes a store queue that stores information representing store instructions. In response to retirement of a store instruction, the processor invalidates the corresponding entry in the store queue, thereby indicating that the entry is available to store a subsequent store instruction. The store address is not removed from the queue until the subsequent store instruction is stored. Accordingly, the store address is available for comparison to a dependent load address.

Type: Grant

Filed: October 17, 2012

Date of Patent: April 7, 2015

Assignee: Advanced Micro Devices, Inc.

Inventors: Matthew A. Rafacz, Matthew M. Crum, Michael E. Tuuk
SYSTEM, METHOD, AND COMPUTER PROGRAM PRODUCT FOR IMPLEMENTING MULTI-CYCLE REGISTER FILE BYPASS

Publication number: 20150089202

Abstract: A system, method, and computer program product are provided for implementing a multi-cycle register file bypass mechanism. The method includes the steps of receiving a set of control bits, combining the set of control bits with a set of valid bits associated with previously issued instructions, and enabling a bypass path for each thread based on the set of control bits and the set of valid bits. Each valid bit in the set of valid bits indicates whether execution of an instruction of the previously issued instructions was enabled for a thread in a thread block.

Type: Application

Filed: September 26, 2013

Publication date: March 26, 2015

Applicant: NVIDIA Corporation

Inventors: Xiaogang Qiu, Ian Chi Yan Kwong, Ming Yiu Siu, Jack H. Choquette, Michael Alan Fetterman
MFENCE and LFENCE micro-architectural implementation method and system

Patent number: 8959314

Abstract: A system and method for fencing memory accesses. Memory loads can be fenced, or all memory access can be fenced. The system receives a fencing instruction that separates memory access instructions into older accesses and newer accesses. A buffer within the memory ordering unit is allocated to the instruction. The access instructions newer than the fencing instruction are stalled. The older access instructions are gradually retired. When all older memory accesses are retired, the fencing instruction is dispatched from the buffer.

Type: Grant

Filed: July 15, 2013

Date of Patent: February 17, 2015

Assignee: Intel Corporation

Inventors: Salvador Palanca, Stephen Fischer, Subramaniam Maiyuran, Shekoufeh Qawami
TECHNIQUES FOR INCREASING INSTRUCTION ISSUE RATE AND REDUCING LATENCY IN AN OUT-OF-ORDER PROCESSOR

Publication number: 20150039862

Abstract: A technique for operating a processor includes storing a first result to a writeback buffer, in response to a first execution unit of the processor attempting to write the first result of a first completed instruction to a register file of the processor at a same processor time as a second execution unit of the processor is attempting to write a second result of a second completed instruction to the register file. The writeback buffer is positioned in a dataflow between the first execution unit and the register file. A buffer full indicator logic is used to detect that the writeback buffer is unavailable. A buffer unavailable signal is transmitted, from the buffer full indicator logic, in response to detecting the writeback buffer is unavailable. In response to receiving the buffer unavailable signal, a buffer retrieving logic writes the first result from the writeback buffer to the register file.

Type: Application

Filed: July 31, 2014

Publication date: February 5, 2015

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: HARRY BAROWSKI, TIM NIGGERMEIER
Method and apparatus for improving cache efficiency

Patent number: 8943273

Abstract: Aspects of the disclosure provide methods for cache efficiency. A method for cache efficiency can include storing data in a buffer entry in association with a cache array in response to a first store instruction that hits the cache array before the first store instruction is committed. Further, when a dependent load instruction is subsequent to the first store instruction, the method can include providing the data from the buffer entry in response to the first dependent load instruction. When a second store instruction overlaps an address of the first store instruction, the method can include coalescing data of the second store instruction in the buffer entry before the second store instruction is committed. When the second store instruction is followed by a second dependent load instruction, the method can include providing the coalesced data from the buffer entry in response to the second dependent load instruction.

Type: Grant

Filed: August 14, 2009

Date of Patent: January 27, 2015

Assignee: Marvell International Ltd.

Inventors: Sujat Jamil, R. Frank O'Bleness, Russell Robideau, Tom Hameenanttila, Joseph Delgross, David Miner
COMPACT LINKED-LIST-BASED MULTI-THREADED INSTRUCTION GRADUATION BUFFER

Publication number: 20150012730

Abstract: A processor and instruction graduation unit for a processor. In one embodiment, a processor or instruction graduation unit according to the present invention includes a linked-list-based multi-threaded graduation buffer and a graduation controller. The graduation buffer stores identification values generated by an instruction decode and dispatch unit of the processor as part of one or more linked-list data structures. Each linked-list data structure formed is associated with a particular program thread running on the processor. The number of linked-list data structures formed is variable and related to the number of program threads running on the processor. The graduation controller includes linked-list head identification registers and linked-list tail identification registers that facilitate reading and writing identifications values to linked-list data structures associated with particular program threads.

Type: Application

Filed: September 23, 2014

Publication date: January 8, 2015

Inventor: Kjeld Svendsen
Sync-ID for multiple concurrent sync dependencies in an out-of-order store queue

Patent number: 8930680

Abstract: A method, system and process for retiring data entries held within a store queue (STQ). The STQ of a processor cache is modified to receive and process multiple synchronized groups (sync-groups). Sync groups comprise thread of execution synchronized (thread-sync) entries, all thread of execution synchronized (all-thread-sync) entries, and regular store entries (non-thread-sync and non-all-thread-sync). The task of storing data entries, from the STQ out to memory or an input/output device, is modified to increase the effectiveness of the cache. Sync-groups are created for each thread and tracked within the STQ via a synchronized identification (SID). An entry is eligible for retirement when the entry is within a currently retiring sync-group as identified by the SID.

Type: Grant

Filed: December 21, 2007

Date of Patent: January 6, 2015

Assignee: International Business Machines Corporation

Inventor: Eric F. Robinson
Prefetching load data in lookahead mode and invalidating architectural registers instead of writing results for retiring instructions

Patent number: 8918626

Abstract: The disclosed embodiments relate to a system that executes program instructions on a processor. During a normal-execution mode, the system issues instructions for execution in program order. Upon encountering an unresolved data dependency during execution of an instruction, the system speculatively executes subsequent instructions in a lookahead mode to prefetch future loads. When an instruction retires during the lookahead mode, a working register which serves as a destination register for the instruction is not copied to a corresponding architectural register. Instead the architectural register is marked as invalid. Note that by not updating architectural registers during lookahead mode, the system eliminates the need to checkpoint the architectural registers prior to entering lookahead mode.

Type: Grant

Filed: November 10, 2011

Date of Patent: December 23, 2014

Assignee: Oracle International Corporation

Inventors: Yuan C. Chou, Eric W. Mahurin
Instruction tracking system for processors

Patent number: 8874880

Abstract: Instructions are tracked in a processor. A completion unit in the processor receives an instruction group to add to a table to form a received instruction group. In response to receiving the received instruction group, the completion unit determines whether an entry is present that contains a previously stored instruction group in a first location and has space for storing the received instruction group. In response to the entry being present, the completion unit stores the received instruction group in a second location in the entry to form a stored instruction group.

Type: Grant

Filed: August 26, 2013

Date of Patent: October 28, 2014

Assignee: International Business Machines Corporation

Inventors: Christopher M. Abernathy, Hung Q. Le, Dung Q. Nguyen, Benjamin W. Stolt
Vector processing circuit, command issuance control method, and processor system

Patent number: 8874879

Abstract: A vector processing circuit includes a vector register file including a plurality of array elements, a command issuance control circuit, and a plurality of pipeline arithmetic units. Each pipeline arithmetic unit performs arithmetic processing of data stored in the array elements indicated as a source by one command in parts through a plurality of cycles and stores the result in the array elements indicated as a destination by the one command through a plurality of cycles. When data word length of a preceding command is longer than that of a subsequent command, the command issuance control circuit changes data sizes of the array elements in accordance with data word length of the command and determines whether there is register interference between the array element to be processed at a non-head cycle of the preceding command, and the array element to be processed at a head cycle of the subsequent command.

Type: Grant

Filed: October 24, 2011

Date of Patent: October 28, 2014

Assignee: Fujitsu Limited

Inventors: Yi Ge, Yoshimasa Takebe, Hiromasa Takahashi
Processor with increased efficiency via early instruction completion

Patent number: 8769247

Abstract: Methods and apparatuses are provided for increased efficiency in a processor via early instruction completion. An apparatus is provided for increased efficiency in a processor via early instruction completion. The apparatus comprises an execution unit for processing instructions and determining whether a later issued instruction is ready for completion or an earlier issued instruction is ready for completion and a retire unit for retiring the later issued instruction when the later instruction is ready for completion or to retire the earlier instruction when later instruction is not ready for completion and the earlier issued instruction has a known good completion status. A method is provided for increased efficiency in a processor via early instruction completion. The method comprises completing an earlier issued instruction having a known good completion status ahead of a later issued instruction when the later issued instruction is not ready for completion.

Type: Grant

Filed: April 15, 2011

Date of Patent: July 1, 2014

Assignee: Advanced Micro Devices, Inc.

Inventors: Michael D Estlick, Kevin Hurd, Jay Fleischman
Sharing a fault-status register when processing vector instructions

Patent number: 8683178

Abstract: The described embodiments provide a processor that executes vector instructions. In the described embodiments, the processor initializes an architectural fault-status register (FSR) and a shadow copy of the architectural FSR by setting each of N bit positions in the architectural FSR and the shadow copy of the architectural FSR to a first predetermined value. The processor then executes a first first-faulting or non-faulting (FF/NF) vector instruction. While executing the first vector instruction, the processor also executes one or more subsequent FF/NF instructions. In these embodiments, when executing the first vector instruction and the subsequent vector instructions, the processor updates one or more bit positions in the shadow copy of the architectural FSR to a second predetermined value upon encountering a fault condition.

Type: Grant

Filed: April 20, 2011

Date of Patent: March 25, 2014

Assignee: Apple Inc.

Inventor: Jeffry E. Gonion
Method and apparatus for cooperative software multitasking in a processor system with a partitioned register file

Patent number: 8677101

Abstract: A processor system executes multiple applet programs within a software application program in an information handling system. The information handling system includes operating system software that manages processor system hardware and software in a multi-tasking environment. In particular, the operating system software manages partitioning of a register file in the processor system to achieve a cooperative relationship among multiple applet programs within respective partitions of the register file. In one embodiment, the operating system software manages unique applet ID's to modify register file partition sizes and locations during applet program instruction text execution. In one embodiment, applet ID masking hardware provides sharing of register file space among multiple copies of applet program code.

Type: Grant

Filed: June 7, 2007

Date of Patent: March 18, 2014

Assignee: International Business Machines Corporation

Inventors: Brian Flachs, Harm Peter Hofstee, Brad William Michael
Multithreaded processor architecture with operational latency hiding

Publication number: 20140075159

Abstract: A method and processor architecture for achieving a high level of concurrency and latency hiding in an “infinite-thread processor architecture” with a limited number of hardware threads is disclosed. A preferred embodiment defines “fork” and “join” instructions for spawning new context-switched threads. Context switching is used to hide the latency of both memory-access operations (i.e., loads and stores) and arithmetic/logical operations. When an operation executing in a thread incurs a latency having the potential to delay the instruction pipeline, the latency is hidden by performing a context switch to a different thread. When the result of the operation becomes available, a context switch back to that thread is performed to allow the thread to continue.

Type: Application

Filed: July 12, 2011

Publication date: March 13, 2014

Applicant: International Business Machines Corporation

Inventors: Matteo Frigo, Ahmed Gheith, Volker Strumpen
Processor with support for nested speculative sections with different transactional modes

Patent number: 8621183

Abstract: A system and method are disclosed wherein a processor of a plurality of processors coupled to shared memory, is configured to initiate execution of a section of code according to a first transactional mode of the processor. The processor is configured to execute a plurality of protected memory access operations to the shared memory within the section of code as a single atomic transaction with respect to the plurality of processors. The processor is further configured to initiate, within the section of code, execution of a subsection of the section of code according to a second transactional mode of the processor, wherein the first and second transactional modes are each associated with respective recovery actions that the processor is configured to perform in response to detecting an abort condition.

Type: Grant

Filed: July 28, 2009

Date of Patent: December 31, 2013

Assignee: Advanced Micro Devices, Inc.

Inventors: Michael P. Hohmuth, David S. Christie, Stephan Diestelhorst
Register renaming system using multi-bank physical register mapping table and method thereof

Patent number: 8583901

Abstract: Embodiments of a processor architecture utilizing multi-bank implementation of physical register mapping table are provided. A register renaming system to correlate architectural registers to physical registers includes a physical register mapping table and a renaming logic. The physical register mapping table has a plurality of entries each indicative of a state of a respective physical register. The mapping table has a plurality of non-overlapping sections each of which having respective entries of the mapping table. The renaming logic is coupled to search a number of the sections of the mapping table in parallel to identify entries that indicate the respective physical registers have a first state. The renaming logic selectively correlates each of a plurality of architectural registers to a respective physical register identified as being in the first state. Methods of utilizing the multi-bank implementation of physical register mapping table are also provided.

Type: Grant

Filed: February 4, 2010

Date of Patent: November 12, 2013

Assignee: STMicroelectronics (Beijing) R&D Co. Ltd.

Inventors: Peng Fei Zhu, Hong-Xia Sun, Yong Qiang Wu
Load request scheduling in a cache hierarchy

Patent number: 8521982

Abstract: A system and method for tracking core load requests and providing arbitration and ordering of requests. When a core interface unit (CIU) receives a load operation from the processor core, a new entry in allocated in a queue of the CIU. In response to allocating the new entry in the queue, the CIU detects contention between the load request and another memory access request. In response to detecting contention, the load request may be suspended until the contention is resolved. Received load requests may be stored in the queue and tracked using a least recently used (LRU) mechanism. The load request may then be processed when the load request resides in a least recently used entry in the load request queue. CIU may also suspend issuing an instruction unless a read claim (RC) machine is available. In another embodiment, CIU may issue stored load requests in a specific priority order.

Type: Grant

Filed: April 15, 2009

Date of Patent: August 27, 2013

Assignee: International Business Machines Corporation

Inventors: Robert A. Cargnoni, Guy L. Guthrie, Thomas L. Jeremiah, Stephen J. Powell, William J. Starke, Jeffrey A. Steucheli
Supporting partial recycle in a pipelined microprocessor

Patent number: 8516228

Abstract: A computer processing system is provided. The computer processing system includes a first datastore that stores a subset of information associated with an instruction. A first stage of a processor pipeline writes the subset of information to the first datastore based on an execution of an operation associated with the instruction. A second stage of the pipeline initiates reprocessing of the operation associated with the instruction based on the subset of information stored in the first datastore.

Type: Grant

Filed: March 19, 2008

Date of Patent: August 20, 2013

Assignee: International Business Machines Corporation

Inventors: Khary J. Alexander, Michael Billeci, Fadi Y. Busaba, Bruce C. Giamei
PROCESSOR SIMULATION ENVIRONMENT

Publication number: 20130173887

Abstract: In a method of simulating a processor system by running code that simulates the system on a host processor, code is translated at run time to a form required by the host processor. All instructions are mapped to a native instruction set of the host using two or more different code dictionaries: the translated instructions are mapped to multiple and different dictionaries dependent on the execution privilege level or mode of the simulated processor. If an instruction is encountered during runtime that changes the mode of the processor the code dictionary is switched to use the dictionary associated with the new mode. The different modes require different instruction mappings to the native instruction set of the host using different models that more accurately represent the behaviour of the system code and hardware in the system being simulated.

Type: Application

Filed: February 19, 2013

Publication date: July 4, 2013

Applicant: Imperas Software Ltd.

Inventor: Imperas Software Ltd.
Setting a flag bit to defer event handling to one of multiple safe points in an instruction stream

Patent number: 8464033

Abstract: Methods and systems thereof for exception handling are described. An event to be handled is identified during execution of a code sequence. A bit is set to indicate that handling of the event is to be deferred. An exception corresponding to the event is generated if the bit is set.

Type: Grant

Filed: August 30, 2011

Date of Patent: June 11, 2013

Inventors: Guillermo J. Rozas, Alexander Klaiber
Multiprocessor

Patent number: 8433884

Abstract: A multiprocessor executes a plurality of threads without decreasing execution efficiency. The multiprocessor includes a first processor allocating a different register file to each of a predetermined number of threads to be executed from among plural threads, and executing the predetermined number of threads in parallel; and a second processor performing processing according to a processing request made by the first processor. The first processor has areas allocated to the plurality of threads in one-to-one correspondence, makes the processing request to the second processor according to an instruction included in one of the predetermined number of threads, upon receiving a request for writing a value resulting from the processing from the second processor, judges whether the one thread is being executed, and when judging negatively, performs control such that the obtained value is written into one of the areas allocated to the one thread.

Type: Grant

Filed: June 16, 2009

Date of Patent: April 30, 2013

Assignee: Panasonic Corporation

Inventor: Hiroyuki Morishita
Dynamic tag allocation in a multithreaded out-of-order processor

Patent number: 8429386

Abstract: Various techniques for dynamically allocating instruction tags and using those tags are disclosed. These techniques may apply to processors supporting out-of-order execution and to architectures that supports multiple threads. A group of instructions may be assigned a tag value from a pool of available tag values. A tag value may be usable to determine the program order of a group of instructions relative to other instructions in a thread. After the group of instructions has been (or is about to be) committed, the tag value may be freed so that it can be re-used on a second group of instructions. Tag values are dynamically allocated between threads; accordingly, a particular tag value or range of tag values is not dedicated to a particular thread.

Type: Grant

Filed: June 30, 2009

Date of Patent: April 23, 2013

Assignee: Oracle America, Inc.

Inventors: Paul J. Jordan, Robert T. Golla, Jama I. Barreh
Completion arbitration for more than two threads based on resource limitations

Patent number: 8386753

Abstract: A mechanism is provided for thread completion arbitration. The mechanism comprises executing more than two threads of instructions simultaneously in the processor, selecting a first thread from a first subset of threads, in the more than two threads, for completion of execution within the processor, and selecting a second thread from a second subset of threads, in the more than two threads, for completion of execution within the processor. The mechanism further comprises completing execution of the first and second threads by committing results of the execution of the first and second threads to a storage device associated with the processor. At least one of the first subset of threads or the second subset of threads comprise two or more threads from the more than two threads. The first subset of threads and second subset of threads have different threads from one another.

Type: Grant

Filed: April 14, 2009

Date of Patent: February 26, 2013

Assignee: International Business Machines Corporation

Inventors: Susan E. Eisen, Dung Q. Nguyen, Balaram Sinharoy, Benjamin W. Stolt
Cache rollback acceleration via a bank based versioning cache ciruit

Patent number: 8370576

Abstract: An embodiment of the present invention includes a circuit for tracking memory operations with trace-based execution. Each trace includes a sequence of operations that includes zero or more of the memory operations. At least some of the active memory operations access the memory in an execution order that is different from the program order. The circuit includes a first memory that caches data accessed by the memory operations. This memory is partitioned into N banks. Checkpoint entries, which are stored in a second memory also partitioned into N banks, are associated with each trace. Each entry refers to a checkpoint location in the first memory. A sub-circuit receives rollback requests and responds by overwriting checkpoint locations. Each of the N memory units consisting of a bank in the first memory and the corresponding bank in the second memory may be rolled back independently and concurrently with other memory units.

Type: Grant

Filed: February 13, 2008

Date of Patent: February 5, 2013

Assignee: Oracle America, Inc.

Inventors: John Gregory Favor, Paul G. Chan, Graham Ricketson Murphy, Joseph Byron Rowlands
Apparatus and method for local operand bypassing for cryptographic instructions

Patent number: 8356185

Abstract: A processor may include a hardware instruction fetch unit configured to issue instructions for execution, and a hardware functional unit configured to receive instructions for execution, where the instructions include cryptographic instruction(s) and non-cryptographic instruction(s). The functional unit may include a cryptographic execution pipeline configured to execute the cryptographic instructions with a corresponding cryptographic execution latency, and a non-cryptographic execution pipeline configured to execute the non-cryptographic instructions with a corresponding non-cryptographic execution latency that is longer than the cryptographic execution latency.

Type: Grant

Filed: October 8, 2009

Date of Patent: January 15, 2013

Assignee: Oracle America, Inc.

Inventors: Christopher H. Olson, Gregory F. Grohoski, Robert T. Golla
Method for scheduling a network packet processor

Patent number: 8284772

Abstract: A method is provided for scheduling a network packet processor. A textual language specification is input of the processing of network packets by the network packet processor. The textual language specification includes memory read actions and modification actions. Each memory read action reads a stored value from a memory of the network packet processor. Each modification action modifies a field of the network packets. An availability is determined for each field read from the network packets for the memory read and modification actions. An availability is determined for each stored value read from the memory for the memory read actions. A look-ahead interval is determined from the availabilities. A respective storage class is determined for the fields for the memory read and modification actions. The respective storage class is one of a bus, a register, and a register with bypass.

Type: Grant

Filed: May 3, 2007

Date of Patent: October 9, 2012

Assignee: XILINX, Inc.

Inventors: Philip B. James-Roxby, Eric R. Keller
Intentionally delaying execution of a copy instruction to achieve simultaneous execution with a subsequent, non-adjacent write instruction

Patent number: 8271766

Abstract: An information processing device including registers (105) for holding data and an operation device (102) for executing arithmetic and logic operations on input/output data held in the register. The information processing device can issue an inter-register copy instruction for instructing data held in one register to be copied to another register. The information processing device further includes a copy information holding device (113) for reserving for execution of a data copy operation by the inter-register copy instruction from a control unit (108) so as to execute the actual copy operation simultaneously with the succeeding instruction to hide the execution time of the copy operation. Thus, in the inter-register copy instruction execution phase, a reservation for a data copy operation is stored in the copy information holding device so that the execution phase is completed without performing the actual data copy operation.

Type: Grant

Filed: May 18, 2006

Date of Patent: September 18, 2012

Assignee: NEC Corporation

Inventor: Noritaka Hoshi
Instruction set architecture with instruction characteristic bit indicating a result is not of architectural importance

Patent number: 8266411

Abstract: Instead of having a processor with an instruction set architecture (ISA) that includes fixed architected operands, an improved processor supports additional characteristic bits for computing instructions (e.g., a multiply-add, load/store instructions). Such additional bits for the certain instructions influence the processing of these instructions by the processor. Also, a new instruction is introduced for further usage of the proposed method. Typically these additional characteristic bits as well as the instruction can be automatically generated by compilers to provide relatively well-suited instruction sequences for the processor.

Type: Grant

Filed: February 5, 2009

Date of Patent: September 11, 2012

Assignee: International Business Machines Corporation

Inventors: Tobias Gemmeke, Markus Kaltenbach, Nicolas Maeding
Efficient load queue snooping

Patent number: 8214602

Abstract: In one embodiment, a processor comprises a data cache and a load/store unit (LSU). The LSU comprises a queue and a control unit, and each entry in the queue is assigned to a different load that has accessed the data cache but has not retired. The control unit is configured to update the data cache hit status of each load represented in the queue as a content of the data cache changes. The control unit is configured to detect a snoop hit on a first load in a first entry of the queue responsive to: the snoop index matching a load index stored in the first entry, the data cache hit status of the first load indicating hit, the data cache detecting a snoop hit for the snoop operation, and a load way stored in the first entry matching a first way of the data cache in which the snoop operation is a hit.

Type: Grant

Filed: June 23, 2008

Date of Patent: July 3, 2012

Assignee: Advanced Micro Devices, Inc.

Inventors: Ashutosh S. Dhodapkar, Michael G. Butler
Suspending write back of valid data in pipeline register to register file and cancelling when overwritten by subsequent instruction

Patent number: 8209519

Abstract: A pipeline processor has a first stage to read data from a general purpose register unit, a second stage to execute instruction, and a third stage to write back the data into the general purpose register unit. A pipeline register retains data obtained by executing the second stage. The pipeline register stores a data validity flag. A WRITE suspension unit suspends execution of writing the retained data into a general purpose register of the general purpose register unit until the retained data is rewritten by a subsequent instruction, even if the data validity flag indicates “valid.” A data invalidation unit cancels the execution of writing the data retained in the pipeline register into the general purpose register into which the data is to be written by a preceding instruction and invalidates the retained data, when data is written into the general purpose register by the subsequent instruction.

Type: Grant

Filed: July 21, 2011

Date of Patent: June 26, 2012

Assignee: Kabushiki Kaisha Toshiba

Inventor: Jun Tanabe
Processing bypass directory tracking system and method

Patent number: 8209518

Abstract: A processing bypass directory system and method are disclosed. In one embodiment, a bypass directory tracking process includes setting bits in a bypass directory when a corresponding architectural register is written. The bits are selectively cleared in the bypass directory each cycle. The configuration of the bits is utilized to determine which stage of a bypass path processing information is at.

Type: Grant

Filed: March 28, 2011

Date of Patent: June 26, 2012

Inventors: Alexander Klaiber, Guillermo Rozas
METHOD AND APPARATUS FOR PROVIDING EARLY BYPASS DETECTION TO REDUCE POWER CONSUMPTION WHILE READING REGISTER FILES OF A PROCESSOR

Publication number: 20120159217

Abstract: A method and apparatus are described for reducing power consumption in a processor. A micro-operation is selected for execution, and a destination physical register tag of the selected micro-operation is compared to a plurality of source physical register tags of micro-operations dependent upon the selected micro-operation. If there is a match between the destination physical register tag and one of the source physical register tags, a corresponding physical register file (PRF) read operation is disabled. The comparison may be performed by a wakeup content-addressable memory (CAM) of a scheduler. The wakeup CAM may send a read control signal to the PRF to disable the read operation. Disabling the corresponding PRF read operation may include shutting off power in the PRF and related logic.

Type: Application

Filed: December 16, 2010

Publication date: June 21, 2012

Applicant: ADVANCED MICRO DEVICES, INC.

Inventors: Ganesh Venkataramanan, Emil Talpes
Vector unit in a processor enabled to replicate data on a first portion of a data bus to primary and secondary registers

Patent number: 8200945

Abstract: A microprocessor includes a branch unit, a load/store unit (LSU), an arithmetic logic unit (ALU), and a vector unit to execute a vector instruction. The vector unit includes a vector register file having a primary vector register and a secondary vector register. The processor preferably further includes a first data bus and a second data bus wherein the first and second data busses couple the vector unit to the data memory. The vector unit includes a first input multiplexer enabling data on the first data bus to be provided to the primary register file or the secondary register file and a second input multiplexer, independent of the first input multiplexer enabling data on the second data bus to be provided to the second data bus. The first and second data busses may comprise first and second portions of a data memory bus.

Type: Grant

Filed: November 7, 2003

Date of Patent: June 12, 2012

Assignee: International Business Machines Corporation

Inventors: Siddhartha Chatterjee, Kenneth Dockser, Fred Gehrung Gustayson, Manish Gupta
Image forming apparatus and management system utilizing counter and job log information for usage tracking

Patent number: 8179540

Abstract: An image forming apparatus is provided that holds counter information obtained by integrating a consumption of a consumable that depends on usage of service provided by the image forming apparatus. A log corresponding to the usage of the service is set in job log information with a synchronization flag set off. The log in the job log information, for which the synchronization flag is set off, is set on. The counter information and the job log information are output after the synchronization flag for the log having the synchronization flag set off has been set on.

Type: Grant

Filed: October 29, 2008

Date of Patent: May 15, 2012

Assignee: Canon Kabushiki Kaisha

Inventors: Junichi Hiruma, Nobuyuki Tonegawa
System and method of data forwarding within an execution unit

Patent number: 8145874

Abstract: In an embodiment, a method is disclosed that includes, comparing, during a write back stage at an execution unit, a write identifier associated with a result to be written to a register file from execution of a first instruction to a read identifier associated with a second instruction at an execution pipeline within an interleaved multi-threaded (IMT) processor having multiple execution units. When the write identifier matches the read identifier, the method further includes storing the result at a local memory of the execution unit for use by the execution unit in the subsequent read stage.

Type: Grant

Filed: February 26, 2008

Date of Patent: March 27, 2012

Assignee: QUALCOMM Incorporated

Inventors: Suresh Venkumahanti, Lucian Codrescu, Lin Wang

prev 1 2 3 4 5 6 7 … next