Commitment Control Or Register Bypass Patents (Class 712/218)
  • Patent number: 9367317
    Abstract: A processor includes a microcode storage comprising a plurality of microcode flows and a decode logic coupled to the microcode storage. The decode logic is configured to receive a first instruction, decode the first instruction into an entry point vector to a first microcode flow in the microcode storage, the entry point vector comprising a first indicator specifying a number of clock cycles associated with the first microcode flow, initiate the microcode storage, wherein the microcode storage inserts microinstructions of the first microcode flow into an instruction queue, count clock cycles after initiating the microcode storage, and decode a second instruction without first receiving a return from the microcode storage, wherein the second instruction is decoded at a particular clock cycle based on the number of clock cycles associated with the first microcode flow.
    Type: Grant
    Filed: July 3, 2013
    Date of Patent: June 14, 2016
    Assignee: Intel Corporation
    Inventors: Jonathan D. Combs, Jonathan Y. Tong
  • Patent number: 9361104
    Abstract: In a data processing system having execution circuitry, a method includes providing a cross-check instruction and a reference instruction to the execution circuitry, where the reference instruction has an operand. The method also includes executing the reference instruction to obtain a first result. Residual information is derived from execution of the reference instruction, and the method also includes executing the cross-check instruction using the residual information to obtain a second result. The second result obtained from execution of the cross-check instruction is compared to the operand of the reference instruction to determine whether an error occurred during execution of the reference instruction or the cross-check instruction.
    Type: Grant
    Filed: August 13, 2010
    Date of Patent: June 7, 2016
    Assignee: FREESCALE SEMICONDUCTOR, INC.
    Inventors: Gary R. Morrison, William C. Moyer
  • Patent number: 9317286
    Abstract: A processor including instruction support for implementing the Camellia block cipher algorithm may issue, for execution, programmer-selectable instructions from a defined instruction set architecture (ISA). The processor may include a cryptographic unit that may receive instructions for execution. The instructions include one or more Camellia instructions defined within the ISA. In addition, the Camellia instructions may be executable by the cryptographic unit to implement portions of a Camellia cipher that is compliant with Internet Engineering Task Force (IETF) Request For Comments (RFC) 3713. In response to receiving a Camellia F( )-operation instruction defined within the ISA, the cryptographic unit may perform an F( ) operation, as defined by the Camellia cipher, upon a data input operand and a subkey operand, in which the data input operand and subkey operand may be specified by the Camellia F( )-operation instruction.
    Type: Grant
    Filed: March 31, 2009
    Date of Patent: April 19, 2016
    Assignee: Oracle America, Inc.
    Inventors: Christopher H. Olson, Gregory F. Grohoski, Lawrence A. Spracklen
  • Patent number: 9286172
    Abstract: Embodiments of systems, apparatuses, and methods for utilizing a faulty cache line in a cache are described. In some embodiments, a graphics processing unit is allowed to access a faulty cache line in the cache. A cache access request to access a faulty cache line from a central processing unit core is remapped to access a fault-free cache line.
    Type: Grant
    Filed: December 22, 2011
    Date of Patent: March 15, 2016
    Assignee: Intel Corporation
    Inventors: Tanausu Ramirez, Javier Carretero Casado, Enric Herrero, Matteo Monchiero, Xavier Vera
  • Patent number: 9232010
    Abstract: A method, apparatus and computer program product are provided for context aware intelligent message handling. In the context of a method, the method includes receiving a message, that includes a message identifier, determining if the message is dependent on an unprocessed message, causing a message status, including a message identifier, to be stored in a memory, causing the message to be stored in the memory in an instance in which the message is dependent on an unprocessed messages, receiving a message process completion indication from a downstream system and updating the message status based on the message process completion indication; determining if the message stored in the memory continues to be dependent on an unprocessed message based on the updated message status, and causing the transmission of the message to the downstream system in an instance in which the message is no longer dependent on an unprocessed message.
    Type: Grant
    Filed: August 4, 2014
    Date of Patent: January 5, 2016
    Assignee: HERE Global B.V.
    Inventor: Kottorage Buddika Gajapala
  • Patent number: 9207922
    Abstract: Provided is a compiling method and apparatus for scheduling a block in a pipeline. The compiling method for scheduling a block in a pipeline may include profiling, using a processor, an access count of a block in a control flow of a program code, determining that the block is an important block, in response to an edge count of an edge entering the block being greater than or equal to a predetermined value, the edge count being included in the access count of the block, and scheduling the important block based on the access count to prevent a register writeback conflict.
    Type: Grant
    Filed: December 19, 2013
    Date of Patent: December 8, 2015
    Assignee: SAMSUNG ELECTRONICS CO., LTD.
    Inventor: Taisong Jin
  • Patent number: 9201656
    Abstract: The data processing apparatus (and method) has processing circuitry for performing data processing operations in response to data processing instructions, the data processing instructions referencing logical registers. A set of physical registers are provided for storing data values for access by the processing circuitry when performing the data processing operations. Register renaming storage stores a one-to-one mapping between the logical registers and the physical registers, with the register renaming storage being accessed by the processing circuitry when performing the data processing operations in order to map the referenced logical registers to corresponding physical registers. Update circuitry is arranged to identify the physical registers corresponding to those multiple logical registers in the register renaming storage. Altered one-to-one mapping between multiple logical registers and identified physical registers is employed when performing the current data processing operation.
    Type: Grant
    Filed: December 2, 2011
    Date of Patent: December 1, 2015
    Assignee: ARM Limited
    Inventors: Jean-Baptiste Brelot, Cédric Denis Robert Airaud
  • Patent number: 9170962
    Abstract: A method, system and processing device for retiring data entries held within a store queue (STQ). The STQ of a processor cache is modified to receive and process several types of data entries including: non-synchronized (non-sync), thread of execution synchronized (thread-sync), and all thread of execution synchronized (all-thread-sync). The task of storing data entries, from the STQ out to memory or an input/output device, is modified to increase the effectiveness of the cache. The modified STQ allows non-sync, thread-sync, and all-thread-sync instructions to coexist in the STQ regardless of the thread of execution. Stored data entries, or stores are deterministically selected for retirement, according to the data entry type.
    Type: Grant
    Filed: December 21, 2007
    Date of Patent: October 27, 2015
    Assignee: International Business Machines Corporation
    Inventor: Eric F. Robinson
  • Patent number: 9164769
    Abstract: A reconfigurable array is provided. The reconfigurable array includes a Very Long Instruction Word (VLIW) mode and a Coarse-Grained Array (CGA) mode. When the VLIW mode is converted to the CGA mode, instead of sharing a central register file between the VLIW mode and the CGA mode, live data to be used in the CGA mode is copied from the central register file to local register files.
    Type: Grant
    Filed: December 8, 2010
    Date of Patent: October 20, 2015
    Assignee: Samsung Electronics Co., Ltd.
    Inventors: Won-Sub Kim, Tai-Song Jin, Dong-Hoon Yoo, Bernhard Egger, Jin-Seok Lee
  • Patent number: 9164690
    Abstract: A system, method, and computer program product are provided for copying data between memory locations. In use, a memory copy instruction is implemented. Additionally, data is copied from a first memory location to a second memory location, utilizing the memory copy instruction.
    Type: Grant
    Filed: July 27, 2012
    Date of Patent: October 20, 2015
    Assignee: NVIDIA Corporation
    Inventors: Brucek Kurdo Khailany, Sean Jeffrey Treichler
  • Patent number: 9122487
    Abstract: A system and method for balancing instruction loads between multiple execution units are disclosed. One or more execution units may be represented by a slot configured to accept instructions on behalf of the execution unit(s). A decode unit may assign instructions to a particular slot for subsequent scheduling for execution. Slot assignments may be made based on an instruction's type and/or on a history of previous slot assignments. A cumulative slot assignment history may be maintained in a bias counter, the value of which reflects the bias of previous slot assignments. Slot assignments may be determined based on the value of the bias counter, in order to balance the instruction load across all slots, and all execution units. The bias counter may reflect slot assignments made only within a desired historical window. A separate data structure may store data reflecting the actual slot assignments made during the desired historical window.
    Type: Grant
    Filed: June 23, 2009
    Date of Patent: September 1, 2015
    Assignee: Oracle America, Inc.
    Inventors: Robert T. Golla, Gregory F. Grohoski
  • Patent number: 9104991
    Abstract: A system assesses one or more applications for retirement. The system includes a processing device configured for receiving attribute data corresponding to one or more of a plurality of applications. The processing device is further configured for determining one or more of the plurality of applications to assess for retirement, translating at least some of the received attribute data into two or more translated values based at least in part on one or more predetermined values, and summing two or more of the translated values, thereby resulting in one or more combined values. The processing device is further configured for calculating one or more cumulative values based at least in part on the one or more combined values and converting the one or more cumulative values, thereby resulting in one or more probability values each indicating the probability of retirement of one of the one or more applications.
    Type: Grant
    Filed: July 30, 2010
    Date of Patent: August 11, 2015
    Assignee: Bank of America Corporation
    Inventors: Erin Kristin Collins, Marianna E. Chandler, Patty M. Curtner, Sherrill Jean Massingham, Joni DeVoe McKeen, Darryl Alan Sansbury, Siroos Shahnizadeh, Anthony Simoes, Rajaraman Viswanathan
  • Patent number: 9087000
    Abstract: According to an embodiment of the invention, a method for operating a data processing machine is described in which data about a state of the machine is written to a location in storage. The location is one that is accessible to software that may be written for the machine. The state data as written is encoded. This state data may be recovered from the storage according to a decoding process. Other embodiments are also described and claimed.
    Type: Grant
    Filed: March 15, 2013
    Date of Patent: July 21, 2015
    Assignee: Intel Corporation
    Inventors: Scott H. Robinson, Gustavo P. Espinosa, Steven M. Bennett
  • Patent number: 9086887
    Abstract: The invention relates to an electronic device for data processing, which includes an execution unit with a temporary register, a register file, a first feedback path from the data output of the execution unit to the register file, a second feedback path from the data output of the execution unit to the temporary register, a switch configured to connect the first feedback path and/or the second feedback path, and a logic stage coupled to control the switch. The control stage is configured to control the switch to connect the second feedback path if the data output of an execution unit is used as an operand in the subsequent operation of an execution unit.
    Type: Grant
    Filed: September 9, 2011
    Date of Patent: July 21, 2015
    Assignee: TEXAS INSTRUMENT INCORPORATED
    Inventors: Marko Krüger, Steven Bartling, Markus Kösler
  • Patent number: 9081581
    Abstract: An out-of-order processor 4 groups program instructions together to control their commitment to complete processing. If an instruction within a group has a source operand dependent upon a plurality of destination operands of other instructions then this is identified as a size mismatch hazard. When the program instruction having the size mismatch hazard reaches a commit point within the processor, then it is flushed together with any speculatively executed succeeding program instructions. Furthermore, the group of program instructions containing the program instruction containing the program instruction having the size mismatch is divided into a plurality of groups of program instructions each containing a single program instruction which are then replayed through the processing mechanisms.
    Type: Grant
    Filed: November 16, 2010
    Date of Patent: July 14, 2015
    Assignee: ARM Limited
    Inventors: James Nolan Hardage, Conrado Blasco Allue, Glen Andrew Harris
  • Patent number: 9003225
    Abstract: A processor includes a store queue that stores information representing store instructions. In response to retirement of a store instruction, the processor invalidates the corresponding entry in the store queue, thereby indicating that the entry is available to store a subsequent store instruction. The store address is not removed from the queue until the subsequent store instruction is stored. Accordingly, the store address is available for comparison to a dependent load address.
    Type: Grant
    Filed: October 17, 2012
    Date of Patent: April 7, 2015
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Matthew A. Rafacz, Matthew M. Crum, Michael E. Tuuk
  • Publication number: 20150089202
    Abstract: A system, method, and computer program product are provided for implementing a multi-cycle register file bypass mechanism. The method includes the steps of receiving a set of control bits, combining the set of control bits with a set of valid bits associated with previously issued instructions, and enabling a bypass path for each thread based on the set of control bits and the set of valid bits. Each valid bit in the set of valid bits indicates whether execution of an instruction of the previously issued instructions was enabled for a thread in a thread block.
    Type: Application
    Filed: September 26, 2013
    Publication date: March 26, 2015
    Applicant: NVIDIA Corporation
    Inventors: Xiaogang Qiu, Ian Chi Yan Kwong, Ming Yiu Siu, Jack H. Choquette, Michael Alan Fetterman
  • Patent number: 8959314
    Abstract: A system and method for fencing memory accesses. Memory loads can be fenced, or all memory access can be fenced. The system receives a fencing instruction that separates memory access instructions into older accesses and newer accesses. A buffer within the memory ordering unit is allocated to the instruction. The access instructions newer than the fencing instruction are stalled. The older access instructions are gradually retired. When all older memory accesses are retired, the fencing instruction is dispatched from the buffer.
    Type: Grant
    Filed: July 15, 2013
    Date of Patent: February 17, 2015
    Assignee: Intel Corporation
    Inventors: Salvador Palanca, Stephen Fischer, Subramaniam Maiyuran, Shekoufeh Qawami
  • Publication number: 20150039862
    Abstract: A technique for operating a processor includes storing a first result to a writeback buffer, in response to a first execution unit of the processor attempting to write the first result of a first completed instruction to a register file of the processor at a same processor time as a second execution unit of the processor is attempting to write a second result of a second completed instruction to the register file. The writeback buffer is positioned in a dataflow between the first execution unit and the register file. A buffer full indicator logic is used to detect that the writeback buffer is unavailable. A buffer unavailable signal is transmitted, from the buffer full indicator logic, in response to detecting the writeback buffer is unavailable. In response to receiving the buffer unavailable signal, a buffer retrieving logic writes the first result from the writeback buffer to the register file.
    Type: Application
    Filed: July 31, 2014
    Publication date: February 5, 2015
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: HARRY BAROWSKI, TIM NIGGERMEIER
  • Patent number: 8943273
    Abstract: Aspects of the disclosure provide methods for cache efficiency. A method for cache efficiency can include storing data in a buffer entry in association with a cache array in response to a first store instruction that hits the cache array before the first store instruction is committed. Further, when a dependent load instruction is subsequent to the first store instruction, the method can include providing the data from the buffer entry in response to the first dependent load instruction. When a second store instruction overlaps an address of the first store instruction, the method can include coalescing data of the second store instruction in the buffer entry before the second store instruction is committed. When the second store instruction is followed by a second dependent load instruction, the method can include providing the coalesced data from the buffer entry in response to the second dependent load instruction.
    Type: Grant
    Filed: August 14, 2009
    Date of Patent: January 27, 2015
    Assignee: Marvell International Ltd.
    Inventors: Sujat Jamil, R. Frank O'Bleness, Russell Robideau, Tom Hameenanttila, Joseph Delgross, David Miner
  • Publication number: 20150012730
    Abstract: A processor and instruction graduation unit for a processor. In one embodiment, a processor or instruction graduation unit according to the present invention includes a linked-list-based multi-threaded graduation buffer and a graduation controller. The graduation buffer stores identification values generated by an instruction decode and dispatch unit of the processor as part of one or more linked-list data structures. Each linked-list data structure formed is associated with a particular program thread running on the processor. The number of linked-list data structures formed is variable and related to the number of program threads running on the processor. The graduation controller includes linked-list head identification registers and linked-list tail identification registers that facilitate reading and writing identifications values to linked-list data structures associated with particular program threads.
    Type: Application
    Filed: September 23, 2014
    Publication date: January 8, 2015
    Inventor: Kjeld Svendsen
  • Patent number: 8930680
    Abstract: A method, system and process for retiring data entries held within a store queue (STQ). The STQ of a processor cache is modified to receive and process multiple synchronized groups (sync-groups). Sync groups comprise thread of execution synchronized (thread-sync) entries, all thread of execution synchronized (all-thread-sync) entries, and regular store entries (non-thread-sync and non-all-thread-sync). The task of storing data entries, from the STQ out to memory or an input/output device, is modified to increase the effectiveness of the cache. Sync-groups are created for each thread and tracked within the STQ via a synchronized identification (SID). An entry is eligible for retirement when the entry is within a currently retiring sync-group as identified by the SID.
    Type: Grant
    Filed: December 21, 2007
    Date of Patent: January 6, 2015
    Assignee: International Business Machines Corporation
    Inventor: Eric F. Robinson
  • Patent number: 8918626
    Abstract: The disclosed embodiments relate to a system that executes program instructions on a processor. During a normal-execution mode, the system issues instructions for execution in program order. Upon encountering an unresolved data dependency during execution of an instruction, the system speculatively executes subsequent instructions in a lookahead mode to prefetch future loads. When an instruction retires during the lookahead mode, a working register which serves as a destination register for the instruction is not copied to a corresponding architectural register. Instead the architectural register is marked as invalid. Note that by not updating architectural registers during lookahead mode, the system eliminates the need to checkpoint the architectural registers prior to entering lookahead mode.
    Type: Grant
    Filed: November 10, 2011
    Date of Patent: December 23, 2014
    Assignee: Oracle International Corporation
    Inventors: Yuan C. Chou, Eric W. Mahurin
  • Patent number: 8874880
    Abstract: Instructions are tracked in a processor. A completion unit in the processor receives an instruction group to add to a table to form a received instruction group. In response to receiving the received instruction group, the completion unit determines whether an entry is present that contains a previously stored instruction group in a first location and has space for storing the received instruction group. In response to the entry being present, the completion unit stores the received instruction group in a second location in the entry to form a stored instruction group.
    Type: Grant
    Filed: August 26, 2013
    Date of Patent: October 28, 2014
    Assignee: International Business Machines Corporation
    Inventors: Christopher M. Abernathy, Hung Q. Le, Dung Q. Nguyen, Benjamin W. Stolt
  • Patent number: 8874879
    Abstract: A vector processing circuit includes a vector register file including a plurality of array elements, a command issuance control circuit, and a plurality of pipeline arithmetic units. Each pipeline arithmetic unit performs arithmetic processing of data stored in the array elements indicated as a source by one command in parts through a plurality of cycles and stores the result in the array elements indicated as a destination by the one command through a plurality of cycles. When data word length of a preceding command is longer than that of a subsequent command, the command issuance control circuit changes data sizes of the array elements in accordance with data word length of the command and determines whether there is register interference between the array element to be processed at a non-head cycle of the preceding command, and the array element to be processed at a head cycle of the subsequent command.
    Type: Grant
    Filed: October 24, 2011
    Date of Patent: October 28, 2014
    Assignee: Fujitsu Limited
    Inventors: Yi Ge, Yoshimasa Takebe, Hiromasa Takahashi
  • Patent number: 8769247
    Abstract: Methods and apparatuses are provided for increased efficiency in a processor via early instruction completion. An apparatus is provided for increased efficiency in a processor via early instruction completion. The apparatus comprises an execution unit for processing instructions and determining whether a later issued instruction is ready for completion or an earlier issued instruction is ready for completion and a retire unit for retiring the later issued instruction when the later instruction is ready for completion or to retire the earlier instruction when later instruction is not ready for completion and the earlier issued instruction has a known good completion status. A method is provided for increased efficiency in a processor via early instruction completion. The method comprises completing an earlier issued instruction having a known good completion status ahead of a later issued instruction when the later issued instruction is not ready for completion.
    Type: Grant
    Filed: April 15, 2011
    Date of Patent: July 1, 2014
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Michael D Estlick, Kevin Hurd, Jay Fleischman
  • Patent number: 8683178
    Abstract: The described embodiments provide a processor that executes vector instructions. In the described embodiments, the processor initializes an architectural fault-status register (FSR) and a shadow copy of the architectural FSR by setting each of N bit positions in the architectural FSR and the shadow copy of the architectural FSR to a first predetermined value. The processor then executes a first first-faulting or non-faulting (FF/NF) vector instruction. While executing the first vector instruction, the processor also executes one or more subsequent FF/NF instructions. In these embodiments, when executing the first vector instruction and the subsequent vector instructions, the processor updates one or more bit positions in the shadow copy of the architectural FSR to a second predetermined value upon encountering a fault condition.
    Type: Grant
    Filed: April 20, 2011
    Date of Patent: March 25, 2014
    Assignee: Apple Inc.
    Inventor: Jeffry E. Gonion
  • Patent number: 8677101
    Abstract: A processor system executes multiple applet programs within a software application program in an information handling system. The information handling system includes operating system software that manages processor system hardware and software in a multi-tasking environment. In particular, the operating system software manages partitioning of a register file in the processor system to achieve a cooperative relationship among multiple applet programs within respective partitions of the register file. In one embodiment, the operating system software manages unique applet ID's to modify register file partition sizes and locations during applet program instruction text execution. In one embodiment, applet ID masking hardware provides sharing of register file space among multiple copies of applet program code.
    Type: Grant
    Filed: June 7, 2007
    Date of Patent: March 18, 2014
    Assignee: International Business Machines Corporation
    Inventors: Brian Flachs, Harm Peter Hofstee, Brad William Michael
  • Publication number: 20140075159
    Abstract: A method and processor architecture for achieving a high level of concurrency and latency hiding in an “infinite-thread processor architecture” with a limited number of hardware threads is disclosed. A preferred embodiment defines “fork” and “join” instructions for spawning new context-switched threads. Context switching is used to hide the latency of both memory-access operations (i.e., loads and stores) and arithmetic/logical operations. When an operation executing in a thread incurs a latency having the potential to delay the instruction pipeline, the latency is hidden by performing a context switch to a different thread. When the result of the operation becomes available, a context switch back to that thread is performed to allow the thread to continue.
    Type: Application
    Filed: July 12, 2011
    Publication date: March 13, 2014
    Applicant: International Business Machines Corporation
    Inventors: Matteo Frigo, Ahmed Gheith, Volker Strumpen
  • Patent number: 8621183
    Abstract: A system and method are disclosed wherein a processor of a plurality of processors coupled to shared memory, is configured to initiate execution of a section of code according to a first transactional mode of the processor. The processor is configured to execute a plurality of protected memory access operations to the shared memory within the section of code as a single atomic transaction with respect to the plurality of processors. The processor is further configured to initiate, within the section of code, execution of a subsection of the section of code according to a second transactional mode of the processor, wherein the first and second transactional modes are each associated with respective recovery actions that the processor is configured to perform in response to detecting an abort condition.
    Type: Grant
    Filed: July 28, 2009
    Date of Patent: December 31, 2013
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Michael P. Hohmuth, David S. Christie, Stephan Diestelhorst
  • Patent number: 8583901
    Abstract: Embodiments of a processor architecture utilizing multi-bank implementation of physical register mapping table are provided. A register renaming system to correlate architectural registers to physical registers includes a physical register mapping table and a renaming logic. The physical register mapping table has a plurality of entries each indicative of a state of a respective physical register. The mapping table has a plurality of non-overlapping sections each of which having respective entries of the mapping table. The renaming logic is coupled to search a number of the sections of the mapping table in parallel to identify entries that indicate the respective physical registers have a first state. The renaming logic selectively correlates each of a plurality of architectural registers to a respective physical register identified as being in the first state. Methods of utilizing the multi-bank implementation of physical register mapping table are also provided.
    Type: Grant
    Filed: February 4, 2010
    Date of Patent: November 12, 2013
    Assignee: STMicroelectronics (Beijing) R&D Co. Ltd.
    Inventors: Peng Fei Zhu, Hong-Xia Sun, Yong Qiang Wu
  • Patent number: 8521982
    Abstract: A system and method for tracking core load requests and providing arbitration and ordering of requests. When a core interface unit (CIU) receives a load operation from the processor core, a new entry in allocated in a queue of the CIU. In response to allocating the new entry in the queue, the CIU detects contention between the load request and another memory access request. In response to detecting contention, the load request may be suspended until the contention is resolved. Received load requests may be stored in the queue and tracked using a least recently used (LRU) mechanism. The load request may then be processed when the load request resides in a least recently used entry in the load request queue. CIU may also suspend issuing an instruction unless a read claim (RC) machine is available. In another embodiment, CIU may issue stored load requests in a specific priority order.
    Type: Grant
    Filed: April 15, 2009
    Date of Patent: August 27, 2013
    Assignee: International Business Machines Corporation
    Inventors: Robert A. Cargnoni, Guy L. Guthrie, Thomas L. Jeremiah, Stephen J. Powell, William J. Starke, Jeffrey A. Steucheli
  • Patent number: 8516228
    Abstract: A computer processing system is provided. The computer processing system includes a first datastore that stores a subset of information associated with an instruction. A first stage of a processor pipeline writes the subset of information to the first datastore based on an execution of an operation associated with the instruction. A second stage of the pipeline initiates reprocessing of the operation associated with the instruction based on the subset of information stored in the first datastore.
    Type: Grant
    Filed: March 19, 2008
    Date of Patent: August 20, 2013
    Assignee: International Business Machines Corporation
    Inventors: Khary J. Alexander, Michael Billeci, Fadi Y. Busaba, Bruce C. Giamei
  • Publication number: 20130173887
    Abstract: In a method of simulating a processor system by running code that simulates the system on a host processor, code is translated at run time to a form required by the host processor. All instructions are mapped to a native instruction set of the host using two or more different code dictionaries: the translated instructions are mapped to multiple and different dictionaries dependent on the execution privilege level or mode of the simulated processor. If an instruction is encountered during runtime that changes the mode of the processor the code dictionary is switched to use the dictionary associated with the new mode. The different modes require different instruction mappings to the native instruction set of the host using different models that more accurately represent the behaviour of the system code and hardware in the system being simulated.
    Type: Application
    Filed: February 19, 2013
    Publication date: July 4, 2013
    Applicant: Imperas Software Ltd.
    Inventor: Imperas Software Ltd.
  • Patent number: 8464033
    Abstract: Methods and systems thereof for exception handling are described. An event to be handled is identified during execution of a code sequence. A bit is set to indicate that handling of the event is to be deferred. An exception corresponding to the event is generated if the bit is set.
    Type: Grant
    Filed: August 30, 2011
    Date of Patent: June 11, 2013
    Inventors: Guillermo J. Rozas, Alexander Klaiber
  • Patent number: 8433884
    Abstract: A multiprocessor executes a plurality of threads without decreasing execution efficiency. The multiprocessor includes a first processor allocating a different register file to each of a predetermined number of threads to be executed from among plural threads, and executing the predetermined number of threads in parallel; and a second processor performing processing according to a processing request made by the first processor. The first processor has areas allocated to the plurality of threads in one-to-one correspondence, makes the processing request to the second processor according to an instruction included in one of the predetermined number of threads, upon receiving a request for writing a value resulting from the processing from the second processor, judges whether the one thread is being executed, and when judging negatively, performs control such that the obtained value is written into one of the areas allocated to the one thread.
    Type: Grant
    Filed: June 16, 2009
    Date of Patent: April 30, 2013
    Assignee: Panasonic Corporation
    Inventor: Hiroyuki Morishita
  • Patent number: 8429386
    Abstract: Various techniques for dynamically allocating instruction tags and using those tags are disclosed. These techniques may apply to processors supporting out-of-order execution and to architectures that supports multiple threads. A group of instructions may be assigned a tag value from a pool of available tag values. A tag value may be usable to determine the program order of a group of instructions relative to other instructions in a thread. After the group of instructions has been (or is about to be) committed, the tag value may be freed so that it can be re-used on a second group of instructions. Tag values are dynamically allocated between threads; accordingly, a particular tag value or range of tag values is not dedicated to a particular thread.
    Type: Grant
    Filed: June 30, 2009
    Date of Patent: April 23, 2013
    Assignee: Oracle America, Inc.
    Inventors: Paul J. Jordan, Robert T. Golla, Jama I. Barreh
  • Patent number: 8386753
    Abstract: A mechanism is provided for thread completion arbitration. The mechanism comprises executing more than two threads of instructions simultaneously in the processor, selecting a first thread from a first subset of threads, in the more than two threads, for completion of execution within the processor, and selecting a second thread from a second subset of threads, in the more than two threads, for completion of execution within the processor. The mechanism further comprises completing execution of the first and second threads by committing results of the execution of the first and second threads to a storage device associated with the processor. At least one of the first subset of threads or the second subset of threads comprise two or more threads from the more than two threads. The first subset of threads and second subset of threads have different threads from one another.
    Type: Grant
    Filed: April 14, 2009
    Date of Patent: February 26, 2013
    Assignee: International Business Machines Corporation
    Inventors: Susan E. Eisen, Dung Q. Nguyen, Balaram Sinharoy, Benjamin W. Stolt
  • Patent number: 8370576
    Abstract: An embodiment of the present invention includes a circuit for tracking memory operations with trace-based execution. Each trace includes a sequence of operations that includes zero or more of the memory operations. At least some of the active memory operations access the memory in an execution order that is different from the program order. The circuit includes a first memory that caches data accessed by the memory operations. This memory is partitioned into N banks. Checkpoint entries, which are stored in a second memory also partitioned into N banks, are associated with each trace. Each entry refers to a checkpoint location in the first memory. A sub-circuit receives rollback requests and responds by overwriting checkpoint locations. Each of the N memory units consisting of a bank in the first memory and the corresponding bank in the second memory may be rolled back independently and concurrently with other memory units.
    Type: Grant
    Filed: February 13, 2008
    Date of Patent: February 5, 2013
    Assignee: Oracle America, Inc.
    Inventors: John Gregory Favor, Paul G. Chan, Graham Ricketson Murphy, Joseph Byron Rowlands
  • Patent number: 8356185
    Abstract: A processor may include a hardware instruction fetch unit configured to issue instructions for execution, and a hardware functional unit configured to receive instructions for execution, where the instructions include cryptographic instruction(s) and non-cryptographic instruction(s). The functional unit may include a cryptographic execution pipeline configured to execute the cryptographic instructions with a corresponding cryptographic execution latency, and a non-cryptographic execution pipeline configured to execute the non-cryptographic instructions with a corresponding non-cryptographic execution latency that is longer than the cryptographic execution latency.
    Type: Grant
    Filed: October 8, 2009
    Date of Patent: January 15, 2013
    Assignee: Oracle America, Inc.
    Inventors: Christopher H. Olson, Gregory F. Grohoski, Robert T. Golla
  • Patent number: 8284772
    Abstract: A method is provided for scheduling a network packet processor. A textual language specification is input of the processing of network packets by the network packet processor. The textual language specification includes memory read actions and modification actions. Each memory read action reads a stored value from a memory of the network packet processor. Each modification action modifies a field of the network packets. An availability is determined for each field read from the network packets for the memory read and modification actions. An availability is determined for each stored value read from the memory for the memory read actions. A look-ahead interval is determined from the availabilities. A respective storage class is determined for the fields for the memory read and modification actions. The respective storage class is one of a bus, a register, and a register with bypass.
    Type: Grant
    Filed: May 3, 2007
    Date of Patent: October 9, 2012
    Assignee: XILINX, Inc.
    Inventors: Philip B. James-Roxby, Eric R. Keller
  • Patent number: 8271766
    Abstract: An information processing device including registers (105) for holding data and an operation device (102) for executing arithmetic and logic operations on input/output data held in the register. The information processing device can issue an inter-register copy instruction for instructing data held in one register to be copied to another register. The information processing device further includes a copy information holding device (113) for reserving for execution of a data copy operation by the inter-register copy instruction from a control unit (108) so as to execute the actual copy operation simultaneously with the succeeding instruction to hide the execution time of the copy operation. Thus, in the inter-register copy instruction execution phase, a reservation for a data copy operation is stored in the copy information holding device so that the execution phase is completed without performing the actual data copy operation.
    Type: Grant
    Filed: May 18, 2006
    Date of Patent: September 18, 2012
    Assignee: NEC Corporation
    Inventor: Noritaka Hoshi
  • Patent number: 8266411
    Abstract: Instead of having a processor with an instruction set architecture (ISA) that includes fixed architected operands, an improved processor supports additional characteristic bits for computing instructions (e.g., a multiply-add, load/store instructions). Such additional bits for the certain instructions influence the processing of these instructions by the processor. Also, a new instruction is introduced for further usage of the proposed method. Typically these additional characteristic bits as well as the instruction can be automatically generated by compilers to provide relatively well-suited instruction sequences for the processor.
    Type: Grant
    Filed: February 5, 2009
    Date of Patent: September 11, 2012
    Assignee: International Business Machines Corporation
    Inventors: Tobias Gemmeke, Markus Kaltenbach, Nicolas Maeding
  • Patent number: 8214602
    Abstract: In one embodiment, a processor comprises a data cache and a load/store unit (LSU). The LSU comprises a queue and a control unit, and each entry in the queue is assigned to a different load that has accessed the data cache but has not retired. The control unit is configured to update the data cache hit status of each load represented in the queue as a content of the data cache changes. The control unit is configured to detect a snoop hit on a first load in a first entry of the queue responsive to: the snoop index matching a load index stored in the first entry, the data cache hit status of the first load indicating hit, the data cache detecting a snoop hit for the snoop operation, and a load way stored in the first entry matching a first way of the data cache in which the snoop operation is a hit.
    Type: Grant
    Filed: June 23, 2008
    Date of Patent: July 3, 2012
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Ashutosh S. Dhodapkar, Michael G. Butler
  • Patent number: 8209519
    Abstract: A pipeline processor has a first stage to read data from a general purpose register unit, a second stage to execute instruction, and a third stage to write back the data into the general purpose register unit. A pipeline register retains data obtained by executing the second stage. The pipeline register stores a data validity flag. A WRITE suspension unit suspends execution of writing the retained data into a general purpose register of the general purpose register unit until the retained data is rewritten by a subsequent instruction, even if the data validity flag indicates “valid.” A data invalidation unit cancels the execution of writing the data retained in the pipeline register into the general purpose register into which the data is to be written by a preceding instruction and invalidates the retained data, when data is written into the general purpose register by the subsequent instruction.
    Type: Grant
    Filed: July 21, 2011
    Date of Patent: June 26, 2012
    Assignee: Kabushiki Kaisha Toshiba
    Inventor: Jun Tanabe
  • Patent number: 8209518
    Abstract: A processing bypass directory system and method are disclosed. In one embodiment, a bypass directory tracking process includes setting bits in a bypass directory when a corresponding architectural register is written. The bits are selectively cleared in the bypass directory each cycle. The configuration of the bits is utilized to determine which stage of a bypass path processing information is at.
    Type: Grant
    Filed: March 28, 2011
    Date of Patent: June 26, 2012
    Inventors: Alexander Klaiber, Guillermo Rozas
  • Publication number: 20120159217
    Abstract: A method and apparatus are described for reducing power consumption in a processor. A micro-operation is selected for execution, and a destination physical register tag of the selected micro-operation is compared to a plurality of source physical register tags of micro-operations dependent upon the selected micro-operation. If there is a match between the destination physical register tag and one of the source physical register tags, a corresponding physical register file (PRF) read operation is disabled. The comparison may be performed by a wakeup content-addressable memory (CAM) of a scheduler. The wakeup CAM may send a read control signal to the PRF to disable the read operation. Disabling the corresponding PRF read operation may include shutting off power in the PRF and related logic.
    Type: Application
    Filed: December 16, 2010
    Publication date: June 21, 2012
    Applicant: ADVANCED MICRO DEVICES, INC.
    Inventors: Ganesh Venkataramanan, Emil Talpes
  • Patent number: 8200945
    Abstract: A microprocessor includes a branch unit, a load/store unit (LSU), an arithmetic logic unit (ALU), and a vector unit to execute a vector instruction. The vector unit includes a vector register file having a primary vector register and a secondary vector register. The processor preferably further includes a first data bus and a second data bus wherein the first and second data busses couple the vector unit to the data memory. The vector unit includes a first input multiplexer enabling data on the first data bus to be provided to the primary register file or the secondary register file and a second input multiplexer, independent of the first input multiplexer enabling data on the second data bus to be provided to the second data bus. The first and second data busses may comprise first and second portions of a data memory bus.
    Type: Grant
    Filed: November 7, 2003
    Date of Patent: June 12, 2012
    Assignee: International Business Machines Corporation
    Inventors: Siddhartha Chatterjee, Kenneth Dockser, Fred Gehrung Gustayson, Manish Gupta
  • Patent number: 8179540
    Abstract: An image forming apparatus is provided that holds counter information obtained by integrating a consumption of a consumable that depends on usage of service provided by the image forming apparatus. A log corresponding to the usage of the service is set in job log information with a synchronization flag set off. The log in the job log information, for which the synchronization flag is set off, is set on. The counter information and the job log information are output after the synchronization flag for the log having the synchronization flag set off has been set on.
    Type: Grant
    Filed: October 29, 2008
    Date of Patent: May 15, 2012
    Assignee: Canon Kabushiki Kaisha
    Inventors: Junichi Hiruma, Nobuyuki Tonegawa
  • Patent number: 8145874
    Abstract: In an embodiment, a method is disclosed that includes, comparing, during a write back stage at an execution unit, a write identifier associated with a result to be written to a register file from execution of a first instruction to a read identifier associated with a second instruction at an execution pipeline within an interleaved multi-threaded (IMT) processor having multiple execution units. When the write identifier matches the read identifier, the method further includes storing the result at a local memory of the execution unit for use by the execution unit in the subsequent read stage.
    Type: Grant
    Filed: February 26, 2008
    Date of Patent: March 27, 2012
    Assignee: QUALCOMM Incorporated
    Inventors: Suresh Venkumahanti, Lucian Codrescu, Lin Wang