Commitment Control Or Register Bypass Patents (Class 712/218)
-
Patent number: 10277955Abstract: A signal receiver chip may be configured to receive a satellite signal, and when the satellite signal is partially-processed off-chip, to bypass at least a portion of processing functions applied in the signal receiver chip during processing of satellite signals. The bypassed processing functions may comprise or correspond to signal band conversions. The satellite signal chip may generate an output signal, corresponding to the satellite signal, with the output signal being configured for communication to a peer device (e.g., satellite STB). The output signal may be generated and/or configured such that to enable distributing content carried in the output signal to a plurality of client devices in a local network serviced by the peer device. The signal receiver chip may combine a plurality of portions, corresponding to a plurality of satellite signals, into the output signal.Type: GrantFiled: October 13, 2015Date of Patent: April 30, 2019Assignee: MAXLINEAR, INC.Inventors: Raja Pullela, Glenn Chang, Curtis Ling
-
Patent number: 10241797Abstract: A method for reducing a number of operations replayed in a processor includes decoding an operation to determine a memory address and a command in the operation. If data is not in a way predictor based on the memory address, a suppress wakeup signal is sent to an operation scheduler, and the operation scheduler suppresses waking up other operations that are dependent on the data.Type: GrantFiled: July 17, 2012Date of Patent: March 26, 2019Assignee: Advanced Micro Devices, Inc.Inventors: Ganesh Venkataramanan, Mike Butler, Krishnan V. Ramani
-
Patent number: 10209995Abstract: A processor core supporting out-of-order execution (OOE) includes load-hit-store (LHS) hazard prediction at the instruction execution phase, reducing load instruction rejections and queue flushes at the dispatch phase. The instruction dispatch unit (IDU) detects likely LHS hazards by generating entries for pending stores in a LHS detection table. The entries in the table contain an address field (generally the immediate field) of the store instruction and the register number of the store. The IDU compares the address field and register number for each load with entries in the table to determine if a likely LHS hazard exists and if an LHS hazard is detected, the load is dispatched to the issue queue of the load-store unit (LSU) with a tag corresponding to the matching store instruction, causing the LSU to dispatch the load only after the corresponding store has been dispatched for execution.Type: GrantFiled: October 24, 2014Date of Patent: February 19, 2019Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Sundeep Chadha, Richard James Eickemeyer, John Barry Griswell, Jr., Dung Quoc Nguyen
-
Patent number: 10198298Abstract: The technology disclosed improves existing streaming processing systems by allowing the ability to both scale up and scale down resources within an infrastructure of a stream processing system. In particular, the technology disclosed relates to a dispatch system for a stream processing system that adapts its behavior according to a computational capacity of the system based on a run-time evaluation. The technical solution includes, during run-time execution of a pipeline, comparing a count of available physical threads against a set number of logically parallel threads. When a count of available physical threads equals or exceeds the number of logically parallel threads, the solution includes concurrently processing the batches at the physical threads. Further, when there are fewer available physical threads than the number of logically parallel threads, the solution includes multiplexing the batches sequentially over the available physical threads.Type: GrantFiled: December 31, 2015Date of Patent: February 5, 2019Assignee: salesforce.com, inc.Inventors: Elden Gregory Bishop, Jeffrey Chao
-
Patent number: 10152396Abstract: A method, apparatus, and system for a time-based checkpoint target is provided for standby databases. Change records received from a primary database are applied for a standby database, creating dirty buffer queues. As the change records are applied, a mapping is maintained, which maps timestamps to logical times of change records that were most recently applied at the timestamp for the standby database. On a periodic dirty buffer queue processing interval, the mapping is used to determine a target logical time that is mapped to a target timestamp that is prior to a present timestamp by at least a checkpoint delay. The dirty buffer queues are then processed up to the target logical time, creating an incremental checkpoint. On a periodic header update interval, file headers reflecting a consistent logical time for the checkpoint are also updated. The intervals and the checkpoint delay are adjustable by user or application.Type: GrantFiled: May 5, 2014Date of Patent: December 11, 2018Assignee: Oracle International CorporationInventors: Jonghyun Lee, Yunrui Li, Mahesh Baburao Girkar, Amrish Srivastava
-
Patent number: 10133582Abstract: A processor includes a first logic to execute an instruction stream out-of-order, the instruction stream divided into a plurality of strands, the instruction stream and each strand ordered by program order (PO). The processor also includes a second logic to determine an oldest undispatched instruction in the instruction stream and store an associated PO value of the oldest undispatched instruction as an executed instruction pointer. The instruction stream includes dispatched and undispatched instructions. The processor also includes a third logic to determine a most recently retired instruction in the instruction stream and store an associated PO value of the most recently retired instruction as a retirement pointer, a fourth logic to select a range of instructions between the retirement pointer and the executed instruction pointer, and a fifth logic to identify the range of instructions as eligible for retirement.Type: GrantFiled: December 23, 2013Date of Patent: November 20, 2018Assignee: Intel CorporationInventors: Nikolay Kosarev, Sergey Y. Shishlov, Jayesh Iyer, Alexander V. Butuzov, Boris A. Babayan, Andrey Kluchnikov
-
Patent number: 10127074Abstract: Various embodiments include methods and apparatus structured to provide synchronization of a transaction identification between a host and a memory module using a parity check. A transaction identification can be generated at both the host and the memory module independently using incremental counters of these apparatus. Synchronization of the transaction identifications generated by the host and by a controller of the memory module can be implemented using a parity bit sequences pattern of a combination of the generated transaction identification plus the corresponding transaction command and data address. Use of transaction commands modified with respect to transaction identifications can be used in initialization of the synchronization, in message passing, and in error detection and response to errors. Additional apparatus, systems, and methods can be implemented in a variety of applications.Type: GrantFiled: January 27, 2017Date of Patent: November 13, 2018Assignee: Futurewei Technologies, Inc.Inventors: Xiaobing Lee, Feng Yang, Shaojie Chen
-
Patent number: 10108420Abstract: An apparatus including first and second reservation stations. The first reservation station dispatches a load micro instruction, and indicates on a hold bus if the load micro instruction is a specified load micro instruction directed to retrieve an operand from a prescribed resource other than on-core cache memory, where the specified load instruction requires more than a first number of clock cycles to retrieve the operand. The second reservation station is coupled to the hold bus, and dispatches one or more younger micro instructions therein that depend on the load micro instruction for execution after a number of clock cycles following dispatch of the first load micro instruction, and if it is indicated on the hold bus that the load micro instruction is the specified load micro instruction, the second reservation station is configured to stall dispatch of the one or more younger micro instructions until the load micro instruction has retrieved the operand.Type: GrantFiled: November 24, 2015Date of Patent: October 23, 2018Assignee: VIA ALLIANCE SEMICONDUCTOR CO., LTDInventors: Gerard M. Col, Colin Eddy, G. Glenn Henry
-
Patent number: 10108429Abstract: An apparatus including first and second reservation stations. The first reservation station dispatches a load micro instruction, and indicates on a hold bus if the load micro instruction is a specified load micro instruction directed to retrieve an operand from a prescribed resource other than on-core cache memory. The second reservation station is coupled to the hold bus, and dispatches one or more younger micro instructions therein that depend on the load micro instruction for execution after a number of clock cycles following dispatch of the first load micro instruction, and if it is indicated on the hold bus that the load micro instruction is the specified load micro instruction, the second reservation station is configured to stall dispatch of the one or more younger micro instructions until the load micro instruction has retrieved the operand.Type: GrantFiled: December 14, 2014Date of Patent: October 23, 2018Assignee: VIA ALLIANCE SEMICONDUCTOR CO., LTDInventors: Gerard M. Col, Colin Eddy, G. Glenn Henry
-
Patent number: 10108426Abstract: Embodiments include issuing dynamic issue masks for processor hang prevention. Aspects include storing an instruction in an issue queue for execution by an execution unit, the instruction including a default issue mask. Aspects further include determining whether the instruction in the issue queue is likely to be rescinded by the execution unit. Based on determining that the instruction is not likely to be rescinded by the execution unit, aspects include issuing the instruction to the execution unit with the default issue mask. Based on determining that the instruction is likely to be rescinded by the execution unit, aspects include issuing the instruction to the execution unit with a likely to be rescinded issue mask.Type: GrantFiled: September 1, 2015Date of Patent: October 23, 2018Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Gregory W. Alexander, Steven R. Carlough, Lee E. Eisen, David A. Schroter
-
Patent number: 10108425Abstract: A computing device reorders an iteratively executed sequence of instructions such that constituent instructions that require longer execution time than other constituent instructions are grouped together. The computing device inserts, within the iteratively executed sequence of instructions, one or more additional instructions to enable parallel execution of two or more instances of the sequence of instructions such that the constituent instructions that require longer execution time will be executed concurrently with the other constituent instructions.Type: GrantFiled: October 25, 2016Date of Patent: October 23, 2018Assignee: Superpowered Inc.Inventors: Gabor Szanto, Alexander Patrick Vlaskovits
-
Patent number: 10108421Abstract: An apparatus including first and second reservation stations. The first reservation station dispatches a load micro instruction, and indicates on a hold bus if the load micro instruction is a specified load micro instruction directed to retrieve an operand from a prescribed resource other than on-core cache memory. The second reservation station is coupled to the hold bus, and dispatches one or more younger micro instructions therein that depend on the load micro instruction for execution after a number of clock cycles following dispatch of the first load micro instruction, and if it is indicated on the hold bus that the load micro instruction is the specified load micro instruction, the second reservation station is configured to stall dispatch of the one or more younger micro instructions until the load micro instruction has retrieved the operand.Type: GrantFiled: November 24, 2015Date of Patent: October 23, 2018Assignee: VIA ALLIANCE SEMICONDUCTOR CO., LTDInventors: Gerard M. Col, Colin Eddy, G. Glenn Henry
-
Patent number: 10108428Abstract: An apparatus including first and second reservation stations. The first reservation station dispatches a load micro instruction, and indicates on a hold bus if the load micro instruction is a specified load micro instruction directed to retrieve an operand from a prescribed resource other than on-core cache memory, where the specified load instruction requires more than a first number of clock cycles to retrieve the operand. The second reservation station is coupled to the hold bus, and dispatches one or more younger micro instructions therein that depend on the load micro instruction for execution after a number of clock cycles following dispatch of the first load micro instruction, and if it is indicated on the hold bus that the load micro instruction is the specified load micro instruction, the second reservation station is configured to stall dispatch of the one or more younger micro instructions until the load micro instruction has retrieved the operand.Type: GrantFiled: December 14, 2014Date of Patent: October 23, 2018Assignee: VIA ALLIANCE SEMICONDUCTOR CO., LTDInventors: Gerard M. Col, Colin Eddy, G. Glenn Henry
-
Patent number: 10102158Abstract: Methods and apparatus relating to the transfer of data for processing and/or the transfer of the resulting processed data are described. Some features relate to a processing system which performs data transfers under control of a Dynamic Sequence Controller (DSC). In various embodiments a sequence of operational codes is used to control data transfer with the status of data source and destination locations taken into consideration. Modification of the op code sequence used to control the dynamic sequence controller and thus the transfer of data can be performed asynchronously to control of processing units which can be controlled via a command and control bus used to control the function of operators which process the data provided via the data bus.Type: GrantFiled: December 31, 2014Date of Patent: October 16, 2018Assignee: Accusoft CorporationInventor: Robert M Nally
-
Patent number: 10102142Abstract: A method for detecting an instruction ordering violation in a CPU. The method includes receiving a reordered stream of instructions and detecting whether an ordering violation has occurred by using virtual addresses. The method further includes transferring results of the reordered stream of instructions from a load store buffer into a cache and detecting whether an ordering violation has occurred by using physical addresses. Subsequently, a recovery is initiated upon detection of an ordering violation.Type: GrantFiled: December 26, 2012Date of Patent: October 16, 2018Assignee: Nvidia CorporationInventors: Guillermo J. Rozas, Bharath Krishnan, James Van Zoeren
-
Patent number: 10102002Abstract: Embodiments include issuing dynamic issue masks for processor hang prevention. Aspects include storing an instruction in an issue queue for execution by an execution unit, the instruction including a default issue mask. Aspects further include determining whether the instruction in the issue queue is likely to be rescinded by the execution unit. Based on determining that the instruction is not likely to be rescinded by the execution unit, aspects include issuing the instruction to the execution unit with the default issue mask. Based on determining that the instruction is likely to be rescinded by the execution unit, aspects include issuing the instruction to the execution unit with a likely to be rescinded issue mask.Type: GrantFiled: September 30, 2014Date of Patent: October 16, 2018Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Gregory W. Alexander, Steven R. Carlough, Lee E. Eisen, David A. Schroter
-
Patent number: 10095647Abstract: An accelerated processor structure on a programmable integrated circuit device includes a processor and a plurality of configurable digital signal processors (DSPs). Each configurable DSP includes a circuit block, which in turn includes a plurality of multipliers. The accelerated processor structure further includes a first bus to transfer data from the processor to the configurable DSPs, and a second bus to transfer data from the configurable DSPs to the processor.Type: GrantFiled: May 29, 2015Date of Patent: October 9, 2018Assignee: Altera CorporationInventors: David Shippy, Martin Langhammer, Jeffrey Eastlack
-
Patent number: 10083035Abstract: A streaming engine employed in a digital data processor specifies fixed first and second read only data streams. Corresponding stream address generator produces address of data elements of the two streams. Corresponding steam head registers stores data elements next to be supplied to functional units for use as operands. The two streams share two memory ports. A toggling preference of stream to port ensures fair allocation. The arbiters permit one stream to borrow the other's interface when the other interface is idle. Thus one stream may issue two memory requests, one from each memory port, if the other stream is idle. This spreads the bandwidth demand for each stream across both interfaces, ensuring neither interface becomes a bottleneck.Type: GrantFiled: December 20, 2016Date of Patent: September 25, 2018Assignee: TEXAS INSTRUMENTS INCORPORATEDInventors: Joseph Zbiciak, Timothy Anderson
-
Patent number: 10025554Abstract: An apparatus and a corresponding method for processing a sequence of received data items are disclosed. The processing is performed by multiple processing elements. A reorder buffer comprising multiple slots is used to maintain the order of the received data items, wherein a processing element reserves a next available slot in the reorder buffer before beginning processing the next data item of the sequence of received data items. On completion of the processing a buffer change indicator value is read by the processing element when seeking to insert the processed data item into the reserved slot. If the buffer change indicator changes during the course of the insertion process, this serves as an indication to the processing element that another processing element is modifying the content of the reorder buffer in parallel. A check may be repeated for at least one subsequent already-processed data item, since this latter data item may have become ready to be retired from the reorder buffer.Type: GrantFiled: September 19, 2016Date of Patent: July 17, 2018Assignee: ARM LimitedInventor: Eric Ola Harald Liljedahl
-
Patent number: 9952901Abstract: Described herein are technologies related to enforcing thread dependency using a hybrid scoreboard. An encoded video information that includes a plurality of threads is received, a first set and a second set of threads from the plurality of thread is determined, the first and second sets of threads are assigned to a hardware and a software, respectively, and dependency threads in the first and second sets of threads is enforced.Type: GrantFiled: December 9, 2014Date of Patent: April 24, 2018Assignee: Intel CorporationInventors: Haihua Wu, Julia A. Gould, Li-An Tang
-
Patent number: 9934039Abstract: Methods of predicting stack pointer values of variables stored in a stack are described. When an instruction is seen which stores a variable in the stack in a position offset from the stack pointer, an entry is added to a data structure which identifies the physical register which currently stores the stack pointer, the physical register which stores the value of the variable and the offset value. Subsequently when an instruction to load a variable from the stack from a position which is identified by reference to the stack pointer is seen, the data structure is searched to see if there is a corresponding entry which includes the same offset and the same physical register storing the stack pointer as the load instruction. If a corresponding entry is found the architectural register in the load instruction is mapped to the physical register storing the value of the variable from the entry.Type: GrantFiled: January 16, 2015Date of Patent: April 3, 2018Assignee: MIPS Tech LimitedInventor: Hugh Jackson
-
Patent number: 9928067Abstract: Systems and methods are provided in example embodiments for performing binary translation. A binary translation system converts, by a translator module, source instructions to target instructions. The binary translation system identifies a condition code block in the source instructions, where the condition code block includes a plurality of condition bits. In response to identifying the condition code block, the binary translation system provides an optimizer module to convert the condition code block. Then, the binary translation system performs a pre-execution on the condition code block to resolve the plurality of condition bits in the condition code block.Type: GrantFiled: September 21, 2012Date of Patent: March 27, 2018Assignee: Intel CorporationInventors: Xueliang Zhong, Jianhui Li, Jian Ping Jane Chen, Gang Wang, Yi Qian, Huifeng Gu
-
Patent number: 9875105Abstract: Embodiments related to re-dispatching an instruction selected for re-execution from a buffer upon a microprocessor re-entering a particular execution location after runahead are provided. In one example, a microprocessor is provided. The example microprocessor includes fetch logic, one or more execution mechanisms for executing a retrieved instruction provided by the fetch logic, and scheduler logic for scheduling the retrieved instruction for execution. The example scheduler logic includes a buffer for storing the retrieved instruction and one or more additional instructions, the scheduler logic being configured, upon the microprocessor re-entering at a particular execution location after runahead, to re-dispatch, from the buffer, an instruction that has been previously dispatched to one of the execution mechanisms.Type: GrantFiled: May 3, 2012Date of Patent: January 23, 2018Assignee: NVIDIA CORPORATIONInventors: Guillermo J. Rozas, Paul Serris, Brad Hoyt, Sridharan Ramakrishnan, Hens Vanderschoot, Ross Segelken, Darrell Boggs, Magnus Ekman
-
Patent number: 9851975Abstract: A processor and instruction graduation unit for a processor. In one embodiment, a processor or instruction graduation unit according to the present invention includes a linked-list-based multi-threaded graduation buffer and a graduation controller. The graduation buffer stores identification values generated by an instruction decode and dispatch unit of the processor as part of one or more linked-list data structures. Each linked-list data structure formed is associated with a particular program thread running on the processor. The number of linked-list data structures formed is variable and related to the number of program threads running on the processor. The graduation controller includes linked-list head identification registers and linked-list tail identification registers that facilitate reading and writing identifications values to linked-list data structures associated with particular program threads.Type: GrantFiled: September 23, 2014Date of Patent: December 26, 2017Assignee: ARM Finance Overseas LimitedInventor: Kjeld Svendsen
-
Patent number: 9830197Abstract: One embodiment of the present invention sets forth a technique for performing aggregation operations across multiple threads that execute independently. Aggregation is specified as part of a barrier synchronization or barrier arrival instruction, where in addition to performing the barrier synchronization or arrival, the instruction aggregates (using reduction or scan operations) values supplied by each thread. When a thread executes the barrier aggregation instruction the thread contributes to a scan or reduction result, and waits to execute any more instructions until after all of the threads have executed the barrier aggregation instruction. A reduction result is communicated to each thread after all of the threads have executed the barrier aggregation instruction and a scan result is communicated to each thread as the barrier aggregation instruction is executed by the thread.Type: GrantFiled: August 16, 2016Date of Patent: November 28, 2017Assignee: NVIDIA CorporationInventors: Brian Fahs, Ming Y Siu, Brett W. Coon, John R. Nickolls, Lars Nyland
-
Patent number: 9823929Abstract: A processor includes a queue for storing instructions processed within the context of a current value of a register field, where for some embodiments the instruction is undefined or defined, depending upon the register field at time of processing. After a write instruction (an instruction that writes to the register field) executes, the queue is searched for any entries that contain instructions that depend upon the executed write instruction. Each such entry stores the value of the register field at the time the instruction in the entry was processed. If such an entry is found in the queue and its stored value of the register field does not match the value that the write instruction wrote to the register field, then the processor flushes the pipeline and restarts at a state so as to correctly execute the instruction.Type: GrantFiled: March 15, 2013Date of Patent: November 21, 2017Assignee: QUALCOMM IncorporatedInventors: Daren Eugene Streett, Brian Michael Stempel, Thomas Philip Speier, Rodney Wayne Smith, Michael Scott McIlvaine, Kenneth Alan Dockser, James Norris Dieffenderfer
-
Patent number: 9755620Abstract: A device for detecting and correcting timing error and a method for designing typical-case timing using the same is disclosed. The device includes two datapath units connected with first and second multiplexers and two transition detectors. Each datapath unit receives and calculates an input signal to generate a speculation value and a correct value. Then, the speculation value and the correct value are transmitted to the first and second multiplexers and the transition detectors determine whether transition of the outputted speculation value is unstable. If yes, the datapath unit outputting the speculation value is stalled for a period of time for correction, whereby the second multiplexer outputs the correct value. If no, the datapath unit outputs the speculation value, then the present invention uses the undertaken timing as a setting specification to complete a circuit design. The present invention can improve system efficiency and power of the whole circuit.Type: GrantFiled: July 18, 2016Date of Patent: September 5, 2017Assignee: NATIONAL CHUNG CHENG UNIVERSITYInventors: Tay-Jyi Lin, Jinn-Shyan Wang, Hong-Chih Lin, Ting-Yu Shyu
-
Patent number: 9754104Abstract: The invention relates to a virtual machine. The virtual machine is set to recognize, in addition to a set of conventional bytecodes, at least one secure bytecode functionally equivalent to one of the conventional bytecodes. It is set to process secure bytecodes with increased security, while it is set to process conventional bytecodes with increased speed. The invention also relates to a computing device comprising such a virtual machine, to a procedure for generating bytecode executable by such a virtual machine, and to an applet development tool comprising such procedure.Type: GrantFiled: December 9, 2009Date of Patent: September 5, 2017Assignee: GEMALTO SAInventors: Olivier Joffray, Milan Krizenecky
-
Patent number: 9747217Abstract: An approach is provided in which a computing system captures content included in a history buffer entry that corresponds to a flush ITAG. The computing system, in turn, uses an execution unit to transmit the content over a results bus to multiple registers and restore at least one of the registers accordingly.Type: GrantFiled: June 1, 2015Date of Patent: August 29, 2017Assignee: International Business Machines CorporationInventors: Salma Ayub, Sundeep Chadha, Michael J. Genden, Cliff Kucharski, Dung Q. Nguyen, David R. Terry
-
Patent number: 9740620Abstract: An approach is provided in which a computing system captures content included in a history buffer entry that corresponds to a flush ITAG. The computing system, in turn, uses an execution unit to transmit the content over a results bus to multiple registers and restore at least one of the registers accordingly.Type: GrantFiled: May 7, 2015Date of Patent: August 22, 2017Assignee: International Business Machines CorporationInventors: Salma Ayub, Sundeep Chadha, Michael J. Genden, Cliff Kucharski, Dung Q. Nguyen, David R. Terry
-
Patent number: 9715389Abstract: A method includes suppressing execution of at least one dependent instruction of a load instruction by a processor using stored dependency information responsive to an invalid status of the load instruction. A processor includes an execution unit to execute instructions and a scheduler. The scheduler is to select for execution in the execution unit a load instruction having at least one dependent instruction and suppress execution of the at least one dependent instruction using stored dependency information responsive to an invalid status of the load instruction.Type: GrantFiled: June 25, 2013Date of Patent: July 25, 2017Assignee: Advanced Micro Devices, Inc.Inventors: Francesco Spadini, Michael Achenbach
-
Patent number: 9703567Abstract: In an embodiment, the present invention includes a processor having an execution logic to execute instructions and a control transfer termination (CTT) logic coupled to the execution logic. This logic is to cause a CTT fault to be raised if a target instruction of a control transfer instruction is not a CTT instruction. Other embodiments are described and claimed.Type: GrantFiled: November 30, 2012Date of Patent: July 11, 2017Assignee: Intel CorporationInventors: Vedvyas Shanbhogue, Jason W. Brandt, Uday R. Savagaonkar, Ravi L. Sahita
-
Patent number: 9703565Abstract: Embodiments provide methods, apparatus, systems, and computer readable media associated with predicting predicates and branch targets during execution of programs using combined branch target and predicate predictions. The predictions may be made using one or more prediction control flow graphs which represent predicates in instruction blocks and branches between blocks in a program. The prediction control flow graphs may be structured as trees such that each node in the graphs is associated with a predicate instruction, and each leaf associated with a branch target which jumps to another block. During execution of a block, a prediction generator may take a control point history and generate a prediction. Following the path suggested by the prediction through the tree, both predicate values and branch targets may be predicted. Other embodiments may be described and claimed.Type: GrantFiled: March 25, 2015Date of Patent: July 11, 2017Assignee: The Board of Regents of the University of Texas SystemInventors: Douglas C. Burger, Stephen W. Keckler
-
Patent number: 9684511Abstract: In an embodiment, the present invention includes a processor having a decode unit, an execution unit, and a retirement unit. The decode unit is to decode control transfer instructions and the execution unit is to execute control transfer instructions. The retirement unit is to retire a first control transfer instruction, and to raise a fault if a next instruction to be retired after the first control transfer instruction is not a second control transfer instruction and a target instruction of the first control transfer instruction is in code using the control transfer instructions.Type: GrantFiled: September 27, 2013Date of Patent: June 20, 2017Assignee: Intel CorporationInventors: Vedvyas Shanbhogue, Jason Brandt, Uday Savagaonkar, Ravi Sahita
-
Patent number: 9672044Abstract: A processor may efficiently implement register renaming and checkpoint repair even in instruction set architectures with large numbers of wide (bit-width) registers by (i) renaming all destination operand register targets, (ii) implementing free list and architectural-to-physical mapping table as a combined array storage with unitary (or common) read, write and checkpoint pointer indexing and (iiii) storing checkpoints as snapshots of the mapping table, rather than of actual register contents. In this way, uniformity (and timing simplicity) of the decode pipeline may be accentuated and architectural-to-physical mappings (or allocable mappings) may be efficiently shuttled between free-list, reorder buffer and mapping table stores in correspondence with instruction dispatch and completion as well as checkpoint creation, retirement and restoration.Type: GrantFiled: August 1, 2012Date of Patent: June 6, 2017Assignee: NXP USA, INC.Inventor: Thang M. Tran
-
Patent number: 9652198Abstract: Systems and methods are disclosed for managing data entry buffers in a data storage device. A memory of the data storage device includes one or more data input ports. The device further includes a controller configured to receive a data entry over one of the data input ports and store the data entry in a first data structure (e.g., a FIFO data structure). The data entry is stored in the first data structure among other data entries received over various data input ports. The controller stores a data entry corresponding to the data entry stored in the first data structure in a second data structure. Entries in the second data structure include a valid bit field and one or more condition fields. The controller indicates, using a valid bit field of the second data structure data entry, that the corresponding data entry stored in the first data structure is valid.Type: GrantFiled: July 29, 2015Date of Patent: May 16, 2017Assignee: Western Digital Technologies, Inc.Inventor: Jianxun Gao
-
Patent number: 9652244Abstract: A processing bypass directory system and method are disclosed. In one embodiment, a bypass directory tracking process includes setting bits in a bypass directory when a corresponding architectural register is written. The bits are selectively cleared in the bypass directory each cycle. The configuration of the bits is utilized to determine which stage of a bypass path processing information is at.Type: GrantFiled: June 25, 2012Date of Patent: May 16, 2017Assignee: Intellectual Ventures Holding 81 LLCInventors: Alexander Klaiber, Guillermo Rozas
-
Patent number: 9645827Abstract: An apparatus including first and second reservation stations. The first reservation station dispatches a load micro instruction, and indicates on a hold bus if the load micro instruction is a specified load micro instruction directed to retrieve an operand from a prescribed resource other than on-core cache memory. The second reservation station is coupled to the hold bus, and dispatches one or more younger micro instructions therein that depend on the load micro instruction for execution after a number of clock cycles following dispatch of the first load micro instruction, and if it is indicated on the hold bus that the load micro instruction is the specified load micro instruction, the second reservation station is configured to stall dispatch of the one or more younger micro instructions until the load micro instruction has retrieved the operand.Type: GrantFiled: December 14, 2014Date of Patent: May 9, 2017Assignee: VIA ALLIANCE SEMICONDUCTOR CO., LTD.Inventors: Gerard M. Col, Colin Eddy, G. Glenn Henry
-
Patent number: 9639479Abstract: A method for managing a parallel cache hierarchy in a processing unit. The method includes receiving an instruction from a scheduler unit, where the instruction comprises a load instruction or a store instruction; determining that the instruction includes a cache operations modifier that identifies a policy for caching data associated with the instruction at one or more levels of the parallel cache hierarchy; and executing the instruction and caching the data associated with the instruction based on the cache operations modifier.Type: GrantFiled: September 22, 2010Date of Patent: May 2, 2017Assignee: NVIDIA CorporationInventors: John R. Nickolls, Brett W. Coon, Michael C. Shebanow
-
Patent number: 9582280Abstract: The description covers a system and method for operating a micro-processing system having a runahead mode of operation. In one implementation, the method includes providing, for a first portion of code, a runahead correlate. When the first portion of code is encountered by the micro-processing system, a determination is made as to whether the system is operating in the runahead mode. If so, the system branches to the runahead correlate, which is specifically configured to identify and resolve latency events likely to occur when the first portion of code is encountered outside of runahead. Branching out of the first portion of code may also be performed based on a determination that a register is poisoned.Type: GrantFiled: July 18, 2013Date of Patent: February 28, 2017Assignee: NVIDIA CORPORATIONInventors: Rohit Kumar, Guillermo Rozas, Magnus Ekman, Lawrence Spracklen
-
Patent number: 9569214Abstract: In one embodiment, in an execution pipeline having a plurality of execution subunits, a method of using a bypass network to directly forward data from a producing execution subunit to a consuming execution subunit is provided. The method includes producing output data with the producing execution subunit, consuming input data with the consuming execution subunit, for one or more intervening operations whose input is the output data from the producing execution subunit and whose output is the input data to the consuming execution subunit, evaluating those one or more intervening operations to determine whether their execution would compose an identify function, and if the one or more intervening operations would compose such an identity function, controlling the bypass network to forward the producing execution subunit's output data directly to the consuming execution subunit.Type: GrantFiled: December 27, 2012Date of Patent: February 14, 2017Assignee: NVIDIA CORPORATIONInventors: Gokul Govindu, Parag Gupta, Scott Pitkethly, Guillermo J. Rozas
-
Patent number: 9483273Abstract: A method includes suppressing execution of an operation portion of a load-operation instruction in a processor responsive to an invalid status of a load portion of load-operation instruction. A processor includes an instruction pipeline including an execution unit operable to execute instructions and a scheduler unit. The scheduler unit includes a scheduler queue and is operable to store a load-operation in the scheduler queue. The load-operation instruction includes a load portion and an operation portion. The scheduler unit schedules the load portion for execution in the execution unit, marks the operation portion in the scheduler queue as eligible for execution responsive to scheduling the load portion, receives an indication of an invalid status of the load portion, and suppresses execution of the operation portion responsive to the indication of the invalid status.Type: GrantFiled: July 16, 2013Date of Patent: November 1, 2016Assignee: Advanced Micro Devices, Inc.Inventors: Francesco Spadini, Michael Achenbach, Emil Talpes, Ganesh Venkataramanan
-
Patent number: 9454370Abstract: A Conditional Transaction End (CTEND) instruction is provided that allows a program executing in a nonconstrained transactional execution mode to inspect a storage location that is modified by either another central processing unit or the Input/Output subsystem. Based on the inspected data, transactional execution may be ended or aborted, or the decision to end/abort may be delayed, e.g., until a predefined event occurs. For instance, when the instruction executes, the processor is in a nonconstrained transaction execution mode, and the transaction nesting depth is one at the beginning of the instruction, a second operand of the instruction is inspected, and based on the inspected data, transaction execution may be ended or aborted, or the decision to end/abort may be delayed, e.g., until a predefined event occurs, such as the value of the second operand becomes a prespecified value or a time interval is exceeded.Type: GrantFiled: March 14, 2014Date of Patent: September 27, 2016Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Dan F. Greiner, Christian Jacobi, Marcel Mitran, Donald W. Schmidt, Timothy J. Slegel
-
Patent number: 9436470Abstract: A technique for restoring a register renaming map is described. In one example, a restore table having a number of storage locations saves a copy of the register renaming map whenever a flow-risk instruction is passed to a re-order buffer. When all storage locations are full, further instructions still pass to the re-order buffer, but a copy of the map is not saved. A storage location subsequently becomes available when its associated flow-risk instruction is executed. A register renaming map state for an unrecorded flow-risk instruction passed to the re-order buffer whilst the storage locations were full is generated and stored in the available location. This is generated using the restore table entry for a previous flow-risk instruction and re-order buffer values for intervening instructions between the previous and unrecorded flow-risk instructions. The restore table can be used to restore the map if an unexpected change in instruction flow occurs.Type: GrantFiled: August 3, 2015Date of Patent: September 6, 2016Assignee: Imagination Technologies LimitedInventor: Hugh Jackson
-
Patent number: 9436476Abstract: A method for sorting elements in hardware structures is disclosed. The method comprises selecting a plurality of elements to order from an unordered input queue (UIQ) within a predetermined range in response to finding a match between at least one most significant bit of the predetermined range and corresponding bits of a respective identifier associated with each of the plurality of elements. The method further comprises presenting each of the plurality of elements to a respective multiplexer. Further the method comprises generating a select signal for an enabled multiplexer in response to finding a match between at least one least significant bit of a respective identifier associated with each of the plurality of elements and a port number of the ordered queue. Finally, the method comprises forwarding a packet associated with a selected element identifier to a matching port number of the ordered queue from the enabled multiplexer.Type: GrantFiled: October 11, 2013Date of Patent: September 6, 2016Assignee: SOFT MACHINES INC.Inventors: Mohammad A. Abdallah, Mandeep Singh
-
Patent number: 9430275Abstract: Transactional memory implementations may be extended to support transaction communicators and/or transaction condition variables for which transaction isolation is relaxed, and through which concurrent transactions can communicate and be synchronized with each other. Transactional accesses to these objects may not be isolated unless called within communicator-isolating transactions. A waiter transaction may invoke a wait method of a transaction condition variable, be added to a wait list for the variable, and be suspended pending notification of a notification event from a notify method of the variable. A notifier transaction may invoke a notify method of the variable, which may remove the waiter from the wait list, schedule the waiter transaction for resumed execution, and notify the waiter of the notification event. A waiter transaction may commit only if the corresponding notifier transaction commits. If the waiter transaction aborts, the notification may be forwarded to another waiter.Type: GrantFiled: June 27, 2011Date of Patent: August 30, 2016Assignee: Oracle International CorporationInventors: Virendra J. Marathe, Victor M. Luchangco
-
Patent number: 9424035Abstract: A Conditional Transaction End (CTEND) instruction is provided that allows a program executing in a nonconstrained transactional execution mode to inspect a storage location that is modified by either another central processing unit or the Input/Output subsystem. Based on the inspected data, transactional execution may be ended or aborted, or the decision to end/abort may be delayed, e.g., until a predefined event occurs. For instance, when the instruction executes, the processor is in a nonconstrained transaction execution mode, and the transaction nesting depth is one at the beginning of the instruction, a second operand of the instruction is inspected, and based on the inspected data, transaction execution may be ended or aborted, or the decision to end/abort may be delayed, e.g., until a predefined event occurs, such as the value of the second operand becomes a prespecified value or a time interval is exceeded.Type: GrantFiled: November 26, 2014Date of Patent: August 23, 2016Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Dan F. Greiner, Christian Jacobi, Marcel Mitran, Donald W. Schmidt, Timothy J. Slegel
-
Patent number: 9418043Abstract: A method is disclosed of utilizing a plurality of Arithmetic Logic Units (ALUs) of an array processor. It is determined that a first quantity of the ALUs are scheduled to execute a function during a given processing cycle, with each ALU being scheduled to use a respective one of a plurality of selected input vectors as an input. It is also determined that a second quantity of the ALUs are not scheduled for use during the given processing cycle. A plurality of predicted future input vectors that differ from the plurality of selected input vectors are determined. The second quantity of ALUs are scheduled to execute the function during the given processing cycle using respective ones of the plurality of predicted future input vectors as inputs. After completion of the processing cycle, function outputs received from the first and second quantity of ALUs are cached.Type: GrantFiled: March 7, 2014Date of Patent: August 16, 2016Assignee: Sony CorporationInventors: Jim Rasmusson, HÃ¥kan Jonsson, Jonas Gustavsson, Anders Isberg
-
Patent number: 9411588Abstract: A Conditional Transaction End (CTEND) instruction is provided that allows a program executing in a nonconstrained transactional execution mode to inspect a storage location that is modified by either another central processing unit or the Input/Output subsystem. Based on the inspected data, transactional execution may be ended or aborted, or the decision to end/abort may be delayed, e.g., until a predefined event occurs. For instance, when the instruction executes, the processor is in a nonconstrained transaction execution mode, and the transaction nesting depth is one at the beginning of the instruction, a second operand of the instruction is inspected, and based on the inspected data, transaction execution may be ended or aborted, or the decision to end/abort may be delayed, e.g., until a predefined event occurs, such as the value of the second operand becomes a prespecified value or a time interval is exceeded.Type: GrantFiled: March 14, 2014Date of Patent: August 9, 2016Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Dan F. Greiner, Christian Jacobi, Marcel Mitran, Donald W. Schmidt, Timothy J. Slegel
-
Patent number: 9400651Abstract: In an embodiment, a processor includes an issue circuit configured to issue instruction operations for execution. The issue circuit may be configured to monitor the source operands of the instruction operations, and to issue instruction operations for which the source operands (including predicate operands, as appropriate) are resolved. Additionally, the issue circuit may be configured to detect a null predicate that indicates that none of the vector elements will be modified by a corresponding instruction operation. The issue circuit may be configured to issue the corresponding instruction operation with the null predicate even if other source operands are not yet resolved.Type: GrantFiled: September 24, 2013Date of Patent: July 26, 2016Assignee: Apple Inc.Inventor: Jeffry E. Gonion