Speculative Instruction Execution, E.g., Conditional Execution, Procedural Dependencies, Instruction Invalidation (epo) Patents (Class 712/E9.05)
  • Patent number: 11960892
    Abstract: In one embodiment, a system includes a memory and a processor core. The processor core includes functional units and an instruction decode unit configured to determine whether an execute packet of instructions received by the processing core includes a first instruction that is designated for execution by a first functional unit of the functional units and a second instruction that is a condition code extension instruction that includes a plurality of sets of condition code bits, wherein each set of condition code bits corresponds to a different one of the functional units, and wherein the sets of condition code bits include a first set of condition code bits that corresponds to the first functional unit. When the execute packet includes the first and second instructions, the first functional unit is configured to execute the first instruction conditionally based upon the first set of condition code bits in the second instruction.
    Type: Grant
    Filed: July 22, 2022
    Date of Patent: April 16, 2024
    Assignee: Texas Instruments Incorporated
    Inventors: Timothy David Anderson, Duc Quang Bui, Joseph Raymond Michael Zbiciak
  • Patent number: 11947455
    Abstract: Disclosed is a system and method for use in a cache for suppressing modification of cache line. The system and method includes a processor and a memory operating cooperatively with a cache controller. The memory includes a coherence directory stored within a cache created to track at least one cache line in the cache via the cache controller. The processor instructs a cache controller to store a first data in a cache line in the cache. The cache controller tags the cache line based on the first data. The processor instructs the cache controller to store a second data in the cache line in the cache causing eviction of the first data from the cache line. The processor compares based on the tagging the first data and the second data and suppresses modification of the cache line based on the comparing of the first data and the second data.
    Type: Grant
    Filed: April 17, 2023
    Date of Patent: April 2, 2024
    Assignee: ADVANCED MICRO DEVICES, INC.
    Inventor: Paul J. Moyer
  • Patent number: 11928470
    Abstract: Introduced herein is a program counter advancing technique that uses NOP padding without its limitations. During a build process, the introduced technique removes EOG markers for instruction groups that are immediately followed by the NOP instructions that are immediately followed by an instruction group beginning at a start of a cache line. As such, during an execution process, when the processing unit detects an absence of an EOG marker in the requested instruction group, it knows that a group of NOP instructions are about to follow and skips over them by directly advancing the program counter to a start of a subsequent cache line where the next instruction group starts. In addition to the presence of an EOG marker, the introduced technique also takes into account whether the requested instruction group is a straddling group when advancing the program counter to a start of the subsequent cache line.
    Type: Grant
    Filed: April 20, 2022
    Date of Patent: March 12, 2024
    Assignee: VeriSilicon Holdings Co., Ltd.
    Inventor: Tracy T. Nguyen
  • Patent number: 11915006
    Abstract: A method, system and device for pipeline processing of instructions and a computer storage medium. The method comprises: acquiring a target instruction set (S101); acquiring a target prediction result, wherein the target prediction result is a result obtained by predicting a jump mode of the target instruction set (S102); performing pipeline processing on the target instruction set according to the target prediction result (S103); determining if a pipeline flushing request is received (S104); and if so, correspondingly saving the target instruction set and a corresponding pipeline processing result, so as to perform pipeline processing on the target instruction set again on the basis of the pipeline processing result (S105).
    Type: Grant
    Filed: November 28, 2019
    Date of Patent: February 27, 2024
    Assignee: INSPUR SUZHOU INTELLIGENT TECHNOLOGY CO., LTD.
    Inventors: Yulong Zhou, Tongqiang Liu, Xiaofeng Zou
  • Patent number: 11847060
    Abstract: Described is a data cache with prediction hints for a cache hit. The data cache includes a plurality of cache lines, where a cache line includes a data field, a tag field, and a prediction hint field. The prediction hint field is configured to store a prediction hint which directs alternate behavior for a cache hit against the cache line. The prediction hint field is integrated with the tag field or is integrated with a way predictor field.
    Type: Grant
    Filed: March 1, 2023
    Date of Patent: December 19, 2023
    Assignee: SiFive, Inc.
    Inventors: John Ingalls, Josh Smith
  • Patent number: 11797403
    Abstract: Maintaining a synchronous replication relationship between two or more storage systems, including: receiving, by at least one of a plurality of storage systems across which a dataset will be synchronously replicated, timing information for at least one of the plurality of storage systems; and establishing, based on the timing information, a synchronous replication lease describing a period of time during which the synchronous replication relationship is valid, wherein a request to modify the dataset may only be acknowledged after a copy of the dataset has been modified on each of the storage systems.
    Type: Grant
    Filed: September 12, 2022
    Date of Patent: October 24, 2023
    Assignee: PURE STORAGE, INC.
    Inventors: David Grunwald, Steven Hodgson, Ronald Karr, Kunal Trivedi, Christopher Golden, Thomas Gill, Connor Brooks, Zoheb Shivani
  • Patent number: 11797309
    Abstract: An apparatus and method for tracking speculative execution flow and detecting potential vulnerabilities.
    Type: Grant
    Filed: December 27, 2019
    Date of Patent: October 24, 2023
    Assignee: Intel Corporation
    Inventors: Carlos Rozas, Francis McKeen, Pasquale Cocchini, Meltem Ozsoy, Matthew Fernandez
  • Patent number: 11782845
    Abstract: An apparatus comprises memory management circuitry to perform a translation table walk for a target address of a memory access request and to signal a fault in response to the translation table walk identifying a fault condition for the target address, prefetch circuitry to generate a prefetch request to request prefetching of information associated with a prefetch target address to a cache; and faulting address prediction circuitry to predict whether the memory management circuitry would identify the fault condition for the prefetch target address if the translation table walk was performed by the memory management circuitry for the prefetch target address. In response to a prediction that the fault condition would be identified for the prefetch target address, the prefetch circuitry suppresses the prefetch request and the memory management circuitry prevents the translation table walk being performed for the prefetch target address of the prefetch request.
    Type: Grant
    Filed: December 2, 2021
    Date of Patent: October 10, 2023
    Assignee: Arm Limited
    Inventors: Alexander Cole Shulyak, Joseph Michael Pusdesris, Abhishek Raja, Karthik Sundaram, Anoop Ramachandra Iyer, Michael Brian Schinzler, James David Dundas, Yasuo Ishii
  • Patent number: 11748284
    Abstract: A system and method for efficiently arbitrating traffic on a bus. A computing system includes a fabric for routing traffic among one or more agents and one or more endpoints. The fabric includes multiple arbiters in an arbitration hierarchy. Arbiters store traffic in buffers with each buffer associated with a particular traffic type and a source of the traffic. Arbiters maintain a respective urgency counter for keeping track of a period of time traffic of a particular type is blocked by upstream arbiters. When the block is removed, the traffic of the particular type has priority for selection based on the urgency counter. When arbiters receive feedback from downstream arbiters or sources, the arbiters adjust selection priority accordingly. For example, changes in bandwidth requirement, low latency tolerance and active status cause adjustments in selection priority of stored requests.
    Type: Grant
    Filed: July 14, 2021
    Date of Patent: September 5, 2023
    Assignee: Apple Inc.
    Inventors: Nachiappan Chidambaram Nachiappan, Jaideep Dastidar, Yiu Chun Tse, Ripudaman Singh, Shawn Munetoshi Fukami, Benjamin K. Dodge, Vinodh R. Cuppu
  • Patent number: 11740909
    Abstract: A system including a computer storage and a processor is described. The computer storage is configured to identify a stored data as protected. The processor is configured to perform speculative execution. To perform the speculative execution, the processor is configured to determine, in response to the speculative execution of an instruction to read the stored data, whether the stored data is identified as protected. In response to a determination that the stored data attempted to be read during the speculative execution is protected, the processor is configured to disallow during the speculative execution immediate successful completion of the instruction to read the stored data.
    Type: Grant
    Filed: November 9, 2021
    Date of Patent: August 29, 2023
    Assignee: Meta Platforms, Inc.
    Inventors: Hao Wang, Harish Dattatraya Dixit, Shobhit O. Kanaujia
  • Patent number: 11734194
    Abstract: A method is provided that includes performing, by a processor in response to a dual issue multiply instruction, multiplication of operands of the dual issue multiply instruction using multiplication units comprised in a data path of the processor and configured to operate together to determine a product of the operands, and storing, by the processor, the product in a storage location indicated by the dual issue multiply instruction.
    Type: Grant
    Filed: April 4, 2022
    Date of Patent: August 22, 2023
    Assignee: Texas Instruments Incorporated
    Inventors: Timothy David Anderson, Mujibur Rahman
  • Patent number: 11734788
    Abstract: A SIMD processing unit processes a plurality of tasks which each include up to a predetermined maximum number of work items. The work items of a task are arranged for executing a common sequence of instructions on respective data items. The data items are arranged into blocks, with some of the blocks including at least one invalid data item. Work items which relate to invalid data items are invalid work items. The SIMD processing unit comprises a group of processing lanes configured to execute instructions of work items of a particular task over a plurality of processing cycles. A control module assembles work items into the tasks based on the validity of the work items, so that invalid work items of the particular task are temporally aligned across the processing lanes. In this way the number of wasted processing slots due to invalid work items may be reduced.
    Type: Grant
    Filed: October 29, 2021
    Date of Patent: August 22, 2023
    Assignee: Imagination Technologies Limited
    Inventors: John Howson, Jonathan Redshaw, Yoong Chert Foo
  • Patent number: 11704301
    Abstract: Provided is a method for performing a file system consistency check. The method comprises calculating, by a first thread that does not have access to an inode table, file block addresses for one or more files to be checked by the thread. The method further comprises collecting validity information for the one or more files. The method further comprises reading information relating to the one or more files from the inode table. The reading is performed in response to the thread being given access to the inode table after the calculating operation. The method further comprises validating the information by comparing the information from the inode table to the validity information.
    Type: Grant
    Filed: September 11, 2020
    Date of Patent: July 18, 2023
    Assignee: International Business Machines Corporation
    Inventors: Huzefa Pancha, Abhishek Jain, Sasikanth Eda, Karthik Iyer
  • Patent number: 11698794
    Abstract: A method to provide flexible access to an internal data of an regulated system, the method comprising receiving, by a data access component of the regulated system, a loadable configuration file defining a set of triggering events and a set of memory, determining the occurrence of a single triggering event, accessing at least a subset of memory that contain the internal data of the avionics system to retrieve data associated with the one or more memory of the set of memory, and outputting the retrieved data to a receiving component.
    Type: Grant
    Filed: September 2, 2020
    Date of Patent: July 11, 2023
    Assignee: GE Aviation Systems LLC
    Inventors: Joachim Karl Ulf Hochwarth, Victor Mario Leal Herrera, Antonio Lugo Trejo, Joshua Michael Krbez
  • Patent number: 11693666
    Abstract: A predicated-loop-terminating branch instruction controls, based on whether a loop termination condition is satisfied, whether the processing circuitry should process a further iteration of a predicated loop body or process a following instruction. If at least one unnecessary iteration of the predicated loop body is processed following a mispredicted-non-termination branch misprediction when the loop termination condition is mispredicted as unsatisfied for a given iteration when it should have been satisfied, processing of the at least one unnecessary iteration of the predicated loop body is predicated to suppress an effect of the at least one unnecessary iteration.
    Type: Grant
    Filed: October 20, 2021
    Date of Patent: July 4, 2023
    Assignee: Arm Limited
    Inventors: Joseph Michael Pusdesris, Nicholas Andrew Plante, Yasuo Ishii, Chris Abernathy
  • Patent number: 11650818
    Abstract: A processor includes an execution unit and a processing logic operatively coupled to the execution unit, the processing logic to: enter a first execution state and transition to a second execution state responsive to executing a control transfer instruction. Responsive to executing a target instruction of the control transfer instruction, the processing logic further transitions to the first execution state responsive to the target instruction being a control transfer termination instruction of a mode identical to a mode of the processing logic following the execution of the control transfer instruction; and raises an execution exception responsive to the target instruction being a control transfer termination instruction of a mode different than the mode of the processing logic following the execution of the control transfer instruction.
    Type: Grant
    Filed: August 17, 2021
    Date of Patent: May 16, 2023
    Assignee: Intel Corporation
    Inventors: Vedvyas Shanbhogue, Jason W. Brandt, Ravi L. Sahita, Xiaoning Li
  • Patent number: 11645073
    Abstract: Address-based filtering for load/store speculation includes maintaining a filtering table including table entries associated with ranges of addresses; in response to receiving an ordering check triggering transaction, querying the filtering table using a target address of the ordering check triggering transaction to determine if an instruction dependent upon the ordering check triggering transaction has previously been generated a physical address; and in response to determining that the filtering table lacks an indication that the instruction dependent upon the ordering check triggering transaction has previously been generated a physical address, bypassing a lookup operation in an ordering violation memory structure to determine whether the instruction dependent upon the ordering check triggering transaction is currently in-flight.
    Type: Grant
    Filed: April 23, 2021
    Date of Patent: May 9, 2023
    Assignee: ADVANCED MICRO DEVICES, INC.
    Inventors: John Kalamatianos, Krishnan V. Ramani, Susumu Mashimo
  • Patent number: 11630772
    Abstract: Disclosed is a system and method for use in a cache for suppressing modification of cache line. The system and method includes a processor and a memory operating cooperatively with a cache controller. The memory includes a coherence directory stored within a cache created to track at least one cache line in the cache via the cache controller. The processor instructs a cache controller to store a first data in a cache line in the cache. The cache controller tags the cache line based on the first data. The processor instructs the cache controller to store a second data in the cache line in the cache causing eviction of the first data from the cache line. The processor compares based on the tagging the first data and the second data and suppresses modification of the cache line based on the comparing of the first data and the second data.
    Type: Grant
    Filed: September 29, 2021
    Date of Patent: April 18, 2023
    Assignee: ADVANCED MICRO DEVICES, INC.
    Inventor: Paul J. Moyer
  • Patent number: 11614940
    Abstract: A method to compare first and second source data in a processor in response to a vector maximum with indexing instruction includes specifying first and second source registers containing first and second source data, a destination register storing compared data, and a predicate register. Each of the registers includes a plurality of lanes. The method includes executing the instruction by, for each lane in the first and second source register, comparing a value in the lane of the first source register to a value in the corresponding lane of the second source register to identify a maximum value, storing the maximum value in a corresponding lane of the destination register, asserting a corresponding lane of the predicate register if the maximum value is from the first source register, and de-asserting the corresponding lane of the predicate register if the maximum value is from the second source register.
    Type: Grant
    Filed: March 29, 2021
    Date of Patent: March 28, 2023
    Assignee: Texas Instruments Incorporated
    Inventors: Duc Bui, Peter Richard Dent, Timothy D. Anderson
  • Patent number: 11520581
    Abstract: A vector processing unit is described, and includes processor units that each include multiple processing resources. The processor units are each configured to perform arithmetic operations associated with vectorized computations. The vector processing unit includes a vector memory in data communication with each of the processor units and their respective processing resources. The vector memory includes memory banks configured to store data used by each of the processor units to perform the arithmetic operations. The processor units and the vector memory are tightly coupled within an area of the vector processing unit such that data communications are exchanged at a high bandwidth based on the placement of respective processor units relative to one another, and based on the placement of the vector memory relative to each processor unit.
    Type: Grant
    Filed: May 24, 2021
    Date of Patent: December 6, 2022
    Assignee: Google LLC
    Inventors: William Lacy, Gregory Michael Thorson, Christopher Aaron Clark, Norman Paul Jouppi, Thomas Norrie, Andrew Everett Phelps
  • Patent number: 11520591
    Abstract: Processing data in an information handling system is disclosed that includes: in response to an event that triggers a flushing operation, calculate a finish ratio, wherein the finish ratio is a number of finished operations to a number of at least one of the group consisting of in-flight instructions, instructions pending in a processor pipeline, instructions issued to an issue queue, and instructions being processed in a processor execution unit; compare the calculated finish ratio to a threshold; and if the finish ratio is greater than the threshold, then do not perform the flushing operation. Also disclosed is moving the flush point.
    Type: Grant
    Filed: March 27, 2020
    Date of Patent: December 6, 2022
    Assignee: International Business Machines Corporation
    Inventors: Ehsan Fatehi, Richard J. Eickemeyer, John B. Griswell, Jr.
  • Patent number: 11494191
    Abstract: Processors and methods related to tracking exact convergence to guide the recovery process in response to a mispredicted branch are provided. An example processor includes a pipeline having a frontend and a backend. The processor further includes a state table for maintaining information related to at least a subset of branches corresponding to instructions being processed by the processor. The processor further includes state logic configured to access the state table and track locations of any exact convergence points associated with branches corresponding to the instructions being processed by the processor. The state logic is further configured to identify a first recovery method for recovering from a misprediction associated with a branch if a location of an exact convergence point associated with the branch is determined to be in the frontend of the pipeline, else identify a second recovery method for recovering from the misprediction associated with the branch.
    Type: Grant
    Filed: May 18, 2021
    Date of Patent: November 8, 2022
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Vignyan Reddy Kothinti Naresh, Shivam Priyadarshi
  • Patent number: 11481348
    Abstract: A first operation identifier is assigned to a current operation directed to a memory component, the first operation identifier having a first entry in a first data structure that associates the first operation identifier with a first buffer identifier. It is determined whether the current operation collides with a prior operation assigned a second operation identifier, the second operation identifier having a second entry in the first data structure that associates the second operation identifier with a second buffer identifier. A latest flag is updated to indicate that the first entry is a latest operation directed to an address (1) in response to determining that the current operation collides with the prior operation and that the current and prior operations are read operations, or (2) in response to determining to determining that the current operation does not collide with a prior operation.
    Type: Grant
    Filed: January 28, 2021
    Date of Patent: October 25, 2022
    Assignee: MICRON TECHNOLOGY, INC.
    Inventors: Lyle E. Adams, Mark Ish, Pushpa Seetamraju, Karl D. Schuh, Dan Tupy
  • Patent number: 11474991
    Abstract: In a distributed database, a transaction is to be committed at a first coordinator server and one or more participant servers 1210. The first coordinator server is configured to receive a notification that each participant server of the transaction is prepared at a respective prepared timestamp, the respective prepared timestamp being chosen within a time range for which the respective participant server obtained at least one lock 1220. The first coordinator server computes the commit timestamp for the transaction equal or greater than each of the prepared timestamps 1230, and restrict the commit timestamp such that a second coordinator server sharing at least one of the participant servers for one or more other transactions at a shared shard cannot select the same commit timestamp for any of the other transactions 1240. The transaction is committed at the commit timestamp 1250.
    Type: Grant
    Filed: March 13, 2018
    Date of Patent: October 18, 2022
    Assignee: Google LLC
    Inventors: Sebastian Kanthak, Brian Frank Cooper
  • Patent number: 11443044
    Abstract: A computer-implemented method for advancing speculative execution in microarchitectures is disclosed. A non-limiting example of the computer-implemented method includes receiving, by a processor, a test scenario including a first load instruction from a first memory location flagged with a delay notification and a speculative memory access instruction from a second memory following the first load instruction. The method executes, by the processor, the first load instruction from the first memory location and delays a return of data from the first memory location for a number of processor cycles. The method executes, by the processor, the speculative storage access instruction from the second memory location during the delay in returning the data from the first memory location.
    Type: Grant
    Filed: September 23, 2019
    Date of Patent: September 13, 2022
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Olaf Knute Hendrickson, Michael P Mullen, Matthew Michael Garcia Pardini
  • Patent number: 11403227
    Abstract: This disclosure relates to a data storage method and apparatus, and a server. The method includes receiving, by a first server, a write instruction sent by a second server, storing target data in a cache of a controller, detecting a read instruction for the target data, and storing the target data in a storage medium of a non-volatile memory based on the read instruction. In other words, when the second server needs to write the target data to the first server, the target data is not only written to the cache of the first server, but also written to the storage medium of the first server. This can ensure that the data in the cache is written to the storage medium promptly.
    Type: Grant
    Filed: March 12, 2021
    Date of Patent: August 2, 2022
    Assignee: HUAWEI TECHNOLOGIES CO., LTD.
    Inventors: Chenji Gong, Chao Zhou, Junjie Chen
  • Patent number: 10452417
    Abstract: Methods, apparatus, and articles of manufacture to virtualize performance counters are disclosed. An example method includes dividing performance events to be counted into a plurality of classes; assigning a first virtual performance counter of a virtual machine to a first performance event type in a first one of the classes; assigning a second virtual performance counter of the virtual machine to a second performance event type in a second one of the classes different from the first class; incrementing the first virtual performance counter in response to a first occurrence of the first performance event type during direct execution of guest instructions by the virtual machine; and not incrementing the first virtual performance counter in response to a second occurrence of the first performance event type during execution of emulated instructions by a hypervisor on behalf of the virtual machine.
    Type: Grant
    Filed: May 26, 2015
    Date of Patent: October 22, 2019
    Assignee: VMWARE, INC.
    Inventors: Benjamin Charles Serebrin, Daniel Michael Hecht
  • Patent number: 9361111
    Abstract: First processing circuitry processes at least part of a stream of program instructions. The first processing circuitry has registers for storing data and register renaming circuitry for mapping architectural register specifiers to physical register specifiers. A renaming data store stores renaming entries for identifying a register mapping between the architectural and physical register specifiers. At least some renaming entries have a count value indicating a number of speculation points occurring between generation of a previous count value and generation of the count value. The speculation points may for example be branch operation or load/store operations.
    Type: Grant
    Filed: January 9, 2013
    Date of Patent: June 7, 2016
    Assignee: ARM Limited
    Inventors: Luca Scalabrino, Melanie Emanuelle Lucie Teyssier, Cedric Denis Robert Airaud, Guillaume Schon
  • Patent number: 9032191
    Abstract: A hypervisor and one or more guest operating systems resident in a data processing system and hosted by the hypervisor are configured to selectively enable or disable branch prediction logic through separate hypervisor-mode and guest-mode instructions. By doing so, different branch prediction strategies may be employed for different operating systems and user applications hosted thereby to provide finer grained optimization of the branch prediction logic for different operating scenarios.
    Type: Grant
    Filed: January 23, 2012
    Date of Patent: May 12, 2015
    Assignee: International Business Machines Corporation
    Inventors: Adam J. Muff, Paul E. Schardt, Robert A. Shearer, Matthew R. Tubbs
  • Patent number: 9026769
    Abstract: A processor for processing loop instructions can include an instruction reorder structure and a loop processing controller. The instruction reorder structure is configured to store decoded instructions according to program order and issue the decoded instructions for execution out of program order. The loop processing controller is configured to detect a loop in the decoded instructions stored in the instruction reorder structure and cause the instruction reorder structure to reissue the decoded instructions that form the loop for re-execution.
    Type: Grant
    Filed: January 24, 2012
    Date of Patent: May 5, 2015
    Assignee: Marvell International Ltd.
    Inventors: Sujat Jamil, R. Frank O'Bleness, Joseph Delgross, Tom Hameenanttila
  • Patent number: 8990817
    Abstract: Various systems and methods for automated error recovery in workflows. For example, one method involves receiving an operation indication. The operation indication indicates an operation that is to be performed using a multi-tier application system that includes first and second applications. The first and second applications are implemented using different tiers of the multi-tier application system. The method involves accessing dependency information that indicates first data dependencies between the first and the second applications. The method further involves determining outcome of execution of the operation, where the determining is based on the dependency information but does not include executing the operation.
    Type: Grant
    Filed: September 6, 2012
    Date of Patent: March 24, 2015
    Assignee: Symantec Corporation
    Inventors: Debasish Garai, Sumeet S. Kembhavi
  • Patent number: 8990819
    Abstract: A method for rolling back speculative threads in symmetric-multiprocessing (SMP) environments is disclosed. In one embodiment, such a method includes detecting an aborted thread at runtime and determining whether the aborted thread is an oldest aborted thread. In the event the aborted thread is the oldest aborted thread, the method sets a high-priority request for allocation to an absolute thread number associated with the oldest aborted thread. The method further detects that the high-priority request is set and, in response, modifies a local allocation token of the oldest aborted thread. The modification prompts the oldest aborted thread to retry a work unit associated with its absolute thread number. The oldest aborted thread subsequently initiates the retry of a successor thread by updating the successor thread's local allocation token. A corresponding apparatus and computer program product are also disclosed.
    Type: Grant
    Filed: December 28, 2012
    Date of Patent: March 24, 2015
    Assignee: International Business Machines Corporation
    Inventors: Martin Ohmacht, Raul E. Silvera, Mark G. Stoodley, Kai-Ting A. Wang
  • Patent number: 8959319
    Abstract: Embodiments of the present invention provide systems, methods, and computer program products for improving divergent conditional branches in code being executed by a processor. For example, in an embodiment, a method comprises detecting a conditional statement of a program being simultaneously executed by a plurality of threads, determining which threads evaluate a condition of the conditional statement as true and which threads evaluate the condition as false, pushing an identifier associated with the larger set of the threads onto a stack, executing code associated with a smaller set of the threads, and executing code associated with the larger set of the threads.
    Type: Grant
    Filed: December 2, 2011
    Date of Patent: February 17, 2015
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Mark Leather, Norman Rubin, Brian D. Emberling, Michael Mantor
  • Patent number: 8918625
    Abstract: A processor that executes instructions out of program order is described. In some implementations, a processor detects whether a second memory operation is dependent on a first memory operation prior to memory address calculation. If the processor detects that the second memory operation is not dependent on the first memory operation, the processor is configured to allow the second memory operation to be scheduled. If the processor detects that the second memory operation is dependent on the first memory operation, the processor is configured to prevent the second memory operation from being scheduled until the first memory operation has been scheduled to reduce the likelihood of having to reexecute the second memory operation.
    Type: Grant
    Filed: November 15, 2011
    Date of Patent: December 23, 2014
    Assignee: Marvell International Ltd.
    Inventors: R. Frank O'Bleness, Sujat Jamil, Tom Hameenanttila
  • Patent number: 8918626
    Abstract: The disclosed embodiments relate to a system that executes program instructions on a processor. During a normal-execution mode, the system issues instructions for execution in program order. Upon encountering an unresolved data dependency during execution of an instruction, the system speculatively executes subsequent instructions in a lookahead mode to prefetch future loads. When an instruction retires during the lookahead mode, a working register which serves as a destination register for the instruction is not copied to a corresponding architectural register. Instead the architectural register is marked as invalid. Note that by not updating architectural registers during lookahead mode, the system eliminates the need to checkpoint the architectural registers prior to entering lookahead mode.
    Type: Grant
    Filed: November 10, 2011
    Date of Patent: December 23, 2014
    Assignee: Oracle International Corporation
    Inventors: Yuan C. Chou, Eric W. Mahurin
  • Patent number: 8862861
    Abstract: Techniques are disclosed relating to a processor that is configured to execute control transfer instructions (CTIs). In some embodiments, the processor includes a mechanism that suppresses results of mispredicted younger CTIs on a speculative execution path. This mechanism permits the branch predictor to maintain its fidelity, and eliminates spurious flushes of the pipeline. In one embodiment, a misprediction bit is be used to indicate that a misprediction has occurred, and younger CTIs than the CTI that was mispredicted are suppressed. In some embodiments, the processor may be configured to execute instruction streams from multiple threads. Each thread may include a misprediction indication. CTIs in each thread may execute in program order with respect to other CTIs of the thread, while instructions other than CTIs may execute out of program order.
    Type: Grant
    Filed: September 8, 2011
    Date of Patent: October 14, 2014
    Assignee: Oracle International Corporation
    Inventors: Christopher H. Olson, Manish K. Shah
  • Patent number: 8683129
    Abstract: The disclosed embodiments provide a system that uses speculative cache requests to reduce cache miss delays for a cache in a multi-level memory hierarchy. During operation, the system receives a memory reference which is directed to a cache line in the cache. Next, while determining whether the cache line is available in the cache, the system determines whether the memory reference is likely to miss in the cache, and if so, simultaneously sends a speculative request for the cache line to a lower level of the multi-level memory hierarchy.
    Type: Grant
    Filed: October 21, 2010
    Date of Patent: March 25, 2014
    Assignee: Oracle International Corporation
    Inventors: Tarik Ono, Mark R. Greenstreet
  • Patent number: 8656102
    Abstract: A method for preloading into a hierarchy of memories, bitstreams representing the configuration information for a reconfigurable processing system including several processing units. The method includes an off-execution step of determining tasks that can be executed on a processing unit subsequently to the execution of a given task. The method also includes, during execution of the given task, computing a priority for each of the tasks that can be executed. The priority depends on information relating to the current execution of the given task. The method also includes, during execution of the given task, sorting the tasks that can be executed in the order of their priorities. The method also includes, during execution of the given task, preloading into the memory, bitstreams representing the information of the configurations for the execution of the tasks that can be executed, while favoring the tasks whose priority is the highest.
    Type: Grant
    Filed: February 6, 2009
    Date of Patent: February 18, 2014
    Assignee: Commissariat a l'Energie Atomique et aux Energies Alternatives
    Inventors: Stéphane Guyetant, Stéphane Chevobbe
  • Patent number: 8041926
    Abstract: Executing a block of code is disclosed. Executing includes receiving an indication that the block of code is to be executed using a synchronization mechanism and speculatively executing the block of code on a virtual machine. The block of code may include application code. The block of code does not necessarily indicate that the block of code should be speculatively executed.
    Type: Grant
    Filed: October 6, 2010
    Date of Patent: October 18, 2011
    Assignee: Azul Systems, Inc.
    Inventors: Gil Tene, Michael A. Wolf
  • Patent number: 8028132
    Abstract: The present invention relates to mechanisms for handling and detecting collisions between threads (5, 6, 7) that execute computer program instructions out of program order. According to an embodiment of the present invention each of a plurality of threads (5, 6, 7) are associated with a respective data structure (9, 10, 11) comprising a number of bits (12) that correspond to memory elements (m0, m1, m2, mn) of a shared memory (4). When a thread accesses a memory element in the shared memory, it sets a bit in its associated data structure, which bit corresponds to the accessed memory element. This indicates that the memory element has been accessed by the thread. Collision detection may be carried out after the thread has finished executing by means of comparing the data structure of the thread with the data structures of other threads on which the thread may depend.
    Type: Grant
    Filed: December 12, 2001
    Date of Patent: September 27, 2011
    Assignee: Telefonaktiebolaget LM Ericsson (publ)
    Inventors: Anders Widell, Per Holmberg, Marcus Dahlström
  • Patent number: 7840785
    Abstract: Executing a block of code is disclosed. Executing includes receiving an indication that the block of code is to be executed using a synchronization mechanism and speculatively executing the block of code on a virtual machine. The block of code may include application code. The block of code does not necessarily indicate that the block of code should be speculatively executed.
    Type: Grant
    Filed: September 14, 2005
    Date of Patent: November 23, 2010
    Assignee: Azul Systems, Inc.
    Inventors: Gil Tene, Michael A. Wolf