Speculative Instruction Execution, E.g., Conditional Execution, Procedural Dependencies, Instruction Invalidation (epo) Patents (Class 712/E9.05)
  • Patent number: 12164923
    Abstract: Methods and systems are disclosed for processing a vector by a vector processor. Techniques disclosed include receiving predicated instructions by a scheduler, each of which is associated with an opcode, a vector of elements, and a predicate. The techniques further include executing the predicated instructions. Executing a predicated instruction includes compressing, based on an index derived from a predicate of the instruction, elements in a vector of the instruction, where the elements in the vector are contiguously mapped, then, after the mapped elements are processed, decompressing the processed mapped elements, where the processed mapped elements are reverse mapped based on the index.
    Type: Grant
    Filed: June 29, 2022
    Date of Patent: December 10, 2024
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Elliott David Binder, Onur Kayiran, Masab Ahmad
  • Patent number: 12147349
    Abstract: Disclosed is a processor for performing a speculative execution for an out-of-order execution. The processor may include: a core; and an L1 cache memory, and the core may include a speculative track buffer (STB) storing speculative track information in order to track the speculative instruction when a speculative instruction is recorded in a reorder buffer (ROB), and a load queue (LQ) transmitting a commit doorbell signal or a restore doorbell signal for a first speculative block to which a first speculative instruction belongs to an L1 cache memory based on first speculative track information of the first speculative instruction when a speculative success or a speculative failure of the first speculative instruction included in the speculative instruction is decided, and the L1 cache memory may include a write buffer.
    Type: Grant
    Filed: December 15, 2022
    Date of Patent: November 19, 2024
    Assignee: Korea University Research and Business Foundation
    Inventors: Taeweon Suh, Gunjae Koo, Jongmin Lee, Junyeon Lee
  • Patent number: 12135681
    Abstract: In an embodiment, a coprocessor may include a bypass indication which identifies execution circuitry that is not used by a given processor instruction, and thus may be bypassed. The corresponding circuitry may be disabled during execution, preventing evaluation when the output of the circuitry will not be used for the instruction. In another embodiment, the coprocessor may implement a grid of processing elements in rows and columns, where a given coprocessor instruction may specify an operation that causes up to all of the processing elements to operate on vectors of input operands to produce results. Implementations of the coprocessor may implement a portion of the processing elements. The coprocessor control circuitry may be designed to operate with the full grid or partial grid, reissuing instructions in the partial grid case to perform the requested operation. In still another embodiment, the coprocessor may be able to fuse vector mode operations.
    Type: Grant
    Filed: July 20, 2022
    Date of Patent: November 5, 2024
    Assignee: Apple Inc.
    Inventors: Aditya Kesiraju, Andrew J. Beaumont-Smith, Boris S. Alvarez-Heredia, Ran A. Chachick
  • Patent number: 12131073
    Abstract: A set of submission queues associated with a host system is identified. A first set of internal queues and a second set of internal queues is generated based on the set of submission queues. Responsive to fetching a first memory access command pending in a submission queue of the set of submission queues, a first internal queue of the first set of internal queues is populated. Responsive to processing the first memory access command from the first internal queue of the first set of internal queues, a second internal queue of the second set of internal queues is populated. Responsive to completion of the first memory access command from the second internal queue of the second set of internal queues, an indication of the completion of the first memory access command is returned to the host system.
    Type: Grant
    Filed: November 29, 2023
    Date of Patent: October 29, 2024
    Assignee: Micron Technology, Inc.
    Inventors: Muthazhagan Balasubramani, Woei Chen Peh
  • Patent number: 12112396
    Abstract: A SIMD processing unit processes a plurality of tasks which each include up to a predetermined maximum number of work items. The work items of a task are arranged for executing a common sequence of instructions on respective data items. The data items are arranged into blocks, with some of the blocks including at least one invalid data item. Work items which relate to invalid data items are invalid work items. The SIMD processing unit comprises a group of processing lanes configured to execute instructions of work items of a particular task over a plurality of processing cycles. A control module assembles work items into the tasks based on the validity of the work items, so that invalid work items of the particular task are temporally aligned across the processing lanes. In this way the number of wasted processing slots due to invalid work items may be reduced.
    Type: Grant
    Filed: August 21, 2023
    Date of Patent: October 8, 2024
    Assignee: Imagination Technologies Limited
    Inventors: John Howson, Jonathan Redshaw, Yoong Chert Foo
  • Patent number: 12093261
    Abstract: Techniques related to cache storage formats are disclosed. In some embodiments, a set of values is stored in a cache as a set of first representations and a set of second representations. For example, the set of first representations may be a set of hardware-level representations, and the set of second representations may be a set of non-hardware-level representations. Responsive to receiving a query to be executed over the set of values, a determination is made as to whether or not it would be more efficient to execute the query over the set of first representations than to execute the query over the set of second representations. If the determination indicates that it would be more efficient to execute the query over the set of first representations than to execute the query over the set of second representations, the query is executed over the set of first representations.
    Type: Grant
    Filed: April 2, 2018
    Date of Patent: September 17, 2024
    Assignee: Oracle International Corporation
    Inventors: Aurosish Mishra, Shasank K. Chavan, Vinita Subramanian, Ekrem S. C. Soylemez, Adam Kociubes, Eugene Karichkin, Garret F. Swart
  • Patent number: 12086653
    Abstract: A processor is described. The processor includes model specific register space that is visible to software above a BIOS level. The model specific register space is to specify a granularity of a processing entity of a lock-step group. The processor also includes logic circuitry to support dynamic entry/exit of the lock-step group's processing entities to/from lock-step mode including: i) termination of lock-step execution by the processing entities before the program code to be executed in lock-step is fully executed; and, ii) as part of the exit from the lock-step mode, restoration of a state of a shadow processing entity of the processing entities as the state existed before the shadow processing entity entered the lock-step mode and began lock-step execution of the program code.
    Type: Grant
    Filed: December 24, 2020
    Date of Patent: September 10, 2024
    Assignee: Intel Corporation
    Inventors: Vedvyas Shanbhogue, Jeff A. Huxel, Jeffrey G. Wiedemeier, James D. Allen, Arvind Raman, Krishnakumar Ganapathy
  • Patent number: 12086603
    Abstract: An apparatus of an aspect includes a plurality of cores and shared core extension logic coupled with each of the plurality of cores. The shared core extension logic has shared data processing logic that is shared by each of the plurality of cores. Instruction execution logic, for each of the cores, in response to a shared core extension call instruction, is to call the shared core extension logic. The call is to have data processing performed by the shared data processing logic on behalf of a corresponding core. Other apparatus, methods, and systems are also disclosed.
    Type: Grant
    Filed: October 27, 2022
    Date of Patent: September 10, 2024
    Assignee: Intel Corporation
    Inventors: Eran Shifer, Mostafa Hagog, Eliyahu Turiel
  • Patent number: 12086600
    Abstract: Embodiments of the present disclosure include techniques for branch prediction. A branch predictor may be included in a front end of a processor. The branch predictor may store branch targets in a branch target buffer. The branch target buffer includes shared bits, which may be combined with branch target bits to specify branch target destination addresses. Shared bits may result in more efficient memory usage in the processor, for example.
    Type: Grant
    Filed: December 5, 2022
    Date of Patent: September 10, 2024
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Somasundaram Arunachalam, Daren Eugene Streett, Richard William Doing
  • Patent number: 12086042
    Abstract: A tracing circuit is integrated in a semiconductor device along with a microprocessor including an m-bit program counter, and externally outputs a tracing clock along with an n-bit tracing data (where 2?n?m). The tracing circuit, when the program counter remains unchanged, synchronously with the tracing clock sets the tracing data to a first output value; when the program counter is incremented, synchronously with the tracing clock sets the tracing data to a second output value; and when the program counter is loaded, synchronously with the tracing clock sets the tracing data to a third output value, and then suspends the state machine in the microprocessor and split-outputs, as the tracing data, the branch destination address or interrupt destination address loaded in the program counter.
    Type: Grant
    Filed: October 16, 2020
    Date of Patent: September 10, 2024
    Assignee: Rohm Co., Ltd.
    Inventor: Takahiro Nishiyama
  • Patent number: 12039305
    Abstract: A method for a compilation, an electronic device and a readable storage medium are provided. The method for a compilation includes analyzing source program data to determine a target irregular branch, generating an update data flow graph according to the target irregular branch, and mapping the update data flow graph to a target hardware to complete the compilation.
    Type: Grant
    Filed: March 7, 2022
    Date of Patent: July 16, 2024
    Assignees: Beijing Superstring Academy of Memory Technology, Tsinghua University
    Inventors: Baofen Yuan, Shouyi Yin, Shaojun Wei
  • Patent number: 12008352
    Abstract: A loop within computer code is transformed to minimize loop iterations. A determination is made using statistical information relating to the loop whether the loop that has an early exit indication is to be transformed to minimize iterations of the loop. Based on determining that the loop is to be transformed, the loop is transformed.
    Type: Grant
    Filed: November 24, 2021
    Date of Patent: June 11, 2024
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Wai Hung Tsang, Ettore Tiotto, Bardia Mahjour
  • Patent number: 11989134
    Abstract: An apparatus comprising translation circuitry to perform a translation operation to generate a translated second memory address within a second memory address space as a translation of a first memory address within a first memory address space, in which the translation circuitry is configured to generate the translated second memory address in dependence upon translation information stored at one or more translation information addresses; permission circuitry to perform an operation to detect permission information to indicate, for a given second memory address, whether memory access is permitted to the given second memory address; and access circuitry to allow access to data stored at the given second memory address when the permission information indicates that memory access is permitted to the given second memory address.
    Type: Grant
    Filed: March 8, 2021
    Date of Patent: May 21, 2024
    Assignee: Arm Limited
    Inventors: Yuval Elad, Jason Parker, Richard Roy Grisenthwaite, Simon John Craske, Alexander Donald Charles Chadwick
  • Patent number: 11960892
    Abstract: In one embodiment, a system includes a memory and a processor core. The processor core includes functional units and an instruction decode unit configured to determine whether an execute packet of instructions received by the processing core includes a first instruction that is designated for execution by a first functional unit of the functional units and a second instruction that is a condition code extension instruction that includes a plurality of sets of condition code bits, wherein each set of condition code bits corresponds to a different one of the functional units, and wherein the sets of condition code bits include a first set of condition code bits that corresponds to the first functional unit. When the execute packet includes the first and second instructions, the first functional unit is configured to execute the first instruction conditionally based upon the first set of condition code bits in the second instruction.
    Type: Grant
    Filed: July 22, 2022
    Date of Patent: April 16, 2024
    Assignee: Texas Instruments Incorporated
    Inventors: Timothy David Anderson, Duc Quang Bui, Joseph Raymond Michael Zbiciak
  • Patent number: 11947455
    Abstract: Disclosed is a system and method for use in a cache for suppressing modification of cache line. The system and method includes a processor and a memory operating cooperatively with a cache controller. The memory includes a coherence directory stored within a cache created to track at least one cache line in the cache via the cache controller. The processor instructs a cache controller to store a first data in a cache line in the cache. The cache controller tags the cache line based on the first data. The processor instructs the cache controller to store a second data in the cache line in the cache causing eviction of the first data from the cache line. The processor compares based on the tagging the first data and the second data and suppresses modification of the cache line based on the comparing of the first data and the second data.
    Type: Grant
    Filed: April 17, 2023
    Date of Patent: April 2, 2024
    Assignee: ADVANCED MICRO DEVICES, INC.
    Inventor: Paul J. Moyer
  • Patent number: 11928470
    Abstract: Introduced herein is a program counter advancing technique that uses NOP padding without its limitations. During a build process, the introduced technique removes EOG markers for instruction groups that are immediately followed by the NOP instructions that are immediately followed by an instruction group beginning at a start of a cache line. As such, during an execution process, when the processing unit detects an absence of an EOG marker in the requested instruction group, it knows that a group of NOP instructions are about to follow and skips over them by directly advancing the program counter to a start of a subsequent cache line where the next instruction group starts. In addition to the presence of an EOG marker, the introduced technique also takes into account whether the requested instruction group is a straddling group when advancing the program counter to a start of the subsequent cache line.
    Type: Grant
    Filed: April 20, 2022
    Date of Patent: March 12, 2024
    Assignee: VeriSilicon Holdings Co., Ltd.
    Inventor: Tracy T. Nguyen
  • Patent number: 11915006
    Abstract: A method, system and device for pipeline processing of instructions and a computer storage medium. The method comprises: acquiring a target instruction set (S101); acquiring a target prediction result, wherein the target prediction result is a result obtained by predicting a jump mode of the target instruction set (S102); performing pipeline processing on the target instruction set according to the target prediction result (S103); determining if a pipeline flushing request is received (S104); and if so, correspondingly saving the target instruction set and a corresponding pipeline processing result, so as to perform pipeline processing on the target instruction set again on the basis of the pipeline processing result (S105).
    Type: Grant
    Filed: November 28, 2019
    Date of Patent: February 27, 2024
    Assignee: INSPUR SUZHOU INTELLIGENT TECHNOLOGY CO., LTD.
    Inventors: Yulong Zhou, Tongqiang Liu, Xiaofeng Zou
  • Patent number: 11847060
    Abstract: Described is a data cache with prediction hints for a cache hit. The data cache includes a plurality of cache lines, where a cache line includes a data field, a tag field, and a prediction hint field. The prediction hint field is configured to store a prediction hint which directs alternate behavior for a cache hit against the cache line. The prediction hint field is integrated with the tag field or is integrated with a way predictor field.
    Type: Grant
    Filed: March 1, 2023
    Date of Patent: December 19, 2023
    Assignee: SiFive, Inc.
    Inventors: John Ingalls, Josh Smith
  • Patent number: 11797403
    Abstract: Maintaining a synchronous replication relationship between two or more storage systems, including: receiving, by at least one of a plurality of storage systems across which a dataset will be synchronously replicated, timing information for at least one of the plurality of storage systems; and establishing, based on the timing information, a synchronous replication lease describing a period of time during which the synchronous replication relationship is valid, wherein a request to modify the dataset may only be acknowledged after a copy of the dataset has been modified on each of the storage systems.
    Type: Grant
    Filed: September 12, 2022
    Date of Patent: October 24, 2023
    Assignee: PURE STORAGE, INC.
    Inventors: David Grunwald, Steven Hodgson, Ronald Karr, Kunal Trivedi, Christopher Golden, Thomas Gill, Connor Brooks, Zoheb Shivani
  • Patent number: 11797309
    Abstract: An apparatus and method for tracking speculative execution flow and detecting potential vulnerabilities.
    Type: Grant
    Filed: December 27, 2019
    Date of Patent: October 24, 2023
    Assignee: Intel Corporation
    Inventors: Carlos Rozas, Francis McKeen, Pasquale Cocchini, Meltem Ozsoy, Matthew Fernandez
  • Patent number: 11782845
    Abstract: An apparatus comprises memory management circuitry to perform a translation table walk for a target address of a memory access request and to signal a fault in response to the translation table walk identifying a fault condition for the target address, prefetch circuitry to generate a prefetch request to request prefetching of information associated with a prefetch target address to a cache; and faulting address prediction circuitry to predict whether the memory management circuitry would identify the fault condition for the prefetch target address if the translation table walk was performed by the memory management circuitry for the prefetch target address. In response to a prediction that the fault condition would be identified for the prefetch target address, the prefetch circuitry suppresses the prefetch request and the memory management circuitry prevents the translation table walk being performed for the prefetch target address of the prefetch request.
    Type: Grant
    Filed: December 2, 2021
    Date of Patent: October 10, 2023
    Assignee: Arm Limited
    Inventors: Alexander Cole Shulyak, Joseph Michael Pusdesris, Abhishek Raja, Karthik Sundaram, Anoop Ramachandra Iyer, Michael Brian Schinzler, James David Dundas, Yasuo Ishii
  • Patent number: 11748284
    Abstract: A system and method for efficiently arbitrating traffic on a bus. A computing system includes a fabric for routing traffic among one or more agents and one or more endpoints. The fabric includes multiple arbiters in an arbitration hierarchy. Arbiters store traffic in buffers with each buffer associated with a particular traffic type and a source of the traffic. Arbiters maintain a respective urgency counter for keeping track of a period of time traffic of a particular type is blocked by upstream arbiters. When the block is removed, the traffic of the particular type has priority for selection based on the urgency counter. When arbiters receive feedback from downstream arbiters or sources, the arbiters adjust selection priority accordingly. For example, changes in bandwidth requirement, low latency tolerance and active status cause adjustments in selection priority of stored requests.
    Type: Grant
    Filed: July 14, 2021
    Date of Patent: September 5, 2023
    Assignee: Apple Inc.
    Inventors: Nachiappan Chidambaram Nachiappan, Jaideep Dastidar, Yiu Chun Tse, Ripudaman Singh, Shawn Munetoshi Fukami, Benjamin K. Dodge, Vinodh R. Cuppu
  • Patent number: 11740909
    Abstract: A system including a computer storage and a processor is described. The computer storage is configured to identify a stored data as protected. The processor is configured to perform speculative execution. To perform the speculative execution, the processor is configured to determine, in response to the speculative execution of an instruction to read the stored data, whether the stored data is identified as protected. In response to a determination that the stored data attempted to be read during the speculative execution is protected, the processor is configured to disallow during the speculative execution immediate successful completion of the instruction to read the stored data.
    Type: Grant
    Filed: November 9, 2021
    Date of Patent: August 29, 2023
    Assignee: Meta Platforms, Inc.
    Inventors: Hao Wang, Harish Dattatraya Dixit, Shobhit O. Kanaujia
  • Patent number: 11734788
    Abstract: A SIMD processing unit processes a plurality of tasks which each include up to a predetermined maximum number of work items. The work items of a task are arranged for executing a common sequence of instructions on respective data items. The data items are arranged into blocks, with some of the blocks including at least one invalid data item. Work items which relate to invalid data items are invalid work items. The SIMD processing unit comprises a group of processing lanes configured to execute instructions of work items of a particular task over a plurality of processing cycles. A control module assembles work items into the tasks based on the validity of the work items, so that invalid work items of the particular task are temporally aligned across the processing lanes. In this way the number of wasted processing slots due to invalid work items may be reduced.
    Type: Grant
    Filed: October 29, 2021
    Date of Patent: August 22, 2023
    Assignee: Imagination Technologies Limited
    Inventors: John Howson, Jonathan Redshaw, Yoong Chert Foo
  • Patent number: 11734194
    Abstract: A method is provided that includes performing, by a processor in response to a dual issue multiply instruction, multiplication of operands of the dual issue multiply instruction using multiplication units comprised in a data path of the processor and configured to operate together to determine a product of the operands, and storing, by the processor, the product in a storage location indicated by the dual issue multiply instruction.
    Type: Grant
    Filed: April 4, 2022
    Date of Patent: August 22, 2023
    Assignee: Texas Instruments Incorporated
    Inventors: Timothy David Anderson, Mujibur Rahman
  • Patent number: 11704301
    Abstract: Provided is a method for performing a file system consistency check. The method comprises calculating, by a first thread that does not have access to an inode table, file block addresses for one or more files to be checked by the thread. The method further comprises collecting validity information for the one or more files. The method further comprises reading information relating to the one or more files from the inode table. The reading is performed in response to the thread being given access to the inode table after the calculating operation. The method further comprises validating the information by comparing the information from the inode table to the validity information.
    Type: Grant
    Filed: September 11, 2020
    Date of Patent: July 18, 2023
    Assignee: International Business Machines Corporation
    Inventors: Huzefa Pancha, Abhishek Jain, Sasikanth Eda, Karthik Iyer
  • Patent number: 11698794
    Abstract: A method to provide flexible access to an internal data of an regulated system, the method comprising receiving, by a data access component of the regulated system, a loadable configuration file defining a set of triggering events and a set of memory, determining the occurrence of a single triggering event, accessing at least a subset of memory that contain the internal data of the avionics system to retrieve data associated with the one or more memory of the set of memory, and outputting the retrieved data to a receiving component.
    Type: Grant
    Filed: September 2, 2020
    Date of Patent: July 11, 2023
    Assignee: GE Aviation Systems LLC
    Inventors: Joachim Karl Ulf Hochwarth, Victor Mario Leal Herrera, Antonio Lugo Trejo, Joshua Michael Krbez
  • Patent number: 11693666
    Abstract: A predicated-loop-terminating branch instruction controls, based on whether a loop termination condition is satisfied, whether the processing circuitry should process a further iteration of a predicated loop body or process a following instruction. If at least one unnecessary iteration of the predicated loop body is processed following a mispredicted-non-termination branch misprediction when the loop termination condition is mispredicted as unsatisfied for a given iteration when it should have been satisfied, processing of the at least one unnecessary iteration of the predicated loop body is predicated to suppress an effect of the at least one unnecessary iteration.
    Type: Grant
    Filed: October 20, 2021
    Date of Patent: July 4, 2023
    Assignee: Arm Limited
    Inventors: Joseph Michael Pusdesris, Nicholas Andrew Plante, Yasuo Ishii, Chris Abernathy
  • Patent number: 11650818
    Abstract: A processor includes an execution unit and a processing logic operatively coupled to the execution unit, the processing logic to: enter a first execution state and transition to a second execution state responsive to executing a control transfer instruction. Responsive to executing a target instruction of the control transfer instruction, the processing logic further transitions to the first execution state responsive to the target instruction being a control transfer termination instruction of a mode identical to a mode of the processing logic following the execution of the control transfer instruction; and raises an execution exception responsive to the target instruction being a control transfer termination instruction of a mode different than the mode of the processing logic following the execution of the control transfer instruction.
    Type: Grant
    Filed: August 17, 2021
    Date of Patent: May 16, 2023
    Assignee: Intel Corporation
    Inventors: Vedvyas Shanbhogue, Jason W. Brandt, Ravi L. Sahita, Xiaoning Li
  • Patent number: 11645073
    Abstract: Address-based filtering for load/store speculation includes maintaining a filtering table including table entries associated with ranges of addresses; in response to receiving an ordering check triggering transaction, querying the filtering table using a target address of the ordering check triggering transaction to determine if an instruction dependent upon the ordering check triggering transaction has previously been generated a physical address; and in response to determining that the filtering table lacks an indication that the instruction dependent upon the ordering check triggering transaction has previously been generated a physical address, bypassing a lookup operation in an ordering violation memory structure to determine whether the instruction dependent upon the ordering check triggering transaction is currently in-flight.
    Type: Grant
    Filed: April 23, 2021
    Date of Patent: May 9, 2023
    Assignee: ADVANCED MICRO DEVICES, INC.
    Inventors: John Kalamatianos, Krishnan V. Ramani, Susumu Mashimo
  • Patent number: 11630772
    Abstract: Disclosed is a system and method for use in a cache for suppressing modification of cache line. The system and method includes a processor and a memory operating cooperatively with a cache controller. The memory includes a coherence directory stored within a cache created to track at least one cache line in the cache via the cache controller. The processor instructs a cache controller to store a first data in a cache line in the cache. The cache controller tags the cache line based on the first data. The processor instructs the cache controller to store a second data in the cache line in the cache causing eviction of the first data from the cache line. The processor compares based on the tagging the first data and the second data and suppresses modification of the cache line based on the comparing of the first data and the second data.
    Type: Grant
    Filed: September 29, 2021
    Date of Patent: April 18, 2023
    Assignee: ADVANCED MICRO DEVICES, INC.
    Inventor: Paul J. Moyer
  • Patent number: 11614940
    Abstract: A method to compare first and second source data in a processor in response to a vector maximum with indexing instruction includes specifying first and second source registers containing first and second source data, a destination register storing compared data, and a predicate register. Each of the registers includes a plurality of lanes. The method includes executing the instruction by, for each lane in the first and second source register, comparing a value in the lane of the first source register to a value in the corresponding lane of the second source register to identify a maximum value, storing the maximum value in a corresponding lane of the destination register, asserting a corresponding lane of the predicate register if the maximum value is from the first source register, and de-asserting the corresponding lane of the predicate register if the maximum value is from the second source register.
    Type: Grant
    Filed: March 29, 2021
    Date of Patent: March 28, 2023
    Assignee: Texas Instruments Incorporated
    Inventors: Duc Bui, Peter Richard Dent, Timothy D. Anderson
  • Patent number: 11520591
    Abstract: Processing data in an information handling system is disclosed that includes: in response to an event that triggers a flushing operation, calculate a finish ratio, wherein the finish ratio is a number of finished operations to a number of at least one of the group consisting of in-flight instructions, instructions pending in a processor pipeline, instructions issued to an issue queue, and instructions being processed in a processor execution unit; compare the calculated finish ratio to a threshold; and if the finish ratio is greater than the threshold, then do not perform the flushing operation. Also disclosed is moving the flush point.
    Type: Grant
    Filed: March 27, 2020
    Date of Patent: December 6, 2022
    Assignee: International Business Machines Corporation
    Inventors: Ehsan Fatehi, Richard J. Eickemeyer, John B. Griswell, Jr.
  • Patent number: 11520581
    Abstract: A vector processing unit is described, and includes processor units that each include multiple processing resources. The processor units are each configured to perform arithmetic operations associated with vectorized computations. The vector processing unit includes a vector memory in data communication with each of the processor units and their respective processing resources. The vector memory includes memory banks configured to store data used by each of the processor units to perform the arithmetic operations. The processor units and the vector memory are tightly coupled within an area of the vector processing unit such that data communications are exchanged at a high bandwidth based on the placement of respective processor units relative to one another, and based on the placement of the vector memory relative to each processor unit.
    Type: Grant
    Filed: May 24, 2021
    Date of Patent: December 6, 2022
    Assignee: Google LLC
    Inventors: William Lacy, Gregory Michael Thorson, Christopher Aaron Clark, Norman Paul Jouppi, Thomas Norrie, Andrew Everett Phelps
  • Patent number: 11494191
    Abstract: Processors and methods related to tracking exact convergence to guide the recovery process in response to a mispredicted branch are provided. An example processor includes a pipeline having a frontend and a backend. The processor further includes a state table for maintaining information related to at least a subset of branches corresponding to instructions being processed by the processor. The processor further includes state logic configured to access the state table and track locations of any exact convergence points associated with branches corresponding to the instructions being processed by the processor. The state logic is further configured to identify a first recovery method for recovering from a misprediction associated with a branch if a location of an exact convergence point associated with the branch is determined to be in the frontend of the pipeline, else identify a second recovery method for recovering from the misprediction associated with the branch.
    Type: Grant
    Filed: May 18, 2021
    Date of Patent: November 8, 2022
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Vignyan Reddy Kothinti Naresh, Shivam Priyadarshi
  • Patent number: 11481348
    Abstract: A first operation identifier is assigned to a current operation directed to a memory component, the first operation identifier having a first entry in a first data structure that associates the first operation identifier with a first buffer identifier. It is determined whether the current operation collides with a prior operation assigned a second operation identifier, the second operation identifier having a second entry in the first data structure that associates the second operation identifier with a second buffer identifier. A latest flag is updated to indicate that the first entry is a latest operation directed to an address (1) in response to determining that the current operation collides with the prior operation and that the current and prior operations are read operations, or (2) in response to determining to determining that the current operation does not collide with a prior operation.
    Type: Grant
    Filed: January 28, 2021
    Date of Patent: October 25, 2022
    Assignee: MICRON TECHNOLOGY, INC.
    Inventors: Lyle E. Adams, Mark Ish, Pushpa Seetamraju, Karl D. Schuh, Dan Tupy
  • Patent number: 11474991
    Abstract: In a distributed database, a transaction is to be committed at a first coordinator server and one or more participant servers 1210. The first coordinator server is configured to receive a notification that each participant server of the transaction is prepared at a respective prepared timestamp, the respective prepared timestamp being chosen within a time range for which the respective participant server obtained at least one lock 1220. The first coordinator server computes the commit timestamp for the transaction equal or greater than each of the prepared timestamps 1230, and restrict the commit timestamp such that a second coordinator server sharing at least one of the participant servers for one or more other transactions at a shared shard cannot select the same commit timestamp for any of the other transactions 1240. The transaction is committed at the commit timestamp 1250.
    Type: Grant
    Filed: March 13, 2018
    Date of Patent: October 18, 2022
    Assignee: Google LLC
    Inventors: Sebastian Kanthak, Brian Frank Cooper
  • Patent number: 11443044
    Abstract: A computer-implemented method for advancing speculative execution in microarchitectures is disclosed. A non-limiting example of the computer-implemented method includes receiving, by a processor, a test scenario including a first load instruction from a first memory location flagged with a delay notification and a speculative memory access instruction from a second memory following the first load instruction. The method executes, by the processor, the first load instruction from the first memory location and delays a return of data from the first memory location for a number of processor cycles. The method executes, by the processor, the speculative storage access instruction from the second memory location during the delay in returning the data from the first memory location.
    Type: Grant
    Filed: September 23, 2019
    Date of Patent: September 13, 2022
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Olaf Knute Hendrickson, Michael P Mullen, Matthew Michael Garcia Pardini
  • Patent number: 11403227
    Abstract: This disclosure relates to a data storage method and apparatus, and a server. The method includes receiving, by a first server, a write instruction sent by a second server, storing target data in a cache of a controller, detecting a read instruction for the target data, and storing the target data in a storage medium of a non-volatile memory based on the read instruction. In other words, when the second server needs to write the target data to the first server, the target data is not only written to the cache of the first server, but also written to the storage medium of the first server. This can ensure that the data in the cache is written to the storage medium promptly.
    Type: Grant
    Filed: March 12, 2021
    Date of Patent: August 2, 2022
    Assignee: HUAWEI TECHNOLOGIES CO., LTD.
    Inventors: Chenji Gong, Chao Zhou, Junjie Chen
  • Patent number: 10452417
    Abstract: Methods, apparatus, and articles of manufacture to virtualize performance counters are disclosed. An example method includes dividing performance events to be counted into a plurality of classes; assigning a first virtual performance counter of a virtual machine to a first performance event type in a first one of the classes; assigning a second virtual performance counter of the virtual machine to a second performance event type in a second one of the classes different from the first class; incrementing the first virtual performance counter in response to a first occurrence of the first performance event type during direct execution of guest instructions by the virtual machine; and not incrementing the first virtual performance counter in response to a second occurrence of the first performance event type during execution of emulated instructions by a hypervisor on behalf of the virtual machine.
    Type: Grant
    Filed: May 26, 2015
    Date of Patent: October 22, 2019
    Assignee: VMWARE, INC.
    Inventors: Benjamin Charles Serebrin, Daniel Michael Hecht
  • Patent number: 9361111
    Abstract: First processing circuitry processes at least part of a stream of program instructions. The first processing circuitry has registers for storing data and register renaming circuitry for mapping architectural register specifiers to physical register specifiers. A renaming data store stores renaming entries for identifying a register mapping between the architectural and physical register specifiers. At least some renaming entries have a count value indicating a number of speculation points occurring between generation of a previous count value and generation of the count value. The speculation points may for example be branch operation or load/store operations.
    Type: Grant
    Filed: January 9, 2013
    Date of Patent: June 7, 2016
    Assignee: ARM Limited
    Inventors: Luca Scalabrino, Melanie Emanuelle Lucie Teyssier, Cedric Denis Robert Airaud, Guillaume Schon
  • Patent number: 9032191
    Abstract: A hypervisor and one or more guest operating systems resident in a data processing system and hosted by the hypervisor are configured to selectively enable or disable branch prediction logic through separate hypervisor-mode and guest-mode instructions. By doing so, different branch prediction strategies may be employed for different operating systems and user applications hosted thereby to provide finer grained optimization of the branch prediction logic for different operating scenarios.
    Type: Grant
    Filed: January 23, 2012
    Date of Patent: May 12, 2015
    Assignee: International Business Machines Corporation
    Inventors: Adam J. Muff, Paul E. Schardt, Robert A. Shearer, Matthew R. Tubbs
  • Patent number: 9026769
    Abstract: A processor for processing loop instructions can include an instruction reorder structure and a loop processing controller. The instruction reorder structure is configured to store decoded instructions according to program order and issue the decoded instructions for execution out of program order. The loop processing controller is configured to detect a loop in the decoded instructions stored in the instruction reorder structure and cause the instruction reorder structure to reissue the decoded instructions that form the loop for re-execution.
    Type: Grant
    Filed: January 24, 2012
    Date of Patent: May 5, 2015
    Assignee: Marvell International Ltd.
    Inventors: Sujat Jamil, R. Frank O'Bleness, Joseph Delgross, Tom Hameenanttila
  • Patent number: 8990819
    Abstract: A method for rolling back speculative threads in symmetric-multiprocessing (SMP) environments is disclosed. In one embodiment, such a method includes detecting an aborted thread at runtime and determining whether the aborted thread is an oldest aborted thread. In the event the aborted thread is the oldest aborted thread, the method sets a high-priority request for allocation to an absolute thread number associated with the oldest aborted thread. The method further detects that the high-priority request is set and, in response, modifies a local allocation token of the oldest aborted thread. The modification prompts the oldest aborted thread to retry a work unit associated with its absolute thread number. The oldest aborted thread subsequently initiates the retry of a successor thread by updating the successor thread's local allocation token. A corresponding apparatus and computer program product are also disclosed.
    Type: Grant
    Filed: December 28, 2012
    Date of Patent: March 24, 2015
    Assignee: International Business Machines Corporation
    Inventors: Martin Ohmacht, Raul E. Silvera, Mark G. Stoodley, Kai-Ting A. Wang
  • Patent number: 8990817
    Abstract: Various systems and methods for automated error recovery in workflows. For example, one method involves receiving an operation indication. The operation indication indicates an operation that is to be performed using a multi-tier application system that includes first and second applications. The first and second applications are implemented using different tiers of the multi-tier application system. The method involves accessing dependency information that indicates first data dependencies between the first and the second applications. The method further involves determining outcome of execution of the operation, where the determining is based on the dependency information but does not include executing the operation.
    Type: Grant
    Filed: September 6, 2012
    Date of Patent: March 24, 2015
    Assignee: Symantec Corporation
    Inventors: Debasish Garai, Sumeet S. Kembhavi
  • Patent number: 8959319
    Abstract: Embodiments of the present invention provide systems, methods, and computer program products for improving divergent conditional branches in code being executed by a processor. For example, in an embodiment, a method comprises detecting a conditional statement of a program being simultaneously executed by a plurality of threads, determining which threads evaluate a condition of the conditional statement as true and which threads evaluate the condition as false, pushing an identifier associated with the larger set of the threads onto a stack, executing code associated with a smaller set of the threads, and executing code associated with the larger set of the threads.
    Type: Grant
    Filed: December 2, 2011
    Date of Patent: February 17, 2015
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Mark Leather, Norman Rubin, Brian D. Emberling, Michael Mantor
  • Patent number: 8918625
    Abstract: A processor that executes instructions out of program order is described. In some implementations, a processor detects whether a second memory operation is dependent on a first memory operation prior to memory address calculation. If the processor detects that the second memory operation is not dependent on the first memory operation, the processor is configured to allow the second memory operation to be scheduled. If the processor detects that the second memory operation is dependent on the first memory operation, the processor is configured to prevent the second memory operation from being scheduled until the first memory operation has been scheduled to reduce the likelihood of having to reexecute the second memory operation.
    Type: Grant
    Filed: November 15, 2011
    Date of Patent: December 23, 2014
    Assignee: Marvell International Ltd.
    Inventors: R. Frank O'Bleness, Sujat Jamil, Tom Hameenanttila
  • Patent number: 8918626
    Abstract: The disclosed embodiments relate to a system that executes program instructions on a processor. During a normal-execution mode, the system issues instructions for execution in program order. Upon encountering an unresolved data dependency during execution of an instruction, the system speculatively executes subsequent instructions in a lookahead mode to prefetch future loads. When an instruction retires during the lookahead mode, a working register which serves as a destination register for the instruction is not copied to a corresponding architectural register. Instead the architectural register is marked as invalid. Note that by not updating architectural registers during lookahead mode, the system eliminates the need to checkpoint the architectural registers prior to entering lookahead mode.
    Type: Grant
    Filed: November 10, 2011
    Date of Patent: December 23, 2014
    Assignee: Oracle International Corporation
    Inventors: Yuan C. Chou, Eric W. Mahurin
  • Patent number: 8862861
    Abstract: Techniques are disclosed relating to a processor that is configured to execute control transfer instructions (CTIs). In some embodiments, the processor includes a mechanism that suppresses results of mispredicted younger CTIs on a speculative execution path. This mechanism permits the branch predictor to maintain its fidelity, and eliminates spurious flushes of the pipeline. In one embodiment, a misprediction bit is be used to indicate that a misprediction has occurred, and younger CTIs than the CTI that was mispredicted are suppressed. In some embodiments, the processor may be configured to execute instruction streams from multiple threads. Each thread may include a misprediction indication. CTIs in each thread may execute in program order with respect to other CTIs of the thread, while instructions other than CTIs may execute out of program order.
    Type: Grant
    Filed: September 8, 2011
    Date of Patent: October 14, 2014
    Assignee: Oracle International Corporation
    Inventors: Christopher H. Olson, Manish K. Shah
  • Patent number: 8683129
    Abstract: The disclosed embodiments provide a system that uses speculative cache requests to reduce cache miss delays for a cache in a multi-level memory hierarchy. During operation, the system receives a memory reference which is directed to a cache line in the cache. Next, while determining whether the cache line is available in the cache, the system determines whether the memory reference is likely to miss in the cache, and if so, simultaneously sends a speculative request for the cache line to a lower level of the multi-level memory hierarchy.
    Type: Grant
    Filed: October 21, 2010
    Date of Patent: March 25, 2014
    Assignee: Oracle International Corporation
    Inventors: Tarik Ono, Mark R. Greenstreet