Speculative Instruction Execution, E.g., Conditional Execution, Procedural Dependencies, Instruction Invalidation (epo) Patents (Class 712/E9.05)
-
Patent number: 12164923Abstract: Methods and systems are disclosed for processing a vector by a vector processor. Techniques disclosed include receiving predicated instructions by a scheduler, each of which is associated with an opcode, a vector of elements, and a predicate. The techniques further include executing the predicated instructions. Executing a predicated instruction includes compressing, based on an index derived from a predicate of the instruction, elements in a vector of the instruction, where the elements in the vector are contiguously mapped, then, after the mapped elements are processed, decompressing the processed mapped elements, where the processed mapped elements are reverse mapped based on the index.Type: GrantFiled: June 29, 2022Date of Patent: December 10, 2024Assignee: Advanced Micro Devices, Inc.Inventors: Elliott David Binder, Onur Kayiran, Masab Ahmad
-
Patent number: 12147349Abstract: Disclosed is a processor for performing a speculative execution for an out-of-order execution. The processor may include: a core; and an L1 cache memory, and the core may include a speculative track buffer (STB) storing speculative track information in order to track the speculative instruction when a speculative instruction is recorded in a reorder buffer (ROB), and a load queue (LQ) transmitting a commit doorbell signal or a restore doorbell signal for a first speculative block to which a first speculative instruction belongs to an L1 cache memory based on first speculative track information of the first speculative instruction when a speculative success or a speculative failure of the first speculative instruction included in the speculative instruction is decided, and the L1 cache memory may include a write buffer.Type: GrantFiled: December 15, 2022Date of Patent: November 19, 2024Assignee: Korea University Research and Business FoundationInventors: Taeweon Suh, Gunjae Koo, Jongmin Lee, Junyeon Lee
-
Patent number: 12135681Abstract: In an embodiment, a coprocessor may include a bypass indication which identifies execution circuitry that is not used by a given processor instruction, and thus may be bypassed. The corresponding circuitry may be disabled during execution, preventing evaluation when the output of the circuitry will not be used for the instruction. In another embodiment, the coprocessor may implement a grid of processing elements in rows and columns, where a given coprocessor instruction may specify an operation that causes up to all of the processing elements to operate on vectors of input operands to produce results. Implementations of the coprocessor may implement a portion of the processing elements. The coprocessor control circuitry may be designed to operate with the full grid or partial grid, reissuing instructions in the partial grid case to perform the requested operation. In still another embodiment, the coprocessor may be able to fuse vector mode operations.Type: GrantFiled: July 20, 2022Date of Patent: November 5, 2024Assignee: Apple Inc.Inventors: Aditya Kesiraju, Andrew J. Beaumont-Smith, Boris S. Alvarez-Heredia, Ran A. Chachick
-
Patent number: 12131073Abstract: A set of submission queues associated with a host system is identified. A first set of internal queues and a second set of internal queues is generated based on the set of submission queues. Responsive to fetching a first memory access command pending in a submission queue of the set of submission queues, a first internal queue of the first set of internal queues is populated. Responsive to processing the first memory access command from the first internal queue of the first set of internal queues, a second internal queue of the second set of internal queues is populated. Responsive to completion of the first memory access command from the second internal queue of the second set of internal queues, an indication of the completion of the first memory access command is returned to the host system.Type: GrantFiled: November 29, 2023Date of Patent: October 29, 2024Assignee: Micron Technology, Inc.Inventors: Muthazhagan Balasubramani, Woei Chen Peh
-
Patent number: 12112396Abstract: A SIMD processing unit processes a plurality of tasks which each include up to a predetermined maximum number of work items. The work items of a task are arranged for executing a common sequence of instructions on respective data items. The data items are arranged into blocks, with some of the blocks including at least one invalid data item. Work items which relate to invalid data items are invalid work items. The SIMD processing unit comprises a group of processing lanes configured to execute instructions of work items of a particular task over a plurality of processing cycles. A control module assembles work items into the tasks based on the validity of the work items, so that invalid work items of the particular task are temporally aligned across the processing lanes. In this way the number of wasted processing slots due to invalid work items may be reduced.Type: GrantFiled: August 21, 2023Date of Patent: October 8, 2024Assignee: Imagination Technologies LimitedInventors: John Howson, Jonathan Redshaw, Yoong Chert Foo
-
Patent number: 12093261Abstract: Techniques related to cache storage formats are disclosed. In some embodiments, a set of values is stored in a cache as a set of first representations and a set of second representations. For example, the set of first representations may be a set of hardware-level representations, and the set of second representations may be a set of non-hardware-level representations. Responsive to receiving a query to be executed over the set of values, a determination is made as to whether or not it would be more efficient to execute the query over the set of first representations than to execute the query over the set of second representations. If the determination indicates that it would be more efficient to execute the query over the set of first representations than to execute the query over the set of second representations, the query is executed over the set of first representations.Type: GrantFiled: April 2, 2018Date of Patent: September 17, 2024Assignee: Oracle International CorporationInventors: Aurosish Mishra, Shasank K. Chavan, Vinita Subramanian, Ekrem S. C. Soylemez, Adam Kociubes, Eugene Karichkin, Garret F. Swart
-
Patent number: 12086653Abstract: A processor is described. The processor includes model specific register space that is visible to software above a BIOS level. The model specific register space is to specify a granularity of a processing entity of a lock-step group. The processor also includes logic circuitry to support dynamic entry/exit of the lock-step group's processing entities to/from lock-step mode including: i) termination of lock-step execution by the processing entities before the program code to be executed in lock-step is fully executed; and, ii) as part of the exit from the lock-step mode, restoration of a state of a shadow processing entity of the processing entities as the state existed before the shadow processing entity entered the lock-step mode and began lock-step execution of the program code.Type: GrantFiled: December 24, 2020Date of Patent: September 10, 2024Assignee: Intel CorporationInventors: Vedvyas Shanbhogue, Jeff A. Huxel, Jeffrey G. Wiedemeier, James D. Allen, Arvind Raman, Krishnakumar Ganapathy
-
Patent number: 12086603Abstract: An apparatus of an aspect includes a plurality of cores and shared core extension logic coupled with each of the plurality of cores. The shared core extension logic has shared data processing logic that is shared by each of the plurality of cores. Instruction execution logic, for each of the cores, in response to a shared core extension call instruction, is to call the shared core extension logic. The call is to have data processing performed by the shared data processing logic on behalf of a corresponding core. Other apparatus, methods, and systems are also disclosed.Type: GrantFiled: October 27, 2022Date of Patent: September 10, 2024Assignee: Intel CorporationInventors: Eran Shifer, Mostafa Hagog, Eliyahu Turiel
-
Patent number: 12086600Abstract: Embodiments of the present disclosure include techniques for branch prediction. A branch predictor may be included in a front end of a processor. The branch predictor may store branch targets in a branch target buffer. The branch target buffer includes shared bits, which may be combined with branch target bits to specify branch target destination addresses. Shared bits may result in more efficient memory usage in the processor, for example.Type: GrantFiled: December 5, 2022Date of Patent: September 10, 2024Assignee: Microsoft Technology Licensing, LLCInventors: Somasundaram Arunachalam, Daren Eugene Streett, Richard William Doing
-
Patent number: 12086042Abstract: A tracing circuit is integrated in a semiconductor device along with a microprocessor including an m-bit program counter, and externally outputs a tracing clock along with an n-bit tracing data (where 2?n?m). The tracing circuit, when the program counter remains unchanged, synchronously with the tracing clock sets the tracing data to a first output value; when the program counter is incremented, synchronously with the tracing clock sets the tracing data to a second output value; and when the program counter is loaded, synchronously with the tracing clock sets the tracing data to a third output value, and then suspends the state machine in the microprocessor and split-outputs, as the tracing data, the branch destination address or interrupt destination address loaded in the program counter.Type: GrantFiled: October 16, 2020Date of Patent: September 10, 2024Assignee: Rohm Co., Ltd.Inventor: Takahiro Nishiyama
-
Patent number: 12039305Abstract: A method for a compilation, an electronic device and a readable storage medium are provided. The method for a compilation includes analyzing source program data to determine a target irregular branch, generating an update data flow graph according to the target irregular branch, and mapping the update data flow graph to a target hardware to complete the compilation.Type: GrantFiled: March 7, 2022Date of Patent: July 16, 2024Assignees: Beijing Superstring Academy of Memory Technology, Tsinghua UniversityInventors: Baofen Yuan, Shouyi Yin, Shaojun Wei
-
Patent number: 12008352Abstract: A loop within computer code is transformed to minimize loop iterations. A determination is made using statistical information relating to the loop whether the loop that has an early exit indication is to be transformed to minimize iterations of the loop. Based on determining that the loop is to be transformed, the loop is transformed.Type: GrantFiled: November 24, 2021Date of Patent: June 11, 2024Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Wai Hung Tsang, Ettore Tiotto, Bardia Mahjour
-
Patent number: 11989134Abstract: An apparatus comprising translation circuitry to perform a translation operation to generate a translated second memory address within a second memory address space as a translation of a first memory address within a first memory address space, in which the translation circuitry is configured to generate the translated second memory address in dependence upon translation information stored at one or more translation information addresses; permission circuitry to perform an operation to detect permission information to indicate, for a given second memory address, whether memory access is permitted to the given second memory address; and access circuitry to allow access to data stored at the given second memory address when the permission information indicates that memory access is permitted to the given second memory address.Type: GrantFiled: March 8, 2021Date of Patent: May 21, 2024Assignee: Arm LimitedInventors: Yuval Elad, Jason Parker, Richard Roy Grisenthwaite, Simon John Craske, Alexander Donald Charles Chadwick
-
Patent number: 11960892Abstract: In one embodiment, a system includes a memory and a processor core. The processor core includes functional units and an instruction decode unit configured to determine whether an execute packet of instructions received by the processing core includes a first instruction that is designated for execution by a first functional unit of the functional units and a second instruction that is a condition code extension instruction that includes a plurality of sets of condition code bits, wherein each set of condition code bits corresponds to a different one of the functional units, and wherein the sets of condition code bits include a first set of condition code bits that corresponds to the first functional unit. When the execute packet includes the first and second instructions, the first functional unit is configured to execute the first instruction conditionally based upon the first set of condition code bits in the second instruction.Type: GrantFiled: July 22, 2022Date of Patent: April 16, 2024Assignee: Texas Instruments IncorporatedInventors: Timothy David Anderson, Duc Quang Bui, Joseph Raymond Michael Zbiciak
-
Patent number: 11947455Abstract: Disclosed is a system and method for use in a cache for suppressing modification of cache line. The system and method includes a processor and a memory operating cooperatively with a cache controller. The memory includes a coherence directory stored within a cache created to track at least one cache line in the cache via the cache controller. The processor instructs a cache controller to store a first data in a cache line in the cache. The cache controller tags the cache line based on the first data. The processor instructs the cache controller to store a second data in the cache line in the cache causing eviction of the first data from the cache line. The processor compares based on the tagging the first data and the second data and suppresses modification of the cache line based on the comparing of the first data and the second data.Type: GrantFiled: April 17, 2023Date of Patent: April 2, 2024Assignee: ADVANCED MICRO DEVICES, INC.Inventor: Paul J. Moyer
-
Patent number: 11928470Abstract: Introduced herein is a program counter advancing technique that uses NOP padding without its limitations. During a build process, the introduced technique removes EOG markers for instruction groups that are immediately followed by the NOP instructions that are immediately followed by an instruction group beginning at a start of a cache line. As such, during an execution process, when the processing unit detects an absence of an EOG marker in the requested instruction group, it knows that a group of NOP instructions are about to follow and skips over them by directly advancing the program counter to a start of a subsequent cache line where the next instruction group starts. In addition to the presence of an EOG marker, the introduced technique also takes into account whether the requested instruction group is a straddling group when advancing the program counter to a start of the subsequent cache line.Type: GrantFiled: April 20, 2022Date of Patent: March 12, 2024Assignee: VeriSilicon Holdings Co., Ltd.Inventor: Tracy T. Nguyen
-
Patent number: 11915006Abstract: A method, system and device for pipeline processing of instructions and a computer storage medium. The method comprises: acquiring a target instruction set (S101); acquiring a target prediction result, wherein the target prediction result is a result obtained by predicting a jump mode of the target instruction set (S102); performing pipeline processing on the target instruction set according to the target prediction result (S103); determining if a pipeline flushing request is received (S104); and if so, correspondingly saving the target instruction set and a corresponding pipeline processing result, so as to perform pipeline processing on the target instruction set again on the basis of the pipeline processing result (S105).Type: GrantFiled: November 28, 2019Date of Patent: February 27, 2024Assignee: INSPUR SUZHOU INTELLIGENT TECHNOLOGY CO., LTD.Inventors: Yulong Zhou, Tongqiang Liu, Xiaofeng Zou
-
Patent number: 11847060Abstract: Described is a data cache with prediction hints for a cache hit. The data cache includes a plurality of cache lines, where a cache line includes a data field, a tag field, and a prediction hint field. The prediction hint field is configured to store a prediction hint which directs alternate behavior for a cache hit against the cache line. The prediction hint field is integrated with the tag field or is integrated with a way predictor field.Type: GrantFiled: March 1, 2023Date of Patent: December 19, 2023Assignee: SiFive, Inc.Inventors: John Ingalls, Josh Smith
-
Patent number: 11797403Abstract: Maintaining a synchronous replication relationship between two or more storage systems, including: receiving, by at least one of a plurality of storage systems across which a dataset will be synchronously replicated, timing information for at least one of the plurality of storage systems; and establishing, based on the timing information, a synchronous replication lease describing a period of time during which the synchronous replication relationship is valid, wherein a request to modify the dataset may only be acknowledged after a copy of the dataset has been modified on each of the storage systems.Type: GrantFiled: September 12, 2022Date of Patent: October 24, 2023Assignee: PURE STORAGE, INC.Inventors: David Grunwald, Steven Hodgson, Ronald Karr, Kunal Trivedi, Christopher Golden, Thomas Gill, Connor Brooks, Zoheb Shivani
-
Patent number: 11797309Abstract: An apparatus and method for tracking speculative execution flow and detecting potential vulnerabilities.Type: GrantFiled: December 27, 2019Date of Patent: October 24, 2023Assignee: Intel CorporationInventors: Carlos Rozas, Francis McKeen, Pasquale Cocchini, Meltem Ozsoy, Matthew Fernandez
-
Patent number: 11782845Abstract: An apparatus comprises memory management circuitry to perform a translation table walk for a target address of a memory access request and to signal a fault in response to the translation table walk identifying a fault condition for the target address, prefetch circuitry to generate a prefetch request to request prefetching of information associated with a prefetch target address to a cache; and faulting address prediction circuitry to predict whether the memory management circuitry would identify the fault condition for the prefetch target address if the translation table walk was performed by the memory management circuitry for the prefetch target address. In response to a prediction that the fault condition would be identified for the prefetch target address, the prefetch circuitry suppresses the prefetch request and the memory management circuitry prevents the translation table walk being performed for the prefetch target address of the prefetch request.Type: GrantFiled: December 2, 2021Date of Patent: October 10, 2023Assignee: Arm LimitedInventors: Alexander Cole Shulyak, Joseph Michael Pusdesris, Abhishek Raja, Karthik Sundaram, Anoop Ramachandra Iyer, Michael Brian Schinzler, James David Dundas, Yasuo Ishii
-
Patent number: 11748284Abstract: A system and method for efficiently arbitrating traffic on a bus. A computing system includes a fabric for routing traffic among one or more agents and one or more endpoints. The fabric includes multiple arbiters in an arbitration hierarchy. Arbiters store traffic in buffers with each buffer associated with a particular traffic type and a source of the traffic. Arbiters maintain a respective urgency counter for keeping track of a period of time traffic of a particular type is blocked by upstream arbiters. When the block is removed, the traffic of the particular type has priority for selection based on the urgency counter. When arbiters receive feedback from downstream arbiters or sources, the arbiters adjust selection priority accordingly. For example, changes in bandwidth requirement, low latency tolerance and active status cause adjustments in selection priority of stored requests.Type: GrantFiled: July 14, 2021Date of Patent: September 5, 2023Assignee: Apple Inc.Inventors: Nachiappan Chidambaram Nachiappan, Jaideep Dastidar, Yiu Chun Tse, Ripudaman Singh, Shawn Munetoshi Fukami, Benjamin K. Dodge, Vinodh R. Cuppu
-
Patent number: 11740909Abstract: A system including a computer storage and a processor is described. The computer storage is configured to identify a stored data as protected. The processor is configured to perform speculative execution. To perform the speculative execution, the processor is configured to determine, in response to the speculative execution of an instruction to read the stored data, whether the stored data is identified as protected. In response to a determination that the stored data attempted to be read during the speculative execution is protected, the processor is configured to disallow during the speculative execution immediate successful completion of the instruction to read the stored data.Type: GrantFiled: November 9, 2021Date of Patent: August 29, 2023Assignee: Meta Platforms, Inc.Inventors: Hao Wang, Harish Dattatraya Dixit, Shobhit O. Kanaujia
-
Patent number: 11734788Abstract: A SIMD processing unit processes a plurality of tasks which each include up to a predetermined maximum number of work items. The work items of a task are arranged for executing a common sequence of instructions on respective data items. The data items are arranged into blocks, with some of the blocks including at least one invalid data item. Work items which relate to invalid data items are invalid work items. The SIMD processing unit comprises a group of processing lanes configured to execute instructions of work items of a particular task over a plurality of processing cycles. A control module assembles work items into the tasks based on the validity of the work items, so that invalid work items of the particular task are temporally aligned across the processing lanes. In this way the number of wasted processing slots due to invalid work items may be reduced.Type: GrantFiled: October 29, 2021Date of Patent: August 22, 2023Assignee: Imagination Technologies LimitedInventors: John Howson, Jonathan Redshaw, Yoong Chert Foo
-
Patent number: 11734194Abstract: A method is provided that includes performing, by a processor in response to a dual issue multiply instruction, multiplication of operands of the dual issue multiply instruction using multiplication units comprised in a data path of the processor and configured to operate together to determine a product of the operands, and storing, by the processor, the product in a storage location indicated by the dual issue multiply instruction.Type: GrantFiled: April 4, 2022Date of Patent: August 22, 2023Assignee: Texas Instruments IncorporatedInventors: Timothy David Anderson, Mujibur Rahman
-
Patent number: 11704301Abstract: Provided is a method for performing a file system consistency check. The method comprises calculating, by a first thread that does not have access to an inode table, file block addresses for one or more files to be checked by the thread. The method further comprises collecting validity information for the one or more files. The method further comprises reading information relating to the one or more files from the inode table. The reading is performed in response to the thread being given access to the inode table after the calculating operation. The method further comprises validating the information by comparing the information from the inode table to the validity information.Type: GrantFiled: September 11, 2020Date of Patent: July 18, 2023Assignee: International Business Machines CorporationInventors: Huzefa Pancha, Abhishek Jain, Sasikanth Eda, Karthik Iyer
-
Patent number: 11698794Abstract: A method to provide flexible access to an internal data of an regulated system, the method comprising receiving, by a data access component of the regulated system, a loadable configuration file defining a set of triggering events and a set of memory, determining the occurrence of a single triggering event, accessing at least a subset of memory that contain the internal data of the avionics system to retrieve data associated with the one or more memory of the set of memory, and outputting the retrieved data to a receiving component.Type: GrantFiled: September 2, 2020Date of Patent: July 11, 2023Assignee: GE Aviation Systems LLCInventors: Joachim Karl Ulf Hochwarth, Victor Mario Leal Herrera, Antonio Lugo Trejo, Joshua Michael Krbez
-
Patent number: 11693666Abstract: A predicated-loop-terminating branch instruction controls, based on whether a loop termination condition is satisfied, whether the processing circuitry should process a further iteration of a predicated loop body or process a following instruction. If at least one unnecessary iteration of the predicated loop body is processed following a mispredicted-non-termination branch misprediction when the loop termination condition is mispredicted as unsatisfied for a given iteration when it should have been satisfied, processing of the at least one unnecessary iteration of the predicated loop body is predicated to suppress an effect of the at least one unnecessary iteration.Type: GrantFiled: October 20, 2021Date of Patent: July 4, 2023Assignee: Arm LimitedInventors: Joseph Michael Pusdesris, Nicholas Andrew Plante, Yasuo Ishii, Chris Abernathy
-
Patent number: 11650818Abstract: A processor includes an execution unit and a processing logic operatively coupled to the execution unit, the processing logic to: enter a first execution state and transition to a second execution state responsive to executing a control transfer instruction. Responsive to executing a target instruction of the control transfer instruction, the processing logic further transitions to the first execution state responsive to the target instruction being a control transfer termination instruction of a mode identical to a mode of the processing logic following the execution of the control transfer instruction; and raises an execution exception responsive to the target instruction being a control transfer termination instruction of a mode different than the mode of the processing logic following the execution of the control transfer instruction.Type: GrantFiled: August 17, 2021Date of Patent: May 16, 2023Assignee: Intel CorporationInventors: Vedvyas Shanbhogue, Jason W. Brandt, Ravi L. Sahita, Xiaoning Li
-
Patent number: 11645073Abstract: Address-based filtering for load/store speculation includes maintaining a filtering table including table entries associated with ranges of addresses; in response to receiving an ordering check triggering transaction, querying the filtering table using a target address of the ordering check triggering transaction to determine if an instruction dependent upon the ordering check triggering transaction has previously been generated a physical address; and in response to determining that the filtering table lacks an indication that the instruction dependent upon the ordering check triggering transaction has previously been generated a physical address, bypassing a lookup operation in an ordering violation memory structure to determine whether the instruction dependent upon the ordering check triggering transaction is currently in-flight.Type: GrantFiled: April 23, 2021Date of Patent: May 9, 2023Assignee: ADVANCED MICRO DEVICES, INC.Inventors: John Kalamatianos, Krishnan V. Ramani, Susumu Mashimo
-
Patent number: 11630772Abstract: Disclosed is a system and method for use in a cache for suppressing modification of cache line. The system and method includes a processor and a memory operating cooperatively with a cache controller. The memory includes a coherence directory stored within a cache created to track at least one cache line in the cache via the cache controller. The processor instructs a cache controller to store a first data in a cache line in the cache. The cache controller tags the cache line based on the first data. The processor instructs the cache controller to store a second data in the cache line in the cache causing eviction of the first data from the cache line. The processor compares based on the tagging the first data and the second data and suppresses modification of the cache line based on the comparing of the first data and the second data.Type: GrantFiled: September 29, 2021Date of Patent: April 18, 2023Assignee: ADVANCED MICRO DEVICES, INC.Inventor: Paul J. Moyer
-
Patent number: 11614940Abstract: A method to compare first and second source data in a processor in response to a vector maximum with indexing instruction includes specifying first and second source registers containing first and second source data, a destination register storing compared data, and a predicate register. Each of the registers includes a plurality of lanes. The method includes executing the instruction by, for each lane in the first and second source register, comparing a value in the lane of the first source register to a value in the corresponding lane of the second source register to identify a maximum value, storing the maximum value in a corresponding lane of the destination register, asserting a corresponding lane of the predicate register if the maximum value is from the first source register, and de-asserting the corresponding lane of the predicate register if the maximum value is from the second source register.Type: GrantFiled: March 29, 2021Date of Patent: March 28, 2023Assignee: Texas Instruments IncorporatedInventors: Duc Bui, Peter Richard Dent, Timothy D. Anderson
-
Patent number: 11520591Abstract: Processing data in an information handling system is disclosed that includes: in response to an event that triggers a flushing operation, calculate a finish ratio, wherein the finish ratio is a number of finished operations to a number of at least one of the group consisting of in-flight instructions, instructions pending in a processor pipeline, instructions issued to an issue queue, and instructions being processed in a processor execution unit; compare the calculated finish ratio to a threshold; and if the finish ratio is greater than the threshold, then do not perform the flushing operation. Also disclosed is moving the flush point.Type: GrantFiled: March 27, 2020Date of Patent: December 6, 2022Assignee: International Business Machines CorporationInventors: Ehsan Fatehi, Richard J. Eickemeyer, John B. Griswell, Jr.
-
Patent number: 11520581Abstract: A vector processing unit is described, and includes processor units that each include multiple processing resources. The processor units are each configured to perform arithmetic operations associated with vectorized computations. The vector processing unit includes a vector memory in data communication with each of the processor units and their respective processing resources. The vector memory includes memory banks configured to store data used by each of the processor units to perform the arithmetic operations. The processor units and the vector memory are tightly coupled within an area of the vector processing unit such that data communications are exchanged at a high bandwidth based on the placement of respective processor units relative to one another, and based on the placement of the vector memory relative to each processor unit.Type: GrantFiled: May 24, 2021Date of Patent: December 6, 2022Assignee: Google LLCInventors: William Lacy, Gregory Michael Thorson, Christopher Aaron Clark, Norman Paul Jouppi, Thomas Norrie, Andrew Everett Phelps
-
Patent number: 11494191Abstract: Processors and methods related to tracking exact convergence to guide the recovery process in response to a mispredicted branch are provided. An example processor includes a pipeline having a frontend and a backend. The processor further includes a state table for maintaining information related to at least a subset of branches corresponding to instructions being processed by the processor. The processor further includes state logic configured to access the state table and track locations of any exact convergence points associated with branches corresponding to the instructions being processed by the processor. The state logic is further configured to identify a first recovery method for recovering from a misprediction associated with a branch if a location of an exact convergence point associated with the branch is determined to be in the frontend of the pipeline, else identify a second recovery method for recovering from the misprediction associated with the branch.Type: GrantFiled: May 18, 2021Date of Patent: November 8, 2022Assignee: Microsoft Technology Licensing, LLCInventors: Vignyan Reddy Kothinti Naresh, Shivam Priyadarshi
-
Patent number: 11481348Abstract: A first operation identifier is assigned to a current operation directed to a memory component, the first operation identifier having a first entry in a first data structure that associates the first operation identifier with a first buffer identifier. It is determined whether the current operation collides with a prior operation assigned a second operation identifier, the second operation identifier having a second entry in the first data structure that associates the second operation identifier with a second buffer identifier. A latest flag is updated to indicate that the first entry is a latest operation directed to an address (1) in response to determining that the current operation collides with the prior operation and that the current and prior operations are read operations, or (2) in response to determining to determining that the current operation does not collide with a prior operation.Type: GrantFiled: January 28, 2021Date of Patent: October 25, 2022Assignee: MICRON TECHNOLOGY, INC.Inventors: Lyle E. Adams, Mark Ish, Pushpa Seetamraju, Karl D. Schuh, Dan Tupy
-
Patent number: 11474991Abstract: In a distributed database, a transaction is to be committed at a first coordinator server and one or more participant servers 1210. The first coordinator server is configured to receive a notification that each participant server of the transaction is prepared at a respective prepared timestamp, the respective prepared timestamp being chosen within a time range for which the respective participant server obtained at least one lock 1220. The first coordinator server computes the commit timestamp for the transaction equal or greater than each of the prepared timestamps 1230, and restrict the commit timestamp such that a second coordinator server sharing at least one of the participant servers for one or more other transactions at a shared shard cannot select the same commit timestamp for any of the other transactions 1240. The transaction is committed at the commit timestamp 1250.Type: GrantFiled: March 13, 2018Date of Patent: October 18, 2022Assignee: Google LLCInventors: Sebastian Kanthak, Brian Frank Cooper
-
Patent number: 11443044Abstract: A computer-implemented method for advancing speculative execution in microarchitectures is disclosed. A non-limiting example of the computer-implemented method includes receiving, by a processor, a test scenario including a first load instruction from a first memory location flagged with a delay notification and a speculative memory access instruction from a second memory following the first load instruction. The method executes, by the processor, the first load instruction from the first memory location and delays a return of data from the first memory location for a number of processor cycles. The method executes, by the processor, the speculative storage access instruction from the second memory location during the delay in returning the data from the first memory location.Type: GrantFiled: September 23, 2019Date of Patent: September 13, 2022Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Olaf Knute Hendrickson, Michael P Mullen, Matthew Michael Garcia Pardini
-
Patent number: 11403227Abstract: This disclosure relates to a data storage method and apparatus, and a server. The method includes receiving, by a first server, a write instruction sent by a second server, storing target data in a cache of a controller, detecting a read instruction for the target data, and storing the target data in a storage medium of a non-volatile memory based on the read instruction. In other words, when the second server needs to write the target data to the first server, the target data is not only written to the cache of the first server, but also written to the storage medium of the first server. This can ensure that the data in the cache is written to the storage medium promptly.Type: GrantFiled: March 12, 2021Date of Patent: August 2, 2022Assignee: HUAWEI TECHNOLOGIES CO., LTD.Inventors: Chenji Gong, Chao Zhou, Junjie Chen
-
Patent number: 10452417Abstract: Methods, apparatus, and articles of manufacture to virtualize performance counters are disclosed. An example method includes dividing performance events to be counted into a plurality of classes; assigning a first virtual performance counter of a virtual machine to a first performance event type in a first one of the classes; assigning a second virtual performance counter of the virtual machine to a second performance event type in a second one of the classes different from the first class; incrementing the first virtual performance counter in response to a first occurrence of the first performance event type during direct execution of guest instructions by the virtual machine; and not incrementing the first virtual performance counter in response to a second occurrence of the first performance event type during execution of emulated instructions by a hypervisor on behalf of the virtual machine.Type: GrantFiled: May 26, 2015Date of Patent: October 22, 2019Assignee: VMWARE, INC.Inventors: Benjamin Charles Serebrin, Daniel Michael Hecht
-
Patent number: 9361111Abstract: First processing circuitry processes at least part of a stream of program instructions. The first processing circuitry has registers for storing data and register renaming circuitry for mapping architectural register specifiers to physical register specifiers. A renaming data store stores renaming entries for identifying a register mapping between the architectural and physical register specifiers. At least some renaming entries have a count value indicating a number of speculation points occurring between generation of a previous count value and generation of the count value. The speculation points may for example be branch operation or load/store operations.Type: GrantFiled: January 9, 2013Date of Patent: June 7, 2016Assignee: ARM LimitedInventors: Luca Scalabrino, Melanie Emanuelle Lucie Teyssier, Cedric Denis Robert Airaud, Guillaume Schon
-
Patent number: 9032191Abstract: A hypervisor and one or more guest operating systems resident in a data processing system and hosted by the hypervisor are configured to selectively enable or disable branch prediction logic through separate hypervisor-mode and guest-mode instructions. By doing so, different branch prediction strategies may be employed for different operating systems and user applications hosted thereby to provide finer grained optimization of the branch prediction logic for different operating scenarios.Type: GrantFiled: January 23, 2012Date of Patent: May 12, 2015Assignee: International Business Machines CorporationInventors: Adam J. Muff, Paul E. Schardt, Robert A. Shearer, Matthew R. Tubbs
-
Patent number: 9026769Abstract: A processor for processing loop instructions can include an instruction reorder structure and a loop processing controller. The instruction reorder structure is configured to store decoded instructions according to program order and issue the decoded instructions for execution out of program order. The loop processing controller is configured to detect a loop in the decoded instructions stored in the instruction reorder structure and cause the instruction reorder structure to reissue the decoded instructions that form the loop for re-execution.Type: GrantFiled: January 24, 2012Date of Patent: May 5, 2015Assignee: Marvell International Ltd.Inventors: Sujat Jamil, R. Frank O'Bleness, Joseph Delgross, Tom Hameenanttila
-
Patent number: 8990819Abstract: A method for rolling back speculative threads in symmetric-multiprocessing (SMP) environments is disclosed. In one embodiment, such a method includes detecting an aborted thread at runtime and determining whether the aborted thread is an oldest aborted thread. In the event the aborted thread is the oldest aborted thread, the method sets a high-priority request for allocation to an absolute thread number associated with the oldest aborted thread. The method further detects that the high-priority request is set and, in response, modifies a local allocation token of the oldest aborted thread. The modification prompts the oldest aborted thread to retry a work unit associated with its absolute thread number. The oldest aborted thread subsequently initiates the retry of a successor thread by updating the successor thread's local allocation token. A corresponding apparatus and computer program product are also disclosed.Type: GrantFiled: December 28, 2012Date of Patent: March 24, 2015Assignee: International Business Machines CorporationInventors: Martin Ohmacht, Raul E. Silvera, Mark G. Stoodley, Kai-Ting A. Wang
-
Patent number: 8990817Abstract: Various systems and methods for automated error recovery in workflows. For example, one method involves receiving an operation indication. The operation indication indicates an operation that is to be performed using a multi-tier application system that includes first and second applications. The first and second applications are implemented using different tiers of the multi-tier application system. The method involves accessing dependency information that indicates first data dependencies between the first and the second applications. The method further involves determining outcome of execution of the operation, where the determining is based on the dependency information but does not include executing the operation.Type: GrantFiled: September 6, 2012Date of Patent: March 24, 2015Assignee: Symantec CorporationInventors: Debasish Garai, Sumeet S. Kembhavi
-
Patent number: 8959319Abstract: Embodiments of the present invention provide systems, methods, and computer program products for improving divergent conditional branches in code being executed by a processor. For example, in an embodiment, a method comprises detecting a conditional statement of a program being simultaneously executed by a plurality of threads, determining which threads evaluate a condition of the conditional statement as true and which threads evaluate the condition as false, pushing an identifier associated with the larger set of the threads onto a stack, executing code associated with a smaller set of the threads, and executing code associated with the larger set of the threads.Type: GrantFiled: December 2, 2011Date of Patent: February 17, 2015Assignee: Advanced Micro Devices, Inc.Inventors: Mark Leather, Norman Rubin, Brian D. Emberling, Michael Mantor
-
Patent number: 8918625Abstract: A processor that executes instructions out of program order is described. In some implementations, a processor detects whether a second memory operation is dependent on a first memory operation prior to memory address calculation. If the processor detects that the second memory operation is not dependent on the first memory operation, the processor is configured to allow the second memory operation to be scheduled. If the processor detects that the second memory operation is dependent on the first memory operation, the processor is configured to prevent the second memory operation from being scheduled until the first memory operation has been scheduled to reduce the likelihood of having to reexecute the second memory operation.Type: GrantFiled: November 15, 2011Date of Patent: December 23, 2014Assignee: Marvell International Ltd.Inventors: R. Frank O'Bleness, Sujat Jamil, Tom Hameenanttila
-
Patent number: 8918626Abstract: The disclosed embodiments relate to a system that executes program instructions on a processor. During a normal-execution mode, the system issues instructions for execution in program order. Upon encountering an unresolved data dependency during execution of an instruction, the system speculatively executes subsequent instructions in a lookahead mode to prefetch future loads. When an instruction retires during the lookahead mode, a working register which serves as a destination register for the instruction is not copied to a corresponding architectural register. Instead the architectural register is marked as invalid. Note that by not updating architectural registers during lookahead mode, the system eliminates the need to checkpoint the architectural registers prior to entering lookahead mode.Type: GrantFiled: November 10, 2011Date of Patent: December 23, 2014Assignee: Oracle International CorporationInventors: Yuan C. Chou, Eric W. Mahurin
-
Patent number: 8862861Abstract: Techniques are disclosed relating to a processor that is configured to execute control transfer instructions (CTIs). In some embodiments, the processor includes a mechanism that suppresses results of mispredicted younger CTIs on a speculative execution path. This mechanism permits the branch predictor to maintain its fidelity, and eliminates spurious flushes of the pipeline. In one embodiment, a misprediction bit is be used to indicate that a misprediction has occurred, and younger CTIs than the CTI that was mispredicted are suppressed. In some embodiments, the processor may be configured to execute instruction streams from multiple threads. Each thread may include a misprediction indication. CTIs in each thread may execute in program order with respect to other CTIs of the thread, while instructions other than CTIs may execute out of program order.Type: GrantFiled: September 8, 2011Date of Patent: October 14, 2014Assignee: Oracle International CorporationInventors: Christopher H. Olson, Manish K. Shah
-
Patent number: 8683129Abstract: The disclosed embodiments provide a system that uses speculative cache requests to reduce cache miss delays for a cache in a multi-level memory hierarchy. During operation, the system receives a memory reference which is directed to a cache line in the cache. Next, while determining whether the cache line is available in the cache, the system determines whether the memory reference is likely to miss in the cache, and if so, simultaneously sends a speculative request for the cache line to a lower level of the multi-level memory hierarchy.Type: GrantFiled: October 21, 2010Date of Patent: March 25, 2014Assignee: Oracle International CorporationInventors: Tarik Ono, Mark R. Greenstreet