Commitment Control Or Register Bypass Patents (Class 712/218)
-
Patent number: 12067396Abstract: Techniques related to executing instructions by a processor comprising receiving a first instruction for execution, determining a first latency value based on an expected amount of time needed for the first instruction to be executed, storing the first latency value in a writeback queue, beginning execution of the first instruction on the instruction execution pipeline, adjusting the latency value based on an amount of time passed since beginning execution of the first instruction, outputting a first result of the first instruction based on the latency value, receiving a second instruction, determining that the second instruction is a variable latency instruction, storing a ready value indicating that a second result of the second instruction is not ready in the writeback queue, beginning execution of the second instruction on the instruction execution pipeline, updating the ready value to indicate that the second result is ready, and outputting the second result.Type: GrantFiled: December 21, 2021Date of Patent: August 20, 2024Assignee: Texas Instruments IncorporatedInventor: Timothy D. Anderson
-
Patent number: 11977891Abstract: Apparatus and methods are disclosed for controlling execution of memory access instructions in a block-based processor architecture using a hardware structure that generates a relative ordering of memory access instruction in an instruction block. In one example of the disclosed technology, a method of executing an instruction block having a plurality of memory load and/or memory store instructions includes decoding an instruction block encoding a plurality of memory access instructions and generating data indicating a relative order for executing the memory access instructions in the instruction block and scheduling operation of a portion of the instruction block based at least in part on the relative order data. In some examples, a store vector data register can store the generated relative ordering data for use in subsequent instances of the instruction block.Type: GrantFiled: February 1, 2016Date of Patent: May 7, 2024Assignee: Microsoft Technology Licensing, LLCInventors: Douglas C. Burger, Aaron L. Smith
-
Patent number: 11960404Abstract: Systems, apparatuses, and methods for efficiently processing memory requests are disclosed. A computing system includes at least one processing unit coupled to a memory. Circuitry in the processing unit determines a memory request becomes a long-latency request based on detecting a translation lookaside buffer (TLB) miss, a branch misprediction, a memory dependence misprediction, or a precise exception has occurred. The circuitry marks the memory request as a long-latency request such as storing an indication of a long-latency request in an instruction tag of the memory request. The circuitry uses weighted criteria for scheduling out-of-order issue and servicing of memory requests. However, the indication of a long-latency request is not combined with other criteria in a weighted sum. Rather, the indication of the long-latency request is a separate value. The circuitry prioritizes memory requests marked as long-latency requests over memory requests not marked as long-latency requests.Type: GrantFiled: September 23, 2020Date of Patent: April 16, 2024Assignee: Advanced Micro Devices, Inc.Inventors: Jagadish B. Kotra, John Kalamatianos
-
Patent number: 11836089Abstract: A cache memory includes a first cache area corresponding to even addresses, and a second cache area corresponding to odd addresses, wherein each of the first and second cache areas includes a plurality of cache sets, and each cache set includes a data set field suitable for storing data corresponding to an address among the even and odd addresses, and a pair field suitable for storing information on a location where data corresponding to an adjacent address which is adjacent to an address corresponding to the stored data is stored.Type: GrantFiled: August 4, 2022Date of Patent: December 5, 2023Assignee: SK hynix Inc.Inventor: Seung-Gyu Jeong
-
Patent number: 11720284Abstract: Methods, systems, and devices for low latency storage based on data size are described. A memory system may include logic, a processor, a first memory, and a second memory. The logic may be configured to receive commands, or data, or both, from a host system. The first memory and the second memory may be coupled with the processor. The processor may be configured to store, or to cause the storage of, data for commands associated with data that are smaller than a threshold in the first memory and to store data for commands associated with data that are larger than the threshold in the second memory. A first communication path between the logic and the first memory may be associated with a faster transfer speed than a second communication path between the logic and the second memory.Type: GrantFiled: April 29, 2021Date of Patent: August 8, 2023Assignee: Micron Technology, Inc.Inventors: Federica Cresci, Nicola Del Gatto, Massimiliano Patriarca, Maddalena Calzolari, Michela Spagnolo, Massimiliano Turconi
-
Patent number: 11709680Abstract: A system and method of processing instructions may comprise an application processing domain (APD) and a metadata processing domain (MTD). The APD may comprise an application processor executing instructions and providing related information to the MTD. The MTD may comprise a tag processing unit (TPU) having a cache of policy-based rules enforced by the MTD. The TPU may determine, based on policies being enforced and metadata tags and operands associated with the instructions, that the instructions are allowed to execute (i.e., are valid). The TPU may write, if the instructions are valid, the metadata tags to a queue. The queue may (i) receive operation output information from the application processing domain, (ii) receive, from the TPU, the metadata tags, (iii) output, responsive to receiving the metadata tags, resulting information indicative of the operation output information and the metadata tags; and (iv) permit the resulting information to be written to memory.Type: GrantFiled: September 14, 2021Date of Patent: July 25, 2023Assignee: The Charles Stark Draper Laboratory, Inc.Inventors: Steve E. Milburn, Eli Boling, Andre' DeHon, Andrew B. Sutherland, Gregory T. Sullivan
-
Patent number: 11531622Abstract: Systems and methods are disclosed including a processing device operatively coupled to a first and a second memory device. The processing device can receive a set of data access requests, from a host system, in a first order and execute the set of data access requests in a second order. The processing device can further identify a late data access request of the set of data access requests and determine whether a data structure in a local memory associated with the processing device includes a previous outstanding data access request corresponding to an address associated with the late data access request. Responsive to determining that the data structure includes an indication of a previous outstanding data access request corresponding to the address associated with the late data access request, identifying a type of data dependency associated with the previous outstanding data access request and performing one or more operations associated with the type of data dependency.Type: GrantFiled: August 26, 2020Date of Patent: December 20, 2022Assignee: MICRON TECHNOLOGY, INC.Inventors: Horia C. Simionescu, Chung Kuang Chin, Paul Stonelake, Narasimhulu Dharanikumar Kotte
-
Patent number: 11500641Abstract: Methods, devices and media for efficient data dependency management for in-order issue processors are described. In various embodiments described herein, methods, devices and media are disclosed that provide techniques for managing RAW data dependencies between instructions in a constrained hardware environment. The described techniques include initial wait station allocation of write instructions, followed by wait station allocation conflict resolution methods that use a greedy algorithm to optimize a cost function based on the estimated latency of a single instruction. Efficient compilation and reduced execution time may be achieved in some embodiments. Methods and devices for compiling source code are described, as well as devices for executing the compiled machine code and media for storing compiled machine code.Type: GrantFiled: October 7, 2020Date of Patent: November 15, 2022Assignee: HUAWEI TECHNOLOGIES CO., LTD.Inventors: Hazem A. Abdelhafez, Ning Xie, Ahmed Mohammed ElShafiey Mohammed Eltantawy
-
Patent number: 11461237Abstract: An information handling system and method for translating virtual addresses to real addresses including a processor for processing data; memory devices for storing the data; a Page Walk Cache (PWC) for storing page directory entries; and a memory controller configured to control accesses to the memory devices. The processor in an embodiment is configured to send to the memory controller a page directory base and a plurality of memory offsets; and receive from the memory controller and store in the PWC at least one of the page directory entries. The memory controller is configured to: combine a first level page directory entry with a second level memory offset; read from memory a second page directory entry using the first level page directory entry and the second level memory offset; and send to the processor at least one of the page directory entries and a page table entry (PTE).Type: GrantFiled: December 3, 2019Date of Patent: October 4, 2022Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Mohit Karve, Brian W. Thompto
-
Patent number: 11403110Abstract: A method includes receiving an execute packet that includes a first instruction and a second instruction and executing the first instruction and the second instruction using a pipeline. Executing the first and second instructions includes storing a result of the first instruction in a holding register; determining whether an event that interrupts execution of the execute packet occurs prior to completion of the executing of the second instruction; and based on the event not occurring, committing the result of the first instruction after completion of the executing of the second instruction.Type: GrantFiled: October 23, 2020Date of Patent: August 2, 2022Assignee: Texas Instruments IncorporatedInventors: Kai Chirca, Timothy D. Anderson, Paul Daniel Gauvreau
-
Patent number: 11392495Abstract: Systems and methods are provided for accurately simulating memory operations of a multi-compute-engine system, such as a multi-core system. Simulation speed can be increased by consolidation location and state information associated with data stored in one or more caches of a simulated cache hierarchy. This consolidation of information can be reflected in a single cache line map or flat cache. Accordingly, searches for data (and copies of the data) in each and every cache of the simulated cache hierarchy can be performed fast and with greater efficiency than conventional simulation systems that operate using sequential, cache-by-cache searching, while maintaining data coherency.Type: GrantFiled: February 8, 2019Date of Patent: July 19, 2022Assignee: Hewlett Packard Enterprise Development LPInventors: Ryan D. Menhusen, Todd Austin Carrington
-
Patent number: 11379242Abstract: An integrated circuit may include elastic datapaths or pipelines, through which software threads or iterations of loops, may be executed. Throttling circuitry may be coupled along an elastic pipeline in the integrated circuit. The throttling circuitry may include dependency detection circuitry that dynamically detect memory dependency issues that may arise during runtime. To mitigate these dependency issues, the throttling circuitry may assert stall signals to upstream stages in the pipeline. Additionally, the throttling circuitry may control the pipeline to resolve a store operation prior to a corresponding load operation in order to avoid store/load conflicts. In an embodiment, the throttling circuitry may include a validator circuit, a rewind block, a revert block, and a flush block. The throttling circuitry may pass speculative iterations through the rewind block, and later validate the speculative iterations using the validator block.Type: GrantFiled: June 29, 2017Date of Patent: July 5, 2022Assignee: Intel CorporationInventors: Andrei Mihai Hagiescu Miriste, Byron Sinclair, Joseph Garvey
-
Patent number: 11366665Abstract: Microcode combination of complex instructions is shown. A microprocessor includes an instruction queue, an instruction decoder, and a microcode controller. The instruction decoder is coupled to the instruction queue. The microcode controller is coupled to the instruction decoder and has a memory. The memory stores a combined microcode for M complex instructions arranged in a specific order, where M is an integer greater than 1. When the M complex instructions in the specific order have popped out of the first to M-th entries of the instruction queue, the instruction decoder operates the microcode controller to read the memory for the combined microcode with microcode reading trapping happened just once.Type: GrantFiled: August 4, 2020Date of Patent: June 21, 2022Assignee: SHANGHAI ZHAOXIN SEMICONDUCTOR CO., LTD.Inventor: Yingbing Guan
-
Patent number: 11340960Abstract: Systems, methods, and apparatuses relating to circuitry to implement lockstep of processor cores are described. In one embodiment, a hardware processor comprises a first processor core comprising a first control flow signature register and a first execution circuit, a second processor core comprising a second control flow signature register and a second execution circuit, and at least one signature circuit to perform a first state history compression operation on a first instruction that executes on the first execution circuit of the first processor core to produce a first result, store the first result in the first control flow signature register, perform a second state history compression operation on a second instruction that executes on the second execution circuit of the second processor core to produce a second result, and store the second result in the second control flow signature register.Type: GrantFiled: March 27, 2020Date of Patent: May 24, 2022Assignee: Intel CorporationInventors: Umberto Santoni, Philip Abraham
-
Patent number: 11315684Abstract: A method for generating an alimentary instruction set identifying a list of supplements, comprising receiving information related to a biological extraction and physiological state of a user and generating a diagnostic output based upon the information related to the biological extraction and physiological state of the user. The generating comprises identifying a condition of the user as a function of the information related to the biological extraction and physiological state of the user and a first training set. Further, the generating includes identifying a supplement related to the identified condition of the user as a function of the identified condition of the user and a second training set. Further, the method includes generating, by an alimentary instruction set generator operating on a computing device, a supplement plan as a function of the diagnostic output, said supplement plan including the supplement related to the identified condition of the user.Type: GrantFiled: January 31, 2020Date of Patent: April 26, 2022Assignee: KPN INNOVATIONS, LLC.Inventor: Kenneth Neumann
-
Patent number: 11301255Abstract: Methods, apparatuses, devices, and storage media for performing a processing task are provided. A portion of portions of the processing task can include a group of operations that are to be performed at a processing unit of processing units. The group of operations can include operations of a first type and operations of a second type. In the method, a first queue for performing the operations of the first type and a second queue for performing the operations of the second type can be built, respectively. Based on a definition of the processing task, a dependency relationship between a group of operations to be performed at the processing unit and a group of operations to be performed at other processing units in the plurality of processing units can be obtained. Operations in the first queue and operations in the second queue can be performed respectively based on the dependency relationship.Type: GrantFiled: December 30, 2019Date of Patent: April 12, 2022Assignee: Kunlunxin Technology (Beijing) Company LimitedInventors: Qingshu Chen, Zhibiao Zhao, Hefei Zhu, Xiaozhang Gong, Yong Wang, Jian Ouyang
-
Patent number: 11288076Abstract: An integrated circuit including configurable multiplier-accumulator circuitry, wherein, during processing operations, a plurality of the multiplier-accumulator circuits are serially connected into pipelines to perform concatenated multiply and accumulate operations. The integrated circuit includes a first memory and a second memory, and a switch interconnect network, including configurable multiplexers arranged in a plurality of switch matrices. The first and second memories are configurable as either a dedicated read memory or a dedicated write memory and connected to a given pipeline, via the switch interconnect network, during a processing operation performed thereby; wherein, during a first processing operations, the first memory is dedicated to write data to a first pipeline and the second memory is dedicated to read data therefrom and, during a second processing operation, the first memory is dedicated to read data from a second pipeline and the second memory is dedicated to write data thereto.Type: GrantFiled: September 12, 2020Date of Patent: March 29, 2022Assignee: Flex Logix Technologies, Inc.Inventor: Cheng C. Wang
-
Using a plurality of conversion tables to implement an instruction set agnostic runtime architecture
Patent number: 11281481Abstract: A system for an agnostic runtime architecture is disclosed. The system includes a system emulation/virtualization converter, an application code converter, and a system converter wherein the system emulation/virtualization converter and the application code converter implement a system emulation process, and wherein the system converter implements a system conversion process for executing code from a guest image. The system converter further comprises a guest fetch logic component for accessing a plurality of guest instructions, a guest fetch buffer coupled to the guest fetch logic component and a branch prediction component for assembling the plurality of guest instructions into a guest instruction block, and a plurality of conversion tables including a first level conversion table and a second level conversion table coupled to the guest fetch buffer for translating the guest instruction block into a corresponding native conversion block.Type: GrantFiled: July 22, 2015Date of Patent: March 22, 2022Assignee: INTEL CORPORATIONInventor: Mohammad Abdallah -
Patent number: 11194584Abstract: Retiring instructions out-of-order includes: receiving processor instructions comprising two or more and fewer than all processor instructions generated based on a program, where the processor instructions include a first instruction and a second instruction such that the first instruction precedes the second instruction in a program order of the program; receiving a start instruction that immediately precedes the processor instructions and indicates that the processor instructions are to be retired out-of-order; receiving a stop instruction immediately that succeeds the processor instructions and indicates a stop to out-of-order instruction retirement; and, in response to completing execution of the second instruction before completing execution of the first instruction, retiring the second instruction before retiring the first instruction.Type: GrantFiled: April 30, 2020Date of Patent: December 7, 2021Assignee: Marvell Asia Pte, Ltd.Inventor: Shubhendu Sekhar Mukherjee
-
Patent number: 11188340Abstract: Techniques for parallel execution of instructions in an instruction set are described. The techniques include determining a plurality of instruction streams and paths for a branch in an instruction set and executing the determined paths in parallel such that a mis-predicted path does not cause significant mis-prediction penalties.Type: GrantFiled: December 20, 2018Date of Patent: November 30, 2021Assignee: International Business Machines CorporationInventors: Brian W. Thompto, Hung Q. Le, Dung Q. Nguyen
-
Patent number: 11175923Abstract: A computer-implemented method for marking load and store instruction overlap in a processor pipeline is described. The method includes detecting a load instruction following a store instruction in an instruction stream. The load instruction and the store instruction include instruction text. The instruction text includes operand address information. The method includes comparing operand address information of the store instruction with operand address information of the load instruction to determine whether there is a memory image overlap in an issue queue between the operand address information of the store instruction and the load instruction. The method also includes delaying the load instruction in the processor pipeline in response to determining that there is a memory image overlap.Type: GrantFiled: February 13, 2017Date of Patent: November 16, 2021Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Gregory W. Alexander, James J. Bonanno, Adam B. Collura, Bruce C. Giamei, Christian Jacobi, Jang-Soo Lee, Edward T. Malley, Lawrence J. Powell, Jr., Anthony Saporito
-
Patent number: 11163571Abstract: Technology for fusing an add-immediate instruction with a load-immediate instruction (or store-immediate instruction) in a microprocessor. This can result in quicker address generation while performing a load and store operation.Type: GrantFiled: July 29, 2020Date of Patent: November 2, 2021Assignee: International Business Machines CorporationInventors: Brian D. Barrick, Sundeep Chadha, Sheldon Bernard Levenstein, Phillip G. Williams, Niels Fricke, Dung Q. Nguyen, Brian W. Thompto, Christian Gerhard Zoellin
-
Patent number: 11163654Abstract: Techniques for system recovery using a failover processor are disclosed. A first processor, with a first instruction set, is configured to execute operations of a first type; and a second processor, with a second instruction set different from the first instruction set, is configured to execute operations of a second type. A determination is made that the second processor has failed to execute at least one operation of the second type within a particular period of time. Responsive to determining that the second processor has failed to execute at least one operation of the second type within the particular period of time, the first processor is configured to execute both the operations of the first type and the operations of the second type.Type: GrantFiled: September 23, 2019Date of Patent: November 2, 2021Assignee: Oracle International CorporationInventors: Christopher West, James Baer
-
Patent number: 11137980Abstract: A data storage system implements techniques for efficient retrieval of data stored thereon, using time of upload or another monotonically increasing variable as a key or identifier for the data to be stored and/or retrieved. Data is sorted according to, e.g., upload time, and the data is addressed with respect to time of upload and byte offset within the archive.Type: GrantFiled: September 27, 2016Date of Patent: October 5, 2021Assignee: Amazon Technologies, Inc.Inventors: Rishabh Animesh, Adam Frederick Brock, Umar Farooq, James Caleb Kirschner
-
Patent number: 11132199Abstract: A processor that includes a register file, a latency shifter, a decode unit and a plurality of functional units is introduced. The register file includes a write port. The latency shifter includes a plurality of shifter entries and shifts out a shifter entry among the shifter entries every clock cycle. Each of the shifter entries is associated with a clock cycle and each of shifter entries includes a writeback value that indicates whether the write port of the register file is available for a writeback operation in the associated clock cycles. The decode unit is configured to decode an instruction and issue the instruction according to the writeback value of the latency shifter. The functional units are coupled to the decode unit and the register file and are configured to execute the instruction issued by the decode unit and perform writeback operation to the write port of the register file.Type: GrantFiled: March 31, 2020Date of Patent: September 28, 2021Assignee: ANDES TECHNOLOGY CORPORATIONInventor: Thang Minh Tran
-
Patent number: 11106469Abstract: Methods and systems for implementing an instruction selection mechanism with class-dependent age-array are described. In an example, a system can include a processor that may sequence instructions. The system can further include a memory operatively coupled to the processor. The system can further include an array allocated on the memory. The array can be operable to store instruction age designations associated with a plurality of instructions sequenced by the processor. The array can be further operable to store the instruction age designations based on instruction classes. The processor can be operable to fetch an instruction from the memory. The processor can be operable to dispatch the instruction to a queue. The processor can be operable to store the instruction age designations associated with the instruction, in the array, based on an instruction class of the instruction.Type: GrantFiled: August 14, 2019Date of Patent: August 31, 2021Assignee: International Business Machines CorporationInventor: Joel A. Silberman
-
Patent number: 11068273Abstract: Swapping and restoring context-specific branch predictor states on context switches in a processor. A branch prediction circuit in an instruction processing circuit of a processor includes a private branch prediction memory configured to store branch prediction states for a context of a process being executed. The branch prediction states are accessed by the branch prediction circuit to predict outcomes of its branch instructions of the process. In certain aspects, when a context switch occurs in the processor, branch prediction states stored in a private branch prediction memory and associated with the current, to-be-swapped-out context, are swapped out of the private branch prediction memory to the shared branch prediction memory. Branch prediction states in the shared branch prediction memory previously stored (i.e., swapped out) and associated with to-be-swapped-in context for execution are restored in the private branch prediction memory to be used for branch prediction.Type: GrantFiled: September 3, 2019Date of Patent: July 20, 2021Assignee: Microsoft Technology Licensing, LLCInventors: Rami Mohammad Al Sheikh, Michael Scott McIlvaine
-
Patent number: 11061807Abstract: A method for tracing software code executing on a core of a processor is described. The method includes generating a set of packets for a trace packet stream based on a main cycle counter, which maintains a count of cycles elapsing in the core since a packet was emitted into the trace packet stream, and a commit cycle counter, which maintains a cycle count in the core since the last commit operation, wherein the generating comprises (1) storing a value of the main cycle counter in the commit cycle counter in response to detecting a commit operation and (2) storing a value of the commit cycle counter in the main cycle counter in response to detecting an abort in the core; and emitting the set of packets from the processor into the trace packet stream for tracing execution of the software code.Type: GrantFiled: December 28, 2018Date of Patent: July 13, 2021Assignee: Intel CorporationInventors: Beeman Strong, Matthew C. Merten, Jason Agron
-
Patent number: 11055096Abstract: An in-order processor has a mapping storage element to store current register mapping information identifying, for each of two or more architectural register specifiers, which physical register specifies valid data for that architectural register specifier. At least one checkpoint storage element stores checkpoint register mapping corresponding to a checkpoint of previous architectural state. This enables checkpoints to be saved and restored simply by transferring mapping information between the mapping and checkpoint storage elements, rather than transferring the actual state data.Type: GrantFiled: January 5, 2018Date of Patent: July 6, 2021Assignee: Arm LimitedInventors: Neil Burgess, Lee Evan Eisen
-
Patent number: 11055100Abstract: Embodiments of the present disclosure relate to a method for processing information, and a processor. The processor includes an arithmetic and logic unit, a bypass unit, a queue unit, a multiplexer, and a register file. The bypass unit includes a data processing subunit; the data processing subunit is configured to acquire at least one valid processing result outputted by the arithmetic and logic unit, determine a processing result from the at least one valid processing result, output the determined processing result to the multiplexer, and output processing results except for the determined processing result of among the at least one valid processing result to the queue unit; and the multiplexer is configured to sequentially output more than one valid processing results to the register file.Type: GrantFiled: July 3, 2019Date of Patent: July 6, 2021Assignee: Beijing Baidu Netcom Science and Technology Co., Ltd.Inventor: Jian Ouyang
-
Patent number: 11036510Abstract: A merging predicated instruction controls a processing pipeline to perform a processing operation to determine a processing result based on at least one source operand, and to perform a merging operation to merge the processing result with a previous value of a destination register under control of a predicate value identifying, for each of a plurality of portions of the destination register, whether that portion is to be set to a corresponding portion of the processing result or a corresponding portion of the previous value. The merging predicated instruction is permitted to be issued to the pipeline with a timing which results in the previous value of the destination register still being unavailable when the merging predicated instruction is at a given pipeline stage at which the processing result is determined. This can help to improve performance of subsequent instructions which are independent of the merging predicated instruction.Type: GrantFiled: October 11, 2018Date of Patent: June 15, 2021Assignee: Arm LimitedInventors: Karel Hubertus Gerardus Walters, Chiloda Ashan Senarath Pathirane
-
Patent number: 11023232Abstract: In an embodiment, the present invention includes a processor having an execution logic to execute instructions and a control transfer termination (CTT) logic coupled to the execution logic. This logic is to cause a CTT fault to be raised if a target instruction of a control transfer instruction is not a CTT instruction. Other embodiments are described and claimed.Type: GrantFiled: March 13, 2019Date of Patent: June 1, 2021Assignee: Intel CorporationInventors: Vedvyas Shanbhogue, Jason W. Brandt, Uday Savagaonkar, Ravi L. Sahita
-
Patent number: 11016779Abstract: Various embodiments are disclosed of a multiprocessor system with processing elements optimized for high performance and low power dissipation and an associated method of programming the processing elements. Each processing element may comprise a fetch unit and a plurality of address generator units and a plurality of pipelined datapaths. The fetch unit may be configured to receive a multi-part instruction, wherein the multi-part instruction includes a plurality of fields. A first address generator unit may be configured to perform an arithmetic operation dependent upon a first field of the plurality of fields. A second address generator unit may be configured to generate at least one address of a plurality of addresses, wherein each address is dependent upon a respective field of the plurality of fields. A parallel assembly language may be used to control the plurality of address generator units and the plurality of pipelined datapaths.Type: GrantFiled: August 13, 2019Date of Patent: May 25, 2021Assignee: Coherent Logix, IncorporatedInventors: Michael B. Doerr, Carl S. Dobbs, Michael B. Solka, Michael R. Trocino, Kenneth R. Faulkner, Keith M. Bindloss, Sumeer Arya, John Mark Beardslee, David A. Gibson
-
Patent number: 10929144Abstract: A computer system, processor, and method for processing information is disclosed that includes determining whether an instruction is a designated instruction, determining whether an instruction following the designated instruction is a subsequent store instruction, speculatively releasing the subsequent store instruction while the designated instruction is pending and before the subsequent store instruction is complete. Preferably, in response to determining that an instruction is the designated instruction, initiating or advancing a speculative tail pointer in an instruction completion table (ICT) to look through the instructions in the ICT following the designated instruction.Type: GrantFiled: February 6, 2019Date of Patent: February 23, 2021Assignee: International Business Machines CorporationInventors: Kenneth L. Ward, Hung Q. Le, Dung Q. Nguyen, Bryan Lloyd
-
Patent number: 10922129Abstract: An operation processing device includes a first register unit including first registers configured to hold data to be used for an operation in an operation unit; a first selection unit that selects data held by a first register indicated by a read address signal; a second selection unit that selects, based on a bypass selection signal, data from a data group including the data selected by the first selection unit and data indicative of a result of the operation; a second register unit that outputs the data selected by the second selection unit to the operation unit; a timing adjustment unit that outputs the read address signal to the first selection unit; and a bypass control unit that stops an operation of the timing adjustment unit when generating the bypass selection signal indicative of a selection of data other than the data selected by the first selection unit.Type: GrantFiled: June 21, 2018Date of Patent: February 16, 2021Assignee: FUJITSU LIMITEDInventors: Seiji Hirao, Sota Sakashita
-
Patent number: 10878131Abstract: A hardware secure element is described. The hardware secure element includes a microprocessor and a memory, such as a non-volatile memory. The memory stores a plurality of software routines executable by the microprocessor. Each software routine starts at a respective memory start address. The hardware secure element also includes a receiver circuit and a hardware message handler module. The receiver circuit is configured to receive command data that includes a command. The hardware message handler module is configured to determine a software routine to be executed by the microprocessor as a function of the command, and also configured to provide address data to the microprocessor that indicates the software routine to be executed.Type: GrantFiled: April 27, 2018Date of Patent: December 29, 2020Assignees: STMICROELECTRONICS S.R.L., STMICROELECTRONICS APPLICATION GMBHInventors: Roberto Colombo, Nicolas Bernard Grossier, Giovanni Disirio, Lorenzo Re Fiorentin
-
Patent number: 10802756Abstract: Systems and methods are disclosed for command status polling at a flash queue of a non-volatile memory device. The flash queue may be configured to perform polling on the status of flash operations without direct oversight from the data storage controller or firmware. In certain embodiments, a flash queue circuit may be configured to receive, from a data storage controller of a nonvolatile solid state memory (NVSSM) data storage device, one or more commands to access a flash memory of the NVSSM data storage device, each command of the one or more commands including one or more instructions. The flash queue circuit may execute the one or more commands to access the flash memory, evaluate a status response from the flash memory at the flash queue circuit, and re-execute a sequence of instructions of the one or more commands based on the status response.Type: GrantFiled: July 12, 2018Date of Patent: October 13, 2020Assignee: Seagate Technology LLCInventors: Jeffrey John Pream, Jeremy Blair Goolsby
-
Patent number: 10776123Abstract: Systems, apparatuses, and methods for performing efficient processor pipeline flush recovery are disclosed. A processor core includes a retire queue for storing information of outstanding instructions. When the retire queue logic detects that a pipeline flush condition occurs, the logic creates one or more groups of entries in the retire queue. The logic begins the groups with an entry storing information for a youngest outstanding instruction, and creates other groups in a contiguous manner after creating this first group. The logic marks with a first indication a given group when the given group includes one or more instructions of a given type. The logic marks with a second indication the given group when the given group does not include an instruction of the given type. The logic sends to flush recovery logic information of one or more entries in only groups marked with the first indication.Type: GrantFiled: December 3, 2018Date of Patent: September 15, 2020Assignee: Advanced Micro Devices, Inc.Inventors: Erik D. Swanson, Michael Estlick, Sneha V. Desai
-
Patent number: 10747542Abstract: A processor circuit and an operation method thereof are provided. The processor circuit includes a first alias queue module, a second alias queue module, and a pattern detection module. The pattern detection module is coupled to the first alias queue module and the second alias queue module. When a next sequential instruction pointer value of a store data instruction of the first alias queue module is matched, and a next sequential instruction pointer value of a store address instruction of the second alias queue module is matched, the pattern detection module determines that the load instruction depends on the store data instruction or the store address instruction according to a pattern value corresponding to the store data instruction.Type: GrantFiled: August 22, 2018Date of Patent: August 18, 2020Assignee: Shanghai Zhaoxin Semiconductor Co., Ltd.Inventor: Xiaolong Fei
-
Patent number: 10725786Abstract: Method and apparatus for a completion mechanism for a microprocessor are provided by marking entries in a section of an Instruction Completion Table (ICT) as ready to complete using corresponding Ready to Complete (RTC) status bits; determining a tail pointer indicating a start of the entries in the ICT that are ready for completion in a current clock cycle; performing a counting leading ones on an RTC vector that organizes the RTC status bits according to a program order for completing the entries to determine a count leading ones pointer that indicates an end of the entries in the ICT that are ready for completion in the current clock cycle; completing instructions included in the entries between the tail pointer and the count leading ones pointer in one clock cycle; and updating the tail pointer to a value of the count leading ones pointer for a subsequent clock cycle.Type: GrantFiled: August 23, 2018Date of Patent: July 28, 2020Assignee: International Business Machines CorporationInventors: Kenneth L. Ward, Susan E. Eisen, Dung Q. Nguyen, Glenn O. Kincaid, Joe Lee, Deepak K. Singh
-
Patent number: 10706494Abstract: A method for processing data in a graphics processing unit including receiving an indication that all threads of a warp in a graphics processing unit (GPU) are to execute a same branch in a first set of instructions, storing one or more predicate bits in a memory as a single set of predicate bits, wherein the single set of predicate bits applies to all of the threads in the warp, and executing a portion of the first set of instructions in accordance with the single set of predicate bits. Executing the first set of instructions may include executing the first set of instruction in accordance with the single set of predicate bits using a single instruction, multiple data (SIMD) processing core and/or executing the first set of instruction in accordance with the single set of predicate bits using a scalar processing unit.Type: GrantFiled: August 14, 2018Date of Patent: July 7, 2020Assignee: QUALCOMM IncorporatedInventors: Andrew Evan Gruber, Pramod Vasant Argade, Jing Wu
-
Patent number: 10691461Abstract: Data processing circuitry comprises fetch circuitry to fetch blocks, containing instructions for execution, defined by a fetch queue; and prediction circuitry to predict one or more next blocks to be fetched and to add the predicted next blocks to the fetch queue; the prediction circuitry comprising: branch prediction circuitry to detect a predicted branch destination for a branch instruction in a current block, the predicted branch destination representing either a branch target for a branch predicted to be taken or a next instruction after the branch instruction, for a branch predicted not to be taken; and sequence prediction circuitry to detect sequence data, associated with the predicted branch destination, identifying a next block following the predicted branch destination in the program flow order having a next instance of a branch instruction, to add to the fetch queue the identified next block and any intervening blocks between the current block and the identified next block, and to initiate branch prType: GrantFiled: December 22, 2017Date of Patent: June 23, 2020Assignee: ARM LimitedInventors: Houdhaifa Bouzguarrou, Guillaume Bolbenes, Vincenzo Consales, Eddy Lapeyre
-
Patent number: 10691462Abstract: A processor and instruction graduation unit for a processor. In one embodiment, a processor or instruction graduation unit according to the present invention includes a linked-list-based multi-threaded graduation buffer and a graduation controller. The graduation buffer stores identification values generated by an instruction decode and dispatch unit of the processor as part of one or more linked-list data structures. Each linked-list data structure formed is associated with a particular program thread running on the processor. The number of linked-list data structures formed is variable and related to the number of program threads running on the processor. The graduation controller includes linked-list head identification registers and linked-list tail identification registers that facilitate reading and writing identifications values to linked-list data structures associated with particular program threads.Type: GrantFiled: December 14, 2017Date of Patent: June 23, 2020Assignee: ARM Finance Overseas LimitedInventor: Kjeld Svendsen
-
Patent number: 10613859Abstract: An execution pipeline architecture of a microprocessor employs a third-pass functional unit, for example, third-level of arithmetic logic unit (ALU) or third short-latency execution unit to execute instructions with reduced complexity and area cost of out-of-order execution. The third-pass functional unit allows instructions with long latency execution to be moved into a retire queue. The retire queue further includes the third functional unit (e.g., ALU), a reservation station and a graduate buffer. Data dependencies of dependent instructions in the retire queue is handled independently from the main pipeline.Type: GrantFiled: August 18, 2016Date of Patent: April 7, 2020Assignee: Synopsys, Inc.Inventor: Thang Tran
-
Patent number: 10614019Abstract: In general, embodiments of the technology relate to a method and system for performing fast ordered writes in a storage appliance that includes multiple separate storage modules. More specifically, embodiments of the technology enable multicasting of data to multiple storage modules in a storage appliance, where the order in which the write requests are processed is the same across all storage modules in the storage appliance.Type: GrantFiled: April 28, 2017Date of Patent: April 7, 2020Assignee: EMC IP Holding Company LLCInventors: Michael Nishimoto, Samir Rajadnya
-
Patent number: 10553316Abstract: A system for generating an alimentary instruction set based on vibrant constitutional guidance using artificial intelligence includes a diagnostic engine operating on at least a server and configured to receive at least a biological extraction from a user and generate a diagnostic output, based on the at least a biological extraction. The system includes a plan generation module operating on the at least a server, the plan generation module designed and configured to generate, based on the diagnostic output, a comprehensive instruction set associated with the user. The system is further configured to generate an alimentary instruction set associated with the user based on the comprehensive instruction set, wherein the alimentary instruction set is configured to interact with a plurality of processes and services based on components of the alimentary instruction set.Type: GrantFiled: April 4, 2019Date of Patent: February 4, 2020Assignee: KPN Innovations, LLCInventor: Kenneth Neumann
-
Patent number: 10445101Abstract: In a processing pipeline, hazards involving conditional instructions may be ignored when the conditional instruction would fail its test condition and there are no earlier instructions than the conditional instruction remaining which could potentially update the condition status information used to evaluate the test condition.Type: GrantFiled: August 1, 2016Date of Patent: October 15, 2019Assignee: ARM LimitedInventor: Spyros Lyberis
-
Patent number: 10430197Abstract: According to one general aspect, an apparatus may include a register circuit and an instruction scheduler circuit. The register circuit may include a plurality of physical registers that are partitioned into at least a common portion that is associated with a predefined plurality of instructions, and a shared portion, and a plurality of write ports, wherein each portion is associated with at least one respective write port. The instruction scheduler circuit configured to determine an instruction, and rename an architectural register associated with the instruction to a physical register. Wherein the portion including the physical register is selected based, at least in part, upon a characteristic of the current instruction.Type: GrantFiled: July 21, 2017Date of Patent: October 1, 2019Assignee: SAMSUNG ELECTRONICS CO., LTD.Inventor: Ankit Ghiya
-
Patent number: 10430208Abstract: A method and system for using multiple versions of a software component, includes storing, in memory, a first function table that points to executable code in the memory for functions from a first version of the software component, and storing, in the memory, a second function table that points to executable code in the memory for functions from a second version of the software component, referencing the first function table, when running a first application thread, to execute the functions from the first version of the software component; and referencing the second function table, when running a second application thread that is active concurrently with the first application thread, to execute the functions from the second version of the software component.Type: GrantFiled: May 2, 2017Date of Patent: October 1, 2019Assignee: HUAWEI TECHNOLOGIES CO., LTD.Inventors: Kai-Ting Amy Wang, Peng Wu, Brice Dobry, Haichuan Wang
-
Patent number: 10338926Abstract: A computer implemented method for processing machine instructions by a physical processor, includes receiving a machine instruction, stored in a memory, to execute, the machine instruction including an identification of at least one first operation to execute and a conditional prefix representing a condition to verify to execute the at least one first operation; evaluating, using a management module, the prefix, and executing, using a processing unit, the at least one first operation identified in the machine instruction, according to whether the condition is verified or not.Type: GrantFiled: May 19, 2015Date of Patent: July 2, 2019Assignee: BULL SASInventor: Ghassan Chehaibar