Loop Execution Patents (Class 712/241)
-
Patent number: 12175244Abstract: A nested loop controller includes a first register having a first value initialized to an initial first value, a second register having a second value initialized to an initial second value, and a third register configured as a predicate FIFO, initialized to have a third value. The second value is advanced in response to a tick instruction during execution of a loop. In response to the second value reaching a second threshold, the second register is reset to the initial second value. The nested loop controller further includes a comparator coupled to the second register and to the predicate FIFO and configured to provide an outer loop indicator value as input to the predicate FIFO when the second value is equal to the second threshold, and provide an inner loop indicator value as input to the predicate FIFO when the second value is not equal to the second threshold.Type: GrantFiled: November 13, 2023Date of Patent: December 24, 2024Assignee: Texas Instruments IncorporatedInventors: Kai Chirca, Timothy D. Anderson, Todd T. Hahn, Alan L. Davis
-
Patent number: 12099823Abstract: A computer-implemented method, system and computer program product for reducing register pressure. Loops of a computer program with a number of live variables that exceeds a threshold number, such as the number of available registers with capacity to store data, are identified. Such identified loops may be the to be subject to high register pressure. Upon identifying such loops in the computer program, chains within each identified loop are identified, where each chain includes load and store instructions from the same induction address and where the variable offsets of the load and store instructions are loop invariants. The address expressions for the load and store instructions in the identified chains may then be modified or changed to reuse common variable offsets using an analysis and transformation process. By reusing common variable offsets, there are less variable offsets that need to be stored in the registers thereby mitigating register pressure.Type: GrantFiled: January 16, 2023Date of Patent: September 24, 2024Assignee: International Business Machines CorporationInventors: Zheng Chen, Ke Wen Lin, Si Yuan Zhang
-
Patent number: 12093811Abstract: A fractal computing device according to an embodiment of the present application may be included in an integrated circuit device. The integrated circuit device includes a universal interconnect interface and other processing devices. The calculating device interacts with other processing devices to jointly complete a user specified calculation operation. The integrated circuit device may also include a storage device. The storage device is respectively connected with the calculating device and other processing devices and is used for data storage of the computing device and other processing devices.Type: GrantFiled: December 23, 2021Date of Patent: September 17, 2024Assignee: CAMBRICON TECHNOLOGIES CORPORATION LIMITEDInventors: Shaoli Liu, Guang Jiang, Yongwei Zhao, Jun Liang
-
Patent number: 12079470Abstract: Disclosed embodiments relate to one or more techniques to control access by a requestor of a computing system to a shared memory resource. In one embodiment, a technique includes determining a number (N) of pending requests to be sent to the memory by the requestor, determining a number (M) of requests that the requestor is limited to sending based on an amount of buffering resources available, and comparing M to N. When N is both greater than zero and less than or equal to M, the requestor sends the N pending requests to the memory. When N is both greater than zero and greater than M, M is compared to a hysteresis value (R) and, when M is less than R, the requestor sends R of the N pending requests to the memory.Type: GrantFiled: July 19, 2021Date of Patent: September 3, 2024Assignee: Texas Instruments IncorporatedInventor: Matthew Pierson
-
Patent number: 11989554Abstract: Techniques are disclosed for reducing or eliminating loop overhead caused by function calls in processors that form part of a pipeline architecture. The processors in the pipeline process data blocks in an iterative fashion, with each processor in the pipeline completing one of several iterations associated with a processing loop for a commonly-executed function. The described techniques leverage the use of message passing for pipelined processors to enable an upstream processor to signal to a downstream processor when processing has been completed, and thus a data block is ready for further processing in accordance with the next loop processing iteration. The described techniques facilitate a zero loop overhead architecture, enable continuous data block processing, and allow the processing pipeline to function indefinitely within the main body of the processing loop associated with the commonly-executed function where efficiency is greatest.Type: GrantFiled: December 23, 2020Date of Patent: May 21, 2024Assignee: Intel CorporationInventors: Kameran Azadet, Jeroen Leijten, Joseph Williams
-
Patent number: 11983533Abstract: There is provided a data processing apparatus comprising history storage circuitry that stores sets of behaviours of helper instructions for a control flow instruction. Pointer storage circuitry stores pointers, each associated with one of the sets. The behaviours in the one of the sets are indexed according to one of the pointers associated with that one of the sets. Increment circuitry increments at least some of the pointers in response to an increment event and prediction circuitry determines a predicted behaviour of the control flow instruction using one of the sets of behaviours.Type: GrantFiled: June 28, 2022Date of Patent: May 14, 2024Assignee: Arm LimitedInventors: Joseph Michael Pusdesris, Alexander Cole Shulyak, Yasuo Ishii, Houdhaifa Bouzguarrou
-
Patent number: 11972236Abstract: A method for compiling and executing a nested loop includes initializing a nested loop controller with an outer loop count value and an inner loop count value. The nested loop controller includes a predicate FIFO. The method also includes coalescing the nested loop and, during execution of the coalesced nested loop, causing the nested loop controller to populate the predicate FIFO and executing a get predicate instruction having an offset value, where the get predicate returns a value from the predicate FIFO specified by the offset value. The method further includes predicating an outer loop instruction on the returned value from the predicate FIFO.Type: GrantFiled: September 12, 2022Date of Patent: April 30, 2024Assignee: Texas Instruments IncorporatedInventors: Kai Chirca, Timothy D. Anderson, Todd T. Hahn, Alan L Davis
-
Patent number: 11954496Abstract: In various examples, systems and methods for reducing written requirements in a system on chip (SoC) are described herein. For instance, a total number of iterations may be determined for processing data, such as data representing an array. In some circumstances, a set of iterations may include a first number of iterations that is less than a second number of iterations. As such, and during execution of the set of iterations, a predicate flag corresponding to an excess iteration of the set of iterations may be generated, where the excess iteration corresponds to an iteration that is part of a number of excess iterations that is associated with a difference between the first number of iterations and the second number of iterations. Based on the predicate flag, one or more first values corresponding to the iteration may be prevented from being written to memory.Type: GrantFiled: August 2, 2021Date of Patent: April 9, 2024Assignee: NVIDIA CorporationInventors: Ching-Yu Hung, Ravi P Singh, Jagadeesh Sankaran, Yen-Te Shih, Ahmad Itani
-
Patent number: 11928468Abstract: Various embodiments of a system and associated method for generating a valid mapping for a computational loop on a CGRA are disclosed herein. In particular, the method includes generating randomized schedules within particular constraints to explore greater mapping spaces than previous approaches. Further, the system and related method employs a feasibility test to test validity of each schedule such that mappings are only generated from valid schedules.Type: GrantFiled: November 23, 2021Date of Patent: March 12, 2024Assignee: ARIZONA BOARD OF REGENTS ON BEHALF OF ARIZONA STATE UNIVERSITYInventors: Mahesh Balasubramanian, Aviral Shrivastava
-
Patent number: 11853757Abstract: Systems, apparatuses and methods may provide for technology that identifies that an iterative loop includes a first code portion that executes in response to a condition being satisfied, generates a first vector mask that is to represent one or more instances of the condition being satisfied for one or more values of a first vector of values, and one or more instances of the condition being unsatisfied for the first vector of values, where the first vector of values is to correspond to one or more first iterations of the iterative loop, and conducts a vectorization process of the iterative loop based on the first vector mask.Type: GrantFiled: March 6, 2020Date of Patent: December 26, 2023Assignee: Intel CorporationInventors: Ilya Burylov, Mikhail Plotnikov, Hideki Ido, Ruslan Arutyunyan
-
Patent number: 11816485Abstract: A nested loop controller includes a first register having a first value initialized to an initial first value, a second register having a second value initialized to an initial second value, and a third register configured as a predicate FIFO, initialized to have a third value. The second value is advanced in response to a tick instruction during execution of a loop. In response to the second value reaching a second threshold, the second register is reset to the initial second value. The nested loop controller further includes a comparator coupled to the second register and to the predicate FIFO and configured to provide an outer loop indicator value as input to the predicate FIFO when the second value is equal to the second threshold, and provide an inner loop indicator value as input to the predicate FIFO when the second value is not equal to the second threshold.Type: GrantFiled: July 4, 2021Date of Patent: November 14, 2023Assignee: Texas Instruments IncorporatedInventors: Kai Chirca, Timothy D. Anderson, Todd T. Hahn, Alan L. Davis
-
Patent number: 11734003Abstract: The present disclosure relates to a compiler for causing a computer to execute a process. The process includes generating a first program, wherein the first program includes a first code that determines whether a first area of a memory that a process inside a loop included in a second program refers to in a first execution time of the loop is in duplicate with a second area of the memory that the process refers to in a second execution time of the loop, a second code that executes the process in an order of the first and second execution times when it is determined that the first and the second areas are duplicate, and a third code that executes the process for the first execution time and the process for the second execution time in parallel when it is determined that the first and the second areas are not duplicate.Type: GrantFiled: December 16, 2021Date of Patent: August 22, 2023Assignee: FUJITSU LIMITEDInventor: Yuta Mukai
-
Patent number: 11726807Abstract: A hypervisor communicates with a guest operating system running in a virtual machine supported by the hypervisor using a hyper-callback whose functions are based on the particular guest operating system running the virtual machine and are triggered by one or more events in the guest operating system. The functions are modified to make sure they are safe to execute and to allow only limited access to the guest operating system. Additionally, the functions are converted to byte code corresponding to a simplified CPU and memory model and are safety checked by the hypervisor when registered with the hypervisor. The functions are executed by the hypervisor without any context switch between the hypervisor and guest operating system, and when executed, provide information about the particular guest operating system, allowing the hypervisor to improve operations such as page reclamation, virtual CPU scheduling, I/O operations, and tracing of the guest operating system.Type: GrantFiled: May 5, 2017Date of Patent: August 15, 2023Assignee: VMware, Inc.Inventors: Nadav Amit, Michael Wei, Cheng Chun Tu
-
Patent number: 11714620Abstract: Decoupling loop dependencies using first in, first out (FIFO) buffers or other types of buffers to enable pipelining of loops is disclosed. By using buffers along with tailored ordering of their writes and reads, loop dependencies can be decoupled. This allows the loop to be pipelined and can lead to improved performance.Type: GrantFiled: January 14, 2022Date of Patent: August 1, 2023Assignee: Triad National Security, LLCInventors: Andrew John Dubois, Stephen Wayne Poole, Laura Marie Morton Monroe, Robert W. Robey, Brett R. Neuman
-
Patent number: 11663007Abstract: In response to decoding a zero-overhead loop control instruction of an instruction set architecture, processing circuitry sets at least one loop control parameter for controlling execution of one or more iterations of a program loop body of a zero-overhead loop. Based on the at least one loop control parameter, loop control circuitry controls execution of the one or more iterations of the program loop body of the zero-overhead loop, the program loop body excluding the zero-overhead loop control instruction. Branch prediction disabling circuitry detects whether the processing circuitry is executing the program loop body of the zero-overhead loop associated with the zero-overhead loop control instruction, and dependent on detecting that the processing circuitry is executing the program loop body of the zero-overhead loop, disables branch prediction circuitry. This reduces power consumption during a zero-overhead loop when the branch prediction circuitry is unlikely to provide a benefit.Type: GrantFiled: October 1, 2021Date of Patent: May 30, 2023Assignee: Arm LimitedInventors: Thomas Christopher Grocutt, François Christopher Jacques Botman
-
Patent number: 11630672Abstract: An electronic apparatus and a method for reducing the number of commands are provided. The electronic apparatus includes a central processor and a co-processor. The central processor generates a plurality of original register setting commands to set at least one bit of at least one register of the co-processor. The original register setting commands include a plurality of first original register setting commands, and a plurality of setting targets of the first original register setting commands have address continuity. The central processor merges the first original register setting commands to generate at least one merged register setting command. The central processor transmits the at least one merged register setting command to the co-processor.Type: GrantFiled: September 22, 2020Date of Patent: April 18, 2023Assignee: Glenfly Tech Co., Ltd.Inventors: Jianming Lin, Xuan Zhao
-
Patent number: 11614941Abstract: An apparatus for hardware acceleration for use in operating a computational network is configured for determining that a loop structure including one or more loops is to be executed by a first processor. Each of the one or more loops includes a set of operations. The loop structure may be configured as a nested loop, a cascaded or a combination of the two. A second processor may be configured to decouple overhead operations of the loop structure from compute operations of the loop structure. The apparatus accelerates processing of the loop structure by simultaneously processing the overhead operations using the second processor separately from processing the compute operations based on the configuration to operate the computational network.Type: GrantFiled: March 30, 2018Date of Patent: March 28, 2023Assignee: QUALCOMM IncorporatedInventors: Amrit Panda, Francisco Perez, Karamvir Chatha
-
Patent number: 11567768Abstract: A processor is disclosed including: a barrel-threaded execution unit for executing concurrent threads, and a repeat cache shared between the concurrent threads. The processor's instruction set includes a repeat instruction which takes a repeat count operand. When the repeat cache is not claimed and the repeat instruction is executed in a first thread, a portion of code is cached from the first thread into the repeat cache, the state of the repeat cache is changed to record it as claimed, and the cached code is executed a number of times. When the repeat instruction is then executed in a further thread, then the already-cached portion of code is again executed a respective number of times, each time from the repeat cache. For each of the first and further instructions, the repeat count operand in the respective instruction specifies the number of times to execute the cached code.Type: GrantFiled: February 15, 2019Date of Patent: January 31, 2023Assignee: Graphcore LimitedInventors: Alan Graham Alexander, Simon Christian Knowles, Mrudula Chidambar Gore, Jonathan Louis Ferguson
-
Patent number: 11531393Abstract: In one instance, a process for predicting and using emotions of a user in a virtual reality environment includes applying a plurality of physiological sensors to a user. The process further includes receiving physiological sensor signals from the physiological sensors and preparing the physiological sensor signals for further processing by removing at least some of the noise and artifacts and doing data augmentation. The process also includes producing an emotion-predictive signal by utilizing an emotion database. The emotion database has been developed based on empirical data from physiological sensors with known emotional states. The method also includes delivering the emotion-predictive signal to a virtual-reality system or other computer-implemented system. Other methods and systems are presented.Type: GrantFiled: June 26, 2020Date of Patent: December 20, 2022Assignee: Sensoriai LLCInventors: Yaochung Weng, Puneeth Iyengar, Roya Norouzi Kandalan
-
Patent number: 11520580Abstract: A processor includes a plurality of execution units. At least one of the execution units is configured to repeatedly execute a first instruction based on a first field of the first instruction indicating that the first instruction is to be iteratively executed.Type: GrantFiled: March 7, 2016Date of Patent: December 6, 2022Assignee: TEXAS INSTRUMENTS INCORPORATEDInventors: Horst Diewald, Johann Zipperer
-
Patent number: 11494300Abstract: Methods and apparatus provide virtual to physical address translations and a hardware page table walker with region based page table prefetch operation that produces virtual memory region tracking information that includes at least: data representing a virtual base address of a virtual memory region and a physical address of a first page table entry (PTE) corresponding to a virtual page within the virtual memory region. The hardware page table walker, in response to the TLB miss indication, prefetches a physical address of a second page table entry, that provides a final physical address for the missed TLB entry, using the virtual memory region tracking information. In some implementations, the prefetching of the physical PTE address is done in parallel with earlier levels of a page walk operations.Type: GrantFiled: September 26, 2020Date of Patent: November 8, 2022Assignee: ADVANCED MICRO DEVICES, INC.Inventor: Gabriel H. Loh
-
Patent number: 11411759Abstract: An integrated circuit chip has a set of communication units, each unit being configured to operate according to a protocol in which a data packet sent by one unit is receivable by one unit only, each unit being configured to send at least one packet having one of a plurality of tiers to at least one other unit and being configured to specify, for each tier, a subset of destination units to which packets of that tier are to be sent, wherein each unit is configured to: receive a packet having one of the plurality of tiers; determine the tier of the received packet; and sequentially send packets having a different tier to the tier of the received packet to each of the respective subset of destination units for the different tier.Type: GrantFiled: July 28, 2020Date of Patent: August 9, 2022Assignee: SIEMENS INDUSTRY SOFTWARE INC.Inventor: Iain Robertson
-
Patent number: 11294690Abstract: Single Program, Multiple Data (SPMD) parallel processing of SPMD instructions can be generated among processors assigned to a task in a plurality of threads. The SPMD parallel processing can be increased in speed by performing predicated looping with the SPMD instructions in an activated SPMD mode of operation over a non-SPMD mode. Execution of overhead instructions is removed from the SPMD instructions associated with a thread in order to only execute the loop body of a loop associated with a data element of a data set in an enhanced Zero Loop Overhead (ZOL) device.Type: GrantFiled: January 29, 2020Date of Patent: April 5, 2022Assignee: Infineon Technologies AGInventor: Prakash Balasubramanian
-
Patent number: 11269633Abstract: A method is provided for executing instructions in a pipelined processor. The method includes receiving a plurality of instructions in the pipelined processor. A first instruction of the plurality of instructions has a first bit field for holding a value for indicating how many times execution of the first instruction is repeated. Also, the value is for indicating how many no operation (NOP) instructions follow a last iteration of the repeated first instruction. The number of repeated instructions plus the number of NOP instructions is equal to the number of pipeline stages in the pipelined processor. In another embodiment, a pipelined data processor is provided for executing the repeating instruction.Type: GrantFiled: June 7, 2021Date of Patent: March 8, 2022Assignee: NXP B.V.Inventor: Kevin Bruce Traylor
-
Patent number: 11263011Abstract: A device for controlling neural inference processor cores is provided, including a compound instruction set architecture. The device comprises an instruction memory, which comprises a plurality of instructions for controlling a neural inference processor core. Each of the plurality of instructions comprises a control operation. The device further comprises a program counter. The device further comprises at least one loop counter register. The device is adapted to execute the plurality of instructions. Executing the plurality of instructions comprises: reading an instruction from the instruction memory based on a value of the program counter; updating the at least one loop counter register according to the control operation of the instruction; and updating the program counter according to the control operation of the instruction and a value of the at least one loop counter register.Type: GrantFiled: November 28, 2018Date of Patent: March 1, 2022Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Andrew S. Cassidy, Rathinakumar Appuswamy, John V. Arthur, Pallab Datta, Michael V. Debole, Steven K. Esser, Myron D. Flickner, Dharmendra S. Modha, Hartmut Penner, Jun Sawada, Brian Taba
-
Patent number: 11200170Abstract: A data processing system includes a processor, a memory, and a cache. The cache includes a cache array, cache control circuitry coupled to receive an access address corresponding to a read access request from the processor and configured to determine whether the received access address hits or misses in the cache array, pre-load control storage circuitry outside the cache array and configured to store a pre-load cache line address and a corresponding stride value, and pre-load control circuitry coupled to the cache control circuit rand the pre-load control storage circuitry. The pre-load control circuitry is configured to receive the access address corresponding to the read access request from the processor and selectively initiating a pre-load from the memory to the cache based on whether a cache line address portion of the access address matches the stored pre-load cache line address.Type: GrantFiled: December 4, 2019Date of Patent: December 14, 2021Assignee: NXP USA, Inc.Inventors: Paul Kimelman, Brian Christopher Kahne
-
Patent number: 11163572Abstract: Memory systems and memory control methods are described.Type: GrantFiled: February 4, 2014Date of Patent: November 2, 2021Assignee: Micron Technology, Inc.Inventors: Umberto Siciliani, Tommaso Vali, Walter Di-Francesco, Violante Moschiano, Andrea Smaniotto
-
Patent number: 11138010Abstract: Embodiments of the present invention include a computer system that manages execution of one or more programs with one or more loops where each loop having a loop level. Embodiments that manage loops that can skip execution and the number of loops changing during execution are also disclosed. A loop level register (LLEV) stores the loop level for a currently executing loop. A Loop-Back Program Counter Register (LBPR) has a table of one or more Loop-Back Registers. Each Loop-Back Register stores the loop level for a LBPR respective loop and a loop back PC location for the LBPR respective loop. A Program Counter points back to the PC location for each iteration of the loop. A Loop Current Count Register table (LCCR) tracks a number of iterations remaining to executed for of the loop. A loop management process causes one of the CPUs to execute all the one or more instructions of an iteration of the currently executing program loop.Type: GrantFiled: October 1, 2020Date of Patent: October 5, 2021Assignee: International Business Machines CorporationInventors: Chia-Yu Chen, Jungwook Choi, Brian William Curran, Bruce Fleischer, Kailash Gopalakrishnan, Jinwook Oh, Sunil K Shukla, Vijayalakshmi Srinivasan
-
Patent number: 11132200Abstract: In a data processing apparatus loop end prediction is carried out to predict whether a branch represented by a loop end instruction will be taken, branching to the start of the loop for a further iteration to be carried out, or will be not taken leading to the further instructions following the loop. A loop iteration counter at the fetch stage of the apparatus maintains a count on the basis of which the prediction is made. The loop iteration counter is decremented both by loop end instructions reaching the end of the pipeline for which no prediction was made and by later loop end instructions for which a prediction is made, once it has been established that a loop is being executed. This dual counting mechanism allows “shadow” loop end instructions, which were already in the pipeline by the time it was established that a loop is being executed, to be accounted for.Type: GrantFiled: September 28, 2020Date of Patent: September 28, 2021Assignee: Arm LimitedInventors: Vijay Chavan, Kim Richard Schuttenberg, Rong Zhang
-
Patent number: 11119892Abstract: The present disclosure provides a method, apparatus, device and computer-readable storage medium for guiding symbolic execution. According to embodiments of the present disclosure, it is possible to determine the specific code region of the program, and obtain the program loop output of the program corresponding to the specific code region of the program by using the program inverse analysis method, so that it is possible to obtain the program loop input of the program corresponding to the specific code region by using the program loop predictor according to the program loop output of the program. In this way, the obtained program loop input of the program corresponding to the specific code region may be used to guide the symbolic execution to filter out impossible execution paths and jump out of the program code and reach the specific code region, thereby improving the reliability of the symbolic execution.Type: GrantFiled: March 25, 2020Date of Patent: September 14, 2021Assignee: BAIDU ONLINE NETWORK TECHNOLOGY (BEIJING) CO., LTD.Inventors: Qian Feng, Shengjian Guo, Peng Li, Minghua Wang, Yulong Zhang, Tao Wei
-
Patent number: 11093224Abstract: A method performed during execution of a compilation process for a program having nested loops is provided. The method replaces multiple conditional branch instructions for a processor which uses a conditional branch instruction limited to only comparing a value of a general register with a value of a special register that holds a loop counter value. The method generates, in replacement of the multiple conditional branch instructions, the conditional branch instruction limited to only comparing the value of the general register with the value of the special register that holds the loop counter value for the inner-most loop. The method adds (i) a register initialization outside the nested loops and (ii) a register value adjustment to the inner-most loop. The method defines the value for the general register for the register initialization and conditions for the generated conditional branch instruction, responsive to requirements of the multiple conditional branch instructions.Type: GrantFiled: April 24, 2019Date of Patent: August 17, 2021Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Eri Ogawa, Kazuaki Ishizaki, Hiroshi Inoue
-
Patent number: 11055095Abstract: A nested loop controller includes a first register having a first value initialized to an initial first value, a second register having a second value initialized to an initial second value, and a third register configured as a predicate FIFO, initialized to have a third value. The second value is advanced in response to a tick instruction during execution of a loop. In response to the second value reaching a second threshold, the second register is reset to the initial second value. The nested loop controller further includes a comparator coupled to the second register and to the predicate FIFO and configured to provide an outer loop indicator value as input to the predicate FIFO when the second value is equal to the second threshold, and provide an inner loop indicator value as input to the predicate FIFO when the second value is not equal to the second threshold.Type: GrantFiled: May 24, 2019Date of Patent: July 6, 2021Assignee: Texas Instruments IncorporatedInventors: Kai Chirca, Timothy D. Anderson, Todd T. Hahn, Alan L. Davis
-
Patent number: 11042377Abstract: In an embodiment, the present invention is directed to a processor including a decode logic to receive a multi-dimensional loop counter update instruction and to decode the multi-dimensional loop counter update instruction into at least one decoded instruction, and an execution logic to execute the at least one decoded instruction to update at least one loop counter value of a first operand associated with the multi-dimensional loop counter update instruction by a first amount. Methods to collapse loops using such instructions are also disclosed. Other embodiments are described and claimed.Type: GrantFiled: December 27, 2018Date of Patent: June 22, 2021Assignee: Intel CorporationInventors: Mikhail Plotnikov, Andrey Naraikin, Elmoustapha Ould-Ahmed-Vall
-
Patent number: 11029953Abstract: Disclosed embodiments relate to the usage of a branch prediction unit in service of performance sensitive microcode flows. In one example, a processor includes a branch prediction unit (BPU) and a pipeline including a fetch stage to fetch an instruction specifying an opcode, an operand, and a loop condition based on the operand, wherein the BPU is to generate a hint reflecting a predicted result of the loop condition, a decode stage to generate either a first or a second micro-operation flow as per the hint, the pipeline to begin executing the generated micro-operation flow; a read stage to read the operand and resolve the loop condition; and execution circuitry to continue the generated micro-operation flow if the prediction was correct, and, otherwise, to flush the pipeline, update the prediction, and switch from the generated micro-operation flow to the other of the first and second micro-operation flows.Type: GrantFiled: June 26, 2019Date of Patent: June 8, 2021Assignee: Intel CorporationInventors: Michael Mishaeli, Ido Ouziel, Jared Warner Stark, IV
-
Patent number: 11010169Abstract: A processor device includes a scheduler and a performance counter. The scheduler schedules commands of a first command set and commands of a second command set for a functional unit. A performance counter counts numbers of times where events of interest respectively occur while the functional unit processes first operations directed by the first command set and second operations directed by the second command set. The commands of the first command set are repeatedly scheduled such that the numbers of times for all the events of interest are counted with regard to the first operations. The commands of the second command set are scheduled after the numbers of times for all the events of interest are counted with regard to the first operations.Type: GrantFiled: September 17, 2018Date of Patent: May 18, 2021Assignee: SAMSUNG ELECTRONICS CO., LTD.Inventors: Youngsam Shin, Dong-Hoon Yoo, Young-Hwan Heo
-
Patent number: 10990404Abstract: An apparatus and method are provided for performing branch prediction. The apparatus has processing circuitry to execute instructions, and branch prediction circuitry for making branch outcome predictions in respect of branch instructions. The branch prediction circuitry includes loop minimum iteration prediction circuitry having one or more entries, where each entry is associated with a loop controlling branch instruction that controls repeated execution of a loop comprising a number of instructions. During a training phase for an entry, the loop minimum iteration prediction circuitry seeks to identify a minimum number of iterations of the loop. The loop minimum iteration prediction circuitry is then arranged, when the training phase has successfully identified a minimum number of iterations, to subsequently identify a branch outcome prediction for the associated loop controlling branch instruction for use during the minimum number of iterations.Type: GrantFiled: August 10, 2018Date of Patent: April 27, 2021Assignee: Arm LimitedInventors: Houdhaifa Bouzguarrou, Luc Orion, Guillaume Bolbenes, Eddy Lapeyre
-
Patent number: 10972280Abstract: Profile_ID files, containing proprietary hardware operating details of an originating user who originates a process recipe, are encrypted before dissemination of the process recipe to an end user. Blockchain technology is used to enable the end user to validate the encrypted process recipe and control uniform validated process across multiple chambers and locations.Type: GrantFiled: October 9, 2018Date of Patent: April 6, 2021Assignee: Applied Materials, Inc.Inventors: Adolph Miller Allen, Paul Kiely, Noufal Kappachali
-
Patent number: 10936463Abstract: An apparatus and method are provided for detecting regularity in a number of occurrences of an event observed during multiple instances of a counting period. The apparatus has regularity detection circuitry for seeking to detect such a regularity, and a storage providing a storage entry having a count value field to store a count value and a confidence indication field to indicate a confidence in the regularity. The regularity detection circuitry is arranged to consider the multiple instances of the counting period in pairs, for one instance in a given pair of the pairs the regularity detection circuitry incrementing the count value following each occurrence of the event, and for the other instance in the given pair the regularity detection circuitry decrementing the count value following each occurrence of the event.Type: GrantFiled: August 22, 2018Date of Patent: March 2, 2021Assignee: Arm LimitedInventors: Houdhaifa Bouzguarrou, Guillaume Bolbenes, Eddy Lapeyre, Luc Orion
-
Patent number: 10929129Abstract: Apparatus and method for Modifying Addresses, Data, or Program Code Associated With Offloaded Instructions. One embodiment of a processor comprises: a plurality of cores; an interconnect coupling the plurality of cores; and offload circuitry to transfer work from a first core of the plurality of cores to a second core of the plurality of cores without operating system (OS) intervention, the work comprising a plurality of instructions; the second core comprising a translator to translate information associated with a first instruction of the plurality of instructions from a first format usable on the first core to a second format usable on the second core; fetch, decode, and execution circuitry of the second core to fetch, decode, and/or execute the first instruction using the second format.Type: GrantFiled: June 29, 2019Date of Patent: February 23, 2021Assignee: Intel CorporationInventor: ElMoustapha Ould-Ahmed-Vall
-
Patent number: 10915320Abstract: A processor includes an instruction fetch circuit to retrieve instructions from memory, and a decode unit circuit to decode retrieved instructions. The decode unit circuit identifies a shift instruction, accumulates a shift folded immediate value to track a number of bit positions shifted for a source register, and prevents the shift instruction from allocation to an execution unit of the processor.Type: GrantFiled: December 21, 2018Date of Patent: February 9, 2021Assignee: INTEL CORPORATIONInventors: Vineeth Mekkat, Xi Chen, Manjunath Shevgoor
-
Patent number: 10846228Abstract: The present disclosure relates to managing an instruction cache based on temporal locality of cached instructions. One example method includes receiving a request for a first instruction included in a software application; storing the first instruction in a cache structure; receiving a request for a second instruction included in the software application; determining that a cache entry must be removed from the cache structure to create space to store the second instruction; determining that the first instruction should be removed from the cache structure based on temporal locality attributes associated with at least one of the first instruction or the second instruction, the temporal locality attributes representing a likelihood that additional requests will be received for an associated instruction while the instruction is stored in the cache structure; removing the first instruction from the cache structure; and storing the second instruction in the cache structure.Type: GrantFiled: January 18, 2019Date of Patent: November 24, 2020Assignee: Google LLCInventors: Benjamin C. Serebrin, Kim Hazelwood
-
Patent number: 10802808Abstract: A non-transitory computer readable-medium storing a compiler to cause a computer to perform processing for compiling sequence programs including a declaration of a global variable and generating an execution program to be executed by a PLC. When there is a change in a memory address in the PLC assigned to the global variable between before and after edit of a declaration of the global variable, the compiler gives an execution code to synchronize a first value stored at a memory address assigned to an unedited global variable with a second value stored at a memory address assigned to an edited global variable to an execution program corresponding to the sequence program that references the edited global variable.Type: GrantFiled: May 11, 2018Date of Patent: October 13, 2020Assignee: MITSUBISHI ELECTRIC CORPORATIONInventor: Nobutoshi Watanabe
-
Patent number: 10789069Abstract: Dynamically selecting a version of an instruction to be executed. Based on processing, a version of an instruction to be executed is selected. The selecting chooses the version from a plurality of versions of instructions. The plurality of versions of instructions including an architected version and another version different from the architected version. The version of the instruction selected for execution is executed.Type: GrantFiled: March 3, 2017Date of Patent: September 29, 2020Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventor: Michael K. Gschwind
-
Patent number: 10769273Abstract: An electronic control unit includes: a memory saving a program that has a call/return to/from a function represented as a control flow together with the function itself and a check instruction inserted in a program code of the program for checking whether the program code is executable based on the control flow. The electronic control unit may also include an input unit receiving an input of use frequency information indicative of a use frequency of the function; a measurement unit measuring a load of the electronic control unit; an execution object determiner determining the check instruction to be executed based on the use frequency information and the load; and an arithmetic unit executing the check instruction determined by the execution object determiner at a time of execution of the program.Type: GrantFiled: June 25, 2018Date of Patent: September 8, 2020Assignee: DENSO CORPORATIONInventor: Motonori Ando
-
Patent number: 10664251Abstract: Utilizing problem insights based on the entire environment as inputs to drive a static compiler. A decision engine receives inputs associated with applications to be compiled. The decision engine also receives optimization constraints based on available resources. A decision learning model is applied to the inputs to predict compiler performance and the results are provided to the decision engine. The decision engine determines a profile that comprises an order of execution and an optimization level for use during compilation of the plurality of applications. The profile is then used to schedule compiling and optimization of the applications.Type: GrantFiled: October 5, 2018Date of Patent: May 26, 2020Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Christopher Barton, Al Chakra, Sumit Patel
-
Patent number: 10649777Abstract: Prefetching data by determining that a first set of instructions that is processed by a computer processor indicates that a second set of instructions includes multiple iteration groups, where each of the iteration groups includes one or more loop-unrolled instructions, monitoring the second set of instructions as the second set of instructions is processed by the computer processor after the first set of instructions is processed by the computer processor, mapping a corresponding one of the loop-unrolled instructions in each of the iteration groups of the second set of instructions to a stride-tracking record that is shared by the corresponding loop-unrolled instructions, and prefetching data into a cache memory of the computer processor based on the stride-tracking record.Type: GrantFiled: May 14, 2018Date of Patent: May 12, 2020Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Yossi Shapira, Eyal Naor, Gregory Miaskovsky, Yair Fried
-
Patent number: 10606597Abstract: A NONTRANSACTIONAL STORE instruction, executed in transactional execution mode, performs stores that are retained, even if a transaction associated with the instruction aborts. The stores include user-specified information that may facilitate debugging of an aborted transaction.Type: GrantFiled: March 3, 2013Date of Patent: March 31, 2020Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Dan F. Greiner, Christian Jacobi, Timothy J. Slegel
-
Patent number: 10600495Abstract: In described examples of circuitry and methods for testing multiple memories, a controller generates a sequence of commands to be applied to one or more of the memories, where each given command includes expected data, and a command address. Local adapters are individually coupled with the controller and with an associated memory. Each local adapter translates the command to a memory type of the associated memory, maps the command address to a local address of the associated memory, and provides test results to the controller according to read data from the local address of the associated memory and the expected data of the command.Type: GrantFiled: February 8, 2018Date of Patent: March 24, 2020Assignee: TEXAS INSTRUMENTS INCORPORATEDInventors: Devanathan Varadarajan, Sumant Kale
-
Patent number: 10585651Abstract: A method and system for partial connection of iterations during loop unrolling during compilation of a program by a compiler. Unrolled loop iterations of a loop in the program are selectively connected during loop unrolling during the compilation, including redirecting, to the head of the loop, undesirable edges of a control flow from one iteration to a next iteration of the loop. Merges on a path of hot code are removed to increase a scope for optimization of the program. The head of the loop and a start of a replicated loop body of the loop are equivalent points of the control flow.Type: GrantFiled: June 21, 2018Date of Patent: March 10, 2020Assignee: International Business Machines CorporationInventors: Andrew J. Craik, Vijay Sundaresan
-
Patent number: 10534607Abstract: Methods, systems, and apparatus, including an apparatus for accessing a N-dimensional tensor, the apparatus including, for each dimension of the N-dimensional tensor, a partial address offset value element that stores a partial address offset value for the dimension based at least on an initial value for the dimension, a step value for the dimension, and a number of iterations of a loop for the dimension. The apparatus includes a hardware adder and a processor. The processor obtains an instruction to access a particular element of the N-dimensional tensor. The N-dimensional tensor has multiple elements arranged across each of the N dimensions, where N is an integer that is equal to or greater than one. The processor determines, using the partial address offset value elements and the hardware adder, an address of the particular element and outputs data indicating the determined address for accessing the particular element of the N-dimensional tensor.Type: GrantFiled: February 23, 2018Date of Patent: January 14, 2020Assignee: Google LLCInventors: Olivier Temam, Harshit Khaitan, Ravi Narayanaswami, Dong Hyuk Woo