Loop Execution Patents (Class 712/241)

Storage device and data processing method for cache reading data thereof

Patent number: 12254216

Abstract: The present application discloses a storage device and a data processing method thereof. The method includes: receiving a data processing command; converting the data processing command into a plurality of groups of subcommands, wherein each group of subcommands includes at least one subcommand: caching each group of subcommands in a corresponding storage element: and submitting the subcommands in each storage element to a storage medium controller in sequence to execute the subcommands, wherein submitting, in response to at least two subcommands existing in the storage element, a data address of a subsequent subcommand and a previous subcommand simultaneously. By means of the described way, the data processing efficiency of the storage device may be improved.

Type: Grant

Filed: September 6, 2023

Date of Patent: March 18, 2025

Assignee: ZHONGSHAN LONGSYS ELECTRONICS CO., LTD.

Inventors: Jiang Tang, Enhua Deng, Zhixiong Li
Nested loop control

Patent number: 12175244

Abstract: A nested loop controller includes a first register having a first value initialized to an initial first value, a second register having a second value initialized to an initial second value, and a third register configured as a predicate FIFO, initialized to have a third value. The second value is advanced in response to a tick instruction during execution of a loop. In response to the second value reaching a second threshold, the second register is reset to the initial second value. The nested loop controller further includes a comparator coupled to the second register and to the predicate FIFO and configured to provide an outer loop indicator value as input to the predicate FIFO when the second value is equal to the second threshold, and provide an inner loop indicator value as input to the predicate FIFO when the second value is not equal to the second threshold.

Type: Grant

Filed: November 13, 2023

Date of Patent: December 24, 2024

Assignee: Texas Instruments Incorporated

Inventors: Kai Chirca, Timothy D. Anderson, Todd T. Hahn, Alan L. Davis
Reducing register pressure

Patent number: 12099823

Abstract: A computer-implemented method, system and computer program product for reducing register pressure. Loops of a computer program with a number of live variables that exceeds a threshold number, such as the number of available registers with capacity to store data, are identified. Such identified loops may be the to be subject to high register pressure. Upon identifying such loops in the computer program, chains within each identified loop are identified, where each chain includes load and store instructions from the same induction address and where the variable offsets of the load and store instructions are loop invariants. The address expressions for the load and store instructions in the identified chains may then be modified or changed to reuse common variable offsets using an analysis and transformation process. By reusing common variable offsets, there are less variable offsets that need to be stored in the registers thereby mitigating register pressure.

Type: Grant

Filed: January 16, 2023

Date of Patent: September 24, 2024

Assignee: International Business Machines Corporation

Inventors: Zheng Chen, Ke Wen Lin, Si Yuan Zhang
Fractal calculating device and method, integrated circuit and board card

Patent number: 12093811

Abstract: A fractal computing device according to an embodiment of the present application may be included in an integrated circuit device. The integrated circuit device includes a universal interconnect interface and other processing devices. The calculating device interacts with other processing devices to jointly complete a user specified calculation operation. The integrated circuit device may also include a storage device. The storage device is respectively connected with the calculating device and other processing devices and is used for data storage of the computing device and other processing devices.

Type: Grant

Filed: December 23, 2021

Date of Patent: September 17, 2024

Assignee: CAMBRICON TECHNOLOGIES CORPORATION LIMITED

Inventors: Shaoli Liu, Guang Jiang, Yongwei Zhao, Jun Liang
Streaming engine with fetch ahead hysteresis

Patent number: 12079470

Abstract: Disclosed embodiments relate to one or more techniques to control access by a requestor of a computing system to a shared memory resource. In one embodiment, a technique includes determining a number (N) of pending requests to be sent to the memory by the requestor, determining a number (M) of requests that the requestor is limited to sending based on an amount of buffering resources available, and comparing M to N. When N is both greater than zero and less than or equal to M, the requestor sends the N pending requests to the memory. When N is both greater than zero and greater than M, M is compared to a hysteresis value (R) and, when M is less than R, the requestor sends R of the N pending requests to the memory.

Type: Grant

Filed: July 19, 2021

Date of Patent: September 3, 2024

Assignee: Texas Instruments Incorporated

Inventor: Matthew Pierson
Processing pipeline with zero loop overhead

Patent number: 11989554

Abstract: Techniques are disclosed for reducing or eliminating loop overhead caused by function calls in processors that form part of a pipeline architecture. The processors in the pipeline process data blocks in an iterative fashion, with each processor in the pipeline completing one of several iterations associated with a processing loop for a commonly-executed function. The described techniques leverage the use of message passing for pipelined processors to enable an upstream processor to signal to a downstream processor when processing has been completed, and thus a data block is ready for further processing in accordance with the next loop processing iteration. The described techniques facilitate a zero loop overhead architecture, enable continuous data block processing, and allow the processing pipeline to function indefinitely within the main body of the processing loop associated with the commonly-executed function where efficiency is greatest.

Type: Grant

Filed: December 23, 2020

Date of Patent: May 21, 2024

Assignee: Intel Corporation

Inventors: Kameran Azadet, Jeroen Leijten, Joseph Williams
Control flow prediction using pointers

Patent number: 11983533

Abstract: There is provided a data processing apparatus comprising history storage circuitry that stores sets of behaviours of helper instructions for a control flow instruction. Pointer storage circuitry stores pointers, each associated with one of the sets. The behaviours in the one of the sets are indexed according to one of the pointers associated with that one of the sets. Increment circuitry increments at least some of the pointers in response to an increment event and prediction circuitry determines a predicted behaviour of the control flow instruction using one of the sets of behaviours.

Type: Grant

Filed: June 28, 2022

Date of Patent: May 14, 2024

Assignee: Arm Limited

Inventors: Joseph Michael Pusdesris, Alexander Cole Shulyak, Yasuo Ishii, Houdhaifa Bouzguarrou
Nested loop control

Patent number: 11972236

Abstract: A method for compiling and executing a nested loop includes initializing a nested loop controller with an outer loop count value and an inner loop count value. The nested loop controller includes a predicate FIFO. The method also includes coalescing the nested loop and, during execution of the coalesced nested loop, causing the nested loop controller to populate the predicate FIFO and executing a get predicate instruction having an offset value, where the get predicate returns a value from the predicate FIFO specified by the offset value. The method further includes predicating an outer loop instruction on the returned value from the predicate FIFO.

Type: Grant

Filed: September 12, 2022

Date of Patent: April 30, 2024

Assignee: Texas Instruments Incorporated

Inventors: Kai Chirca, Timothy D. Anderson, Todd T. Hahn, Alan L Davis
Reduced memory write requirements in a system on a chip using automatic store predication

Patent number: 11954496

Abstract: In various examples, systems and methods for reducing written requirements in a system on chip (SoC) are described herein. For instance, a total number of iterations may be determined for processing data, such as data representing an array. In some circumstances, a set of iterations may include a first number of iterations that is less than a second number of iterations. As such, and during execution of the set of iterations, a predicate flag corresponding to an excess iteration of the set of iterations may be generated, where the excess iteration corresponds to an iteration that is part of a number of excess iterations that is associated with a difference between the first number of iterations and the second number of iterations. Based on the predicate flag, one or more first values corresponding to the iteration may be prevented from being written to memory.

Type: Grant

Filed: August 2, 2021

Date of Patent: April 9, 2024

Assignee: NVIDIA Corporation

Inventors: Ching-Yu Hung, Ravi P Singh, Jagadeesh Sankaran, Yen-Te Shih, Ahmad Itani
Systems and methods for improved mapping of computational loops on reconfigurable architectures

Patent number: 11928468

Abstract: Various embodiments of a system and associated method for generating a valid mapping for a computational loop on a CGRA are disclosed herein. In particular, the method includes generating randomized schedules within particular constraints to explore greater mapping spaces than previous approaches. Further, the system and related method employs a feasibility test to test validity of each schedule such that mappings are only generated from valid schedules.

Type: Grant

Filed: November 23, 2021

Date of Patent: March 12, 2024

Assignee: ARIZONA BOARD OF REGENTS ON BEHALF OF ARIZONA STATE UNIVERSITY

Inventors: Mahesh Balasubramanian, Aviral Shrivastava
Vectorization of loops based on vector masks and vector count distances

Patent number: 11853757

Abstract: Systems, apparatuses and methods may provide for technology that identifies that an iterative loop includes a first code portion that executes in response to a condition being satisfied, generates a first vector mask that is to represent one or more instances of the condition being satisfied for one or more values of a first vector of values, and one or more instances of the condition being unsatisfied for the first vector of values, where the first vector of values is to correspond to one or more first iterations of the iterative loop, and conducts a vectorization process of the iterative loop based on the first vector mask.

Type: Grant

Filed: March 6, 2020

Date of Patent: December 26, 2023

Assignee: Intel Corporation

Inventors: Ilya Burylov, Mikhail Plotnikov, Hideki Ido, Ruslan Arutyunyan
Nested loop control

Patent number: 11816485

Abstract: A nested loop controller includes a first register having a first value initialized to an initial first value, a second register having a second value initialized to an initial second value, and a third register configured as a predicate FIFO, initialized to have a third value. The second value is advanced in response to a tick instruction during execution of a loop. In response to the second value reaching a second threshold, the second register is reset to the initial second value. The nested loop controller further includes a comparator coupled to the second register and to the predicate FIFO and configured to provide an outer loop indicator value as input to the predicate FIFO when the second value is equal to the second threshold, and provide an inner loop indicator value as input to the predicate FIFO when the second value is not equal to the second threshold.

Type: Grant

Filed: July 4, 2021

Date of Patent: November 14, 2023

Assignee: Texas Instruments Incorporated

Inventors: Kai Chirca, Timothy D. Anderson, Todd T. Hahn, Alan L. Davis
Non-transitory computer-readable recording medium, compilation method, and compiler device

Patent number: 11734003

Abstract: The present disclosure relates to a compiler for causing a computer to execute a process. The process includes generating a first program, wherein the first program includes a first code that determines whether a first area of a memory that a process inside a loop included in a second program refers to in a first execution time of the loop is in duplicate with a second area of the memory that the process refers to in a second execution time of the loop, a second code that executes the process in an order of the first and second execution times when it is determined that the first and the second areas are duplicate, and a third code that executes the process for the first execution time and the process for the second execution time in parallel when it is determined that the first and the second areas are not duplicate.

Type: Grant

Filed: December 16, 2021

Date of Patent: August 22, 2023

Assignee: FUJITSU LIMITED

Inventor: Yuta Mukai
Safe execution of virtual machine callbacks in a hypervisor

Patent number: 11726807

Abstract: A hypervisor communicates with a guest operating system running in a virtual machine supported by the hypervisor using a hyper-callback whose functions are based on the particular guest operating system running the virtual machine and are triggered by one or more events in the guest operating system. The functions are modified to make sure they are safe to execute and to allow only limited access to the guest operating system. Additionally, the functions are converted to byte code corresponding to a simplified CPU and memory model and are safety checked by the hypervisor when registered with the hypervisor. The functions are executed by the hypervisor without any context switch between the hypervisor and guest operating system, and when executed, provide information about the particular guest operating system, allowing the hypervisor to improve operations such as page reclamation, virtual CPU scheduling, I/O operations, and tracing of the guest operating system.

Type: Grant

Filed: May 5, 2017

Date of Patent: August 15, 2023

Assignee: VMware, Inc.

Inventors: Nadav Amit, Michael Wei, Cheng Chun Tu
Decoupling loop dependencies using buffers to enable pipelining of loops

Patent number: 11714620

Abstract: Decoupling loop dependencies using first in, first out (FIFO) buffers or other types of buffers to enable pipelining of loops is disclosed. By using buffers along with tailored ordering of their writes and reads, loop dependencies can be decoupled. This allows the loop to be pipelined and can lead to improved performance.

Type: Grant

Filed: January 14, 2022

Date of Patent: August 1, 2023

Assignee: Triad National Security, LLC

Inventors: Andrew John Dubois, Stephen Wayne Poole, Laura Marie Morton Monroe, Robert W. Robey, Brett R. Neuman
Control of branch prediction for zero-overhead loop

Patent number: 11663007

Abstract: In response to decoding a zero-overhead loop control instruction of an instruction set architecture, processing circuitry sets at least one loop control parameter for controlling execution of one or more iterations of a program loop body of a zero-overhead loop. Based on the at least one loop control parameter, loop control circuitry controls execution of the one or more iterations of the program loop body of the zero-overhead loop, the program loop body excluding the zero-overhead loop control instruction. Branch prediction disabling circuitry detects whether the processing circuitry is executing the program loop body of the zero-overhead loop associated with the zero-overhead loop control instruction, and dependent on detecting that the processing circuitry is executing the program loop body of the zero-overhead loop, disables branch prediction circuitry. This reduces power consumption during a zero-overhead loop when the branch prediction circuitry is unlikely to provide a benefit.

Type: Grant

Filed: October 1, 2021

Date of Patent: May 30, 2023

Assignee: Arm Limited

Inventors: Thomas Christopher Grocutt, François Christopher Jacques Botman
Reducing a number of commands transmitted to a co-processor by merging register-setting commands having address continuity

Patent number: 11630672

Abstract: An electronic apparatus and a method for reducing the number of commands are provided. The electronic apparatus includes a central processor and a co-processor. The central processor generates a plurality of original register setting commands to set at least one bit of at least one register of the co-processor. The original register setting commands include a plurality of first original register setting commands, and a plurality of setting targets of the first original register setting commands have address continuity. The central processor merges the first original register setting commands to generate at least one merged register setting command. The central processor transmits the at least one merged register setting command to the co-processor.

Type: Grant

Filed: September 22, 2020

Date of Patent: April 18, 2023

Assignee: Glenfly Tech Co., Ltd.

Inventors: Jianming Lin, Xuan Zhao
System and method for decoupling operations to accelerate processing of loop structures

Patent number: 11614941

Abstract: An apparatus for hardware acceleration for use in operating a computational network is configured for determining that a loop structure including one or more loops is to be executed by a first processor. Each of the one or more loops includes a set of operations. The loop structure may be configured as a nested loop, a cascaded or a combination of the two. A second processor may be configured to decouple overhead operations of the loop structure from compute operations of the loop structure. The apparatus accelerates processing of the loop structure by simultaneously processing the overhead operations using the second processor separately from processing the compute operations based on the configuration to operate the computational network.

Type: Grant

Filed: March 30, 2018

Date of Patent: March 28, 2023

Assignee: QUALCOMM Incorporated

Inventors: Amrit Panda, Francisco Perez, Karamvir Chatha
Repeat instruction for loading and/or executing code in a claimable repeat cache a specified number of times

Patent number: 11567768

Abstract: A processor is disclosed including: a barrel-threaded execution unit for executing concurrent threads, and a repeat cache shared between the concurrent threads. The processor's instruction set includes a repeat instruction which takes a repeat count operand. When the repeat cache is not claimed and the repeat instruction is executed in a first thread, a portion of code is cached from the first thread into the repeat cache, the state of the repeat cache is changed to record it as claimed, and the cached code is executed a number of times. When the repeat instruction is then executed in a further thread, then the already-cached portion of code is again executed a respective number of times, each time from the repeat cache. For each of the first and further instructions, the repeat count operand in the respective instruction specifies the number of times to execute the cached code.

Type: Grant

Filed: February 15, 2019

Date of Patent: January 31, 2023

Assignee: Graphcore Limited

Inventors: Alan Graham Alexander, Simon Christian Knowles, Mrudula Chidambar Gore, Jonathan Louis Ferguson
Human-computer interface systems and methods

Patent number: 11531393

Abstract: In one instance, a process for predicting and using emotions of a user in a virtual reality environment includes applying a plurality of physiological sensors to a user. The process further includes receiving physiological sensor signals from the physiological sensors and preparing the physiological sensor signals for further processing by removing at least some of the noise and artifacts and doing data augmentation. The process also includes producing an emotion-predictive signal by utilizing an emotion database. The emotion database has been developed based on empirical data from physiological sensors with known emotional states. The method also includes delivering the emotion-predictive signal to a virtual-reality system or other computer-implemented system. Other methods and systems are presented.

Type: Grant

Filed: June 26, 2020

Date of Patent: December 20, 2022

Assignee: Sensoriai LLC

Inventors: Yaochung Weng, Puneeth Iyengar, Roya Norouzi Kandalan
Processor with instruction iteration

Patent number: 11520580

Abstract: A processor includes a plurality of execution units. At least one of the execution units is configured to repeatedly execute a first instruction based on a first field of the first instruction indicating that the first instruction is to be iteratively executed.

Type: Grant

Filed: March 7, 2016

Date of Patent: December 6, 2022

Assignee: TEXAS INSTRUMENTS INCORPORATED

Inventors: Horst Diewald, Johann Zipperer
Page table walker with page table entry (PTE) physical address prediction

Patent number: 11494300

Abstract: Methods and apparatus provide virtual to physical address translations and a hardware page table walker with region based page table prefetch operation that produces virtual memory region tracking information that includes at least: data representing a virtual base address of a virtual memory region and a physical address of a first page table entry (PTE) corresponding to a virtual page within the virtual memory region. The hardware page table walker, in response to the TLB miss indication, prefetches a physical address of a second page table entry, that provides a final physical address for the missed TLB entry, using the virtual memory region tracking information. In some implementations, the prefetching of the physical PTE address is done in parallel with earlier levels of a page walk operations.

Type: Grant

Filed: September 26, 2020

Date of Patent: November 8, 2022

Assignee: ADVANCED MICRO DEVICES, INC.

Inventor: Gabriel H. Loh
Emulating broadcast in a network on chip

Patent number: 11411759

Abstract: An integrated circuit chip has a set of communication units, each unit being configured to operate according to a protocol in which a data packet sent by one unit is receivable by one unit only, each unit being configured to send at least one packet having one of a plurality of tiers to at least one other unit and being configured to specify, for each tier, a subset of destination units to which packets of that tier are to be sent, wherein each unit is configured to: receive a packet having one of the plurality of tiers; determine the tier of the received packet; and sequentially send packets having a different tier to the tier of the received packet to each of the respective subset of destination units for the different tier.

Type: Grant

Filed: July 28, 2020

Date of Patent: August 9, 2022

Assignee: SIEMENS INDUSTRY SOFTWARE INC.

Inventor: Iain Robertson
Predicated looping on multi-processors for single program multiple data (SPMD) programs

Patent number: 11294690

Abstract: Single Program, Multiple Data (SPMD) parallel processing of SPMD instructions can be generated among processors assigned to a task in a plurality of threads. The SPMD parallel processing can be increased in speed by performing predicated looping with the SPMD instructions in an activated SPMD mode of operation over a non-SPMD mode. Execution of overhead instructions is removed from the SPMD instructions associated with a thread in order to only execute the loop body of a loop associated with a data element of a data set in an enhanced Zero Loop Overhead (ZOL) device.

Type: Grant

Filed: January 29, 2020

Date of Patent: April 5, 2022

Assignee: Infineon Technologies AG

Inventor: Prakash Balasubramanian
System and method for executing a number of NOP instructions after a repeated instruction

Patent number: 11269633

Abstract: A method is provided for executing instructions in a pipelined processor. The method includes receiving a plurality of instructions in the pipelined processor. A first instruction of the plurality of instructions has a first bit field for holding a value for indicating how many times execution of the first instruction is repeated. Also, the value is for indicating how many no operation (NOP) instructions follow a last iteration of the repeated first instruction. The number of repeated instructions plus the number of NOP instructions is equal to the number of pipeline stages in the pipelined processor. In another embodiment, a pipelined data processor is provided for executing the repeating instruction.

Type: Grant

Filed: June 7, 2021

Date of Patent: March 8, 2022

Assignee: NXP B.V.

Inventor: Kevin Bruce Traylor
Compound instruction set architecture for a neural inference chip

Patent number: 11263011

Abstract: A device for controlling neural inference processor cores is provided, including a compound instruction set architecture. The device comprises an instruction memory, which comprises a plurality of instructions for controlling a neural inference processor core. Each of the plurality of instructions comprises a control operation. The device further comprises a program counter. The device further comprises at least one loop counter register. The device is adapted to execute the plurality of instructions. Executing the plurality of instructions comprises: reading an instruction from the instruction memory based on a value of the program counter; updating the at least one loop counter register according to the control operation of the instruction; and updating the program counter according to the control operation of the instruction and a value of the at least one loop counter register.

Type: Grant

Filed: November 28, 2018

Date of Patent: March 1, 2022

Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Andrew S. Cassidy, Rathinakumar Appuswamy, John V. Arthur, Pallab Datta, Michael V. Debole, Steven K. Esser, Myron D. Flickner, Dharmendra S. Modha, Hartmut Penner, Jun Sawada, Brian Taba
Cache pre-loading in a data processing system

Patent number: 11200170

Abstract: A data processing system includes a processor, a memory, and a cache. The cache includes a cache array, cache control circuitry coupled to receive an access address corresponding to a read access request from the processor and configured to determine whether the received access address hits or misses in the cache array, pre-load control storage circuitry outside the cache array and configured to store a pre-load cache line address and a corresponding stride value, and pre-load control circuitry coupled to the cache control circuit rand the pre-load control storage circuitry. The pre-load control circuitry is configured to receive the access address corresponding to the read access request from the processor and selectively initiating a pre-load from the memory to the cache based on whether a cache line address portion of the access address matches the stored pre-load cache line address.

Type: Grant

Filed: December 4, 2019

Date of Patent: December 14, 2021

Assignee: NXP USA, Inc.

Inventors: Paul Kimelman, Brian Christopher Kahne
Memory systems and memory control methods

Patent number: 11163572

Abstract: Memory systems and memory control methods are described.

Type: Grant

Filed: February 4, 2014

Date of Patent: November 2, 2021

Assignee: Micron Technology, Inc.

Inventors: Umberto Siciliani, Tommaso Vali, Walter Di-Francesco, Violante Moschiano, Andrea Smaniotto
Loop management in multi-processor dataflow architecture

Patent number: 11138010

Abstract: Embodiments of the present invention include a computer system that manages execution of one or more programs with one or more loops where each loop having a loop level. Embodiments that manage loops that can skip execution and the number of loops changing during execution are also disclosed. A loop level register (LLEV) stores the loop level for a currently executing loop. A Loop-Back Program Counter Register (LBPR) has a table of one or more Loop-Back Registers. Each Loop-Back Register stores the loop level for a LBPR respective loop and a loop back PC location for the LBPR respective loop. A Program Counter points back to the PC location for each iteration of the loop. A Loop Current Count Register table (LCCR) tracks a number of iterations remaining to executed for of the loop. A loop management process causes one of the CPUs to execute all the one or more instructions of an iteration of the currently executing program loop.

Type: Grant

Filed: October 1, 2020

Date of Patent: October 5, 2021

Assignee: International Business Machines Corporation

Inventors: Chia-Yu Chen, Jungwook Choi, Brian William Curran, Bruce Fleischer, Kailash Gopalakrishnan, Jinwook Oh, Sunil K Shukla, Vijayalakshmi Srinivasan
Loop end prediction using loop counter updated by inflight loop end instructions

Patent number: 11132200

Abstract: In a data processing apparatus loop end prediction is carried out to predict whether a branch represented by a loop end instruction will be taken, branching to the start of the loop for a further iteration to be carried out, or will be not taken leading to the further instructions following the loop. A loop iteration counter at the fetch stage of the apparatus maintains a count on the basis of which the prediction is made. The loop iteration counter is decremented both by loop end instructions reaching the end of the pipeline for which no prediction was made and by later loop end instructions for which a prediction is made, once it has been established that a loop is being executed. This dual counting mechanism allows “shadow” loop end instructions, which were already in the pipeline by the time it was established that a loop is being executed, to be accounted for.

Type: Grant

Filed: September 28, 2020

Date of Patent: September 28, 2021

Assignee: Arm Limited

Inventors: Vijay Chavan, Kim Richard Schuttenberg, Rong Zhang
Method, device and computer-readable storage medium for guiding symbolic execution

Patent number: 11119892

Abstract: The present disclosure provides a method, apparatus, device and computer-readable storage medium for guiding symbolic execution. According to embodiments of the present disclosure, it is possible to determine the specific code region of the program, and obtain the program loop output of the program corresponding to the specific code region of the program by using the program inverse analysis method, so that it is possible to obtain the program loop input of the program corresponding to the specific code region by using the program loop predictor according to the program loop output of the program. In this way, the obtained program loop input of the program corresponding to the specific code region may be used to guide the symbolic execution to filter out impossible execution paths and jump out of the program code and reach the specific code region, thereby improving the reliability of the symbolic execution.

Type: Grant

Filed: March 25, 2020

Date of Patent: September 14, 2021

Assignee: BAIDU ONLINE NETWORK TECHNOLOGY (BEIJING) CO., LTD.

Inventors: Qian Feng, Shengjian Guo, Peng Li, Minghua Wang, Yulong Zhang, Tao Wei
Compilation to reduce number of instructions for deep learning processor

Patent number: 11093224

Abstract: A method performed during execution of a compilation process for a program having nested loops is provided. The method replaces multiple conditional branch instructions for a processor which uses a conditional branch instruction limited to only comparing a value of a general register with a value of a special register that holds a loop counter value. The method generates, in replacement of the multiple conditional branch instructions, the conditional branch instruction limited to only comparing the value of the general register with the value of the special register that holds the loop counter value for the inner-most loop. The method adds (i) a register initialization outside the nested loops and (ii) a register value adjustment to the inner-most loop. The method defines the value for the general register for the register initialization and conditions for the generated conditional branch instruction, responsive to requirements of the multiple conditional branch instructions.

Type: Grant

Filed: April 24, 2019

Date of Patent: August 17, 2021

Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Eri Ogawa, Kazuaki Ishizaki, Hiroshi Inoue
Nested loop control

Patent number: 11055095

Abstract: A nested loop controller includes a first register having a first value initialized to an initial first value, a second register having a second value initialized to an initial second value, and a third register configured as a predicate FIFO, initialized to have a third value. The second value is advanced in response to a tick instruction during execution of a loop. In response to the second value reaching a second threshold, the second register is reset to the initial second value. The nested loop controller further includes a comparator coupled to the second register and to the predicate FIFO and configured to provide an outer loop indicator value as input to the predicate FIFO when the second value is equal to the second threshold, and provide an inner loop indicator value as input to the predicate FIFO when the second value is not equal to the second threshold.

Type: Grant

Filed: May 24, 2019

Date of Patent: July 6, 2021

Assignee: Texas Instruments Incorporated

Inventors: Kai Chirca, Timothy D. Anderson, Todd T. Hahn, Alan L. Davis
Collapsing of multiple nested loops, methods, and instructions

Patent number: 11042377

Abstract: In an embodiment, the present invention is directed to a processor including a decode logic to receive a multi-dimensional loop counter update instruction and to decode the multi-dimensional loop counter update instruction into at least one decoded instruction, and an execution logic to execute the at least one decoded instruction to update at least one loop counter value of a first operand associated with the multi-dimensional loop counter update instruction by a first amount. Methods to collapse loops using such instructions are also disclosed. Other embodiments are described and claimed.

Type: Grant

Filed: December 27, 2018

Date of Patent: June 22, 2021

Assignee: Intel Corporation

Inventors: Mikhail Plotnikov, Andrey Naraikin, Elmoustapha Ould-Ahmed-Vall
Branch prediction unit in service of short microcode flows

Patent number: 11029953

Abstract: Disclosed embodiments relate to the usage of a branch prediction unit in service of performance sensitive microcode flows. In one example, a processor includes a branch prediction unit (BPU) and a pipeline including a fetch stage to fetch an instruction specifying an opcode, an operand, and a loop condition based on the operand, wherein the BPU is to generate a hint reflecting a predicted result of the loop condition, a decode stage to generate either a first or a second micro-operation flow as per the hint, the pipeline to begin executing the generated micro-operation flow; a read stage to read the operand and resolve the loop condition; and execution circuitry to continue the generated micro-operation flow if the prediction was correct, and, otherwise, to flush the pipeline, update the prediction, and switch from the generated micro-operation flow to the other of the first and second micro-operation flows.

Type: Grant

Filed: June 26, 2019

Date of Patent: June 8, 2021

Assignee: Intel Corporation

Inventors: Michael Mishaeli, Ido Ouziel, Jared Warner Stark, IV
Processor device collecting performance information through command-set-based replay

Patent number: 11010169

Abstract: A processor device includes a scheduler and a performance counter. The scheduler schedules commands of a first command set and commands of a second command set for a functional unit. A performance counter counts numbers of times where events of interest respectively occur while the functional unit processes first operations directed by the first command set and second operations directed by the second command set. The commands of the first command set are repeatedly scheduled such that the numbers of times for all the events of interest are counted with regard to the first operations. The commands of the second command set are scheduled after the numbers of times for all the events of interest are counted with regard to the first operations.

Type: Grant

Filed: September 17, 2018

Date of Patent: May 18, 2021

Assignee: SAMSUNG ELECTRONICS CO., LTD.

Inventors: Youngsam Shin, Dong-Hoon Yoo, Young-Hwan Heo
Apparatus and method for performing branch prediction using loop minimum iteration prediction

Patent number: 10990404

Abstract: An apparatus and method are provided for performing branch prediction. The apparatus has processing circuitry to execute instructions, and branch prediction circuitry for making branch outcome predictions in respect of branch instructions. The branch prediction circuitry includes loop minimum iteration prediction circuitry having one or more entries, where each entry is associated with a loop controlling branch instruction that controls repeated execution of a loop comprising a number of instructions. During a training phase for an entry, the loop minimum iteration prediction circuitry seeks to identify a minimum number of iterations of the loop. The loop minimum iteration prediction circuitry is then arranged, when the training phase has successfully identified a minimum number of iterations, to subsequently identify a branch outcome prediction for the associated loop controlling branch instruction for use during the minimum number of iterations.

Type: Grant

Filed: August 10, 2018

Date of Patent: April 27, 2021

Assignee: Arm Limited

Inventors: Houdhaifa Bouzguarrou, Luc Orion, Guillaume Bolbenes, Eddy Lapeyre
Blockchain for distributed authentication of hardware operating profile

Patent number: 10972280

Abstract: Profile_ID files, containing proprietary hardware operating details of an originating user who originates a process recipe, are encrypted before dissemination of the process recipe to an end user. Blockchain technology is used to enable the end user to validate the encrypted process recipe and control uniform validated process across multiple chambers and locations.

Type: Grant

Filed: October 9, 2018

Date of Patent: April 6, 2021

Assignee: Applied Materials, Inc.

Inventors: Adolph Miller Allen, Paul Kiely, Noufal Kappachali
Apparatus and method for detecting regularity in a number of occurrences of an event observed during multiple instances of a counting period

Patent number: 10936463

Abstract: An apparatus and method are provided for detecting regularity in a number of occurrences of an event observed during multiple instances of a counting period. The apparatus has regularity detection circuitry for seeking to detect such a regularity, and a storage providing a storage entry having a count value field to store a count value and a confidence indication field to indicate a confidence in the regularity. The regularity detection circuitry is arranged to consider the multiple instances of the counting period in pairs, for one instance in a given pair of the pairs the regularity detection circuitry incrementing the count value following each occurrence of the event, and for the other instance in the given pair the regularity detection circuitry decrementing the count value following each occurrence of the event.

Type: Grant

Filed: August 22, 2018

Date of Patent: March 2, 2021

Assignee: Arm Limited

Inventors: Houdhaifa Bouzguarrou, Guillaume Bolbenes, Eddy Lapeyre, Luc Orion
Apparatus and method for modifying addresses, data, or program code associated with offloaded instructions

Patent number: 10929129

Abstract: Apparatus and method for Modifying Addresses, Data, or Program Code Associated With Offloaded Instructions. One embodiment of a processor comprises: a plurality of cores; an interconnect coupling the plurality of cores; and offload circuitry to transfer work from a first core of the plurality of cores to a second core of the plurality of cores without operating system (OS) intervention, the work comprising a plurality of instructions; the second core comprising a translator to translate information associated with a first instruction of the plurality of instructions from a first format usable on the first core to a second format usable on the second core; fetch, decode, and execution circuitry of the second core to fetch, decode, and/or execute the first instruction using the second format.

Type: Grant

Filed: June 29, 2019

Date of Patent: February 23, 2021

Assignee: Intel Corporation

Inventor: ElMoustapha Ould-Ahmed-Vall
Shift-folding for efficient load coalescing in a binary translation based processor

Patent number: 10915320

Abstract: A processor includes an instruction fetch circuit to retrieve instructions from memory, and a decode unit circuit to decode retrieved instructions. The decode unit circuit identifies a shift instruction, accumulates a shift folded immediate value to track a number of bit positions shifted for a source register, and prevents the shift instruction from allocation to an execution unit of the processor.

Type: Grant

Filed: December 21, 2018

Date of Patent: February 9, 2021

Assignee: INTEL CORPORATION

Inventors: Vineeth Mekkat, Xi Chen, Manjunath Shevgoor
Instruction cache management based on temporal locality

Patent number: 10846228

Abstract: The present disclosure relates to managing an instruction cache based on temporal locality of cached instructions. One example method includes receiving a request for a first instruction included in a software application; storing the first instruction in a cache structure; receiving a request for a second instruction included in the software application; determining that a cache entry must be removed from the cache structure to create space to store the second instruction; determining that the first instruction should be removed from the cache structure based on temporal locality attributes associated with at least one of the first instruction or the second instruction, the temporal locality attributes representing a likelihood that additional requests will be received for an associated instruction while the instruction is stored in the cache structure; removing the first instruction from the cache structure; and storing the second instruction in the cache structure.

Type: Grant

Filed: January 18, 2019

Date of Patent: November 24, 2020

Assignee: Google LLC

Inventors: Benjamin C. Serebrin, Kim Hazelwood
Compiler and programming support device

Patent number: 10802808

Abstract: A non-transitory computer readable-medium storing a compiler to cause a computer to perform processing for compiling sequence programs including a declaration of a global variable and generating an execution program to be executed by a PLC. When there is a change in a memory address in the PLC assigned to the global variable between before and after edit of a declaration of the global variable, the compiler gives an execution code to synchronize a first value stored at a memory address assigned to an unedited global variable with a second value stored at a memory address assigned to an edited global variable to an execution program corresponding to the sequence program that references the edited global variable.

Type: Grant

Filed: May 11, 2018

Date of Patent: October 13, 2020

Assignee: MITSUBISHI ELECTRIC CORPORATION

Inventor: Nobutoshi Watanabe
Dynamically selecting version of instruction to be executed

Patent number: 10789069

Abstract: Dynamically selecting a version of an instruction to be executed. Based on processing, a version of an instruction to be executed is selected. The selecting chooses the version from a plurality of versions of instructions. The plurality of versions of instructions including an architected version and another version different from the architected version. The version of the instruction selected for execution is executed.

Type: Grant

Filed: March 3, 2017

Date of Patent: September 29, 2020

Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventor: Michael K. Gschwind
Electronic control unit

Patent number: 10769273

Abstract: An electronic control unit includes: a memory saving a program that has a call/return to/from a function represented as a control flow together with the function itself and a check instruction inserted in a program code of the program for checking whether the program code is executable based on the control flow. The electronic control unit may also include an input unit receiving an input of use frequency information indicative of a use frequency of the function; a measurement unit measuring a load of the electronic control unit; an execution object determiner determining the check instruction to be executed based on the use frequency information and the load; and an arithmetic unit executing the check instruction determined by the execution object determiner at a time of execution of the program.

Type: Grant

Filed: June 25, 2018

Date of Patent: September 8, 2020

Assignee: DENSO CORPORATION

Inventor: Motonori Ando
Analytics driven compiler

Patent number: 10664251

Abstract: Utilizing problem insights based on the entire environment as inputs to drive a static compiler. A decision engine receives inputs associated with applications to be compiled. The decision engine also receives optimization constraints based on available resources. A decision learning model is applied to the inputs to predict compiler performance and the results are provided to the decision engine. The decision engine determines a profile that comprises an order of execution and an optimization level for use during compilation of the plurality of applications. The profile is then used to schedule compiling and optimization of the applications.

Type: Grant

Filed: October 5, 2018

Date of Patent: May 26, 2020

Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Christopher Barton, Al Chakra, Sumit Patel
Hardware-based data prefetching based on loop-unrolled instructions

Patent number: 10649777

Abstract: Prefetching data by determining that a first set of instructions that is processed by a computer processor indicates that a second set of instructions includes multiple iteration groups, where each of the iteration groups includes one or more loop-unrolled instructions, monitoring the second set of instructions as the second set of instructions is processed by the computer processor after the first set of instructions is processed by the computer processor, mapping a corresponding one of the loop-unrolled instructions in each of the iteration groups of the second set of instructions to a stride-tracking record that is shared by the corresponding loop-unrolled instructions, and prefetching data into a cache memory of the computer processor based on the stride-tracking record.

Type: Grant

Filed: May 14, 2018

Date of Patent: May 12, 2020

Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Yossi Shapira, Eyal Naor, Gregory Miaskovsky, Yair Fried
Nontransactional store instruction

Patent number: 10606597

Abstract: A NONTRANSACTIONAL STORE instruction, executed in transactional execution mode, performs stores that are retained, even if a transaction associated with the instruction aborts. The stores include user-specified information that may facilitate debugging of an aborted transaction.

Type: Grant

Filed: March 3, 2013

Date of Patent: March 31, 2020

Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Dan F. Greiner, Christian Jacobi, Timothy J. Slegel
Parallel memory self-testing

Patent number: 10600495

Abstract: In described examples of circuitry and methods for testing multiple memories, a controller generates a sequence of commands to be applied to one or more of the memories, where each given command includes expected data, and a command address. Local adapters are individually coupled with the controller and with an associated memory. Each local adapter translates the command to a memory type of the associated memory, maps the command address to a local address of the associated memory, and provides test results to the controller according to read data from the local address of the associated memory and the expected data of the command.

Type: Grant

Filed: February 8, 2018

Date of Patent: March 24, 2020

Assignee: TEXAS INSTRUMENTS INCORPORATED

Inventors: Devanathan Varadarajan, Sumant Kale
Partial connection of iterations during loop unrolling

Patent number: 10585651

Abstract: A method and system for partial connection of iterations during loop unrolling during compilation of a program by a compiler. Unrolled loop iterations of a loop in the program are selectively connected during loop unrolling during the compilation, including redirecting, to the head of the loop, undesirable edges of a control flow from one iteration to a next iteration of the loop. Merges on a path of hot code are removed to increase a scope for optimization of the program. The head of the loop and a start of a replicated loop body of the loop are equivalent points of the control flow.

Type: Grant

Filed: June 21, 2018

Date of Patent: March 10, 2020

Assignee: International Business Machines Corporation

Inventors: Andrew J. Craik, Vijay Sundaresan

1 2 3 4 5 … next