Patents Examined by Keith E Vicary
-
Patent number: 12204363Abstract: Representative apparatus, method, and system embodiments are disclosed for configurable computing. In a representative embodiment, a system includes an interconnection network, a processor, a host interface, and a configurable circuit cluster. The configurable circuit cluster may include a plurality of configurable circuits arranged in an array; an asynchronous packet network and a synchronous network coupled to each configurable circuit of the array; and a memory interface circuit and a dispatch interface circuit coupled to the asynchronous packet network and to the interconnection network. Each configurable circuit includes instruction or configuration memories for selection of a current data path configuration, a master synchronous network input, and a data path configuration for a next configurable circuit.Type: GrantFiled: January 15, 2024Date of Patent: January 21, 2025Assignee: Micron Technology, Inc.Inventor: Tony M. Brewer
-
Patent number: 12197917Abstract: A computer-implemented method includes fetching a fetch-packet containing a first hyper-block from a first address of a memory. The fetch-packet contains a bitwise distance from an entry point of the first hyper-block to a predicted exit point. A first branch instruction of the first hyper-block is executed that corresponds to a first exit point. The first branch instruction includes an address corresponding to an entry point of a second hyper-block. Responsive to executing the first branch instruction, a bitwise distance from the entry point of the first hyper-block to the first exit point is stored. A program counter is moved from the first exit point of the first hyper-block to the entry point of the second hyper-block.Type: GrantFiled: June 27, 2022Date of Patent: January 14, 2025Assignee: Texas Instruments IncorporatedInventors: Kai Chirca, Timothy D. Anderson, David E. Smith, Jr., Paul D. Gauvreau
-
Patent number: 12197915Abstract: Parallel instruction demarcators and methods for parallel instruction demarcation are included, wherein an instruction syllable sequence comprising a plurality of instruction syllables is received and stored at an instruction buffer. It is determined, using one or more logic blocks arranged in a sequence, a size of an instruction and at least one boundary at which the instruction is demarcated. Additionally, using a controlling logic block a restart point is determined from where the sequence of instruction syllables is examined and demarcated into individual instructions.Type: GrantFiled: November 15, 2021Date of Patent: January 14, 2025Inventor: Sitaram Yadavalli
-
Patent number: 12190116Abstract: A processor includes a time counter and a time-resource matrix and provides a method for statically dispatching instructions if the resources are available based on data stored in the time-resource matrix, and wherein execution times for the instructions use a time count from the time counter to specify when the instructions may be provided to an execution pipeline. The execution times are based on fixed latency times of instructions with exception of the load instruction which is based on the data cache hit latency time. A data cache miss causes the load instruction and subsequent dependent instructions to be statically replayed at a later time using the same time count.Type: GrantFiled: April 5, 2022Date of Patent: January 7, 2025Assignee: Simplex Micro, Inc.Inventor: Thang Minh Tran
-
Patent number: 12190117Abstract: Techniques are provided for allocating registers for a processor. The techniques include identifying a first instruction of an instruction dispatch set that meets all register allocation suppression criteria of a first set of register allocation suppression criteria, suppressing register allocation for the first instruction, identifying a second instruction of the instruction dispatch set that does not meet all register allocation suppression criteria of a second set of register allocation suppression criteria, and allocating a register for the second instruction.Type: GrantFiled: November 26, 2019Date of Patent: January 7, 2025Assignee: Advanced Micro Devices, Inc.Inventors: Neil N. Marketkar, Arun A. Nair
-
Patent number: 12182575Abstract: A data processing apparatus comprises: a physical register array, prediction circuitry, register rename circuitry, and hardware execution circuitry. The physical register array comprises a plurality of sectors having one or more different access properties, each of the plurality of sectors having one or more different access properties compared to other sectors of the plurality of sectors, each sector of the plurality of sectors comprising at least one physical register. The prediction circuitry to predict, for a given instruction, a sector identifier identifying one of the sectors of the physical register array to be used for a destination register of the given instruction. The prediction circuitry is configured to select the sector identifier in dependence on prediction information learnt from performance monitoring information indicative of performance achieved for a sequence of instructions when using different sector identifiers for the given instruction.Type: GrantFiled: December 12, 2022Date of Patent: December 31, 2024Assignee: Arm LimitedInventor: Mbou Eyole
-
Patent number: 12164927Abstract: Techniques are disclosed relating to instruction scheduling in the context of instruction cache misses. In some embodiments, first-stage scheduler circuitry is configured to assign threads to channels and second-stage scheduler circuitry is configured to assign an operation from a given channel to a given execution pipeline based on decode of an operation for that channel. In some embodiments, thread replacement circuitry is configured to, in response to an instruction cache miss for an operation of a first thread assigned to a first channel, deactivate the first thread from the first channel.Type: GrantFiled: November 10, 2022Date of Patent: December 10, 2024Assignee: Apple Inc.Inventors: Justin Friesenhahn, Benjiman L. Goodman
-
Patent number: 12159140Abstract: An electronic device receives a single instruction to apply a neural network operation to a set of M-bit elements stored in one or more input vector registers to initiate a sequence of computational operations related to a neural network. In response to the single instruction, the electronic device implements the neural network operation on the set of M-bit elements to generate a set of P-bit elements by obtaining the set of M-bit elements from the one or more input vector registers, quantizing each of the set of M-bit elements from M bits to P bits, and packing the set of P-bit elements into an output vector register. P is smaller than M. In some embodiments, the neural network operation is a quantization operation including at least a multiplication with a quantization factor and an addition with a zero point.Type: GrantFiled: April 28, 2022Date of Patent: December 3, 2024Assignee: QUALCOMM IncorporatedInventors: Srijesh Sudarsanan, Deepak Mathew, Marc Hoffman, Sundar Rajan Balasubramanian, Mansi Jain, James Lee, Gerald Sweeney
-
Patent number: 12153929Abstract: A processor includes a plurality of execution units. At least one of the execution units is configured to determine, based on a first field of a first instruction, a number of additional instructions to execute in conjunction with the first instruction and prior to execution of the first instruction. The at least one of the execution units is further configured to determine, based on a second field of the first instruction, a subset of the additional instructions to execute atomically.Type: GrantFiled: November 17, 2021Date of Patent: November 26, 2024Assignee: Texas Instruments IncorporatedInventors: Horst Diewald, Johann Zipperer
-
Patent number: 12153921Abstract: An apparatus includes an array processor to process array data in response to a set of macro-instructions. A macro-instruction in the set of macro-instructions performs loop operations, array iteration operations, and/or arithmetic logic unit (ALU) operations.Type: GrantFiled: June 28, 2021Date of Patent: November 26, 2024Assignee: Silicon Laboratories Inc.Inventors: Matthew Brandon Gately, Eric Jonathan Deal, Mark Willard Johnson, Daniel Thomas Riedler
-
Patent number: 12141583Abstract: An apparatus has processing circuitry with execution units to perform operations, physical registers to store data, and forwarding circuitry to forward the data from the physical registers to the execution units. The forwarding circuitry provides an incomplete set of connections between the physical registers and the execution units such that, for each of at least some of the physical registers, the physical register is connected to only a subset of the execution units. The apparatus also has register renaming circuitry to map logical registers identified by the operations to respective physical registers and register reorganisation circuitry to monitor upcoming operations and to determine, based on the upcoming operations and the connections provided by the forwarding circuitry, whether to perform a register reorganisation procedure to change a mapping between the logical registers and the physical registers.Type: GrantFiled: September 13, 2022Date of Patent: November 12, 2024Assignee: Arm LimitedInventors: Xiaoyang Shen, Zichao Xie
-
Patent number: 12136470Abstract: A processing-in-memory (PIM) system includes a host and a PIM controller. The host is configured to generate a request for a memory access operation or a multiplication/accumulation (MAC) operation of a PIM device and also to generate a mode definition signal defining an operation mode of the PIM device. The PIM controller is configured to generate a command corresponding to the request to control the memory access operation or the MAC operation of the PIM device. When the operation mode of the PIM device is inconsistent with a mode set defined by the mode definition signal, the PIM controller controls the memory access operation or the MAC operation of the PIM device after changing the operation mode of the PIM device.Type: GrantFiled: January 7, 2021Date of Patent: November 5, 2024Assignee: SK hynix Inc.Inventor: Choung Ki Song
-
Patent number: 12118358Abstract: Software instructions are executed on a processor within a computer system to configure a streaming engine with stream parameters to define a multidimensional array. The stream parameters define a size for each dimension of the multidimensional array and a specified width for a selected dimension of the array. Data is fetched from a memory coupled to the streaming engine responsive to the stream parameters. A stream of vectors is formed for the multidimensional array responsive to the stream parameters from the data fetched from memory. When the selected dimension in the stream of vectors exceeds the specified width, the streaming engine inserts null elements into each portion of a respective vector for the selected dimension that exceeds the specified width in the stream of vectors. Stream vectors that are completely null are formed by the streaming engine without accessing the system memory for respective data.Type: GrantFiled: January 25, 2022Date of Patent: October 15, 2024Assignee: Texas Instruments IncorporatedInventors: Son Hung Tran, Shyam Jagannathan, Timothy David Anderson
-
Patent number: 12086597Abstract: An apparatus includes an array processor to process at least one array. The apparatus further includes a memory coupled to the array processor. The at least one array is stored in memory with programmable per-dimension size and stride values.Type: GrantFiled: June 28, 2021Date of Patent: September 10, 2024Assignee: Silicon Laboratories Inc.Inventors: Matthew Brandon Gately, Eric Jonathan Deal, Mark Willard Johnson
-
Patent number: 12086593Abstract: An apparatus has processing circuitry, an instruction decoder, and capability registers, each capability register to store a capability comprising a pointer and constraint metadata for constraining valid use of the pointer/capability. In response to a capability-generating address calculating instruction specifying an offset value, a reference capability register is selected as one of a program counter capability register and a further capability register. A result capability is generated for which the pointer of the result capability indicates a window address identifying a selected window within an address space, the selected window being offset from a reference window by a number of windows determined based on the offset value of the capability-generating address calculating instruction. The reference window comprises the window comprising an address indicated by the pointer of the reference capability register.Type: GrantFiled: January 7, 2021Date of Patent: September 10, 2024Assignee: Arm LimitedInventor: Lee Douglas Smith
-
Patent number: 12079630Abstract: An apparatus includes an array processor to process array data. The array data are arranged in a memory. The array data are specified with programmable per-dimension size and stride values.Type: GrantFiled: June 28, 2021Date of Patent: September 3, 2024Assignee: Silicon Laboratories Inc.Inventors: Matthew Brandon Gately, Eric Jonathan Deal, Mark Willard Johnson, Sebastian Ahmed
-
Patent number: 12067400Abstract: Processing circuitry has a handler mode and a thread mode. In response to an exception condition, a switch to handler mode is made. In response to an intermodal calling branch instruction specifying a branch target address when the processing circuitry is in the handler mode, an instruction decoder controls the processing circuitry to save a function return address to a function return address storage location; switch a current mode of the processing circuitry to the thread mode; and branch to an instruction identified by the branch target address. This can be useful for deprivileging of exceptions.Type: GrantFiled: November 5, 2020Date of Patent: August 20, 2024Assignee: Arm LimitedInventor: Thomas Christopher Grocutt
-
Patent number: 12067398Abstract: Techniques are disclosed relating to load value prediction. In some embodiments, a processor includes learning table circuitry that is shared for both address and value prediction. Loads may be trained for value prediction when they are eligible for both value and address prediction. Entries in the learning table may be promoted to an address prediction table or a load value prediction table for prediction, e.g., when they reach a threshold confidence level in the training table. In some embodiments, the learning table stores a hash of a predicted load value and control circuitry uses a probing load to retrieve the actual predicted load value for the value prediction table.Type: GrantFiled: April 29, 2022Date of Patent: August 20, 2024Assignee: Apple Inc.Inventors: Yuan C. Chou, Debasish Chandra, Mridul Agarwal, Haoyan Jia
-
Patent number: 12050918Abstract: A prefetcher for a coprocessor is disclosed. An apparatus includes a processor and a coprocessor that are configured to execute processor and coprocessor instructions, respectively. The processor and coprocessor instructions appear together in code sequences fetched by the processor, with the coprocessor instructions being provided to the coprocessor by the processor. The apparatus further includes a coprocessor prefetcher configured to monitor a code sequence fetched by the processor and, in response to identifying a presence of coprocessor instructions in the code sequence, capture the memory addresses, generated by the processor, of operand data for coprocessor instructions. The coprocessor is further configured to issue, for a cache memory accessible to the coprocessor, prefetches for data associated with the memory addresses prior to execution of the coprocessor instructions by the coprocessor.Type: GrantFiled: July 28, 2023Date of Patent: July 30, 2024Assignee: Apple Inc.Inventors: Brandon H. Dwiel, Andrew J. Beaumont-Smith, Eric J. Furbish, John D. Pape, Stephen G. Meier, Tyler J. Huberty
-
Patent number: 12014183Abstract: Embodiments described herein provide a technique to decompose 64-bit per-lane virtual addresses to access a plurality of data elements on behalf of a multi-lane parallel processing execution resource of a graphics or compute accelerator. The 64-bit per-lane addresses are decomposed into a base address and a plurality of per-lane offsets for transmission to memory access circuitry. The memory access circuitry then combines the base address and the per-lane offsets to reconstruct the per-lane addresses.Type: GrantFiled: September 21, 2022Date of Patent: June 18, 2024Assignee: Intel CorporationInventors: John Wiegert, Joydeep Ray, Timothy Bauer, James Valerio