Patents Examined by Keith E Vicary
  • Patent number: 12141583
    Abstract: An apparatus has processing circuitry with execution units to perform operations, physical registers to store data, and forwarding circuitry to forward the data from the physical registers to the execution units. The forwarding circuitry provides an incomplete set of connections between the physical registers and the execution units such that, for each of at least some of the physical registers, the physical register is connected to only a subset of the execution units. The apparatus also has register renaming circuitry to map logical registers identified by the operations to respective physical registers and register reorganisation circuitry to monitor upcoming operations and to determine, based on the upcoming operations and the connections provided by the forwarding circuitry, whether to perform a register reorganisation procedure to change a mapping between the logical registers and the physical registers.
    Type: Grant
    Filed: September 13, 2022
    Date of Patent: November 12, 2024
    Assignee: Arm Limited
    Inventors: Xiaoyang Shen, Zichao Xie
  • Patent number: 12136470
    Abstract: A processing-in-memory (PIM) system includes a host and a PIM controller. The host is configured to generate a request for a memory access operation or a multiplication/accumulation (MAC) operation of a PIM device and also to generate a mode definition signal defining an operation mode of the PIM device. The PIM controller is configured to generate a command corresponding to the request to control the memory access operation or the MAC operation of the PIM device. When the operation mode of the PIM device is inconsistent with a mode set defined by the mode definition signal, the PIM controller controls the memory access operation or the MAC operation of the PIM device after changing the operation mode of the PIM device.
    Type: Grant
    Filed: January 7, 2021
    Date of Patent: November 5, 2024
    Assignee: SK hynix Inc.
    Inventor: Choung Ki Song
  • Patent number: 12118358
    Abstract: Software instructions are executed on a processor within a computer system to configure a streaming engine with stream parameters to define a multidimensional array. The stream parameters define a size for each dimension of the multidimensional array and a specified width for a selected dimension of the array. Data is fetched from a memory coupled to the streaming engine responsive to the stream parameters. A stream of vectors is formed for the multidimensional array responsive to the stream parameters from the data fetched from memory. When the selected dimension in the stream of vectors exceeds the specified width, the streaming engine inserts null elements into each portion of a respective vector for the selected dimension that exceeds the specified width in the stream of vectors. Stream vectors that are completely null are formed by the streaming engine without accessing the system memory for respective data.
    Type: Grant
    Filed: January 25, 2022
    Date of Patent: October 15, 2024
    Assignee: Texas Instruments Incorporated
    Inventors: Son Hung Tran, Shyam Jagannathan, Timothy David Anderson
  • Patent number: 12086597
    Abstract: An apparatus includes an array processor to process at least one array. The apparatus further includes a memory coupled to the array processor. The at least one array is stored in memory with programmable per-dimension size and stride values.
    Type: Grant
    Filed: June 28, 2021
    Date of Patent: September 10, 2024
    Assignee: Silicon Laboratories Inc.
    Inventors: Matthew Brandon Gately, Eric Jonathan Deal, Mark Willard Johnson
  • Patent number: 12086593
    Abstract: An apparatus has processing circuitry, an instruction decoder, and capability registers, each capability register to store a capability comprising a pointer and constraint metadata for constraining valid use of the pointer/capability. In response to a capability-generating address calculating instruction specifying an offset value, a reference capability register is selected as one of a program counter capability register and a further capability register. A result capability is generated for which the pointer of the result capability indicates a window address identifying a selected window within an address space, the selected window being offset from a reference window by a number of windows determined based on the offset value of the capability-generating address calculating instruction. The reference window comprises the window comprising an address indicated by the pointer of the reference capability register.
    Type: Grant
    Filed: January 7, 2021
    Date of Patent: September 10, 2024
    Assignee: Arm Limited
    Inventor: Lee Douglas Smith
  • Patent number: 12079630
    Abstract: An apparatus includes an array processor to process array data. The array data are arranged in a memory. The array data are specified with programmable per-dimension size and stride values.
    Type: Grant
    Filed: June 28, 2021
    Date of Patent: September 3, 2024
    Assignee: Silicon Laboratories Inc.
    Inventors: Matthew Brandon Gately, Eric Jonathan Deal, Mark Willard Johnson, Sebastian Ahmed
  • Patent number: 12067398
    Abstract: Techniques are disclosed relating to load value prediction. In some embodiments, a processor includes learning table circuitry that is shared for both address and value prediction. Loads may be trained for value prediction when they are eligible for both value and address prediction. Entries in the learning table may be promoted to an address prediction table or a load value prediction table for prediction, e.g., when they reach a threshold confidence level in the training table. In some embodiments, the learning table stores a hash of a predicted load value and control circuitry uses a probing load to retrieve the actual predicted load value for the value prediction table.
    Type: Grant
    Filed: April 29, 2022
    Date of Patent: August 20, 2024
    Assignee: Apple Inc.
    Inventors: Yuan C. Chou, Debasish Chandra, Mridul Agarwal, Haoyan Jia
  • Patent number: 12067400
    Abstract: Processing circuitry has a handler mode and a thread mode. In response to an exception condition, a switch to handler mode is made. In response to an intermodal calling branch instruction specifying a branch target address when the processing circuitry is in the handler mode, an instruction decoder controls the processing circuitry to save a function return address to a function return address storage location; switch a current mode of the processing circuitry to the thread mode; and branch to an instruction identified by the branch target address. This can be useful for deprivileging of exceptions.
    Type: Grant
    Filed: November 5, 2020
    Date of Patent: August 20, 2024
    Assignee: Arm Limited
    Inventor: Thomas Christopher Grocutt
  • Patent number: 12050918
    Abstract: A prefetcher for a coprocessor is disclosed. An apparatus includes a processor and a coprocessor that are configured to execute processor and coprocessor instructions, respectively. The processor and coprocessor instructions appear together in code sequences fetched by the processor, with the coprocessor instructions being provided to the coprocessor by the processor. The apparatus further includes a coprocessor prefetcher configured to monitor a code sequence fetched by the processor and, in response to identifying a presence of coprocessor instructions in the code sequence, capture the memory addresses, generated by the processor, of operand data for coprocessor instructions. The coprocessor is further configured to issue, for a cache memory accessible to the coprocessor, prefetches for data associated with the memory addresses prior to execution of the coprocessor instructions by the coprocessor.
    Type: Grant
    Filed: July 28, 2023
    Date of Patent: July 30, 2024
    Assignee: Apple Inc.
    Inventors: Brandon H. Dwiel, Andrew J. Beaumont-Smith, Eric J. Furbish, John D. Pape, Stephen G. Meier, Tyler J. Huberty
  • Patent number: 12014183
    Abstract: Embodiments described herein provide a technique to decompose 64-bit per-lane virtual addresses to access a plurality of data elements on behalf of a multi-lane parallel processing execution resource of a graphics or compute accelerator. The 64-bit per-lane addresses are decomposed into a base address and a plurality of per-lane offsets for transmission to memory access circuitry. The memory access circuitry then combines the base address and the per-lane offsets to reconstruct the per-lane addresses.
    Type: Grant
    Filed: September 21, 2022
    Date of Patent: June 18, 2024
    Assignee: Intel Corporation
    Inventors: John Wiegert, Joydeep Ray, Timothy Bauer, James Valerio
  • Patent number: 12008369
    Abstract: Techniques are disclosed that relate to executing fused instructions. A processor may include a decoder circuit and a load/store circuit. The decoder circuit may detect a load/store instruction to load a value from a memory and detect a non-load/store instruction that depends on the value to be loaded. The decoder circuit may fuse the load/store instruction and the non-load/store instruction such that one or more operations that the non-load/store instruction is defined to perform are to be executed within the load/store circuit. The load/store circuit may receive an indication of the fused load/store and non-load/store instructions and then execute one or more operations of the load/store instruction and the one or more operations of the non-load/store instruction using a circuit included in the load/store circuit.
    Type: Grant
    Filed: February 25, 2022
    Date of Patent: June 11, 2024
    Assignee: Apple Inc.
    Inventors: John D. Pape, Skanda K. Srinivasa, Francesco Spadini, Brian T. Mokrzycki
  • Patent number: 11995442
    Abstract: A processor includes a register file having a plurality of register file addresses, a processing unit, configured to perform processing in accordance with a configuration defined by information stored in the register file, and an instruction sequencer. The instruction sequencer is configured to control the processing unit by retrieving a sequence of instructions from a memory, in which each instruction includes an opcode, and a subset of the instructions includes a data portion. For each instruction in the sequence of instructions, the instruction sequencer performs an action defined by the opcode. The action for the subset of the opcodes includes writing the data portion to a register file address defined by the opcode. The sequence of instructions includes variable length instructions.
    Type: Grant
    Filed: April 7, 2022
    Date of Patent: May 28, 2024
    Assignee: NXP B.V.
    Inventors: Paul Wielage, Mathias Martinus van Ansem, Jose de Jesus Pineda de Gyvez, Hamed Fatemi
  • Patent number: 11966742
    Abstract: Systems, methods, and apparatuses relating to instructions to reset software thread runtime property histories in a hardware processor are described. In one embodiment, a hardware processor includes a hardware guide scheduler comprising a plurality of software thread runtime property histories; a decoder to decode a single instruction into a decoded single instruction, the single instruction having a field that identifies a model-specific register; and an execution circuit to execute the decoded single instruction to check that an enable bit of the model-specific register is set, and when the enable bit is set, to reset the plurality of software thread runtime property histories of the hardware guide scheduler.
    Type: Grant
    Filed: May 3, 2023
    Date of Patent: April 23, 2024
    Assignee: Intel Corporation
    Inventors: Eliezer Weissmann, Mark Charney, Michael Mishaeli, Robert Valentine, Itai Ravid, Jason W. Brandt, Gilbert Neiger, Baruch Chaikin, Efraim Rotem
  • Patent number: 11954496
    Abstract: In various examples, systems and methods for reducing written requirements in a system on chip (SoC) are described herein. For instance, a total number of iterations may be determined for processing data, such as data representing an array. In some circumstances, a set of iterations may include a first number of iterations that is less than a second number of iterations. As such, and during execution of the set of iterations, a predicate flag corresponding to an excess iteration of the set of iterations may be generated, where the excess iteration corresponds to an iteration that is part of a number of excess iterations that is associated with a difference between the first number of iterations and the second number of iterations. Based on the predicate flag, one or more first values corresponding to the iteration may be prevented from being written to memory.
    Type: Grant
    Filed: August 2, 2021
    Date of Patent: April 9, 2024
    Assignee: NVIDIA Corporation
    Inventors: Ching-Yu Hung, Ravi P Singh, Jagadeesh Sankaran, Yen-Te Shih, Ahmad Itani
  • Patent number: 11941399
    Abstract: A streaming engine employed in a digital data processor specifies a fixed read only data stream. Once fetched data elements in the data stream are disposed in lanes in a stream head register in the fixed order. Some lanes may be invalid, for example when the number of remaining data elements are less than the number of lanes in the stream head register. The streaming engine automatically produces a valid data word stored in a stream valid register indicating lanes holding valid data. The data in the stream valid register may be automatically stored in a predicate register or otherwise made available. This data can be used to control vector SIMD operations or may be combined with other predicate register data.
    Type: Grant
    Filed: March 7, 2022
    Date of Patent: March 26, 2024
    Assignee: Texas Instruments Incorporated
    Inventors: Joseph Zbiciak, Son H. Tran
  • Patent number: 11941397
    Abstract: Techniques to take advantage of the single-instruction-multiple-data (SIMD) capabilities of a processor to process data blocks can include implementing an instruction to fuse the data blocks together. The fuse input instruction can have a first input vector, a second input vector, a select input, a first output vector, and a second output vector. The fuse input instruction selects a portion of the first input vector and a portion of the second input vector based on the select input, sign extends the selected portion of the first input vector and the selected portion of the second input vector, and shuffles data elements of the sign extended portion of the first input vector with data elements of the sign extended portion of the second input vector to generate the first and second output vectors.
    Type: Grant
    Filed: May 31, 2022
    Date of Patent: March 26, 2024
    Assignee: Amazon Technologies, Inc.
    Inventors: Xiaodan Tan, Paul Gilbert Meyer
  • Patent number: 11928474
    Abstract: Selectively updating branch predictors for loops executed from loop buffers is disclosed herein. In some aspects, a branch predictor update circuit of a processor is configured to detect a loop comprising a plurality of loop instructions in an instruction stream, and to determine that the loop is stored within a loop buffer circuit of the processor. The branch predictor update circuit is further configured to determine a count of potential history register updates to the history register for the plurality of loop instructions, and to determine whether the count of potential history register updates exceeds a size of the history register. The branch predictor update circuit is also configured to, responsive to determining that the count of potential history register updates does not exceed the size of the history register, update a branch predictor of the branch predictor circuit based on the plurality of loop instructions.
    Type: Grant
    Filed: June 3, 2022
    Date of Patent: March 12, 2024
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Rami Mohammad Al Sheikh, Saransh Jain, Michael Scott McIlvaine, Daren Eugene Streett
  • Patent number: 11907713
    Abstract: Systems, methods, and apparatuses relating to a sign modification field for fused operations in a configurable spatial accelerator are described.
    Type: Grant
    Filed: December 28, 2019
    Date of Patent: February 20, 2024
    Assignee: Intel Corporation
    Inventors: Kermin E. Chofleming, Chuanjun Zhang, Daniel Towner, Simon C. Steely, Jr., Benjamin Keen
  • Patent number: 11907726
    Abstract: Systems and methods for virtually partitioning an integrated circuit may include identifying dimensional attributes of a target input dataset and selecting a data partitioning scheme from a plurality of distinct data partitioning schemes for the target input dataset based on the dimensional attributes of the target dataset and architectural attributes of an integrated circuit. The methods described herein may also include disintegrating the target dataset into a plurality of distinct subsets of data based on the selected data partitioning scheme and identifying a virtual processing core partitioning scheme from a plurality of distinct processing core partitioning schemes for an architecture of the integrated circuit based on the disintegration of the target input dataset.
    Type: Grant
    Filed: October 17, 2022
    Date of Patent: February 20, 2024
    Assignee: quadric.io, Inc.
    Inventors: Nigel Drego, Aman Sikka, Mrinalini Ravichandran, Robert Daniel Firu, Veerbhan Kheterpal
  • Patent number: 11893393
    Abstract: A microprocessor system comprises a computational array and a hardware arbiter. The computational array includes a plurality of computation units. Each of the plurality of computation units operates on a corresponding value addressed from memory. The hardware arbiter is configured to control issuing of at least one memory request for one or more of the corresponding values addressed from the memory for the computation units. The hardware arbiter is also configured to schedule a control signal to be issued based on the issuing of the memory requests.
    Type: Grant
    Filed: October 22, 2021
    Date of Patent: February 6, 2024
    Assignee: Tesla, Inc.
    Inventors: Emil Talpes, Peter Joseph Bannon, Kevin Altair Hurd