Patents Examined by Shawn Doman
  • Patent number: 11704131
    Abstract: An instruction processing device and an instruction processing method are provided.
    Type: Grant
    Filed: August 14, 2020
    Date of Patent: July 18, 2023
    Assignee: Alibaba Group Holding Limited
    Inventors: Chen Chen, Tao Jiang, Dongqi Liu
  • Patent number: 11698883
    Abstract: A method of recording tile identifiers in each of a plurality of tiles of a multitile processor is described. Tiles are arranged in columns, each column having a plurality of processing circuits, each processing circuit comprising one or more tiles, wherein a base processing circuit in each column is connected to a set of processing circuit identifier wires. A base value is generated on each of the set of processing circuit identifier wires for the base processing circuit in each column. At the base processing circuit, the base value on the set of processing circuit identifier wires is read and incremented by one. The incremented value is propagated to a next processing circuit in the column, and at the next processing circuit a unique identifier is recorded by concatenating an identifier of the column and the incremented value.
    Type: Grant
    Filed: June 11, 2021
    Date of Patent: July 11, 2023
    Assignee: GRAPHCORE LIMITED
    Inventors: Stephen Felix, Jonathan Mangnall
  • Patent number: 11693660
    Abstract: A streaming engine employed in a digital signal processor specified a fixed data stream. Once started the data stream is read only and cannot be written. Once fetched, the data stream is stored in a first-in-first-out buffer for presentation to functional units in the fixed order. Data use by the functional unit is controlled using the input operand fields of the corresponding instruction. A read only operand coding supplies the data an input of the functional unit. A read/advance operand coding supplies the data and also advances the stream to the next sequential data elements. The read only operand coding permits reuse of data without requiring a register of the register file for temporary storage.
    Type: Grant
    Filed: May 11, 2020
    Date of Patent: July 4, 2023
    Assignee: Texas Instruments Incorporated
    Inventor: Joseph Zbiciak
  • Patent number: 11675588
    Abstract: A reconfigurable compute fabric can include multiple nodes, and each node can include multiple tiles with respective processing and storage elements. A first tile in a first node can include a processor with a processor output and a first register network configured to receive information from the processor output and information from one or more of the multiple other tiles in the first node. In response to an output instruction and a delay instruction, the register network can provide an output signal to one of the multiple other tiles in the first node. Based on the output instruction, the output signal can include one or the other of the information from the processor output and the information from one or more of the multiple other tiles in the first node. A timing characteristic of the output signal can depend on the delay instruction.
    Type: Grant
    Filed: August 20, 2021
    Date of Patent: June 13, 2023
    Assignee: Micron Technology, Inc.
    Inventors: Douglas Vanesko, Tony M. Brewer, Gongyu Wang
  • Patent number: 11669331
    Abstract: A first processor processes an instruction configured to perform a plurality of functions. The plurality of functions includes one or more functions to operate on one or more tensors. A determination is made of a function of the plurality of functions to be performed. The first processor provides to a second processor information related to the function. The second processor is to perform the function. The first processor and the second processor share memory providing memory coherence.
    Type: Grant
    Filed: June 17, 2021
    Date of Patent: June 6, 2023
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Laith M. AlBarakat, Jonathan D. Bradbury, Timothy Slegel, Cedric Lichtenau, Simon Weishaupt, Anthony Saporito
  • Patent number: 11663016
    Abstract: An integrated circuit including configurable multiplier-accumulator circuitry, wherein, during processing operations, a plurality of the multiplier-accumulator circuits are serially connected into pipelines to perform concatenated multiply and accumulate operations. The integrated circuit includes a first memory and a second memory, and a switch interconnect network, including configurable multiplexers arranged in a plurality of switch matrices. The first and second memories are configurable as either a dedicated read memory or a dedicated write memory and connected to a given pipeline, via the switch interconnect network, during a processing operation performed thereby; wherein, during a first processing operations, the first memory is dedicated to write data to a first pipeline and the second memory is dedicated to read data therefrom and, during a second processing operation, the first memory is dedicated to read data from a second pipeline and the second memory is dedicated to write data thereto.
    Type: Grant
    Filed: March 23, 2022
    Date of Patent: May 30, 2023
    Assignee: Flex Logix Technologies, Inc.
    Inventor: Cheng C. Wang
  • Patent number: 11663008
    Abstract: A memory device includes a memory having a memory bank, a processor in memory (PIM) circuit, and control logic. The PIM circuit includes instruction memory storing at least one instruction provided from a host. The PIM circuit is configured to process an operation using data provided by the host or data read from the memory bank and to store at least one instruction provided by the host. The control logic is configured to decode a command/address received from the host to generate a decoding result and to perform a control operation so that one of i) a memory operation on the memory bank is performed and ii) the PIM circuit performs a processing operation, based on the decoding result. A counting value of a program counter instructing a position of the instruction memory is controlled in response to the command/address instructing the processing operation be performed.
    Type: Grant
    Filed: March 10, 2020
    Date of Patent: May 30, 2023
    Assignee: SAMSUNG ELECTRONICS CO., LTD.
    Inventors: Sukhan Lee, Shinhaeng Kang, Namsung Kim, Seongil O, Hak-Soo Yu
  • Patent number: 11663005
    Abstract: Examples of the present disclosure provide apparatuses and methods for determining a vector population count in a memory. An example method comprises determining, using sensing circuitry, a vector population count of a number of fixed length elements of a vector stored in a memory array.
    Type: Grant
    Filed: January 15, 2021
    Date of Patent: May 30, 2023
    Assignee: Micron Technology, Inc.
    Inventor: Sanjay Tiwari
  • Patent number: 11635956
    Abstract: A fully pipelined convertToBinaryFromDecimalCharacter hardware operator logic circuit configured to convert one or more human-readable decimal character sequence floating-point representations to IEEE 754-2008 binary floating-point representations every clock cycle. The circuit converts decimal character sequence floating-point representations up to 28 decimal digits in length to IEEE 754 binary64, binary32, or binary16 floating-point format representations.
    Type: Grant
    Filed: December 18, 2021
    Date of Patent: April 25, 2023
    Inventor: Jerry D. Harthcock
  • Patent number: 11635957
    Abstract: A universal floating-point Instruction Set Architecture (ISA) compute engine implemented entirely in hardware. The ISA compute engine computes directly with human-readable decimal character sequence floating-point representation operands without first having to explicitly perform a conversion-to-binary-format process in software. A fully pipelined convertToBinaryFromDecimalCharacter hardware operator logic circuit converts one or more human-readable decimal character sequence floating-point representations to IEEE 754-2008 binary floating-point representations every clock cycle. Following computations by at least one hardware floating-point operator, a convertToDecimalCharacterFromBinary hardware conversion circuit converts the result back to a human-readable decimal character sequence floating-point representation.
    Type: Grant
    Filed: February 3, 2022
    Date of Patent: April 25, 2023
    Inventor: Jerry D. Harthcock
  • Patent number: 11635962
    Abstract: A memory device includes a memory having a memory bank, a processor in memory (PIM) circuit, and control logic. The PIM circuit includes instruction memory storing at least one instruction provided from a host. The PIM circuit is configured to process an operation using data provided by the host or data read from the memory bank and to store at least one instruction provided by the host. The control logic is configured to decode a command/address received from the host to generate a decoding result and to perform a control operation so that one of i) a memory operation on the memory bank is performed and ii) the PIM circuit performs a processing operation, based on the decoding result. A counting value of a program counter instructing a position of the instruction memory is controlled in response to the command/address instructing the processing operation be performed.
    Type: Grant
    Filed: March 10, 2020
    Date of Patent: April 25, 2023
    Assignee: SAMSUNG ELECTRONICS CO., LTD.
    Inventors: Sukhan Lee, Shinhaeng Kang, Namsung Kim, Seongil O, Hak-Soo Yu
  • Patent number: 11625592
    Abstract: Systems, apparatus, and methods for thread-based scheduling within a multicore processor. Neural networking uses a network of connected nodes (aka neurons) to loosely model the neuro-biological functionality found in the human brain. Various embodiments of the present disclosure use thread dependency graphs analysis to decouple scheduling across many distributed cores. Rather than using thread dependency graphs to generate a sequential ordering for a centralized scheduler, the individual thread dependencies define a count value for each thread at compile-time. Threads and their thread dependency count are distributed to each core at run-time. Thereafter, each core can dynamically determine which threads to execute based on fulfilled thread dependencies without requiring a centralized scheduler.
    Type: Grant
    Filed: July 5, 2021
    Date of Patent: April 11, 2023
    Assignee: Femtosense, Inc.
    Inventors: Sam Brian Fok, Alexander Smith Neckar
  • Patent number: 11614941
    Abstract: An apparatus for hardware acceleration for use in operating a computational network is configured for determining that a loop structure including one or more loops is to be executed by a first processor. Each of the one or more loops includes a set of operations. The loop structure may be configured as a nested loop, a cascaded or a combination of the two. A second processor may be configured to decouple overhead operations of the loop structure from compute operations of the loop structure. The apparatus accelerates processing of the loop structure by simultaneously processing the overhead operations using the second processor separately from processing the compute operations based on the configuration to operate the computational network.
    Type: Grant
    Filed: March 30, 2018
    Date of Patent: March 28, 2023
    Assignee: QUALCOMM Incorporated
    Inventors: Amrit Panda, Francisco Perez, Karamvir Chatha
  • Patent number: 11609862
    Abstract: A method is provided that includes performing, by a processor in response to a vector sort instruction, sorting of values stored in lanes of the vector to generate a sorted vector, wherein the values in a first portion of the lanes are sorted in a first order indicated by the vector sort instruction and the values in a second portion of the lanes are sorted in a second order indicated by the vector sort instruction; and storing the sorted vector in a storage location.
    Type: Grant
    Filed: February 7, 2022
    Date of Patent: March 21, 2023
    Assignee: Texas Instruments Incorporated
    Inventors: Timothy David Anderson, Mujibur Rahman
  • Patent number: 11599770
    Abstract: A state machine engine having a program buffer. The program buffer is configured to receive configuration data via a bus interface for configuring a state machine lattice. The state machine engine also includes a repair map buffer configured to provide repair map data to an external device via the bus interface. The state machine lattice includes multiple programmable elements. Each programmable element includes multiple memory cells configured to analyze data and to output a result of the analysis.
    Type: Grant
    Filed: December 16, 2019
    Date of Patent: March 7, 2023
    Assignee: Micron Technology, Inc.
    Inventors: Harold B Noyes, David R. Brown
  • Patent number: 11599359
    Abstract: A processor in a data processing system includes a master-shadow physical register file and a renaming unit. The master-shadow physical register file has a master storage coupled to shadow storage. The renaming unit is coupled to the master-shadow physical register file. Based on an occurrence of shadow transfer activation conditions verified by the renaming unit, data in the master storage is transferred from the master storage to the shadow storage for storage. Data is transferred from the shadow storage back to the master storage based on the occurrence of a shadow-to-master transfer event, which includes, for example, a flush of the master storage by the processor.
    Type: Grant
    Filed: May 18, 2020
    Date of Patent: March 7, 2023
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Arun A. Nair, Ashok T. Venkatachar, Emil Talpes, Srikanth Arekapudi, Rajesh Kumar Arunachalam
  • Patent number: 11579889
    Abstract: A processing system 2 includes a processing pipeline 12, 14, 16, 18, 28 which includes fetch circuitry 12 for fetching instructions to be executed from a memory 6, 8. Buffer control circuitry 34 is responsive to a programmable trigger, such as explicit hint instructions delimiting an instruction burst, or predetermined configuration data specifying parameters of a burst together with a synchronising instruction, to trigger the buffer control circuitry to stall a stallable portion of the processing pipeline (e.g. issue circuitry 16), to accumulate within one or more buffers 30, 32 fetched instructions starting from a predetermined starting instruction, and, when those instructions have been accumulated, to restart the stallable portion of the pipeline.
    Type: Grant
    Filed: November 18, 2020
    Date of Patent: February 14, 2023
    Assignee: ARM LIMITED
    Inventors: Jatin Bhartia, Kauser Yakub Johar, Antony John Penton
  • Patent number: 11579884
    Abstract: Techniques for performing instruction fetch operations are provided. The techniques include determining instruction addresses for a primary branch prediction path; requesting that a level 0 translation lookaside buffer (“TLB”) caches address translations for the primary branch prediction path; determining either or both of alternate control flow path instruction addresses and lookahead control flow path instruction addresses; and requesting that either the level 0 TLB or an alternative level TLB caches address translations for either or both of the alternate control flow path instruction addresses and the lookahead control flow path instruction addresses.
    Type: Grant
    Filed: June 26, 2020
    Date of Patent: February 14, 2023
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Ashok Tirupathy Venkatachar, Steven R. Havlir, Robert B. Cohen
  • Patent number: 11561925
    Abstract: A method of processing partitions of a tensor in a target order includes receiving, by a reorder unit and from two or more producer units, a plurality of partitions of a tensor in a first order that is different from the target order, storing the plurality of partitions in the reorder unit, and providing, from the reorder unit, the plurality of partitions in the target order to one or more consumer units. In an example, the one or more consumer units process the plurality of partitions in the target order.
    Type: Grant
    Filed: September 16, 2021
    Date of Patent: January 24, 2023
    Assignee: SambaNova Systems, Inc.
    Inventors: Raghu Prabhakar, Nathan Francis Sheeley, Matheen Musaddiq, Scott Layson Burson, Sitanshu Gupta, Sumti Jairath, Pramod Nataraja, Ajit Punj
  • Patent number: 11561926
    Abstract: A time deterministic computer is architected so that exchange code compiled for one set of tiles, e.g., a column, can be reused on other sets. The computer comprises: a plurality of processing units each having an input interface with a set of input wires, and an output interface with a set of output wires: a switching fabric connected to each of the processing units by the respective set of output wires and connectable to each of the processing units by the respective set of output wires and connectable to each of the processing units by the respective input wires via switching circuitry controllable by its associated processing unit; the processing units arranged in columns, each column having a base processing unit proximate the switching fabric and multiple processing units one adjacent the other in respective positions in the direction of the column.
    Type: Grant
    Filed: January 20, 2022
    Date of Patent: January 24, 2023
    Assignee: GRAPHCORE LIMITED
    Inventors: Stephen Felix, Simon Christian Knowles