Processing Control For Data Transfer Patents (Class 712/225)
  • Patent number: 11914998
    Abstract: A processor circuit includes an instruction decode unit, an instruction detector, an address generator and a data buffer. The instruction decode unit is configured to decode a first load instruction included in a plurality of load instructions to generate a first decoding result. The instruction detector, coupled to the instruction decode unit, is configured to detect if the load instructions use a same register. The address generator, coupled to the instruction decode unit, is configured to generate a first address requested by the first load instruction according to the first decoding result. The data buffer is coupled to the instruction detector and the address generator. When the instruction detector detects that the load instructions use the same register, the data buffer is configured to store the first address generated from the address generator, and store data requested by the first load instruction according to the first address.
    Type: Grant
    Filed: April 27, 2022
    Date of Patent: February 27, 2024
    Assignee: REALTEK SEMICONDUCTOR CORPORATION
    Inventor: Chia-I Chen
  • Patent number: 11900156
    Abstract: A processor includes a compute fabric and a controller. The compute fabric includes an array of compute nodes and interconnects that configurably connect the compute nodes. The controller is configured to configure at least some of the compute nodes and interconnects in the compute fabric to execute specified code instructions, and to send to the compute fabric multiple threads that each executes the specified code instructions. A compute node among the compute nodes is configured to execute a code instruction for a first thread, and to transfer a result of the code instruction within the fabric, for use as an operand by a second thread, different from the first thread.
    Type: Grant
    Filed: September 9, 2020
    Date of Patent: February 13, 2024
    Assignee: SPEEDATA LTD.
    Inventors: Yoav Etsion, Dani Voitsechov
  • Patent number: 11902136
    Abstract: An example network device includes memory, a communication unit, and processing circuitry coupled to the memory and the communication unit. The processing circuitry is configured to receive first samples of flows from an interface of another network device sampled at a first sampling rate and determine a first parameter based on the first samples. The processing circuitry is configured to receive second samples of flows from the interface sampled at a second sampling rate, wherein the second sampling rate is different than the first sampling rate and determine a second parameter based on the second samples. The processing circuitry is configured to determine a third sampling rate based on the first parameter and the second parameter, control the communication unit to transmit a signal indicative of the third sampling rate to the another network device; and receive third samples of flows from the interface sampled at the third sampling rate.
    Type: Grant
    Filed: May 19, 2022
    Date of Patent: February 13, 2024
    Assignee: Juniper Networks, Inc.
    Inventors: Prasad Miriyala, Suresh Palguna Krishnan, SelvaKumar Sivaraj
  • Patent number: 11900113
    Abstract: The present disclosure relates to data flow processing methods and devices. One example method includes obtaining a dependency relationship and an execution sequence of operating a data flow by a plurality of processing units, generating synchronization logic based on the dependency relationship and the execution sequence, and inserting the synchronization logic into an operation pipeline of each of the plurality of processing unit to generate executable code.
    Type: Grant
    Filed: April 12, 2021
    Date of Patent: February 13, 2024
    Assignee: Huawei Technologies Co., Ltd.
    Inventors: Lijuan Hai, Chen Cheng, Christopher Rodrigues, Peng Wu
  • Patent number: 11868287
    Abstract: The memory sub-systems of the present disclosure discloses a just-in-time (JIT) scheduling system and method. In one embodiment, a system receives a request to perform a memory operation using a hardware resource associated with a memory device. The system identifies a traffic class corresponding to the memory operation. The system determines a number of available quality of service (QoS) credits for the traffic class during a current scheduling time frame. The system determines a number of QoS credits associated with a type of the memory operation. Responsive to determining the number of QoS credits associated with the type of the memory operation is less than the number of available QoS credits, the system submits the memory operation to be processed at a memory device.
    Type: Grant
    Filed: August 20, 2021
    Date of Patent: January 9, 2024
    Assignee: Micron Technology, Inc.
    Inventors: Johnny A Lam, Alex J. Wesenberg, Guanying Wu, Sanjay Subbarao, Chandra Guda
  • Patent number: 11861367
    Abstract: A method and apparatus for controlling pre-fetching in a processor. A processor includes an execution pipeline and an instruction pre-fetch unit. The execution pipeline is configured to execute instructions. The instruction pre-fetch unit is coupled to the execution pipeline. The instruction pre-fetch unit includes instruction storage to store pre-fetched instructions, and pre-fetch control logic. The pre-fetch control logic is configured to fetch instructions from memory and store the fetched instructions in the instruction storage. The pre-fetch control logic is also configured to provide instructions stored in the instruction storage to the execution pipeline for execution. The pre-fetch control logic is further configured set a maximum number of instruction words to be pre-fetched for execution subsequent to execution of an instruction currently being executed in the execution pipeline.
    Type: Grant
    Filed: December 14, 2021
    Date of Patent: January 2, 2024
    Assignee: TEXAS INSTRUMENTS INCORPORATED
    Inventors: Christian Wiencke, Johann Zipperer
  • Patent number: 11818050
    Abstract: A traffic shaping circuit regulates packets transferred by a transmission resource into a network (e.g., a network on a chip) on behalf of a client. The packet transfers are selectively enabled or disabled based on a current budget value. The budget value is modified based on a packet-transfer cost in response to transferring a packet into the network. The rate of packet transfers into the network is monitored. A cost-adjustment signal is generated based on the rate of packet transfers. The packet-transfer cost is modified in response to the cost-adjustment signal for accounting for a subsequent-packet transfer into the network. The cost-adjustment signal may indicate an increase or decrease of the packet-transfer cost and/or a budget limit, both of which are read from a cost table comprising records ordered based on respective packet-transfer cost values. The packet-transfer cost and/or a budget limit are configurable.
    Type: Grant
    Filed: May 28, 2021
    Date of Patent: November 14, 2023
    Assignee: MICROSOFT TECHNOLOGY LICENSING, LLC
    Inventors: Thomas Frederick Detwiler, Thomas Abner Basnight, Suraj Balasubramanian
  • Patent number: 11816486
    Abstract: A hardware multithreaded processor including a register file, a thread controller, and aliasing circuitry. The thread controller is configured to assign each of multiple hardware processing threads to a corresponding one of multiple register block sets in which each register block set includes at least two of multiple register blocks and in which each register block includes at least two registers. The aliasing circuitry is programmable to redirect a reference provided by a first hardware processing thread to a register of a register block assigned to a second hardware processing thread. The reference may be a register number in an instruction issued by the first hardware processing thread. The register number is converted by the aliasing circuitry to a register file address locating a register of the register block assigned to the second hardware processing thread. The aliasing circuitry may include a programmable register for one or more threads.
    Type: Grant
    Filed: January 18, 2022
    Date of Patent: November 14, 2023
    Assignee: NXP B.V.
    Inventor: Michael Andrew Fischer
  • Patent number: 11809369
    Abstract: Representative apparatus, method, and system embodiments are disclosed for a self-scheduling processor which also provides additional functionality. Representative embodiments include a self-scheduling processor, comprising: a processor core adapted to execute a received instruction; and a core control circuit adapted to automatically schedule an instruction for execution by the processor core in response to a received work descriptor data packet. In another embodiment, the core control circuit is also adapted to schedule a fiber create instruction for execution by the processor core, to reserve a predetermined amount of memory space in a thread control memory to store return arguments, and to generate one or more work descriptor data packets to another processor or hybrid threading fabric circuit for execution of a corresponding plurality of execution threads. Event processing, data path management, system calls, memory requests, and other new instructions are also disclosed.
    Type: Grant
    Filed: August 3, 2021
    Date of Patent: November 7, 2023
    Assignee: Micron Technology, Inc.
    Inventor: Tony M. Brewer
  • Patent number: 11809368
    Abstract: Representative apparatus, method, and system embodiments are disclosed for a self-scheduling processor which also provides additional functionality. Representative embodiments include a self-scheduling processor, comprising: a processor core adapted to execute a received instruction; and a core control circuit adapted to automatically schedule an instruction for execution by the processor core in response to a received work descriptor data packet. In another embodiment, the core control circuit is also adapted to schedule a fiber create instruction for execution by the processor core, to reserve a predetermined amount of memory space in a thread control memory to store return arguments, and to generate one or more work descriptor data packets to another processor or hybrid threading fabric circuit for execution of a corresponding plurality of execution threads. Event processing, data path management, system calls, memory requests, and other new instructions are also disclosed.
    Type: Grant
    Filed: July 31, 2021
    Date of Patent: November 7, 2023
    Assignee: Micron Technology, Inc.
    Inventor: Tony M. Brewer
  • Patent number: 11809872
    Abstract: Representative apparatus, method, and system embodiments are disclosed for a self-scheduling processor which also provides additional functionality. Representative embodiments include a self-scheduling processor, comprising: a processor core adapted to execute a received instruction; and a core control circuit adapted to automatically schedule an instruction for execution by the processor core in response to a received work descriptor data packet. In another embodiment, the core control circuit is also adapted to schedule a fiber create instruction for execution by the processor core, to reserve a predetermined amount of memory space in a thread control memory to store return arguments, and to generate one or more work descriptor data packets to another processor or hybrid threading fabric circuit for execution of a corresponding plurality of execution threads. Event processing, data path management, system calls, memory requests, and other new instructions are also disclosed.
    Type: Grant
    Filed: July 25, 2021
    Date of Patent: November 7, 2023
    Assignee: Micron Technology, Inc.
    Inventor: Tony M. Brewer
  • Patent number: 11803638
    Abstract: In order to mitigate side channel attacks that exploit speculative store-to-load forwarding, a store dependence predictor is used to prevent store-to-load forwarding if the load and store instructions do not have a matching translation context (TC). In one design, a store queue (SQ) stores the TC—a function of the privilege mode (PM), address space identifier (ASID), and/or virtual machine identifier (VMID)—of each store and conditions store-to-load forwarding on matching store and load TCs. In another design, a memory dependence predictor (MDP) disambiguates predictions of store-to-load forwarding based on the load instruction's TC. In each design, the MDP or SQ does not predict or allow store-to-load forwarding for loads whose addresses, but not their TCs, match an MDP entry.
    Type: Grant
    Filed: February 25, 2021
    Date of Patent: October 31, 2023
    Assignee: Ventana Micro Systems Inc.
    Inventor: John G. Favor
  • Patent number: 11789657
    Abstract: An intercept engine is installed on a computer and includes an intercept filter adapted to intercept selected commands transmitted between a file system and a storage device. The intercept engine also includes an intercept manager adapted to transmit to the intercept filter one or more primitives, wherein each primitive includes device information specifying a device, wherein a command directed to the specified device is to be intercepted, command type information specifying a type of command to be intercepted, and follow-up action information specifying an action to be performed after the command has been intercepted. A primitive may also include default action information specifying an action to be performed with respect to the command if a communication between the intercept filter and the intercept manager is interrupted. The intercept engine intercepts commands transmitted between the file system and the storage device in accordance with the one or more primitives.
    Type: Grant
    Filed: October 22, 2021
    Date of Patent: October 17, 2023
    Assignee: CIRRUS DATA SOLUTIONS INC.
    Inventors: Wai T. Lam, Sammy Tam, Li-Hsiang Cheng, Tomasz Jaworski
  • Patent number: 11775440
    Abstract: Indirect prefetch circuitry initiates a producer prefetch requesting return of producer data having a producer address and at least one consumer prefetch to request prefetching of consumer data having a consumer address derived from the producer data. A producer prefetch filter table stores producer filter entries indicative of previous producer addresses of previous producer prefetches. Initiation of a requested producer prefetch for producer data having a requested producer address is suppressed when a lookup of the producer prefetch filter table determines that the requested producer address hits against a producer filter entry of the table. The lookup of the producer prefetch filter table for the requested producer address depends on a subset of bits of the requested producer address including at least one bit which distinguishes different chunks of data within a same cache line.
    Type: Grant
    Filed: January 20, 2022
    Date of Patent: October 3, 2023
    Assignee: Arm Limited
    Inventors: Alexander Cole Shulyak, Balaji Vijayan, Karthik Sundaram, Yasuo Ishii, Joseph Michael Pusdesris
  • Patent number: 11775446
    Abstract: Methods, apparatus, systems and articles of manufacture to facilitate atomic compare and swap in cache for a coherent level 1 data cache system are disclosed. An example system includes a cache storage; a cache controller coupled to the cache storage wherein the cache controller is operable to: receive a memory operation that specifies a key, a memory address, and a first set of data; retrieve a second set of data corresponding to the memory address; compare the second set of data to the key; based on the second set of data corresponding to the key, cause the first set of data to be stored at the memory address; and based on the second set of data not corresponding to the key, complete the memory operation without causing the first set of data to be stored at the memory address.
    Type: Grant
    Filed: September 13, 2021
    Date of Patent: October 3, 2023
    Assignee: Texas Instruments Incorporated
    Inventors: Naveen Bhoria, Timothy David Anderson, Pete Michael Hippleheuser
  • Patent number: 11755327
    Abstract: Delivering immediate values by using program counter (PC)-relative load instructions to fetch literal data in processor-based devices is disclosed. In this regard, a processing element (PE) of a processor-based device provides an execution pipeline circuit that comprises an instruction processing portion and a data access portion. Using a literal data access logic circuit, the PE detects a PC-relative load instruction within a fetch window that includes multiple fetched instructions. The PE determines that the PC-relative load instruction can be serviced using literal data that is available to the instruction processing portion of the execution pipeline circuit (e.g., located within the fetch window containing the PC-relative load instruction, or stored in a literal pool buffer), The PE then retrieves the literal data within the instruction processing portion of the execution pipeline circuit, and executes the PC-relative load instruction using the literal data.
    Type: Grant
    Filed: March 2, 2020
    Date of Patent: September 12, 2023
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Melinda Joyce Brown, Michael Scott Mcilvaine
  • Patent number: 11748101
    Abstract: In response to a single-copy-atomic load/store instruction for requesting an atomic transfer of a target block of data between the memory system and the registers, where the target block has a given size greater than a maximum data size supported for a single load/store micro-operation by a load/store data path, instruction decoding circuitry maps the single-copy-atomic load/store instruction to two or more mapped load/store micro-operations each for requesting transfer of a respective portion of the target block of data. In response to the mapped load/store micro-operations, load/store circuitry triggers issuing of a shared memory access request to the memory system to request the atomic transfer of the target block of data of said given size to or from the memory system, and triggers separate transfers of respective portions of the target block of data over the load/store data path.
    Type: Grant
    Filed: July 13, 2021
    Date of Patent: September 5, 2023
    Assignee: Arm Limited
    Inventors: Abhishek Raja, Albin Pierrick Tonnerre
  • Patent number: 11741020
    Abstract: Methods, apparatus, systems and articles of manufacture are disclosed to facilitate fully pipelined read-modify-write support in level 1 data cache using store queue and data forwarding. An example apparatus includes a first storage, a second storage, a store queue coupled to the first storage and the second storage, the store queue operable to receive a first memory operation specifying a first set of data, process the first memory operation for storing the first set of data in at least one of the first storage and the second storage, receive a second memory operation, and prior to storing the first set of data in the at least one of the first storage and the second storage, feedback the first set of data for use in the second memory operation.
    Type: Grant
    Filed: May 22, 2020
    Date of Patent: August 29, 2023
    Assignee: Texas Instruments Incorporated
    Inventors: Naveen Bhoria, Timothy David Anderson, Pete Michael Hippleheuser
  • Patent number: 11741010
    Abstract: A method of programming data to a storage device including a nonvolatile memory device includes receiving first to third barrier commands from a host, receiving first to third data corresponding to the first to third barrier commands from the host, merging the first and second barrier commands and programming the first and second data to the nonvolatile memory device sequentially based on an order of the first and second barrier commands, verifying program completion of both the first and second data, mapping in mapping information of the first and second data when the programming of the first and second data is completed, and mapping out the information of both the first and second data when the programming of at least one of the first and second data is not complete, and programming the third data to the nonvolatile memory device after the mapping in or the mapping out.
    Type: Grant
    Filed: January 2, 2019
    Date of Patent: August 29, 2023
    Inventor: JooYoung Hwang
  • Patent number: 11726791
    Abstract: Examples of the present disclosure provide apparatuses and methods related to generating and executing a control flow. An example apparatus can include a first device configured to generate control flow instructions, and a second device including an array of memory cells, an execution unit to execute the control flow instructions, and a controller configured to control an execution of the control flow instructions on data stored in the array.
    Type: Grant
    Filed: May 12, 2022
    Date of Patent: August 15, 2023
    Assignee: Micron Technology, Inc.
    Inventors: Kyle B. Wheeler, Richard C. Murphy, Troy A. Manning, Dean A. Klein
  • Patent number: 11714644
    Abstract: A predicated vector load micro-operation specifies a load target address, a destination vector register for which active vector elements of the destination vector register are to be loaded with data associated with addresses identified based on the load target address, and a predicate operand indicative of whether each vector element of the destination vector register is active or inactive.
    Type: Grant
    Filed: August 27, 2021
    Date of Patent: August 1, 2023
    Assignee: Arm Limited
    Inventor: Abhishek Raja
  • Patent number: 11704041
    Abstract: An integrated circuit for allowing a band of an external memory to be effectively used in processing a layer algorithm is disclosed. One aspect of the present disclosure relates to an integrated circuit including a first arithmetic part including a first arithmetic unit and a first memory, wherein the first arithmetic unit performs an operation and the first memory stores data for use in the first arithmetic unit and a first data transfer control unit that controls transfer of data between the first memory and a second memory of a second arithmetic part including a second arithmetic unit, wherein the second arithmetic part communicates with an external memory via the first arithmetic part.
    Type: Grant
    Filed: April 1, 2020
    Date of Patent: July 18, 2023
    Assignee: Preferred Networks, Inc.
    Inventors: Tatsuya Kato, Ken Namura
  • Patent number: 11693665
    Abstract: A data processing apparatus and method of operating such is disclosed. Issue circuitry buffers operations prior to execution until operands are available in a set of registers. A first and a second load operation are identified in the issue circuitry, when both are dependent on a common operand, and when the common operand is available in the set of registers. Load circuitry has a first address generation unit to generate a first address for the first load operation and a second address generation unit to generate a second address for the second load operation. An address comparison unit compares the first address and the second address. The load circuitry is arranged to cause a merged lookup to be performed in local temporary storage, when the address comparison unit determines that the first and the second address differ by less than a predetermined address range characteristic of the local temporary storage.
    Type: Grant
    Filed: September 28, 2020
    Date of Patent: July 4, 2023
    Assignee: Arm Limited
    Inventors: Mbou Eyole, Michiel Willem Van Tol
  • Patent number: 11663014
    Abstract: A data processing apparatus is provided that comprises fetch circuitry to fetch an instruction stream comprising a plurality of instructions, including a status updating instruction, from storage circuitry. Status storage circuitry stores a status value. Execution circuitry executes the instructions, wherein at least some of the instructions are executed in an order other than in the instruction stream. For the status updating instruction, the execution circuitry is adapted to update the status value based on execution of the status updating instruction. Flush circuitry flushes, when the status storage circuitry is updated, following instructions that appear after the status updating instruction in the instruction stream.
    Type: Grant
    Filed: August 26, 2019
    Date of Patent: May 30, 2023
    Assignee: ARM LIMITED
    Inventors: Abhishek Raja, Rakesh Shaji Lal, Michael Filippo, Glen Andrew Harris, Vasu Kudaravalli, Huzefa Moiz Sanjeliwala, Jason Setter
  • Patent number: 11636544
    Abstract: Orders received by an electronic trading system are processed in batches based on the instrument to which an order relates. An incoming order is assigned to a queue of a queue set that makes up the batch according to a random process. Where orders are received from related trading parties, they are assigned to the same queue set according to their time of receipt. The batch has a random duration within defined minimum and maximum durations and at the end of the batch, the orders held in the queues are transferred to a matching thread of the trading system sequentially with one order being removed from each queue and a number of passes of the queues completed until orders have been removed.
    Type: Grant
    Filed: February 28, 2022
    Date of Patent: April 25, 2023
    Assignee: NEX Services North America LLC
    Inventors: Michael Merold, John E. Schoen
  • Patent number: 11630607
    Abstract: Memory devices and a memory controller that controls such memory devices. Multiple memory devices receive commands and addresses on a command/address (C/A) bus that is relayed point-to-point by each memory device. Data is received and sent from these devices to/from a memory controller in a point-to-point configuration by adjusting the width of each individual data bus coupled between the individual memory devices and the memory controller. Along with the C/A bus are clock signals that are regenerated by each memory device and relayed. The memory controller and memory devices may be packaged on a single substrate using package-on-package technology. Using package-on-package technology allows the relayed C/A signals to connect from memory device to memory device using wire bonding. Wirebond connections provide a short, high-performance signaling environment for the chip-to-chip relaying of the C/A signals and clocks from one memory device to the next in the daisy-chain.
    Type: Grant
    Filed: November 8, 2021
    Date of Patent: April 18, 2023
    Assignee: Rambus Inc.
    Inventor: Frederick Ware
  • Patent number: 11573726
    Abstract: A device may include a plurality of data processing engines. Each of the data processing engines may include a memory pool having a plurality of memory banks, a plurality of cores each coupled to the memory pool and configured to access the plurality of memory banks, a memory mapped switch coupled to the memory pool and a memory mapped switch of at least one neighboring data processing engine, and a stream switch coupled to each of the plurality of cores and to a stream switch of the at least one neighboring data processing engine.
    Type: Grant
    Filed: November 13, 2020
    Date of Patent: February 7, 2023
    Assignee: Xilinx, Inc.
    Inventors: Juan J. Noguera Serra, Goran H K Bilski, Jan Langer, Baris Ozgul, Richard L. Walke, Ralph D. Wittig, Kornelis A. Vissers, Christopher H. Dick, Philip B. James-Roxby
  • Patent number: 11567777
    Abstract: A storage system and method for implementing an encoder, decoder, and/or buffer using a field programmable gate array are provided. In one embodiment, a storage system is provided with a field programmable gate array and a memory that stores sets of instruction code for the field programmable gate array. The sets of instruction code can be for different error decoder implementations, for providing an additional encoder and/or decoder, or for implementing a host memory buffer or a controller memory buffer.
    Type: Grant
    Filed: January 24, 2022
    Date of Patent: January 31, 2023
    Assignee: Western Digital Technologies, Inc.
    Inventors: Ariel Navon, Ran Zamir, Shay Benisty
  • Patent number: 11520719
    Abstract: A memory controller includes a host interface circuit connectable to a host device by a bus conforming to a memory card system specification, a data buffer circuit including a buffer memory, a tag information generation circuit configured to generate tag information associated with a command received by the host interface circuit, and a first register in which the tag information generated by the tag information generation circuit is stored, and a second register into which the tag information stored in the first register is copied after the command is fetched from the host interface circuit for processing. When a read request is made from the host interface circuit to the data buffer circuit, the data buffer circuit returns read data stored in the buffer memory upon confirming that the tag information stored in the first register and the tag information stored in the second register match each other.
    Type: Grant
    Filed: March 3, 2021
    Date of Patent: December 6, 2022
    Assignee: KIOXIA CORPORATION
    Inventors: Tamio Saimen, Kenji Sakaue
  • Patent number: 11507375
    Abstract: In an example, an apparatus comprises a plurality of execution units, and a first general register file (GRF) communicatively couple to the plurality of execution units, wherein the first GRF is shared by the plurality of execution units. Other embodiments are also disclosed and claimed.
    Type: Grant
    Filed: May 12, 2021
    Date of Patent: November 22, 2022
    Assignee: INTEL CORPORATION
    Inventors: Abhishek R. Appu, Altug Koker, Joydeep Ray, Kamal Sinha, Kiran C. Veernapu, Subramaniam Maiyuran, Prasoonkumar Surti, Guei-Yuan Lueh, David Puffer, Supratim Pal, Eric J. Hoekstra, Travis T. Schluessler, Linda L. Hurd
  • Patent number: 11461243
    Abstract: An apparatus (2) comprises processing circuitry (4) to perform speculative execution of instructions; a main cache storage region (30); a speculative cache storage region (32); and cache control circuitry (34) to allocate an entry, for which allocation is caused by a speculative memory access triggered by the processing circuitry, to the speculative cache storage region instead of the main cache storage region while the speculative memory access remains speculative. This can help protect against potential security attacks which exploit cache timing side-channels to gain information about allocations into the cache caused by speculative memory accesses.
    Type: Grant
    Filed: August 30, 2018
    Date of Patent: October 4, 2022
    Assignee: Arm Limited
    Inventor: Richard Roy Grisenthwaite
  • Patent number: 11461107
    Abstract: One embodiment provides for a general-purpose graphics processing unit comprising a streaming multiprocessor having a single instruction, multiple thread (SIMT) architecture including hardware multithreading. The streaming multiprocessor comprises multiple processing blocks including multiple processing cores. The processing cores include independent integer and floating-point data paths that are configurable to concurrently execute multiple independent instructions. A memory is coupled with the multiple processing blocks.
    Type: Grant
    Filed: December 20, 2018
    Date of Patent: October 4, 2022
    Assignee: Intel Corporation
    Inventors: Elmoustapha Ould-Ahmed-Vall, Barath Lakshmanan, Tatiana Shpeisman, Joydeep Ray, Ping T. Tang, Michael Strickland, Xiaoming Chen, Anbang Yao, Ben J. Ashbaugh, Linda L. Hurd, Liwei Ma
  • Patent number: 11449576
    Abstract: Embodiments of this application provide a convolution operation processing method and a related product. The integrated chip includes a control unit, at least one convolutional processing element, an input cache, and an output cache. The control unit loads a sectioned convolution kernel and sectioned convolution input data into the input cache, the sectioned convolution kernel being generated by sectioning a convolution kernel and including a plurality of convolution kernel segments, and the sectioned convolution input data being generated by sectioning convolution input data and including a plurality of convolution input data segments; and the at least one convolutional processing element performs a sectioned convolution operation on the sectioned convolution kernel and the sectioned convolution input data to obtain a sectioned convolution result, and stores the sectioned convolution result into the output cache.
    Type: Grant
    Filed: November 8, 2019
    Date of Patent: September 20, 2022
    Assignee: Tencent Technology (Shenzhen) Company Limited
    Inventors: Heng Zhang, Yangming Zhang
  • Patent number: 11442863
    Abstract: Data processing apparatuses and methods of processing data are disclosed. The operations comprise: storing copies of data items; and storing, in a producer pattern history table, a plurality of producer-consumer relationships, each defining an association between producer load indicator and a plurality of consumer load entries, each consumer load entry comprising a consumer load indicator and one or more usefulness metrics. Further steps comprise: initiating, in response to a data load from an address corresponding to the producer load indicator in the producer pattern history table and when at least one of the corresponding one or more usefulness meets a criterion, a producer prefetch of data to be prefetched for storing as a local copy; and issuing, when the data is returned, one or more consumer prefetches to return consumer data from a consumer address generated from the data returned by the producer prefetch and a consumer load indicator of a consumer load entry.
    Type: Grant
    Filed: November 10, 2020
    Date of Patent: September 13, 2022
    Assignee: Arm Limited
    Inventors: Alexander Cole Shulyak, Adrian Montero, Joseph Michael Pusdesris, Karthik Sundaram, Yasuo Ishii
  • Patent number: 11403394
    Abstract: Detecting and preventing selected events within a computing environment. A determination is made as to whether a selected event of the computing environment is consistent with a historical pattern of selected events of the computing environment. Based on determining the selected event is inconsistent with the historical pattern of selected events, processing associated with the selected event is delayed. Based on delaying processing associated with the selected event, a determination is made as to whether the selected event is valid. Based on determining that the selected event is valid, processing associated with the selected event is resumed.
    Type: Grant
    Filed: September 17, 2019
    Date of Patent: August 2, 2022
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventor: William O'Farrell
  • Patent number: 11379241
    Abstract: System includes at least one computer processor having a load store execution unit (LSU) for processing load and store instructions, wherein the LSU includes (a) a store queue having a plurality of entries for storing data, each store queue entry having a data field for storing the data, the data field having a width for storing the data; and (b) a gather buffer for holding data, wherein the processor is configured to: process oversize data larger than the width of the data field of the store queue, and process an oversize load instruction for oversize data by executing two passes through the LSU, a first pass through the LSU configured to store a first portion of the oversize data in the gather buffer and a second pass through the LSU configured to merge the first portion of the oversize data with a second portion of the oversize data.
    Type: Grant
    Filed: July 30, 2020
    Date of Patent: July 5, 2022
    Assignee: International Business Machines Corporation
    Inventors: Bryan Lloyd, Brian Chen, Kimberly M. Fernsler, Robert A. Cordes, David A. Hrusecky
  • Patent number: 11354177
    Abstract: A memory system having a set of media, a plurality of inter-process communication channels, and a controller configured to run a plurality of processes that communicate with each other using inter-process communication messages transmitted via the plurality of inter-process communication channels, in response to requests from a host system to store data in the media or retrieve data from the media. The memory system has a message manager that examines requests from the host system, identifies a plurality of combinable requests, generates a combined request, and provides the combined request to the plurality of processes as a substitute of the plurality of combinable requests.
    Type: Grant
    Filed: August 8, 2019
    Date of Patent: June 7, 2022
    Assignee: Micron Technology, Inc.
    Inventor: Alex Frolikov
  • Patent number: 11347680
    Abstract: A processor includes a widest set of data registers that corresponds to a given logical processor. Each of the data registers of the widest set have a first width in bits. A decode unit that corresponds to the given logical processor is to decode instructions that specify the data registers of the widest set, and is to decode an atomic store to memory instruction. The atomic store to memory instruction is to indicate data that is to have a second width in bits that is wider than the first width in bits. The atomic store to memory instruction is to indicate memory address information associated with a memory location. An execution unit is coupled with the decode unit. The execution unit, in response to the atomic store to memory instruction, is to atomically store the indicated data to the memory location.
    Type: Grant
    Filed: December 22, 2020
    Date of Patent: May 31, 2022
    Assignee: Intel Corporation
    Inventors: Vedvyas Shanbhogue, Stephen J. Robinson, Christopher D. Bryant, Jason W. Brandt
  • Patent number: 11327881
    Abstract: Technologies for media management for providing column data layouts for clustered data include a device having a column-addressable memory and circuitry connected to the memory. The circuitry is configured to store a data cluster of a logical matrix in the column-addressable memory with a column-based format and to read a logical column of the data cluster from the column-addressable memory with a column read operation. Reading the logical column may include reading logical column data diagonally from the column-address memory, including reading from the data cluster and a duplicate copy of the data cluster. Reading the logical column may include reading from multiple complementary logical columns. Reading the logical column may include reading logical column data diagonally with a modulo counter. The column data may bread from a partition of the column-address memory selected based on the logical column number. Other embodiments are described and claimed.
    Type: Grant
    Filed: May 13, 2020
    Date of Patent: May 10, 2022
    Assignee: Intel Corporation
    Inventors: Chetan Chauhan, Sourabh Dongaonkar, Rajesh Sundaram, Jawad Khan, Sandeep Guliani, Dipanjan Sengupta, Mariano Tepper
  • Patent number: 11314660
    Abstract: A system comprises a processor including a CPU core, first and second memory caches, and a memory controller subsystem. The memory controller subsystem speculatively determines a hit or miss condition of a virtual address in the first memory cache and speculatively translates the virtual address to a physical address. Associated with the hit or miss condition and the physical address, the memory controller subsystem configures a status to a valid state. Responsive to receipt of a first indication from the CPU core that no program instructions associated with the virtual address are needed, the memory controller subsystem reconfigures the status to an invalid state and, responsive to receipt of a second indication from the CPU core that a program instruction associated with the virtual address is needed, the memory controller subsystem reconfigures the status back to a valid state.
    Type: Grant
    Filed: November 25, 2019
    Date of Patent: April 26, 2022
    Assignee: Texas Instruments Incorporated
    Inventors: Bipin Prasad Heremagalur Ramaprasad, David Matthew Thompson, Abhijeet Ashok Chachad, Hung Ong
  • Patent number: 11301250
    Abstract: The disclosure provides a data prefetching auxiliary circuit, a data prefetching method, and a microprocessor. The data prefetching auxiliary circuit includes a stride calculating circuit, a comparing module, a stride selecting module, and a prefetching output module. The stride calculating circuit receives an access address to calculate and provide a stride. The comparing module receives the access address and the stride, generates a reference address based on a first multiple, the access address and the stride, determines whether the reference address matches any of a plurality of history access addresses, and generates and outputs a hit indicating bit value. The stride selecting module receives the hit indicating bit value, and determines whether to output the hit indicating bit value based on a prefetch enabling bit value. The prefetching output module determines a prefetch address according to the output of the stride selecting module.
    Type: Grant
    Filed: October 7, 2019
    Date of Patent: April 12, 2022
    Assignee: Shanghai Zhaoxin Semiconductor Co., Ltd.
    Inventors: Xianpei Zheng, Zhongmin Chen, Weilin Wang, Jiin Lai
  • Patent number: 11294686
    Abstract: An apparatus for computing, comprising a processing circuitry configured for computing an outcome of executing a set of computer instructions comprising a group of data variables, by: identifying an initial state of the processing circuitry; executing a set of anticipated computer instructions produced based on the set of computer instructions and a likely data value, where the likely data value is a value of one the group of data variables anticipated while executing the set of computer instructions; and when identifying, while executing the set of anticipated computer instructions, a failed prediction where the data variable is not equal to the likely data value: restoring the initial state of the processing circuitry; and executing a set of alternative computer instructions, produced based on the set of computer instructions and the at least one likely data value.
    Type: Grant
    Filed: January 11, 2021
    Date of Patent: April 5, 2022
    Assignee: Next Silicon Ltd
    Inventors: Elad Raz, Ilan Tayari
  • Patent number: 11294672
    Abstract: Techniques are disclosed relating to routing circuitry configured to perform permute operations for operands of threads in a single-instruction multiple-data group. In some embodiments, an apparatus includes hierarchical operand routing circuitry configured to route operands between a set of single-instruction multiple-data (SIMD) pipelines based on a permute instruction. In some embodiments, the routing circuitry includes a first level and a second level. The first level may include a set of multiple crossbar circuits each configured to receive operands from a respective subset of the pipelines and output one or more of the received operands on multiple output lines based on the permute instruction, where the crossbar circuits support full permutation within a respective subset.
    Type: Grant
    Filed: August 22, 2019
    Date of Patent: April 5, 2022
    Assignee: Apple Inc.
    Inventors: Robert D. Kenney, Liang-Kai Wang, Terence M. Potter
  • Patent number: 11281586
    Abstract: The invention provides a processor including a prediction table, a prediction logic circuit, and a prediction verification circuit. The prediction table has a plurality of sets respectively corresponding to a plurality of cache sets of a cache memory in the cache system, each of the sets has a plurality of confidence values, and the prediction table provides the confidence values of a selected set according to the index. The prediction logic circuit receives the confidence values of the selected set, and generates a prediction result by judging whether each of the confidence values of the selected set is larger than a threshold value or not. The prediction verification circuit receives the prediction result, generates a correct/incorrect information according to the prediction result, and generates an update information according to the correct/incorrect information. Wherein, the prediction verification circuit updates the confidence values of the prediction table according to the update information.
    Type: Grant
    Filed: May 9, 2017
    Date of Patent: March 22, 2022
    Assignee: ANDES TECHNOLOGY CORPORATION
    Inventors: Kun-Ho Liu, Chieh-Jen Cheng, Chuan-Hua Chang, I-Cheng Kevin Chen
  • Patent number: 11277149
    Abstract: Systems, apparatuses, and methods related to bit string compression are described. A method for bit string compression can include determining that a particular operation is to be performed using a bit string formatted according to a universal number format or a posit format to alter a bit width associated with the bit string from a first bit width to a second bit width and performing a compression operation on a bit string formatted according to a universal number format or a posit format to alter a bit width associated with the bit string from a first bit width to a second bit width. The method can further include writing the bit string having the second bit width to a first register, performing an arithmetic operation or a logical operation, or both using the bit string having the second bit string width, and monitoring a quantity of bits of a result of the operation.
    Type: Grant
    Filed: November 19, 2020
    Date of Patent: March 15, 2022
    Assignee: Micron Technology, Inc.
    Inventor: Vijay S. Ramesh
  • Patent number: 11275701
    Abstract: Various embodiments include methods and systems performed by a processor of a first function block for providing secure timer synchronization with a second function block. Various embodiments may include storing, in a shared register space, a first time counter value in which the first time counter value is based on a global counter of the second function block, transmitting, from the shared register space, the stored first time counter value to a preload register of the first function block, receiving, by the first function block, a strobe signal from the second function block configured to enable the first time counter value in the preload register to be loaded into a global counter of the first function block, and configuring the global counter with the first time counter value from the preload register.
    Type: Grant
    Filed: June 24, 2020
    Date of Patent: March 15, 2022
    Assignee: QUALCOMM Incorporated
    Inventor: Naveen Kumar Narala
  • Patent number: 11269636
    Abstract: A digital data processor includes an instruction memory storing instructions each specifying a data processing operation and at least one data operand field, an instruction decoder coupled to the instruction memory for sequentially recalling instructions from the instruction memory and determining the data processing operation and the at least one data operand, and at least one operational unit coupled to a data register file and to the instruction decoder to perform a data processing operation upon at least one operand corresponding to an instruction decoded by the instruction decoder and storing results of the data processing operation. The at least one operational unit is configured to perform a table write in response to a look up table write instruction by writing at least one data element from a source data register to a specified location in a specified number of at least one table.
    Type: Grant
    Filed: September 13, 2019
    Date of Patent: March 8, 2022
    Assignee: Texas Instmments Incorporated
    Inventors: Naveen Bhoria, Duc Bui, Dheera Balasubramanian Samudrala
  • Patent number: 11256987
    Abstract: A method for selectively dropping out feature elements from a tensor is disclosed. The method includes generating a mask that has a plurality of mask elements arranged in a first order. A compressed mask is generated, which includes a plurality of compressed mask elements arranged in a second order that is different from the first order. For example, each mask element of the plurality of mask elements of the mask is compressed to generate a corresponding compressed mask element of the plurality of compressed mask elements of the compressed mask. Individual compressed mask element of the plurality of compressed mask elements is indicative of whether a corresponding feature element of the tensor output by a neural network layer is to be dropped out or retained. Feature elements are selectively dropped from the tensor, based on the compressed mask.
    Type: Grant
    Filed: June 2, 2021
    Date of Patent: February 22, 2022
    Assignee: SambaNova Systems, Inc.
    Inventors: Sathish Terakanambi Sheshadri, Ram Sivaramakrishnan, Raghu Prabhakar
  • Patent number: 11243814
    Abstract: Machine learning is utilized to analyze respective execution times of a plurality of tasks in a job performed in a distributed computing system to determine that a subset of the plurality of tasks are straggler tasks in the job, where the distributed computing system includes a plurality of computing devices. A supervised machine-learning algorithm is performed using a set of inputs including performance attributes of the plurality of tasks, where the supervised machine learning algorithm uses labels generated from determination of the set of straggler tasks, the performance attributes include respective attributes of the plurality of tasks observed during performance of the job, and applying the supervised learning algorithm results in identification of a set of rules defining conditions, based on the performance attributes of the plurality of tasks, indicative of which tasks will be straggler tasks in a job. Rule data is generated to describe the set of rules.
    Type: Grant
    Filed: March 30, 2020
    Date of Patent: February 8, 2022
    Assignee: Intel Corporation
    Inventors: Huanxing Shen, Cong Li, Tai Huang
  • Patent number: 11210102
    Abstract: An apparatus comprises processing circuitry to execute instructions from one or more of a plurality of execution contexts each associated with a respective execution context identifier; a cache; and a speculative buffer. Control circuitry controls allocation of data to the cache and the speculative buffer. A speculative entry, for which allocation is caused by a speculative memory access associated with a given execution context, is allocated to the speculative buffer instead of to the cache while the speculatively executed memory access instruction remains speculative. The speculative entry specifies, as a tagged execution context identifier, the execution context identifier associated with the given execution context. Presence of the speculative entry in the speculative buffer is prevented from being observable to execution contexts other than the execution context identified by the tagged execution context identifier.
    Type: Grant
    Filed: November 26, 2019
    Date of Patent: December 28, 2021
    Assignee: Arm Limited
    Inventor: Roko Grubisic