Patents Examined by Michael Sun
  • Patent number: 11182168
    Abstract: A computer data processing system includes an instruction pipeline having a front end and a back end, a decoding and dispatch unit to dispatch a current instruction; and a pipeline by-pass unit to invoke an out-of-order pipeline by-pass operation. The pipeline by-pass unit by-passes a section of the instruction pipeline such that the current instruction architecturally completes before initiating instruction execution. The computer data processing system further includes a post-completion execution unit that executes the current instruction after the current instruction architecturally completes.
    Type: Grant
    Filed: December 21, 2020
    Date of Patent: November 23, 2021
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Avery Francois, Christian Jacobi, Gregory William Alexander
  • Patent number: 11176068
    Abstract: Methods and apparatus for a synchronized multi-directional transfer on an inter-processor communication (IPC) link. In one embodiment, the synchronized multi-directional transfer utilizes one or more buffers which are configured to accumulate data during a first state. The one or more buffers are further configured to transfer the accumulated data during a second state. Data is accumulated during a low power state where one or more processors are inactive, and the data transfer occurs during an operational state where the processors are active. Additionally, in some variants, the data transfer may be performed for currently available transfer resources, and halted until additional transfer resources are made available. In still other variants, one or more of the independently operable processors may execute traffic monitoring processes so as to optimize data throughput of the IPC link.
    Type: Grant
    Filed: February 3, 2020
    Date of Patent: November 16, 2021
    Assignee: Apple Inc.
    Inventors: Karan Sanghi, Vladislav Petkov, Radha Kumar Pulyala, Saurabh Garg, Haining Zhang
  • Patent number: 11176081
    Abstract: Various embodiments include systems and methods of operating the systems that include operation of a plurality of first nodes and second nodes in response to a request, where each first node is a first type of processing unit and each second node is a second type of processing unit, where the second type of processing node is different from the first type of processing node. Each of the first and second nodes can be operable in parallel with the other nodes of their respective plurality. Each second node may be operable to respond to the request using data and/or metadata it holds and/or operable in response to data and/or metadata from one or more of the first nodes. Additional apparatus, systems, and methods are disclosed.
    Type: Grant
    Filed: June 23, 2016
    Date of Patent: November 16, 2021
    Assignee: Halliburton Energy Services, Inc.
    Inventors: Joseph Blake Winston, Scott David Senften, Keshava Prasad Rangarajan
  • Patent number: 11175950
    Abstract: Methods, systems, and computer-readable media for dynamic regulation of parallelism for job scheduling are disclosed. A job scheduler sends a resource manager a request to execute a first set of compute jobs using a set of computing resources. The number of jobs corresponds to a first parallelism value. The job scheduler receives a response indicating that the number of computing resources was sufficient or insufficient to schedule the jobs. The job scheduler sends another request to execute another set of compute jobs whose number corresponds to a new parallelism value determined based at least in part on the response. If the number of computing resources was sufficient, the new parallelism value represents an increase over the first parallelism value. If the number of computing resources was insufficient, the new parallelism value represents a decrease from the first parallelism value.
    Type: Grant
    Filed: May 18, 2020
    Date of Patent: November 16, 2021
    Assignee: Amazon Technologies, Inc.
    Inventors: Xu Yang, Jason Roy Rupard, Aswin Damodar, Devendra D Chavan, Ujjwal Kamal Kabra, Brian W Barrett, Stephen William Kendrex
  • Patent number: 11169812
    Abstract: Systems, apparatuses, and methods for arbitrating threads in a computing system are disclosed. A computing system includes a processor with multiple cores, each capable of simultaneously processing instructions of multiple threads. When a thread throttling unit receives an indication that a shared cache has resource contention, the throttling unit sets a threshold number of cache misses for the cache. If the number of cache misses exceeds this threshold, then the throttling unit notifies a particular upstream computation unit to throttle the processing of instructions for the thread. After a time period elapses, if the cache continues to exceed the threshold, then the throttling unit notifies the upstream computation unit to more restrictively throttle the thread by performing one or more of reducing the selection rate and increasing the time period. Otherwise, the unit notifies the upstream computation unit to less restrictively throttle the thread.
    Type: Grant
    Filed: September 26, 2019
    Date of Patent: November 9, 2021
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Paul James Moyer, Douglas Benson Hunt, Kai Troester
  • Patent number: 11169802
    Abstract: In some embodiments, packed data elements of first and second packed data source operands are of a first, different size than a second size of packed data elements of a third packed data operand. Execution circuitry executes decoded single instruction to perform, for each packed data element position of a destination operand, a multiplication of a M N-sized packed data elements from the first and second packed data sources that correspond to a packed data element position of the third packed data source, add of results from these multiplications to a full-sized packed data element of a packed data element position of the third packed data source, and storage of the addition result in a packed data element position destination corresponding to the packed data element position of the third packed data source, wherein M is equal to the full-sized packed data element divided by N.
    Type: Grant
    Filed: October 20, 2016
    Date of Patent: November 9, 2021
    Assignee: Intel Corporation
    Inventors: Robert Valentine, Galina Ryvchin, Piotr Majcher, Mark J. Charney, Elmoustapha Ould-Ahmed-Vall, Jesus Corbal, Milind B. Girkar, Zeev Sperber, Simon Rubanovich, Amit Gradstein
  • Patent number: 11163581
    Abstract: Apparatuses and methods of data processing are disclosed for tagging instructions on-line. Instruction tag storage stores information indicative of a tag applied to certain instruction identifiers. A data processing operation performed by the data processing circuitry in response to an executed instruction is dependent on whether there is a corresponding instruction identifier for the executed instruction in the instruction tag storage which has the instruction tag. Register writer storage is maintained, and an entry is created for each register writing instruction encountered which causes a result value to be written to a destination register, where the entry comprises an indication of the destination register and the register writing instruction. An instruction tagging queue buffers instruction identifiers and an instruction identifier is added to the queue for a predetermined type of instruction when it is encountered.
    Type: Grant
    Filed: October 21, 2019
    Date of Patent: November 2, 2021
    Assignee: Arm Limited
    Inventors: Mbou Eyole, Michiel Willem Van Tol
  • Patent number: 11157432
    Abstract: A host device configures a first plurality of block devices each corresponding to a path between the host device and a control device of a storage system. The host device submits an inquiry to the storage system using a given block device of the first plurality of block devices. The host device receives from the storage system an indication that the given block device corresponds to a data logical volume that has been provisioned for use by the host device and issues a command to remove the first plurality of block devices based at least in part on receiving the indication. The host device performs a scan of the plurality of paths and configures, based at least in part on the scan, a second plurality of block devices each corresponding to a path of the plurality of paths between the host device and the data logical volume.
    Type: Grant
    Filed: August 28, 2020
    Date of Patent: October 26, 2021
    Assignee: EMC IP Holding Company LLC
    Inventors: Gopinath Marappan, Vinay G. Rao
  • Patent number: 11157424
    Abstract: Systems, apparatuses, and methods related to a computing tile are described. The computing tile may perform operations on received data to extract some of the received data. The computing tile may perform operations without intervening commands. The computing tile may perform operations on data streamed through the computing tile to extract relevant data from data received by the computing tile. In an example, the computing tile is configured to receive a command to initiate an operation to reduce a size of a block of data from a first size to a second size. The computing tile can then receive a block of data from a memory device coupled to the apparatus. The computing tile can then perform an operation on the block of data to extract predetermined data from the block of data to reduce a size of the block of data from a first size to a second size.
    Type: Grant
    Filed: January 17, 2020
    Date of Patent: October 26, 2021
    Assignee: Micron Technology, Inc.
    Inventors: Richard C. Murphy, Glen E. Hush, Vijay Ramesh, Allan Porterfield, Anton Korzh
  • Patent number: 11150907
    Abstract: An execution unit circuit for use in a processor core provides efficient use of area and energy by reducing the per-entry storage requirement of a load-store unit issue queue. The execution unit circuit includes a recirculation queue that stores the effective address of the load and store operations and the values to be stored by the store operations. A queue control logic controls the recirculation queue and issue queue so that that after the effective address of a load or store operation has been computed, the effective address of the load operation or the store operation is written to the recirculation queue and the operation is removed from the issue queue, so that address operands and other values that were in the issue queue entry no longer require storage. When a load or store operation is rejected by the cache unit, it is subsequently reissued from the recirculation queue.
    Type: Grant
    Filed: July 30, 2018
    Date of Patent: October 19, 2021
    Assignee: International Business Machines Corporation
    Inventors: Salma Ayub, Sundeep Chadha, Robert Allen Cordes, David Allen Hrusecky, Hung Qui Le, Dung Quoc Nguyen, Brian William Thompto
  • Patent number: 11150835
    Abstract: According to one embodiment, a memory system includes a nonvolatile memory and a controller. The controller acquires, from a host, write data having the same first size as a data write unit of the nonvolatile memory and obtained by dividing write data associated with one write command having a first identifier indicating a first write destination block in a plurality of write destination blocks into a plurality of write data or combining write data associated with two or more write commands having the first identifier. The controller writes the acquired write data having the first size to the first write destination block by a first write operation.
    Type: Grant
    Filed: July 14, 2020
    Date of Patent: October 19, 2021
    Assignee: TOSHIBA MEMORY CORPORATION
    Inventors: Shinichi Kanno, Hideki Yoshida, Naoki Esaka, Hiroshi Nishimura
  • Patent number: 11144324
    Abstract: Systems, apparatuses, and methods for compressing multiple instruction operations together into a single retire queue entry are disclosed. A processor includes at least a scheduler, a retire queue, one or more execution units, and control logic. When the control logic detects a given instruction operation being dispatched by the scheduler to an execution unit, the control logic determines if the given instruction operation meets one or more conditions for being compressed with one or more other instruction operations into a single retire queue entry. If the one or more conditions are met, two or more instruction operations are stored together in a single retire queue entry. By compressing multiple instruction operations together into an individual retire queue entry, the retire queue is able to be used more efficiently, and the processor can speculatively execute more instructions without the retire queue exhausting its supply of available entries.
    Type: Grant
    Filed: September 27, 2019
    Date of Patent: October 12, 2021
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Matthew T. Sobel, Joshua James Lindner, Neil N. Marketkar, Kai Troester, Emil Talpes, Ashok Tirupathy Venkatachar
  • Patent number: 11132329
    Abstract: An electronic control device includes: a partially reconfigurable logic circuit in which a calculation unit which is reconfigured and executes calculation and a storage unit which stores calculation target date to be calculated by the calculation unit is configured; and a processing control unit which transmits circuit data for reconfiguring the calculation unit and the calculation target date to the logic circuit. When the processing control unit obtains next calculation target data which is the calculation target date relating to a next calculation unit which is the calculation unit after completion of reconfiguration, transmission of the next calculation target date to the storage unit is started regardless of whether the reconfiguration of the next calculation unit is completed, and upon completion of the reconfiguration, the next calculation unit performs calculation using the next calculation target date.
    Type: Grant
    Filed: March 22, 2018
    Date of Patent: September 28, 2021
    Assignee: HITACHI AUTOMOTIVE SYSTEMS, LTD.
    Inventors: Taisuke Ueta, Satoshi Tsutsumi, Hideki Endo, Hideyuki Sakamoto
  • Patent number: 11126573
    Abstract: Systems and methods of managing variable size load units of application codes in a processing system include identifying pages of a random access memory (RAM) device to store copies of load units from an external memory device upon request by a bus master in the processing system. The RAM device is internal to an integrated circuit device that includes the bus masters, and the external memory device is external to the integrated circuit device. The bus masters execute the application codes, and each of the application codes comprise one or more load units that include executable program instructions. At least some of the load units have different sizes from one another. A page type indicator is determined for an identified page. A first page type indicates whether the identified page is a split page to store a segment of each of two different load units.
    Type: Grant
    Filed: July 29, 2020
    Date of Patent: September 21, 2021
    Assignee: NXP USA, Inc.
    Inventors: Michael Rohleder, Cristian Macario, Dirk Moeller
  • Patent number: 11119777
    Abstract: Techniques for an extended prefix including a routing bit for an extended instruction format are described herein. An aspect includes generating, by an instruction preprocessing module, a first extended instruction corresponding to an internal operation including a first routing bit. Another aspect includes generating, by the instruction preprocessing module, a second extended instruction corresponding to a prefixed instruction set architecture (ISA) instruction including a second routing bit, wherein a value of the second routing bit is opposite a value of the first routing bit. Another aspect includes providing the first extended instruction and the second extended instruction to a central processing unit (CPU). Another aspect includes, based on the value of the first routing bit, routing the internal operation directly to an execution unit of the CPU, and based on the value of the second routing bit, routing the prefixed ISA instruction to a decode/execute path of the CPU.
    Type: Grant
    Filed: April 22, 2020
    Date of Patent: September 14, 2021
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Giles Roger Frazier, Hung Q. Le
  • Patent number: 11119908
    Abstract: Methods of mapping memory regions to processes based on thermal data of memory regions are described. In some embodiments, a memory controller may receive a memory allocation request. The memory allocation request may include a logical memory address. The method may further include mapping the logical memory address to an address in a memory region of the memory system based on thermal data for memory regions of the memory system. Additional methods and systems are also described.
    Type: Grant
    Filed: May 28, 2020
    Date of Patent: September 14, 2021
    Assignee: Micron Technology, Inc.
    Inventors: Robert Walker, David A. Roberts
  • Patent number: 11113061
    Abstract: Described herein are techniques for saving registers in the event of a function call. The techniques include modifying a program including a block of code designated as a calling code that calls a function. The modifying includes modifying the calling code to set a register usage mask indicating which registers are in use at the time of the function call. The modifying also includes modifying the function to combine the information of the register usage mask with information indicating registers used by the function to generate registers to be saved and save the registers to be saved.
    Type: Grant
    Filed: September 26, 2019
    Date of Patent: September 7, 2021
    Assignee: Advanced Micro Devices, Inc.
    Inventor: Michael John Bedy
  • Patent number: 11106795
    Abstract: Embodiments of the specification provide a method and an apparatus for updating shared data in a multi-core processor environment. The multi-processor environment comprises a multi-core processor. The multi-core processor comprises a plurality of separate processing units (referred to as cores, or core processing units (CPUs) in the specification); the multi-core processor is configured to process a multi-threaded task; the multi-threaded task has shared data to update. The method is executed by any CPU. The method may comprise: requesting, by a first CPU, for a lock to execute a critical section function on the shared data, wherein the lock provides permission to update the shared data, and the critical section function updates the shared data; and setting, by the first CPU if the lock is occupied by a second CPU, a memory index corresponding to the critical section function in a memory of the lock for the second CPU to execute the critical section function based on the memory index.
    Type: Grant
    Filed: October 22, 2019
    Date of Patent: August 31, 2021
    Assignee: ADVANCED NEW TECHNOLOGIES CO., LTD.
    Inventors: Ling Ma, Changhua He
  • Patent number: 11106598
    Abstract: The present disclosure provides a computation device. The computation device is configured to perform a machine learning computation, and includes an operation unit, a controller unit, and a conversion unit. The storage unit is configured to obtain input data and a computation instruction. The controller unit is configured to extract and parse the computation instruction from the storage unit to obtain one or more operation instructions, and to send the one or more operation instructions and the input data to the operation unit. The operation unit is configured to perform operations on the input data according to one or more operation instructions to obtain a computation result of the computation instruction. In the examples of the present disclosure, the input data involved in machine learning computations is represented by fixed-point data, thereby improving the processing speed and efficiency of training operations.
    Type: Grant
    Filed: December 16, 2019
    Date of Patent: August 31, 2021
    Assignee: Shanghai Cambricon Information Technology Co., Ltd.
    Inventors: Yao Zhang, Bingrui Wang
  • Patent number: 11099848
    Abstract: An apparatus comprises: processing circuitry, an instruction decoder, and registers. In response to an overlapped-immediate/register-field-specifying (OIRFS) instruction comprising an opcode field specifying an OIRFS-indicating opcode value, and an overlapped immediate/register field specifying an immediate value and a register specifier, the instruction decoder controls the processing circuitry to use a selected register of the plurality of registers corresponding to the register specifier as a source register or destination register when performing a processing operation depending on the immediate value. The overlapped immediate/register field includes at least one shared bit decoded as part of the immediate value for at least one encoding of the OIRFS instruction and decoded as part of the register specifier for at least one encoding of the OIRFS instruction.
    Type: Grant
    Filed: January 30, 2020
    Date of Patent: August 24, 2021
    Assignee: Arm Limited
    Inventor: Neil Burgess