Patents Examined by Cheng-Yuan Tseng
  • Patent number: 11205118
    Abstract: A deep neural network (DNN) module utilizes parallel kernel and parallel input processing to decrease bandwidth utilization, reduce power consumption, improve neuron multiplier stability, and provide other technical benefits. Parallel kernel processing enables the DNN module to load input data only once for processing by multiple kernels. Parallel input processing enables the DNN module to load kernel data only once for processing with multiple input data. The DNN module can implement other power-saving techniques like clock-gating (i.e. removing the clock from) and power-gating (i.e. removing the power from) banks of accumulators based upon usage of the accumulators. For example, individual banks of accumulators can be power-gated when all accumulators in a bank are not in use, and do not store data for a future calculation. Banks of accumulators can also be clock-gated when all accumulators in a bank are not in use, but store data for a future calculation.
    Type: Grant
    Filed: April 12, 2018
    Date of Patent: December 21, 2021
    Assignee: MICROSOFT TECHNOLOGY LICENSING, LLC
    Inventors: Amol Ashok Ambardekar, Chad Balling McBRIDE, George Petre, Larry Marvin Wall, Kent D. Cedola, Boris Bobrov
  • Patent number: 11200064
    Abstract: Methods and parallel processing units for avoiding inter-pipeline data hazards wherein inter-pipeline data hazards are identified at compile time. For each identified inter-pipeline data hazard the primary instruction and secondary instruction(s) thereof are identified as such and are linked by a counter which is used to track that inter-pipeline data hazard. Then when a primary instruction is output by the instruction decoder for execution the value of the counter associated therewith is adjusted (e.g. incremented) to indicate that there is hazard related to the primary instruction, and when primary instruction has been resolved by one of multiple parallel processing pipelines the value of the counter associated therewith is adjusted (e.g. decremented) to indicate that the hazard related to the primary instruction has been resolved.
    Type: Grant
    Filed: October 14, 2020
    Date of Patent: December 14, 2021
    Assignee: Imagination Technologies Limited
    Inventors: Luca Iuliano, Simon Nield, Yoong-Chert Foo, Ollie Mower
  • Patent number: 11194743
    Abstract: A method of accessing a dual line solid-state drive (SSD) device through a network interface and a PCIe EP simultaneously. The method includes: (1) establishing, by the dual line SSD device, a connection with a remote server through the network interface, (2) establishing, by the remote server, an administrative queue with the dual line SSD device, (3) establishing, by the remote server, an input/output queue with the dual line SSD device by posting a command in the administrative queue over the network interface to initiate transfer of data, (4) establishing, by the dual line SSD device, a connection with a local server over the PCIe EP, (5) establishing, by the local server, the administrative queue over the PCIe EP, and (6) establishing, by the local server, the input/output queue by posting the command in the administrative queue over the PCIe EP to initiate transfer of the data.
    Type: Grant
    Filed: June 7, 2019
    Date of Patent: December 7, 2021
    Assignee: Samsung Electronics Co., Ltd.
    Inventors: Anil Desmal Solanki, Venkataratnam Nimmagadda, Prashant Vishwanath Mahendrakar
  • Patent number: 11194576
    Abstract: The present disclosure relates to A method of automating a process to process a task or an object comprising: defining elements of the process in one or more human-intelligible and editable and machine interpretable workflow program documents, the workflow program documents each including a plurality of actors who perform actions or take decision, a sequence of action steps each associated with an actor and having at least one expected outcome and a corresponding next step for each expected outcome, and wherein the action steps may include a first type of action further defined within the workflow program documents and a second type of action implemented by a computer according to code defined other than in said workflow program documents; and executing the process by a processor running the code defined by the workflow program documents, wherein if an exception is detected in the processing of a task or object according to the code, the exception is passed to a supervisory function to perform a remedial acti
    Type: Grant
    Filed: June 17, 2021
    Date of Patent: December 7, 2021
    Inventors: Tielman Francois Botha, Dawid Eduard Botha, Philip Viljoen
  • Patent number: 11194575
    Abstract: Provided is a method, computer program product, and system for performing data address prediction. The method comprises receiving a first instruction for execution by a processor. A load address predictor (LAP) accesses a LAP table entry for a section of an instruction cache. The section is associated with a plurality of instructions that includes the first instruction. The LAP predicts a set of data addresses that will be loaded using the LAP table entry. The method further comprises sending a recommendation to prefetch the set of data addresses to a load-store unit (LSU).
    Type: Grant
    Filed: November 7, 2019
    Date of Patent: December 7, 2021
    Assignee: International Business Machines Corporation
    Inventors: Mohit Karve, Naga P. Gorti, Edmund Joseph Gieske
  • Patent number: 11188329
    Abstract: Systems, apparatuses, and methods related to dynamic precision bit string accumulation are described. Dynamic bit string accumulation can be performed using an edge computing device. In an example method, dynamic precision bit string accumulation can include performing an iteration of a recursive operation using a first bit string and a second bit string and determining that a result of the iteration of the recursive operation contains a quantity of bits in a particular bit sub-set of the result that is greater than a threshold quantity of bits associated with the particular bit sub-set. The method can further include writing a result of the iteration of the recursive operation to a first register and writing at least a portion of the bits associated with the particular bit sub-set of the result to a second register.
    Type: Grant
    Filed: June 24, 2020
    Date of Patent: November 30, 2021
    Assignee: Micron Technology, Inc.
    Inventors: Vijay S. Ramesh, Richard C. Murphy
  • Patent number: 11182320
    Abstract: The following description is directed to a configurable logic platform. In one example, a configurable logic platform includes host logic and a plurality of reconfigurable logic regions. Each reconfigurable region can include hardware that is configurable to implement an application logic design. The host logic can be used for separately encapsulating each of the reconfigurable logic regions. The host logic can include a plurality of data path functions where each data path function can include a layer for formatting data transfers between a host interface and the application logic of a corresponding reconfigurable logic region. The host interface can be configured to apportion bandwidth of the data transfers generated by the application logic of the respective reconfigurable logic regions.
    Type: Grant
    Filed: July 1, 2020
    Date of Patent: November 23, 2021
    Assignee: Amazon Technologies, Inc.
    Inventors: Asif Khan, Islam Mohamed Hatem Abdulfattah Mohamed Atta, Robert Michael Johnson, Mark Bradley Davis, Christopher Joseph Pettey, Nafea Bshara, Erez Izenberg
  • Patent number: 11175957
    Abstract: The present disclosure relates to a hardware accelerator for executing a computation task composed of a set of operations. The hardware accelerator comprises a controller and a set of computation units. Each computation unit of the set of computation units is configured to receive input data of an operation of the set of operations and to perform the operation, wherein the input data is represented with a distinct bit length associated with each computation unit. The controller is configured to receive the input data represented with a certain bit length of the bit lengths and to select one of the set of computation units that can deliver a valid result and that is associated with a bit length smaller than or equal to the certain bit length.
    Type: Grant
    Filed: September 22, 2020
    Date of Patent: November 16, 2021
    Assignee: International Business Machines Corporation
    Inventors: Dionysios Diamantopoulos, Florian Michael Scheidegger, Adelmo Cristiano Innocenza Malossi, Christoph Hagleitner, Konstantinos Bekas
  • Patent number: 11169801
    Abstract: A hybrid quantum classical (HQC) computer, which includes both a classical computer component and a quantum computer component, solves linear systems. The HQC decomposes the linear system to be solved into subsystems that are small enough to be solved by the quantum computer component, under control of the classical computer component. The classical computer component synthesizes the outputs of the quantum computer component to generate the complete solution to the linear system.
    Type: Grant
    Filed: October 4, 2019
    Date of Patent: November 9, 2021
    Assignee: Zapata Computing, Inc.
    Inventor: Yudong Cao
  • Patent number: 11157283
    Abstract: A graphics processing device comprises a set of compute units to execute multiple threads of a workload, a cache coupled with the set of compute units, and a prefetcher to prefetch instructions associated with the workload. The prefetcher is configured to use a thread dispatch command that is used to dispatch threads to execute a kernel to prefetch instructions, parameters, and/or constants that will be used during execution of the kernel. Prefetch operations for the kernel can then occur concurrently with thread dispatch operations.
    Type: Grant
    Filed: January 9, 2019
    Date of Patent: October 26, 2021
    Assignee: Intel Corporation
    Inventors: James Valerio, Vasanth Ranganathan, Joydeep Ray, Pradeep Ramani
  • Patent number: 11157285
    Abstract: A system and method including a processor configured to, based on encountering an instruction that does not modify the architectural state of the processor, preferably a prefetch instruction, that is being executed by the processor, determine whether utilization of a first queue used in processing the instruction is over a first queue utilization limit; in response to the first queue utilization being over the first queue utilization limit, do not execute the prefetch instruction; and in response to the first queue utilization being under the first queue utilization limit, at least partially process the prefetch instruction.
    Type: Grant
    Filed: February 6, 2020
    Date of Patent: October 26, 2021
    Assignee: International Business Machines Corporation
    Inventors: Bryant Cockcroft, John A. Schumann, Karen Yokum, Vivek Britto, Debapriya Chatterjee
  • Patent number: 11151064
    Abstract: A computing device includes a memory and a processor connected to the memory and configured to: create, in a first memory space of the memory, a first I/O submission queue associated with a first application running in user space; create, in a second memory space of the memory, a second I/O submission queue associated with a second application running in user space; in response to a first I/O request from the first application, store the first I/O request in the first I/O submission queue for access by the semiconductor storage device; and in response to a second I/O request from the second application, store the second I/O request in the second I/O submission queue for access by the semiconductor storage device.
    Type: Grant
    Filed: February 6, 2020
    Date of Patent: October 19, 2021
    Assignee: KIOXIA CORPORATION
    Inventor: Hideki Yoshida
  • Patent number: 11126438
    Abstract: In one embodiment, a reservation station of a processor includes: a plurality of first lanes having a plurality of entries to store information for instructions having in-order dependencies; a variable latency tracking table including a second plurality of entries to store information for instructions having a variable latency; and a scheduler circuit to access a head entry of the plurality of first lanes to schedule, for execution on at least one execution unit, at least one instruction from the head entry of at least one of the plurality of first lanes. Other embodiments are described and claimed.
    Type: Grant
    Filed: June 26, 2019
    Date of Patent: September 21, 2021
    Assignee: Intel Corporation
    Inventors: Srikanth Srinivasan, Thomas Mullins, Ammon Christiansen, James Hadley, Robert S. Chappell, Sean Mirkes
  • Patent number: 11120330
    Abstract: The present disclosure relates to a communication method and system for converging a 5th-Generation (5G) communication system for supporting higher data rates beyond a 4th-Generation (4G) system with a technology for Internet of Things (IoT). The present disclosure may be applied to intelligent services based on the 5G communication technology and the IoT-related technology, such as smart home, smart building, smart city, smart car, connected car, health care, digital education, smart retail, security and safety services. A Processing Element (PE) implemented in an accelerator in a convolutional neural network, which includes a first buffer configured to transfer input data to one other PE, and a second buffer configured to transmit to an outside output data that is processed on the basis of the input data; and an operation unit configured to generate output data.
    Type: Grant
    Filed: July 27, 2017
    Date of Patent: September 14, 2021
    Assignee: Samsung Electronics Co., Ltd.
    Inventors: Younghwan Park, Kyounghoon Kim, Seungwon Lee, Hansu Cho, Sukjin Kim
  • Patent number: 11106463
    Abstract: A digital signal processor having a CPU with a program counter register and, optionally, an event context stack pointer register for saving and restoring the event handler context when higher priority event preempts a lower priority event handler. The CPU is configured to use a minimized set of addressing modes that includes using the event context stack pointer register and program counter register to compute an address for storing data in memory. The CPU may also eliminate post-decrement, pre-increment and post-decrement addressing and rely only on post-increment addressing.
    Type: Grant
    Filed: May 24, 2019
    Date of Patent: August 31, 2021
    Assignee: TEXAS INSTRUMENTS INCORPORATED
    Inventors: Timothy David Anderson, Duc Quang Bui, Joseph Zbiciak, Kai Chirca
  • Patent number: 11106615
    Abstract: A single-wire bus (SuBUS) slave circuit is provided. The SuBUS slave circuit is coupled to a SuBUS bridge circuit via a SuBUS and can be configured to perform a slave task that may block communication on the SuBUS. Notably, the SuBUS slave circuit may not be equipped with an accurate timing reference source that can determine a precise timing for terminating the slave task and unblock the SuBUS. Instead, the SuBUS slave circuit is configured to terminate the slave task and unblock the SuBUS based on a self-determined slave free-running-oscillator count derived from a start-of-sequence training sequence that precedes any SuBUS telegram of a predefined SuBUS operation, even though the SuBUS operation is totally unrelated to the slave task. As such, it may be possible to eliminate the accurate timing reference source from the SuBUS slave circuit, thus helping to reduce cost and current drain in the SuBUS slave circuit.
    Type: Grant
    Filed: January 7, 2020
    Date of Patent: August 31, 2021
    Assignee: Qorvo US, Inc.
    Inventors: Christopher Truong Ngo, Alexander Wayne Hietala, Puneet Paresh Nipunage
  • Patent number: 11106464
    Abstract: Systems, methods, and apparatuses relating to access synchronization in a shared memory are described. In one embodiment, a processor includes a decoder to decode an instruction into a decoded instruction, and an execution unit to execute the decoded instruction to: receive a first input operand of a memory address to be tracked and a second input operand of an allowed sequence of memory accesses to the memory address, and cause a block of a memory access that violates the allowed sequence of memory accesses to the memory address. In one embodiment, a circuit separate from the execution unit compares a memory address for a memory access request to one or more memory addresses in a tracking table, and blocks a memory access for the memory access request when a type of access violates a corresponding allowed sequence of memory accesses to the memory address for the memory access request.
    Type: Grant
    Filed: September 27, 2016
    Date of Patent: August 31, 2021
    Assignee: Intel Corporation
    Inventors: Swagath Venkataramani, Dipankar Das, Sasikanth Avancha, Ashish Ranjan, Subarno Banerjee, Bharat Kaul, Anand Raghunathan
  • Patent number: 11086722
    Abstract: There are provided a memory system and an operating method thereof. The memory system includes: a memory device including a plurality of semiconductor memories; and a controller for generating a plurality of command queues respectively corresponding to the plurality of semiconductor memories by queuing a plurality of commands received from a host, and controlling the plurality of semiconductor memories to perform overall operations by outputting the plurality of commands queued in the plurality of command queues, wherein the controller holds a first command queue, among the plurality of command queues, corresponding to a first semiconductor memory, among the plurality of semiconductor memories, in which a program fail has occurred.
    Type: Grant
    Filed: September 27, 2018
    Date of Patent: August 10, 2021
    Assignee: SK hynix Inc.
    Inventors: Joo Young Lee, Hoe Seung Jung
  • Patent number: 11086632
    Abstract: A computer system is presented. The computer system comprises a memory system that stores data, a computer processor, and a memory access engine. The memory access engine is configured to: receive a first instruction of a computing process from the computer processor, wherein the first instruction is for accessing the data from the memory system; acquire at least a part of the data from the memory system based on the first instruction; and after the acquisition of the at least a first part of the data, transmit an indication to the computer processor to enable the computer processor to execute a second instruction of the computing process.
    Type: Grant
    Filed: February 10, 2017
    Date of Patent: August 10, 2021
    Assignee: ALIBABA GROUP HOLDING LIMITED
    Inventor: Xiaowei Jiang
  • Patent number: 11087233
    Abstract: Methods, systems, and apparatus for operating a system of qubits. In one aspect, a method includes operating a first qubit from a first plurality of qubits at a first qubit frequency from a first qubit frequency region, and operating a second qubit from the first plurality of qubits at a second qubit frequency from a second first qubit frequency region, the second qubit frequency and the second first qubit frequency region being different to the first qubit frequency and the first qubit frequency region, respectively, wherein the second qubit is diagonal to the first qubit in a two-dimensional grid of qubits.
    Type: Grant
    Filed: August 9, 2017
    Date of Patent: August 10, 2021
    Assignee: Google LLC
    Inventors: John Martinis, Rami Barends, Austin Greig Fowler