Patents Examined by Jacob A. Petranek

Apparatuses, methods, and systems for instructions to multiply floating-point values of about zero

Patent number: 11875154

Abstract: Systems, methods, and apparatuses relating to instructions to multiply floating-point values of about zero are described.

Type: Grant

Filed: December 13, 2019

Date of Patent: January 16, 2024

Assignee: Intel Corporation

Inventors: Mohamed Elmalaki, Elmoustapha Ould-Ahmed-Vall
Converting floating-point bit strings in a multi-user network

Patent number: 11875150

Abstract: Systems, apparatuses, and methods related to arithmetic and logical operations in a multi-user network are described. Circuitry may be part of a pool of shared computing resources in a multi-user network. Data (e.g., one or more bit strings) received by the circuitry may be selectively operated upon. The circuitry can perform operations on data to convert the data between one or more formats, such as floating-point and/or universal number (e.g., posit) formats and can further perform arithmetic and/or logical operations on the converted data. For instance, the circuitry may be configured to receive a request to perform an arithmetic operation and/or a logical operation using at least one posit bit string operand. The request can include a parameter corresponding to performance of the operation. The circuitry can perform the arithmetic operation and/or the logical operation based, at least in part, on the parameter.

Type: Grant

Filed: March 25, 2021

Date of Patent: January 16, 2024

Assignee: Micron Technology, Inc.

Inventor: Vijay S. Ramesh
Detecting real-time clock loss

Patent number: 11868811

Abstract: An application (or process) may have an amount of steady state work to perform per unit time, as well as one or more mechanisms for doing a lower quality job of that work in the event the application falls behind. Approaches presented herein can utilize a clock monitor that enables the application to determine whether a clock loss was encountered that was due to an external source, and is of an amount of time that may be naturally recoverable by the application. If so, the application can enter into a mode of operation wherein the activation of one or more recovery mechanisms is postponed for a period of time to provide the application time to recover. If, after the period of such mode operation, the application has not recovered from the real time clock loss, then the recovery mechanism(s) can be activated as appropriate.

Type: Grant

Filed: October 6, 2022

Date of Patent: January 9, 2024

Assignee: Amazon Technologies, Inc.

Inventors: Erik Jason Johnson, Ryan Hegar
Hierarchical networks on chip (NoC) for neural network accelerator

Patent number: 11868307

Abstract: This application describes a hardware accelerator and a device for accelerating neural network computations. An example accelerator may include multiple cores and a central processing unit (CPU) respectively associated with DDRs, a data exchange interface connecting a host device to the accelerator, and a three-layer NoC architecture. The three-layer NoC architecture includes an outer-layer NoC configured to transfer data between the host device and the DDRs, a middle-layer NoC configured to transfer data among the plurality of cores; and an inner-layer NoC within each core and including a cross-bar network for broadcasting weights and activations of neural networks from a global buffer of the core to a plurality of processing entity (PE) clusters within the core.

Type: Grant

Filed: May 15, 2023

Date of Patent: January 9, 2024

Assignee: Moffett International Co., Limited

Inventors: Xiaoqian Zhang, Zhibin Xiao
Processor-guided execution of offloaded instructions using fixed function operations

Patent number: 11868777

Abstract: Processor-guided execution of offloaded instructions using fixed function operations is disclosed. Instructions designated for remote execution by a target device are received by a processor. Each instruction includes, as an operand, a target register in the target device. The target register may be an architected virtual register. For each of the plurality of instructions, the processor transmits an offload request in the order that the instructions are received. The offload request includes the instruction designated for remote execution. The target device may be, for example, a processing-in-memory device or an accelerator coupled to a memory.

Type: Grant

Filed: December 16, 2020

Date of Patent: January 9, 2024

Assignee: ADVANCED MICRO DEVICES, INC.

Inventors: John Kalamatianos, Michael T. Clark, Marius Evers, William L. Walker, Paul Moyer, Jay Fleischman, Jagadish B. Kotra
Neural processing unit (NPU) direct memory access (NDMA) hardware pre-processing and post-processing

Patent number: 11861484

Abstract: A neural processing unit (NPU) is described. The NPU includes an NPU direct memory access (NDMA) core. The NDMA core includes a read engine having a read buffer. The NDMA core also includes a write engine having a write buffer. The NPU also includes a controller. The controller is configured to direct the NDMA core to perform hardware pre-processing of NDMA data in the read buffer and post-processing of NDMA data in the write buffer on blocks of a data stripe to process tensors in artificial neural networks.

Type: Grant

Filed: September 28, 2018

Date of Patent: January 2, 2024

Assignee: QUALCOMM Incorporated

Inventors: Jinxia Bai, Rosario Cammarota, Michael Goldfarb
Workflow execution state variables

Patent number: 11861029

Abstract: Methods, systems and computer program products for managing workflows between multiple third-party systems. A content management system stores a workflow that operates over content objects by invoking third-party applications. The content management system invokes these third-party applications, and these applications make modifications to the progression of the workflows as the third-party applications carry-out their specific portions of the workflow. Upon receipt of a workflow variable value from a first third-party application, the content management system determines the semantics of the workflow variable value and caries out a next portion of the workflow. The content management system then invokes a further next portion of the workflow to be carried out by a second third-party application.

Type: Grant

Filed: September 13, 2021

Date of Patent: January 2, 2024

Assignee: Box Inc.

Inventors: Stephen Philip Hiller, Jón Tómas Grétarsson, Seth Morgan Luce Voltz, Ravneet Uberoi
Single instruction multiple data execution with variable size logical registers

Patent number: 11853762

Abstract: Systems, apparatuses and methods are disclosed for efficient management of registers in a graph stream processing (GSP) system. The GSP system includes a thread scheduler module operative to initiate a Single Instruction Multiple Data (SIMD) thread, the SIMD thread including a dispatch mask with an initial value. A thread arbiter module operative to select an instruction from the instructions and provide the instruction to each of one or more compute resources, and an instruction iterator module, associated with the each of one or more compute resources operative to determine a data type of the instruction. The instruction iterator module iteratively executes the instruction based on the data type and the dispatch mask.

Type: Grant

Filed: May 20, 2022

Date of Patent: December 26, 2023

Assignee: Blaize, Inc.

Inventors: Kamaraj Thangam, Srinivasulu Nagisetty, Venkata Divya Bharathi Palaparthy, Aswathy Asok, Satyaki Koneru
Neural network accelerator with type conversion units and operating method thereof

Patent number: 11853759

Abstract: Disclosed are a neural network accelerator and an operating method thereof, which include an instruction analyzer that analyzes a first instruction instructing an operation with respect to a first layer of a neural network algorithm from an external device, a polymorphic operator array including a plurality of operators that performs the operation with respect to the first layer under a control of the instruction analyzer, an interface that communicates with the external device and an external memory under the control of the instruction analyzer, an internal memory, a type converter, a type conversion data mover that stores data received from the external memory through the interface in the internal memory under the control of the instruction analyzer, and an internal type converter that performs a conversion of data stored in the internal memory or data generated by the polymorphic operator array under the control of the instruction analyzer.

Type: Grant

Filed: August 27, 2021

Date of Patent: December 26, 2023

Assignee: ELECTRONICS AND TELECOMMUNICATIONS RESEARCH INSTITUTE

Inventor: Jeongmin Yang
Reconfigurable computer accelerator providing stream processor and dataflow processor

Patent number: 11853244

Abstract: A reconfigurable hardware accelerator for computers combines a high-speed dataflow processor, having programmable functional units rapidly reconfigured in a network of programmable switches, with a stream processor that may autonomously access memory in predefined access patterns after receiving simple stream instructions. The result is a compact, high-speed processor that may exploit parallelism associated with many application-specific programs susceptible to acceleration.

Type: Grant

Filed: January 26, 2017

Date of Patent: December 26, 2023

Assignee: Wisconsin Alumni Research Foundation

Inventors: Karthikeyan Sankaralingam, Anthony Nowatzki, Vinay Gangadhar
Data transmission method and device for network on chip and electronic apparatus

Patent number: 11847091

Abstract: The present disclosure provides a data transmission method and device for a network on chip and an electronic apparatus. The method includes: receiving, by a second network node, a first data packet sent by a first network node, the first data packet including first identification information and a data packet payload; determining, by the second network node, valid transmission information and second identification information corresponding to the valid transmission information according to the first identification information; determining, by the second network node, a second data packet according to the second identification information and the data packet payload; and sending, by the second network node, the second data packet according to the valid transmission information.

Type: Grant

Filed: November 28, 2019

Date of Patent: December 19, 2023

Assignee: LYNXI TECHNOLOGIES CO., LTD.

Inventors: Yangshu Shen, Luping Shi, Yaolong Zhu
Systems and methods of instructions to accelerate multiplication of sparse matrices using bitmasks that identify non-zero elements

Patent number: 11847185

Abstract: Disclosed embodiments relate to accelerating multiplication of sparse matrices. In one example, a processor is to fetch and decode an instruction having fields to specify locations of first, second, and third matrices, and an opcode indicating the processor is to multiply and accumulate matching non-zero (NZ) elements of the first and second matrices with corresponding elements of the third matrix, and executing the decoded instruction as per the opcode to generate NZ bitmasks for the first and second matrices, broadcast up to two NZ elements at a time from each row of the first matrix and each column of the second matrix to a processing engine (PE) grid, each PE to multiply and accumulate matching NZ elements of the first and second matrices with corresponding elements of the third matrix. Each PE further to store an NZ element for use in a subsequent multiplications.

Type: Grant

Filed: September 24, 2021

Date of Patent: December 19, 2023

Assignee: Intel Corporation

Inventors: Dan Baum, Chen Koren, Elmoustapha Ould-Ahmed-Vall, Michael Espig, Christopher J. Hughes, Raanan Sade, Robert Valentine, Mark J. Charney, Alexander F. Heinecke
Parallelization method and apparatus with processing of neural network model for manycore system

Patent number: 11842220

Abstract: A parallelization method includes: generating a profiling result by performing profiling on a target neural network based on model information of the target neural network and architecture information of a manycore system; determining an assignment strategy to assign a plurality of cores of each of a plurality of clusters of the manycore system to a plurality of layers of the target neural network, based on the profiling result; and generating a parallelization strategy for parallel processing of the manycore system based on the assignment strategy.

Type: Grant

Filed: April 7, 2021

Date of Patent: December 12, 2023

Assignee: Samsung Electronics Co., Ltd.

Inventor: Jaeyeon Kim
Apparatus and methods related to microcode instructions indicating instruction types

Patent number: 11842191

Abstract: The present disclosure includes apparatuses and methods related to microcode instructions indicating instruction types. One example apparatus comprises a memory storing a set of microcode instructions. Each microcode instruction of the set can comprise a first field comprising a number of control data units, and a second field comprising a number of type select data units. Each microcode instruction of the set can have a particular instruction type defined by a value of the number of type select data units, and particular functions corresponding to the number of control data units are variable based on the particular instruction type.

Type: Grant

Filed: July 12, 2021

Date of Patent: December 12, 2023

Assignee: Micron Technology, Inc.

Inventors: Shawn Rosti, Timothy P. Finkbeiner
Accelerator controller for inserting template microcode instructions into a microcode buffer to accelerate matrix operations

Patent number: 11836488

Abstract: A method for a controller to execute a program comprising a sequence of functions on an accelerator with a pipelined architecture comprising a microcode buffer. The method comprises executing a function of the program as a sequence of operations, wherein the sequence of operations is represented by a sequence of templates, determining whether the template is non-colliding with previously inserted templates in the microcode buffer, determining whether data in local memory will be referenced before all previously inserted templates have taken effect, determining whether registers will be referenced before all previously inserted templates in the microcode buffer have taken effect, when it is determined that the template fits, that resources are available, that local data memory accesses will not collide, and that register accesses will not collide: creating a sequence of microcode instructions in the template, and inserting the template into the microcode buffer.

Type: Grant

Filed: January 13, 2020

Date of Patent: December 5, 2023

Assignee: Telefonaktiebolaget LM Ericsson (publ)

Inventors: Anders Wesslén, Michael Breschel
Processor cores using packet identifiers for routing and computation

Patent number: 11829752

Abstract: Processor cores using packet identifiers for routing and computation are disclosed. One method includes executing a complex computation using a set of processing cores. The method includes routing a set of packets using a set of packet identifiers and executing a set of instructions. The set of instructions are defined using a set of operand identifiers. The operand identifiers represent packet identifiers in the set of packet identifiers. In specific implementations the set of the operand identifiers represent packet identifiers in the set of packet identifiers in that a set of memories on the set of processing cores stores data values in common association with both the set of packets, and a set of operands identified by the set of operand identifiers. In specific implementations the set of operand identifiers and packet identifiers are unambiguously mapped to an underlying set of application datums of the complex computation.

Type: Grant

Filed: March 3, 2022

Date of Patent: November 28, 2023

Assignee: Tenstorrent Inc.

Inventors: Davor Capalija, Ljubisa Bajic, Jasmina Vasiljevic
System and method for reactive flattening map for use with a microservices or other computing environment

Patent number: 11822924

Abstract: In accordance with an embodiment, described herein is a system and method for providing a reactive flattening map for use with a microservices or other computing environment. In a cloud computing environment, reactive programming can be used with publishers and subscribers, to abstract execution away from the thread of execution while providing rigorous coordination of various state transitions. The described approach provides support for processing streams of data involving one or more publishers and subscribers, by use of a multi-flat-map publisher component, to flatten or otherwise combine events emitted by multiple publishers concurrently, into a single stream of events for use by a downstream subscriber.

Type: Grant

Filed: August 24, 2021

Date of Patent: November 21, 2023

Assignee: ORACLE INTERNATIONAL CORPORATION

Inventor: Oleksandr Otenko
Hardware verification of dynamically generated code

Patent number: 11816484

Abstract: In an embodiment, dynamically-generated code may be supported in the system by ensuring that the code either remains executing within a predefined region of memory or exits to one of a set of valid exit addresses. Software embodiments are described in which the dynamically-generated code is scanned prior to permitting execution of the dynamically-generated code to ensure that various criteria are met including exclusion of certain disallowed instructions and control of branch target addresses. Hardware embodiments are described in which the dynamically-generated code is permitted to executed but is monitored to ensure that the execution criteria are met.

Type: Grant

Filed: June 15, 2021

Date of Patent: November 14, 2023

Assignee: Apple Inc.

Inventors: Jeffrey E. Gonion, Michael D. Snyder, Filip J. Pizlo
Dynamic allocation of arithmetic logic units for vectorized operations

Patent number: 11816061

Abstract: A system includes a processing device that includes a vector arithmetic logic unit comprising a plurality of arithmetic logic units (ALUs), and a first processor core operatively coupled to the vector arithmetic logic unit, the processing device to receive a first vector instruction from the first processor core, wherein the first vector instruction specifies at least one first input vector having a first vector length, identify a first subset of the ALUs in view of the first vector length and one or more allocation criteria, execute, using the first subset of the set of ALUs, one or more first ALU operations specified by the first vector instruction, wherein the vector arithmetic logic unit executes the first ALU operations in parallel with one or more second ALU operations specified by a second vector instruction received from a second processor core.

Type: Grant

Filed: December 18, 2020

Date of Patent: November 14, 2023

Assignee: Red Hat, Inc.

Inventor: Ulrich Drepper
Function virtualization facility for blocking instruction function of a multi-function instruction of a virtual processor

Patent number: 11809870

Abstract: In a processor supporting execution of a plurality of functions of an instruction, an instruction blocking value is set for blocking one or more of the plurality of functions, such that an attempt to execute one of the blocked functions, will result in a program exception and the instruction will not execute, however the same instruction will be able to execute any of the functions that are not blocked functions.

Type: Grant

Filed: April 7, 2021

Date of Patent: November 7, 2023

Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Dan Greiner, Damian Osisek, Timothy Slegel, Lisa Cranton Heller

prev 1 2 3 4 5 6 … next