Patents Examined by Corey S Faherty

Operation elimination

Patent number: 11907723

Abstract: A data processing apparatus is provided. Rename circuitry performs a register rename stage of a pipeline by storing, in storage circuitry, mappings between registers. Each of the mappings is associated with an elimination field value. Operation elimination circuitry replaces an operation that indicates an action is to be performed on data from a source register and stored in a destination register, with a new mapping in the storage circuitry that references the destination register and has the elimination field value set. Operation circuitry responds to a subsequent operation that accesses the destination register when the elimination field value is set; by obtaining contents of the source register, performing the action on the contents to obtain a result, and returning the result.

Type: Grant

Filed: March 21, 2022

Date of Patent: February 20, 2024

Assignee: Arm Limited

Inventors: Nicholas Andrew Plante, Joseph Michael Pusdesris, Jungsoo Kim
Techniques for efficiently transferring data to a processor

Patent number: 11907717

Abstract: A technique for block data transfer is disclosed that reduces data transfer and memory access overheads and significantly reduces multiprocessor activity and energy consumption. Threads executing on a multiprocessor needing data stored in global memory can request and store the needed data in on-chip shared memory, which can be accessed by the threads multiple times. The data can be loaded from global memory and stored in shared memory using an instruction which directs the data into the shared memory without storing the data in registers and/or cache memory of the multiprocessor during the data transfer.

Type: Grant

Filed: February 8, 2023

Date of Patent: February 20, 2024

Assignee: NVIDIA Corporation

Inventors: Andrew Kerr, Jack Choquette, Xiaogang Qiu, Omkar Paranjape, Poornachandra Rao, Shirish Gadre, Steven J. Heinrich, Manan Patel, Olivier Giroux, Alan Kaatz
Saving and restoring registers

Patent number: 11907720

Abstract: There is provided a data processing apparatus comprising a plurality of registers, each of the registers having data bits to store data and metadata bits to store metadata. Each of the registers is adapted to operate in a metadata mode in which the metadata bits and the data bits are valid, and a data mode in which the data bits are valid and the metadata bits are invalid. Mode bit storage circuitry indicates whether each of the registers is in the data mode or the metadata mode. Execution circuitry is responsive to a memory operation that is a store operation on one or more given registers.

Type: Grant

Filed: November 26, 2020

Date of Patent: February 20, 2024

Assignee: Arm Limited

Inventors: Bradley John Smith, Thomas Christopher Grocutt
Distributed processor system

Patent number: 11907160

Abstract: This disclosure relates to a distributed processing system for configuring multiple processing channels. The distributed processing system includes a main processor, such as an ARM processor, communicatively coupled to a plurality of co-processors, such as stream processors. The co-processors can execute instructions in parallel with each other and interrupt the ARM processor. Longer latency instructions can be executed by the main processor and lower latency instructions can be executed by the co-processors. There are several ways that a stream can be triggered in the distributed processing system. In an embodiment, the distributed processing system is a stream processor system that includes an ARM processor and stream processors configured to access different register sets. The stream processors can include a main stream processor and stream processors in respective transmit and receive channels. The stream processor system can be implemented in a radio system to configure the radio for operation.

Type: Grant

Filed: August 5, 2022

Date of Patent: February 20, 2024

Assignee: Analog Devices, Inc.

Inventors: Manish J. Manglani, Shipra Bhal, Christopher Mayer
Processing unit virtualization with scalable over-provisioning in an information processing system

Patent number: 11900174

Abstract: Techniques are disclosed for processing unit virtualization with scalable over-provisioning in an information processing system. For example, the method accesses a data structure that maps a correspondence between a plurality of virtualized processing units and a plurality of abstracted processing units, wherein the plurality of abstracted processing units are configured to decouple an allocation decision from the plurality of virtualized processing units, and further wherein at least one of the virtualized processing units is mapped to multiple ones of the abstracted processing units. The method allocates one or more virtualized processing units to execute a given application by allocating one or more abstracted processing units identified from the data structure. The method also enables migration of one or more virtualized processing units across the system.

Type: Grant

Filed: June 22, 2022

Date of Patent: February 13, 2024

Assignee: Dell Products L.P.

Inventors: Anzhou Hou, Zhen Jia, Qiang Chen, Victor Fong, Michael Robillard
Marker-based processor instruction grouping

Patent number: 11900123

Abstract: A system includes a processing unit such as a GPU that itself includes a command processor configured to receive instructions for execution from a software application. A processor pipeline coupled to the processing unit includes a set of parallel processing units for executing the instructions in sets. A set manager is coupled to one or more of the processor pipeline and the command processor. The set manager includes at least one table for storing a set start time, a set end time, and a set execution time. The set manager determines an execution time for one or more sets of instructions of a first window of sets of instructions submitted to the processor pipeline. Based on the execution time of the one or more sets of instructions, a set limit is determined and applied to one or more sets of instructions of a second window subsequent to the first window.

Type: Grant

Filed: December 13, 2019

Date of Patent: February 13, 2024

Assignee: Advanced Micro Devices, Inc.

Inventors: Alexander Fuad Ashkar, Manu Rastogi, Harry J. Wise
Instructions for fused multiply-add operations with variable precision input operands

Patent number: 11900107

Abstract: Disclosed embodiments relate to instructions for fused multiply-add (FMA) operations with variable-precision inputs. In one example, a processor to execute an asymmetric FMA instruction includes fetch circuitry to fetch an FMA instruction having fields to specify an opcode, a destination, and first and second source vectors having first and second widths, respectively, decode circuitry to decode the fetched FMA instruction, and a single instruction multiple data (SIMD) execution circuit to process as many elements of the second source vector as fit into an SIMD lane width by multiplying each element by a corresponding element of the first source vector, and accumulating a resulting product with previous contents of the destination, wherein the SIMD lane width is one of 16 bits, 32 bits, and 64 bits, the first width is one of 4 bits and 8 bits, and the second width is one of 1 bit, 2 bits, and 4 bits.

Type: Grant

Filed: March 25, 2022

Date of Patent: February 13, 2024

Assignee: Intel Corporation

Inventors: Dipankar Das, Naveen K. Mellempudi, Mrinmay Dutta, Arun Kumar, Dheevatsa Mudigere, Abhisek Kundu
Multi-processor system and method for processing floating point operation thereof

Patent number: 11893392

Abstract: A method for processing floating point operations in a multi-processor system including a plurality of single processor cores is provided. In this method, upon receiving a group setting for performing an operation, the plurality of single processor cores are grouped into at least one group according to the group setting, and a single processor core set as a master in the group loads an instruction for performing the operation from an external memory, and performs parallel operations by utilizing floating point units (FUPs) of all single processor cores in the group according to the instructions.

Type: Grant

Filed: November 30, 2021

Date of Patent: February 6, 2024

Assignee: ELECTRONICS AND TELECOMMUNICATIONS RESEARCH INSTITUTE

Inventors: Ju-Yeob Kim, Jin Ho Han
Processor instruction dispatch configuration

Patent number: 11868804

Abstract: A processor comprises a computational array of computational elements and an instruction dispatch circuit. The computational elements receive data operands via data lanes extending along a first dimension, and processes the operands based upon instructions received from the instruction dispatch circuit via instruction lanes extending along a second dimension. The instruction dispatch circuit receives raw instructions, and comprises an instruction dispatch unit (IDU) processor that processes a set of raw instructions to generate processed instructions for dispatch to the computational elements, where the number of processed instructions is not equal to the number of instructions of the set of raw instructions.

Type: Grant

Filed: November 18, 2020

Date of Patent: January 9, 2024

Assignee: Groq, Inc.

Inventors: Brian Lee Kurtz, Dinesh Maheshwari, James David Sprach
Methods and apparatus for deep learning network execution pipeline on multi-processor platform

Patent number: 11868782

Abstract: Methods and systems are disclosed using an execution pipeline on a multi-processor platform for deep learning network execution. In one example, a network workload analyzer receives a workload, analyzes a computation distribution of the workload, and groups the network nodes into groups. A network executor assigns each group to a processing core of the multi-core platform so that the respective processing core handle computation tasks of the received workload for the respective group.

Type: Grant

Filed: August 15, 2022

Date of Patent: January 9, 2024

Assignee: Intel Corporation

Inventors: Liu Yang, Anbang Yao
Automatically determining and modifying environments for running microservices in a performant and cost-effective manner

Patent number: 11868769

Abstract: Deployments of microservices executing in a cloud are automatically managed. Some microservices are deployed on dedicated nodes, others in serverless configurations. Rates of invocation and runtime data of microservices are monitored. Responsive to the monitored rate of invocation of a microservice running serverless exceeding a given threshold, the microservice is automatically redeployed to a dedicated node. A microservice executing on a dedicated node may be redeployed serverless if the infrequency with which it is called is sufficient. Microservices can be automatically redeployed between different dedicated nodes with different capacities based on monitored usage. The underlying cloud service provider may be automatically monitored for changes in serverless support functionality. Responsive to these changes, the thresholds at which microservices are redeployed can be automatically adjusted.

Type: Grant

Filed: July 27, 2022

Date of Patent: January 9, 2024

Assignee: PANGEA CYBER CORPORATION, INC.

Inventors: Akshay Dongaonkar, Prashant Pathak, Sourabh Satish
Efficient processing of nested loops for computing device with multiple configurable processing elements using multiple spoke counts

Patent number: 11861366

Abstract: Disclosed in some examples, are methods, systems, devices, and machine-readable mediums which provide for more efficient CGRA execution by assigning different initiation intervals to different PEs executing a same code base. The initiation intervals may be a multiple of each other and the PE with the lowest initiation interval may be used to execute instructions of the code that is to be executed at a greater frequency than other instructions than other instructions that may be assigned to PEs with higher initiation intervals.

Type: Grant

Filed: August 11, 2021

Date of Patent: January 2, 2024

Assignee: Micron Technology, Inc.

Inventors: Douglas Vanesko, Tony M. Brewer
System for error detection and correction in a multi-thread processor

Patent number: 11847457

Abstract: A master processor is configured to execute a first thread and a second thread designated to run a program in sequence. A slave processor is configured to execute a third thread to run the program in sequence. An instruction fetch compare engine is provided. The first thread initiates a first thread instruction fetch for the program and stored in an instruction fetch storage. Retrieved data associated with the fetched first thread instruction is stored in a retrieved data storage. The second thread initiates a second thread instruction fetch for the program. The instruction fetch compare logic compares the second thread instruction fetch for the program with the first thread instruction fetch stored in the instruction fetch storage for a match. When there is a match, the retrieved data associated with the fetched first thread instruction is presented from the retrieved data storage, in response to the second thread instruction fetch.

Type: Grant

Filed: May 31, 2022

Date of Patent: December 19, 2023

Assignee: Ceremorphic, Inc.

Inventors: Heonchul Park, Sri Hari Nemani, Patel Urvishkumar Jayrambhai, Dhruv Maheshkumar Patel
Operation module and method thereof

Patent number: 11836497

Abstract: There is provides an operation module, which includes a memory, a register unit, a dependency relationship processing unit, an operation unit, and a control unit. The memory is configured to store a vector, the register unit is configured to store an extension instruction, and the control unit is configured to acquire and parse the extension instruction, so as to obtain a first operation instruction and a second operation instruction. An execution sequence of the first operation instruction and the second operation instruction can be determined, and an input vector of the first operation instruction can be read from the memory. The operation unit is configured to convert an expression mode of the input data index of the first operation instruction and to screen data, and to execute the first and second operation instruction according to the execution sequence, so as to obtain an extension instruction.

Type: Grant

Filed: July 23, 2018

Date of Patent: December 5, 2023

Assignee: SHANGHAI CAMBRICON INFORMATION TECHNOLOGY CO., LTD

Inventors: Bingrui Wang, Shengyuan Zhou, Yao Zhang
Method of implementing an ARM64-bit floating point emulator on a Linux system

Patent number: 11836495

Abstract: The present invention provides a method of implementing an ARM64-bit floating point emulator on a Linux system, which includes: running an ARM64-bit instruction on the Linux system; applying an instruction classifier to a first feature code of a machine code indicated by the ARM64-bit instruction to determine whether the ARM64-bit instruction is an ARM64-bit floating point instruction; and, if the ARM64-bit instruction is an ARM64-bit floating point instruction, applying the instruction classifier to a second feature code of the machine code indicated by the ARM64-bit instruction to determine the ARM64-bit floating point instruction to be a specific ARM64-bit floating point instruction.

Type: Grant

Filed: May 4, 2022

Date of Patent: December 5, 2023

Assignee: AIROHA TECHNOLOGY (SUZHOU) LIMITED

Inventors: Fei Yan, Peng Du
Methods and systems for processing requests using load-dependent throttling

Patent number: 11822959

Abstract: Methods and systems for processing requests with load-dependent throttling. The system compares a count of active job requests being currently processed for a user associated with a new job request with an active job cap number for that user. When the count of active job requests being currently processed for that user does not exceed the active job cap number specific to that user, the job request is added to an active job queue for processing. However, when the count of active job requests being currently processed for that user exceeds the active job cap number, the job request is placed on a throttled queue to await later processing when an updated count of active job requests being currently processed for that user is below the active job cap number. Once the count is below the cap, the throttle request is moved to the active job queue for processing.

Type: Grant

Filed: February 18, 2022

Date of Patent: November 21, 2023

Assignee: Shopify Inc.

Inventors: Robert Mic, Aline Fatima Manera, Timothy Willard, Nicole Simone, Scott Weber
Apparatus and method for performing dual signed and unsigned multiplication of packed data elements

Patent number: 11809867

Abstract: An apparatus and method for performing dual concurrent multiplications of packed data elements.

Type: Grant

Filed: September 21, 2020

Date of Patent: November 7, 2023

Assignee: Intel Corporation

Inventors: Venkateswara Madduri, Elmoustapha Ould-Ahmed-Vall, Jesus Corbal, Mark Charney, Robert Valentine, Binwei Yang
Bit string lookup data structure

Patent number: 11809868

Abstract: Systems, apparatuses, and methods related to bit string operations using a computing tile are described. An example apparatus includes computing device (or “tile”) that includes a processing unit and a memory resource configured as a cache for the processing unit. A data structure can be coupled to the computing device. The data structure can be configured to receive a bit string that represents a result of an arithmetic operation, a logical operation, or both and store the bit string that represents the result of the arithmetic operation, the logical operation, or both. The bit string can be formatted in a format different than a floating-point format.

Type: Grant

Filed: January 21, 2022

Date of Patent: November 7, 2023

Assignee: Micron Technology, Inc.

Inventor: Vijay S. Ramesh
Look-up table read

Patent number: 11803382

Abstract: A digital data processor includes a multi-stage butterfly network, which is configured to, in response to a look up table read instruction, receive look up table data from an intermediate register, reorder the look up table data based on control signals comprising look up table configuration register data, and write the reordered look up table data to a destination register specified by the look up table read instruction.

Type: Grant

Filed: September 2, 2022

Date of Patent: October 31, 2023

Assignee: Texas Instruments Incorporated

Inventors: Naveen Bhoria, Duc Bui, Dheera Balasubramanian Samudrala, Rama Venkatasubramanian
Broadcast synchronization for dynamically adaptable arrays

Patent number: 11803385

Abstract: An array processor includes processor element arrays (PEAs) distributed in rows and columns. The PEAs are configured to perform operations on parameter values. A first sequencer received a first direct memory access (DMA) instruction that includes a request to read data from at least one address in memory. A texture address (TA) engine requests the data from the memory based on the at least one address and a texture data (TD) engine provides the data to the PEAs. The PEAs provide first synchronization signals to the TD engine to indicate availability of registers for receiving the data. The TD engine provides second synchronization signals to the first sequencer in response to receiving acknowledgments that the PEAs have consumed the data.

Type: Grant

Filed: December 10, 2021

Date of Patent: October 31, 2023

Assignee: Advanced Micro Devices, Inc.

Inventors: Sateesh Lagudu, Arun Vaidyanathan Ananthanarayan, Michael Mantor, Allen H. Rush

prev 1 2 3 4 5 6 … next