Patents Examined by Jacob A. Petranek

Apparatuses, methods, and systems for selective linear address masking based on processor privilege level and control register bits

Patent number: 11966334

Abstract: Systems, methods, and apparatuses relating to linear address masking architecture are described. In one embodiment, a hardware processor includes an address generation unit to generate a linear address for a memory access request to a memory, at least one control register comprising a user mode masking bit and a supervisor mode masking bit, a register comprising a current privilege level indication, and a memory management unit to mask out a proper subset of bits inside an address space of the linear address for the memory access request based on the current privilege level indication and either of the user mode masking bit or the supervisor mode masking bit to produce a resultant linear address, and output the resultant linear address.

Type: Grant

Filed: January 11, 2021

Date of Patent: April 23, 2024

Assignee: Intel Corporation

Inventors: Ron Gabor, Igor Yanover
Systems and methods for multi-branch routing for interconnected chip networks

Patent number: 11960437

Abstract: A system includes a high-bandwidth inter-chip network (ICN) that allows communication between parallel processing units (PPUs) in the system. For example, the ICN allows a PPU to communicate with other PPUs on the same compute node or server and also with PPUs on other compute nodes or servers. In embodiments, communication may be at the command level (e.g., at the direct memory access level) and at the instruction level (e.g., the finer-grained load/store instruction level). The ICN allows PPUs in the system to communicate without using a PCIe bus, thereby avoiding its bandwidth limitations and relative lack of speed. The respective routing tables comprise information of multiple paths to any given other PPU.

Type: Grant

Filed: July 15, 2022

Date of Patent: April 16, 2024

Assignee: T-Head (Shanghai) Semiconductor Co., Ltd.

Inventors: Liang Han, Yunxiao Zou
System, apparatus and method for user space object coherency in a processor

Patent number: 11960922

Abstract: In an embodiment, a processor comprises: an execution circuit to execute instructions; at least one cache memory coupled to the execution circuit; and a table storage element coupled to the at least one cache memory, the table storage element to store a plurality of entries each to store object metadata of an object used in a code sequence. The processor is to use the object metadata to provide user space multi-object transactional atomic operation of the code sequence. Other embodiments are described and claimed.

Type: Grant

Filed: September 24, 2020

Date of Patent: April 16, 2024

Assignee: Intel Corporation

Inventors: Joshua B. Fryman, Jason M. Howard, Ibrahim Hur, Robert Pawlowski
Techniques, devices, and instruction set architecture for efficient modular division and inversion

Patent number: 11954487

Abstract: Disclosed are apparatuses, systems, and techniques to perform and facilitate fast and efficient modular computational operations, such as modular division and modular inversion, using shared platforms, including hardware accelerator engines.

Type: Grant

Filed: March 29, 2022

Date of Patent: April 9, 2024

Assignee: Nvidia Corporation

Inventors: Shuai Wang, Chen Yao, Xiao Wu, Rongzhe Zhu, Yuji Qian, Xixi Xie
Neural processing device, processing element included therein and method for operating various formats of neural processing device

Patent number: 11954488

Abstract: A neural processing device, a processing element included therein and a method for operating various formats of the neural processing device are provided. The neural processing device includes at least one neural processor, a shared memory shared by the at least one neural processor, and a global interconnection configured to transmit data between the at least one neural processor and the shared memory, wherein each of the at least one neural processor comprises at least one processing element, each of the at least one processing element receives an input in a first format and thereby performs an operation, and receives an input in a second format that is different from the first format and thereby performs an operation if a format conversion signal is received, and the first format and the second format have a same number of bits.

Type: Grant

Filed: August 31, 2023

Date of Patent: April 9, 2024

Assignee: Rebellions Inc.

Inventors: Karim Charfi, Jinwook Oh
Fence enforcement techniques based on stall characteristics

Patent number: 11954492

Abstract: Techniques are disclosed relating to channel stalls or deactivations based on the latency of prior operations. In some embodiments, a processor includes a plurality of channel pipelines for a plurality of channels and a plurality of execution pipelines shared by the channel pipelines and configured to perform different types of operations provided by the channel pipelines. First scheduler circuitry may assign threads to channels and second scheduler circuitry may assign an operation from a given channel to a given execution pipeline based on decode of an operation for that channel. Dependency circuitry may, for a first operation that depends on a prior operation that uses one of the execution pipelines, determine, based on status information for the prior operation from the one of the execution pipelines, whether to stall the first operation or to deactivate a thread that includes the first operation from its assigned channel.

Type: Grant

Filed: November 10, 2022

Date of Patent: April 9, 2024

Assignee: Apple Inc.

Inventors: Benjiman L. Goodman, Dzung Q. Vu, Robert Kenney
Automated orchestration of large-scale flow log transformation

Patent number: 11947965

Abstract: When a transformation job of flow logs generated for a cloud environment is triggered, a security service determines a parameterized template for batch data processing operations offered by the cloud service provider (CSP) to use based on the type of transformation job. The security service communicates an indication of the template and the corresponding parameter values to a data processing service/pipeline offered by the CSP. The provisioned processing resources retrieve the flow logs from a designated location in cloud storage, complete the transformation, and store the transformed flow logs in a new storage location. If the CSP does not provide a data processing service/pipeline which can perform bulk data transformation, the security service uses a generic parameterized template specifying a transformation job to be run on a cluster. Upon completion, the security service retrieves and analyzes the transformed flow logs as part of threat detection performed for securing the cloud environment.

Type: Grant

Filed: August 1, 2022

Date of Patent: April 2, 2024

Assignee: Palo Alto Networks, Inc.

Inventor: Krishnan Shankar Narayan
Carry chain for SIMD operations

Patent number: 11947964

Abstract: Examples of a carry chain for performing an operation on operands each including elements of a selectable size is provided. Advantageously, the carry chain adapts to elements of different sizes. The carry chain determines a mask based on a selected size of an element. The carry chain selects, based on the mask, whether to carry a partial result of an operation performed on corresponding first portions of a first operand and a second operand into a next operation. The next operation is performed on corresponding second portions of the first operand and the second operand, and, based on the selection, the partial result of the operation. The carry chain stores, in a memory, a result formed from outputs of the operation and the next operation.

Type: Grant

Filed: October 25, 2022

Date of Patent: April 2, 2024

Assignee: Marvell Asia Pte, Ltd.

Inventor: David Kravitz
Hardware accelerated anomaly detection using a min/max collector in a system on a chip

Patent number: 11940947

Abstract: In various examples, a VPU and associated components may be optimized to improve VPU performance and throughput. For example, the VPU may include a min/max collector, automatic store predication functionality, a SIMD data path organization that allows for inter-lane sharing, a transposed load/store with stride parameter functionality, a load with permute and zero insertion functionality, hardware, logic, and memory layout functionality to allow for two point and two by two point lookups, and per memory bank load caching capabilities. In addition, decoupled accelerators may be used to offload VPU processing tasks to increase throughput and performance, and a hardware sequencer may be included in a DMA system to reduce programming complexity of the VPU and the DMA system. The DMA and VPU may execute a VPU configuration mode that allows the VPU and DMA to operate without a processing controller for performing dynamic region based data movement operations.

Type: Grant

Filed: January 6, 2023

Date of Patent: March 26, 2024

Assignee: NVIDIA Corporation

Inventors: Ching-Yu Hung, Ravi P. Singh, Jagadeesh Sankaran, Yen-Te Shih, Ahmad Itani
Fast mapper restore for flush in processor

Patent number: 11941398

Abstract: A method for restoring a mapper of a processor core includes saving first information in a staging latch. The first information represents a newly dispatched first instruction of the processor core and is saved in an entry latch of a save-and-restore buffer. In response to reception of a flush command of the processor core, the restoration of the mapper is begun with the first information from the staging latch without waiting for a comparison of a flush tag of the flush command with the entry latch of the save-and-restore buffer. A processor core configured to perform the method described above is also provided. A processor core is also provided that includes a dispatch, a mapper, a save-and-restore buffer that includes entry latches and is connected to the mapper via at least one pipeline, and a register disposed in the at least one pipeline.

Type: Grant

Filed: December 5, 2022

Date of Patent: March 26, 2024

Assignee: International Business Machines Corporation

Inventors: Brian D. Barrick, Steven J. Battle, Dung Q. Nguyen, Susan E. Eisen, Cliff Kucharski, Salma Ayub
AI processor simulation

Patent number: 11934940

Abstract: The present disclosure discloses a data processing method and related products, in which the data processing method includes: generating, by a general-purpose processor, a binary instruction according to device information of an AI processor, and generating an AI learning task according to the binary instruction; transmitting, by the general-purpose processor, the AI learning task to the cloud AI processor for running; receiving, by the general-purpose processor, a running result corresponding to the AI learning task; and determining, by the general-purpose processor, an offline running file according to the running result, where the offline running file is generated according to the device information of the AI processor and the binary instruction when the running result satisfies a preset requirement. By implementing the present disclosure, the debugging between the AI algorithm model and the AI processor can be achieved in advance.

Type: Grant

Filed: December 19, 2019

Date of Patent: March 19, 2024

Assignee: CAMBRICON TECHNOLOGIES CORPORATION LIMITED

Inventors: Yao Zhang, Xiaofu Meng, Shaoli Liu
Atomic operation predictor to predict whether an atomic operation will complete successfully

Patent number: 11928467

Abstract: In an embodiment, a processor comprises an atomic predictor circuit to predict whether or not an atomic operation will complete successfully. The prediction may be used when a subsequent load operation to the same memory location as the atomic operation is executed, to determine whether or not to forward store data from the atomic operation to the subsequent load operation. If the prediction is successful, the store data may be forwarded. If the prediction is unsuccessful, the store data may not be forwarded. In cases where an atomic operation has been failing (not successfully performing the store operation), the prediction may prevent the forwarding of the store data and thus may prevent a subsequent flush of the load.

Type: Grant

Filed: September 13, 2021

Date of Patent: March 12, 2024

Assignee: Apple Inc.

Inventors: Brian R. Mestan, Gideon N. Levinsky, Michael L. Karm
Hybrid model for time series data processing

Patent number: 11922208

Abstract: Systems and methods are disclosed for switching between batch processing and real-time processing of time series data, with a system being configured to switch between a batch processing module and a real-time processing module to process time series data. The system includes an orchestration service to indicate when to switch, which may be based on a switching event identified by the orchestration service. In some implementations, the orchestration service identifies a switching event in incoming time series data to be processed. When a batch processing module is to be used to batch process time series data, the real-time processing module may be disabled, with the real-time processing module being enabled when it is used to process the time series data. In some implementations, the real-time processing module includes the same processing models as the batch processing module such that the two modules' outputs have a similar accuracy.

Type: Grant

Filed: May 31, 2023

Date of Patent: March 5, 2024

Assignee: Intuit Inc.

Inventors: Immanuel David Buder, Shashank Shashikant Rao
Network command coalescing on GPUs

Patent number: 11922207

Abstract: An approach is provided for coalescing network commands in a GPU that implements a SIMT architecture. Compatible next network operations from different threads are coalesced into a single network command packet. This reduces the number of network command packets generated and issued by threads, thereby increasing efficiency, and improving throughput. The approach is applicable to any number of threads and any thread organization methodology, such as wavefronts, warps, etc.

Type: Grant

Filed: August 13, 2020

Date of Patent: March 5, 2024

Assignee: Advanced Micro Devices, Inc

Inventors: Michael W. LeBeane, Khaled Hamidouche, Brandon K. Potter
Generation of trace messages including an instruction retirement count and a stall count

Patent number: 11907100

Abstract: A method of tracing instruction execution on a processor of an integrated circuit chip in real time whilst the processor continues to execute instructions during clock cycles of the processor. The instruction execution of the processor is monitored by counting the number of successive instructions which are retired contiguously in time to form an instruction count, and counting the number of subsequent contiguous clock cycles of the processor during which no instruction is retired to form a stall count. A trace message is generated which includes the instruction count and the stall count, and the trace message is outputted.

Type: Grant

Filed: April 16, 2020

Date of Patent: February 20, 2024

Assignee: Siemens Industry Software Inc.

Inventor: Iain Robertson
Methods, systems, and apparatuses for out-of-order access to a shared microcode sequencer by a clustered decode pipeline

Patent number: 11907712

Abstract: Systems, methods, and apparatuses relating to circuitry to implement out-of-order access to a shared microcode sequencer by a clustered decode pipeline are described.

Type: Grant

Filed: September 25, 2020

Date of Patent: February 20, 2024

Assignee: Intel Corporation

Inventors: Thomas Madaelil, Jonathan Combs, Vikash Agarwal
Methods, systems, and apparatuses to perform a compute operation according to a configuration packet and comparing the result to data in local memory

Patent number: 11893398

Abstract: Methods, apparatuses, and systems for implementing data flows in a processor are described herein. A data flow manager may be configured to generate a configuration packet for a compute operation based on status information regarding multiple processing elements of the processor. Accordingly, multiple processing elements of a processor may concurrently process data flows based on the configuration packet. For example, the multiple processing elements may implement a mapping of processing elements to memory, while also implementing identified paths, through the processor, for the data flows. After executing the compute operation at certain processing elements of the processor, the processing results may be provided. In speech signal processing operations, the processing results may be compared to phonemes to identify such components of human speech in the processing results.

Type: Grant

Filed: February 16, 2022

Date of Patent: February 6, 2024

Assignee: MICRON TECHNOLOGY, INC.

Inventors: Jeremy Chritz, Tamara Schmitz, Fa-Long Luo, David Hulton
Synchronizing coprocessors using synchronization instructions to force a second coprocessor to wait until receiving an acknowledgement signal from a first coprocessor

Patent number: 11892970

Abstract: A method for data processing, a processor chip. The method includes: acquiring a first relationship instruction; executing at least one first computing instruction acquired before the first relationship instruction based on the first relationship instruction; and sending acknowledgment information based on the first relationship instruction in response to completing executing the at least one first computing instruction, to cause a second coprocessor receiving the acknowledgment information to revert to a state of acquiring a second computing instruction after the second relationship instruction acquired by a second coprocessor based on the acknowledgment information.

Type: Grant

Filed: July 19, 2022

Date of Patent: February 6, 2024

Assignee: KUNLUNXIN TECHNOLOGY (BEIJING) COMPANY

Inventors: Jing Wang, Jiaxin Shi, Hanlin Xie, Xiaozhang Gong
Decoupled access-execute processing and prefetching control

Patent number: 11886881

Abstract: Apparatuses and methods are provided, relating to the control of data processing in devices which comprise both decoupled access-execute processing circuitry and prefetch circuitry. Control of the access portion of the decoupled access-execute processing circuitry may be dependent on a performance metric of the prefetch circuitry. Alternatively or in addition, control of the prefetch circuitry may be dependent on a performance metric of the access portion.

Type: Grant

Filed: December 21, 2020

Date of Patent: January 30, 2024

Assignee: Arm Limited

Inventors: Mbou Eyole, Michiel Willem Van Tol, Stefanos Kaxiras
Computer architecture with resistive processing units

Patent number: 11886378

Abstract: A processor includes an array of resistive processing units connected between row and column lines with a resistive element. A first single instruction, multiple data processing unit (SIMD) is connected to the row lines. A second SIMD is connected to the column lines. A first instruction issuer is connected to the first SIMD to issue instructions to the first SIMD, and a second instruction issuer is connected to the second SIMD to issue instructions to the second SIMD such that the processor is programmable and configurable for specific operations depending on an issued instruction set.

Type: Grant

Filed: December 28, 2020

Date of Patent: January 30, 2024

Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventor: Tayfun Gokmen

1 2 3 4 5 … next