Patents Examined by Jacob A. Petranek
  • Patent number: 11966334
    Abstract: Systems, methods, and apparatuses relating to linear address masking architecture are described. In one embodiment, a hardware processor includes an address generation unit to generate a linear address for a memory access request to a memory, at least one control register comprising a user mode masking bit and a supervisor mode masking bit, a register comprising a current privilege level indication, and a memory management unit to mask out a proper subset of bits inside an address space of the linear address for the memory access request based on the current privilege level indication and either of the user mode masking bit or the supervisor mode masking bit to produce a resultant linear address, and output the resultant linear address.
    Type: Grant
    Filed: January 11, 2021
    Date of Patent: April 23, 2024
    Assignee: Intel Corporation
    Inventors: Ron Gabor, Igor Yanover
  • Patent number: 11960437
    Abstract: A system includes a high-bandwidth inter-chip network (ICN) that allows communication between parallel processing units (PPUs) in the system. For example, the ICN allows a PPU to communicate with other PPUs on the same compute node or server and also with PPUs on other compute nodes or servers. In embodiments, communication may be at the command level (e.g., at the direct memory access level) and at the instruction level (e.g., the finer-grained load/store instruction level). The ICN allows PPUs in the system to communicate without using a PCIe bus, thereby avoiding its bandwidth limitations and relative lack of speed. The respective routing tables comprise information of multiple paths to any given other PPU.
    Type: Grant
    Filed: July 15, 2022
    Date of Patent: April 16, 2024
    Assignee: T-Head (Shanghai) Semiconductor Co., Ltd.
    Inventors: Liang Han, Yunxiao Zou
  • Patent number: 11960922
    Abstract: In an embodiment, a processor comprises: an execution circuit to execute instructions; at least one cache memory coupled to the execution circuit; and a table storage element coupled to the at least one cache memory, the table storage element to store a plurality of entries each to store object metadata of an object used in a code sequence. The processor is to use the object metadata to provide user space multi-object transactional atomic operation of the code sequence. Other embodiments are described and claimed.
    Type: Grant
    Filed: September 24, 2020
    Date of Patent: April 16, 2024
    Assignee: Intel Corporation
    Inventors: Joshua B. Fryman, Jason M. Howard, Ibrahim Hur, Robert Pawlowski
  • Patent number: 11954487
    Abstract: Disclosed are apparatuses, systems, and techniques to perform and facilitate fast and efficient modular computational operations, such as modular division and modular inversion, using shared platforms, including hardware accelerator engines.
    Type: Grant
    Filed: March 29, 2022
    Date of Patent: April 9, 2024
    Assignee: Nvidia Corporation
    Inventors: Shuai Wang, Chen Yao, Xiao Wu, Rongzhe Zhu, Yuji Qian, Xixi Xie
  • Patent number: 11954488
    Abstract: A neural processing device, a processing element included therein and a method for operating various formats of the neural processing device are provided. The neural processing device includes at least one neural processor, a shared memory shared by the at least one neural processor, and a global interconnection configured to transmit data between the at least one neural processor and the shared memory, wherein each of the at least one neural processor comprises at least one processing element, each of the at least one processing element receives an input in a first format and thereby performs an operation, and receives an input in a second format that is different from the first format and thereby performs an operation if a format conversion signal is received, and the first format and the second format have a same number of bits.
    Type: Grant
    Filed: August 31, 2023
    Date of Patent: April 9, 2024
    Assignee: Rebellions Inc.
    Inventors: Karim Charfi, Jinwook Oh
  • Patent number: 11954492
    Abstract: Techniques are disclosed relating to channel stalls or deactivations based on the latency of prior operations. In some embodiments, a processor includes a plurality of channel pipelines for a plurality of channels and a plurality of execution pipelines shared by the channel pipelines and configured to perform different types of operations provided by the channel pipelines. First scheduler circuitry may assign threads to channels and second scheduler circuitry may assign an operation from a given channel to a given execution pipeline based on decode of an operation for that channel. Dependency circuitry may, for a first operation that depends on a prior operation that uses one of the execution pipelines, determine, based on status information for the prior operation from the one of the execution pipelines, whether to stall the first operation or to deactivate a thread that includes the first operation from its assigned channel.
    Type: Grant
    Filed: November 10, 2022
    Date of Patent: April 9, 2024
    Assignee: Apple Inc.
    Inventors: Benjiman L. Goodman, Dzung Q. Vu, Robert Kenney
  • Patent number: 11947965
    Abstract: When a transformation job of flow logs generated for a cloud environment is triggered, a security service determines a parameterized template for batch data processing operations offered by the cloud service provider (CSP) to use based on the type of transformation job. The security service communicates an indication of the template and the corresponding parameter values to a data processing service/pipeline offered by the CSP. The provisioned processing resources retrieve the flow logs from a designated location in cloud storage, complete the transformation, and store the transformed flow logs in a new storage location. If the CSP does not provide a data processing service/pipeline which can perform bulk data transformation, the security service uses a generic parameterized template specifying a transformation job to be run on a cluster. Upon completion, the security service retrieves and analyzes the transformed flow logs as part of threat detection performed for securing the cloud environment.
    Type: Grant
    Filed: August 1, 2022
    Date of Patent: April 2, 2024
    Assignee: Palo Alto Networks, Inc.
    Inventor: Krishnan Shankar Narayan
  • Patent number: 11947964
    Abstract: Examples of a carry chain for performing an operation on operands each including elements of a selectable size is provided. Advantageously, the carry chain adapts to elements of different sizes. The carry chain determines a mask based on a selected size of an element. The carry chain selects, based on the mask, whether to carry a partial result of an operation performed on corresponding first portions of a first operand and a second operand into a next operation. The next operation is performed on corresponding second portions of the first operand and the second operand, and, based on the selection, the partial result of the operation. The carry chain stores, in a memory, a result formed from outputs of the operation and the next operation.
    Type: Grant
    Filed: October 25, 2022
    Date of Patent: April 2, 2024
    Assignee: Marvell Asia Pte, Ltd.
    Inventor: David Kravitz
  • Patent number: 11940947
    Abstract: In various examples, a VPU and associated components may be optimized to improve VPU performance and throughput. For example, the VPU may include a min/max collector, automatic store predication functionality, a SIMD data path organization that allows for inter-lane sharing, a transposed load/store with stride parameter functionality, a load with permute and zero insertion functionality, hardware, logic, and memory layout functionality to allow for two point and two by two point lookups, and per memory bank load caching capabilities. In addition, decoupled accelerators may be used to offload VPU processing tasks to increase throughput and performance, and a hardware sequencer may be included in a DMA system to reduce programming complexity of the VPU and the DMA system. The DMA and VPU may execute a VPU configuration mode that allows the VPU and DMA to operate without a processing controller for performing dynamic region based data movement operations.
    Type: Grant
    Filed: January 6, 2023
    Date of Patent: March 26, 2024
    Assignee: NVIDIA Corporation
    Inventors: Ching-Yu Hung, Ravi P. Singh, Jagadeesh Sankaran, Yen-Te Shih, Ahmad Itani
  • Patent number: 11941398
    Abstract: A method for restoring a mapper of a processor core includes saving first information in a staging latch. The first information represents a newly dispatched first instruction of the processor core and is saved in an entry latch of a save-and-restore buffer. In response to reception of a flush command of the processor core, the restoration of the mapper is begun with the first information from the staging latch without waiting for a comparison of a flush tag of the flush command with the entry latch of the save-and-restore buffer. A processor core configured to perform the method described above is also provided. A processor core is also provided that includes a dispatch, a mapper, a save-and-restore buffer that includes entry latches and is connected to the mapper via at least one pipeline, and a register disposed in the at least one pipeline.
    Type: Grant
    Filed: December 5, 2022
    Date of Patent: March 26, 2024
    Assignee: International Business Machines Corporation
    Inventors: Brian D. Barrick, Steven J. Battle, Dung Q. Nguyen, Susan E. Eisen, Cliff Kucharski, Salma Ayub
  • Patent number: 11934940
    Abstract: The present disclosure discloses a data processing method and related products, in which the data processing method includes: generating, by a general-purpose processor, a binary instruction according to device information of an AI processor, and generating an AI learning task according to the binary instruction; transmitting, by the general-purpose processor, the AI learning task to the cloud AI processor for running; receiving, by the general-purpose processor, a running result corresponding to the AI learning task; and determining, by the general-purpose processor, an offline running file according to the running result, where the offline running file is generated according to the device information of the AI processor and the binary instruction when the running result satisfies a preset requirement. By implementing the present disclosure, the debugging between the AI algorithm model and the AI processor can be achieved in advance.
    Type: Grant
    Filed: December 19, 2019
    Date of Patent: March 19, 2024
    Assignee: CAMBRICON TECHNOLOGIES CORPORATION LIMITED
    Inventors: Yao Zhang, Xiaofu Meng, Shaoli Liu
  • Patent number: 11928467
    Abstract: In an embodiment, a processor comprises an atomic predictor circuit to predict whether or not an atomic operation will complete successfully. The prediction may be used when a subsequent load operation to the same memory location as the atomic operation is executed, to determine whether or not to forward store data from the atomic operation to the subsequent load operation. If the prediction is successful, the store data may be forwarded. If the prediction is unsuccessful, the store data may not be forwarded. In cases where an atomic operation has been failing (not successfully performing the store operation), the prediction may prevent the forwarding of the store data and thus may prevent a subsequent flush of the load.
    Type: Grant
    Filed: September 13, 2021
    Date of Patent: March 12, 2024
    Assignee: Apple Inc.
    Inventors: Brian R. Mestan, Gideon N. Levinsky, Michael L. Karm
  • Patent number: 11922208
    Abstract: Systems and methods are disclosed for switching between batch processing and real-time processing of time series data, with a system being configured to switch between a batch processing module and a real-time processing module to process time series data. The system includes an orchestration service to indicate when to switch, which may be based on a switching event identified by the orchestration service. In some implementations, the orchestration service identifies a switching event in incoming time series data to be processed. When a batch processing module is to be used to batch process time series data, the real-time processing module may be disabled, with the real-time processing module being enabled when it is used to process the time series data. In some implementations, the real-time processing module includes the same processing models as the batch processing module such that the two modules' outputs have a similar accuracy.
    Type: Grant
    Filed: May 31, 2023
    Date of Patent: March 5, 2024
    Assignee: Intuit Inc.
    Inventors: Immanuel David Buder, Shashank Shashikant Rao
  • Patent number: 11922207
    Abstract: An approach is provided for coalescing network commands in a GPU that implements a SIMT architecture. Compatible next network operations from different threads are coalesced into a single network command packet. This reduces the number of network command packets generated and issued by threads, thereby increasing efficiency, and improving throughput. The approach is applicable to any number of threads and any thread organization methodology, such as wavefronts, warps, etc.
    Type: Grant
    Filed: August 13, 2020
    Date of Patent: March 5, 2024
    Assignee: Advanced Micro Devices, Inc
    Inventors: Michael W. LeBeane, Khaled Hamidouche, Brandon K. Potter
  • Patent number: 11907100
    Abstract: A method of tracing instruction execution on a processor of an integrated circuit chip in real time whilst the processor continues to execute instructions during clock cycles of the processor. The instruction execution of the processor is monitored by counting the number of successive instructions which are retired contiguously in time to form an instruction count, and counting the number of subsequent contiguous clock cycles of the processor during which no instruction is retired to form a stall count. A trace message is generated which includes the instruction count and the stall count, and the trace message is outputted.
    Type: Grant
    Filed: April 16, 2020
    Date of Patent: February 20, 2024
    Assignee: Siemens Industry Software Inc.
    Inventor: Iain Robertson
  • Patent number: 11907712
    Abstract: Systems, methods, and apparatuses relating to circuitry to implement out-of-order access to a shared microcode sequencer by a clustered decode pipeline are described.
    Type: Grant
    Filed: September 25, 2020
    Date of Patent: February 20, 2024
    Assignee: Intel Corporation
    Inventors: Thomas Madaelil, Jonathan Combs, Vikash Agarwal
  • Patent number: 11893398
    Abstract: Methods, apparatuses, and systems for implementing data flows in a processor are described herein. A data flow manager may be configured to generate a configuration packet for a compute operation based on status information regarding multiple processing elements of the processor. Accordingly, multiple processing elements of a processor may concurrently process data flows based on the configuration packet. For example, the multiple processing elements may implement a mapping of processing elements to memory, while also implementing identified paths, through the processor, for the data flows. After executing the compute operation at certain processing elements of the processor, the processing results may be provided. In speech signal processing operations, the processing results may be compared to phonemes to identify such components of human speech in the processing results.
    Type: Grant
    Filed: February 16, 2022
    Date of Patent: February 6, 2024
    Assignee: MICRON TECHNOLOGY, INC.
    Inventors: Jeremy Chritz, Tamara Schmitz, Fa-Long Luo, David Hulton
  • Patent number: 11892970
    Abstract: A method for data processing, a processor chip. The method includes: acquiring a first relationship instruction; executing at least one first computing instruction acquired before the first relationship instruction based on the first relationship instruction; and sending acknowledgment information based on the first relationship instruction in response to completing executing the at least one first computing instruction, to cause a second coprocessor receiving the acknowledgment information to revert to a state of acquiring a second computing instruction after the second relationship instruction acquired by a second coprocessor based on the acknowledgment information.
    Type: Grant
    Filed: July 19, 2022
    Date of Patent: February 6, 2024
    Assignee: KUNLUNXIN TECHNOLOGY (BEIJING) COMPANY
    Inventors: Jing Wang, Jiaxin Shi, Hanlin Xie, Xiaozhang Gong
  • Patent number: 11886881
    Abstract: Apparatuses and methods are provided, relating to the control of data processing in devices which comprise both decoupled access-execute processing circuitry and prefetch circuitry. Control of the access portion of the decoupled access-execute processing circuitry may be dependent on a performance metric of the prefetch circuitry. Alternatively or in addition, control of the prefetch circuitry may be dependent on a performance metric of the access portion.
    Type: Grant
    Filed: December 21, 2020
    Date of Patent: January 30, 2024
    Assignee: Arm Limited
    Inventors: Mbou Eyole, Michiel Willem Van Tol, Stefanos Kaxiras
  • Patent number: 11886378
    Abstract: A processor includes an array of resistive processing units connected between row and column lines with a resistive element. A first single instruction, multiple data processing unit (SIMD) is connected to the row lines. A second SIMD is connected to the column lines. A first instruction issuer is connected to the first SIMD to issue instructions to the first SIMD, and a second instruction issuer is connected to the second SIMD to issue instructions to the second SIMD such that the processor is programmable and configurable for specific operations depending on an issued instruction set.
    Type: Grant
    Filed: December 28, 2020
    Date of Patent: January 30, 2024
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventor: Tayfun Gokmen