Patents Examined by Shawn Doman
-
Patent number: 12380057Abstract: Techniques for improving computing efficiency of a processor by optimizing a computational size of each computing core in the processor are provided. The techniques include obtaining a configuration space for a target parameter; obtaining a computational time model of the processor, the computational time model is a function of the target parameter and a number of computing cores of the processor; traversing the target parameter in the configuration space, and calculating, based on the computational time model, a computational time corresponding to the target parameter that is selected; in response to the target parameter being a k-th parameter with a minimum computational time, determining the target parameter as the k-th parameter; and improving the computing efficiency of the processor by configuring the computational size of each computing core in the processor based on the k-th parameter.Type: GrantFiled: September 25, 2024Date of Patent: August 5, 2025Assignee: Beijing Zitiao Network Technology Co., Ltd.Inventors: Yunfeng Shi, Hangjian Yuan, Tao Li, Jing Xing, Jian Wang
-
Patent number: 12360766Abstract: A hardware accelerator for running an instruction set of a recurrent neural network, a data processing method, a system-level chip, and a medium are provided. The hardware accelerator is configured to process the instruction set.Type: GrantFiled: May 24, 2023Date of Patent: July 15, 2025Assignee: DAPUSTOR CORPORATIONInventors: Yan Wang, Yunxin Huang, Jixing Zhang, Weijun Li
-
Patent number: 12360768Abstract: Methods and apparatus relating to throttling a code fetch for speculative code paths are described. In an embodiment, a first storage structure stores a reference to a code line in response to a request to be received from a cache. A second storage structure to store a reference to the code line in response to an update to an Instruction Dispatch Queue (IDQ). Logic circuitry controls additional code line fetch operations based at least in part on a comparison of a number of ongoing speculative code fetches and a determination that the code line is speculative. Other embodiments are also disclosed and claimed.Type: GrantFiled: December 16, 2021Date of Patent: July 15, 2025Assignee: Intel CorporationInventors: Anant Vithal Nori, Prathmesh Kallurkar, Sreenivas Subramoney, Niranjan Kumar Soundararajan
-
Patent number: 12360804Abstract: A processing system flexibly schedules workgroups across kernels based on data dependencies between workgroups to enhance processing efficiency. The workgroups are partitioned into subsets based on the data dependencies and workgroups of a first subset that produces data are scheduled to execute immediately before workgroups of a second subset that consumes the data generated by the first subset. Thus, the processing system does not execute one kernel at a time, but instead schedules workgroups across kernels based on data dependencies across kernels. By limiting the sizes of the subsets to the amount of data that can be stored at local caches, the processing system increases the probability that data to be consumed by workgroups of a subset will be resident in a local cache and will not require a memory access.Type: GrantFiled: December 30, 2022Date of Patent: July 15, 2025Assignee: Advanced Micro Devices, Inc.Inventor: Harris Gasparakis
-
Patent number: 12353880Abstract: In an embodiment a One-Time Programmable (OTP) memory controller includes a data register, a given number K of shadow-registers, wherein the number K is smaller than a given number N of memory slots of an OTP memory area, a communication interface configured to receive a read request requesting the data of a given memory slot and a control circuit configured to receive a preload start signal and a shadow-register preload enable signal, wherein the control circuit is configured to manage a preload phase and a data-read phase.Type: GrantFiled: May 30, 2023Date of Patent: July 8, 2025Assignee: STMicroelectronics S.r.l.Inventors: Antonino Giuseppe Fontana, Giuseppe Guarnaccia, Stefano Catalano
-
Patent number: 12346728Abstract: Systems and methods are provided related to a scheduler to receive a job request from a virtual function associated with a tenant for execution by at least one processing unit. The scheduler validates the job request in accordance with one or more defined restrictions associated with the tenant and, responsive to successful validation, provides the job request for execution by the processing unit via one or more physical functions associated with the processing unit. In certain embodiments, multi-level enforcement of the defined restrictions are provided via user-mode and kernel-mode drivers associated with the virtual function that are also enabled to validate job requests based on the defined restrictions.Type: GrantFiled: December 1, 2022Date of Patent: July 1, 2025Assignee: ATI TECHNOLOGIES ULCInventors: Ahmed M. Abdelkhalek, Rutao Zhang, Bokun Zhang, Min Zhang, Yinan Jiang, Jeffrey G. Cheng
-
Patent number: 12346790Abstract: A state machine engine having a program buffer. The program buffer is configured to receive configuration data via a bus interface for configuring a state machine lattice. The state machine engine also includes a repair map buffer configured to provide repair map data to an external device via the bus interface. The state machine lattice includes multiple programmable elements. Each programmable element includes multiple memory cells configured to analyze data and to output a result of the analysis.Type: GrantFiled: February 15, 2023Date of Patent: July 1, 2025Assignee: Micron Technology, Inc.Inventors: Harold B Noyes, David R. Brown
-
Patent number: 12346698Abstract: A streaming engine employed in a digital signal processor specified a fixed data stream. Once started the data stream is read only and cannot be written. Once fetched, the data stream is stored in a first-in-first-out buffer for presentation to functional units in the fixed order. Data use by the functional unit is controlled using the input operand fields of the corresponding instruction. A read only operand coding supplies the data an input of the functional unit. A read/advance operand coding supplies the data and also advances the stream to the next sequential data elements. The read only operand coding permits reuse of data without requiring a register of the register file for temporary storage.Type: GrantFiled: June 12, 2023Date of Patent: July 1, 2025Assignee: Texas Instruments IncorporatedInventor: Joseph Zbiciak
-
Patent number: 12340223Abstract: Branch prediction techniques for pipelined microprocessors are disclosed. A microprocessor for branch predictor selection includes a fetch stage configured to retrieve instructions from a memory. A buffer is configured to store instructions retrieved by the fetch stage, and one or more pipelined stages configured to execute the instructions stored in the buffer. The branch predictor, communicatively coupled to the buffer and the one or more pipelined stages, is configured to select a branch target predictor from a set of branch target predictors. Each of the branch target predictors comprise a trained model associated with a previously executed instruction to identify a target branch path for the instruction currently being executed based on the selected branch target predictor.Type: GrantFiled: December 6, 2021Date of Patent: June 24, 2025Assignee: HUAWEI TECHNOLOGIES CO., LTD.Inventors: Da Qi Ren, Qian Wang, XingYu Jiang
-
Patent number: 12288069Abstract: A streaming engine employed in a digital signal processor specifies a fixed read only data stream. Once fetched the data stream is stored in two head registers for presentation to functional units in the fixed order. Data use by the functional unit is preferably controlled using the input operand fields of the corresponding instruction. A first read only operand coding supplies data from the first head register. A first read/advance operand coding supplies data from the first head register and also advances the stream to the next sequential data elements. Corresponding second read only operand coding and second read/advance operand coding operate similarly with the second head register. A third read only operand coding supplies double width data from both head registers.Type: GrantFiled: March 18, 2024Date of Patent: April 29, 2025Assignee: Texas Instruments IncorporatedInventor: Joseph Zbiciak
-
Patent number: 12282772Abstract: A processor includes a time counter, a vector coprocessor, and a vector data buffer for executing vector load and store instructions. The processor handles unit, stride or indices of data elements of a vector register. The vector data buffer includes crossbar switches for coupling between a plurality of data elements of a vector register and a plurality of data banks of the vector data buffer.Type: GrantFiled: June 30, 2023Date of Patent: April 22, 2025Assignee: Simplex Micro, Inc.Inventor: Thang Minh Tran
-
Patent number: 12282776Abstract: Hybrid parallelized tagged geometric (TAGE) branch prediction, including: selecting, based on a branch instruction, a first plurality of counts from at least one TAGE table; selecting, based on the branch instruction, a second plurality of counts from at least one non-TAGE branch prediction table; generating, based on the first plurality of counts and a second plurality of counts; and wherein selecting the first plurality of counts and selecting the second plurality of counts are performed during a same branch prediction pipeline stage.Type: GrantFiled: March 30, 2022Date of Patent: April 22, 2025Assignee: ADVANCED MICRO DEVICES, INC.Inventors: Anthony Jarvis, Thomas Clouqueur
-
Patent number: 12260218Abstract: There is provided an apparatus, method for data processing. The apparatus comprises post decode cracking circuitry responsive to receipt of decoded instructions from decode circuitry of a processing pipeline, to crack the decoded instructions into micro-operations to be processed by processing circuitry of the processing pipeline. The post decode cracking circuitry is responsive to receipt of a decoded instruction suitable for cracking into a plurality of micro-operations including at least one pair of micro-operations having a producer-consumer data dependency, to generate the plurality of micro-operations including a producer micro-operation and a consumer micro-operation, and to assign a transfer register to transfer data between the producer micro-operation and the consumer micro-operation.Type: GrantFiled: June 28, 2023Date of Patent: March 25, 2025Assignee: Arm LimitedInventors: Quentin Éric Nouvel, Luca Nassi, Nicola Piano, Albin Pierrick Tonnerre, Geoffray Matthieu Lacourba
-
Patent number: 12260221Abstract: Circuitry comprises processing circuitry configured to execute program instructions in dependence upon respective trigger conditions matching a current trigger state and to set a next trigger state in response to program instruction execution; the processing circuitry comprising: instruction storage configured to selectively provide a group of two or more program instructions for execution in parallel; and trigger circuitry responsive to the generation of a trigger state by execution of program instructions and to a trigger condition associated with a given group of program instructions, to control the instruction storage to provide program instructions of the given group of program instructions for execution.Type: GrantFiled: January 19, 2022Date of Patent: March 25, 2025Assignee: Arm LimitedInventors: Mbou Eyole, Giacomo Gabrielli, Balaji Venu
-
Patent number: 12242851Abstract: Methods and apparatus relating to verifying a compressed stream fused with copy or transform operation(s) are described. In an embodiment, compression logic circuitry compresses input data and stores the compressed data in a temporary buffer. The compression logic circuitry determines a first checksum value corresponding to the compressed data stored in the temporary buffer. Decompression logic circuitry performs a decompress-verify operation and a copy operation. The decompress-verify operation decompresses the compressed data stored in the temporary buffer to determine a second checksum value corresponding to the decompressed data from the temporary buffer. The copy operation transfers the compressed data from the temporary buffer to a destination buffer in response to a match between the first checksum value and the second checksum value. Other embodiments are also disclosed and claimed.Type: GrantFiled: September 9, 2021Date of Patent: March 4, 2025Assignee: Intel CorporationInventors: Vinodh Gopal, James D. Guilford, Daniel F. Cutter
-
Patent number: 12242848Abstract: Examples of the present disclosure provide apparatuses and methods for determining a vector population count in a memory. An example method comprises determining, using sensing circuitry, a vector population count of a number of fixed length elements of a vector stored in a memory array.Type: GrantFiled: May 25, 2023Date of Patent: March 4, 2025Inventor: Sanjay Tiwari
-
Patent number: 12210876Abstract: Instruction set architectures (ISAs) and apparatus and methods related thereto comprise an instruction set that includes one or more instructions which identify the global pointer (GP) register as an operand (e.g., base register or source register) of the instruction. Identification can be implicit. By implicitly identifying the GP register as an operand of the instruction, one or more bits of the instruction that were dedicated to explicitly identifying the operand (e.g., base register or source register) can be used to extend the size of one or more other operands, such as the offset or immediate, to provide longer offsets or immediates.Type: GrantFiled: August 31, 2018Date of Patent: January 28, 2025Assignee: MIPS Tech, LLCInventors: James Hippisley Robinson, Morgyn Taylor, Matthew Fortune, Richard Fuhler, Sanjay Patel
-
Patent number: 12204908Abstract: A branch predictor predicts a first outcome of a first branch in a first block of instructions. Fetch logic fetches instructions for speculative execution along a first path indicated by the first outcome. Information representing a remainder of the first block is stored in response to the first predicted outcome being taken. In response to the first branch instruction being not taken, the branch predictor is restarted based on the remainder block. In some cases, entries corresponding to second blocks along speculative paths from the first block are accessed using an address of the first block as an index into a branch prediction structure. Outcomes of branch instructions in the second blocks are concurrently predicted using a corresponding set of instances of branch conditional logic and the predicted outcomes are used in combination with the remainder block to restart the branch predictor in response to mispredictions.Type: GrantFiled: June 4, 2018Date of Patent: January 21, 2025Assignee: Advanced Micro Devices, Inc.Inventors: Marius Evers, Douglas Williams, Ashok T. Venkatachar, Sudherssen Kalaiselvan
-
Patent number: 12198222Abstract: Embodiments described herein include software, firmware, and hardware logic that provides techniques to perform arithmetic on sparse data via a systolic processing unit. One embodiment provides for data aware sparsity via compressed bitstreams. One embodiment provides for block sparse dot product instructions. One embodiment provides for a depth-wise adapter for a systolic array.Type: GrantFiled: December 7, 2023Date of Patent: January 14, 2025Assignee: Intel CorporationInventors: Abhishek Appu, Subramaniam Maiyuran, Mike Macpherson, Fangwen Fu, Jiasheng Chen, Varghese George, Vasanth Ranganathan, Ashutosh Garg, Joydeep Ray
-
Patent number: 12197308Abstract: On-circuit utilization monitoring may be performed for a systolic array. A current utilization measurement may be determined for processing elements of a systolic array and compared with a prior utilization measurement. Based on the comparison, a throttling recommendation may be provided to a management component to determine whether to perform the throttling recommendation.Type: GrantFiled: November 6, 2020Date of Patent: January 14, 2025Assignee: Amazon Technologies, Inc.Inventors: Thomas A Volpe, Ron Diamant