Patents Examined by Corey S Faherty
-
Patent number: 11687341Abstract: In one embodiment, a matrix processor comprises a memory to store a matrix operand and a strided read sequence, wherein: the matrix operand is stored out of order in the memory; and the strided read sequence comprises a sequence of read operations to read the matrix operand in a correct order from the memory. The matrix processor further comprises circuitry to: receive a first instruction to be executed by the matrix processor, wherein the first instruction is to instruct the matrix processor to perform a first operation on the matrix operand; read the matrix operand from the memory based on the strided read sequence; and execute the first instruction by performing the first operation on the matrix operand.Type: GrantFiled: August 29, 2019Date of Patent: June 27, 2023Assignee: Intel CorporationInventors: Nitin N. Garegrat, Tony L. Werner, Jeff DelChiaro, Michael Rotzin, Robert T. Rhoades, Ujwal Basavaraj Sajjanar, Anne Q. Ye
-
Patent number: 11687344Abstract: One aspect provides a system for hardware-assisted pre-execution. During operation, the system determines a pre-execution code region comprising one or more instructions. The system increments a global counter upon initiating the one or more instructions. The system issues a first instruction, which involves setting, in a first entry for the first instruction in a data structure, a first prefetch region identifier with a current value of the global counter. Responsive to a head pointer of the data structure reaching the first entry, the system: determines, based on a non-zero value for the first prefetch region identifier, that the first entry is not available to be allocated; and advances the head pointer to a next entry in the data structure, which renders a load associated with the first entry as a non-blocking load. The system resets the global counter upon completing the one or more instructions.Type: GrantFiled: August 25, 2021Date of Patent: June 27, 2023Assignee: Hewlett Packard Enterprise Development LPInventor: Sanyam Mehta
-
Patent number: 11681806Abstract: In an approach to protecting against out-of-bounds buffer references, an apparatus comprises one or more processor cores and a bounds-checking functional unit in each processor core configured to manage bounds information for one or more memory buffers. When a buffer is allocated, an address range of the buffer is stored. When a pointer is assigned an address within the address range of the buffer, the address range of the buffer is associated with the pointer. When the pointer is used to compute an address for an operation, whether the address for the operation is within the address range associated with the pointer is determined. If the address is not within the address range associated with the pointer, signaling that an error has occurred.Type: GrantFiled: October 15, 2019Date of Patent: June 20, 2023Assignee: International Business Machines CorporationInventors: Richard H. Boivie, Alper Buyuktosunoglu, Tong Chen
-
Patent number: 11675596Abstract: Various example embodiments for supporting message processing are presented. Various example embodiments for supporting message processing are configured to support message processing by a processor. Various example embodiments for supporting message processing by a processor are configured to support message processing by the processor based on dynamic control over processor instruction sets of the processor.Type: GrantFiled: November 4, 2020Date of Patent: June 13, 2023Assignee: NOKIA SOLUTIONS AND NETWORKS OYInventor: Pranjal Kumar Dutta
-
Patent number: 11675598Abstract: Representative apparatus, method, and system embodiments are disclosed for configurable computing. A representative system includes an interconnection network; a processor; and a plurality of configurable circuit clusters. Each configurable circuit cluster includes a plurality of configurable circuits arranged in an array; a synchronous network coupled to each configurable circuit of the array; and an asynchronous packet network coupled to each configurable circuit of the array.Type: GrantFiled: March 15, 2022Date of Patent: June 13, 2023Assignee: Micron Technology, Inc.Inventor: Tony M. Brewer
-
Patent number: 11675734Abstract: Representative apparatus, method, and system embodiments are disclosed for configurable computing. A representative system includes an interconnection network; a processor; and a plurality of configurable circuit clusters. Each configurable circuit cluster includes a plurality of configurable circuits arranged in an array; a synchronous network coupled to each configurable circuit of the array; and an asynchronous packet network coupled to each configurable circuit of the array.Type: GrantFiled: March 4, 2022Date of Patent: June 13, 2023Assignee: Micron Technology, Inc.Inventor: Tony M. Brewer
-
Patent number: 11675594Abstract: Embodiments of instructions are detailed herein including one or more of 1) a branch fence instruction, prefix, or variants (BFENCE); 2) a predictor fence instruction, prefix, or variants (PFENCE); 3) an exception fence instruction, prefix, or variants (EFENCE); 4) an address computation fence instruction, prefix, or variants (AFENCE); 5) a register fence instruction, prefix, or variants (RFENCE); and, additionally, modes that apply the above semantics to some or all ordinary instructions.Type: GrantFiled: December 28, 2018Date of Patent: June 13, 2023Assignee: Intel CorporationInventors: Robert S. Chappell, Jason W. Brandt, Alan Cox, Asit Mallick, Joseph Nuzman, Arjan Van De Ven
-
Patent number: 11669333Abstract: In certain aspects of the disclosure, an apparatus comprises a first scheduling pool associated with a first minimum scheduling latency and a second scheduling pool associated with a second minimum scheduling latency, the second minimum scheduling latency greater than the first minimum scheduling latency. A common instruction picker is coupled to both the first scheduling pool and the second scheduling pool. The common instruction picker may be configured to select a first instruction from the first scheduling pool and a second instruction from the second scheduling pool, and then choose either the first instruction or second instruction for dispatch according to a picking policy.Type: GrantFiled: April 26, 2018Date of Patent: June 6, 2023Assignee: Qualcomm IncorporatedInventors: Rodney Wayne Smith, Raghavan Madhavan, Luke Yen, Shivam Priyadarshi, Yusuf Cagatay Tekmen
-
Patent number: 11669328Abstract: A method for converting instructions is provided. The method is used in a processor and includes: receiving an instruction, wherein the instruction is an unknown instruction; determining whether the received instruction is a new instruction; and converting the received instruction into at least one old instruction when the received instruction is a new instruction.Type: GrantFiled: September 10, 2021Date of Patent: June 6, 2023Assignee: SHANGHAI ZHAOXIN SEMICONDUCTOR CO., LTD.Inventors: Weilin Wang, Mengchen Yang, Yingbing Guan
-
Patent number: 11663004Abstract: An instruction to perform converting and scaling operations is provided. Execution of the instruction includes converting an input value in one format to provide a converted result in another format. The converted result is scaled to provide a scaled result. A result obtained from the scaled result is placed in a selected location. Further, an instruction to perform scaling and converting operations is provided. Execution of the instruction includes scaling an input value in one format to provide a scaled result and converting the scaled result from the one format to provide a converted result in another format. A result obtained from the converted result is placed in a selected location.Type: GrantFiled: February 26, 2021Date of Patent: May 30, 2023Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Eric Mark Schwarz, Kerstin Claudia Schelm, Petra Leber, Silvia Melitta Mueller, Reid Copeland, Xin Guo, Cedric Lichtenau
-
Patent number: 11663006Abstract: Methods and apparatuses relating to switching of a shadow stack pointer are described. In one embodiment, a hardware processor includes a hardware decode unit to decode an instruction, and a hardware execution unit to execute the instruction to: pop a token for a thread from a shadow stack, wherein the token includes a shadow stack pointer for the thread with at least one least significant bit (LSB) of the shadow stack pointer overwritten with a bit value of an operating mode of the hardware processor for the thread, remove the bit value in the at least one LSB from the token to generate the shadow stack pointer, and set a current shadow stack pointer to the shadow stack pointer from the token when the operating mode from the token matches a current operating mode of the hardware processor.Type: GrantFiled: June 7, 2021Date of Patent: May 30, 2023Assignee: Intel CorporationInventors: Vedvyas Shanbhogue, Jason W. Brandt, Ravi L. Sahita, Barry E. Huntley, Baiju V. Patel, Deepak K. Gupta
-
Patent number: 11663009Abstract: A Reduced Instruction Set Computer (“RISC”) supporting large-word operations in a computing environment is disclosed. In one implementation, in response to receiving one or more control signals from a central processing unit (“CPU”), a set of operations are executed on a state of a special purpose execution unit (“SPU”) having a plurality of SPU registers, the SPU being associated with the CPU and the state of the SPU having word widths of one or more of the plurality of registers being greater in size than word widths of a plurality of CPU registers of a computing system and a set of state-master bits to synchronize the state of the SPU and a state of the CPU. The results of the set of operations are stored in the plurality of CPU registers or an alternative set of the plurality of SPU registers.Type: GrantFiled: October 14, 2021Date of Patent: May 30, 2023Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Sandhya Koteshwara, Kattamuri Ekanadham, Manoj Kumar, Jose E. Moreira, Pratap C. Pattnaik
-
Patent number: 11656873Abstract: An apparatus and method for efficiently managing shadow stacks. For example, one embodiment of a processor comprises: a plurality of registers to store a plurality of shadow stack pointers (SSPs); event processing circuitry to select a first SSP of the plurality of SSPs from a first register of the plurality of registers responsive to receipt of a first event associated with a first event priority level, the first SSP usable to identify a top of a first shadow stack; verification and utilization checking circuitry to determine whether the first SSP has been previously verified, wherein if the first SSP has not been previously verified then initiating a set of atomic operations to verify the first SSP and confirm that the first SSP is not in use, the set of atomic operations using a locking operation to lock data until the set of atomic operations are complete.Type: GrantFiled: February 1, 2022Date of Patent: May 23, 2023Assignee: Intel CorporationInventors: Vedvyas Shanbhogue, Gilbert Neiger, Deepak K. Gupta, H. Peter Anvin
-
Patent number: 11656877Abstract: Techniques are provided for executing wavefronts. The techniques include at a first time for issuing instructions for execution, performing first identifying, including identifying that sufficient processing resources exist to execute a first set of instructions together within a processing lane; in response to the first identifying, executing the first set of instructions together; at a second time for issuing instructions for execution, performing second identifying, including identifying that no instructions are available for which sufficient processing resources exist for execution together within the processing lane; and in response to the second identifying, executing an instruction independently of any other instruction.Type: GrantFiled: March 31, 2021Date of Patent: May 23, 2023Assignee: Advanced Micro Devices, Inc.Inventor: Maxim V. Kazakov
-
Patent number: 11656874Abstract: An asymmetrical processing system is provided. The processor has a vector unit comprised of one or more computational units coupled with a vector memory space and a scalar unit coupled with a data memory space and the vector memory space, the scalar unit accessing one or more memory locations within the vector memory space.Type: GrantFiled: October 8, 2015Date of Patent: May 23, 2023Assignee: NXP USA, Inc.Inventors: Malcolm Douglas Stewart, Daniel Claude Laroche, Trevor Graydon Burton, Ali Osman Ors
-
Patent number: 11650821Abstract: A system can include a microprocessor having a prefetch queue including a plurality of slots configured to store program counter values (PCVs) and instructions, a pipeline configured to receive instructions from the prefetch queue, and a select circuit coupled to the prefetch queue. The select circuit may selectively freeze a first slot of the plurality of slots and selectively output a frozen PCV and a frozen instruction from the first slot while frozen. The microprocessor can include write logic coupled to the prefetch queue and a comparator circuit coupled to the prefetch queue and the select circuit. The write logic may load data into unfrozen slots of the prefetch queue. The comparator circuit may compare a target PCV with the frozen PCV to determine a match. The select circuit indicates, to the pipeline, whether the frozen instruction is valid based on the comparing.Type: GrantFiled: May 19, 2021Date of Patent: May 16, 2023Assignee: Xilinx, Inc.Inventor: Stefan Asserhall
-
Patent number: 11650824Abstract: An integrated circuit including memory to store image data and filter weights, and a plurality of multiply-accumulator execution pipelines, each multiply-accumulator execution pipeline coupled to the memory to receive (i) image data and (ii) filter weights, wherein each multiply-accumulator execution pipeline processes the image data, using associated filter weights, via a plurality of multiply and accumulate operations. In one embodiment, the multiply-accumulator circuitry of each multiply-accumulator execution pipeline, in operation, receives a different set of image data, each set including a plurality of image data, and, using filter weights associated with the received set of image data, processes the set of image data associated therewith, via performing a plurality of multiply and accumulate operations concurrently with the multiply-accumulator circuitry of the other multiply-accumulator execution pipelines, to generate output data.Type: GrantFiled: November 18, 2021Date of Patent: May 16, 2023Assignee: Flex Logix Technologies, Inc.Inventors: Frederick A. Ware, Cheng C. Wang
-
Patent number: 11650820Abstract: A method of an aspect includes receiving an instruction indicating a destination storage location. A result is stored in the destination storage location in response to the instruction. The result includes a sequence of at least four non-negative integers in numerical order with all integers in consecutive positions differing by a constant stride of at least two. In an aspect, storing the result including the sequence of the at least four integers is performed without calculating the at least four integers using a result of a preceding instruction. Other methods, apparatus, systems, and instructions are disclosed.Type: GrantFiled: December 14, 2020Date of Patent: May 16, 2023Assignee: INTEL CORPORATIONInventors: Elmoustapha Ould-Ahmed-Vall, Seth Abraham, Robert Valentine, Zeev Sperber, Amit Gradstein
-
Patent number: 11645076Abstract: Provided are embodiments for a method of performing register pressure targeted function splitting. The method can include determining a candidate region of a function, the candidate region comprising variables, and determining a number of available registers in a computing system for allocating the variables of the function. The method can also include grouping the variables in the candidate region into first variables and second variables based at least in part on the number of available registers, and splitting the candidate region of the function into split functions based at least in part on the grouping of the variables. Also provided are embodiments for a computer program product and a system for performing register pressure targeted function splitting.Type: GrantFiled: July 26, 2021Date of Patent: May 9, 2023Assignee: International Business Machines CorporationInventors: Jinsong Ji, Zheng Chen, Ke Wen Lin
-
Patent number: 11645077Abstract: Embodiments detailed herein relate to systems and methods to zero a tile register pair. In one example, a processor includes decode circuitry to decode a matrix pair zeroing instruction having fields for an opcode and an identifier to identify a destination matrix having a PAIR parameter equal to TRUE; and execution circuitry to execute the decoded matrix pair zeroing instruction to zero every element of a left matrix and a right matrix of the identified destination matrix.Type: GrantFiled: June 1, 2021Date of Patent: May 9, 2023Assignee: Intel CorporationInventors: Raanan Sade, Simon Rubanovich, Amit Gradstein, Zeev Sperber, Alexander Heinecke, Robert Valentine, Mark J. Charney, Bret Toll, Jesus Corbal, Elmoustapha Ould-Ahmed-Vall, Menachem Adelman, Eyal Hadas