Patents Examined by Corey S Faherty
-
Patent number: 11797306Abstract: In accordance with an embodiment, a method verifies contents of a plurality of registers having two first registers, where each of the plurality of registers is configured to store a data word and a verification bit. The method includes determining whether a value of the verification bit of each respective register of the plurality of registers corresponds to the data word of its respective register. The data words stored in the two first registers are selected so that the bits of a same rank of the two first registers include two complementary bits, each bit of a common binary word is associated with a respective register of the plurality of registers, and the value of the verification bit of each respective register depends on the data word of the respective register and the bit of the common binary word associated with the respective register.Type: GrantFiled: April 26, 2022Date of Patent: October 24, 2023Assignee: STMicroelectronics (Grenoble 2) SASInventors: Gregory Trunde, Denis Dutey
-
Patent number: 11789737Abstract: Systems, methods, and apparatuses for generating a protected stack allocation pointer. In certain examples, a hardware processor core comprises a decoder circuit to decode a single instruction into a decoded single instruction, the single instruction comprising one or more fields to indicate a stack allocation index as an operand, and an opcode to indicate that an execution circuit is to generate a stack allocation pointer to reference an address in a stack and an address in a shadow stack; and an execution circuit to execute the decoded single instruction according to the opcode.Type: GrantFiled: March 24, 2022Date of Patent: October 17, 2023Assignee: Intel CorporationInventors: Michael Lemay, David M. Durham
-
Patent number: 11775302Abstract: A digital data processor includes an instruction memory storing instructions each specifying a data processing operation and at least one data operand field, an instruction decoder coupled to the instruction memory for sequentially recalling instructions from the instruction memory and determining the data processing operation and the at least one data operand, and at least one operational unit coupled to a data register file and to an instruction decoder to perform a data processing operation upon at least one operand corresponding to an instruction decoded by the instruction decoder and storing results of the data processing operation. The operational unit is configured to increment histogram values in response to a histogram instruction by incrementing a bin entry at a specified location in a specified number of at least one histogram.Type: GrantFiled: October 25, 2021Date of Patent: October 3, 2023Assignee: Texas Instruments IncorporatedInventors: Naveen Bhoria, Duc Bui, Rama Venkatasubramanian, Dheera Balasubramanian Samudrala, Alan Davis
-
Patent number: 11768681Abstract: An apparatus and method for performing multiply-accumulate operations.Type: GrantFiled: January 24, 2018Date of Patent: September 26, 2023Assignee: Intel CorporationInventors: Alexander Heinecke, Dipankar Das, Robert Valentine, Mark Charney
-
Patent number: 11768688Abstract: Methods and circuitry for efficient management of local branch history registers are described. An example processor includes a pipeline comprising a plurality of stages and a bit-vector associated with each of in-flight branches associated with the pipeline. The processor includes a recovery counter for tracking a number of bits needing recovery before a local branch history register is valid for participation in branch prediction. The processor includes branch predictor circuitry configured to, in response to an update of a local branch history register by a branch, set a bit in a corresponding bit-vector indicative of the update of the local branch history register. The branch predictor circuitry is configured to, upon a flush, determine a value indicative of an extent of recovery required for each local branch history register affected by the flush, and set a corresponding recovery counter to the value indicative of the extent of recovery required.Type: GrantFiled: June 2, 2022Date of Patent: September 26, 2023Assignee: Microsoft Technology Licensing, LLCInventors: Rami Mohammad Al Sheikh, Ahmed Helmi Mahmoud Osman Abulila, Daren Eugene Streett, Michael Scott McIlvaine
-
Patent number: 11755321Abstract: A circuit includes a data input that is configured to receive a data word, the data word including at least one operand which is rotated by a number of bits given by a rotation parameter, a first control input that is configured to receive the rotation parameter, a second control input that is configured to receive an indication of an operation to be performed, a first subcircuit that is configured to generate an operation- and rotation-dependent bit mask from the rotation parameter and the indication of the operation to be performed, a second subcircuit which is configured to process the at least one operand as a function of the bit mask and the operation to be performed, wherein the operand and the operation result generated by the processing remain in the rotated state, and a data output which is configured to output the operation result.Type: GrantFiled: January 13, 2022Date of Patent: September 12, 2023Assignee: INFINEON TECHNOLOGIES AGInventors: Florian Mendel, Martin Schlaeffer, Erich Wenger
-
Patent number: 11748651Abstract: An apparatus and method for scalable qubit addressing. For example, one embodiment of a processor comprises: a decoder comprising quantum instruction decode circuitry to decode quantum instructions to generate quantum microoperations (uops) and non-quantum decode circuitry to decode non-quantum instructions to generate non-quantum uops; execution circuitry comprising: an address generation unit (AGU) to generate a system memory address responsive to execution of one or more of the non-quantum uops; and quantum index generation circuitry to generate quantum index values responsive to execution of one or more of the quantum uops, each quantum index value uniquely identifying a quantum bit (qubit) in a quantum processor; wherein to generate a first quantum index value for a first quantum uop, the quantum index generation circuitry is to read the first quantum index value from a first architectural register identified by the first quantum uop.Type: GrantFiled: November 17, 2022Date of Patent: September 5, 2023Assignee: Intel CorporationInventor: Xiang Zou
-
Patent number: 11748270Abstract: In a method of operating a computer system, an instruction loop is executed by a processor in which each iteration of the instruction loop accesses a current data vector and an associated current vector predicate. The instruction loop is repeated when the current vector predicate indicates the current data vector contains at least one valid data element and the instruction loop is exited when the current vector predicate indicates the current data vector contains no valid data elements.Type: GrantFiled: November 21, 2022Date of Patent: September 5, 2023Assignee: Texas Instruments IncorporatedInventors: Duc Quang Bui, Joseph Raymond Michael Zbiciak
-
Patent number: 11734225Abstract: a Systems and methods are provided for matrix tiling to accelerate computing in redundant matrices. The method may include identifying unique submatrices in the matrix; loading values of elements of each unique submatrix into a respective one of the array processors; applying the vector to inputs of each of the array processors; and adding outputs of the array processors according to locations of the unique submatrices in the matrix.Type: GrantFiled: July 31, 2020Date of Patent: August 22, 2023Assignee: Hewlett Packard Enterprise Development LPInventors: Suhas Kumar, Rui Liu
-
Patent number: 11726789Abstract: Embodiments of a multithreaded processor and a method of assigning blocks of register files for hardware threads of multithreaded processors are disclosed.Type: GrantFiled: January 27, 2022Date of Patent: August 15, 2023Assignee: NXP B.V.Inventor: Michael Andrew Fischer
-
Patent number: 11726785Abstract: A graphics processing unit and methods for comping and executing instructions with opportunistic inter-path reconvergence are provided. A graphics processing unit may access computer executable instructions mapped to code blocks of a control flow for a warp. The code blocks may include an immediate dominator block and an intermediate post dominator block. The graphics processing unit may store a first thread mask associated with the first code block. The first thread mask may include a plurality of bits indicative of the active or non-active status for the threads of the warp, respectively. The graphics processing unit may a second thread mask corresponding to an intermediate code block between the immediate dominator block and intermediate post dominator block. The graphics processing unit may execute, with threads indicated as active by the first thread mask, instructions of the intermediate code block with a first operand or a second operand depending on the second thread mask.Type: GrantFiled: September 30, 2021Date of Patent: August 15, 2023Assignee: Purdue Research FoundationInventors: Milind Kulkarni, Jad Hbeika
-
Patent number: 11720355Abstract: One embodiment provides a graphics processor comprising a memory controller and a graphics processing resource coupled with the memory controller. The graphics processing resource includes circuitry configured to execute an instruction to perform a matrix operation on first input including weight data and second input including input activation data, generate intermediate data based on a result of the matrix operation, quantize the intermediate data to a floating-point format determined based on a statistical distribution of first output data, and output, as second output data, quantized intermediate data in a determined floating-point format.Type: GrantFiled: June 7, 2022Date of Patent: August 8, 2023Assignee: Intel CorporationInventors: Himanshu Kaul, Mark A. Anders, Sanu K. Mathew, Anbang Yao, Joydeep Ray, Ping T. Tang, Michael S. Strickland, Xiaoming Chen, Tatiana Shpeisman, Abhishek R. Appu, Altug Koker, Kamal Sinha, Balaji Vembu, Nicolas C. Galoppo Von Borries, Eriko Nurvitadhi, Rajkishore Barik, Tsung-Han Lin, Vasanth Ranganathan, Sanjeev Jahagirdar
-
Patent number: 11720357Abstract: The present disclosure provides a computation device. The computation device is configured to perform a machine learning computation, and includes an operation unit, a controller unit, and a conversion unit. The storage unit is configured to obtain input data and a computation instruction. The controller unit is configured to extract and parse the computation instruction from the storage unit to obtain one or more operation instructions, and to send the one or more operation instructions and the input data to the operation unit. The operation unit is configured to perform operations on the input data according to one or more operation instructions to obtain a computation result of the computation instruction. In the examples of the present disclosure, the input data involved in machine learning computations is represented by fixed-point data, thereby improving the processing speed and efficiency of training operations.Type: GrantFiled: December 16, 2019Date of Patent: August 8, 2023Assignee: SHANGHAI CAMBRICON INFORMATION TECHNOLOGY CO., LTDInventors: Yao Zhang, Bingrui Wang
-
Patent number: 11709672Abstract: The present disclosure provides a computation device. The computation device is configured to perform a machine learning computation, and includes an operation unit, a controller unit, and a conversion unit. The storage unit is configured to obtain input data and a computation instruction. The controller unit is configured to extract and parse the computation instruction from the storage unit to obtain one or more operation instructions, and to send the one or more operation instructions and the input data to the operation unit. The operation unit is configured to perform operations on the input data according to one or more operation instructions to obtain a computation result of the computation instruction. In the examples of the present disclosure, the input data involved in machine learning computations is represented by fixed-point data, thereby improving the processing speed and efficiency of training operations.Type: GrantFiled: December 16, 2019Date of Patent: July 25, 2023Assignee: SHANGHAI CAMBRICON INFORMATION TECHNOLOGY CO., LTDInventors: Yao Zhang, Bingrui Wang
-
Patent number: 11704125Abstract: The present disclosure provides a computation device. The computation device is configured to perform a machine learning computation, and includes an operation unit, a controller unit, and a conversion unit. The storage unit is configured to obtain input data and a computation instruction. The controller unit is configured to extract and parse the computation instruction from the storage unit to obtain one or more operation instructions, and to send the one or more operation instructions and the input data to the operation unit. The operation unit is configured to perform operations on the input data according to one or more operation instructions to obtain a computation result of the computation instruction. In the examples of the present disclosure, the input data involved in machine learning computations is represented by fixed-point data, thereby improving the processing speed and efficiency of training operations.Type: GrantFiled: December 16, 2019Date of Patent: July 18, 2023Assignee: Cambricon (Xi'an) Semiconductor Co., Ltd.Inventors: Yao Zhang, Bingrui Wang
-
Patent number: 11687344Abstract: One aspect provides a system for hardware-assisted pre-execution. During operation, the system determines a pre-execution code region comprising one or more instructions. The system increments a global counter upon initiating the one or more instructions. The system issues a first instruction, which involves setting, in a first entry for the first instruction in a data structure, a first prefetch region identifier with a current value of the global counter. Responsive to a head pointer of the data structure reaching the first entry, the system: determines, based on a non-zero value for the first prefetch region identifier, that the first entry is not available to be allocated; and advances the head pointer to a next entry in the data structure, which renders a load associated with the first entry as a non-blocking load. The system resets the global counter upon completing the one or more instructions.Type: GrantFiled: August 25, 2021Date of Patent: June 27, 2023Assignee: Hewlett Packard Enterprise Development LPInventor: Sanyam Mehta
-
Patent number: 11687341Abstract: In one embodiment, a matrix processor comprises a memory to store a matrix operand and a strided read sequence, wherein: the matrix operand is stored out of order in the memory; and the strided read sequence comprises a sequence of read operations to read the matrix operand in a correct order from the memory. The matrix processor further comprises circuitry to: receive a first instruction to be executed by the matrix processor, wherein the first instruction is to instruct the matrix processor to perform a first operation on the matrix operand; read the matrix operand from the memory based on the strided read sequence; and execute the first instruction by performing the first operation on the matrix operand.Type: GrantFiled: August 29, 2019Date of Patent: June 27, 2023Assignee: Intel CorporationInventors: Nitin N. Garegrat, Tony L. Werner, Jeff DelChiaro, Michael Rotzin, Robert T. Rhoades, Ujwal Basavaraj Sajjanar, Anne Q. Ye
-
Patent number: 11681806Abstract: In an approach to protecting against out-of-bounds buffer references, an apparatus comprises one or more processor cores and a bounds-checking functional unit in each processor core configured to manage bounds information for one or more memory buffers. When a buffer is allocated, an address range of the buffer is stored. When a pointer is assigned an address within the address range of the buffer, the address range of the buffer is associated with the pointer. When the pointer is used to compute an address for an operation, whether the address for the operation is within the address range associated with the pointer is determined. If the address is not within the address range associated with the pointer, signaling that an error has occurred.Type: GrantFiled: October 15, 2019Date of Patent: June 20, 2023Assignee: International Business Machines CorporationInventors: Richard H. Boivie, Alper Buyuktosunoglu, Tong Chen
-
Patent number: 11675596Abstract: Various example embodiments for supporting message processing are presented. Various example embodiments for supporting message processing are configured to support message processing by a processor. Various example embodiments for supporting message processing by a processor are configured to support message processing by the processor based on dynamic control over processor instruction sets of the processor.Type: GrantFiled: November 4, 2020Date of Patent: June 13, 2023Assignee: NOKIA SOLUTIONS AND NETWORKS OYInventor: Pranjal Kumar Dutta
-
Patent number: 11675594Abstract: Embodiments of instructions are detailed herein including one or more of 1) a branch fence instruction, prefix, or variants (BFENCE); 2) a predictor fence instruction, prefix, or variants (PFENCE); 3) an exception fence instruction, prefix, or variants (EFENCE); 4) an address computation fence instruction, prefix, or variants (AFENCE); 5) a register fence instruction, prefix, or variants (RFENCE); and, additionally, modes that apply the above semantics to some or all ordinary instructions.Type: GrantFiled: December 28, 2018Date of Patent: June 13, 2023Assignee: Intel CorporationInventors: Robert S. Chappell, Jason W. Brandt, Alan Cox, Asit Mallick, Joseph Nuzman, Arjan Van De Ven