Patents Examined by Keith E Vicary
-
Patent number: 11755324Abstract: A computer system, processor, programming instructions and/or method for managing operations of a gather buffer for a processor core load storage unit. The processor core includes a processing pipeline having one or more execution units for processing unaligned load instructions that executes in two phases to satisfy. A buffer storage element is provided having a plurality of entries for temporarily collecting partial writeback results retrieved from the memory that are associated with first phase accesses for each of a plurality of unaligned load instructions. An associated logic controller device tracks two parts of the unaligned load to be gathered at independent times, wherein said partial result stored at said buffer storage element comprises a first part of an unaligned load. The second phase load access for the same instruction is independently accessed and later merged with first part of the load data at byte granularity to satisfy the load.Type: GrantFiled: August 31, 2021Date of Patent: September 12, 2023Assignee: International Business Machines CorporationInventors: Kimberly M. Fernsler, Bryan Lloyd, David A. Hrusecky, David A. Campbell
-
Patent number: 11755323Abstract: An apparatus and method for multiplying packed real and imaginary components of complex numbers are described. A processor embodiment includes: a decoder to decode a first instruction to generate a decoded instruction; a first source register to store a first plurality of packed real and imaginary data elements; a second source register to store a second plurality of packed real and imaginary data elements; and execution circuitry to execute the decoded instruction.Type: GrantFiled: February 15, 2022Date of Patent: September 12, 2023Assignee: Intel CorporationInventors: Venkateswara Madduri, Elmoustapha Ould-Ahmed-Vall, Jesus Corbal, Mark Charney, Robert Valentine, Binwei Yang
-
Patent number: 11755333Abstract: A prefetcher for a coprocessor is disclosed. An apparatus includes a processor and a coprocessor that are configured to execute processor and coprocessor instructions, respectively. The processor and coprocessor instructions appear together in code sequences fetched by the processor, with the coprocessor instructions being provided to the coprocessor by the processor. The apparatus further includes a coprocessor prefetcher configured to monitor a code sequence fetched by the processor and, in response to identifying a presence of coprocessor instructions in the code sequence, capture the memory addresses, generated by the processor, of operand data for coprocessor instructions. The coprocessor is further configured to issue, for a cache memory accessible to the coprocessor, prefetches for data associated with the memory addresses prior to execution of the coprocessor instructions by the coprocessor.Type: GrantFiled: December 10, 2021Date of Patent: September 12, 2023Assignee: Apple Inc.Inventors: Brandon H. Dwiel, Andrew J. Beaumont-Smith, Eric J. Furbish, John D. Pape, Stephen G. Meier, Tyler J. Huberty
-
Patent number: 11748106Abstract: A mechanism is described for facilitating fast data operations and for facilitating a finite state machine for machine learning at autonomous machines. A method of embodiments, as described herein, includes detecting input data to be used in computational tasks by a computation component of a processor including a graphics processor. The method may further include determining one or more frequently-used data values (FDVs) from the data, and pushing the one or more frequent data values to bypass the computational tasks.Type: GrantFiled: March 1, 2022Date of Patent: September 5, 2023Assignee: INTEL CORPORATIONInventors: Liwei Ma, Nadathur Rajagopalan Satish, Jeremy Bottleson, Farshad Akhbari, Eriko Nurvitadhi, Abhishek R. Appu, Altug Koker, Kamal Sinha, Joydeep Ray, Balaji Vembu, Vasanth Ranganathan, Sanjeev Jahagirdar
-
Patent number: 11709678Abstract: In one embodiment, a processor includes fetch logic to fetch instructions, decode logic to decode the fetched instructions, and execution logic to execute at least some of the instructions. The decode logic may determine whether a flag portion of a first instruction to be folded is to be performed, and if not, accumulate a first immediate value of the first instruction with a folded immediate value obtained from an entry of an immediate buffer.Type: GrantFiled: June 1, 2021Date of Patent: July 25, 2023Assignee: Intel CorporationInventors: Zeev Sperber, Tomer Weiner, Amit Gradstein, Simon Rubanovich, Alex Gerber, Itai Ravid
-
Patent number: 11704128Abstract: An execution method includes supplying of a machine code, the machine code being formed by a succession of base blocks and each base block being associated with a signature and comprising instructions to be protected. Each instruction to be protected is immediately preceded or followed by an instruction for constructing the value of the signature associated with the base block. Each construction instruction is coded on strictly less than N bits, and each word of the machine code which comprises at least one portion of one of said instructions to be protected also comprises one of the construction instructions so that A is not possible to load an instruction to be protected into an execution file, without at the same time loading a construction instruction which modifies the value of the signature associated with the base block when it is executed.Type: GrantFiled: March 20, 2018Date of Patent: July 18, 2023Assignees: COMMISSARIAT A L'ENERGIE ATOMIQUE ET AUX ENERGIES ALTERNATIVES, SORBONNE UNIVERSITE, CENTRE NATIONAL DE LA RECHERCHE SCIENTIFIQUEInventors: Damien Courousse, Karine Heydemann, Thierno Barry
-
Patent number: 11693666Abstract: A predicated-loop-terminating branch instruction controls, based on whether a loop termination condition is satisfied, whether the processing circuitry should process a further iteration of a predicated loop body or process a following instruction. If at least one unnecessary iteration of the predicated loop body is processed following a mispredicted-non-termination branch misprediction when the loop termination condition is mispredicted as unsatisfied for a given iteration when it should have been satisfied, processing of the at least one unnecessary iteration of the predicated loop body is predicated to suppress an effect of the at least one unnecessary iteration.Type: GrantFiled: October 20, 2021Date of Patent: July 4, 2023Assignee: Arm LimitedInventors: Joseph Michael Pusdesris, Nicholas Andrew Plante, Yasuo Ishii, Chris Abernathy
-
Patent number: 11681531Abstract: Apparatus and methods are disclosed for controlling execution of memory access instructions in a block-based processor architecture using a hardware structure that indicates a relative ordering of memory access instruction in an instruction block. In one example of the disclosed technology, a method of executing an instruction block having a plurality of memory load and/or memory store instructions includes selecting a next memory load or memory store instruction to execute based on dependencies encoded within the block, and on a store vector that stores data indicating which memory load and memory store instructions in the instruction block have executed. The store vector can be masked using a store mask. The store mask can be generated when decoding the instruction block, or copied from an instruction block header. Based on the encoded dependencies and the masked store vector, the next instruction can issue when its dependencies are available.Type: GrantFiled: October 23, 2015Date of Patent: June 20, 2023Assignee: Microsoft Technology Licensing, LLCInventors: Douglas C. Burger, Aaron L. Smith
-
Patent number: 11675595Abstract: An apparatus includes instruction fetching circuitry to read a set of instructions, including a speculative execution instruction and a speculative condition determination instruction; cache the instructions; and read the speculative execution instruction corresponding to the speculative condition of the speculative condition determination instruction. If an execution result of the speculative condition determination instruction indicates the speculative condition is incorrect, clear the instructions cached in the instruction fetching circuitry. Instruction decoding circuitry decodes instructions. Executing circuitry executes instructions, including executing the speculative condition determination instruction to obtain the execution result.Type: GrantFiled: September 24, 2020Date of Patent: June 13, 2023Assignee: Alibaba Group Holding LimitedInventors: Chang Liu, Ruqin Zhang
-
Patent number: 11663001Abstract: Systems, apparatuses, and methods for implementing a family of lossy sparse load single instruction, multiple data (SIMD) instructions are disclosed. A lossy sparse load unit (LSLU) loads a plurality of values from one or more input vector operands and determines how many non-zero values are included in one or more input vector operands of a given instruction. If the one or more input vector operands have less than a threshold number of non-zero values, then the LSLU causes an instruction for processing the one or more input vector operands to be skipped. In this case, the processing of the instruction of the one or more input vector operands is deemed to be redundant. If the one or more input vector operands have greater than or equal to the threshold number of non-zero values, then the LSLU causes an instruction for processing the input vector operand(s) to be executed.Type: GrantFiled: November 19, 2018Date of Patent: May 30, 2023Assignee: Advanced Micro Devices, Inc.Inventors: Sanchari Sen, Derrick Allen Aguren, Joseph Lee Greathouse
-
Patent number: 11650825Abstract: An instruction set architecture including instructions for a processor and instructions for a coprocessor may include synchronizing instructions that may be used to begin and end instruction sequences that include coprocessor instructions (coprocessor sequences). If a terminating synchronizing instruction is followed by an initial synchronizing instruction and the pair are detected in the coprocessor concurrently, the coprocessor may suppress execution of the pair of instructions.Type: GrantFiled: February 10, 2022Date of Patent: May 16, 2023Assignee: Apple Inc.Inventors: Aditya Kesiraju, Rajdeep L. Bhuyar, Ran A. Chachick, Andrew J. Beaumont-Smith
-
Patent number: 11645083Abstract: A system and method for reducing pipeline latency. In one embodiment, a processing system includes a processing pipeline. The processing pipeline includes a plurality of processing stages. Each stage is configured to further processing provided by a previous stage. A first of the stages is configured to perform a first function in a pipeline cycle. A second of the stages is disposed downstream of the first of the stages, and is configured to perform, in a pipeline cycle, a second function that is different from the first function. The first of the stages is further configured to selectably perform the first function and the second function in a pipeline cycle, and bypass the second of the stages.Type: GrantFiled: August 23, 2013Date of Patent: May 9, 2023Assignee: TEXAS INSTRUMENTS INCORPORATEDInventors: Christian Wiencke, Shrey Sudhir Bhatia, Jeroen Vilegen
-
Patent number: 11645080Abstract: Systems, methods, and apparatuses relating to instructions to reset software thread runtime property histories in a hardware processor are described. In one embodiment, a hardware processor includes a hardware guide scheduler comprising a plurality of software thread runtime property histories; a decoder to decode a single instruction into a decoded single instruction, the single instruction having a field that identifies a model-specific register; and an execution circuit to execute the decoded single instruction to check that an enable bit of the model-specific register is set, and when the enable bit is set, to reset the plurality of software thread runtime property histories of the hardware guide scheduler.Type: GrantFiled: September 6, 2022Date of Patent: May 9, 2023Assignee: Intel CorporationInventors: Eliezer Weissmann, Mark Charney, Michael Mishaeli, Robert Valentine, Itai Ravid, Jason W. Brandt, Gilbert Neiger, Baruch Chaikin, Efraim Rotem
-
Patent number: 11635965Abstract: Methods and apparatuses relating to mitigations for speculative execution side channels are described. Speculative execution hardware and environments that utilize the mitigations are also described. For example, three indirect branch control mechanisms and their associated hardware are discussed herein: (i) indirect branch restricted speculation (IBRS) to restrict speculation of indirect branches, (ii) single thread indirect branch predictors (STIBP) to prevent indirect branch predictions from being controlled by a sibling thread, and (iii) indirect branch predictor barrier (IBPB) to prevent indirect branch predictions after the barrier from being controlled by software executed before the barrier.Type: GrantFiled: October 31, 2018Date of Patent: April 25, 2023Assignee: Intel CorporationInventors: Jason W. Brandt, Deepak K. Gupta, Rodrigo Branco, Joseph Nuzman, Robert S. Chappell, Sergiu D. Ghetie, Wojciech Powiertowski, Jared W. Stark, IV, Ariel Sabba, Scott J. Cape, Hisham Shafi, Lihu Rappoport, Yair Berger, Scott P. Bobholz, Gilad Holzstein, Sagar V. Dalvi, Yogesh Bijlani
-
Patent number: 11630673Abstract: A first processor for executing program code has a control interface mapped to the memory address space of a second processor and provides the second processor with direct mapped access to state information of the first processor. The first processor responds to an exception causing event to enter a halted mode stopping execution of the program code and issuing a trigger event. The second processor responds to the trigger to execute an exception handling routine during which the second processor accesses and modifies the state information via the control interface as required by the exception handling routine. On completion of the exception handling routine, the second processor causes the first processor to exit the halted mode and resume execution of the program code. Thus, the program code is physically separated from the software used to perform the exception handling routine to improve security.Type: GrantFiled: October 16, 2019Date of Patent: April 18, 2023Assignee: Arm LimitedInventor: Alasdair Grant
-
Patent number: 11630670Abstract: Techniques are disclosed relating to signature-based instruction prefetching. In some embodiments, processor pipeline circuitry executes a computer program that includes control transfer instructions, such that the execution follows a taken path through the computer program. First signature prefetch table circuitry indicates prefetch addresses for signatures generated using a first signature generation technique and second signature prefetch table circuitry indicates prefetch addresses for signatures generated using a second, different signature generation technique. Signature prefetch circuitry, in response to a prefetch training event, determines a first signature according to the first technique and a second signature according to the second technique and selects one but not both of the first and second signature prefetch tables to train using the first signature or the second signature.Type: GrantFiled: July 21, 2021Date of Patent: April 18, 2023Assignee: Apple Inc.Inventors: Douglas C. Holman, Ian D. Kountanis, Amit Kumar, Muawya M. Al-Otoom
-
Patent number: 11620132Abstract: Various embodiments are provided reusing an operand in an instruction set architecture (ISA) by one or more processors in a computing system. An instruction may specify that an operand register for a selected operand retain operand data used by a previous instruction. The operand data in the operand register may be reused by the instruction.Type: GrantFiled: May 8, 2019Date of Patent: April 4, 2023Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Bruce Fleischer, Sunil Shukla, Vijayalakshmi Srinivasan, Jungwook Choi
-
Patent number: 11614944Abstract: In one embodiment, a branch prediction control system is configured to move a mispredicted conditional branch from a smaller cache side that uses the lower complexity conditional branch predictor to one of the two large cache sides that uses the higher complexity conditional branch predictors. The move (write) is achieved according to a configurable probability or chance to escape misprediction recurrence and results in a reduced amount of mispredictions for the given branch instruction.Type: GrantFiled: November 9, 2020Date of Patent: March 28, 2023Assignee: CENTAUR TECHNOLOGY, INC.Inventor: Thomas C. McDonald
-
Patent number: 11599361Abstract: A data processing apparatus is provided. It includes control flow detection prediction circuitry that performs a presence prediction of whether a block of instructions contains a control flow instruction. A fetch queue stores, in association with prediction information, a queue of indications of the instructions and the prediction information comprises the presence prediction. An instruction cache stores fetched instructions that have been fetched according to the fetch queue. Post-fetch correction circuitry receives the fetched instructions prior to the fetched instructions being received by decode circuitry, the post-fetch correction circuitry includes analysis circuitry that causes the fetch queue to be at least partly flushed in dependence on a type of a given fetched instruction and the prediction information associated with the given fetched instruction.Type: GrantFiled: May 10, 2021Date of Patent: March 7, 2023Assignee: Arm LimitedInventors: Jaekyu Lee, Yasuo Ishii, Krishnendra Nathella, Dam Sunwoo
-
Patent number: 11586443Abstract: Devices and techniques for thread-based processor halting are described herein. A processor monitors control-status register (CSR) values that correspond to a halt condition for a thread. The processor then compares the halt condition to a current state of the thread and halts in response to the current state of the thread meeting the halt condition.Type: GrantFiled: October 20, 2020Date of Patent: February 21, 2023Assignee: Micron Technology, Inc.Inventors: Christopher Baronne, Dean E. Walker