Patents by Inventor Andrew Waterman
Andrew Waterman has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Publication number: 20240104024Abstract: Systems and methods are disclosed for atomic memory operations for address translation. For example, an integrated circuit (e.g., a processor) for executing instructions includes a memory system including random access memory; a bus connected to the memory system; and an atomic memory operation circuitry configured to receive a request from the bus to access an entry in a page table stored in the memory system, wherein the request includes an indication of whether an instruction that references an address being translated using the entry is a store instruction; access the entry in the page table; responsive to the indication indicating that the instruction is a store instruction, set a dirty bit of the entry in the page table; and transmit contents of the entry on the bus in response to the request.Type: ApplicationFiled: September 27, 2023Publication date: March 28, 2024Inventors: John Ingalls, Andrew Waterman
-
Publication number: 20240020012Abstract: A processor core may include circuitry that fetches a first instruction followed by a second instruction. The first instruction may be configured to cause a first memory request, and the second instruction may be configured to cause a second memory request. The circuitry may determine that the first memory request is a candidate for combination with the second memory request. Responsive to the determination, the circuitry may send an indication, from the processor core via a bus, that the first memory request is a candidate for combination.Type: ApplicationFiled: May 31, 2023Publication date: January 18, 2024Inventors: Eric Andrew Gouldey, Wesley Waylon Terpstra, Andrew Waterman
-
Publication number: 20240020124Abstract: Systems and methods are disclosed for supporting multiple vector lengths with a configurable vector register file. For example, an integrated circuit (e.g., a processor) includes a data store configured to store a vector length parameter; a processor core including a vector register, wherein the processor core is configured to: while a first value of the vector length parameter is stored in the data store, store a single architectural register of an instruction set architecture in the vector register; and, while a second value of the vector length parameter is stored in the data store, store multiple architectural registers of the instruction set architecture in respective disjoint portions of the vector register. For example, the integrated circuit may be used to emulate a processor with smaller vector registers for the purpose of migrating a thread to a processor core of the integrated circuit for continued execution.Type: ApplicationFiled: June 30, 2023Publication date: January 18, 2024Inventors: Andrew Waterman, Krste Asanovic
-
Publication number: 20240020126Abstract: Systems and methods are disclosed for fusion with destructive instructions. For example, an integrated circuit (e.g., a processor) for executing instructions includes a fusion circuitry that is configured to detect a sequence of macro-ops stored in a processor pipeline of the processor core, the sequence of macro-ops including a first macro-op identifying a first register as a destination register followed by a second macro-op identifying the first register as both a source register and as a destination register, wherein one or more intervening macro-ops occur between the first macro-op and the second macro-op in the program order; determine a micro-op that is equivalent to the first macro-op followed by the second macro-op; and forward the micro-op to at least one of the one or more execution resource circuitries for execution. For example, the sequence of macro-ops may be detected in a vector dispatch stage of a processor pipeline.Type: ApplicationFiled: June 30, 2023Publication date: January 18, 2024Inventors: Andrew Waterman, Krste Asanovic
-
Publication number: 20240012948Abstract: Disclosed herein are systems and methods for processing masked memory accesses including handling fault exceptions and checking memory attributes of memory region(s) to be accessed. Implementations perform a two-level memory protection violation scheme for masked vector memory instructions. The first level memory check ignores mask information associated with a masked vector memory instruction and operates on a memory footprint associated with the masked vector memory instruction. If a memory protection violation is detected or speculative access is denied with respect to the memory footprint, a second level memory check evaluates mask information at a vector element level to determine whether a fault exception should be raised. If a mask bit for a vector element is set and a memory violation is detected, then a fault exception is raised for the masked vector memory instruction. If a mask bit is not set, execution of the masked vector memory instruction can continue.Type: ApplicationFiled: September 1, 2021Publication date: January 11, 2024Inventors: Andrew Waterman, Krste Asanovic
-
Patent number: 11861365Abstract: Systems and methods are disclosed for macro-op fusion. Sequences of macro-ops that include a control-flow instruction are fused into single micro-ops for execution. The fused micro-ops may avoid the use of control-flow instructions, which may improve performance. A fusion predictor may be used to facilitate macro-op fusion.Type: GrantFiled: May 3, 2021Date of Patent: January 2, 2024Assignee: SiFive, Inc.Inventors: Krste Asanovic, Andrew Waterman
-
Publication number: 20230367715Abstract: Systems and methods are disclosed for load-store pipeline selection for vectors. For example, an integrated circuit (e.g., a processor) for executing instructions includes an L1 cache that provides an interface to a memory system; an L2 cache connected to the L1 cache that implements a cache coherency protocol with the L1 cache; a first store unit configured to write data to the memory system via the L1 cache; a second store unit configured to bypass the L1 cache and write data to the memory system via the L2 cache; and a store pipeline selection circuitry configured to: identify an address associated with a first beat of a store instruction with a vector argument; select between the first store unit and the second store unit based on the address associated with the first beat of the store instruction; and dispatch the store instruction to the selected store unit.Type: ApplicationFiled: April 30, 2023Publication date: November 16, 2023Inventors: Andrew Waterman, Krste Asanovic
-
Publication number: 20230367599Abstract: Systems and methods are disclosed for vector gather with a narrow datapath. For example, some methods may include reading b bits of a vector of indices into a first operand buffer; reading b bits of the vector of source data into a second operand buffer, including an element indexed by a first index stored in the first operand buffer; checking whether other indices stored in the first operand buffer point to elements of the vector of source data stored in the second operand buffer; during a single clock cycle, copying a plurality of elements stored in the second operand buffer that are pointed to by indices stored in the first operand buffer to a third operand buffer; and updating flags in a completion flags buffer corresponding to those indices to indicate that handling of those indices has completed.Type: ApplicationFiled: April 30, 2023Publication date: November 16, 2023Inventors: Andrew Waterman, Krste Asanovic
-
Patent number: 11797308Abstract: Systems and methods are disclosed for fetch stage handling of indirect jumps in a processor pipeline. For example, a method includes detecting a sequence of instructions fetched by a processor core, wherein the sequence of instructions includes a first instruction, with a result that depends on an immediate field of the first instruction and a program counter value, followed by a second instruction that is an indirect jump instruction; responsive to detection of the sequence of instructions, preventing an indirect jump target predictor circuit from generating a target address prediction for the second instruction; and, responsive to detection of the sequence of instructions, determining a target address for the second instruction before the first instruction is issued to an execution stage of a pipeline of the processor core.Type: GrantFiled: April 11, 2022Date of Patent: October 24, 2023Assignee: SiFive, Inc.Inventors: Joshua Smith, Krste Asanovic, Andrew Waterman
-
Publication number: 20230315649Abstract: Systems and methods are disclosed for memory protection for vector operations. For example, a method includes fetching a vector memory instruction using a processor core including a pipeline configured to execute instructions, including constant-stride vector memory instructions; partitioning a vector that is identified by the vector memory instruction into a subvector of a maximum length, greater than one, and one or more additional subvectors with lengths less than or equal to the maximum length; checking, using a memory protection circuit, whether accessing elements of the subvector will cause a memory protection violation; and accessing the elements of the subvector before checking, using the memory protection circuit, whether accessing elements of one of the one or more additional subvectors will cause a memory protection violation.Type: ApplicationFiled: September 1, 2021Publication date: October 5, 2023Inventors: Krste Asanovic, Andrew Waterman
-
Publication number: 20230305852Abstract: Systems and methods are disclosed for register renaming. For example, an integrated circuit is described that includes a first cluster including a first set of physical registers and a first execution resource circuit, wherein the inputs for operations of the first execution resource circuit are of a first data type; a second cluster including a second set of physical registers and a second execution resource circuit, wherein the inputs for operations of the second execution resource circuit are of a second data type that is different than the first data type; and a register renaming circuit configured to: determine a data type prediction for a result of a first instruction that will be stored in a first logical register; and, based on the data type prediction matching the first data type, rename the first logical register to be stored in a physical register of the first set of physical registers.Type: ApplicationFiled: July 23, 2021Publication date: September 28, 2023Inventors: Krste Asanovic, Andrew Waterman
-
Publication number: 20230305969Abstract: Systems and methods are disclosed for memory protection for memory protection for gather-scatter operations. For example, an integrated circuit may include a processor core; a memory protection circuit configured to check for memory protection violations with a protection granule; and an index range circuit configured to: memoize a maximum value and a minimum value of a tuple of indices stored in a vector register of the processor core as the tuple of indices is written to the vector register; determine a range of addresses for a gather-scatter memory instruction that takes the vector register as a set of indices based on a base address of a vector in memory, the memoized minimum value, and the memoized maximum value; and check, using the memory protection circuit during a single clock cycle, whether accessing elements of the vector within the range of addresses will cause a memory protection violation.Type: ApplicationFiled: September 1, 2021Publication date: September 28, 2023Inventors: Andrew Waterman, Krste Asanovic
-
Patent number: 11687342Abstract: Disclosed herein are systems and method for instruction tightly-coupled memory (iTIM) and instruction cache (iCache) access prediction. A processor may use a predictor to enable access to the iTIM or the iCache and a particular way (a memory structure) based on a location state and program counter value. The predictor may determine whether to stay in an enabled memory structure, move to and enable a different memory structure, or move to and enable both memory structures. Stay and move predictions may be based on whether a memory structure boundary crossing has occurred due to sequential instruction processing, branch or jump instruction processing, branch resolution, and cache miss processing. The program counter and a location state indicator may use feedback and be updated each instruction-fetch cycle to determine which memory structure(s) needs to be enabled for the next instruction fetch.Type: GrantFiled: December 12, 2019Date of Patent: June 27, 2023Assignee: SiFive, Inc.Inventors: Krste Asanovic, Andrew Waterman
-
Publication number: 20230195647Abstract: Systems and methods are disclosed for logging guest physical address for memory access faults. For example, a method for logging guest physical address includes receiving a first address translation request from a processor pipeline at a translation lookaside buffer for a first guest virtual address; identifying a hit with a fault condition corresponding to the first guest virtual address; responsive to the fault condition, invoking a single-stage page table walk with the first guest virtual address to obtain a first guest physical address; and storing the first guest physical address with the first guest virtual address in a data store, wherein the data store is separate from an entry in the translation lookaside buffer that includes a tag that includes the first guest virtual address and data that includes a physical address.Type: ApplicationFiled: December 21, 2022Publication date: June 22, 2023Inventors: John Ingalls, Andrew Waterman
-
Publication number: 20230019271Abstract: Systems and methods are disclosed for processor power management using instruction throttling. For example, an integrated circuit may include a processor core including a processor pipeline configured to execute instructions; a register configured to store a power dial value that indicates a portion of available clock cycles for throttling of instruction flow through the processor pipeline; and an instruction throttling circuit configured to periodically stall removal of instructions from a queue in the processor pipeline for a number of clock cycles that is determined based on the power dial value.Type: ApplicationFiled: July 13, 2022Publication date: January 19, 2023Inventors: Shubhendu Sekhar Mukherjee, Andrew Waterman
-
Patent number: 11429392Abstract: Systems and methods are disclosed for secure predictors for speculative execution. Some implementations may eliminate or mitigate side-channel attacks, such as the Spectre-class of attacks, in a processor. For example, an integrated circuit (e.g., a processor) for executing instructions includes a predictor circuit that, when operating in a first mode, uses data stored in a set of predictor entries to generate predictions. For example, the integrated circuit may be configured to: detect a security domain transition for software being executed by the integrated circuit; responsive to the security domain transition, change a mode of the predictor circuit from the first mode to a second mode and invoke a reset of the set of predictor entries, wherein the second mode prevents the use of a first subset of the predictor entries of the set of predictor entries; and, after completion of the reset, change the mode back to the first mode.Type: GrantFiled: March 22, 2019Date of Patent: August 30, 2022Assignee: SiFive, Inc.Inventors: Krste Asanovic, Andrew Waterman
-
Publication number: 20220236993Abstract: Systems and methods are disclosed for fetch stage handling of indirect jumps in a processor pipeline. For example, a method includes detecting a sequence of instructions fetched by a processor core, wherein the sequence of instructions includes a first instruction, with a result that depends on an immediate field of the first instruction and a program counter value, followed by a second instruction that is an indirect jump instruction; responsive to detection of the sequence of instructions, preventing an indirect jump target predictor circuit from generating a target address prediction for the second instruction; and, responsive to detection of the sequence of instructions, determining a target address for the second instruction before the first instruction is issued to an execution stage of a pipeline of the processor core.Type: ApplicationFiled: April 11, 2022Publication date: July 28, 2022Inventors: Joshua Smith, Krste Asanovic, Andrew Waterman
-
Patent number: 11301251Abstract: Systems and methods are disclosed for fetch stage handling of indirect jumps in a processor pipeline. For example, a method includes detecting a sequence of instructions fetched by a processor core, wherein the sequence of instructions includes a first instruction, with a result that depends on an immediate field of the first instruction and a program counter value, followed by a second instruction that is an indirect jump instruction; responsive to detection of the sequence of instructions, preventing an indirect jump target predictor circuit from generating a target address prediction for the second instruction; and, responsive to detection of the sequence of instructions, determining a target address for the second instruction before the first instruction is issued to an execution stage of a pipeline of the processor core.Type: GrantFiled: April 23, 2020Date of Patent: April 12, 2022Assignee: SiFive, Inc.Inventors: Joshua Smith, Krste Asanovic, Andrew Waterman
-
Publication number: 20220083340Abstract: Disclosed herein are systems and method for instruction tightly-coupled memory (iTIM) and instruction cache (iCache) access prediction. A processor may use a predictor to enable access to the iTIM or the iCache and a particular way (a memory structure) based on a location state and program counter value. The predictor may determine whether to stay in an enabled memory structure, move to and enable a different memory structure, or move to and enable both memory structures. Stay and move predictions may be based on whether a memory structure boundary crossing has occurred due to sequential instruction processing, branch or jump instruction processing, branch resolution, and cache miss processing. The program counter and a location state indicator may use feedback and be updated each instruction-fetch cycle to determine which memory structure(s) needs to be enabled for the next instruction fetch.Type: ApplicationFiled: December 12, 2019Publication date: March 17, 2022Applicants: SiFive, Inc., SiFive, Inc.Inventors: Krste Asanovic, Andrew Waterman
-
Publication number: 20210303300Abstract: Systems and methods are disclosed for fetch stage handling of indirect jumps in a processor pipeline. For example, a method includes detecting a sequence of instructions fetched by a processor core, wherein the sequence of instructions includes a first instruction, with a result that depends on an immediate field of the first instruction and a program counter value, followed by a second instruction that is an indirect jump instruction; responsive to detection of the sequence of instructions, preventing an indirect jump target predictor circuit from generating a target address prediction for the second instruction; and, responsive to detection of the sequence of instructions, determining a target address for the second instruction before the first instruction is issued to an execution stage of a pipeline of the processor core.Type: ApplicationFiled: April 23, 2020Publication date: September 30, 2021Inventors: Joshua Smith, Krste Asanovic, Andrew Waterman