Patents by Inventor Jay Fleischman
Jay Fleischman has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Publication number: 20240126552Abstract: Processor-guided execution of offloaded instructions using fixed function operations is disclosed. Instructions designated for remote execution by a target device are received by a processor. Each instruction includes, as an operand, a target register in the target device. The target register may be an architected virtual register. For each of the plurality of instructions, the processor transmits an offload request in the order that the instructions are received. The offload request includes the instruction designated for remote execution. The target device may be, for example, a processing-in-memory device or an accelerator coupled to a memory.Type: ApplicationFiled: December 21, 2023Publication date: April 18, 2024Inventors: JOHN KALAMATIANOS, MICHAEL T. CLARK, MARIUS EVERS, WILLIAM L. WALKER, PAUL MOYER, JAY FLEISCHMAN, JAGADISH B. KOTRA
-
Publication number: 20240095180Abstract: The disclosed computer-implemented method for interpolating register-based lookup tables can include identifying, within a set of registers, a lookup table that has been encoded for storage within the set of registers. The method can also include receiving a request to look up a value in the lookup table and responding to the request by interpolating, from the encoded lookup table stored in the set of registers, a representation of the requested value. Various other methods, systems, and computer-readable media are also disclosed.Type: ApplicationFiled: December 23, 2022Publication date: March 21, 2024Applicant: Advanced Micro Devices, Inc.Inventors: Gabriel H. Loh, Michael Estlick, Jay Fleischman, Michael J. Schulte, Bradford Beckmann, Yasuko Eckert
-
Publication number: 20240045694Abstract: A coprocessor such as a floating-point unit includes a pipeline that is partitioned into a first portion and a second portion. A controller is configured to provide control signals to the first portion and the second portion of the pipeline. A first physical distance traversed by control signals propagating from the controller to the first portion of the pipeline is shorter than a second physical distance traversed by control signals propagating from the controller to the second portion of the pipeline. A scheduler is configured to cause a physical register file to provide a first subset of bits of an instruction to the first portion at a first time. The physical register file provides a second subset of the bits of the instruction to the second portion at a second time subsequent to the first time.Type: ApplicationFiled: June 16, 2023Publication date: February 8, 2024Inventors: Jay FLEISCHMAN, Michael ESTLICK, Michael Christopher SEDMAK, Erik SWANSON, Sneha V. DESAI
-
Patent number: 11868777Abstract: Processor-guided execution of offloaded instructions using fixed function operations is disclosed. Instructions designated for remote execution by a target device are received by a processor. Each instruction includes, as an operand, a target register in the target device. The target register may be an architected virtual register. For each of the plurality of instructions, the processor transmits an offload request in the order that the instructions are received. The offload request includes the instruction designated for remote execution. The target device may be, for example, a processing-in-memory device or an accelerator coupled to a memory.Type: GrantFiled: December 16, 2020Date of Patent: January 9, 2024Assignee: ADVANCED MICRO DEVICES, INC.Inventors: John Kalamatianos, Michael T. Clark, Marius Evers, William L. Walker, Paul Moyer, Jay Fleischman, Jagadish B. Kotra
-
Patent number: 11847062Abstract: In response to eviction of a first clean data block from an intermediate level of cache in a multi-cache hierarchy of a processing system, a cache controller accesses an address of the first clean data block. The controller initiates a fetch of the first clean data block from a system memory into a last-level cache using the accessed address.Type: GrantFiled: December 16, 2021Date of Patent: December 19, 2023Assignee: Advanced Micro Devices, Inc.Inventors: Tarun Nakra, Jay Fleischman, Gautam Tarasingh Hazari, Akhil Arunkumar, William L. Walker, Gabriel H. Loh, John Kalamatianos, Marko Scrbak
-
Publication number: 20230393855Abstract: An approach is provided for implementing register based single instruction, multiple data (SIMD) lookup table operations. According to the approach, an instruction set architecture (ISA) can support one or more SIMD instructions that enable vectors or multiple values in source data registers to be processed in parallel using a lookup table or truth table stored in one or more function registers. The SIMD instructions can be flexibly configured to support functions with inputs and outputs of various sizes and data formats. Various approaches are also described for supporting very large lookup tables that span multiple registers.Type: ApplicationFiled: June 6, 2022Publication date: December 7, 2023Inventors: Gabriel H. Loh, Yasuko Eckert, Bradford Beckmann, Michael Estlick, Jay Fleischman
-
Patent number: 11709681Abstract: A coprocessor such as a floating-point unit includes a pipeline that is partitioned into a first portion and a second portion. A controller is configured to provide control signals to the first portion and the second portion of the pipeline. A first physical distance traversed by control signals propagating from the controller to the first portion of the pipeline is shorter than a second physical distance traversed by control signals propagating from the controller to the second portion of the pipeline. A scheduler is configured to cause a physical register file to provide a first subset of bits of an instruction to the first portion at a first time. The physical register file provides a second subset of the bits of the instruction to the second portion at a second time subsequent to the first time.Type: GrantFiled: December 11, 2017Date of Patent: July 25, 2023Assignee: Advanced Micro Devices, Inc.Inventors: Jay Fleischman, Michael Estlick, Michael Christopher Sedmak, Erik Swanson, Sneha V. Desai
-
Publication number: 20230205700Abstract: In response to generating one or more speculative prefetch requests for a last-level cache, a processor determines prefetch analytics for the generated speculative prefetch requests and compares the determined prefetch analytics of the speculative prefetch requests to selection thresholds. In response to a speculative prefetch request meeting or exceeding a selection threshold, the processor selects the speculative prefetch request for issuance to a memory-side cache controller. When one or more system conditions meet one or more condition thresholds, the processor issues the selected speculative prefetch request to the memory-side cache controller. The memory-side cache controller then retrieves the data indicated in the selected speculative prefetch request from a memory and stores it in a memory-side cache in the data fabric coupled to the last-level cache.Type: ApplicationFiled: December 28, 2021Publication date: June 29, 2023Inventors: Tarun NAKRA, Akhil ARUNKUMAR, Paul MOYER, Jay FLEISCHMAN
-
Publication number: 20230195643Abstract: In response to eviction of a first clean data block from an intermediate level of cache in a multi-cache hierarchy of a processing system, a cache controller accesses an address of the first clean data block. The controller initiates a fetch of the first clean data block from a system memory into a last-level cache using the accessed address.Type: ApplicationFiled: December 16, 2021Publication date: June 22, 2023Inventors: Tarun Nakra, Jay Fleischman, Gautam Tarasingh Hazari, Akhil Arunkumar, William L. Walker, Gabriel H. Loh, John Kalamatianos, Marko Scrbak
-
Patent number: 11567554Abstract: A pipeline includes a first portion configured to process a first subset of bits of an instruction and a second portion configured to process a second subset of the bits of the instruction. A first clock mesh is configured to provide a first clock signal to the first portion of the pipeline. A second clock mesh is configured to provide a second clock signal to the second portion of the pipeline. The first and second clock meshes selectively provide the first and second clock signals based on characteristics of in-flight instructions that have been dispatched to the pipeline but not yet retired. In some cases, a physical register file is configured to store values of bits representative of instructions. Only the first subset is stored in the physical register file in response to the value of the zero high bit indicating that the second subset is equal to zero.Type: GrantFiled: December 11, 2017Date of Patent: January 31, 2023Assignee: Advanced Micro Devices, Inc.Inventors: Jay Fleischman, Michael Estlick, Michael Christopher Sedmak, Erik Swanson, Sneha V. Desai
-
Patent number: 11550728Abstract: A processing system includes a processor, a memory, and an operating system that are used to allocate a page table caching memory object (PTCM) for a user of the processing system. An allocation of the PTCM is requested from a PTCM allocation system. In order to allocate the PTCM, a plurality of physical memory pages from a memory are allocated to store a PTCM page table that is associated with the PTCM. A lockable region of a cache is designated to hold a copy of the PTCM page table, after which the lockable region of the cache is subsequently locked. The PTCM page table is populated with page table entries associated with the PTCM and copied to the locked region of the cache.Type: GrantFiled: September 27, 2019Date of Patent: January 10, 2023Assignee: Advanced Micro Devices, Inc.Inventors: Derrick Allen Aguren, Eric H. Van Tassell, Gabriel H. Loh, Jay Fleischman
-
Publication number: 20220188117Abstract: Processor-guided execution of offloaded instructions using fixed function operations is disclosed. Instructions designated for remote execution by a target device are received by a processor. Each instruction includes, as an operand, a target register in the target device. The target register may be an architected virtual register. For each of the plurality of instructions, the processor transmits an offload request in the order that the instructions are received. The offload request includes the instruction designated for remote execution. The target device may be, for example, a processing-in-memory device or an accelerator coupled to a memory.Type: ApplicationFiled: December 16, 2020Publication date: June 16, 2022Inventors: JOHN KALAMATIANOS, MICHAEL T. CLARK, MARIUS EVERS, WILLIAM L. WALKER, PAUL MOYER, JAY FLEISCHMAN, JAGADISH B. KOTRA
-
Publication number: 20220050785Abstract: Systems, apparatuses, and methods for employing system probe filter aware last level cache insertion bypassing policies are disclosed. A system includes a plurality of processing nodes, a probe filter, and a shared cache. The probe filter monitors a rate of recall probes that are generated, and if the rate is greater than a first threshold, then the system initiates a cache partitioning and monitoring phase for the shared cache. Accordingly, the cache is partitioned into two portions. If the hit rate of a first portion is greater than a second threshold, then a second portion will have a non-bypass insertion policy since the cache is relatively useful in this scenario. However, if the hit rate of the first portion is less than or equal to the second threshold, then the second portion will have a bypass insertion policy since the cache is less useful in this case.Type: ApplicationFiled: October 29, 2021Publication date: February 17, 2022Inventors: Paul James Moyer, Jay Fleischman
-
Patent number: 11163688Abstract: Systems, apparatuses, and methods for employing system probe filter aware last level cache insertion bypassing policies are disclosed. A system includes a plurality of processing nodes, a probe filter, and a shared cache. The probe filter monitors a rate of recall probes that are generated, and if the rate is greater than a first threshold, then the system initiates a cache partitioning and monitoring phase for the shared cache. Accordingly, the cache is partitioned into two portions. If the hit rate of a first portion is greater than a second threshold, then a second portion will have a non-bypass insertion policy since the cache is relatively useful in this scenario. However, if the hit rate of the first portion is less than or equal to the second threshold, then the second portion will have a bypass insertion policy since the cache is less useful in this case.Type: GrantFiled: September 24, 2019Date of Patent: November 2, 2021Assignee: Advanced Micro Devices, Inc.Inventors: Paul James Moyer, Jay Fleischman
-
Patent number: 11023241Abstract: Systems and methods selectively bypass address-generation hardware in processor instruction pipelines. In an embodiment, a processor includes an address-generation stage and an address-generation-bypass-determination unit (ABDU). The ABDU receives a load/store instruction. If an effective address for the load/store instruction is not known at the ABDU, the ABDU routes the load/store instruction via the address-generation stage of the processor. If, however, the effective address of the load/store instruction is known at the ABDU, the ABDU routes the load/store instruction to bypass the address-generation stage of the processor.Type: GrantFiled: August 21, 2018Date of Patent: June 1, 2021Assignee: Advanced Micro Devices, Inc.Inventors: Andrej Kocev, Jay Fleischman, Kai Troester, Johnny C. Chu, Tim J. Wilkens, Neil Marketkar, Michael W. Long
-
Publication number: 20210097002Abstract: A processing system includes a processor, a memory, and an operating system that are used to allocate a page table caching memory object (PTCM) for a user of the processing system. An allocation of the PTCM is requested from a PTCM allocation system. In order to allocate the PTCM, a plurality of physical memory pages from a memory are allocated to store a PTCM page table that is associated with the PTCM. A lockable region of a cache is designated to hold a copy of the PTCM page table, after which the lockable region of the cache is subsequently locked. The PTCM page table is populated with page table entries associated with the PTCM and copied to the locked region of the cache.Type: ApplicationFiled: September 27, 2019Publication date: April 1, 2021Inventors: Derrick Allen AGUREN, Eric H. VAN TASSELL, Gabriel H. LOH, Jay FLEISCHMAN
-
Publication number: 20210089462Abstract: Systems, apparatuses, and methods for employing system probe filter aware last level cache insertion bypassing policies are disclosed. A system includes a plurality of processing nodes, a probe filter, and a shared cache. The probe filter monitors a rate of recall probes that are generated, and if the rate is greater than a first threshold, then the system initiates a cache partitioning and monitoring phase for the shared cache. Accordingly, the cache is partitioned into two portions. If the hit rate of a first portion is greater than a second threshold, then a second portion will have a non-bypass insertion policy since the cache is relatively useful in this scenario. However, if the hit rate of the first portion is less than or equal to the second threshold, then the second portion will have a bypass insertion policy since the cache is less useful in this case.Type: ApplicationFiled: September 24, 2019Publication date: March 25, 2021Inventors: Paul James Moyer, Jay Fleischman
-
Patent number: 10700954Abstract: A system includes a multi-core processor that includes a scheduler. The multi-core processor communicates with a system memory and an operating system. The multi-core processor executes a first process and a second process. The system uses the scheduler to control a use of a memory bandwidth by the second process until a current use in a control cycle by the first process meets a first setpoint of use for the first process when the first setpoint is at or below a latency sensitive (LS) floor or a current use in the control cycle by the first process exceeds the LS floor when the first setpoint exceeds the LS floor.Type: GrantFiled: December 20, 2017Date of Patent: June 30, 2020Assignee: ADVANCED MICRO DEVICES, INC.Inventors: Douglas Benson Hunt, Jay Fleischman
-
Publication number: 20200065108Abstract: Systems and methods selectively bypass address-generation hardware in processor instruction pipelines. In an embodiment, a processor includes an address-generation stage and an address-generation-bypass-determination unit (ABDU). The ABDU receives a load/store instruction. If an effective address for the load/store instruction is not known at the ABDU, the ABDU routes the load/store instruction via the address-generation stage of the processor. If, however, the effective address of the load/store instruction is known at the ABDU, the ABDU routes the load/store instruction to bypass the address-generation stage of the processor.Type: ApplicationFiled: August 21, 2018Publication date: February 27, 2020Inventors: ANDREJ KOCEV, JAY FLEISCHMAN, KAI TROESTER, JOHNNY C. CHU, TIM J. WILKENS, NEIL MARKETKAR, MICHAEL W. LONG
-
Publication number: 20190190805Abstract: A system includes a multi-core processor that includes a scheduler. The multi-core processor communicates with a system memory and an operating system. The multi-core processor executes a first process and a second process. The system uses the scheduler to control a use of a memory bandwidth by the second process until a current use in a control cycle by the first process meets a first setpoint of use for the first process when the first setpoint is at or below a latency sensitive (LS) floor or a current use in the control cycle by the first process exceeds the LS floor when the first setpoint exceeds the LS floor.Type: ApplicationFiled: December 20, 2017Publication date: June 20, 2019Inventors: Douglas Benson HUNT, Jay FLEISCHMAN