Patents by Inventor Matthew T. Sobel
Matthew T. Sobel has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Publication number: 20240111674Abstract: Data reuse cache techniques are described. In one example, a load instruction is generated by an execution unit of a processor unit. In response to the load instruction, data is loaded by a load-store unit for processing by the execution unit and is also stored to a data reuse cache communicatively coupled between the load-store unit and the execution unit. Upon receipt of a subsequent load instruction for the data from the execution unit, the data is loaded from the data reuse cache for processing by the execution unit.Type: ApplicationFiled: September 29, 2022Publication date: April 4, 2024Applicant: Advanced Micro Devices, Inc.Inventors: Alok Garg, Neil N Marketkar, Matthew T. Sobel
-
Publication number: 20220206798Abstract: Systems, apparatuses, and methods for implementing scheduler queue assignment logic are disclosed. A processor includes at least a decode unit, scheduler queue assignment logic, scheduler queues, pickers, and execution units. The assignment logic receives a plurality of operations from a decode unit in each clock cycle. The assignment logic includes a separate logical unit for each different type of operation which is executable by the different execution units of the processor. For each different type of operation, the assignment logic determines which of the possible assignment permutations are valid for assigning different numbers of operations to scheduler queues in a given clock cycle. The assignment logic receives an indication of how many operations to assign in the given clock cycle, and then the assignment logic selects one of the valid assignment permutations for the number of operations specified by the indication.Type: ApplicationFiled: March 18, 2022Publication date: June 30, 2022Inventors: Matthew T. Sobel, Donald A. Priore, Alok Garg
-
Patent number: 11334384Abstract: Systems, apparatuses, and methods for implementing scheduler queue assignment burst mode are disclosed. A scheduler queue assignment unit receives a dispatch packet with a plurality of operations from a decode unit in each clock cycle. The scheduler queue assignment unit determines if the number of operations in the dispatch packet for any class of operations is greater than a corresponding threshold for dispatching to the scheduler queues in a single cycle. If the number of operations for a given class is greater than the corresponding threshold, and if a burst mode counter is less than a burst mode window threshold, the scheduler queue assignment unit dispatches the extra number of operations for the given class in a single cycle. By operating in burst mode for a given operation class during a small number of cycles, processor throughput can be increased without starving the processor of other operation classes.Type: GrantFiled: December 10, 2019Date of Patent: May 17, 2022Assignee: Advanced Micro Devices, Inc.Inventors: Alok Garg, Scott Andrew McLelland, Marius Evers, Matthew T. Sobel
-
Patent number: 11294724Abstract: An approach is provided for allocating a shared resource to threads in a multi-threaded microprocessor based upon the usefulness of the shared resource to each of the threads. The usefulness of a shared resource to a thread is determined based upon the number of entries in the shared resource that are allocated to the thread and the number of active entries that the thread has in the shared resource. Threads that are allocated a large number of entries in the shared resource and have a small number of active entries in the shared resource, indicative of a low level of parallelism, can operate efficiently with fewer entries in the shared resource, and have their allocation limit in the shared resource reduced.Type: GrantFiled: September 27, 2019Date of Patent: April 5, 2022Assignee: ADVANCED MICRO DEVICES, INC.Inventors: Kai Troester, Neil Marketkar, Matthew T. Sobel, Srinivas Keshav
-
Patent number: 11294678Abstract: Systems, apparatuses, and methods for implementing scheduler queue assignment logic are disclosed. A processor includes at least a decode unit, scheduler queue assignment logic, scheduler queues, pickers, and execution units. The assignment logic receives a plurality of operations from a decode unit in each clock cycle. The assignment logic includes a separate logical unit for each different type of operation which is executable by the different execution units of the processor. For each different type of operation, the assignment logic determines which of the possible assignment permutations are valid for assigning different numbers of operations to scheduler queues in a given clock cycle. The assignment logic receives an indication of how many operations to assign in the given clock cycle, and then the assignment logic selects one of the valid assignment permutations for the number of operations specified by the indication.Type: GrantFiled: May 29, 2018Date of Patent: April 5, 2022Assignee: Advanced Micro Devices, Inc.Inventors: Matthew T. Sobel, Donald A. Priore, Alok Garg
-
Publication number: 20220027162Abstract: Systems, apparatuses, and methods for compressing multiple instruction operations together into a single retire queue entry are disclosed. A processor includes at least a scheduler, a retire queue, one or more execution units, and control logic. When the control logic detects a given instruction operation being dispatched by the scheduler to an execution unit, the control logic determines if the given instruction operation meets one or more conditions for being compressed with one or more other instruction operations into a single retire queue entry. If the one or more conditions are met, two or more instruction operations are stored together in a single retire queue entry. By compressing multiple instruction operations together into an individual retire queue entry, the retire queue is able to be used more efficiently, and the processor can speculatively execute more instructions without the retire queue exhausting its supply of available entries.Type: ApplicationFiled: October 8, 2021Publication date: January 27, 2022Inventors: Matthew T. Sobel, Joshua James Lindner, Neil N. Marketkar, Kai Troester, Emil Talpes, Ashok Tirupathy Venkatachar
-
Patent number: 11144324Abstract: Systems, apparatuses, and methods for compressing multiple instruction operations together into a single retire queue entry are disclosed. A processor includes at least a scheduler, a retire queue, one or more execution units, and control logic. When the control logic detects a given instruction operation being dispatched by the scheduler to an execution unit, the control logic determines if the given instruction operation meets one or more conditions for being compressed with one or more other instruction operations into a single retire queue entry. If the one or more conditions are met, two or more instruction operations are stored together in a single retire queue entry. By compressing multiple instruction operations together into an individual retire queue entry, the retire queue is able to be used more efficiently, and the processor can speculatively execute more instructions without the retire queue exhausting its supply of available entries.Type: GrantFiled: September 27, 2019Date of Patent: October 12, 2021Assignee: Advanced Micro Devices, Inc.Inventors: Matthew T. Sobel, Joshua James Lindner, Neil N. Marketkar, Kai Troester, Emil Talpes, Ashok Tirupathy Venkatachar
-
Patent number: 11113056Abstract: A technique for performing store-to-load forwarding is provided. The technique includes determining a virtual address for data to be loaded for the load instruction, identifying a matching store instruction from one or more store instruction memories by comparing a virtual-address-based comparison value for the load instruction to one or more virtual-address-based comparison values of one or more store instructions, determining a physical address for the load instruction, and validating the load instruction based on a comparison between the physical address of the load instruction and a physical address of the matching store instruction.Type: GrantFiled: November 27, 2019Date of Patent: September 7, 2021Assignee: Advanced Micro Devices, Inc.Inventors: John M. King, Matthew T. Sobel
-
Publication number: 20210173702Abstract: Systems, apparatuses, and methods for implementing scheduler queue assignment burst mode are disclosed. A scheduler queue assignment unit receives a dispatch packet with a plurality of operations from a decode unit in each clock cycle. The scheduler queue assignment unit determines if the number of operations in the dispatch packet for any class of operations is greater than a corresponding threshold for dispatching to the scheduler queues in a single cycle. If the number of operations for a given class is greater than the corresponding threshold, and if a burst mode counter is less than a burst mode window threshold, the scheduler queue assignment unit dispatches the extra number of operations for the given class in a single cycle. By operating in burst mode for a given operation class during a small number of cycles, processor throughput can be increased without starving the processor of other operation classes.Type: ApplicationFiled: December 10, 2019Publication date: June 10, 2021Inventors: Alok Garg, Scott Andrew McLelland, Marius Evers, Matthew T. Sobel
-
Publication number: 20210157590Abstract: A technique for performing store-to-load forwarding is provided. The technique includes determining a virtual address for data to be loaded for the load instruction, identifying a matching store instruction from one or more store instruction memories by comparing a virtual-address-based comparison value for the load instruction to one or more virtual-address-based comparison values of one or more store instructions, determining a physical address for the load instruction, and validating the load instruction based on a comparison between the physical address of the load instruction and a physical address of the matching store instruction.Type: ApplicationFiled: November 27, 2019Publication date: May 27, 2021Applicant: Advanced Micro Devices, Inc.Inventors: John M. King, Matthew T. Sobel
-
Publication number: 20210096874Abstract: Systems, apparatuses, and methods for compressing multiple instruction operations together into a single retire queue entry are disclosed. A processor includes at least a scheduler, a retire queue, one or more execution units, and control logic. When the control logic detects a given instruction operation being dispatched by the scheduler to an execution unit, the control logic determines if the given instruction operation meets one or more conditions for being compressed with one or more other instruction operations into a single retire queue entry. If the one or more conditions are met, two or more instruction operations are stored together in a single retire queue entry. By compressing multiple instruction operations together into an individual retire queue entry, the retire queue is able to be used more efficiently, and the processor can speculatively execute more instructions without the retire queue exhausting its supply of available entries.Type: ApplicationFiled: September 27, 2019Publication date: April 1, 2021Inventors: Matthew T. Sobel, Joshua James Lindner, Neil N. Marketkar, Kai Troester, Emil Talpes, Ashok Tirupathy Venkatachar
-
Publication number: 20210096920Abstract: An approach is provided for allocating a shared resource to threads in a multi-threaded microprocessor based upon the usefulness of the shared resource to each of the threads. The usefulness of a shared resource to a thread is determined based upon the number of entries in the shared resource that are allocated to the thread and the number of active entries that the thread has in the shared resource. Threads that are allocated a large number of entries in the shared resource and have a small number of active entries in the shared resource, indicative of a low level of parallelism, can operate efficiently with fewer entries in the shared resource, and have their allocation limit in the shared resource reduced.Type: ApplicationFiled: September 27, 2019Publication date: April 1, 2021Inventors: Kai Troester, Neil Marketkar, Matthew T. Sobel, Srinivas Keshav
-
Publication number: 20190369991Abstract: Systems, apparatuses, and methods for implementing scheduler queue assignment logic are disclosed. A processor includes at least a decode unit, scheduler queue assignment logic, scheduler queues, pickers, and execution units. The assignment logic receives a plurality of operations from a decode unit in each clock cycle. The assignment logic includes a separate logical unit for each different type of operation which is executable by the different execution units of the processor. For each different type of operation, the assignment logic determines which of the possible assignment permutations are valid for assigning different numbers of operations to scheduler queues in a given clock cycle. The assignment logic receives an indication of how many operations to assign in the given clock cycle, and then the assignment logic selects one of the valid assignment permutations for the number of operations specified by the indication.Type: ApplicationFiled: May 29, 2018Publication date: December 5, 2019Inventors: Matthew T. Sobel, Donald A. Priore, Alok Garg
-
Patent number: 9563573Abstract: A memory can be a sum addressed memory (SAM) that receives, for each read access, two address values (e.g. a base address and an offset) having a sum that indicates the entry of the memory to be read (the read entry). A decoder adds the two address value to identify the read entry. Concurrently, a predecode module predecodes the two address values to identify a set of entries (e.g. two different entries) at the memory, whereby the set includes the entry to be read. The predecode module generates a precharge disable signal to terminate precharging at the set of entries which includes the entry to be read. Because the precharge disable signal is based on predecoded address information, it can be generated without waiting for a full decode of the read address entry.Type: GrantFiled: August 20, 2013Date of Patent: February 7, 2017Assignee: Advanced Micro Devices, Inc.Inventor: Matthew T. Sobel
-
Publication number: 20150058575Abstract: A memory can be a sum addressed memory (SAM) that receives, for each read access, two address values (e.g. a base address and an offset) having a sum that indicates the entry of the memory to be read (the read entry). A decoder adds the two address value to identify the read entry. Concurrently, a predecode module predecodes the two address values to identify a set of entries (e.g. two different entries) at the memory, whereby the set includes the entry to be read. The predecode module generates a precharge disable signal to terminate precharging at the set of entries which includes the entry to he read. Because the precharge disable signal is based on predecoded address information, it can be generated without waiting for a full decode of the read address entry.Type: ApplicationFiled: August 20, 2013Publication date: February 26, 2015Applicant: Advanced Micro Devices, Inc.Inventor: Matthew T. Sobel
-
Patent number: 6961403Abstract: A programmable frequency divider circuit with symmetrical output is disclosed. The frequency divider includes a non-symmetrical LFSR based component operated in series with a symmetrical divider component. Both the LFSR and the symmetrical divider may be programmed to provide flexibility. The frequency divider can dynamically adjust the divisor of the LFSR component to overcome limitations in the divide resolution due to the series combination of dividers, providing even and odd divisor values. The divider architecture can also provide higher level functions, including synchronization of multiple divider outputs, dynamic switching of divisor values and generation of multi-phased and spaced outputs. The linear feedback shift register (LFSR) component includes a feedback logic network decomposed into multiple stages to realize a maximum latch-to-latch operational latency of one gate delay regardless of the size of the LFSR.Type: GrantFiled: May 28, 2004Date of Patent: November 1, 2005Assignee: International Business Machines CorporationInventors: John S. Austin, Matthew T. Sobel