Patents by Inventor Michael W. Boyer
Michael W. Boyer has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Publication number: 20230130969Abstract: A memory includes two or more portions of memory circuitry and two or more processor in memory (PIM) functional blocks, each PIM functional block associated with a respective portion of the memory circuitry. In operation, at least one other PIM functional block other than a particular PIM functional block copies data from a source location accessible to the other PIM functional block. The other PIM functional block then provides the data to the particular PIM functional block. The particular acquires and stores the data in a destination location accessible to the particular PIM functional block. The particular PIM functional block next performs one or more PIM operations using the data.Type: ApplicationFiled: October 27, 2021Publication date: April 27, 2023Inventors: Alexandru Dutu, Vaibhav Ramakrishnan Ramachandran, Michael W. Boyer
-
Publication number: 20230102296Abstract: A processing unit decomposes a matrix for partial processing at a processor-in-memory (PIM) device. The processing unit receives a matrix to be used as an operand in an arithmetic operation (e.g., a matrix multiplication operation). In response, the processing unit decomposes the matrix into two component matrices: a sparse component matrix and a dense component matrix. The processing unit itself performs the arithmetic operation with the dense component matrix, but sends the sparse component matrix to the PIM device for execution of the arithmetic operation. The processing unit thereby offloads at least some of the processing overhead to the PIM device, improving overall efficiency of the processing system.Type: ApplicationFiled: September 30, 2021Publication date: March 30, 2023Inventors: Michael W. Boyer, Ashish Gondimalla, Bradford M. Beckmann
-
Patent number: 11513802Abstract: An electronic device includes a processor having a micro-operation queue, multiple scheduler entries, and scheduler compression logic. When a pair of micro-operations in the micro-operation queue is compressible in accordance with one or more compressibility rules, the scheduler compression logic acquires the pair of micro-operations from the micro-operation queue and stores information from both micro-operations of the pair of micro-operations into different portions in a single scheduler entry. In this way, the scheduler compression logic compresses the pair of micro-operations into the single scheduler entry.Type: GrantFiled: September 27, 2020Date of Patent: November 29, 2022Assignee: Advanced Micro Devices, Inc.Inventors: Michael W. Boyer, John Kalamatianos, Pritam Majumder
-
Publication number: 20220100501Abstract: An electronic device includes a processor having a micro-operation queue, multiple scheduler entries, and scheduler compression logic. When a pair of micro-operations in the micro-operation queue is compressible in accordance with one or more compressibility rules, the scheduler compression logic acquires the pair of micro-operations from the micro-operation queue and stores information from both micro-operations of the pair of micro-operations into different portions in a single scheduler entry. In this way, the scheduler compression logic compresses the pair of micro-operations into the single scheduler entry.Type: ApplicationFiled: September 27, 2020Publication date: March 31, 2022Inventors: Michael W. Boyer, John Kalamatianos, Pritam Majumder
-
Patent number: 10838864Abstract: A miss in a cache by a thread in a wavefront is detected. The wavefront includes a plurality of threads that are executing a memory access request concurrently on a corresponding plurality of processor cores. A priority is assigned to the thread based on whether the memory access request is addressed to a local memory or a remote memory. The memory access request for the thread is performed based on the priority. In some cases, the cache is selectively bypassed depending on whether the memory access request is addressed to the local or remote memory. A cache block is requested in response to the miss. The cache block is biased towards a least recently used position in response to requesting the cache block from the local memory and towards a most recently used position in response to requesting the cache block from the remote memory.Type: GrantFiled: May 30, 2018Date of Patent: November 17, 2020Assignee: ADVANCED MICRO DEVICES, INC.Inventors: Michael W. Boyer, Onur Kayiran, Yasuko Eckert, Steven Raasch, Muhammad Shoaib Bin Altaf
-
Patent number: 10705958Abstract: A processor partitions a coherency directory into different regions for different processor cores and manages the number of entries allocated to each region based at least in part on monitored recall costs indicating expected resource costs for reallocating entries. Examples of monitored recall costs include a number of cache evictions associated with entry reallocation, the hit rate of each region of the coherency directory, and the like, or a combination thereof. By managing the entries allocated to each region based on the monitored recall costs, the processor ensures that processor cores associated with denser memory access patterns (that is, memory access patterns that more frequently access cache lines associated with the same memory pages) are assigned more entries of the coherency directory.Type: GrantFiled: August 22, 2018Date of Patent: July 7, 2020Assignee: ADVANCED MICRO DEVICES, INC.Inventors: Michael W. Boyer, Gabriel H. Loh, Yasuko Eckert, William L. Walker
-
Publication number: 20200065246Abstract: A processor partitions a coherency directory into different regions for different processor cores and manages the number of entries allocated to each region based at least in part on monitored recall costs indicating expected resource costs for reallocating entries. Examples of monitored recall costs include a number of cache evictions associated with entry reallocation, the hit rate of each region of the coherency directory, and the like, or a combination thereof. By managing the entries allocated to each region based on the monitored recall costs, the processor ensures that processor cores associated with denser memory access patterns (that is, memory access patterns that more frequently access cache lines associated with the same memory pages) are assigned more entries of the coherency directory.Type: ApplicationFiled: August 22, 2018Publication date: February 27, 2020Inventors: Michael W. BOYER, Gabriel H. LOH, Yasuko ECKERT, William L. WALKER
-
Patent number: 10503641Abstract: A cache coherence bridge protocol provides an interface between a cache coherence protocol of a host processor and a cache coherence protocol of a processor-in-memory, thereby decoupling coherence mechanisms of the host processor and the processor-in-memory. The cache coherence bridge protocol requires limited change to existing host processor cache coherence protocols. The cache coherence bridge protocol may be used to facilitate interoperability between host processors and processor-in-memory devices designed by different vendors and both the host processors and processor-in-memory devices may implement coherence techniques among computing units within each processor. The cache coherence bridge protocol may support different granularity of cache coherence permissions than those used by cache coherence protocols of a host processor and/or a processor-in-memory.Type: GrantFiled: May 31, 2016Date of Patent: December 10, 2019Assignee: Advanced Micro Devices, Inc.Inventors: Michael W. Boyer, Nuwan Jayasena
-
Publication number: 20190370173Abstract: A miss in a cache by a thread in a wavefront is detected. The wavefront includes a plurality of threads that are executing a memory access request concurrently on a corresponding plurality of processor cores. A priority is assigned to the thread based on whether the memory access request is addressed to a local memory or a remote memory. The memory access request for the thread is performed based on the priority. In some cases, the cache is selectively bypassed depending on whether the memory access request is addressed to the local or remote memory. A cache block is requested in response to the miss. The cache block is biased towards a least recently used position in response to requesting the cache block from the local memory and towards a most recently used position in response to requesting the cache block from the remote memory.Type: ApplicationFiled: May 30, 2018Publication date: December 5, 2019Inventors: Michael W. BOYER, Onur KAYIRAN, Yasuko ECKERT, Steven RAASCH, Muhammad SHOAIB BIN ALTAF
-
Patent number: 10310981Abstract: A method and apparatus for performing memory prefetching includes determining whether to initiate prefetching. Upon a determination to initiate prefetching, a first memory row is determined as a suitable prefetch candidate, and it is determined whether a particular set of one or more cachelines of the first memory row is to be prefetched.Type: GrantFiled: September 19, 2016Date of Patent: June 4, 2019Assignee: ADVANCED MICRO DEVICES, INC.Inventors: Yasuko Eckert, Nuwan Jayasena, Reena Panda, Onur Kayiran, Michael W. Boyer
-
Publication number: 20190163632Abstract: A method includes monitoring, at a cache coherence directory, states of cachelines stored in a cache hierarchy of a data processing system using a plurality of entries of the cache coherence directory. Each entry of the cache coherence directory is associated with a corresponding cache page of a plurality of cache pages, and each cache page representing a corresponding set of contiguous cachelines. The method further includes selectively evicting cachelines from a first cache of the cache hierarchy based on cacheline utilization densities of cache pages represented by the corresponding entries of the plurality of entries of the cache coherence directory.Type: ApplicationFiled: November 29, 2017Publication date: May 30, 2019Inventors: William L. WALKER, Michael W. BOYER, Yasuko ECKERT, Gabriel H. LOH
-
Patent number: 10282295Abstract: A method includes monitoring, at a cache coherence directory, states of cachelines stored in a cache hierarchy of a data processing system using a plurality of entries of the cache coherence directory. Each entry of the cache coherence directory is associated with a corresponding cache page of a plurality of cache pages, and each cache page representing a corresponding set of contiguous cachelines. The method further includes selectively evicting cachelines from a first cache of the cache hierarchy based on cacheline utilization densities of cache pages represented by the corresponding entries of the plurality of entries of the cache coherence directory.Type: GrantFiled: November 29, 2017Date of Patent: May 7, 2019Assignee: Advanced Micro Devices, Inc.Inventors: William L. Walker, Michael W. Boyer, Yasuko Eckert, Gabriel H. Loh
-
Publication number: 20170344479Abstract: A cache coherence bridge protocol provides an interface between a cache coherence protocol of a host processor and a cache coherence protocol of a processor-in-memory, thereby decoupling coherence mechanisms of the host processor and the processor-in-memory. The cache coherence bridge protocol requires limited change to existing host processor cache coherence protocols. The cache coherence bridge protocol may be used to facilitate interoperability between host processors and processor-in-memory devices designed by different vendors and both the host processors and processor-in-memory devices may implement coherence techniques among computing units within each processor. The cache coherence bridge protocol may support different granularity of cache coherence permissions than those used by cache coherence protocols of a host processor and/or a processor-in-memory.Type: ApplicationFiled: May 31, 2016Publication date: November 30, 2017Inventors: Michael W. Boyer, Nuwan Jayasena
-
Publication number: 20170293560Abstract: A method and apparatus for performing memory prefetching includes determining whether to initiate prefetching. Upon a determination to initiate prefetching, a first memory row is determined as a suitable prefetch candidate, and it is determined whether a particular set of one or more cachelines of the first memory row is to be prefetched.Type: ApplicationFiled: September 19, 2016Publication date: October 12, 2017Applicant: Advanced Micro Devices, Inc.Inventors: Yasuko Eckert, Nuwan Jayasena, Reena Panda, Onur Kayiran, Michael W. Boyer
-
Publication number: 20160335064Abstract: A method, a system, and a non-transitory computer readable medium for generating application code to be executed on an active storage device are presented. The parts of an application that can be executed on the active storage device are determined. The parts of the application that will not be executed on the active storage device are converted into code to be executed on a host device. The parts of the application that will be executed on the active storage device are converted into code of an instruction set architecture of a processor in the active storage device.Type: ApplicationFiled: May 12, 2015Publication date: November 17, 2016Applicant: Advanced Micro Devices, Inc.Inventors: Shuai Che, Sudhanva Gurumurthi, Michael W. Boyer
-
Patent number: 7207075Abstract: A faucet spout assembly providing multiple spouts with different heights and curvature which are interchangeable within the same faucet. Laminar flow from the spout is achieved with a flow control device upstream of the spout outlet A check valve is optionally used in the inlet to the spout to promote greater flexibility in placement of the flow control device within the assembly.Type: GrantFiled: November 20, 2002Date of Patent: April 24, 2007Assignee: Speakman CompanyInventors: Graham H. Peterson, Michael W. Boyer
-
Publication number: 20030093857Abstract: A faucet spout assembly providing multiple spouts with different heights and curvature which are interchangeable within the same faucet. Laminar flow from the spout is achieved with a flow control device upstream of the spout outlet A check valve is optionally used in the inlet to the spout to promote greater flexibility in placement of the flow control device within the assembly.Type: ApplicationFiled: November 20, 2002Publication date: May 22, 2003Inventors: Graham H. Paterson, Michael W. Boyer