Patents by Inventor Benjamin T. Sander
Benjamin T. Sander has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Publication number: 20220269620Abstract: A processor maintains an access log indicating a stream of cache misses at a cache of the processor. In response to each of at least a subset of cache misses at the cache, the processor records a corresponding entry in the access log, indicating a physical memory address of the memory access request that resulted in the corresponding miss. In addition, the processor maintains an address translation log that indicates a mapping of physical memory addresses to virtual memory addresses. In response to an address translation (e.g., a page walk) that translates a virtual address to a physical address, the processor stores a mapping of the physical address to the corresponding virtual address at an entry of the address translation log. Software executing at the processor can use the two logs for memory management.Type: ApplicationFiled: February 8, 2022Publication date: August 25, 2022Inventors: Benjamin T. SANDER, Mark Fowler, Anthony Asaro, Gongxian Jeffrey Cheng, Michael Mantor
-
Patent number: 11288205Abstract: A processor maintains an access log indicating a stream of cache misses at a cache of the processor. In response to each of at least a subset of cache misses at the cache, the processor records a corresponding entry in the access log, indicating a physical memory address of the memory access request that resulted in the corresponding miss. In addition, the processor maintains an address translation log that indicates a mapping of physical memory addresses to virtual memory addresses. In response to an address translation (e.g., a page walk) that translates a virtual address to a physical address, the processor stores a mapping of the physical address to the corresponding virtual address at an entry of the address translation log. Software executing at the processor can use the two logs for memory management.Type: GrantFiled: June 23, 2015Date of Patent: March 29, 2022Assignees: Advanced Micro Devices, Inc., ATI TECHNOLOGIES ULCInventors: Benjamin T. Sander, Mark Fowler, Anthony Asaro, Gongxian Jeffrey Cheng, Mike Mantor
-
Patent number: 11100004Abstract: A processor uses the same virtual address space for heterogeneous processing units of the processor. The processor employs different sets of page tables for different types of processing units, such as a CPU and a GPU, wherein a memory management unit uses each set of page tables to translate virtual addresses of the virtual address space to corresponding physical addresses of memory modules associated with the processor. As data is migrated between memory modules, the physical addresses in the page tables can be updated to reflect the physical location of the data for each processing unit.Type: GrantFiled: June 23, 2015Date of Patent: August 24, 2021Assignees: ADVANCED MICRO DEVICES, INC., ATI TECHNOLOGIES ULCInventors: Gongxian Jeffrey Cheng, Mark Fowler, Philip J. Rogers, Benjamin T. Sander, Anthony Asaro, Mike Mantor, Raja Koduri
-
Patent number: 10467138Abstract: A processing system includes a first socket, a second socket, and an interface between the first socket and the second socket. A first memory is associated with the first socket and a second memory is associated with the second socket. The processing system also includes a controller for the first memory. The controller is to receive a first request for a first memory transaction with the second memory and perform the first memory transaction along a path that includes the interface and bypasses at least one second cache associated with the second memory.Type: GrantFiled: December 28, 2015Date of Patent: November 5, 2019Assignee: Advanced Micro Devices, Inc.Inventors: Paul Blinzer, Ali Ibrahim, Benjamin T. Sander, Vydhyanathan Kalyanasundharam
-
Patent number: 10423354Abstract: A memory manager of a processor identifies a block of data for eviction from a first memory module to a second memory module. In response, the processor copies only those portions of the data block that have been identified as modified portions to the second memory module. The amount of data to be copied is thereby reduced, improving memory management efficiency and reducing processor power consumption.Type: GrantFiled: September 23, 2015Date of Patent: September 24, 2019Assignees: Advanced Micro Devices, Inc., ATI Technologies ULCInventors: Philip Rogers, Benjamin T. Sander, Anthony Asaro, Gongxian Jeffrey Cheng
-
Patent number: 9910788Abstract: A processor device includes a cache and a memory storing a set of counters. Each counter of the set is associated with a corresponding block of a plurality of blocks of the cache. The processor device further includes a cache access monitor to, for each time quantum for a series of one or more time quanta, increment counter values of the set of counters based on accesses to the corresponding blocks of the cache. The processor device further includes a transfer engine to, after completion of each time quantum, transfer the counter values of the set of counters for the time quantum to a corresponding location in a system memory.Type: GrantFiled: September 22, 2015Date of Patent: March 6, 2018Assignees: Advanced Micro Devices, Inc., ATI Technologies ULCInventors: Philip J. Rogers, Benjamin T. Sander, Anthony Asaro
-
Publication number: 20170185514Abstract: A processing system includes a first socket, a second socket, and an interface between the first socket and the second socket. A first memory is associated with the first socket and a second memory is associated with the second socket. The processing system also includes a controller for the first memory. The controller is to receive a first request for a first memory transaction with the second memory and perform the first memory transaction along a path that includes the interface and bypasses at least one second cache associated with the second memory.Type: ApplicationFiled: December 28, 2015Publication date: June 29, 2017Inventors: Paul Blinzer, Ali Ibrahim, Benjamin T. Sander, Vydhyanathan Kalyanasundharam
-
Publication number: 20170083240Abstract: A memory manager of a processor identifies a block of data for eviction from a first memory module to a second memory module. In response, the processor copies only those portions of the data block that have been identified as modified portions to the second memory module. The amount of data to be copied is thereby reduced, improving memory management efficiency and reducing processor power consumption.Type: ApplicationFiled: September 23, 2015Publication date: March 23, 2017Inventors: Philip Rogers, Benjamin T. Sander, Anthony Asaro, Gongxian Jeffrey Cheng
-
Publication number: 20170083455Abstract: A processor device includes a cache and a memory storing a set of counters. Each counter of the set is associated with a corresponding block of a plurality of blocks of the cache. The processor device further includes a cache access monitor to, for each time quantum for a series of one or more time quanta, increment counter values of the set of counters based on accesses to the corresponding blocks of the cache. The processor device further includes a transfer engine to, after completion of each time quantum, transfer the counter values of the set of counters for the time quantum to a corresponding location in a system memory.Type: ApplicationFiled: September 22, 2015Publication date: March 23, 2017Inventors: Philip J. Rogers, Benjamin T. Sander, Anthony Asaro
-
Publication number: 20160378682Abstract: A processor maintains an access log indicating a stream of cache misses at a cache of the processor. In response to each of at least a subset of cache misses at the cache, the processor records a corresponding entry in the access log, indicating a physical memory address of the memory access request that resulted in the corresponding miss. In addition, the processor maintains an address translation log that indicates a mapping of physical memory addresses to virtual memory addresses. In response to an address translation (e.g., a page walk) that translates a virtual address to a physical address, the processor stores a mapping of the physical address to the corresponding virtual address at an entry of the address translation log. Software executing at the processor can use the two logs for memory management.Type: ApplicationFiled: June 23, 2015Publication date: December 29, 2016Inventors: Benjamin T. Sander, Mark Fowler, Anthony Asaro, Gongxian Jeffrey Cheng, Mike Mantor
-
Publication number: 20160378674Abstract: A processor uses the same virtual address space for heterogeneous processing units of the processor. The processor employs different sets of page tables for different types of processing units, such as a CPU and a GPU, wherein a memory management unit uses each set of page tables to translate virtual addresses of the virtual address space to corresponding physical addresses of memory modules associated with the processor. As data is migrated between memory modules, the physical addresses in the page tables can be updated to reflect the physical location of the data for each processing unit.Type: ApplicationFiled: June 23, 2015Publication date: December 29, 2016Inventors: Gongxian Jeffrey Cheng, Mark Fowler, Philip J. Rogers, Benjamin T. Sander, Anthony Asaro, Mike Mantor, Raja Koduri
-
Patent number: 9501269Abstract: A programming model for a processor accelerator allows accelerated functions to be called from a main program directly without a management API for the accelerator. A compiler automatically generates wrapper source code for each accelerator function called by the application source code. The wrapper code is compiled, together with the accelerator source code, to generate an object file that is linked to an object file for the main program. By automatically generating the wrapper code, a programmer can simply and directly invoke accelerator functions without the use of a complex management API. In addition, because the wrapper code for the accelerator is generated automatically, a standard compiler can be used to compile the main program, using standard linkage conventions.Type: GrantFiled: September 30, 2014Date of Patent: November 22, 2016Assignee: Advanced Micro Devices, Inc.Inventors: Gregory P. Rodgers, Benjamin T. Sander, Shreyas Ramalingam
-
Publication number: 20160092181Abstract: A programming model for a processor accelerator allows accelerated functions to be called from a main program directly without a management API for the accelerator. A compiler automatically generates wrapper source code for each accelerator function called by the application source code. The wrapper code is compiled, together with the accelerator source code, to generate an object file that is linked to an object file for the main program. By automatically generating the wrapper code, a programmer can simply and directly invoke accelerator functions without the use of a complex management API. In addition, because the wrapper code for the accelerator is generated automatically, a standard compiler can be used to compile the main program, using standard linkage conventions.Type: ApplicationFiled: September 30, 2014Publication date: March 31, 2016Inventors: Gregory P. Rodgers, Benjamin T. Sander, Shreyas Ramalingam
-
Patent number: 8667225Abstract: A system and method for efficient data prefetching. A data stream stored in lower-level memory comprises a contiguous block of data used in a computer program. A prefetch unit in a processor detects a data stream by identifying a sequence of storage accesses referencing a contiguous blocks of data in a monotonically increasing or decreasing manner. After a predetermined training period for a given data stream, the prefetch unit prefetches a portion of the given data stream from memory without write permission, in response to an access that does not request write permission. Also, after the training period, the prefetch unit prefetches a portion of the given data stream from lower-level memory with write permission, in response to determining there has been a prior access to the given data stream that requests write permission subsequent to a number of cache misses reaching a predetermined threshold.Type: GrantFiled: September 11, 2009Date of Patent: March 4, 2014Assignee: Advanced Micro Devices, Inc.Inventors: Benjamin T. Sander, Bharath Narasimha Swamy, Swamy Punyamurtula
-
Publication number: 20130339978Abstract: A method and an apparatus for performing load balancing in a heterogeneous computing system including a plurality of processing elements are presented. A program places tasks into a queue. A task from the queue is distributed to one of the plurality of processing elements, wherein the distributing includes the one processing element sending a task request to the queue and receiving a task to be done from the queue. The task is performed by the one processing element. A result of the task is sent from the one processing element to the program. The load balancing is performed by distributing tasks from the queue to processing elements that complete the tasks faster.Type: ApplicationFiled: June 13, 2013Publication date: December 19, 2013Inventor: Benjamin T. Sander
-
Patent number: 8195887Abstract: A data processing device is disclosed that includes multiple processing cores, where each core is associated with a corresponding cache. When a processing core is placed into a first sleep mode, the data processing device initiates a first phase. If any cache probes are received at the processing core during the first phase, the cache probes are serviced. At the end of the first phase, the cache corresponding to the processing core is flushed, and subsequent cache probes are not serviced at the cache. Because it does not service the subsequent cache probes, the processing core can therefore enter another sleep mode, allowing the data processing device to conserve additional power.Type: GrantFiled: January 21, 2009Date of Patent: June 5, 2012Inventors: William A. Hughes, Kiran K. Bondalapati, Vydhyanathan Kalyanasundharam, Kevin M. Lepak, Benjamin T. Sander
-
Publication number: 20120023314Abstract: A method and mechanism for reducing latency of a multi-cycle scheduler within a processor. A processor comprises a front end pipeline that determines data dependencies between instructions prior to a scheduling pipe stage. For each data dependency, a distance value is determined based on a number of instructions a younger dependent instruction is located from a corresponding older (in program order) instruction. When the younger dependent instruction is allocated an entry in a multi-cycle scheduler, this distance value may be used to locate an entry storing the older instruction in the scheduler. When the older instruction is picked for issue, the younger dependent instruction is marked as pre-picked. In an immediately subsequent clock cycle, the younger dependent instruction may be picked for issue, thereby reducing the latency of the multi-cycle scheduler.Type: ApplicationFiled: July 21, 2010Publication date: January 26, 2012Inventors: Matthew M. Crum, Michael D. Achenbach, Betty A. McDaniel, Benjamin T. Sander
-
Patent number: 7937569Abstract: A system and method for scheduling operations using speculative data operands. In one embodiment, a system may include a scheduler configured to store a speculative source tag and a non-speculative source tag for an operand of an operation and an execution core configured to execute operations issued by the scheduler and to output result tags identifying operands generated by executing the operations. The scheduler may be configured to determine whether the operation is ready to issue by comparing the speculative source tag, but not the non-speculative source tag, to the result tags output by the execution core unless an incorrect speculation has been detected. If an incorrect speculation has been detected, the scheduler may be configured to determine whether the operation is ready to issue by comparing the non-speculative source tag, but not the speculative source tag, to the result tags output by the execution core.Type: GrantFiled: May 5, 2004Date of Patent: May 3, 2011Assignee: Advanced Micro Devices, Inc.Inventors: Benjamin T. Sander, Brian D. McMinn
-
Publication number: 20110066811Abstract: A system and method for efficient data prefetching. A data stream stored in lower-level memory comprises a contiguous block of data used in a computer program. A prefetch unit in a processor detects a data stream by identifying a sequence of storage accesses referencing a contiguous blocks of data in a monotonically increasing or decreasing manner. After a predetermined training period for a given data stream, the prefetch unit prefetches a portion of the given data stream from memory without write permission, in response to an access that does not request write permission. Also, after the training period, the prefetch unit prefetches a portion of the given data stream from lower-level memory with write permission, in response to determining there has been a prior access to the given data stream that requests write permission subsequent to a number of cache misses reaching a predetermined threshold.Type: ApplicationFiled: September 11, 2009Publication date: March 17, 2011Inventors: Benjamin T. Sander, Bharath Narasimha Swamy, Swamy Punyamurtula
-
Publication number: 20100185820Abstract: A data processing device is disclosed that includes multiple processing cores, where each core is associated with a corresponding cache. When a processing core is placed into a first sleep mode, the data processing device initiates a first phase. If any cache probes are received at the processing core during the first phase, the cache probes are serviced. At the end of the first phase, the cache corresponding to the processing core is flushed, and subsequent cache probes are not serviced at the cache. Because it does not service the subsequent cache probes, the processing core can therefore enter another sleep mode, allowing the data processing device to conserve additional power.Type: ApplicationFiled: January 21, 2009Publication date: July 22, 2010Applicant: ADVANCED MICRO DEVICES, INC.Inventors: William A. Hughes, Kiran K. Bondalapati, Vydhyanathan Kalyanasundharam, Kevin M. Lepak, Benjamin T. Sander