Patents Assigned to Advanced Micro Device, Inc.
-
Publication number: 20240103745Abstract: A memory controller coupled to a memory module receives both processing-in-memory (PIM) requests and memory requests from a host (e.g., a host processor). The memory controller issues PIM requests to one group of memory banks and concurrently issues memory requests to one or more other groups of memory banks. Accordingly, memory requests are performed on groups of memory banks that would otherwise be idle while PIM requests are performed on the one group of memory banks. Optionally, the memory controller coupled to the memory module also takes various actions when switching between operating in a PIM mode and a non-processing-in-memory mode to reduce or hide overhead when switching between the two modes.Type: ApplicationFiled: September 28, 2022Publication date: March 28, 2024Applicant: Advanced Micro Devices, Inc.Inventors: Niti Madan, Johnathan Robert Alsop, Alexandru Dutu, Mahzabeen Islam, Yasuko Eckert, Nuwan S Jayasena
-
Publication number: 20240103739Abstract: Multi-level cell memory management techniques are described. In one example, the memory controller is configured to control whether a single-level cell operation or a multi-level cell operation to be used using different mapping schemes. The single-level cell operation, for instance, is usable to store a data word using two states whereas the multi-level cell operation is usable to store the data word by also using an intermediate state. In order to store the data word using two states, the memory controller is configurable to separate the data word across two word lines in the physical memory. In an implementation, use of the different operations and corresponding mapping schemes by the memory controller alternates between adjacent word lines in physical memory.Type: ApplicationFiled: September 25, 2022Publication date: March 28, 2024Applicant: Advanced Micro Devices, Inc.Inventor: SeyedMohammad SeyedzadehDelcheh
-
Publication number: 20240104015Abstract: In accordance with the described techniques for data compression and decompression for processing in memory, a page address is received by a processing in memory component that maps to a first location in memory where data of a page is maintained. The data of the page is compressed by the processing in memory component. Further, compressed data of the page is written by the processing in memory component to a compressed block device responsive to the compressed data satisfying one or more compressibility criteria. The compressed block device is a portion of the memory dedicated to storing data in a compressed form.Type: ApplicationFiled: September 26, 2022Publication date: March 28, 2024Applicant: Advanced Micro Devices, Inc.Inventors: Kishore Punniyamurthy, Jagadish B Kotra
-
Publication number: 20240103730Abstract: In accordance with described techniques for reduction of parallel memory operation messages, a computing system or computing device includes a memory system that receives memory operation messages. A shared response component in the memory system receives responses to the memory operation messages, and identifies a set of the responses that are coalesceable. The shared response component then coalesces the set of the responses into a combined message for communication completion through a communication path in the memory system.Type: ApplicationFiled: September 28, 2022Publication date: March 28, 2024Applicant: Advanced Micro Devices, Inc.Inventors: Johnathan Robert Alsop, Shaizeen Dilawarhusen Aga, Mohamed Assem Abd ElMohsen Ibrahim
-
Publication number: 20240103879Abstract: Block data load with transpose techniques are described. In one example, an input is received, at a control unit, specifying an instruction to load a block of data to at least one memory module using a transpose operation. Responsive to the receiving the input by the control unit, the block of data is caused to be loaded to the at least one memory module by transposing the block of data to form a transposed block of data and storing the transposed block of data in the at least one memory.Type: ApplicationFiled: September 25, 2022Publication date: March 28, 2024Applicant: Advanced Micro Devices, Inc.Inventors: Bin He, Michael John Mantor, Brian Emberling, Liang Huang, Chao Liu
-
Publication number: 20240103763Abstract: In accordance with the described techniques for bank-level parallelism for processing in memory, a plurality of commands are received for execution by a processing in memory component embedded in a memory. The memory includes a first bank and a second bank. The plurality of commands include a first stream of commands which cause the processing in memory component to perform operations that access the first bank and a second stream of commands which cause the processing in memory component to perform operations that access the second bank. A next row of the first bank that is to be accessed by the processing in memory component is identified. Further, a precharge command is scheduled to close a first row of the first bank and an activate command is scheduled to open the next row of the first bank in parallel with execution of the second stream of commands.Type: ApplicationFiled: September 27, 2022Publication date: March 28, 2024Applicant: Advanced Micro Devices, Inc.Inventors: Mahzabeen Islam, Shaizeen Dilawarhusen Aga, Johnathan Robert Alsop, MOHAMED ASSEM ABD ELMOHSEN IBRAHIM, Nuwan S Jayasena
-
Publication number: 20240103897Abstract: Systems and methods are disclosed for managing diversified virtual memory by an engine. Techniques disclosed include receiving one or more request messages, each request message including a job descriptor that specifies an operation to be performed on a respective virtual memory space, processing the job descriptors by generating one or more commands for transmission to one or more virtual memory managers, and transmitting the one or more commands to the one or more virtual memory managers (VMMs) for processing.Type: ApplicationFiled: September 27, 2022Publication date: March 28, 2024Applicants: Advanced Micro Devices, Inc., ATI Technologies ULCInventors: Norman Vernon Douglas Stewart, Mihir Shaileshbhai Doctor, Omar Fakhri Ahmed
-
Publication number: 20240103860Abstract: Predicates for processing in memory is described. In accordance with the described techniques, a predicate instruction to compute a conditional value based on data stored in a memory is provided to a processing-in-memory component. A response that includes the conditional value computed by the processing-in-memory component is received, and the conditional value is stored in a predicate register. One or more conditional instructions are provided to the processing-in-memory component based on the conditional value stored in the predicate register.Type: ApplicationFiled: September 26, 2022Publication date: March 28, 2024Applicant: Advanced Micro Devices, Inc.Inventor: Nuwan S. Jayasena
-
Publication number: 20240106438Abstract: An integrated circuit includes a power supply monitor, a clock generator, and a divider. The power supply monitor is operable to provide a trigger signal in response to a power supply voltage dropping below a threshold voltage. The clock generator is operable to provide a first clock signal having a frequency dependent on a value of a frequency control word, and to change the frequency of the first clock signal over time using a native slope in response to a change in the frequency control word. The divider is responsive to an assertion of the trigger signal to divide a frequency of the first clock signal by a divide value to provide a second clock signal.Type: ApplicationFiled: November 30, 2023Publication date: March 28, 2024Applicants: Advanced Micro Devices, Inc., ATI Technologies ULCInventors: Kaushik Mazumdar, Ashish Jain, Joyce Cheuk Wai Wong, Mikhail Rodionov
-
Publication number: 20240106782Abstract: In accordance with described techniques for filtered responses to memory operation messages, a computing system or computing device includes a memory system that receives messages. A filter component in the memory system receives the responses to the memory operation messages, and filters one or more of the responses based on a filterable condition. A tracking logic component tracks the one or more responses as filtered responses for communication completion.Type: ApplicationFiled: September 28, 2022Publication date: March 28, 2024Applicant: Advanced Micro Devices, Inc.Inventors: Johnathan Robert Alsop, Shaizeen Dilawarhusen Aga, MOHAMED ASSEM ABD ELMOHSEN IBRAHIM
-
Patent number: 11941723Abstract: Systems, methods, and techniques dynamically utilize load balancing for workgroup assignments between a group of shader engines by a command processor of a graphics processing unit (GPU). Based on one or more commands received for execution, a plurality of workgroups is generated for assignment to a plurality of shader engines for processing, each shader engine including a respective quantity of active compute units. Each workgroup of the plurality of workgroups is dynamically assigned to a respective shader engine for execution based at least in part on indications of available resources respectively associated with each of the shader engines. In various embodiments, the indications of available resources may include physical parameters regarding each shader engine, as well as current status information regarding the processing of workgroups assigned to each shader engine.Type: GrantFiled: December 29, 2021Date of Patent: March 26, 2024Assignee: Advanced Micro Devices, Inc.Inventors: Randy Ramsey, Yash Ukidave
-
Patent number: 11940858Abstract: A data fabric routes requests between the plurality of requestors and the plurality of responders. The data fabric includes a crossbar router, a coherent slave controller coupled to the crossbar router, and a probe filter coupled to the coherent slave controller and tracking the state of cached lines of memory. Power state control circuitry operates, responsive to detecting any of a plurality of designated conditions, to cause the probe filter to enter a retention low power state in which a clock signal to the probe filter is gated while power is maintained to the probe filter. Entering the retention low power state is performed when all in-process probe filter lookups are complete.Type: GrantFiled: October 25, 2022Date of Patent: March 26, 2024Assignee: Advanced Micro Devices, Inc.Inventors: Benjamin Tsien, Amit P. Apte
-
Patent number: 11942953Abstract: A apparatus includes a reference signal generator, a droop detection circuit, a digital frequency-locked loop (DFLL), and a DFLL control circuit. The reference signal generator that receives a digital value and produces a pulse-density modulated signal based on the digital value. The droop detection circuit converts the pulse-density modulated signal to an analog signal, compares the analog signal to a monitored supply voltage, and responsive to detecting a droop of the monitored supply voltage below a designated value relative to the analog signal, produces a droop detection signal. The DFLL provides a clock signal for synchronizing circuitry within a domain of the monitored supply voltage. The DFLL control circuit, responsive to receiving the droop detection signal, causes the DFLL to slow the clock signal.Type: GrantFiled: December 21, 2021Date of Patent: March 26, 2024Assignees: Advanced Micro Devices, Inc., ATI Technologies ULCInventors: Kaushik Mazumdar, Joyce Cheuk Wai Wong, Naeem Ibrahim Ally, Stephen Victor Kosonocky
-
Publication number: 20240095517Abstract: Methods and devices are provided for processing data using a neural network. Activations from a previous layer of the neural network are received by a layer of the neural network. Weighted values, to be applied to values of elements of the activations, are determined based on a spatial correlation of the elements and a task error output by the layer. The weighted values are applied to the values of the elements and a combined error is determined based on the task error and the spatial correlation.Type: ApplicationFiled: September 20, 2022Publication date: March 21, 2024Applicants: Advanced Micro Devices, Inc., ATI Technologies ULCInventors: Mehdi Saeedi, Ian Charles Colbert, Ihab M. A. Amer
-
Publication number: 20240095184Abstract: Address translation service management techniques are described. These techniques are based on metadata that is usable to provide a hint as insight into memory access, and based on this, use of a translation lookaside buffer is optimized to control which entries are maintained in the queue and manage address translation requests.Type: ApplicationFiled: September 21, 2022Publication date: March 21, 2024Applicant: Advanced Micro Devices, Inc.Inventor: Anthony Thomas Gutierrez
-
Publication number: 20240095180Abstract: The disclosed computer-implemented method for interpolating register-based lookup tables can include identifying, within a set of registers, a lookup table that has been encoded for storage within the set of registers. The method can also include receiving a request to look up a value in the lookup table and responding to the request by interpolating, from the encoded lookup table stored in the set of registers, a representation of the requested value. Various other methods, systems, and computer-readable media are also disclosed.Type: ApplicationFiled: December 23, 2022Publication date: March 21, 2024Applicant: Advanced Micro Devices, Inc.Inventors: Gabriel H. Loh, Michael Estlick, Jay Fleischman, Michael J. Schulte, Bradford Beckmann, Yasuko Eckert
-
Patent number: 11935153Abstract: Data processing methods and devices are provided. A processing device comprises memory and a processor. The memory, which comprises a cache, is configured to store portions of data. The processor is configured to issue a store instruction to store one of the portions of data, provide identifying information associated with the one portion of data, compress the one portion of data; and store the compressed one portion of data across multiple lines of the cache using the identifying information. In an example, the one portion of data is a block of pixels and pixels and the processor is configured to request pixel data for a pixel of a compressed block of pixels, send additional requests for data for other pixels determined to belong to the compressed pixel block and provide an indication that the requests are for pixel data belonging to the compressed block of pixels.Type: GrantFiled: December 28, 2020Date of Patent: March 19, 2024Assignees: Advanced Micro Devices, Inc., ATI Technologies ULCInventors: Sergey Korobkov, Jimshed B. Mirza, Anthony Hung-Cheong Chan
-
Patent number: 11936382Abstract: An output clock frequency of an adaptive oscillator circuit changes in response to noise on an integrated circuit power supply line. The circuit features two identical delay lines which are separately connected to a regulated supply and a droopy supply. In response to noise on the droopy supply, the delay lines cause a change in the output clock frequency. The adaptive oscillator circuit slows down the output clock frequency when the droopy supply droops or falls below the regulated supply. The adaptive oscillator circuit clamps the output clock frequency at a level determined by the regulated supply when the droopy supply overshoots or swings above the regulated supply.Type: GrantFiled: June 27, 2019Date of Patent: March 19, 2024Assignees: Advanced Micro Devices, Inc., ATI TECHNOLOGIES ULCInventors: Joyce Cheuk Wai Wong, Dragoljub Ignjatovic, Mikhail Rodionov, Ljubisa Bajic, Stephen V. Kosonocky, Steven J. Kommrusch
-
Patent number: 11936616Abstract: A controller assigns variable length addresses to addressable elements that are connected to a network. The variable length addresses are determined based on probabilities that packets are addressed to the corresponding addressable element. The controller transmits, to the addressable elements via the network, a routing table indicating the variable length addresses assigned to the addressable elements. Routers or addressable elements receive the routing table and route one or more packets over the network to an addressable element using variable length addresses included in a header of the one or more packets.Type: GrantFiled: October 7, 2021Date of Patent: March 19, 2024Assignee: Advanced Micro Devices, Inc.Inventor: David A. Roberts
-
Patent number: 11934873Abstract: A first processing unit such as a graphics processing unit (GPU) pipelines that execute commands and a scheduler to schedule one or more first commands for execution by one or more of the pipelines. The one or more first commands are received from a user mode driver in a second processing unit such as a central processing unit (CPU). The scheduler schedules one or more second commands for execution in response to completing execution of the one or more first commands and without notifying the second processing unit. In some cases, the first processing unit includes a direct memory access (DMA) engine that writes blocks of information from the first processing unit to a memory. The one or more second commands program the DMA engine to write a block of information including results generated by executing the one or more first commands.Type: GrantFiled: September 16, 2022Date of Patent: March 19, 2024Assignee: Advanced Micro Devices, Inc.Inventor: Rex Eldon McCrary