Patents Assigned to Advanced Micro Devics, Inc.
-
Publication number: 20240103719Abstract: Generating optimization instructions for data processing pipelines is described. A pipeline optimization system computes resource usage information that describes memory and compute usage metrics during execution of each stage of the data processing pipeline. The system additionally generates data storage information that describes how data output by each pipeline stage is utilized by other stages of the pipeline. The pipeline optimization system then generates the optimization instructions to control how memory operations are performed for a specific data processing pipeline during execution. In implementations, the optimization instructions cause a memory system to discard data (e.g., invalidate cache entries) without copying the discarded data to another storage location after the data is no longer needed by the pipeline. The optimization instructions alternatively or additionally control at least one of evicting, writing-back, or prefetching data to minimize latency during pipeline execution.Type: ApplicationFiled: September 28, 2022Publication date: March 28, 2024Applicant: Advanced Micro Devices, Inc.Inventor: Harris Eleftherios Gasparakis
-
Publication number: 20240103739Abstract: Multi-level cell memory management techniques are described. In one example, the memory controller is configured to control whether a single-level cell operation or a multi-level cell operation to be used using different mapping schemes. The single-level cell operation, for instance, is usable to store a data word using two states whereas the multi-level cell operation is usable to store the data word by also using an intermediate state. In order to store the data word using two states, the memory controller is configurable to separate the data word across two word lines in the physical memory. In an implementation, use of the different operations and corresponding mapping schemes by the memory controller alternates between adjacent word lines in physical memory.Type: ApplicationFiled: September 25, 2022Publication date: March 28, 2024Applicant: Advanced Micro Devices, Inc.Inventor: SeyedMohammad SeyedzadehDelcheh
-
Publication number: 20240104015Abstract: In accordance with the described techniques for data compression and decompression for processing in memory, a page address is received by a processing in memory component that maps to a first location in memory where data of a page is maintained. The data of the page is compressed by the processing in memory component. Further, compressed data of the page is written by the processing in memory component to a compressed block device responsive to the compressed data satisfying one or more compressibility criteria. The compressed block device is a portion of the memory dedicated to storing data in a compressed form.Type: ApplicationFiled: September 26, 2022Publication date: March 28, 2024Applicant: Advanced Micro Devices, Inc.Inventors: Kishore Punniyamurthy, Jagadish B Kotra
-
Publication number: 20240103745Abstract: A memory controller coupled to a memory module receives both processing-in-memory (PIM) requests and memory requests from a host (e.g., a host processor). The memory controller issues PIM requests to one group of memory banks and concurrently issues memory requests to one or more other groups of memory banks. Accordingly, memory requests are performed on groups of memory banks that would otherwise be idle while PIM requests are performed on the one group of memory banks. Optionally, the memory controller coupled to the memory module also takes various actions when switching between operating in a PIM mode and a non-processing-in-memory mode to reduce or hide overhead when switching between the two modes.Type: ApplicationFiled: September 28, 2022Publication date: March 28, 2024Applicant: Advanced Micro Devices, Inc.Inventors: Niti Madan, Johnathan Robert Alsop, Alexandru Dutu, Mahzabeen Islam, Yasuko Eckert, Nuwan S Jayasena
-
Publication number: 20240106438Abstract: An integrated circuit includes a power supply monitor, a clock generator, and a divider. The power supply monitor is operable to provide a trigger signal in response to a power supply voltage dropping below a threshold voltage. The clock generator is operable to provide a first clock signal having a frequency dependent on a value of a frequency control word, and to change the frequency of the first clock signal over time using a native slope in response to a change in the frequency control word. The divider is responsive to an assertion of the trigger signal to divide a frequency of the first clock signal by a divide value to provide a second clock signal.Type: ApplicationFiled: November 30, 2023Publication date: March 28, 2024Applicants: Advanced Micro Devices, Inc., ATI Technologies ULCInventors: Kaushik Mazumdar, Ashish Jain, Joyce Cheuk Wai Wong, Mikhail Rodionov
-
Publication number: 20240103763Abstract: In accordance with the described techniques for bank-level parallelism for processing in memory, a plurality of commands are received for execution by a processing in memory component embedded in a memory. The memory includes a first bank and a second bank. The plurality of commands include a first stream of commands which cause the processing in memory component to perform operations that access the first bank and a second stream of commands which cause the processing in memory component to perform operations that access the second bank. A next row of the first bank that is to be accessed by the processing in memory component is identified. Further, a precharge command is scheduled to close a first row of the first bank and an activate command is scheduled to open the next row of the first bank in parallel with execution of the second stream of commands.Type: ApplicationFiled: September 27, 2022Publication date: March 28, 2024Applicant: Advanced Micro Devices, Inc.Inventors: Mahzabeen Islam, Shaizeen Dilawarhusen Aga, Johnathan Robert Alsop, MOHAMED ASSEM ABD ELMOHSEN IBRAHIM, Nuwan S Jayasena
-
Publication number: 20240106813Abstract: A method and system for distributing keys in a key distribution system includes receiving a connection for communication from a first component. A determination is made whether the first component requires a key be generated and distributed. Based upon a security mode for the communication, the key generated and distributed to the first component.Type: ApplicationFiled: September 28, 2022Publication date: March 28, 2024Applicants: Advanced Micro Devices, Inc., ATI Technologies ULCInventors: Norman Vernon Douglas Stewart, Mihir Shaileshbhai Doctor, Omar Fakhri Ahmed, Hemaprabhu Jayanna, John Traver
-
Publication number: 20240104844Abstract: Devices and methods for multi-resolution geometric representation for ray tracing are described which include casting a ray in a space comprising objects represented by geometric shapes and approximating a volume of the geometric shapes using an accelerated hierarchy structure. The accelerated hierarchy structure comprises first nodes each representing a volume of one of the geometric shapes in the space and second nodes each representing an approximate volume of a group of the geometric shapes. When the ray is determined to intersect a bounding box of a second node representing one group of the geometric shapes, a selection is made between traversal and non-traversal of other second nodes based on a LOD for representing the volume of the one group of geometric shapes.Type: ApplicationFiled: September 28, 2022Publication date: March 28, 2024Applicants: Advanced Micro Devices, Inc., ATI Technologies ULCInventors: Sho Ikeda, Paritosh Vijay Kulkarni, Takahiro Harada
-
Patent number: 11940858Abstract: A data fabric routes requests between the plurality of requestors and the plurality of responders. The data fabric includes a crossbar router, a coherent slave controller coupled to the crossbar router, and a probe filter coupled to the coherent slave controller and tracking the state of cached lines of memory. Power state control circuitry operates, responsive to detecting any of a plurality of designated conditions, to cause the probe filter to enter a retention low power state in which a clock signal to the probe filter is gated while power is maintained to the probe filter. Entering the retention low power state is performed when all in-process probe filter lookups are complete.Type: GrantFiled: October 25, 2022Date of Patent: March 26, 2024Assignee: Advanced Micro Devices, Inc.Inventors: Benjamin Tsien, Amit P. Apte
-
Patent number: 11941723Abstract: Systems, methods, and techniques dynamically utilize load balancing for workgroup assignments between a group of shader engines by a command processor of a graphics processing unit (GPU). Based on one or more commands received for execution, a plurality of workgroups is generated for assignment to a plurality of shader engines for processing, each shader engine including a respective quantity of active compute units. Each workgroup of the plurality of workgroups is dynamically assigned to a respective shader engine for execution based at least in part on indications of available resources respectively associated with each of the shader engines. In various embodiments, the indications of available resources may include physical parameters regarding each shader engine, as well as current status information regarding the processing of workgroups assigned to each shader engine.Type: GrantFiled: December 29, 2021Date of Patent: March 26, 2024Assignee: Advanced Micro Devices, Inc.Inventors: Randy Ramsey, Yash Ukidave
-
Patent number: 11942953Abstract: A apparatus includes a reference signal generator, a droop detection circuit, a digital frequency-locked loop (DFLL), and a DFLL control circuit. The reference signal generator that receives a digital value and produces a pulse-density modulated signal based on the digital value. The droop detection circuit converts the pulse-density modulated signal to an analog signal, compares the analog signal to a monitored supply voltage, and responsive to detecting a droop of the monitored supply voltage below a designated value relative to the analog signal, produces a droop detection signal. The DFLL provides a clock signal for synchronizing circuitry within a domain of the monitored supply voltage. The DFLL control circuit, responsive to receiving the droop detection signal, causes the DFLL to slow the clock signal.Type: GrantFiled: December 21, 2021Date of Patent: March 26, 2024Assignees: Advanced Micro Devices, Inc., ATI Technologies ULCInventors: Kaushik Mazumdar, Joyce Cheuk Wai Wong, Naeem Ibrahim Ally, Stephen Victor Kosonocky
-
Publication number: 20240095517Abstract: Methods and devices are provided for processing data using a neural network. Activations from a previous layer of the neural network are received by a layer of the neural network. Weighted values, to be applied to values of elements of the activations, are determined based on a spatial correlation of the elements and a task error output by the layer. The weighted values are applied to the values of the elements and a combined error is determined based on the task error and the spatial correlation.Type: ApplicationFiled: September 20, 2022Publication date: March 21, 2024Applicants: Advanced Micro Devices, Inc., ATI Technologies ULCInventors: Mehdi Saeedi, Ian Charles Colbert, Ihab M. A. Amer
-
Publication number: 20240095184Abstract: Address translation service management techniques are described. These techniques are based on metadata that is usable to provide a hint as insight into memory access, and based on this, use of a translation lookaside buffer is optimized to control which entries are maintained in the queue and manage address translation requests.Type: ApplicationFiled: September 21, 2022Publication date: March 21, 2024Applicant: Advanced Micro Devices, Inc.Inventor: Anthony Thomas Gutierrez
-
Publication number: 20240095180Abstract: The disclosed computer-implemented method for interpolating register-based lookup tables can include identifying, within a set of registers, a lookup table that has been encoded for storage within the set of registers. The method can also include receiving a request to look up a value in the lookup table and responding to the request by interpolating, from the encoded lookup table stored in the set of registers, a representation of the requested value. Various other methods, systems, and computer-readable media are also disclosed.Type: ApplicationFiled: December 23, 2022Publication date: March 21, 2024Applicant: Advanced Micro Devices, Inc.Inventors: Gabriel H. Loh, Michael Estlick, Jay Fleischman, Michael J. Schulte, Bradford Beckmann, Yasuko Eckert
-
Patent number: 11934251Abstract: A memory controller couples to a data fabric clock domain, and to a physical layer interface circuit PHY clock domain. A first interface circuit adapts transfers between the data fabric clock domain (FCLK) and the memory controllers clock domain, and a second interface circuit couples the memory controller to the PHY clock domain. A power controller responds to a power state change request by sending commands to the second interface circuit to change parameters of a memory system and to update a set of timing parameters of the memory controller according to a selected power state of a plurality of power states. The power controller further responds to a request to synchronize with a new frequency on the FCLK domain by changing a set of timing parameters of the clock interface circuit without changing the set of timing parameters of the memory system or the selected power state.Type: GrantFiled: March 31, 2021Date of Patent: March 19, 2024Assignee: Advanced Micro Devices, Inc.Inventors: James R. Magro, Christopher Weaver, Abhishek Kumar Verma
-
Patent number: 11935153Abstract: Data processing methods and devices are provided. A processing device comprises memory and a processor. The memory, which comprises a cache, is configured to store portions of data. The processor is configured to issue a store instruction to store one of the portions of data, provide identifying information associated with the one portion of data, compress the one portion of data; and store the compressed one portion of data across multiple lines of the cache using the identifying information. In an example, the one portion of data is a block of pixels and pixels and the processor is configured to request pixel data for a pixel of a compressed block of pixels, send additional requests for data for other pixels determined to belong to the compressed pixel block and provide an indication that the requests are for pixel data belonging to the compressed block of pixels.Type: GrantFiled: December 28, 2020Date of Patent: March 19, 2024Assignees: Advanced Micro Devices, Inc., ATI Technologies ULCInventors: Sergey Korobkov, Jimshed B. Mirza, Anthony Hung-Cheong Chan
-
Patent number: 11936382Abstract: An output clock frequency of an adaptive oscillator circuit changes in response to noise on an integrated circuit power supply line. The circuit features two identical delay lines which are separately connected to a regulated supply and a droopy supply. In response to noise on the droopy supply, the delay lines cause a change in the output clock frequency. The adaptive oscillator circuit slows down the output clock frequency when the droopy supply droops or falls below the regulated supply. The adaptive oscillator circuit clamps the output clock frequency at a level determined by the regulated supply when the droopy supply overshoots or swings above the regulated supply.Type: GrantFiled: June 27, 2019Date of Patent: March 19, 2024Assignees: Advanced Micro Devices, Inc., ATI TECHNOLOGIES ULCInventors: Joyce Cheuk Wai Wong, Dragoljub Ignjatovic, Mikhail Rodionov, Ljubisa Bajic, Stephen V. Kosonocky, Steven J. Kommrusch
-
Patent number: 11934331Abstract: Systems, apparatuses, and methods for dynamically selecting between wired and wireless interconnects for sending packets are disclosed. A system includes at least a hybrid communication engine and a plurality of interconnects for connecting to various end-points. The communication engine dynamically discovers and utilizes the best interconnect technology available in between given end-points. The communication engine dynamically chooses the physical interconnect that is best suited at any given time to send data from one source to one or multiple destinations. This communication can be either on-chip or across nodes. The communication engine makes a decision based on a set of predetermined parameters that can be re-adjusted by the application layer, such as latency of the transmission, message data size, physical distance from source to destination, the energy cost, and the current congestion on the alternative interconnects.Type: GrantFiled: September 30, 2019Date of Patent: March 19, 2024Assignee: Advanced Micro Devices, Inc.Inventors: Sergey Blagodurov, Antonio Maria Franques Garcia
-
Patent number: 11934873Abstract: A first processing unit such as a graphics processing unit (GPU) pipelines that execute commands and a scheduler to schedule one or more first commands for execution by one or more of the pipelines. The one or more first commands are received from a user mode driver in a second processing unit such as a central processing unit (CPU). The scheduler schedules one or more second commands for execution in response to completing execution of the one or more first commands and without notifying the second processing unit. In some cases, the first processing unit includes a direct memory access (DMA) engine that writes blocks of information from the first processing unit to a memory. The one or more second commands program the DMA engine to write a block of information including results generated by executing the one or more first commands.Type: GrantFiled: September 16, 2022Date of Patent: March 19, 2024Assignee: Advanced Micro Devices, Inc.Inventor: Rex Eldon McCrary
-
Patent number: 11936616Abstract: A controller assigns variable length addresses to addressable elements that are connected to a network. The variable length addresses are determined based on probabilities that packets are addressed to the corresponding addressable element. The controller transmits, to the addressable elements via the network, a routing table indicating the variable length addresses assigned to the addressable elements. Routers or addressable elements receive the routing table and route one or more packets over the network to an addressable element using variable length addresses included in a header of the one or more packets.Type: GrantFiled: October 7, 2021Date of Patent: March 19, 2024Assignee: Advanced Micro Devices, Inc.Inventor: David A. Roberts