Patents by Inventor Michael Mantor

Michael Mantor has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20250130767
    Abstract: The disclosed circuit can select micro-operations specifically for converting a value in a first number format to a second number format. The circuit can include micro-operations for various conversions between different number formats, including number formats of different floating-point precisions. Various other methods, systems, and computer-readable media are also disclosed.
    Type: Application
    Filed: October 21, 2024
    Publication date: April 24, 2025
    Applicant: Advanced Micro Devices, Inc.
    Inventors: Shubh Shah, Ashutosh Garg, Bin He, Michael Mantor, Shubra Marwaha, Subramaniam Maiyuran
  • Publication number: 20250130769
    Abstract: The disclosed circuit is configured to round a value in a first number format using a random value. Using the rounded value, the circuit can convert the rounded value to a second number format that has a lower precision than a precision of the first number format. Various other methods, systems, and computer-readable media are also disclosed.
    Type: Application
    Filed: December 22, 2023
    Publication date: April 24, 2025
    Applicant: Advanced Micro Devices, Inc.
    Inventors: Shubh Shah, Ashutosh Garg, Bin He, Michael Mantor, Shubra Marwaha, Subramaniam Maiyuran
  • Publication number: 20250130794
    Abstract: The disclosed processing circuit can perform an operation with a first operand having a first number format and a second operand having a second number format by directly using the first operand in the first number format and the second operand in the second number format to produce an output result. Various other methods, systems, and computer-readable media are also disclosed.
    Type: Application
    Filed: December 28, 2023
    Publication date: April 24, 2025
    Applicant: Advanced Micro Devices, Inc.
    Inventors: Shubh Shah, Ashutosh Garg, Bin He, Michael Mantor, Shubra Marwaha, Subramaniam Maiyuran
  • Publication number: 20250130774
    Abstract: The disclosed circuit can interpret a bit sequence as a value based on one of multiple floating point number formats in a bias mode indicated by a bias mode indicator. The circuit can and perform an operation using the value in the bias mode. Various other methods, systems, and computer-readable media are also disclosed.
    Type: Application
    Filed: December 22, 2023
    Publication date: April 24, 2025
    Applicant: Advanced Micro Devices, Inc.
    Inventors: Shubh Shah, Ashutosh Garg, Bin He, Michael Mantor, Shubra Marwaha, Subramaniam Maiyuran
  • Publication number: 20250117352
    Abstract: A processing system includes one or more accelerator units (AUs) each having a modular architecture. To this end, each AU includes a connection circuitry and one or more memory stacks disposed on the connection circuitry. Further, each AU includes one or more interposer dies each disposed on the connection circuitry such that each interposer die of the one or more interposer dies is communicatively coupled to a corresponding memory stack of the memory stacks via the connection circuitry. Further, each interposer die of each AU includes circuitry configured to concurrently support two or more types of compute dies.
    Type: Application
    Filed: October 9, 2024
    Publication date: April 10, 2025
    Inventors: Alan D. Smith, Michael Mantor, Mark Fowler, Vydhyanathan Kalyanasundharam, Samuel Naffziger
  • Publication number: 20250111578
    Abstract: Methods and systems are provided for generating a stylized representation of a non-player character (NPC) in a virtual environment. A multimodal plurality of inputs regarding characteristics of the NPC is received, which is processed to generate visual data representing the NPC's appearance and to generate behavior data representing the NPC's actions. The generated visual data and behavior data are adapted to a selected character model to create an adapted configuration model, which is used to generate rendering information for the NPC.
    Type: Application
    Filed: September 27, 2024
    Publication date: April 3, 2025
    Inventors: Karthik Mohan Kumar, Archana Ramalingam, Michael Mantor, Pedro Antonio Pena
  • Patent number: 12254527
    Abstract: A graphics processing unit (GPU) includes a plurality of programmable processing cores configured to process graphics primitives and corresponding data and a plurality of fixed-function hardware units. The plurality of processing cores and the plurality of fixed-function hardware units are configured to implement a configurable number of virtual pipelines to concurrently process different command flows. Each virtual pipeline includes a configurable number of fragments and an operational state of each virtual pipeline is specified by a different context. The configurable number of virtual pipelines can be modified from a first number to a second number that is different than the first number. An emulation of a fixed-function hardware unit can be instantiated on one or more of the graphics processing cores in response to detection of a bottleneck in a fixed-function hardware unit. One or more of the virtual pipelines can then be reconfigured to utilize the emulation instead of the fixed-function hardware unit.
    Type: Grant
    Filed: May 21, 2020
    Date of Patent: March 18, 2025
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Timour T. Paltashev, Michael Mantor, Rex Eldon McCrary
  • Publication number: 20250068464
    Abstract: A method for hierarchical work scheduling includes consuming a work item at a first scheduling domain having a local scheduler circuit and one or more workgroup processing elements. Consuming the work item produces a set of new work items. Subsequently, the local scheduler circuit distributes at least one new work item of the set of new work items to be executed locally at the first scheduling domain. If the local scheduler circuit of the first scheduling domain determines that the set of new work items includes one or more work items that would overload the first scheduling domain with work if scheduled for local execution, those work items are distributed to the next higher-level scheduler circuit in a scheduling domain hierarchy for redistribution to one or more other scheduling domains.
    Type: Application
    Filed: November 8, 2024
    Publication date: February 27, 2025
    Inventors: Matthaeus G. Chajdas, Christopher J. Brennan, Michael Mantor, Robert W. Martin, Nicolai Haehnle
  • Publication number: 20250068429
    Abstract: A Streaming Wave Coalescer (SWC) circuit stores a first set of state values associated with a first subset of threads of a first wave in a bin based on each of the first subset of threads including a first set of instructions to be executed. A second set of state values associated with a second subset of threads of a second wave is stored in the bin based on each of the second subset of threads including the first set of instructions to be executed and based on the first wave and the second wave both being associated with a hard key. A third wave is formed from the threads of the first subset and the second subset and is emitted for execution. As a result of reorganizing the threads and reconstituting a different wave, thread divergence of waves sent for execution is reduced.
    Type: Application
    Filed: December 12, 2023
    Publication date: February 27, 2025
    Inventors: John Stephen Junkins, Christopher J. Brennan, Ian Richard Beaumont, Kellie Marks, Matthaeus G. Chajdas, Max Oberberger, Michael John Bedy, Michael Mantor, Sean Keely
  • Patent number: 12217021
    Abstract: A parallel processing unit employs an arithmetic logic unit (ALU) having a relatively small footprint, thereby reducing the overall power consumption and circuit area of the processing unit. To support the smaller footprint, the ALU includes multiple stages to execute operations corresponding to a received instruction. The ALU executes at least one operation at a precision indicated by the received instruction, and then reduces the resulting data of the at least one operation to a smaller size before providing the results to another stage of the ALU to continue execution of the instruction.
    Type: Grant
    Filed: July 7, 2023
    Date of Patent: February 4, 2025
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Bin He, Shubh Shah, Michael Mantor
  • Patent number: 12205218
    Abstract: A graphics processing unit (GPU) or other apparatus includes a plurality of shader engines. The apparatus also includes a first front end (FE) circuit and one or more second FE circuits. The first FE circuit is configured to schedule geometry workloads for the plurality of shader engines in a first mode. The first FE circuit is configured to schedule geometry workloads for a first subset of the plurality of shader engines and the one or more second FE circuits are configured to schedule geometry workloads for a second subset of the plurality of shader engines in a second mode. In some cases, a partition switch is configured to selectively connect the first FE circuit or the one or more second FE circuits to the second subset of the plurality of shader engines depending on whether the apparatus is in the first mode or the second mode.
    Type: Grant
    Filed: March 29, 2022
    Date of Patent: January 21, 2025
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Mark Leather, Michael Mantor
  • Publication number: 20240424407
    Abstract: Systems and techniques for generating and animating non-player characters (NPCs) within virtual digital environments are provided. Multimodal input data is received that comprises a plurality of input modalities for interaction with an NPC having a set of body features and a set of facial features. The multimodal input data is processed through one or more neural networks to generate animation sequences for both the body features and facial features of the NPC. Generating such animation sequences includes disentangling the multimodal input data to generate substantially disentangled latent representations, combining these representations with the multimodal input data, and using a large-language model (LLM) to generate speech data for the NPC. Further processing using reverse diffusion generates face vertex displacement data and joint trajectory data based on the combined representation and generated speech data.
    Type: Application
    Filed: June 20, 2024
    Publication date: December 26, 2024
    Inventors: Karthik Mohan Kumar, Michael Mantor, Pedro Antonio Pena, Archana Ramalingam
  • Publication number: 20240428494
    Abstract: Systems and techniques for generating and animating non-player characters (NPCs) within virtual digital environments are provided. Multimodal input data is received that comprises a plurality of input modalities for interaction with an NPC having a set of body features and a set of facial features. The multimodal input data is processed through one or more neural networks to generate animation sequences for both the body features and facial features of the NPC. Generating such animation sequences includes disentangling the multimodal input data to generate substantially disentangled latent representations, combining these representations with the multimodal input data, and using a large-language model (LLM) to generate speech data for the NPC. Further processing using reverse diffusion generates face vertex displacement data and joint trajectory data based on the combined representation and generated speech data.
    Type: Application
    Filed: June 20, 2024
    Publication date: December 26, 2024
    Inventors: Karthik Mohan Kumar, Michael Mantor, Pedro Antonio Pena, Archana Ramalingam
  • Publication number: 20240424398
    Abstract: Systems and techniques for generating and animating non-player characters (NPCs) within virtual digital environments are provided. Multimodal input data is received that comprises a plurality of input modalities for interaction with an NPC having a set of body features and a set of facial features. The multimodal input data is processed through one or more neural networks to generate animation sequences for both the body features and facial features of the NPC. Generating such animation sequences includes disentangling the multimodal input data to generate substantially disentangled latent representations, combining these representations with the multimodal input data, and using a large-language model (LLM) to generate speech data for the NPC. Further processing using reverse diffusion generates face vertex displacement data and joint trajectory data based on the combined representation and generated speech data.
    Type: Application
    Filed: June 20, 2024
    Publication date: December 26, 2024
    Inventors: Karthik Mohan Kumar, Michael Mantor, Pedro Antonio Pena, Archana Ramalingam
  • Patent number: 12153957
    Abstract: A method for hierarchical work scheduling includes consuming a work item at a first scheduling domain having a local scheduler circuit and one or more workgroup processing elements. Consuming the work item produces a set of new work items. Subsequently, the local scheduler circuit distributes at least one new work item of the set of new work items to be executed locally at the first scheduling domain. If the local scheduler circuit of the first scheduling domain determines that the set of new work items includes one or more work items that would overload the first scheduling domain with work if scheduled for local execution, those work items are distributed to the next higher-level scheduler circuit in a scheduling domain hierarchy for redistribution to one or more other scheduling domains.
    Type: Grant
    Filed: September 30, 2022
    Date of Patent: November 26, 2024
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Matthaeus G. Chajdas, Christopher J. Brennan, Michael Mantor, Robert W. Martin, Nicolai Haehnle
  • Patent number: 12032487
    Abstract: A processor maintains an access log indicating a stream of cache misses at a cache of the processor. In response to each of at least a subset of cache misses at the cache, the processor records a corresponding entry in the access log, indicating a physical memory address of the memory access request that resulted in the corresponding miss. In addition, the processor maintains an address translation log that indicates a mapping of physical memory addresses to virtual memory addresses. In response to an address translation (e.g., a page walk) that translates a virtual address to a physical address, the processor stores a mapping of the physical address to the corresponding virtual address at an entry of the address translation log. Software executing at the processor can use the two logs for memory management.
    Type: Grant
    Filed: February 8, 2022
    Date of Patent: July 9, 2024
    Assignees: Advanced Micro Devices, Inc., ATI Technologies ULC
    Inventors: Benjamin T. Sander, Mark Fowler, Anthony Asaro, Gongxian Jeffrey Cheng, Michael Mantor
  • Publication number: 20240192994
    Abstract: Techniques for implementing accelerated draw indirect fetching are disclosed. A fetch accelerator enables streamlined data fetching by looping internally and filling a draw queue for a micro engine. By using a dedicated fetch accelerator rather than processing data fetches separately and individually using a conventional processor, significant processing overhead is eliminated and computational latency is reduced. Additionally, different types of aligned or unaligned data structures are usable with equivalent or nearly equivalent performance.
    Type: Application
    Filed: March 28, 2023
    Publication date: June 13, 2024
    Inventors: Alexander Fuad Ashkar, Michael Mantor, Rex Eldon McCrary, Yi Luo, Manu Rastogi, James Robert Klobcar
  • Publication number: 20240193844
    Abstract: A graphics processing unit (GPU) of a processing system is partitioned into multiple dies (referred to as GPU chiplets) that are configurable to collectively function and interface with an application as a single GPU in a first mode and as multiple GPUs in a second mode. By dividing the GPU into multiple GPU chiplets, the processing system flexibly and cost-effectively configures an amount of active GPU physical resources based on an operating mode. In addition, a configurable number of GPU chiplets are assembled into a single GPU, such that multiple different GPUs having different numbers of GPU chiplets can be assembled using a small number of tape-outs and a multiple-die GPU can be constructed out of GPU chiplets that implement varying generations of technology.
    Type: Application
    Filed: December 8, 2022
    Publication date: June 13, 2024
    Inventors: Mark Fowler, Samuel Naffziger, Michael Mantor, Mark Leather
  • Patent number: 11995149
    Abstract: A processing system includes a first set and a second set of general-purpose registers (GPRs) and memory access circuitry that fetches nonzero values of a sparse matrix into consecutive slots in the first set. The memory access circuitry also fetches values of an expanded matrix into consecutive slots in the second set of GPRs. The expanded matrix is formed based on values of a vector and locations of the nonzero values in the sparse matrix. The processing system also includes a set of multipliers that concurrently perform multiplication of the nonzero values in slots of the first set of GPRs with the values of the vector in corresponding slots of the second set. Reduced sum circuitry accumulates results from the set of multipliers for rows of the sparse matrix.
    Type: Grant
    Filed: December 17, 2020
    Date of Patent: May 28, 2024
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Sateesh Lagudu, Allen H. Rush, Michael Mantor
  • Publication number: 20240168719
    Abstract: A processing system executes wavefronts at multiple arithmetic logic unit (ALU) pipelines of a single instruction multiple data (SIMD) unit in a single execution cycle. The ALU pipelines each include a number of ALUs that execute instructions on wavefront operands that are collected from vector general process register (VGPR) banks at a cache and output results of the instructions executed on the wavefronts at a buffer. By storing wavefronts supplied by the VGPR banks at the cache, a greater number of wavefronts can be made available to the SIMD unit without increasing the VGPR bandwidth, enabling multiple ALU pipelines to execute instructions during a single execution cycle.
    Type: Application
    Filed: January 16, 2024
    Publication date: May 23, 2024
    Inventors: Bin HE, Brian EMBERLING, Mark LEATHER, Michael MANTOR