Patents by Inventor Maxim V. KAZAKOV

Maxim V. KAZAKOV has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 10991146
    Abstract: A processor receives a request to access one or more levels of a partially resident texture (PRT) resource. The levels represent a texture at different levels of detail (LOD) and the request includes normalized coordinates indicating a location in the texture. The processor accesses a texture descriptor that includes dimensions of a first level of the levels and one or more offsets between a reference level and one or more second levels that are associated with one or more residency maps that indicate texels that are resident in the PRT resource. The processor translates the normalized coordinates to texel coordinates in the one or more residency maps based on the offset and accesses, in response to the request, the one or more residency maps based on the texel coordinates to determine whether texture data indicated by the normalized coordinates is resident in the PRT resource.
    Type: Grant
    Filed: December 20, 2019
    Date of Patent: April 27, 2021
    Assignee: ADVANCED MICRO DEVICES, INC.
    Inventors: Maxim V. Kazakov, Mark Fowler
  • Publication number: 20210026686
    Abstract: Techniques for performing machine learning operations are provided. The techniques include configuring a first portion of a first chiplet as a cache; performing caching operations via the first portion; configuring at least a first sub-portion of the first portion of the chiplet as directly-accessible memory; and performing machine learning operations with the first sub-portion by a machine learning accelerator within the first chiplet.
    Type: Application
    Filed: July 20, 2020
    Publication date: January 28, 2021
    Applicant: Advanced Micro Devices, Inc.
    Inventors: Swapnil P. Sakharshete, Andrew S. Pomianowski, Maxim V. Kazakov, Vineet Goel, Milind N. Nemlekar, Skyler Jonathon Saleh
  • Patent number: 10783694
    Abstract: A pipeline is configured to access a memory that stores a texture block and metadata that encodes compression parameters of the texture block and a residency status of the texture block. A processor requests access to the metadata in conjunction with requesting data in the texture block to perform a shading operation. The pipeline selectively returns the data in the texture block to the processor depending on whether the metadata indicates that the texture block is resident in the memory. A cache can also be included to store a copy of the metadata that encodes the compression parameters of the texture block. The residency status and the metadata stored in the cache can be modified in response to requests to access the metadata stored in the cache.
    Type: Grant
    Filed: August 25, 2017
    Date of Patent: September 22, 2020
    Assignee: ADVANCED MICRO DEVICES, INC.
    Inventors: Maxim V. Kazakov, Skyler J. Saleh, Ruijin Wu, Sagar Shankar Bhandare
  • Patent number: 10761992
    Abstract: A processor reduces bus bandwidth consumption by employing a shared load scheme, whereby each shared load retrieves data for multiple compute units (CUs) of a processor. Each CU in a specified group monitors a bus for load accesses directed to a cache shared by the multiple CUs. In response to identifying a load access on the bus, a CU determines if the load access is a shared load access for its share group. In response to identifying a shared load access for its share group, the CU allocates an entry of a private cache associated with the CU for data responsive to the shared load access. The CU then monitors the bus for the data targeted by the shared load. In response to identifying the targeted data on the bus, the CU stores the data at the allocated entry of the private cache.
    Type: Grant
    Filed: October 31, 2018
    Date of Patent: September 1, 2020
    Assignee: ADVANCED MICRO DEVICES, INC.
    Inventor: Maxim V. Kazakov
  • Publication number: 20200250877
    Abstract: A processor receives a request to access one or more levels of a partially resident texture (PRT) resource. The levels represent a texture at different levels of detail (LOD) and the request includes normalized coordinates indicating a location in the texture. The processor accesses a texture descriptor that includes dimensions of a first level of the levels and one or more offsets between a reference level and one or more second levels that are associated with one or more residency maps that indicate texels that are resident in the PRT resource. The processor translates the normalized coordinates to texel coordinates in the one or more residency maps based on the offset and accesses, in response to the request, the one or more residency maps based on the texel coordinates to determine whether texture data indicated by the normalized coordinates is resident in the PRT resource.
    Type: Application
    Filed: December 20, 2019
    Publication date: August 6, 2020
    Inventors: Maxim V. KAZAKOV, Mark FOWLER
  • Publication number: 20200184002
    Abstract: A processing device is provided which includes memory configured to store data and a processor configured to determine, based on convolutional parameters associated with an image, a virtual general matrix-matrix multiplication (GEMM) space of a virtual GEMM space output matrix and generate, in the virtual GEMM space output matrix, a convolution result by matrix multiplying the data corresponding to a virtual GEMM space input matrix with the data corresponding to a virtual GEMM space filter matrix. The processing device also includes convolutional mapping hardware configured to map, based on the convolutional parameters, positions of the virtual GEMM space input matrix to positions of an image space of the image.
    Type: Application
    Filed: August 30, 2019
    Publication date: June 11, 2020
    Applicant: Advanced Micro Devices, Inc.
    Inventors: Swapnil P. Sakharshete, Samuel Lawrence Wasmundt, Maxim V. Kazakov, Vineet Goel
  • Publication number: 20200133868
    Abstract: A processor reduces bus bandwidth consumption by employing a shared load scheme, whereby each shared load retrieves data for multiple compute units (CUs) of a processor. Each CU in a specified group monitors a bus for load accesses directed to a cache shared by the multiple CUs. In response to identifying a load access on the bus, a CU determines if the load access is a shared load access for its share group. In response to identifying a shared load access for its share group, the CU allocates an entry of a private cache associated with the CU for data responsive to the shared load access. The CU then monitors the bus for the data targeted by the shared load. In response to identifying the targeted data on the bus, the CU stores the data at the allocated entry of the private cache.
    Type: Application
    Filed: October 31, 2018
    Publication date: April 30, 2020
    Inventor: Maxim V. KAZAKOV
  • Publication number: 20200133991
    Abstract: A processor sequences the application of submatrices at a matrix multiplier to reduce the number of input changes at an input register of the matrix multiplier. The matrix multiplier is configured to perform a matrix multiplication for a relatively small matrix. To multiply two larger matrices the GPU decomposes the larger matrices into smaller submatrices and stores the submatrices at input registers of the matrix multiplier in a sequence, thereby calculating each column of a result matrix. The GPU sequences the storage of the submatrices at the input registers to maintain input data at one of the input registers over multiple calculation cycles of the matrix multiplier thereby reducing power consumption at the GPU.
    Type: Application
    Filed: October 31, 2018
    Publication date: April 30, 2020
    Inventors: Maxim V. KAZAKOV, Jian MAO
  • Patent number: 10558499
    Abstract: Footprints, or resource allocations, of waves within resources that are shared by processor cores in a multithreaded processor are measured concurrently with the waves executing on the processor cores. The footprints are averaged over a time interval. A number of waves are spawned and dispatched for execution in the multithreaded processor based on the average footprint. In some cases, the waves are spawned at a rate that is determined based on the average value of the footprints of waves within the resources. The rate of spawning waves is modified in response to a change in the average value of the footprints of the waves within the resources.
    Type: Grant
    Filed: October 26, 2017
    Date of Patent: February 11, 2020
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Maxim V. Kazakov, Michael Mantor
  • Patent number: 10540802
    Abstract: A processor receives a request to access one or more levels of a partially resident texture (PRT) resource. The levels represent a texture at different levels of detail (LOD) and the request includes normalized coordinates indicating a location in the texture. The processor accesses a texture descriptor that includes dimensions of a first level of the levels and one or more offsets between a reference level and one or more second levels that are associated with one or more residency maps that indicate texels that are resident in the PRT resource. The processor translates the normalized coordinates to texel coordinates in the one or more residency maps based on the offset and accesses, in response to the request, the one or more residency maps based on the texel coordinates to determine whether texture data indicated by the normalized coordinates is resident in the PRT resource.
    Type: Grant
    Filed: January 31, 2019
    Date of Patent: January 21, 2020
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Maxim V. Kazakov, Mark Fowler
  • Publication number: 20200004585
    Abstract: Techniques for executing shader programs with divergent control flow on a single instruction multiple data (“SIMD”) processor are disclosed. These techniques includes detecting entry into a divergent section of a shader program and, for the work-items that enter the divergent section, placing a task entry into a task queue associated with the target of each work-item. The target is the destination, in code, of any particular work-item, and is also referred to as a code segment herein. The task queues store task entries for code segments generated by different (or the same) wavefronts. A command processor examines task lists and schedules wavefronts for execution by grouping together tasks in the same task list into wavefronts and launching those wavefronts. By grouping tasks from different wavefronts together for execution in the same front, serialization of execution is greatly reduced or eliminated.
    Type: Application
    Filed: June 29, 2018
    Publication date: January 2, 2020
    Applicant: Advanced Micro Devices, Inc.
    Inventors: Skyler Jonathon Saleh, Maxim V. Kazakov
  • Publication number: 20190371043
    Abstract: A modified bilinear filter and method for use in a texture processor system are described herein. The system includes a texture processor, which includes a texture address unit and a texture data unit. The texture data unit includes a bilinear filter. An application sends a texture instruction which is processed by a texture address unit to obtain at least a level of detail (LOD) map and texel data. The texture data unit generates modified texel inputs from the LOD map texel data and at least two weights in a texture space region. The bilinear filter applies the at least two weights to the modified texel inputs, where the modified texel inputs and weights prevent finer LOD values from leaking into an area of coarser LOD values.
    Type: Application
    Filed: May 30, 2018
    Publication date: December 5, 2019
    Applicant: Advanced Micro Devices, Inc.
    Inventor: Maxim V. Kazakov
  • Publication number: 20190197761
    Abstract: A texture processor based ray tracing accelerator method and system are described. The system includes a shader, texture processor (TP) and cache, which are interconnected. The TP includes a texture address unit (TA), a texture cache processor (TCP), a filter pipeline unit and a ray intersection engine. The shader sends a texture instruction which contains ray data and a pointer to a bounded volume hierarchy (BVH) node to the TA. The TCP uses an address provided by the TA to fetch BVH node data from the cache. The ray intersection engine performs ray-BVH node type intersection testing using the ray data and the BVH node data. The intersection testing results and indications for BVH traversal are returned to the shader via a texture data return path. The shader reviews the intersection results and the indications to decide how to traverse to the next BVH node.
    Type: Application
    Filed: December 22, 2017
    Publication date: June 27, 2019
    Applicant: Advanced Micro Devices, Inc.
    Inventors: Skyler Jonathon Saleh, Maxim V. Kazakov, Vineet Goel
  • Publication number: 20190129756
    Abstract: Footprints, or resource allocations, of waves within resources that are shared by processor cores in a multithreaded processor are measured concurrently with the waves executing on the processor cores. The footprints are averaged over a time interval. A number of waves are spawned and dispatched for execution in the multithreaded processor based on the average footprint. In some cases, the waves are spawned at a rate that is determined based on the average value of the footprints of waves within the resources. The rate of spawning waves is modified in response to a change in the average value of the footprints of the waves within the resources.
    Type: Application
    Filed: October 26, 2017
    Publication date: May 2, 2019
    Inventors: Maxim V. KAZAKOV, Michael MANTOR
  • Publication number: 20190066352
    Abstract: A pipeline is configured to access a memory that stores a texture block and metadata that encodes compression parameters of the texture block and a residency status of the texture block. A processor requests access to the metadata in conjunction with requesting data in the texture block to perform a shading operation. The pipeline selectively returns the data in the texture block to the processor depending on whether the metadata indicates that the texture block is resident in the memory. A cache can also be included to store a copy of the metadata that encodes the compression parameters of the texture block. The residency status and the metadata stored in the cache can be modified in response to requests to access the metadata stored in the cache.
    Type: Application
    Filed: August 25, 2017
    Publication date: February 28, 2019
    Inventors: Maxim V. KAZAKOV, Skyler J. SALEH, Ruijin WU, Sagar Shankar BHANDARE