Patents by Inventor Nikolay Sakharnykh

Nikolay Sakharnykh has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20230229630
    Abstract: Apparatuses, systems, and techniques to decompress data in parallel. In at least one embodiment, decompressing a variable-length-coded data stream speculatively decodes overlapping portions of said data stream to determine locations to begin correctly decoding said data stream.
    Type: Application
    Filed: January 19, 2022
    Publication date: July 20, 2023
    Inventors: Eyal Soha, Elias Stehle, Nikolay Sakharnykh
  • Publication number: 20230214225
    Abstract: Described approaches provide for effectively and scalably using multiple GPUs to build and probe hash tables and materialize results of probes. Random memory accesses by the GPUs to build and/or probe a hash table may be distributed across GPUs and executed concurrently using global location identifiers. A global location identifier may be computed from data of an entry and identify a global location for an insertion and/or probe using the entry. The global location identifier may be used by a GPU to determine whether to perform an insertion or probe using an entry and/or where the insertion or probe is to be performed. To coordinate GPUs in materializing results of probing a hash table a global offset to the global output buffer may be maintained in memory accessible to each of the GPUs or the GPUs may compute global offsets using an exclusive sum of the local output buffer sizes.
    Type: Application
    Filed: March 13, 2023
    Publication date: July 6, 2023
    Inventors: Tim Kaldewey, Jiri Johannes Kraus, Nikolay Sakharnykh
  • Patent number: 11604654
    Abstract: Described approaches provide for effectively and scalably using multiple GPUs to build and probe hash tables and materialize results of probes. Random memory accesses by the GPUs to build and/or probe a hash table may be distributed across GPUs and executed concurrently using global location identifiers. A global location identifier may be computed from data of an entry and identify a global location for an insertion and/or probe using the entry. The global location identifier may be used by a GPU to determine whether to perform an insertion or probe using an entry and/or where the insertion or probe is to be performed. To coordinate GPUs in materializing results of probing a hash table a global offset to the global output buffer may be maintained in memory accessible to each of the GPUs or the GPUs may compute global offsets using an exclusive sum of the local output buffer sizes.
    Type: Grant
    Filed: October 17, 2019
    Date of Patent: March 14, 2023
    Assignee: NVIDIA Corporation
    Inventors: Tim Kaldewey, Jiri Johannes Kraus, Nikolay Sakharnykh
  • Publication number: 20200125368
    Abstract: Described approaches provide for effectively and scalably using multiple GPUs to build and probe hash tables and materialize results of probes. Random memory accesses by the GPUs to build and/or probe a hash table may be distributed across GPUs and executed concurrently using global location identifiers. A global location identifier may be computed from data of an entry and identify a global location for an insertion and/or probe using the entry. The global location identifier may be used by a GPU to determine whether to perform an insertion or probe using an entry and/or where the insertion or probe is to be performed. To coordinate GPUs in materializing results of probing a hash table a global offset to the global output buffer may be maintained in memory accessible to each of the GPUs or the GPUs may compute global offsets using an exclusive sum of the local output buffer sizes.
    Type: Application
    Filed: October 17, 2019
    Publication date: April 23, 2020
    Inventors: Tim Kaldewey, Jiri Johannes Kraus, Nikolay Sakharnykh
  • Patent number: 9418400
    Abstract: Systems and methods for rendering depth-of-field visual effect on images with high computing efficiency and performance. A diffusion blurring process and a Fast Fourier Transform (FFT)-based convolution are combined to achieve high-fidelity depth-of-field visual effect with Bokeh spots in real-time applications. The brightest regions in the background of an original image are enhanced with Bokeh effect by virtue of FFT convolution with a convolution kernel. A diffusion solver can be used to blur the background of the original image. By blending the Bokeh spots with the image with gradually blurred background, a resultant image can present an enhanced depth-of-field visual effect. The FFT-based convolution can be computed with multi-threaded parallelism.
    Type: Grant
    Filed: June 18, 2013
    Date of Patent: August 16, 2016
    Assignee: NVIDIA CORPORATION
    Inventors: Nikolay Sakharnykh, Holger Gruen
  • Publication number: 20140368494
    Abstract: Systems and methods for rendering depth-of-field visual effect on images with high computing efficiency and performance. A diffusion blurring process and a Fast Fourier Transform (FFT)-based convolution are combined to achieve high-fidelity depth-of-field visual effect with Bokeh spots in real-time applications. The brightest regions in the background of an original image are enhanced with Bokeh effect by virtue of FFT convolution with a convolution kernel. A diffusion solver can be used to blur the background of the original image. By blending the Bokeh spots with the image with gradually blurred background, a resultant image can present an enhanced depth-of-field visual effect. The FFT-based convolution can be computed with multi-threaded parallelism.
    Type: Application
    Filed: June 18, 2013
    Publication date: December 18, 2014
    Inventors: Nikolay SAKHARNYKH, Holger GRUEN
  • Publication number: 20140257769
    Abstract: Systems and methods for MD simulation with significantly increased multithreaded parallelism. A substance body is divided into a plurality of cells. With respect to a current center cell, its neighbor particles can be partitioned into groups with groups processed in sequence by a dedicated CTA that comprises a plurality of warps. Within each CTA, each warp is assigned to process in parallel for a center particle in the center cell to calculate interaction forces between the center particle and the group of neighbor particles. Moreover different levels of the memory hierarchy in a system, including local memories, shared memories and global memory, are used to store intermediate and final results respectively.
    Type: Application
    Filed: July 25, 2013
    Publication date: September 11, 2014
    Applicant: NVIDIA Corporation
    Inventor: Nikolay Sakharnykh