Patents by Inventor Narayan Kulshrestha

Narayan Kulshrestha has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20230266899
    Abstract: Various embodiments include a computer memory system that dynamically adjusts a memory device performance feature, such as dynamic assist control, dynamic turbo mode, and/or the like, to improve the performance of memory devices in the memory system. The memory system enables or disables the memory device performance feature based on the operating voltage relative to a threshold voltage. If the operating voltage crosses the threshold voltage in one direction, then the memory device system enables the memory device performance feature. If the operating voltage crosses the threshold voltage in another direction, then the memory system disables the memory device performance feature. Various techniques enable the memory device performance feature to be employed even with complex integrated circuits that may include tens of thousands of devices that employ the memory device performance feature.
    Type: Application
    Filed: February 23, 2022
    Publication date: August 24, 2023
    Inventors: Anand Shanmugam SUNDARARAJAN, Narayan KULSHRESTHA, Ka Yun LEE, Brian SMITH, Madhukiran V. SWARNA, Ramachandiran V, Kevin WILDER
  • Patent number: 11106261
    Abstract: Integrated circuits, or computer chips, typically include multiple hardware components (e.g. memory, processors, etc.) operating under a shared power (e.g. thermal) constraint that is sourced by one or more power sources for the chip. Typically, the hardware components can be individually configured to operate at certain states (e.g. to operate at a certain frequency by setting a clock speed for a clock dedicated to the hardware component). Thus, each hardware component can be configured to operate at an operating point that is determined to be optimal, usually in terms of achieving some desired goal for a specific application (e.g. frame rates for gaming, etc.). In the context of chip hardware that operates under a shared power/thermal constraint, a method, computer readable medium, and system are provided for determining the optimal operating point for the chip that takes into consideration both performance of the chip and power consumption by the chip.
    Type: Grant
    Filed: November 2, 2018
    Date of Patent: August 31, 2021
    Assignee: NVIDIA CORPORATION
    Inventors: Aniket Naik, Siddharth Bhargav, Bardia Zandian, Narayan Kulshrestha, Amit Pabalkar, Arvind Gopalakrishnan, Justin Tai, Sachin Satish Idgunji
  • Patent number: 10957651
    Abstract: A die package is disclosed through which power domains within the chip may be isolated by removing vias within the package substrate, rather than power gating. Multiple substrate options may be configured without specific vias. This eliminates the need to design power gating circuitry into the die, freeing up that die area for more functional logic. The solution allows the die package to retain the same pinout for use by PCB designers, regardless of which power domains are gated.
    Type: Grant
    Filed: August 7, 2019
    Date of Patent: March 23, 2021
    Assignee: NVIDIA Corp.
    Inventors: Don Templeton, Luke Young Chang, Narayan Kulshrestha
  • Publication number: 20210043574
    Abstract: A die package is disclosed through which power domains within the chip may be isolated by removing vias within the package substrate, rather than power gating. Multiple substrate options may be configured without specific vias. This eliminates the need to design power gating circuitry into the die, freeing up that die area for more functional logic. The solution allows the die package to retain the same pinout for use by PCB designers, regardless of which power domains are gated.
    Type: Application
    Filed: August 7, 2019
    Publication date: February 11, 2021
    Applicant: NVIDIA Corp.
    Inventors: Don Templeton, Luke Young Chang, Narayan Kulshrestha
  • Publication number: 20200142466
    Abstract: Integrated circuits, or computer chips, typically include multiple hardware components (e.g. memory, processors, etc.) operating under a shared power (e.g. thermal) constraint that is sourced by one or more power sources for the chip. Typically, the hardware components can be individually configured to operate at certain states (e.g. to operate at a certain frequency by setting a clock speed for a clock dedicated to the hardware component). Thus, each hardware component can be configured to operate at an operating point that is determined to be optimal, usually in terms of achieving some desired goal for a specific application (e.g. frame rates for gaming, etc.). In the context of chip hardware that operates under a shared power/thermal constraint, a method, computer readable medium, and system are provided for determining the optimal operating point for the chip that takes into consideration both performance of the chip and power consumption by the chip.
    Type: Application
    Filed: November 2, 2018
    Publication date: May 7, 2020
    Inventors: Aniket Naik, Siddharth Bhargav, Bardia Zandian, Narayan Kulshrestha, Amit Pabalkar, Arvind Gopalakrishnan, Justin Tai, Sachin Satish Idgunji
  • Publication number: 20190163254
    Abstract: An optimized power saving technique is described for a processor, such as, for example, a graphic processing unit (GPU), which includes one or more processing cores and at least one data link interface. According to the technique, the processor is operable in a low power mode in which power to the at least one processing core is off and power to the at least one data link interface is on. This technique provides reduced exit latencies compared to currently available approaches in which the core power is turned off.
    Type: Application
    Filed: October 30, 2018
    Publication date: May 30, 2019
    Inventors: Thomas E. DEWEY, Narayan KULSHRESTHA, Ramachandiran V, Sachin IDGUNJI, Lordson YUE
  • Publication number: 20190163255
    Abstract: An optimized power saving technique is described for a processor, such as, for example, a graphic processing unit (GPU), which includes one or more processing cores and at least one data link interface. According to the technique, the processor is operable in a low power mode in which power to the at least one processing core is off and power to the at least one data link interface is on. This technique provides reduced exit latencies compared to currently available approaches in which the core power is turned off.
    Type: Application
    Filed: October 30, 2018
    Publication date: May 30, 2019
    Inventors: Thomas E. DEWEY, Narayan KULSHRESTHA, Ramachandiran V, Sachin IDGUNJI, Lordson YUE
  • Patent number: 9665920
    Abstract: One embodiment of the present invention sets forth a technique for distributing graphics commands and atomic commands to a color processing unit (CROP) in an efficient manner. The interleaving mechanism determines, at each clock cycle, which graphics command(s) or atomic command(s) is transmitted to the CROP based on different factors. First, the interleaving mechanism ensures that atomic commands or graphics commands associated with a multi-transaction command stream are processed together. Second, the interleaving mechanism selects consecutive graphics commands for transmission to the CROP that optimize the use of different memory caches. Third, the interleaving mechanism prioritizes atomic commands over graphics commands. At each clock cycle, the graphics command(s) or the atomic command(s) selected by the interleaving mechanism are transmitted to the CROP for processing.
    Type: Grant
    Filed: December 17, 2009
    Date of Patent: May 30, 2017
    Assignee: NVIDIA Corporation
    Inventors: Chad D. Walker, Rui M. Bastos, Narayan Kulshrestha
  • Patent number: 9530189
    Abstract: A method for compressing framebuffer data is presented. The method includes determining a reduction ratio for framebuffer data in a tile including multiple samples. The reduction ratio determined is independent of the sampling mode, where the sampling mode is the number of samples within each pixel in the tile. The method further includes comparing a first portion of the framebuffer data for each of the multiple samples to determine an equality comparison result and also comparing a second portion of the framebuffer data for each one of the multiple samples to compute per-channel differences for each one of the multiple samples and testing the per-channel differences against a threshold value to determine a threshold comparison result. Finally, the method comprises compressing the framebuffer data for the tile based on the reduction ratio, the equality comparison result and the threshold comparison result to produce output framebuffer data for the tile.
    Type: Grant
    Filed: December 27, 2012
    Date of Patent: December 27, 2016
    Assignee: NVIDIA CORPORATION
    Inventors: Jonathan Dunaisky, David Kirk McAllister, Steven E. Molnar, Narayan Kulshrestha, Rui Bastos, Joseph Detmer, William Craig McKnight
  • Patent number: 9406149
    Abstract: A system and method are described for compressing image data using a combination of compression methods. Compression method combinations are provided to compress image data of a particular frame buffer format and antialiasing mode. Each method in the compression method combination is tried in turn to compress the image data in a tile. The best method that succeeded in compressing the image data is encoded in the compression bit state associated with the tile. Together, the compression bits, the compression method combination, and the frame buffer format provide sufficient information to decompress a tile.
    Type: Grant
    Filed: October 7, 2010
    Date of Patent: August 2, 2016
    Assignee: NVIDIA Corporation
    Inventors: David Kirk McAllister, Narayan Kulshrestha, Steven E. Molnar
  • Patent number: 8624916
    Abstract: One embodiment of the invention sets forth a CROP configured to perform both color raster operations and atomic transactions. Upon receiving an atomic transaction, the distribution unit within the CROP transmits a read request to the L2 cache for retrieving the destination operand. The distribution unit also transmits the source operands and the operation code to the latency buffer for storage until the destination operand is retrieved from the L2 cache. The processing pipeline transmits the operation code, the source and destination operands and an atomic flag to the blend unit for processing. The blend unit performs the atomic transaction on the source and destination operands based on the operation code and returns the result of the atomic transaction to the processing pipeline for storage in the internal cache. The processing pipeline writes the result of the atomic transaction to the L2 cache for storage at the memory location associated with the atomic transaction.
    Type: Grant
    Filed: April 1, 2013
    Date of Patent: January 7, 2014
    Assignee: Nvidia Corporation
    Inventors: Narayan Kulshrestha, Adam Paul Dreyer, Chad D. Walker, Rui M. Bastos
  • Patent number: 8605104
    Abstract: One embodiment of the present invention sets forth a technique for compressing color data. Color data for a tile including multiple samples is compressed based on an equality comparison and a threshold comparison based on a programmable threshold value. The equality comparison is performed on a first portion of the color data that includes at least exponent and sign fields of floating point format values or high order bits of integer format values. The threshold comparison is performed on a second portion of the color data that includes mantissa fields of floating point format values or low order bits of integer format values. The equality comparison and threshold comparison are used to select either computed averages of the pixel components or the original color data as the output color data for the tile. When the threshold is set to zero, only tiles that can be compressed without loss are compressed.
    Type: Grant
    Filed: December 31, 2009
    Date of Patent: December 10, 2013
    Assignee: NVIDIA Corporation
    Inventors: David Kirk McAllister, Steven E. Molnar, Narayan Kulshrestha
  • Publication number: 20130293564
    Abstract: One embodiment of the invention sets forth a CROP configured to perform both color raster operations and atomic transactions. Upon receiving an atomic transaction, the distribution unit within the CROP transmits a read request to the L2 cache for retrieving the destination operand. The distribution unit also transmits the source operands and the operation code to the latency buffer for storage until the destination operand is retrieved from the L2 cache. The processing pipeline transmits the operation code, the source and destination operands and an atomic flag to the blend unit for processing. The blend unit performs the atomic transaction on the source and destination operands based on the operation code and returns the result of the atomic transaction to the processing pipeline for storage in the internal cache. The processing pipeline writes the result of the atomic transaction to the L2 cache for storage at the memory location associated with the atomic transaction.
    Type: Application
    Filed: April 1, 2013
    Publication date: November 7, 2013
    Inventors: Narayan KULSHRESTHA, Adam Paul DREYER, Chad D. WALKER, Rui M. BASTOS
  • Publication number: 20130249897
    Abstract: A method for compressing framebuffer data is presented. The method includes determining a reduction ratio for framebuffer data in a tile including multiple samples. The reduction ratio determined is independent of the sampling mode, where the sampling mode is the number of samples within each pixel in the tile. The method further includes comparing a first portion of the framebuffer data for each of the multiple samples to determine an equality comparison result and also comparing a second portion of the framebuffer data for each one of the multiple samples to compute per-channel differences for each one of the multiple samples and testing the per-channel differences against a threshold value to determine a threshold comparison result. Finally, the method comprises compressing the framebuffer data for the tile based on the reduction ratio, the equality comparison result and the threshold comparison result to produce output framebuffer data for the tile.
    Type: Application
    Filed: December 27, 2012
    Publication date: September 26, 2013
    Applicant: NVIDIA CORPORATION
    Inventors: Jonathan Dunaisky, David Kirk McAllister, Steven E. Molnar, Narayan Kulshrestha, Rui Bastos, Joseph Detmer, William Craig McKnight
  • Patent number: 8488890
    Abstract: One embodiment of the present invention sets forth a technique for compressing image data with high contrast between pixels within a tile and between samples within pixels without any data loss. Partial coverage layers are generated and written to a tile that includes multiple pixels without reading the existing image data that is stored for the tile. A partial coverage layer encodes image data, such as colors, and sub-pixel coverage information for each covered pixel in a tile. The use of partial coverage layers reduces the bandwidth used to store image data when a tile is not fully covered.
    Type: Grant
    Filed: June 11, 2010
    Date of Patent: July 16, 2013
    Assignee: Nvidia Corporation
    Inventors: David Kirk McAllister, Narayan Kulshrestha, Steven E. Molnar
  • Patent number: 8411103
    Abstract: One embodiment of the invention sets forth a CROP configured to perform both color raster operations and atomic transactions. Upon receiving an atomic transaction, the distribution unit within the CROP transmits a read request to the L2 cache for retrieving the destination operand. The distribution unit also transmits the source operands and the operation code to the latency buffer for storage until the destination operand is retrieved from the L2 cache. The processing pipeline transmits the operation code, the source and destination operands and an atomic flag to the blend unit for processing. The blend unit performs the atomic transaction on the source and destination operands based on the operation code and returns the result of the atomic transaction to the processing pipeline for storage in the internal cache. The processing pipeline writes the result of the atomic transaction to the L2 cache for storage at the memory location associated with the atomic transaction.
    Type: Grant
    Filed: September 29, 2009
    Date of Patent: April 2, 2013
    Assignee: Nvidia Corporation
    Inventors: Narayan Kulshrestha, Adam Paul Dreyer, Chad D. Walker, Rui M. Bastos
  • Publication number: 20110243469
    Abstract: A system and method are described for compressing image data using a combination of compression methods. Compression method combinations are provided to compress image data of a particular frame buffer format and antialiasing mode. Each method in the compression method combination is tried in turn to compress the image data in a tile. The best method that succeeded in compressing the image data is encoded in the compression bit state associated with the tile. Together, the compression bits, the compression method combination, and the frame buffer format provide sufficient information to decompress a tile.
    Type: Application
    Filed: October 7, 2010
    Publication date: October 6, 2011
    Inventors: David Kirk McAllister, Narayan Kulshrestha, Steven E. Molnar