Patents by Inventor Atul Kalambur

Atul Kalambur has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 9727521
    Abstract: Techniques are disclosed for peer-to-peer data transfers where a source device receives a request to read data words from a target device. The source device creates a first and second read command for reading a first portion and a second portion of a plurality of data words from the target device, respectively. The source device transmits the first read command to the target device, and, before a first read operation associated with the first read command is complete, transmits the second read command to the target device. The first and second portions of the plurality of data words are stored in a first and second portion a buffer memory, respectively. Advantageously, an arbitrary number of multiple read operations may be in progress at a given time without using multiple peer-to-peer memory buffers. Performance for large data block transfers is improved without consuming peer-to-peer memory buffers needed by other peer GPUs.
    Type: Grant
    Filed: September 14, 2012
    Date of Patent: August 8, 2017
    Assignee: NVIDIA Corporation
    Inventors: Dennis K. Ma, Karan Gupta, Lei Tian, Franck R. Diard, Praveen Jain, Wei-Je Huang, Atul Kalambur
  • Publication number: 20140082120
    Abstract: Techniques are disclosed for peer-to-peer data transfers where a source device receives a request to read data words from a target device. The source device creates a first and second read command for reading a first portion and a second portion of a plurality of data words from the target device, respectively. The source device transmits the first read command to the target device, and, before a first read operation associated with the first read command is complete, transmits the second read command to the target device. The first and second portions of the plurality of data words are stored in a first and second portion a buffer memory, respectively. Advantageously, an arbitrary number of multiple read operations may be in progress at a given time without using multiple peer-to-peer memory buffers. Performance for large data block transfers is improved without consuming peer-to-peer memory buffers needed by other peer GPUs.
    Type: Application
    Filed: September 14, 2012
    Publication date: March 20, 2014
    Inventors: Dennis K. MA, Karan Gupta, Lei Tian, Franck R. Diard, Praveen Jain, Wei-Je Huang, Atul Kalambur
  • Patent number: 8392667
    Abstract: Deadlocks are avoided by marking read requests issued by a parallel processor to system memory as “special.” Read completions associated with read requests marked as special are routed on virtual channel 1 of the PCIe bus. Data returning on virtual channel 1 cannot become stalled by write requests in virtual channel 0, thus avoiding a potential deadlock.
    Type: Grant
    Filed: December 12, 2008
    Date of Patent: March 5, 2013
    Assignee: NVIDIA Corporation
    Inventors: Samuel H. Duncan, David B. Glasco, Wei-Je Huang, Atul Kalambur, Patrick R. Marchand, Dennis K. Ma
  • Publication number: 20100153658
    Abstract: Deadlocks are avoided by marking read requests issued by a parallel processor to system memory as “special.” Read completions associated with read requests marked as special are routed on virtual channel 1 of the PCIe bus. Data returning on virtual channel 1 cannot become stalled by write requests in virtual channel 0, thus avoiding a potential deadlock.
    Type: Application
    Filed: December 12, 2008
    Publication date: June 17, 2010
    Inventors: Samuel H. Duncan, David B. Glasco, Wei-je Huang, Atul Kalambur, Patrick R. Marchand, Dennis K. Ma
  • Patent number: 7237096
    Abstract: If a consumer instruction specifies a 64 bit source register comprised of results provided by two 32 bit producer instructions, the number of dependencies that must be tracked per source register can be decreased by transforming one or more of the 32 bit producer instructions so that rather than simply storing its result in a 32 bit destination register, the transformed instruction stores its result into a 64 bit logical register along with another 32 bit value held in another 32 bit register.
    Type: Grant
    Filed: April 5, 2004
    Date of Patent: June 26, 2007
    Assignee: Sun Microsystems, Inc.
    Inventors: Julian A. Prabhu, Atul Kalambur, Sudarshan Kadambi, Daniel L. Liebholz, Julie M. Staraitis
  • Patent number: 6954865
    Abstract: An integrated circuit that uses a functional unit that outputs one set of values when in a power saving mode is provided. The functional unit, generally pipelined, is capable of being in the power saving mode dependent on an instruction decode/issue unit, and when in the power saving mode, the functional unit, using power saving mode circuitry, outputs one set of values as seen by components external to the functional unit regardless of the state the functional unit is in when the functional unit is initially put in the power saving mode.
    Type: Grant
    Filed: June 18, 2002
    Date of Patent: October 11, 2005
    Assignee: Sun Microsystems, Inc.
    Inventors: Atul Kalambur, Michelle Wong
  • Publication number: 20030233593
    Abstract: An integrated circuit that uses a functional unit that outputs one set of values when in a power saving mode is provided. The functional unit, generally pipelined, is capable of being in the power saving mode dependent on an instruction decode/issue unit, and when in the power saving mode, the functional unit, using power saving mode circuitry, outputs one set of values as seen by components external to the functional unit regardless of the state the functional unit is in when the functional unit is initially put in the power saving mode.
    Type: Application
    Filed: June 18, 2002
    Publication date: December 18, 2003
    Inventors: Atul Kalambur, Michelle Wong
  • Patent number: 6549926
    Abstract: A Sweeney, Robertson, Tocher (SRT) divider for use in a computer system has recoding circuitry to recode the three most significant bits of the dividend into one-hot form as the dividend is loaded into a quotient/partial remainder register. With each clock, a partial remainder is generated also having its most significant three bits in one-hot form and the remaining bits in binary encoded form. The divider has several stages permitting it to generate several bits of quotient in each clock cycle. Each stage has circuitry for estimating a quotient digit, and for computing a partial remainder by subtracting the product of the quotient digit times the divisor from either the dividend or a previous partial remainder. This subtraction is performed upon a one-hot code in the most significant bits and in binary code on the least significant bits. The divider also has circuitry for assembling a plurality of quotient digits into a quotient.
    Type: Grant
    Filed: October 26, 1999
    Date of Patent: April 15, 2003
    Assignee: Sun Microsystems, Inc.
    Inventors: Atul Kalambur, Srinivasa Gopaladhine