Patents by Inventor William J. Dally

William J. Dally has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 11847550
    Abstract: A method, computer program product, and system perform computations using a processor. A first instruction including a first index vector operand and a second index vector operand is received and the first index vector operand is decoded to produce first coordinate sets for a first array, each first coordinate set including at least a first coordinate and a second coordinate of a position of a non-zero element in the first array. The second index vector operand is decoded to produce second coordinate sets for a second array, each second coordinate set including at least a third coordinate and a fourth coordinate of a position of a non-zero element in the second array. The first coordinate sets are summed with the second coordinate sets to produce output coordinate sets and the output coordinate sets are converted into a set of linear indices.
    Type: Grant
    Filed: December 4, 2020
    Date of Patent: December 19, 2023
    Assignee: NVIDIA Corporation
    Inventors: William J. Dally, Angshuman Parashar, Joel Springer Emer, Stephen William Keckler, Larry Robert Dennison
  • Patent number: 11226820
    Abstract: Systems and methods for managing context switches among threads in a processing system. A processor may perform a context switch between threads using separate context registers. A context switch allows a processor to switch from processing a thread that is waiting for data to one that is ready for additional processing. The processor includes control registers with entries which may indicate that an associated context is waiting for data from an external source.
    Type: Grant
    Filed: November 23, 2016
    Date of Patent: January 18, 2022
    Assignee: ARM Finance Overseas Limited
    Inventors: Robert Gelinas, W. Patrick Hays, Sol Katzman, William J. Dally
  • Patent number: 10997496
    Abstract: A method, computer program product, and system perform computations using a sparse convolutional neural network accelerator. Compressed-sparse data is received for input to a processing element, wherein the compressed-sparse data encodes non-zero elements and corresponding multi-dimensional positions. The non-zero elements are processed in parallel by the processing element to produce a plurality of result values. The corresponding multi-dimensional positions are processed in parallel by the processing element to produce destination addresses for each result value in the plurality of result values. Each result value is transmitted to a destination accumulator associated with the destination address for the result value.
    Type: Grant
    Filed: March 14, 2017
    Date of Patent: May 4, 2021
    Assignee: NVIDIA Corporation
    Inventors: William J. Dally, Angshuman Parashar, Joel Springer Emer, Stephen William Keckler, Larry Robert Dennison
  • Publication number: 20210089864
    Abstract: A method, computer program product, and system perform computations using a processor. A first instruction including a first index vector operand and a second index vector operand is received and the first index vector operand is decoded to produce first coordinate sets for a first array, each first coordinate set including at least a first coordinate and a second coordinate of a position of a non-zero element in the first array. The second index vector operand is decoded to produce second coordinate sets for a second array, each second coordinate set including at least a third coordinate and a fourth coordinate of a position of a non-zero element in the second array. The first coordinate sets are summed with the second coordinate sets to produce output coordinate sets and the output coordinate sets are converted into a set of linear indices.
    Type: Application
    Filed: December 4, 2020
    Publication date: March 25, 2021
    Inventors: William J. Dally, Angshuman Parashar, Joel Springer Emer, Stephen William Keckler, Larry Robert Dennison
  • Patent number: 10891538
    Abstract: A method, computer program product, and system perform computations using a processor. A first instruction including a first index vector operand and a second index vector operand is received and the first index vector operand is decoded to produce first coordinate sets for a first array, each first coordinate set including at least a first coordinate and a second coordinate of a position of a non-zero element in the first array. The second index vector operand is decoded to produce second coordinate sets for a second array, each second coordinate set including at least a third coordinate and a fourth coordinate of a position of a non-zero element in the second array. The first coordinate sets are summed with the second coordinate sets to produce output coordinate sets and the output coordinate sets are converted into a set of linear indices.
    Type: Grant
    Filed: July 25, 2017
    Date of Patent: January 12, 2021
    Assignee: NVIDIA Corporation
    Inventors: William J. Dally, Angshuman Parashar, Joel Springer Emer, Stephen William Keckler, Larry Robert Dennison
  • Patent number: 10860922
    Abstract: A method, computer program product, and system perform computations using a sparse convolutional neural network accelerator. A first vector comprising only non-zero weight values and first associated positions of the non-zero weight values within a 3D space is received. A second vector comprising only non-zero input activation values and second associated positions of the non-zero input activation values within a 2D space is received. The non-zero weight values are multiplied with the non-zero input activation values, within a multiplier array, to produce a third vector of products. The first associated positions are combined with the second associated positions to produce a fourth vector of positions, where each position in the fourth vector is associated with a respective product in the third vector. The products in the third vector are transmitted to adders in an accumulator array, based on the position associated with each one of the products.
    Type: Grant
    Filed: November 18, 2019
    Date of Patent: December 8, 2020
    Assignee: NVIDIA Corporation
    Inventors: William J. Dally, Angshuman Parashar, Joel Springer Emer, Stephen William Keckler, Larry Robert Dennison
  • Publication number: 20200082254
    Abstract: A method, computer program product, and system perform computations using a sparse convolutional neural network accelerator. A first vector comprising only non-zero weight values and first associated positions of the non-zero weight values within a 3D space is received. A second vector comprising only non-zero input activation values and second associated positions of the non-zero input activation values within a 2D space is received. The non-zero weight values are multiplied with the non-zero input activation values, within a multiplier array, to produce a third vector of products. The first associated positions are combined with the second associated positions to produce a fourth vector of positions, where each position in the fourth vector is associated with a respective product in the third vector. The products in the third vector are transmitted to adders in an accumulator array, based on the position associated with each one of the products.
    Type: Application
    Filed: November 18, 2019
    Publication date: March 12, 2020
    Inventors: William J. Dally, Angshuman Parashar, Joel Springer Emer, Stephen William Keckler, Larry Robert Dennison
  • Patent number: 10528864
    Abstract: A method, computer program product, and system perform computations using a sparse convolutional neural network accelerator. A first vector comprising only non-zero weight values and first associated positions of the non-zero weight values within a 3D space is received. A second vector comprising only non-zero input activation values and second associated positions of the non-zero input activation values within a 2D space is received. The non-zero weight values are multiplied with the non-zero input activation values, within a multiplier array, to produce a third vector of products. The first associated positions are combined with the second associated positions to produce a fourth vector of positions, where each position in the fourth vector is associated with a respective product in the third vector. The products in the third vector are transmitted to adders in an accumulator array, based on the position associated with each one of the products.
    Type: Grant
    Filed: March 14, 2017
    Date of Patent: January 7, 2020
    Assignee: NVIDIA Corporation
    Inventors: William J. Dally, Angshuman Parashar, Joel Springer Emer, Stephen William Keckler, Larry Robert Dennison
  • Patent number: 10505451
    Abstract: A system and method are provided for controlling a modified buck converter circuit. A pull-up switching mechanism that is coupled to an upstream terminal of an inductor within a modified buck converter circuit is enabled. A load current at the output of the modified buck regulator circuit is measured. A capacitor current associated with a capacitor that is coupled to a downstream terminal of the inductor is continuously sensed and the pull-up switching mechanism is disabled when the capacitor current is greater than a sum of the load current and an enabling current value.
    Type: Grant
    Filed: January 15, 2019
    Date of Patent: December 10, 2019
    Assignee: NVIDIA Corporation
    Inventors: Sudhir Shrikantha Kudva, William J. Dally, Thomas Hastings Greer, III, Carl Thomas Gray
  • Patent number: 10361023
    Abstract: A magnetic power supply coupling system is disclosed. An integrated circuit module includes an integrated circuit die and a secondary winding that is configured to generate an induced, alternating current based on a magnetic flux. A primary winding is external to the integrated circuit module, proximate to the integrated circuit module, and coupled to a main power supply corresponding to an alternating current that generates the magnetic flux. The induced, alternating current is converted into a direct current at a voltage level to supply power to the integrated circuit die.
    Type: Grant
    Filed: July 31, 2015
    Date of Patent: July 23, 2019
    Assignee: NVIDIA Corporation
    Inventors: William J. Dally, Thomas Hastings Greer, III, Sudhir Shrikantha Kudva
  • Publication number: 20190173380
    Abstract: A system and method are provided for controlling a modified buck converter circuit. A pull-up switching mechanism that is coupled to an upstream terminal of an inductor within a modified buck converter circuit is enabled. A load current at the output of the modified buck regulator circuit is measured. A capacitor current associated with a capacitor that is coupled to a downstream terminal of the inductor is continuously sensed and the pull-up switching mechanism is disabled when the capacitor current is greater than a sum of the load current and an enabling current value.
    Type: Application
    Filed: January 15, 2019
    Publication date: June 6, 2019
    Inventors: Sudhir Shrikantha Kudva, William J. Dally, Thomas Hastings Greer, III, Carl Thomas Gray
  • Patent number: 10224813
    Abstract: A system and method are provided for controlling a modified buck converter circuit. A pull-up switching mechanism that is coupled to an upstream terminal of an inductor within a modified buck converter circuit is enabled. A load current at the output of the modified buck regulator circuit is measured. A capacitor current associated with a capacitor that is coupled to a downstream terminal of the inductor is continuously sensed and the pull-up switching mechanism is disabled when the capacitor current is greater than a sum of the load current and an enabling current value.
    Type: Grant
    Filed: March 24, 2016
    Date of Patent: March 5, 2019
    Assignee: NVIDIA Corporation
    Inventors: Sudhir Shrikantha Kudva, William J. Dally, Thomas Hastings Greer, III, Carl Thomas Gray
  • Patent number: 10153985
    Abstract: A multiprocessor computer system comprises a dragonfly processor interconnect network that comprises a plurality of processor nodes, a plurality of routers, each router directly coupled to a plurality of terminal nodes, the routers coupled to one another and arranged into a group, and a plurality of groups of routers, such that each group is connected to each other group via at least one direct connection.
    Type: Grant
    Filed: February 17, 2017
    Date of Patent: December 11, 2018
    Assignees: Intel Corporation, The Board of Trustees of the Leland Stanford Junior University
    Inventors: John Kim, Dennis C. Abts, Steven L. Scott, William J. Dally
  • Patent number: 10128904
    Abstract: A repeater circuit is disclosed. The repeater circuit is coupled to a transmission line driven by a first transmitter circuit and configured to detect a signal transition from a first voltage level to a second voltage level at a first position on the transmission line. The repeater circuit then reinforces the signal transition from the second voltage level to a third voltage level at the first position on the transmission line without interrupting a current through the transmission line.
    Type: Grant
    Filed: June 23, 2015
    Date of Patent: November 13, 2018
    Assignee: NVIDIA CORPORATION
    Inventor: William J. Dally
  • Patent number: 10096134
    Abstract: A method, computer program product, and system for sparse convolutional neural networks that improves efficiency is described. Multi-bit data for input to a processing element is received at a compaction engine. The multi-bit data is determined to equal zero and a single bit signal is transmitted from the memory interface to the processing element in lieu of the multi-bit data, where the single bit signal indicates that the multi-bit data equals zero. A compacted data sequence for input to a processing element is received by a memory interface. The compacted data sequence is transmitted from the memory interface to an expansion engine. Non-zero values are extracted from the compacted data sequence and zeros are inserted between the non-zero values by the expansion engine to generate an expanded data sequence that is output to the processing element.
    Type: Grant
    Filed: February 1, 2017
    Date of Patent: October 9, 2018
    Assignee: NVIDIA Corporation
    Inventors: Zhou Yan, Franciscus Wilhelmus Sijstermans, Yuanzhi Hua, Xiaojun Wang, Jeffrey Michael Pool, William J. Dally, Liang Chen
  • Patent number: 10056862
    Abstract: A method includes measuring one or more performance metrics of a set of solar cells coupled to an inverter. Based at least on the performance metrics meeting a first criterion, a first subset of the set of solar cells are disabled, reducing a voltage, power, or current provided to the inverter. Based at least on the performance metrics meeting a second criterion, a second subset of the set of solar cells are disabled, further reducing a voltage, power, or current provided to the inverter.
    Type: Grant
    Filed: May 26, 2015
    Date of Patent: August 21, 2018
    Assignee: SunPower Corporation
    Inventors: Andrew J Ponec, Darren Hau, Benjamin A. Johnson, Daniel J. M. Maren, William J. Dally
  • Publication number: 20180218518
    Abstract: A method, computer program product, and system for sparse convolutional neural networks that improves efficiency is described. Multi-bit data for input to a processing element is received at a compaction engine. The multi-bit data is determined to equal zero and a single bit signal is transmitted from the memory interface to the processing element in lieu of the multi-bit data, where the single bit signal indicates that the multi-bit data equals zero. A compacted data sequence for input to a processing element is received by a memory interface. The compacted data sequence is transmitted from the memory interface to an expansion engine. Non-zero values are extracted from the compacted data sequence and zeros are inserted between the non-zero values by the expansion engine to generate an expanded data sequence that is output to the processing element.
    Type: Application
    Filed: February 1, 2017
    Publication date: August 2, 2018
    Inventors: Zhou Yan, Franciscus Wilhelmus Sijstermans, Yuanzhi Hua, Xiaojun Wang, Jeffrey Michael Pool, William J. Dally, Liang Chen
  • Patent number: 9928104
    Abstract: A system, method, and computer program product are provided for accessing a queue. The method includes receiving a first request to reserve a data record entry in a queue, updating a queue state block based on the first request, and returning a response to the request. A second request is received to commit the data record entry and the queue state block is updated based on the second request.
    Type: Grant
    Filed: June 19, 2013
    Date of Patent: March 27, 2018
    Assignee: NVIDIA Corporation
    Inventors: William J. Dally, James David Balfour, Ignacio Llamas Ubieto
  • Publication number: 20180046916
    Abstract: A method, computer program product, and system perform computations using a sparse convolutional neural network accelerator. Compressed-sparse data is received for input to a processing element, wherein the compressed-sparse data encodes non-zero elements and corresponding multi-dimensional positions. The non-zero elements are processed in parallel by the processing element to produce a plurality of result values. The corresponding multi-dimensional positions are processed in parallel by the processing element to produce destination addresses for each result value in the plurality of result values. Each result value is transmitted to a destination accumulator associated with the destination address for the result value.
    Type: Application
    Filed: March 14, 2017
    Publication date: February 15, 2018
    Inventors: William J. Dally, Angshuman Parashar, Joel Springer Emer, Stephen William Keckler, Larry Robert Dennison
  • Publication number: 20180046900
    Abstract: A method, computer program product, and system perform computations using a processor. A first instruction including a first index vector operand and a second index vector operand is received and the first index vector operand is decoded to produce first coordinate sets for a first array, each first coordinate set including at least a first coordinate and a second coordinate of a position of a non-zero element in the first array. The second index vector operand is decoded to produce second coordinate sets for a second array, each second coordinate set including at least a third coordinate and a fourth coordinate of a position of a non-zero element in the second array. The first coordinate sets are summed with the second coordinate sets to produce output coordinate sets and the output coordinate sets are converted into a set of linear indices.
    Type: Application
    Filed: July 25, 2017
    Publication date: February 15, 2018
    Inventors: William J. Dally, Angshuman Parashar, Joel Springer Emer, Stephen William Keckler, Larry Robert Dennison