Patents by Inventor Vamsi Nalluri

Vamsi Nalluri has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 11256979
    Abstract: An integrated circuit that includes common factor mass multiplier (CFMM) circuitry is provided that multiplies a common factor operand by a large number of multiplier operands. The CFMM circuitry may be implemented as a instance specific version (where at least some portion of the hardware has to be redesigned if the multipliers change) or a non-instance specific version (where the CFMM circuitry can work with arbitrary multipliers without having to redesign the hardware). Either version can be formed on a programmable integrated circuit or an application-specific integrated circuit. The CFMM circuitry may include a multiplier circuit that effectively multiplies the common factor by predetermined fixed constants to generate partial products and may further include shifting and add/subtract circuits for processing and combining the partial products to generate corresponding final output products.
    Type: Grant
    Filed: March 30, 2018
    Date of Patent: February 22, 2022
    Assignee: Intel Corporation
    Inventors: Thiam Khean Hah, Carl Ebeling, Vamsi Nalluri
  • Patent number: 10860760
    Abstract: Systems and methods are included for efficiently implementing learned parameter systems (LPSs) on a programmable integrated circuit (PIC) via a computing engine. The computing engine receives an input set of learned parameters corresponding to use instances of an LPS. The computing engine reduces at least some redundancies and/or unnecessary operations using instance specific parameter values of the LPS, to generate a less redundant set of learned parameters and a corresponding less redundant LPS. The computing engine generates a netlist based on these, which may share computing resources of the PIC across multiple computations in accordance with the less redundant set of learned parameters and the corresponding less redundant LPS. The computing engine then programs the PIC with the netlist. That is, the netlist replaces use instances of at least some of the original learned parameters and its corresponding LPS and is executed instead of the original.
    Type: Grant
    Filed: March 30, 2018
    Date of Patent: December 8, 2020
    Assignee: Intel Corporation
    Inventors: Thiam Khean Hah, Vamsi Nalluri, Herman Henry Schmit, Scott J. Weber, Randy Huang
  • Patent number: 10853034
    Abstract: An integrated circuit that includes common factor mass multiplier (CFMM) circuitry is provided that multiplies a common factor operand by a large number of multiplier operands. The CFMM circuitry may be implemented as an instance specific version or a non-instance specific version. The instance specific version might also be fully enumerated so that the hardware doesn't have to be redesigned assuming all possible unique multiplier values are implemented. Either version can be formed on a programmable integrated circuit or an application-specific integrated circuit. CFMM circuitry configured in this way can be used to support convolution neural networks or any operation that requires a straight common factor multiply. Any adder component with the CFMM circuitry may be implemented using bit-serial adders. The bit-serial adders may be further connected in a tree in CNN applications to sum together many input streams.
    Type: Grant
    Filed: September 28, 2018
    Date of Patent: December 1, 2020
    Assignee: Intel Corporation
    Inventors: Thiam Khean Hah, Jason Gee Hock Ong, Yeong Tat Liew, Carl Ebeling, Vamsi Nalluri
  • Publication number: 20190303748
    Abstract: An integrated circuit that includes common factor mass multiplier (CFMM) circuitry is provided that multiplies a common factor operand by a large number of multiplier operands. The CFMM circuitry may be implemented as a instance specific version (where at least some portion of the hardware has to be redesigned if the multipliers change) or a non-instance specific version (where the CFMM circuitry can work with arbitrary multipliers without having to redesign the hardware). Either version can be formed on a programmable integrated circuit or an application-specific integrated circuit. The CFMM circuitry may include a multiplier circuit that effectively multiplies the common factor by predetermined fixed constants to generate partial products and may further include shifting and add/subtract circuits for processing and combining the partial products to generate corresponding final output products.
    Type: Application
    Filed: March 30, 2018
    Publication date: October 3, 2019
    Applicant: Intel Corporation
    Inventors: Thiam Khean Hah, Carl Ebeling, Vamsi Nalluri
  • Publication number: 20190303103
    Abstract: An integrated circuit that includes common factor mass multiplier (CFMM) circuitry is provided that multiplies a common factor operand by a large number of multiplier operands. The CFMM circuitry may be implemented as an instance specific version or a non-instance specific version. The instance specific version might also be fully enumerated so that the hardware doesn't have to be redesigned assuming all possible unique multiplier values are implemented. Either version can be formed on a programmable integrated circuit or an application-specific integrated circuit. CFMM circuitry configured in this way can be used to support convolution neural networks or any operation that requires a straight common factor multiply. Any adder component with the CFMM circuitry may be implemented using bit-serial adders. The bit-serial adders may be further connected in a tree in CNN applications to sum together many input streams.
    Type: Application
    Filed: September 28, 2018
    Publication date: October 3, 2019
    Inventors: Thiam Khean Hah, Jason Gee Hock Ong, Yeong Tat Liew, Carl Ebeling, Vamsi Nalluri
  • Publication number: 20180307783
    Abstract: Systems and methods are included for efficiently implementing learned parameter systems (LPSs) on a programmable integrated circuit (PIC) via a computing engine. The computing engine receives an input set of learned parameters corresponding to use instances of an LPS. The computing engine reduces at least some redundancies and/or unnecessary operations using instance specific parameter values of the LPS, to generate a less redundant set of learned parameters and a corresponding less redundant LPS. The computing engine generates a netlist based on these, which may share computing resources of the PIC across multiple computations in accordance with the less redundant set of learned parameters and the corresponding less redundant LPS. The computing engine then programs the PIC with the netlist. That is, the netlist replaces use instances of at least some of the original learned parameters and its corresponding LPS and is executed instead of the original.
    Type: Application
    Filed: March 30, 2018
    Publication date: October 25, 2018
    Inventors: Thiam Khean Hah, Vamsi Nalluri, Herman Henry Schmit, Scott J. Weber, Randy Huang