Patents by Inventor Yongning Sheng

Yongning Sheng has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Compiler for Mixed Precision in a Computational Graph

Publication number: 20250208839

Abstract: The disclosed technology relates to automatically optimizing the precision of data types in a computational graph, such as those used in machine learning and artificial intelligence applications. A representation of the computational graph is obtained. Nodes of the computational graph are assigned to one of three sets: a deny set, an allow set, or an infer set, based on a predefined policy. For nodes in the allow set, the method changes at least one of the input data precision, output data precision, or internal computation precision to a lower precision. For nodes in the infer set, the method propagates a data precision requirement from downstream nodes to upstream nodes. The method generates and stores computer instructions for executing the computational graph with the optimized precisions on one or more processors. This approach enhances performance and energy efficiency while maintaining model accuracy for the computational graph.

Type: Application

Filed: October 2, 2024

Publication date: June 26, 2025

Applicant: SambaNova Systems, Inc.

Inventors: Mark William Gottscho, Vidushi Goyal, Han Wang, Valentina Popescu, Yongning SHENG, Matthew William Ashcraft
Calculating a floating-point function using multiple lookup tables

Patent number: 12333270

Abstract: A computation unit includes input lines to provide a floating-point value, a first lookup table, a second lookup table, a range detector, and an output stage. The input lines include exponent lines and mantissa lines. The first lookup table has a first address input coupled to a first subset of the input lines to provide a first output. The second lookup table has a second address input coupled to a second subset of the input lines to provide a second output. The range detector is coupled to at least some of the input lines and indicates whether the floating-point value provided on the input lines is within a specified range on a range output. The output stage is operatively coupled to the first output, the second output and the range output, to generate a function output based on the first output, the second output, and the range output.

Type: Grant

Filed: May 5, 2022

Date of Patent: June 17, 2025

Assignee: SambaNova Systems, Inc.

Inventors: Mingran Wang, Xiaoyan Li, Yongning Sheng
Dynamic Exponent Bias Method for Neural Network Training

Publication number: 20250117647

Abstract: A method that may be computer implemented converts a tensor value from a first format to a second format and trains a neural network. The method determines a maximum exponent code in the first format and subtracts a first bias to obtain the highest needed exponent. It determines a second bias from the highest available code (HAC) in the second format and the HNE, and converts the tensor value from the first format to the second format by using the second bias instead of the first bias. The method uses the second format to train the neural network. The method may round the mantissa of the tensor value in the first format to obtain a rounded mantissa of the tensor value for the second format.

Type: Application

Filed: October 5, 2023

Publication date: April 10, 2025

Applicant: SambaNova Systems, Inc.

Inventors: Valentina Popescu, Jeffrey S. Brooks, Ram SIVARAMAKRISHNAN, Matthew William Ashcraft, Vinh Quang Nguyen, Gang Liu, Raghu PRABHAKAR, Yongning SHENG
Calculating a Floating-Point Function using Multiple Lookup Tables

Publication number: 20220261220

Abstract: A computation unit includes input lines to provide a floating-point value, a first lookup table, a second lookup table, a range detector, and an output stage. The input lines include exponent lines and mantissa lines. The first lookup table has a first address input coupled to a first subset of the input lines to provide a first output. The second lookup table has a second address input coupled to a second subset of the input lines to provide a second output. The range detector is coupled to at least some of the input lines and indicates whether the floating-point value provided on the input lines is within a specified range on a range output. The output stage is operatively coupled to the first output, the second output and the range output, to generate a function output based on the first output, the second output, and the range output.

Type: Application

Filed: May 5, 2022

Publication date: August 18, 2022

Applicant: SambaNova Systems, Inc.

Inventors: Mingran WANG, Xiaoyan LI, Yongning SHENG
Sigmoid function in hardware and a reconfigurable data processor including same

Patent number: 11327923

Abstract: A functional unit for a data processor comprises an input register to store a variable X; a first circuit, having an input connected to the input register and an output, to generate a value eX on its output; a second circuit, having an input connected to the input register and an output, to generate an output which is a value (tan h(X/2)+1)/2 on its output; a comparator, having an input connected to the input register and an output, to generate a line on its output based on a comparison between X and a constant; and a selector to select between inputs connected to the outputs of the first circuit and the second circuit, in response to the output of the comparator, and provide an output representing a value sigmoid(X).

Type: Grant

Filed: September 4, 2019

Date of Patent: May 10, 2022

Assignee: SambaNova Systems, Inc.

Inventors: Mingran Wang, Mark Luttrell, Yongning Sheng
Computational units for batch normalization

Patent number: 11328038

Abstract: Herein are disclosed computation units for batch normalization. A computation unit may include a first circuit to traverse a batch of input elements xi having a first format, to produce a mean ?1 in the first format and a mean ?2 in a second format, the second format having more bits than the first format. The computation unit may further include a second circuit operatively coupled to the first circuit to traverse the batch of input elements xi to produce a standard deviation ? for the batch using the mean ?1 in the first format. The computation unit may also include a third circuit operatively coupled to the second circuit to traverse the batch of input elements xi to produce a normalized set of values yi using the mean ?2 in the second format and the standard deviation ?.

Type: Grant

Filed: November 25, 2019

Date of Patent: May 10, 2022

Assignee: SambaNova Systems, Inc.

Inventors: Mingran Wang, Xiaoyan Li, Yongning Sheng
Computation units for functions based on lookup tables

Patent number: 11327713

Abstract: A computation unit comprises a floating point input having X bits including a sign bit, an E bit exponent and an M bit mantissa. A first circuit is operatively coupled to receive X-N bits of the input, including e1 bits of the exponent and ml bits of the mantissa, where e1?E, and m1?M, to output values over a first domain of the input. A second circuit is operatively coupled to receive X-K bits of the input, including e2 bits of the exponent, e2<e1, and m2 bits of the mantissa, m2>m1, to output values, over a second domain of the input. A range detector is operatively coupled to the input, to indicate a range in response to a value of the input. A selector can select the output of the first circuit or of the second circuit in response to the range detector.

Type: Grant

Filed: October 1, 2019

Date of Patent: May 10, 2022

Assignee: SambaNova Systems, Inc.

Inventors: Mingran Wang, Xiaoyan Li, Yongning Sheng
Look-up table with input offsetting

Patent number: 11327717

Abstract: A computation unit computes a function f(I). The function f(I) has a target output range over a first domain of an input I encoded using a first format. A first circuit receives the encoded input I in the first format including X bits, to add an offset C to the encoded input I to generate an offset input SI=I+C, in a second format including fewer than X bits. The offset C is equal to a difference between the first domain in f(I) and a higher precision domain of the second format for the offset input SI. A second circuit is operatively coupled to receive the offset input SI in the second format, to output a value equal to a function f(SI) to provide an encoded output value f(I).

Type: Grant

Filed: November 19, 2019

Date of Patent: May 10, 2022

Assignee: SambaNova Systems, Inc.

Inventors: Mingran Wang, Xiaoyan Li, Yongning Sheng
Computationally efficient general matrix-matrix multiplication (GeMM)

Patent number: 11250105

Abstract: A computation unit that comprises (i) a multiplicand vector decomposer that generates a decomposed multiplicand vector which uses a sequence of first and second concatenated multiplicand sub-elements (1st2ndCMCSE) in a lower-precision format (LPF) to represent corresponding ones of multiplicand elements in a multiplicand vector in a higher-precision format (HPF), (ii) a multiplier vector decomposer that generates a decomposed multiplier vector which uses a sequence of first and second concatenated multiplier sub-elements (1st2ndCMLSE) in the LPF to represent corresponding ones of multiplier elements in a multiplier vector in the HPF, (iii) a multiplicand tensor encoder that encodes double reads of the sequence of the 1st2ndCMCSE in a decomposed multiplicand tensor, and (iv) a product vector generator that generates a product vector containing a sequence of first and second concatenated product sub-elements by executing general matrix-matrix multiplication (GeMM) operations between the double reads of the 1st2

Type: Grant

Filed: May 12, 2020

Date of Patent: February 15, 2022

Assignee: SambaNova Systems, Inc.

Inventors: Mingran Wang, Xiaoyan Li, Yongning Sheng
Computationally Efficient General Matrix-Matrix Multiplication (GeMM)

Publication number: 20210357475

Abstract: A computation unit that comprises (i) a multiplicand vector decomposer that generates a decomposed multiplicand vector which uses a sequence of first and second concatenated multiplicand sub-elements (1st2ndCMCSE) in a lower-precision format (LPF) to represent corresponding ones of multiplicand elements in a multiplicand vector in a higher-precision format (HPF), (ii) a multiplier vector decomposer that generates a decomposed multiplier vector which uses a sequence of first and second concatenated multiplier sub-elements (1st2ndCMLSE) in the LPF to represent corresponding ones of multiplier elements in a multiplier vector in the HPF, (iii) a multiplicand tensor encoder that encodes double reads of the sequence of the 1st2ndCMCSE in a decomposed multiplicand tensor, and (iv) a product vector generator that generates a product vector containing a sequence of first and second concatenated product sub-elements by executing general matrix-matrix multiplication (GeMM) operations between the double reads of the 1st2

Type: Application

Filed: May 12, 2020

Publication date: November 18, 2021

Applicant: SambaNova Systems, Inc.

Inventors: Mingran WANG, Xiaoyan LI, Yongning SHENG
Computational units for element approximation

Patent number: 11150872

Abstract: Herein are disclosed computation units for element approximation. A computation unit may include a first circuit to compute a first projection ? of an input element xi from a first range to a second range. In the first circuit, the input element xi may have a first format and the projected element yi may have a second format. In addition, in the first circuit, the second format may have more bits than the first format. The computation unit may further include a second circuit operatively coupled to the first circuit to produce a reduction zi in the first format using the projected element yi in the second format. The computation unit may also include a third circuit operatively coupled to the second circuit to compute a second projection ? of the reduction zi from the second range to the first range to produce an approximation wi.

Type: Grant

Filed: December 17, 2019

Date of Patent: October 19, 2021

Assignee: SambaNova Systems, Inc.

Inventors: Mingran Wang, Xiaoyan Li, Mark Luttrell, Yongning Sheng, Gregory Frederick Grohoski
Computational Units for Element Approximation

Publication number: 20210182021

Abstract: Herein are disclosed computation units for element approximation. A computation unit may include a first circuit to compute a first projection ? of an input element xi from a first range to a second range. In the first circuit, the input element xi may have a first format and the projected element yi may have a second format. In addition, in the first circuit, the second format may have more bits than the first format. The computation unit may further include a second circuit operatively coupled to the first circuit to produce a reduction zi in the first format using the projected element yi in the second format. The computation unit may also include a third circuit operatively coupled to the second circuit to compute a second projection ? of the reduction zi from the second range to the first range to produce an approximation wi.

Type: Application

Filed: December 17, 2019

Publication date: June 17, 2021

Applicant: SambaNova Systems, Inc.

Inventors: Mingran WANG, Xiaoyan Li, Mark Luttrell, Yongning Sheng, Gregory Frederick Grohoski
Computational Units for Batch Normalization

Publication number: 20210157550

Abstract: Herein are disclosed computation units for batch normalization. A computation unit may include a first circuit to traverse a batch of input elements xi having a first format, to produce a mean ?1 in the first format and a mean ?2 in a second format, the second format having more bits than the first format. The computation unit may further include a second circuit operatively coupled to the first circuit to traverse the batch of input elements xi to produce a standard deviation ? for the batch using the mean ?1 in the first format. The computation unit may also include a third circuit operatively coupled to the second circuit to traverse the batch of input elements xi to produce a normalized set of values yi using the mean ?2 in the second format and the standard deviation ?.

Type: Application

Filed: November 25, 2019

Publication date: May 27, 2021

Applicant: SambaNova Systems, Inc.

Inventors: Mingran WANG, Xiaoyan LI, Yongning SHENG
LOOK-UP TABLE WITH INPUT OFFSETTING

Publication number: 20210149634

Abstract: A computation unit computes a function f(I). The function f(I) has a target output range over a first domain of an input I encoded using a first format. A first circuit receives the encoded input I in the first format including X bits, to add an offset C to the encoded input I to generate an offset input SI=I+C, in a second format including fewer than X bits. The offset C is equal to a difference between the first domain in f(I) and a higher precision domain of the second format for the offset input SI. A second circuit is operatively coupled to receive the offset input SI in the second format, to output a value equal to a function f(SI) to provide an encoded output value f(I).

Type: Application

Filed: November 19, 2019

Publication date: May 20, 2021

Applicant: SambaNova Systems, Inc.

Inventors: Mingran WANG, Xiaoyan LI, Yongning SHENG
COMPUTATION UNITS FOR FUNCTIONS BASED ON LOOKUP TABLES

Publication number: 20210096816

Abstract: A computation unit comprises a floating point input having X bits including a sign bit, an E bit exponent and an M bit mantissa. A first circuit is operatively coupled to receive X-N bits of the input, including e1 bits of the exponent and ml bits of the mantissa, where e1?E, and m1?M, to output values over a first domain of the input. A second circuit is operatively coupled to receive X-K bits of the input, including e2 bits of the exponent, e2<e1, and m2 bits of the mantissa, m2>m1, to output values, over a second domain of the input. A range detector is operatively coupled to the input, to indicate a range in response to a value of the input. A selector can select the output of the first circuit or of the second circuit in response to the range detector.

Type: Application

Filed: October 1, 2019

Publication date: April 1, 2021

Applicant: SambaNova Systems, Inc.

Inventors: Mingran WANG, Xiaoyan LI, Yongning SHENG
SIGMOID FUNCTION IN HARDWARE AND A RECONFIGURABLE DATA PROCESSOR INCLUDING SAME

Publication number: 20210064568

Abstract: A functional unit for a data processor comprises an input register to store a variable X; a first circuit, having an input connected to the input register and an output, to generate a value eX on its output; a second circuit, having an input connected to the input register and an output, to generate an output which is a value (tan h(X/2)+1)/2 on its output; a comparator, having an input connected to the input register and an output, to generate a line on its output based on a comparison between X and a constant; and a selector to select between inputs connected to the outputs of the first circuit and the second circuit, in response to the output of the comparator, and provide an output representing a value sigmoid(X).

Type: Application

Filed: September 4, 2019

Publication date: March 4, 2021

Applicant: SambaNova Systems, Inc.

Inventors: Mingran WANG, Mark LUTTRELL, Yongning SHENG
Low leakage spare gates for integrated circuits

Patent number: 8810280

Abstract: Devices, systems, methods, and other embodiments associated with spare gates are described. In one embodiment, a spare gate in an integrated circuit has a disconnected discharge path to minimize or eliminate current leakage.

Type: Grant

Filed: October 6, 2011

Date of Patent: August 19, 2014

Assignee: Oracle International Corporation

Inventors: Rambabu Pyapali, Yongjun Zhang, Yongning Sheng
LOW LEAKAGE SPARE GATES FOR INTEGRATED CIRCUITS

Publication number: 20130088261

Abstract: Devices, systems, methods, and other embodiments associated with spare gates are described. In one embodiment, a spare gate in an integrated circuit has a disconnected discharge path to minimize or eliminate current leakage.

Type: Application

Filed: October 6, 2011

Publication date: April 11, 2013

Applicant: ORACLE INTERNATIONAL CORPORATION

Inventors: Rambabu PYAPALI, Yongjun ZHANG, Yongning SHENG
Estimating capacitances using information including feature sizes extracted from a netlist

Patent number: 7036096

Abstract: The capacitances of one or more inputs/outputs of a circuit are estimated by using an extraction tool (120) to extract information associated with the inputs/outputs from a netlist. The information includes information associated with circuit devices directly connected to the inputs/outputs, particularly information related to device connectivity and the feature sizes of the device. Once the information is extracted, a capacitance determination element (130) aggregates the feature sizes of all the circuit devices connected to each respective input or output, to obtain aggregate feature sizes for each respective input/output. The aggregate feature size is used in determining the total capacitance of the input/output. The total capacitance thus determined can be provided to a timing analysis tool (140), which uses the total capacitance of each input or output to generate a timing model for the circuit.

Type: Grant

Filed: September 8, 2003

Date of Patent: April 25, 2006

Assignee: Sun Microsystems, Inc.

Inventors: Aveek Sarkar, Yongning Sheng, Peter F. Lai, Rambabu Pyapali
Magnetic tunneling structure having ferromagnetic layers of different crystallographic structure

Patent number: 6535365

Abstract: A magnetic tunneling structure formed of first and second ferromagnetic layers and a insulating tunneling barrier layer sandwiched therebetween. The first and second ferromagnetic layers are preferably formed of the same ferromagnetic material, but have different crystallographic structures. The insulating tunneling barrier layer is preferably a nitride layer, for example, boron nitride, formed on the first ferromagnetic layer.

Type: Grant

Filed: February 17, 2000

Date of Patent: March 18, 2003

Assignee: The Regents of the University of Michigan

Inventors: Rosa A. Lukaszew, Yongning Sheng, Roy Clarke, Ctirad Uher