Patents by Inventor Cheng C. Wang

Cheng C. Wang has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 11960886
    Abstract: An integrated circuit including a plurality of processing components to process image data of a plurality of image frames, wherein each image frame includes a plurality of stages. Each processing component includes a plurality of execution pipelines, wherein each pipeline includes a plurality of multiplier-accumulator circuits configurable to perform multiply and accumulate operations using image data and filter weights, wherein: (i) a first processing component is configurable to process all of the data associated with a first plurality of stages of each image frame, and (ii) a second processing component of the plurality of processing components is configurable to process all of the data associated with a second plurality of stages of each image frame. The first and second processing component processes data associated with the first and second plurality of stages, respectively, of a first image frame concurrently.
    Type: Grant
    Filed: April 25, 2022
    Date of Patent: April 16, 2024
    Assignee: Flex Logix Technologies, Inc.
    Inventors: Frederick A. Ware, Cheng C. Wang, Valentin Ossman
  • Publication number: 20240111492
    Abstract: An integrated circuit device includes operand storage circuitry to output first and second operands each having a first standard floating point format, multiplier circuitry to multiply the first and second operands to generate a multiplication product first having a second standard floating point format and product accumulation circuitry. The product accumulation circuitry reformats the multiplication product to coarse floating format having a reduced numeric range relative to the originally generated multiplication product and then adds the reformatted multiplication product to a previously generated accumulation value, also having the coarse floating point format, to generate an updated accumulation value having the coarse floating point format, storing the updated accumulation value in place of the previously generated accumulation value.
    Type: Application
    Filed: September 27, 2023
    Publication date: April 4, 2024
    Inventors: Frederick A. Ware, Cheng C. Wang
  • Publication number: 20240111491
    Abstract: An integrated circuit device includes a broadcast data path, a weighting-value memory, Winograd conversion circuitry and multiply-accumulate units. The Winograd conversion circuitry executes a first Winograd conversion function with respect to an input data set to render a converted input data set onto the broadcast data path and executes a second Winograd conversion function with respect to a filter-weight data set to store a converted weighting data set within the weighting-value memory. The multiply-accumulate units, coupled in common to the broadcast data path to receive the converted input data set and coupled to receive respective converted weighting data values from the weighting-value memory, execute a parallel sequence of multiply-accumulate operations to generate an interim output data set that is, in turn, converted to a final output data set through execution of a third Winograd conversion function within the Winograd conversion circuitry.
    Type: Application
    Filed: September 21, 2023
    Publication date: April 4, 2024
    Inventors: Frederick A. Ware, Cheng C. Wang
  • Publication number: 20240103808
    Abstract: A floating-point summation circuit implemented within an integrated circuit device and having inputs to receive a first normalized floating-point operand having an exponent field and a fraction field, and a non-normalized floating-point operand having an exponent field and a fraction field, the fraction field of the non-normalized floating-point operand having a first significant bit in any of at least two different bit positions. Normalizing circuitry within the floating-point summation circuit generates, at least by normalizing the fraction field of the non-normalized floating-point operand, a second normalized floating-point operand having a value corresponding to that of the non-normalized floating point operand, and adder circuitry within the floating-point summation-circuit generates a floating-point sum by adding the first and second normalized floating-point operands.
    Type: Application
    Filed: September 27, 2023
    Publication date: March 28, 2024
    Inventors: Frederick A. Ware, Cheng C. Wang
  • Publication number: 20240104165
    Abstract: An integrated circuit device includes one or more broadcast data paths, a weighting-value memory and multiply-accumulate (MAC) units. The MAC units are coupled in common to each of the broadcast data paths and coupled to receive respective weighting values from the weighting-value memory via respective weighting-value paths. Each of the MAC units includes MAC circuits that each receive an input data value via a respective one of the broadcast data paths and a shared one of the weighting values via a shared one of the respective weighting-value paths; generate a sequence of multiplication products by multiplying the input data value with the shared one of the weighting values; and accumulate a sum of the multiplication products. A configuration value stored within a programmable register controls the number of timing cycles over which the sum of the multiplication products is accumulated.
    Type: Application
    Filed: September 21, 2023
    Publication date: March 28, 2024
    Inventors: Frederick A. Ware, Cheng C. Wang
  • Patent number: 11916551
    Abstract: A method of routing interconnects of a field programmable gate array including: a plurality of logic tiles, and a tile-to-tile interconnect network, having a plurality of tile-to-tile interconnects to interconnect logic tile networks of the logic tiles, the method comprises: routing a first plurality of tile-to-tile interconnects in a first plurality of logic tiles. After routing the first plurality of tile-to-tile interconnects, routing a second plurality of tile-to-tile interconnects in a second plurality of logic tiles. The start/end point of each tile-to-tile interconnect in the first plurality and the second plurality of tiles is independent of the start/end point of the other tile-to-tile interconnects in the first and second plurality, respectively.
    Type: Grant
    Filed: February 19, 2022
    Date of Patent: February 27, 2024
    Assignee: Flex Logix Technologies, Inc.
    Inventors: Yongning Liu, Fan Mo, Cheng C. Wang
  • Patent number: 11893388
    Abstract: An integrated circuit including a plurality of processing components to process image data of a plurality of image frames, wherein each image frame includes a plurality of stages. Each processing component includes a plurality of execution pipelines, wherein each pipeline includes a plurality of multiplier-accumulator circuits configurable to perform multiply and accumulate operations using image data and filter weights, wherein: (i) a first processing component is configured to process all of the data associated with a first plurality of stages of each image frame, and (ii) a second processing component of the plurality of processing components is configured to process all of the data associated with a second plurality of stages of each image frame. The first and second processing component processes data associated with the first and second plurality of stages, respectively, of a first image frame concurrently.
    Type: Grant
    Filed: April 13, 2022
    Date of Patent: February 6, 2024
    Assignee: Flex Logix Technologies, Inc.
    Inventors: Frederick A. Ware, Cheng C. Wang, Valentin Ossman
  • Publication number: 20240004612
    Abstract: An integrated circuit device includes broadcast data paths, a weighting-value memory, multiply-accumulate (MAC) units, and shared shift-out circuitry. The MAC units are coupled in common to each of the broadcast data paths and coupled to receive respective weighting values from the weighting-value memory via respective weighting-value paths. Each of the MAC units includes MAC circuits that each receive an input data value via a respective one of the broadcast data paths and a shared one of the weighting values via a shared one of the respective weighting-value paths; generate a sequence of multiplication products by multiplying the input data value with the shared one of the weighting values; accumulate a sum of the multiplication products; and output the sum of the multiplication products to a respective one of a plurality of serially coupled storage elements within the shared shift-out path.
    Type: Application
    Filed: June 29, 2023
    Publication date: January 4, 2024
    Inventors: Frederick A. Ware, Cheng C. Wang
  • Publication number: 20230359437
    Abstract: An integrated circuit device includes broadcast data paths, a weighting-value memory, multiply-accumulate (MAC) units, and shared shift-out circuitry. The MAC units are coupled in common to each of the broadcast data paths and coupled to receive respective weighting values from the weighting-value memory via respective weighting-value paths. Each of the MAC units includes MAC circuits that each receive an input data value via a respective one of the broadcast data paths and a shared one of the weighting values via a shared one of the respective weighting-value paths; generate a sequence of multiplication products by multiplying the input data value with the shared one of the weighting values; accumulate a sum of the multiplication products; and output the sum of the multiplication products to a respective one of a plurality of serially coupled storage elements within the shared shift-out path.
    Type: Application
    Filed: May 8, 2023
    Publication date: November 9, 2023
    Inventors: Frederick A. Ware, Cheng C. Wang
  • Patent number: 11768790
    Abstract: An integrated circuit including control/configure circuitry which interfaces with a plurality of interconnected MACs and/or one or more rows of interconnected connected MACs. The control/configure circuitry may include a plurality of control/configure circuits, each control/configure circuit interfaces with at least one MAC pipeline, wherein each pipeline includes a plurality of linearly connected multiplier-accumulator circuits. Each control/configure circuit may include one or more of (i) a configurable input data signal path to provide data to the MACs of the pipeline during the execution sequence(s) and (ii) a configurable output data path for the output data generated by execution sequence (i.e., input data that was processed via the multiplier-accumulator circuits of the pipeline).
    Type: Grant
    Filed: August 16, 2022
    Date of Patent: September 26, 2023
    Assignee: Flex Logix Technologies, Inc.
    Inventors: Frederick A. Ware, Cheng C. Wang
  • Publication number: 20230266968
    Abstract: An integrated circuit device includes broadcast data paths, a weighting-value memory, and multiply-accumulate (MAC) units. The MAC units are coupled in common to each of the broadcast data paths and coupled to receive respective weighting values from the weighting-value memory via respective weighting-value paths. Each of the MAC units includes a plurality of MAC circuits coupled respectively to the broadcast data paths, with each of the MAC circuits within a given one of the MAC units (i) receiving an input data value via a respective one of the broadcast data paths and a shared one of the weighting values via a shared one of the respective weighting-value paths, (ii) generating a sequence of multiplication products by multiplying the input data value with the shared one of the weighting values, and (iii) accumulating a sum of the multiplication products.
    Type: Application
    Filed: February 20, 2023
    Publication date: August 24, 2023
    Inventors: Frederick A. Ware, Cheng C. Wang
  • Patent number: 11693625
    Abstract: An integrated circuit including a plurality of logarithmic addition-accumulator circuits, connected in series, to, in operation, perform logarithmic addition and accumulate operations, wherein each logarithmic addition-accumulator circuit includes: (i) a logarithmic addition circuit to add a first input data and a filter weight data, each having the logarithmic data format, and to generate and output first sum data having a logarithmic data format, and (ii) an accumulator, coupled to the logarithmic addition circuit of the associated logarithmic addition-accumulator circuit, to add a second input data and the first sum data output by the associated logarithmic addition circuit to generate first accumulation data. The integrated circuit may further include first data format conversion circuitry, coupled to the output of each logarithmic addition circuit, to convert the data format of the first sum data to a floating point data format wherein the accumulator may be a floating point type.
    Type: Grant
    Filed: November 6, 2020
    Date of Patent: July 4, 2023
    Assignee: Flex Logix Technologies, Inc.
    Inventors: Frederick A. Ware, Cheng C. Wang
  • Publication number: 20230185531
    Abstract: Multiply-accumulate processors within a tensor processing unit simultaneously execute, in each of a sequence of multiply-accumulate cycles, respective multiply operations using a shared input data operand and respective weighting operands, each of the multiply-accumulate processors applying a new shared input data operand and respective weighting operand in each successive multiply-accumulate cycle to accumulate, as a component of an output tensor, a respective sum-of-multiplication-products.
    Type: Application
    Filed: December 13, 2022
    Publication date: June 15, 2023
    Inventors: Frederick A. Ware, Cheng C. Wang
  • Patent number: 11663016
    Abstract: An integrated circuit including configurable multiplier-accumulator circuitry, wherein, during processing operations, a plurality of the multiplier-accumulator circuits are serially connected into pipelines to perform concatenated multiply and accumulate operations. The integrated circuit includes a first memory and a second memory, and a switch interconnect network, including configurable multiplexers arranged in a plurality of switch matrices. The first and second memories are configurable as either a dedicated read memory or a dedicated write memory and connected to a given pipeline, via the switch interconnect network, during a processing operation performed thereby; wherein, during a first processing operations, the first memory is dedicated to write data to a first pipeline and the second memory is dedicated to read data therefrom and, during a second processing operation, the first memory is dedicated to read data from a second pipeline and the second memory is dedicated to write data thereto.
    Type: Grant
    Filed: March 23, 2022
    Date of Patent: May 30, 2023
    Assignee: Flex Logix Technologies, Inc.
    Inventor: Cheng C. Wang
  • Patent number: 11650824
    Abstract: An integrated circuit including memory to store image data and filter weights, and a plurality of multiply-accumulator execution pipelines, each multiply-accumulator execution pipeline coupled to the memory to receive (i) image data and (ii) filter weights, wherein each multiply-accumulator execution pipeline processes the image data, using associated filter weights, via a plurality of multiply and accumulate operations. In one embodiment, the multiply-accumulator circuitry of each multiply-accumulator execution pipeline, in operation, receives a different set of image data, each set including a plurality of image data, and, using filter weights associated with the received set of image data, processes the set of image data associated therewith, via performing a plurality of multiply and accumulate operations concurrently with the multiply-accumulator circuitry of the other multiply-accumulator execution pipelines, to generate output data.
    Type: Grant
    Filed: November 18, 2021
    Date of Patent: May 16, 2023
    Assignee: Flex Logix Technologies, Inc.
    Inventors: Frederick A. Ware, Cheng C. Wang
  • Publication number: 20230131459
    Abstract: An integrated circuit including a multiplier-accumulator execution pipeline including a plurality of multiplier-accumulator circuits to, in operation, perform multiply and accumulate operations, wherein each multiplier-accumulator circuit includes: (i) a multiplier to multiply first input data, having a first floating point data format, by a filter weight data, having the first floating point data format, and generate and output a product data having a second floating point data format, and (ii) an accumulator, coupled to the multiplier of the associated MAC circuit, to add second input data and the product data output by the associated multiplier to generate sum data. The plurality of multiplier-accumulator circuits of the multiplier-accumulator execution pipeline may be connected in series and, in operation, perform a plurality of concatenated multiply and accumulate operations.
    Type: Application
    Filed: December 23, 2022
    Publication date: April 27, 2023
    Inventors: Frederick A. Ware, Cheng C. Wang, Fang-Li Yuan, Nitish U. Natu
  • Patent number: 11604645
    Abstract: An integrated circuit comprising a plurality of multiplier-accumulator circuits connected in series in a linear pipeline to perform a plurality of concatenated multiply and accumulate operations, wherein each multiplier-accumulator circuit of the plurality of multiplier-accumulator circuits includes: a multiplier to multiply first data by a multiplier weight data and generate a product data, and an accumulator, coupled to the multiplier of the associated multiplier-accumulator circuit, to add second data and the product data of the associated multiplier to generate sum data.
    Type: Grant
    Filed: July 15, 2021
    Date of Patent: March 14, 2023
    Assignee: Flex Logix Technologies, Inc.
    Inventors: Frederick A. Ware, Cheng C. Wang
  • Publication number: 20220391343
    Abstract: An integrated circuit including control/configure circuitry which interfaces with a plurality of interconnected MACs and/or one or more rows of interconnected connected MACs. The control/configure circuitry may include a plurality of control/configure circuits, each control/configure circuit interfaces with at least one MAC pipeline, wherein each pipeline includes a plurality of linearly connected multiplier-accumulator circuits. Each control/configure circuit may include one or more of (i) a configurable input data signal path to provide data to the MACs of the pipeline during the execution sequence(s) and (ii) a configurable output data path for the output data generated by execution sequence (i.e., input data that was processed via the multiplier-accumulator circuits of the pipeline).
    Type: Application
    Filed: August 16, 2022
    Publication date: December 8, 2022
    Applicant: Flex Logix Technologies, Inc.
    Inventors: Frederick A. Ware, Cheng C. Wang
  • Publication number: 20220385291
    Abstract: An integrated circuit comprising a plurality of MACs, connected to form a pipeline, to perform a plurality of multiply and accumulate operations, wherein each MAC includes: (A) a multiplier, coupled to memory to (i) receive the multiplier weight data, (ii) multiply first data and the multiplier weight data and (iii) output product data, (B) an accumulator, coupled to the multiplier of the MAC, to add second data and the first product data and output sum data, and (C) a load-store register, coupled to: (i) an output of the accumulator of the associated MAC and (ii) an input of the load-store register of an immediately successive MAC. Each load-store register may include two interconnected registers, and is configurable to, on the same clock cycle, (a) load the initialization data into the accumulator of the immediately successive MAC and (b) store the sum data from the associated MAC into the load-store register.
    Type: Application
    Filed: August 10, 2022
    Publication date: December 1, 2022
    Applicant: Flex Logix Technologies, Inc.
    Inventor: Cheng C. Wang
  • Publication number: 20220374492
    Abstract: An integrated circuit including a multiplier-accumulator execution pipeline including a plurality of multiplier-accumulator circuits to process the data, using filter weights, via a plurality of multiply and accumulate operations. The integrated circuit includes first conversion circuitry, coupled the pipeline, having inputs to receive a plurality of sets of data, wherein each set of data includes a plurality of data, Winograd conversion circuitry to convert each set of data to a corresponding Winograd set of data, floating point format conversion circuitry, coupled to the Winograd conversion circuitry, to convert the data of each Winograd set of data to a floating point data format. In operation, the multiplier-accumulator circuits are configured to perform the plurality of multiply and accumulate operations using the data of the plurality of Winograd sets of data from the first conversion circuitry and the filter weights, and generate output data based on the multiply and accumulate operations.
    Type: Application
    Filed: August 1, 2022
    Publication date: November 24, 2022
    Applicant: Flex Logix Technologies, Inc.
    Inventors: Frederick A Ware, Cheng C. Wang