Patents Assigned to Vivante Corporation
  • Patent number: 11200495
    Abstract: A convolution neural network (CNN) model is trained and pruned at a pruning ratio. The model is then trained and pruned one or more times without constraining the model according to any previous pruning step. The pruning ratio may be increased at each iteration until a pruning target is reached. The model may then be trained again with pruned connections masked. The process of pruning, retraining, and adjusting the pruning ratio may also be repeated one or more times with a different pruning target.
    Type: Grant
    Filed: September 8, 2017
    Date of Patent: December 14, 2021
    Assignee: Vivante Corporation
    Inventors: Xin Wang, Shang-Hung Lin
  • Patent number: 10585623
    Abstract: A computer system includes a hardware buffer controller. Memory access requests to a buffer do not include an address within the buffer and threads accessing the buffer do not access or directly update any pointers to locations within the buffer. The memory access requests are addressed to the hardware buffer controller, which determines an address from its current state and issues a memory access command to that address. The hardware buffer controller updates its state in response to the memory access requests. The hardware buffer controller evaluates its state and outputs events to a thread scheduler in response to overflow or underflow conditions or near-overflow or near-underflow conditions. The thread scheduler may then block threads from issuing memory access requests to the hardware buffer controller. The buffer implemented may be a FIFO or other type of buffer.
    Type: Grant
    Filed: December 11, 2015
    Date of Patent: March 10, 2020
    Assignee: VIVANTE CORPORATION
    Inventor: Mankit Lo
  • Patent number: 10242311
    Abstract: A convolution engine, such as a convolution neural network, operates efficiently with respect to sparse kernels by implementing zero skipping. An input tile is loaded and accumulated sums are calculated for the input tile for non-zero coefficients by shifting the tile according to a row and column index of the coefficient in the kernel. Each coefficient is applied individually to tile and the result written to an accumulation buffer before moving to the next non-zero coefficient. A 3D or 4D convolution may be implemented in this manner with separate regions of the accumulation buffer storing accumulated sums for different indexes along one dimension. Images are completely processed and results for each image are stored in the accumulation buffer before moving to the next image.
    Type: Grant
    Filed: August 8, 2017
    Date of Patent: March 26, 2019
    Assignee: VIVANTE CORPORATION
    Inventor: Mankit Lo
  • Patent number: 9977619
    Abstract: A computer system processes instructions including an instruction code, source type, source address, destination type, and destination address. The source and destination type may indicate a memory device in which case data is read from the memory device at the source address and written to the destination address. One or both of the source type and destination type may include a transfer descriptor flag, in which case a transfer descriptor identified by the source or destination address is executed. A transfer descriptor referenced by a source address may be executed to obtain an intermediate result that is used for performing the operation indicated by the instruction code. The transfer descriptor referenced by a destination address may be executed to determine a location at which the result of the operation will be stored.
    Type: Grant
    Filed: November 6, 2015
    Date of Patent: May 22, 2018
    Assignee: Vivante Corporation
    Inventor: Mankit Lo
  • Patent number: 9928117
    Abstract: A computer system includes a hardware synchronization component (HSC). Multiple concurrent threads of execution issue instructions to update the state of the HSC. Multiple threads may update the state in the same clock cycle and a thread does not need to receive control of the HSC prior to updating its states. Instructions referencing the state received during the same clock cycle are aggregated and the state is updated according to the number of the instructions. The state is evaluated with respect to a threshold condition. If it is met, then the HSC outputs an event to a processor. The processor then identifies a thread impacted by the event and takes a predetermined action based on the event (e.g. blocking, branching, unblocking of the thread).
    Type: Grant
    Filed: December 11, 2015
    Date of Patent: March 27, 2018
    Assignee: Vivante Corporation
    Inventor: Mankit Lo
  • Patent number: 9875084
    Abstract: A circuit is disclosed that uses a four element dot product circuit (DP4) to approximate an argument t=x/pi for an input x. The argument is then input to a trigonometric function such as Sin Pi( ) or Cos Pi( ). The DP4 circuit calculates x times a representation of the reciprocal of pi. The bits of the reciprocal of pi that are used are selected based on the magnitude of the exponent of x. The DP4 circuit includes four multipliers, two intermediate adders, and a final adder. The outputs of the multipliers, intermediate adders, and final adder are adjusted such that the output of the final adder is a value of the argument t that will provide an accurate output when input to the trigonometric function.
    Type: Grant
    Filed: April 28, 2016
    Date of Patent: January 23, 2018
    Assignee: Vivante Corporation
    Inventors: Lefan Zhong, Guosong Li, Zhenyu Wang, Rui Zhao
  • Patent number: 9703530
    Abstract: Mathematical functions are computed in a single pipeline performing a polynomial approximation (e.g. a quadratic approximation, or the like) using data tables for RCP, SQRT, EXP or LOG using a single pipeline according and opcodes. SIN and COS are also computed using the pipeline according to the approximation ((?1)^IntX)*Sin(?*Min(FracX, 1.0?FracX)/Min(FracX, 1.0?FracX). A pipeline portion approximates Sin(?*FracX) using tables and interpolation and a subsequent stage multiplies this approximation by FracX. For input arguments of x close 1.0. LOG 2(x?1)/(x?1) is computed using a first pipeline portion using tables and interpolation and subsequently multiplied by (x?1). A DIV operation may also be performed with input arguments scaled up to avoid underflow as needed. Inverse trigonometric functions may be calculated using a pre-processing stage and post processing stage in order to obtain multiple inverse trigonometric functions from a single pipeline.
    Type: Grant
    Filed: April 7, 2015
    Date of Patent: July 11, 2017
    Assignee: Vivante Corporation
    Inventors: Lefan Zhong, Wei-Lun Kao
  • Patent number: 9600236
    Abstract: Mathematical functions are computed in a single pipeline performing a polynomial approximation (e.g. a quadratic approximation, or the like); and one or more data tables corresponding to at least one of the RCP, SQRT, EXP or LOG functions operable to be coupled to the single pipeline according to one or more opcodes; wherein the single pipeline is operable for computing at least one of RCP, SQRT, EXP or LOG functions according to the one or more opcodes. SIN and COS are also computed using the pipeline according to the approximation ((?1)^IntX)*Sin(?*Min(FracX, 1.0?FracX)/Min(FracX, 1.0?FracX). A pipeline portion approximates Sin(?*FracX) using tables and interpolation and a subsequent stage multiplies this approximation by FracX. For input arguments of x close 1.0. LOG 2(x?1)/(x?1) is computed using a first pipeline portion using tables and interpolation and subsequently multiplied by (x?1). A DIV operation may also be performed with input arguments scaled up to avoid underflow as needed.
    Type: Grant
    Filed: September 15, 2014
    Date of Patent: March 21, 2017
    Assignee: VIVANTE CORPORATION
    Inventors: Mike M. Cai, Lefan Zhong
  • Patent number: 9460525
    Abstract: Systems and method for tile-based compression are disclosed. Image data, such as a frame, may be divided into tiles. The tiles may be sized based on a size of a line buffer. Tiles are compressed and decompressed individually. As portions of the image frame are updated, corresponding updated tiles may be compressed and stored. Likewise, as tiles are accessed they may be de-compressed and streamed to a requesting device. In some embodiments, a decoder operable to decompress tiles may be interposed between a memory device and a requesting device. Data encoding one or more compressed tiles may be grouped to enable decompression at a rate of four pixels per clock cycle. Methods for compressing image data including both RGB and RGB? components are disclosed.
    Type: Grant
    Filed: June 17, 2013
    Date of Patent: October 4, 2016
    Assignee: Vivante Corporation
    Inventors: Lefan Zhong, Halim Theny, Huiming Zhang
  • Patent number: 9349213
    Abstract: A system for blending includes a memory device, cache, cache controller, and a graphics processing device. The graphics processing device performs blending of a plurality of source images into a single destination image. The graphics processing device performs a method including, for each tile position in the plurality of source images, requesting tiles for the tile position form each source image, blending the tiles individually with a destination tile and overwriting the destination tile in the cache with the result of the blending after each individual blending. The destination tile may be written to memory after each source tile for the each tile position has been blended with the destination tile, such as in response to a cache controller determining that the destination tile is a least recently used (LRU) entry in the cache.
    Type: Grant
    Filed: September 9, 2013
    Date of Patent: May 24, 2016
    Assignee: VIVANTE CORPORATION
    Inventors: Haomin Wu, Frido Garritsen
  • Patent number: 9077313
    Abstract: Disclosed are new approaches to Multi-dimensional filtering with a reduced number of memory reads and writes. In one embodiment, a filter includes first and second coefficients. A block of a data having width and height each equal to the number of one of the first or second coefficients is read from a memory device. Arrays of values from the block are filtering using the first filter coefficients and the results filtered using the second coefficients. The final result may be optionally blended with another data value and written to a memory device. Registers store results of filtering with the first coefficients. The block of data may be read from a location including a source coordinate. The final result of filtering may be written to a destination coordinate obtained by rotating and/or mirroring the source coordinate. The orientation of arrays filtered using the first coefficients varies according to a rotation mode.
    Type: Grant
    Filed: October 14, 2011
    Date of Patent: July 7, 2015
    Assignee: VIVANTE CORPORATION
    Inventors: Mike M. Cai, Huiming Zhang
  • Publication number: 20150070393
    Abstract: A system for blending is disclosed including a memory device, cache, cache controller, and a graphics processing device. The graphics processing device performs blending of a plurality of source images into a single destination image. The graphics processing device performs a method including, for each tile position in the plurality of source images, requesting tiles for the tile position form each source image, blending the tiles individually with a destination tile and overwriting the destination tile in the cache with the result of the blending after each individual blending. The destination tile may be written to memory after each source tile for the each tile position has been blended with the destination tile, such as in response to a cache controller determining that the destination tile is a least recently used (LRU) entry in the cache.
    Type: Application
    Filed: September 9, 2013
    Publication date: March 12, 2015
    Applicant: Vivante Corporation
    Inventors: Haomin Wu, Frido Garritsen
  • Patent number: 8907964
    Abstract: A system to process a plurality of vertices to model an object. An embodiment of the system includes a processor, a front end unit coupled to the processor, and cache configuration logic coupled to the front end unit and the processor. The processor is configured to process the plurality of vertices. The front end unit is configured to communicate vertex data to the processor. The cache configuration logic is configured to establish a cache line size of a vertex cache based on a vertex size of a drawing command.
    Type: Grant
    Filed: April 10, 2007
    Date of Patent: December 9, 2014
    Assignee: Vivante Corporation
    Inventors: Keith Lee, Mike M. Cai
  • Patent number: 8553046
    Abstract: An apparatus and method for detecting and handling thin lines in a raster image includes reading depth values for each pixel of an n×m block of pixels surrounding a substantially central pixel. Differences are then calculated for selected depth values of the n×m block of pixels to yield multiple difference values. These difference values may then be compared with multiple pre-computed difference values associated with thin lines pre-determined to pass through the n×m block of pixels. If the difference values of the pixel block substantially match the difference values of one of the pre-determined thin lines, the pixel block may be deemed to describe a thin line. The apparatus and method may preclude application of an anti-aliasing filter to the substantially central pixel of the pixel block in the event it describes a thin line.
    Type: Grant
    Filed: November 9, 2007
    Date of Patent: October 8, 2013
    Assignee: Vivante Corporation
    Inventors: Lefan Zhong, Abdulkadir Utku Diril
  • Patent number: 8554008
    Abstract: A system to reduce aliasing in a graphical image includes an edge detector configured to read image depth information from a depth buffer. The edge detector also applies edge detection procedures to detect an object edge within the image. An edge style detector is configured to identify a first edge end and a second edge end. The edge style detector also identifies an edge style associated with the detected edge based on the first edge end and the second edge end. The system also includes a restoration module configured to identify pixel data associated with the detected edge and a blending module configured to blend the pixel data associated with the detected edge.
    Type: Grant
    Filed: April 13, 2010
    Date of Patent: October 8, 2013
    Assignee: Vivante Corporation
    Inventors: Lefan Zhong, Mike M. Cai
  • Patent number: 8487948
    Abstract: A graphic processing system to compute a texture level of detail. An embodiment of the graphic processing system includes a memory device, a driver, and level of detail computation logic. The memory device is configured to implement a first lookup table. The first lookup table is configured to provide a first level of detail component. The driver is configured to calculate a log value of a second level of detail component. The level of detail computation logic is coupled to the memory device and the driver. The level of detail computation logic is configured to compute a level of detail for a texture mapping operation based on the first level of detail component from the lookup table and the second level of detail component from the driver. Embodiments of the graphic processing system facilitate a simple hardware implementation using operations other than multiplication, square, and square root operations.
    Type: Grant
    Filed: December 21, 2011
    Date of Patent: July 16, 2013
    Assignee: Vivante Corporation
    Inventors: Mike M. Kai, Jean-Didier Allegrucci, Anthony Ya-Nai Tai
  • Publication number: 20130097212
    Abstract: Disclosed are new approaches to Multi-dimensional filtering with a reduced number of memory reads and writes. In one embodiment, a filter includes first and second coefficients. A block of a data having width and height each equal to the number of one of the first or second coefficients is read from a memory device. Arrays of values from the block are filtering using the first filter coefficients and the results filtered using the second coefficients. The final result may be optionally blended with another data value and written to a memory device. Registers store results of filtering with the first coefficients. The block of data may be read from a location including a source coordinate. The final result of filtering may be written to a destination coordinate obtained by rotating and/or mirroring the source coordinate. The orientation of arrays filtered using the first coefficients varies according to a rotation mode.
    Type: Application
    Filed: October 14, 2011
    Publication date: April 18, 2013
    Applicant: Vivante Corporation
    Inventors: Mike M. Cai, Huiming Zhang
  • Publication number: 20130091189
    Abstract: Methods and apparatus is provided for computing mathematical functions comprising a single pipeline for performing a polynomial approximation (e.g. a quadratic polynomial approximation, or the like); and one or more data tables corresponding to at least one of the RCP, SQRT, EXP or LOG functions operable to be coupled to the single pipeline according to one or more opcodes; wherein the single pipeline is operable for computing at least one of RCP, SQRT, EXP or LOG functions according to the one or more opcodes.
    Type: Application
    Filed: November 30, 2012
    Publication date: April 11, 2013
    Applicant: Vivante Corporation
    Inventor: Vivante Corporation
  • Patent number: 8416241
    Abstract: An apparatus and method for rasterizing a primitive in a graphics system is disclosed in one example of the invention as including scanning a first row of tiles, one tile at a time, starting from a first point and scanning in a first direction. Immediately after scanning the first row of tiles, the method includes moving from the first point to a second point in an orthogonal direction relative to the first row. Immediately after moving from the first point to the second point, the method includes scanning a second row of tiles, one tile at a time, starting from the second point and scanning in the first direction. By scanning rows in the same direction immediately prior to and after moving from one row to another, cache utilization is improved.
    Type: Grant
    Filed: July 21, 2011
    Date of Patent: April 9, 2013
    Assignee: Vivante Corporation
    Inventors: Abdulkadir Utku Diril, Frido Garritsen
  • Publication number: 20130002651
    Abstract: A graphic processing system to compute a texture level of detail. An embodiment of the graphic processing system includes a memory device, a driver, and level of detail computation logic. The memory device is configured to implement a first lookup table. The first lookup table is configured to provide a first level of detail component. The driver is configured to calculate a log value of a second level of detail component. The level of detail computation logic is coupled to the memory device and the driver. The level of detail computation logic is configured to compute a level of detail for a texture mapping operation based on the first level of detail component from the lookup table and the second level of detail component from the driver. Embodiments of the graphic processing system facilitate a simple hardware implementation using operations other than multiplication, square, and square root operations.
    Type: Application
    Filed: December 21, 2011
    Publication date: January 3, 2013
    Applicant: Vivante Corporation
    Inventors: Mike M. Cai, Jean-Didier Allegrucci, Anthony Ya-Nai Tai