Patents by Inventor Zhengya ZHANG

Zhengya ZHANG has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 10853066
    Abstract: A memory processing unit can be configured to compute partial products between one or more elements of a first matrix stored in a first storage location and sequential bits of one or more elements of a second matrix stored in a second storage location. The partial products can be calculated utilizing zero bit skipping to increase throughput and or reduce energy consumption. The partial products for each column of elements can be accumulated and bit shifted to compute the dot product of the first and second matrix.
    Type: Grant
    Filed: December 24, 2019
    Date of Patent: December 1, 2020
    Assignee: MemryX Incorporated
    Inventors: Chester Liu, Mohammed Zidan, Wei Lu, Zhengya Zhang
  • Publication number: 20200357459
    Abstract: A memory processing unit can be configured to compute partial products between one or more elements of a first matrix stored in a given row of a memory cell array and sequential bits of one or more elements of a second matrix. The partial products can be calculated first sequentially across the set of rows and second sequentially across the bit positions of the elements of the second matrix. Alternatively, the partial products can be calculated first sequentially across the bit positions of the elements of the second matrix first and second sequentially across the set of rows. The partial products for each column of elements can be accumulated and bit shifted to compute the dot product of the first and second matrix.
    Type: Application
    Filed: December 24, 2019
    Publication date: November 12, 2020
    Inventors: Mohammed Zidan, Chester Liu, Zhengya Zhang, Wei Lu
  • Publication number: 20200174786
    Abstract: Many signal processing, machine learning and scientific computing applications require a large number of multiply-accumulate (MAC) operations. This type of operation is demanding in both computation and memory. Process in memory has been proposed as a new technique that computes directly on a large array of data in place, to eliminate expensive data movement overhead. To enable parallel multi-bit MAC operations, both width- and level-modulating memory word lines are applied. To improve performance and provide tolerance against process-voltage-temperature variations, a delay-locked loop is used to generate fine unit pulses for driving memory word lines and a dual-ramp Single-slope ADC is used to convert bit line outputs. The concept is prototyped in a 180 nm CMOS test chip made of four 320×64 compute-SRAMs, each supporting 128× parallel 5 b×5 b MACs with 32 5 b output ADCs and consuming 16.6 mW at 200 MHz.
    Type: Application
    Filed: November 29, 2018
    Publication date: June 4, 2020
    Inventors: Zhengya ZHANG, Thomas CHEN, Jacob Christopher BOTIMER, Shiming SONG
  • Publication number: 20190228285
    Abstract: A configurable neuro-inspired convolution processor is designed as an array of neurons each operating in an independent clock domain. The processor implements a recurrent network using efficient sparse convolutions with zero-patch skipping for feedforward operations, and sparse spike-driven reconstruction for feedback operations. A globally asynchronous locally synchronous structure enables scalable design and load balancing to achieve 22% reduction in power. Fabricated in 40 nm CMOS, the 2.56 mm2 inference processor integrates 48 neurons, a hub and an Open RISC processor. The chip achieves 718 GOPS at 380 MHz, and demonstrates applications in feature extraction from images and depth extraction from stereo images.
    Type: Application
    Filed: January 24, 2018
    Publication date: July 25, 2019
    Inventors: Zhengya ZHANG, Chester LIU
  • Publication number: 20180349764
    Abstract: A sparse video inference chip is designed to extract spatio-temporal features from videos for action classification and motion tracking. The core is a sparse video inference processor that implements recurrent neural network in three layers of processing. High sparsity is enforced in each layer of processing, reducing the complexity by two orders of magnitude and allowing all multiply-accumulates (MAC) to be replaced by select-accumulates (SA). The design is demonstrated in a 3.98 mm2 40 nm CMOS chip with an Open-RISC processor providing software-defined control and classification.
    Type: Application
    Filed: June 5, 2018
    Publication date: December 6, 2018
    Inventors: Zhengya ZHANG, Ching-En LEE, Chester LIU, Thomas CHEN
  • Patent number: 10133553
    Abstract: A reciprocal unit for computing an estimated reciprocal of a number represented by a bit string. The unit comprises a first lookup table configured to receive one or more of the bits in the bit string and to output an initial estimate of the reciprocal of the number. The unit further comprises a second lookup table configured to receive one or more of the bits in the bit string and to output the square of the initial estimate of the reciprocal of the number. The unit still further comprises a multiplier circuit configured to multiply the square of the initial estimate by the number, and an adder-subtractor circuit for subtracting the product of the multiplication from a scaled value of the initial estimate to determine a final estimate of the reciprocal of the number.
    Type: Grant
    Filed: February 20, 2016
    Date of Patent: November 20, 2018
    Assignee: The Regents of The University of Michigan
    Inventors: Zhengya Zhang, Chia-Hsiang Chen
  • Publication number: 20170357889
    Abstract: An information processor is provided that includes an inference module configured to extract a subset of data from information in an input and a classification module configured to classify the information in the input based on the extracted data. The inference module includes a first plurality of convolvers acting in parallel to apply each of N1 convolution kernels to each of N2 portions of the input image in order to generate an interim sparse representation of the input and a second plurality of convolvers acting in parallel to apply each of N3 convolution kernels to each of N4 portions of the interim sparse representation to generate a final sparse representation containing the extracted data. In order to take advantage of sparsity in the interim sparse representation, N3 is greater than N4 to parallelize processing in a non-sparse dimension and/or the second plurality of convolvers comprise sparse convolvers.
    Type: Application
    Filed: June 13, 2017
    Publication date: December 14, 2017
    Inventors: Zhengya ZHANG, Chester LIU, Phil KNAG
  • Patent number: 9565581
    Abstract: A nonbinary iterative detector-decoder (IDD) system. The IDD system comprises a detector, a decoder; and a nonbinary interface electrically connected between the detector and decoder. The interface is operative to convert a soft symbol and variance that is output by the detector into a corresponding nonbinary log likelihood ratio (LLR) vector that comprises one or more nonbinary LLRs, and to provide the LLR vector to the decoder. The interface is further configured to convert a nonbinary LLR vector comprised of one or more nonbinary LLRs that is output by the decoder into a corresponding soft symbol and variance, and to provide the soft symbol and variance to the detector.
    Type: Grant
    Filed: February 20, 2016
    Date of Patent: February 7, 2017
    Assignee: The Regents of the University of Michigan
    Inventors: Zhengya Zhang, Chia-Hsiang Chen
  • Publication number: 20160358075
    Abstract: A sparse coding system. The sparse coding system comprises a neural network including a plurality of neurons each having a respective feature associated therewith and each being configured to be electrically connected to every other neuron in the network and to a portion of an input dataset. The plurality of neurons are arranged in a plurality of neuron clusters each comprising a respective subset of the plurality of neurons, and the neurons in each cluster are electrically connected to one another in a bus structure, and the plurality of clusters are electrically connected together in a ring structure. Also provided is a sparse coding system that comprises an inference module configured to extract features from an input image containing an object, wherein the inference module comprises an implementation of a sparse coding algorithm, and a classifier configured to classify the object in the input image based on the extracted features.
    Type: Application
    Filed: June 8, 2016
    Publication date: December 8, 2016
    Inventors: Zhengya Zhang, Thomas Chen, Jung Kuk Kim, Phil Knag
  • Publication number: 20160249234
    Abstract: A nonbinary iterative detector-decoder (IDD) system. The IDD system comprises a detector, a decoder; and a nonbinary interface electrically connected between the detector and decoder. The interface is operative to convert a soft symbol and variance that is output by the detector into a corresponding nonbinary log likelihood ratio (LLR) vector that comprises one or more nonbinary LLRs, and to provide the LLR vector to the decoder. The interface is further configured to convert a nonbinary LLR vector comprised of one or more nonbinary LLRs that is output by the decoder into a corresponding soft symbol and variance, and to provide the soft symbol and variance to the detector.
    Type: Application
    Filed: February 20, 2016
    Publication date: August 25, 2016
    Inventors: Zhengya ZHANG, Chia-Hsiang CHEN
  • Publication number: 20160246572
    Abstract: A reciprocal unit for computing an estimated reciprocal of a number represented by a bit string. The unit comprises a first lookup table configured to receive one or more of the bits in the bit string and to output an initial estimate of the reciprocal of the number. The unit further comprises a second lookup table configured to receive one or more of the bits in the bit string and to output the square of the initial estimate of the reciprocal of the number. The unit still further comprises a multiplier circuit configured to multiply the square of the initial estimate by the number, and an adder-subtractor circuit for subtracting the product of the multiplication from a scaled value of the initial estimate to determine a final estimate of the reciprocal of the number.
    Type: Application
    Filed: February 20, 2016
    Publication date: August 25, 2016
    Inventors: Zhengya ZHANG, Chia-Hsiang CHEN