Patents by Inventor Peng OUYANG

Peng OUYANG has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 11977894
    Abstract: The disclosure provides a method for distributing instructions in a reconfigurable processor. The reconfigurable processor includes an instruction fetch module, an instruction sync control module and an instruction queue module. The method includes: configuring a format of a Memory Sync ID Table of each instruction type, obtaining a first memory identification field and a second memory identification field of each instruction, obtaining one-hot encodings of first and second memory identification fields, obtaining a sync table and executing each instruction of a plurality of to-be-run instructions.
    Type: Grant
    Filed: May 7, 2021
    Date of Patent: May 7, 2024
    Assignee: BEIJING TSINGMICRO INTELLIGENT TECHNOLOGY CO., LTD.
    Inventors: Baochuan Fei, Peng Ouyang, Shibin Tang, Liwei Deng
  • Patent number: 11954061
    Abstract: A mapping method for a reconfigurable array, including: Si obtaining and analyzing a DDG; providing an initial interval; obtaining a reconfigurable architecture; copying the first adjacency matrix and the second adjacency matrix to form a mapping space; establishing an integer linear programming model, and mapping, with the integer linear programming model, a processing vertex, an intra-cycle edge, and an inter-cycle edge in the DDG, to the mapping space, respectively; obtaining a mapping relationship from the processing vertex and the edge in the DDG to the processing element and the link of extended TS_max layers; and generating configuration information by the mapping relationship modulo the initial interval.
    Type: Grant
    Filed: September 23, 2021
    Date of Patent: April 9, 2024
    Assignee: BEIJING TSINGMICRO INTELLIGENT TECHNOLOGY CO., LTD.
    Inventors: Chongyang Wang, Zhen Zhang, Peng Ouyang
  • Patent number: 11928473
    Abstract: An instruction scheduling method and an instruction scheduling system for a reconfigurable array processor. The method includes: determining whether a fan-out of a vertex in a data flow graph (DFG) is less than an actual interconnection number of a processing unit in a reconfigurable array; establishing a corresponding relationship between the vertex and a correlation operator of the processing unit; introducing a register to a directed edge, acquiring a retiming value of each vertex; arranging instructions in such a manner that retiming values of the instruction vertexes are in ascending order, and acquiring transmission time and scheduling order of the instructions; folding the DFG, placing an instruction to an instruction vertex; inserting a register and acquiring a current DFG; and acquiring a common maximum subset of the current DFG and the reconfigurable array by a maximum clique algorithm, and distributing the instructions.
    Type: Grant
    Filed: March 22, 2022
    Date of Patent: March 12, 2024
    Assignee: BEIJING TSINGMICRO INTELLIGENT TECHNOLOGY CO., LTD.
    Inventors: Kejia Zhu, Zhen Zhang, Peng Ouyang
  • Patent number: 11921668
    Abstract: The present disclosure provides a processor array and a multiple-core processor. The processor array includes a plurality of processing elements arranged in a two-dimensional array, a plurality of first load units correspondingly arranged and connected to the processing elements of the first edge row, respectively, a plurality of second load units correspondingly arranged and connected to the processing elements of the first edge column, respectively, a plurality of first store units correspondingly arranged and connected to the processing elements of the second edge column, respectively, a plurality of second store units correspondingly arranged and connected to the processing elements of the second edge row, respectively.
    Type: Grant
    Filed: July 15, 2021
    Date of Patent: March 5, 2024
    Assignee: BEIJING TSINGMICRO INTELLIGENT TECHNOLOGY CO., LTD.
    Inventors: Peng Ouyang, Guozhi Song
  • Patent number: 11740832
    Abstract: A data storage method includes: obtaining memory banks of arithmetic data; generating undetermined memory bank numbers of the memory banks sequentially; scanning storage dimensions of the arithmetic data to obtain the undetermined memory bank numbers, filling elements to make the undetermined memory bank numbers continuous if the undetermined memory bank numbers of two adjacent dimensions are not continuous; taking as a current transformation vector through a greedy algorithm a determined transformation vector where conflict is least and the number of the filling elements is smallest; generating current memory bank numbers of the memory banks according to the current transformation vector; converting each of the current memory bank numbers into a physical storage bank address through an offset function to obtain a corresponding internal offset address; and storing the arithmetic data into the memory banks according to the current memory bank numbers and the internal offset addresses.
    Type: Grant
    Filed: September 24, 2021
    Date of Patent: August 29, 2023
    Assignee: BEIJING TSINGMICRO INTELLIGENT TECHNOLOGY CO., LTD.
    Inventors: Cheng Li, Peng Ouyang, Zhen Zhang
  • Publication number: 20230068450
    Abstract: The disclosure provides a method and apparatus for processing sparse data. The method is applied to a reconfigurable processor that includes a PE array, and the PE array includes P×Q PE units. The method includes: dividing a sparse weight matrix to be calculated into at least one unit block; grouping a plurality of unit blocks into a computing group; and obtaining an effective weight address corresponding to each effective weight in the computing group.
    Type: Application
    Filed: May 27, 2021
    Publication date: March 2, 2023
    Inventors: Shibin TANG, Peng OUYANG
  • Publication number: 20230068463
    Abstract: The disclosure provides a method for distributing instructions in a reconfigurable processor. The reconfigurable processor includes an instruction fetch module, an instruction sync control module and an instruction queue module. The method includes: configuring a format of a Memory Sync ID Table of each instruction type, obtaining a first memory identification field and a second memory identification field of each instruction, obtaining one-hot encodings of first and second memory identification fields, obtaining a sync table and executing each instruction of a plurality of to-be-run instructions.
    Type: Application
    Filed: May 7, 2021
    Publication date: March 2, 2023
    Inventors: Baochuan FEI, Peng OUYANG, Shibin TANG, Liwei DENG
  • Publication number: 20220214883
    Abstract: An instruction scheduling method and an instruction scheduling system for a reconfigurable array processor. The method includes: determining whether a fan-out of a vertex in a data flow graph (DFG) is less than an actual interconnection number of a processing unit in a reconfigurable array; establishing a corresponding relationship between the vertex and a correlation operator of the processing unit; introducing a register to a directed edge, acquiring a retiming value of each vertex; arranging instructions in such a manner that retiming values of the instruction vertexes are in ascending order, and acquiring transmission time and scheduling order of the instructions; folding the DFG, placing an instruction to an instruction vertex; inserting a register and acquiring a current DFG; and acquiring a common maximum subset of the current DFG and the reconfigurable array by a maximum clique algorithm, and distributing the instructions.
    Type: Application
    Filed: March 22, 2022
    Publication date: July 7, 2022
    Inventors: Kejia ZHU, Zhen ZHANG, Peng OUYANG
  • Publication number: 20220206697
    Abstract: Provided are a memory coupled compiling method and system of a reconfigurable chip. The memory coupled compiling method includes: acquiring a cycle number of a data flow graph (DFG); acquiring a linear transformation vector of the cycle number through a mapping time difference; determining whether a linear array of the linear transformation vector is acquired by a heuristic algorithm; acquiring a memory mapping result through a current DFG or acquiring a cycle number of the current DFG until the linear array is acquired, depending on the determination result.
    Type: Application
    Filed: September 24, 2021
    Publication date: June 30, 2022
    Inventors: Zhen ZHANG, Peng OUYANG, Junbao HU
  • Publication number: 20220100698
    Abstract: The present disclosure provides a processor array and a multiple-core processor. The processor array includes a plurality of processing elements arranged in a two-dimensional array, a plurality of first load units correspondingly arranged and connected to the processing elements of the first edge row, respectively, a plurality of second load units correspondingly arranged and connected to the processing elements of the first edge column, respectively, a plurality of first store units correspondingly arranged and connected to the processing elements of the second edge column, respectively, a plurality of second store units correspondingly arranged and connected to the processing elements of the second edge row, respectively.
    Type: Application
    Filed: July 15, 2021
    Publication date: March 31, 2022
    Inventors: Peng OUYANG, Guozhi SONG
  • Publication number: 20220100699
    Abstract: A computing array includes a plurality of process element groups, and each of the plurality of the process element groups includes four process elements arranged in two rows and two columns and a merging unit. Each of the four process elements includes an input subunit; a fetch and decode subunit configured to obtain and compile the instruction to output a logic computing type; an operation subunit configured to obtain computing result data according to the logic computing type and the operation data; an output subunit configured to output the computing result data. The merging unit is connected to the output subunit of each of the four process elements, and configured to receive the computing result data output by the output subunit of each of the four process elements, merge the computing result data and output the merged computing result data.
    Type: Application
    Filed: September 23, 2021
    Publication date: March 31, 2022
    Inventors: Peng OUYANG, Yaxue ZHANG
  • Publication number: 20220100414
    Abstract: A data storage method includes: obtaining memory banks of arithmetic data; generating undetermined memory bank numbers of the memory banks sequentially; scanning storage dimensions of the arithmetic data to obtain the undetermined memory bank numbers, filling elements to make the undetermined memory bank numbers continuous if the undetermined memory bank numbers of two adjacent dimensions are not continuous; taking as a current transformation vector through a greedy algorithm a determined transformation vector where conflict is least and the number of the filling elements is smallest; generating current memory bank numbers of the memory banks according to the current transformation vector; converting each of the current memory bank numbers into a physical storage bank address through an offset function to obtain a corresponding internal offset address; and storing the arithmetic data into the memory banks according to the current memory bank numbers and the internal offset addresses.
    Type: Application
    Filed: September 24, 2021
    Publication date: March 31, 2022
    Inventors: Cheng LI, Peng OUYANG, Zhen ZHANG
  • Publication number: 20220100521
    Abstract: A data loading and storage system includes a storage module, a buffering module, a control module, a plurality of data loading modules, a plurality of data storage modules and a multi-core processor array module. The data is continuously stored in a DDR, and the data computed by the multi-core processor may be arranged continuously or be arranged according to a certain rule. After DMA reads the data into the DATA_BUF module by a BURST mode, in order to support fast loading of the data into the multi-core processor array, the data loading modules (i.e., load modules) are designed. In order to quickly store the computed result of the multi-core processor array into the (DATA_BUF module according to a certain rule, the data storage modules (i.e., store module) are designed.
    Type: Application
    Filed: September 24, 2021
    Publication date: March 31, 2022
    Inventors: Pengpeng ZHANG, Peng OUYANG
  • Publication number: 20220083495
    Abstract: A mapping method for a reconfigurable array, including: Si obtaining and analyzing a DDG; providing an initial interval; obtaining a reconfigurable architecture; copying the first adjacency matrix and the second adjacency matrix to form a mapping space; establishing an integer linear programming model, and mapping, with the integer linear programming model, a processing vertex, an intra-cycle edge, and an inter-cycle edge in the DDG, to the mapping space, respectively; obtaining a mapping relationship from the processing vertex and the edge in the DDG to the processing element and the link of extended TS_max layers; and generating configuration information by the mapping relationship modulo the initial interval.
    Type: Application
    Filed: September 23, 2021
    Publication date: March 17, 2022
    Inventors: Chongyang WANG, Zhen Zhang, Peng OUYANG
  • Publication number: 20220036521
    Abstract: An image correction method includes: capturing speckle patterns on two planes at different distances to obtain a first image of speckle projected on a first plane and a second image of speckle projected on a second plane; matching the first image with the second image to obtain sub-pixel matching points; obtaining, based on first physical coordinates of the sub-pixel matching points on the first image and second physical coordinates of the sub-pixel matching points on the second image, a mapping matrix between the first and second physical coordinates; obtaining a direction vector of a center of the speckle projector in a camera reference frame according to the mapping matrix; adjusting coordinate axis directions of the camera reference frame to align a horizontal axis direction with the direction vector, updating an imaging matrix of the camera; and mapping a target scene image through the imaging matrix to obtain a corrected image.
    Type: Application
    Filed: September 29, 2021
    Publication date: February 3, 2022
    Inventors: Kai ZHOU, Shouyi YIN, Shibin TANG, Peng OUYANG, Xiudong LI, Bo WANG
  • Patent number: 11151439
    Abstract: A computing in-memory system and computing in-memory method based on a skyrmion race memory are provided. The system comprises a circuit architecture of SRM-CIM. The circuit architecture of the SRM-CIM comprises a row decoder, a column decoder, a voltage-driven, a storage array, a modified sensor circuit, a counter Bit-counter and a mode controller. The voltage-driven includes two NMOSs, and the two NMOSs are respectively connected with a selector MUX. The modified sensor circuit compares the resistance between a first node to a second node and a third node to a fourth node by using a pre-charge sense amplifier. The storage array is composed of the skyrmion racetrack memories. The computing in-memory architecture is designed by utilizing the skyrmion racetrack memory, so that storage is realized in the memory, and computing operation can be carried out in the memory.
    Type: Grant
    Filed: April 25, 2019
    Date of Patent: October 19, 2021
    Assignees: HEFEI INNOVATION RESEARCH INSTITUTE, BEIHANG UNIVERSITY, BEIHANG UNIVERSITY
    Inventors: Peng Ouyang, Yu Pan, Youguang Zhang, Weisheng Zhao
  • Publication number: 20210019596
    Abstract: A computing in-memory system and computing in-memory method based on a skyrmion race memory are provided. The system comprises a circuit architecture of SRM-CIM. The circuit architecture of the SRM-CIM comprises a row decoder, a column decoder, a voltage-driven, a storage array, a modified sensor circuit, a counter Bit-counter and a mode controller. The voltage-driven includes two NMOSs, and the two NMOSs are respectively connected with a selector MUX. The modified sensor circuit compares the resistance between a first node to a second node and a third node to a fourth node by using a pre-charge sense amplifier. The storage array is composed of the skyrmion racetrack memories. The computing in-memory architecture is designed by utilizing the skyrmion racetrack memory, so that storage is realized in the memory, and computing operation can be carried out in the memory.
    Type: Application
    Filed: April 25, 2019
    Publication date: January 21, 2021
    Applicants: HEFEI INNOVATION RESEARCH INSTITUTE, BEIHANG UNIVERSITY, BEIHANG UNIVERSITY
    Inventors: Peng OUYANG, Yu PAN, Youguang ZHANG, Weisheng ZHAO