Patents by Inventor Yufei Zhang

Yufei Zhang has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20220295080
    Abstract: The present disclosure relates to a method for computing, computing device and computer-readable storage medium. The method includes: determining a pixel block set in a cache, a first pixel block in the pixel block set comprising an m×n pixel matrix having a first padding setting related to the original pixel data, the m and n being positive integers; and storing the determined pixel block set in a buffer to enable a second pixel block to be read from the buffer based on the buffer initial address of the first pixel block and an address offset associated with the second pixel block, wherein the second pixel block has a second padding setting related to the original pixel data, and the first padding setting and the second padding setting have the same offset amount in a first direction relative to the original pixel data.
    Type: Application
    Filed: March 10, 2022
    Publication date: September 15, 2022
    Applicant: Shanghai Biren Technology Co.,Ltd
    Inventors: YuFei ZHANG, Zhou HONG
  • Publication number: 20220284075
    Abstract: The embodiments of the disclosure relate to a computing device, a computing apparatus, and a method of warp accumulation and relate to the field of computers. The computing device includes a storage unit and an accumulation computing unit coupled to the storage unit. The accumulation computing unit is configured to receive, from a vector processing unit coupled to the computing device, a first warp accumulation instruction, a plurality of first values corresponding to a warp lane number, and a first storage address; generate a current accumulation result based on the plurality of first values in response to the first warp accumulation instruction; and store the current accumulation result in the first storage address in the storage unit to be read by the vector processing unit. In this way, accumulation in a warp may be decoupled to dedicated hardware for processing, and overall accumulation performance may thus be significantly improved.
    Type: Application
    Filed: March 4, 2022
    Publication date: September 8, 2022
    Applicant: Shanghai Biren Technology Co.,Ltd
    Inventors: YuFei ZHANG, Zhu LIANG, Min GAO
  • Publication number: 20220206749
    Abstract: A computing device and a method for reusing data are provided. The computing device includes a general register and an arithmetic unit coupled to the general register. The arithmetic unit includes a data reuse unit, which is coupled to multiple dot product data units. The data reuse unit is configured to read from the general register and temporarily store a data set used for multiple convolution operations, and determine multiple data subsets from the data set to be respectively inputted into the multiple dot product data units. Two data subsets inputted into two adjacent dot product data unit include a portion of the same data. Each of the multiple dot product data units is configured to perform a dot product operation on the inputted data subset, so as to generate a dot product operation result.
    Type: Application
    Filed: November 11, 2021
    Publication date: June 30, 2022
    Applicant: Shanghai Biren Technology Co.,Ltd
    Inventors: YuFei ZHANG, Hao SHU
  • Publication number: 20220158929
    Abstract: An information processing method, an interconnection device, and a computer-readable storage medium are provided. The interconnection device includes a request processing module configured for: receiving a data access request from at least one processor, wherein the data access request comprises a merge bit, a multicast group identifier (MGID), and a multicast transaction identifier (MTID); determining whether the data access request is a multicast request; determining whether the interconnection device receives other multicast requests if it is determined that the data access request is a multicast request based on the MGID, the MTID, and a static routing policy of a multicast group; and obtaining the other multicast requests if it is determined that the interconnection device receives the other multicast requests, merging the multicast request with the other multicast requests into a merged request, and forwarding the merged request to a next-hop device of the interconnection device.
    Type: Application
    Filed: November 11, 2021
    Publication date: May 19, 2022
    Applicant: Shanghai Biren Technology Co.,Ltd
    Inventors: Qin ZHENG, Zhou HONG, YuFei ZHANG, Lin CHEN, ChengKun SUN, Tong SUN, ChengPing LUO, HaiChuan WANG
  • Publication number: 20220156128
    Abstract: The embodiments of the disclosure relate to a computing device, a computing equipment, and a programmable scheduling method for data loading and execution, and relate to the field of computer. The computing device is coupled to a first computing core and a first memory. The computing device includes a scratchpad memory, a second computing core, a first hardware queue, a second hardware queue and a synchronization unit. The second computing core is configured for acceleration in a specific field. The first hardware queue receives a load request from the first computing core. The second hardware queue receives an execution request from the first computing core. The synchronization unit configured to make the triggering of the load request and the execution request to cooperate with each other. In this manner, flexibility, throughput, and overall performance can be enhanced.
    Type: Application
    Filed: November 11, 2021
    Publication date: May 19, 2022
    Applicant: Shanghai Biren Technology Co.,Ltd
    Inventors: Zhou HONG, YuFei ZHANG, ChengKun SUN, Lin CHEN
  • Publication number: 20220147354
    Abstract: The embodiments of the disclosure relate to a computing device and a method for loading data. According to the method, the first processing unit sends a first instruction to the NMP unit. The first instruction includes a first address, a plurality of second addresses, and an operation type. In response to the first instruction, the NMP unit performs operations associated with the operation type on multiple data items on the multiple second addresses of the first memory, so as to generate the operation result. The NMP unit stores the operation result to the first address of the first memory. The first processing unit issues a flush instruction to make the operation result on the first address visible to the first processing unit. The first processing unit issues a read instruction to read the operation result on the first address to the first processing unit.
    Type: Application
    Filed: November 10, 2021
    Publication date: May 12, 2022
    Applicant: Shanghai Biren Technology Co.,Ltd
    Inventors: Zhou HONG, YuFei ZHANG
  • Publication number: 20220121444
    Abstract: The invention relates to an apparatus for configuring cooperative warps in a vector computing system. The apparatus includes general-purpose registers (GPRs); an arithmetic logical unit (ALU); and a warp instruction scheduler. The warp instruction scheduler is arranged operably to: allow each of a plurality of warps to access to data of a whole or a designated portion of the GPRs through the ALU in accordance with a configuration by a software when being executed; and complete calculations of each warp through the ALU.
    Type: Application
    Filed: July 2, 2021
    Publication date: April 21, 2022
    Applicant: Shanghai Biren Technology Co., Ltd
    Inventors: Zhou HONG, YuFei ZHANG, ChengKun SUN, Lin CHEN, Hao SHU
  • Publication number: 20220121727
    Abstract: The invention relates to an apparatus for vector computing incorporating with matrix multiply and accumulation (MMA) calculation. The apparatus includes a streaming multiprocessor (SM), and a block selector. The register space is divided into physical blocks, each of which includes register groups, and a general matrix multiply (GEMM) calculation unit. The SM includes a general-purpose register (GPR), and the GEMM calculation unit includes an instruction queue and a arithmetic logical unit (ALU). The ALU coupled to the GPR is arranged operably to perform MMA calculation according to a GEMM instruction stored in the instruction queue, and store a calculation result in the GPR.
    Type: Application
    Filed: July 2, 2021
    Publication date: April 21, 2022
    Applicant: Shanghai Biren Technology Co., Ltd
    Inventors: Zhou HONG, YuFei ZHANG
  • Publication number: 20220012053
    Abstract: A method of storing data in general purpose registers (GPRs) includes packing a tile of data items into GPRs, where the tile includes multiple channels. The tile of data items is read from memory. At least two channels of the data are stored in a first GPR, and at least two additional channels are stored in a second GPR. Auxiliary data is loaded into a third GPR. The auxiliary data and the tile data can be used together for performing convolution operations.
    Type: Application
    Filed: September 27, 2021
    Publication date: January 13, 2022
    Inventors: Lin Chen, Zhou Hong, Yufei Zhang
  • Publication number: 20210398339
    Abstract: Methodologies and architectures are provided for inter-thread sharing of data in a general purpose register (GPR) of a multiprocessor apparatus. In described embodiments, such data sharing is performed by a graphics processing unit (GPU) having at least one processing cluster, the at least one processing cluster including a plurality of processing cores (PCs) configured for parallel operation. Each PC of a cluster is configured to utilize a dedicated portion of the GPR. The GPU further includes a shared memory for the cluster, and a memory read/write hub coupled to the GPR and shared memory, the hub including a crossbar switch. A PC executes a move data instruction, the move data instruction including operands referencing a destination portion of the GPR and a source portion assigned to the PC, to retrieve data from the source portion. The memory read/write hub writes the data, via the crossbar switch, to the destination portion of the GPR without first writing the data to the shared memory.
    Type: Application
    Filed: September 1, 2021
    Publication date: December 23, 2021
    Inventors: Zhou HONG, Yufei ZHANG
  • Publication number: 20210272232
    Abstract: The disclosed technology relates to graphics processing units (GPU). In one aspect, a GPU includes a general purpose register (GPR) including registers, an arithmetic logic unit (ALU) reading pixels of an image independently of a shared memory, and a level 1 (L1) cache storing pixels to implement a pixel mapping that maps the pixels read from the L1 cache into the registers of the GPR. The pixel mapping includes separating pixels of an image into three regions, with each region including a set of pixels. A first and second set of the pixels are loaded into registers corresponding to two of the three regions horizontally, and a third set of the pixels are loaded into registers corresponding to the third of the three regions vertically. Each of the registers in the first, second, and third registers are loaded as a contiguous ordered number of registers in the GPR.
    Type: Application
    Filed: May 21, 2021
    Publication date: September 2, 2021
    Inventors: Zhou Hong, Yufei Zhang
  • Publication number: 20210264560
    Abstract: The disclosed technology generally relates to a graphics processing unit (GPU). In one aspect, a GPU includes a general purpose register (GPR) having registers, an arithmetic logic unit (ALU) configured to read pixels of an image independently of a shared memory, and a level 1 (L1) cache storing the pixels read by the ALU. The ALU can implement pixel mapping by fetching a quad of pixels, which includes pixels of first, second, third, and fourth pixel types, from the L1 cache, grouping the pixels of the different pixel types of the quad into four groups based on pixel type, and, for each group, separating the pixels included in the group into three regions that each have a set of pixels. The pixels for each group can then be loaded into the registers corresponding to the three regions.
    Type: Application
    Filed: May 13, 2021
    Publication date: August 26, 2021
    Inventors: Zhou Hong, Yufei Zhang
  • Publication number: 20210136544
    Abstract: A method for acquiring push information can be applied to a terminal device and include: transmitting a device model of the terminal device to a server; receiving a push software development kit corresponding to the device model, and a configuration file corresponding to the push software development kit transmitted by the server; initializing the push software development kit based on the configuration file, and completing a registration with the server; and receiving, through the push software development kit, push information transmitted by the server. The terminal device can therefore acquire its corresponding push software development kit according to the device model of the terminal device through a plug-in method, thereby increase push arrival rate and improve quality of operational data.
    Type: Application
    Filed: July 24, 2020
    Publication date: May 6, 2021
    Applicant: Beijing Xiaomi Intelligent Technology Co., Ltd.
    Inventor: Yufei ZHANG
  • Publication number: 20210123582
    Abstract: Disclosed is a stage lamp lighting device. The stage lamp lighting device, includes: a light emitting device configured to emit monochromatic lights of at least two colors; a light combining device configured to combine the monochromatic lights of the at least two colors into one beam of light; and a light mixing device configured to mix the monochromatic lights of different colors in the beam of light emitted from the light combining device and convert the beam of light into a Gaussian beam. The light combining device and light mixing device are sequentially arranged in a light emission path of the light-emitting device. As the Gaussian beam has high luminance at a center thereof, thus the user requirement for the center luminance at a specific distance is satisfied.
    Type: Application
    Filed: July 28, 2017
    Publication date: April 29, 2021
    Inventors: Quan ZHANG, Siyuan ZOU, Yufei ZHANG, Yi LI
  • Patent number: 10726516
    Abstract: A GPU comprises: a GPR comprising registers; an L1 cache coupled to the GPR and configured to implement a pixel mapping by: segregating pixels of an image into regions, the regions comprise a first region and a second region, the first region comprises first pixels, and the second region comprises second pixels, loading the first pixels into the GPR in a horizontal manner, and loading the second pixels into the GPR in a vertical manner; and an ALU configured to read the first pixels and the second pixels independently of a shared memory.
    Type: Grant
    Filed: October 11, 2018
    Date of Patent: July 28, 2020
    Assignee: Futurewei Technologies, Inc.
    Inventors: Zhou Hong, Yufei Zhang
  • Publication number: 20200118238
    Abstract: A GPU comprises: a GPR comprising registers; an L1 cache coupled to the GPR and configured to implement a pixel mapping by: segregating pixels of an image into regions, the regions comprise a first region and a second region, the first region comprises first pixels, and the second region comprises second pixels, loading the first pixels into the GPR in a horizontal manner, and loading the second pixels into the GPR in a vertical manner; and an ALU configured to read the first pixels and the second pixels independently of a shared memory.
    Type: Application
    Filed: October 11, 2018
    Publication date: April 16, 2020
    Inventors: Zhou Hong, Yufei Zhang
  • Publication number: 20200038478
    Abstract: Provided is a polypeptide having hypoglycemic and hypolipidemic activities, and the use thereof in the manufacture of a medicament for lowering the blood glucose and/or blood lipid level in a mammal, or for preventing and/or treating diabetes and/or hyperlipidemia in a mammal. Also provided is a composition comprising the polypeptide.
    Type: Application
    Filed: October 10, 2019
    Publication date: February 6, 2020
    Inventor: Yufei ZHANG
  • Patent number: 9574878
    Abstract: A terminal device is described that includes a housing configured to accommodate various components of the terminal device; a first sensing unit configured to collect first status information of the terminal device; a second sensing unit configured to collect second status information of the terminal device; and a processing unit configured to determine a manner that a user holds the terminal device based on the first status information and the second status information.
    Type: Grant
    Filed: July 16, 2013
    Date of Patent: February 21, 2017
    Assignee: LENOVO (BEIJING) CO., LTD.
    Inventors: Qian Zhao, Hanfeng Zheng, Hao Chen, Yufei Zhang, Chenghu Wu, Tao Cheng, Xiaofei Xu, Xiaoming Liu
  • Publication number: 20140013844
    Abstract: A terminal device is described that includes a housing configured to accommodate various components of the terminal device; a first sensing unit configured to collect first status information of the terminal device; a second sensing unit configured to collect second status information of the terminal device; and a processing unit configured to determine a manner that a user holds the terminal device based on the first status information and the second status information.
    Type: Application
    Filed: July 16, 2013
    Publication date: January 16, 2014
    Inventors: Qian Zhao, Hanfeng Zheng, Hao Chen, Yufei Zhang, Chenghu Wu, Tao Cheng, Xiaofei Xu, Xiaoming Liu
  • Patent number: 8077755
    Abstract: The present invention discloses a multi-mode coexistence method of a multi-mode communication device comprising steps of: setting priorities of frequency usage for all modes supported by the multi-mode communication device; determining a channel where a signal of a lower priority mode are interfered with by that of a higher priority mode; performing frequency hopping to outside said determined channel by the lower priority mode. The invention allows signals of various modes to coexist in the multi-mode communication device without modifying the existing RF reception system, and thus reduces the system cost and implementation complexity.
    Type: Grant
    Filed: December 7, 2005
    Date of Patent: December 13, 2011
    Assignees: Beijing Lenovo Software Ltd., Lenovo (Beijing) Limited
    Inventors: Gongwei Wu, Wenying Shan, Yufei Zhang, Zheng Wang, Chunmei Pei