Patents by Inventor Yufei Zhang
Yufei Zhang has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Publication number: 20220295080Abstract: The present disclosure relates to a method for computing, computing device and computer-readable storage medium. The method includes: determining a pixel block set in a cache, a first pixel block in the pixel block set comprising an m×n pixel matrix having a first padding setting related to the original pixel data, the m and n being positive integers; and storing the determined pixel block set in a buffer to enable a second pixel block to be read from the buffer based on the buffer initial address of the first pixel block and an address offset associated with the second pixel block, wherein the second pixel block has a second padding setting related to the original pixel data, and the first padding setting and the second padding setting have the same offset amount in a first direction relative to the original pixel data.Type: ApplicationFiled: March 10, 2022Publication date: September 15, 2022Applicant: Shanghai Biren Technology Co.,LtdInventors: YuFei ZHANG, Zhou HONG
-
Publication number: 20220284075Abstract: The embodiments of the disclosure relate to a computing device, a computing apparatus, and a method of warp accumulation and relate to the field of computers. The computing device includes a storage unit and an accumulation computing unit coupled to the storage unit. The accumulation computing unit is configured to receive, from a vector processing unit coupled to the computing device, a first warp accumulation instruction, a plurality of first values corresponding to a warp lane number, and a first storage address; generate a current accumulation result based on the plurality of first values in response to the first warp accumulation instruction; and store the current accumulation result in the first storage address in the storage unit to be read by the vector processing unit. In this way, accumulation in a warp may be decoupled to dedicated hardware for processing, and overall accumulation performance may thus be significantly improved.Type: ApplicationFiled: March 4, 2022Publication date: September 8, 2022Applicant: Shanghai Biren Technology Co.,LtdInventors: YuFei ZHANG, Zhu LIANG, Min GAO
-
Publication number: 20220206749Abstract: A computing device and a method for reusing data are provided. The computing device includes a general register and an arithmetic unit coupled to the general register. The arithmetic unit includes a data reuse unit, which is coupled to multiple dot product data units. The data reuse unit is configured to read from the general register and temporarily store a data set used for multiple convolution operations, and determine multiple data subsets from the data set to be respectively inputted into the multiple dot product data units. Two data subsets inputted into two adjacent dot product data unit include a portion of the same data. Each of the multiple dot product data units is configured to perform a dot product operation on the inputted data subset, so as to generate a dot product operation result.Type: ApplicationFiled: November 11, 2021Publication date: June 30, 2022Applicant: Shanghai Biren Technology Co.,LtdInventors: YuFei ZHANG, Hao SHU
-
Publication number: 20220158929Abstract: An information processing method, an interconnection device, and a computer-readable storage medium are provided. The interconnection device includes a request processing module configured for: receiving a data access request from at least one processor, wherein the data access request comprises a merge bit, a multicast group identifier (MGID), and a multicast transaction identifier (MTID); determining whether the data access request is a multicast request; determining whether the interconnection device receives other multicast requests if it is determined that the data access request is a multicast request based on the MGID, the MTID, and a static routing policy of a multicast group; and obtaining the other multicast requests if it is determined that the interconnection device receives the other multicast requests, merging the multicast request with the other multicast requests into a merged request, and forwarding the merged request to a next-hop device of the interconnection device.Type: ApplicationFiled: November 11, 2021Publication date: May 19, 2022Applicant: Shanghai Biren Technology Co.,LtdInventors: Qin ZHENG, Zhou HONG, YuFei ZHANG, Lin CHEN, ChengKun SUN, Tong SUN, ChengPing LUO, HaiChuan WANG
-
Publication number: 20220156128Abstract: The embodiments of the disclosure relate to a computing device, a computing equipment, and a programmable scheduling method for data loading and execution, and relate to the field of computer. The computing device is coupled to a first computing core and a first memory. The computing device includes a scratchpad memory, a second computing core, a first hardware queue, a second hardware queue and a synchronization unit. The second computing core is configured for acceleration in a specific field. The first hardware queue receives a load request from the first computing core. The second hardware queue receives an execution request from the first computing core. The synchronization unit configured to make the triggering of the load request and the execution request to cooperate with each other. In this manner, flexibility, throughput, and overall performance can be enhanced.Type: ApplicationFiled: November 11, 2021Publication date: May 19, 2022Applicant: Shanghai Biren Technology Co.,LtdInventors: Zhou HONG, YuFei ZHANG, ChengKun SUN, Lin CHEN
-
Publication number: 20220147354Abstract: The embodiments of the disclosure relate to a computing device and a method for loading data. According to the method, the first processing unit sends a first instruction to the NMP unit. The first instruction includes a first address, a plurality of second addresses, and an operation type. In response to the first instruction, the NMP unit performs operations associated with the operation type on multiple data items on the multiple second addresses of the first memory, so as to generate the operation result. The NMP unit stores the operation result to the first address of the first memory. The first processing unit issues a flush instruction to make the operation result on the first address visible to the first processing unit. The first processing unit issues a read instruction to read the operation result on the first address to the first processing unit.Type: ApplicationFiled: November 10, 2021Publication date: May 12, 2022Applicant: Shanghai Biren Technology Co.,LtdInventors: Zhou HONG, YuFei ZHANG
-
Publication number: 20220121444Abstract: The invention relates to an apparatus for configuring cooperative warps in a vector computing system. The apparatus includes general-purpose registers (GPRs); an arithmetic logical unit (ALU); and a warp instruction scheduler. The warp instruction scheduler is arranged operably to: allow each of a plurality of warps to access to data of a whole or a designated portion of the GPRs through the ALU in accordance with a configuration by a software when being executed; and complete calculations of each warp through the ALU.Type: ApplicationFiled: July 2, 2021Publication date: April 21, 2022Applicant: Shanghai Biren Technology Co., LtdInventors: Zhou HONG, YuFei ZHANG, ChengKun SUN, Lin CHEN, Hao SHU
-
Publication number: 20220121727Abstract: The invention relates to an apparatus for vector computing incorporating with matrix multiply and accumulation (MMA) calculation. The apparatus includes a streaming multiprocessor (SM), and a block selector. The register space is divided into physical blocks, each of which includes register groups, and a general matrix multiply (GEMM) calculation unit. The SM includes a general-purpose register (GPR), and the GEMM calculation unit includes an instruction queue and a arithmetic logical unit (ALU). The ALU coupled to the GPR is arranged operably to perform MMA calculation according to a GEMM instruction stored in the instruction queue, and store a calculation result in the GPR.Type: ApplicationFiled: July 2, 2021Publication date: April 21, 2022Applicant: Shanghai Biren Technology Co., LtdInventors: Zhou HONG, YuFei ZHANG
-
Publication number: 20220012053Abstract: A method of storing data in general purpose registers (GPRs) includes packing a tile of data items into GPRs, where the tile includes multiple channels. The tile of data items is read from memory. At least two channels of the data are stored in a first GPR, and at least two additional channels are stored in a second GPR. Auxiliary data is loaded into a third GPR. The auxiliary data and the tile data can be used together for performing convolution operations.Type: ApplicationFiled: September 27, 2021Publication date: January 13, 2022Inventors: Lin Chen, Zhou Hong, Yufei Zhang
-
Publication number: 20210398339Abstract: Methodologies and architectures are provided for inter-thread sharing of data in a general purpose register (GPR) of a multiprocessor apparatus. In described embodiments, such data sharing is performed by a graphics processing unit (GPU) having at least one processing cluster, the at least one processing cluster including a plurality of processing cores (PCs) configured for parallel operation. Each PC of a cluster is configured to utilize a dedicated portion of the GPR. The GPU further includes a shared memory for the cluster, and a memory read/write hub coupled to the GPR and shared memory, the hub including a crossbar switch. A PC executes a move data instruction, the move data instruction including operands referencing a destination portion of the GPR and a source portion assigned to the PC, to retrieve data from the source portion. The memory read/write hub writes the data, via the crossbar switch, to the destination portion of the GPR without first writing the data to the shared memory.Type: ApplicationFiled: September 1, 2021Publication date: December 23, 2021Inventors: Zhou HONG, Yufei ZHANG
-
Publication number: 20210272232Abstract: The disclosed technology relates to graphics processing units (GPU). In one aspect, a GPU includes a general purpose register (GPR) including registers, an arithmetic logic unit (ALU) reading pixels of an image independently of a shared memory, and a level 1 (L1) cache storing pixels to implement a pixel mapping that maps the pixels read from the L1 cache into the registers of the GPR. The pixel mapping includes separating pixels of an image into three regions, with each region including a set of pixels. A first and second set of the pixels are loaded into registers corresponding to two of the three regions horizontally, and a third set of the pixels are loaded into registers corresponding to the third of the three regions vertically. Each of the registers in the first, second, and third registers are loaded as a contiguous ordered number of registers in the GPR.Type: ApplicationFiled: May 21, 2021Publication date: September 2, 2021Inventors: Zhou Hong, Yufei Zhang
-
Publication number: 20210264560Abstract: The disclosed technology generally relates to a graphics processing unit (GPU). In one aspect, a GPU includes a general purpose register (GPR) having registers, an arithmetic logic unit (ALU) configured to read pixels of an image independently of a shared memory, and a level 1 (L1) cache storing the pixels read by the ALU. The ALU can implement pixel mapping by fetching a quad of pixels, which includes pixels of first, second, third, and fourth pixel types, from the L1 cache, grouping the pixels of the different pixel types of the quad into four groups based on pixel type, and, for each group, separating the pixels included in the group into three regions that each have a set of pixels. The pixels for each group can then be loaded into the registers corresponding to the three regions.Type: ApplicationFiled: May 13, 2021Publication date: August 26, 2021Inventors: Zhou Hong, Yufei Zhang
-
Publication number: 20210136544Abstract: A method for acquiring push information can be applied to a terminal device and include: transmitting a device model of the terminal device to a server; receiving a push software development kit corresponding to the device model, and a configuration file corresponding to the push software development kit transmitted by the server; initializing the push software development kit based on the configuration file, and completing a registration with the server; and receiving, through the push software development kit, push information transmitted by the server. The terminal device can therefore acquire its corresponding push software development kit according to the device model of the terminal device through a plug-in method, thereby increase push arrival rate and improve quality of operational data.Type: ApplicationFiled: July 24, 2020Publication date: May 6, 2021Applicant: Beijing Xiaomi Intelligent Technology Co., Ltd.Inventor: Yufei ZHANG
-
Publication number: 20210123582Abstract: Disclosed is a stage lamp lighting device. The stage lamp lighting device, includes: a light emitting device configured to emit monochromatic lights of at least two colors; a light combining device configured to combine the monochromatic lights of the at least two colors into one beam of light; and a light mixing device configured to mix the monochromatic lights of different colors in the beam of light emitted from the light combining device and convert the beam of light into a Gaussian beam. The light combining device and light mixing device are sequentially arranged in a light emission path of the light-emitting device. As the Gaussian beam has high luminance at a center thereof, thus the user requirement for the center luminance at a specific distance is satisfied.Type: ApplicationFiled: July 28, 2017Publication date: April 29, 2021Inventors: Quan ZHANG, Siyuan ZOU, Yufei ZHANG, Yi LI
-
Patent number: 10726516Abstract: A GPU comprises: a GPR comprising registers; an L1 cache coupled to the GPR and configured to implement a pixel mapping by: segregating pixels of an image into regions, the regions comprise a first region and a second region, the first region comprises first pixels, and the second region comprises second pixels, loading the first pixels into the GPR in a horizontal manner, and loading the second pixels into the GPR in a vertical manner; and an ALU configured to read the first pixels and the second pixels independently of a shared memory.Type: GrantFiled: October 11, 2018Date of Patent: July 28, 2020Assignee: Futurewei Technologies, Inc.Inventors: Zhou Hong, Yufei Zhang
-
Publication number: 20200118238Abstract: A GPU comprises: a GPR comprising registers; an L1 cache coupled to the GPR and configured to implement a pixel mapping by: segregating pixels of an image into regions, the regions comprise a first region and a second region, the first region comprises first pixels, and the second region comprises second pixels, loading the first pixels into the GPR in a horizontal manner, and loading the second pixels into the GPR in a vertical manner; and an ALU configured to read the first pixels and the second pixels independently of a shared memory.Type: ApplicationFiled: October 11, 2018Publication date: April 16, 2020Inventors: Zhou Hong, Yufei Zhang
-
Publication number: 20200038478Abstract: Provided is a polypeptide having hypoglycemic and hypolipidemic activities, and the use thereof in the manufacture of a medicament for lowering the blood glucose and/or blood lipid level in a mammal, or for preventing and/or treating diabetes and/or hyperlipidemia in a mammal. Also provided is a composition comprising the polypeptide.Type: ApplicationFiled: October 10, 2019Publication date: February 6, 2020Inventor: Yufei ZHANG
-
Patent number: 9574878Abstract: A terminal device is described that includes a housing configured to accommodate various components of the terminal device; a first sensing unit configured to collect first status information of the terminal device; a second sensing unit configured to collect second status information of the terminal device; and a processing unit configured to determine a manner that a user holds the terminal device based on the first status information and the second status information.Type: GrantFiled: July 16, 2013Date of Patent: February 21, 2017Assignee: LENOVO (BEIJING) CO., LTD.Inventors: Qian Zhao, Hanfeng Zheng, Hao Chen, Yufei Zhang, Chenghu Wu, Tao Cheng, Xiaofei Xu, Xiaoming Liu
-
Publication number: 20140013844Abstract: A terminal device is described that includes a housing configured to accommodate various components of the terminal device; a first sensing unit configured to collect first status information of the terminal device; a second sensing unit configured to collect second status information of the terminal device; and a processing unit configured to determine a manner that a user holds the terminal device based on the first status information and the second status information.Type: ApplicationFiled: July 16, 2013Publication date: January 16, 2014Inventors: Qian Zhao, Hanfeng Zheng, Hao Chen, Yufei Zhang, Chenghu Wu, Tao Cheng, Xiaofei Xu, Xiaoming Liu
-
Patent number: 8077755Abstract: The present invention discloses a multi-mode coexistence method of a multi-mode communication device comprising steps of: setting priorities of frequency usage for all modes supported by the multi-mode communication device; determining a channel where a signal of a lower priority mode are interfered with by that of a higher priority mode; performing frequency hopping to outside said determined channel by the lower priority mode. The invention allows signals of various modes to coexist in the multi-mode communication device without modifying the existing RF reception system, and thus reduces the system cost and implementation complexity.Type: GrantFiled: December 7, 2005Date of Patent: December 13, 2011Assignees: Beijing Lenovo Software Ltd., Lenovo (Beijing) LimitedInventors: Gongwei Wu, Wenying Shan, Yufei Zhang, Zheng Wang, Chunmei Pei