Patents by Inventor Shiliang Zhang

Shiliang Zhang has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Sound detection method

Patent number: 12170098

Abstract: The present disclosure discloses a sound detection method. The method includes: obtaining an initial sound signal and a spatial distribution spectrum of the initial sound signal; segmenting the initial sound signal, to obtain a target sound segment, and obtaining a timestamp corresponding to the target sound segment, the target sound segment including a speech of at least one object, and the timestamp being used for indicating a start time of the target sound segment and an end time of the target sound segment; segmenting the spatial distribution spectrum by using the timestamp, to obtain a spatial distribution spectrum segment corresponding to the target sound segment; and inputting the target sound segment and the spatial distribution spectrum segment into a sound detection model, to obtain a first sound detection result, the first sound detection result being used for describing whether sound of multiple objects exists in the initial sound signal.

Type: Grant

Filed: August 26, 2022

Date of Patent: December 17, 2024

Assignee: Alibaba Damo (Hangzhou) Technology Co., Ltd.

Inventors: Shiliang Zhang, Siqi Zheng, Weilong Huang
Audio information processing method, audio information processing apparatus, electronic device, and storage medium

Patent number: 12154545

Abstract: In various embodiments, this application provides an audio information processing method, an audio information processing apparatus, an electronic device, and a storage medium. An audio information processing method in an embodiment includes: obtaining a first audio feature corresponding to audio information; performing, based on an audio feature at a specified moment in the first audio feature and audio features adjacent to the audio feature at the specified moment, an encoding on the audio feature at the specified moment to obtain a second audio feature corresponding to the audio information; obtaining decoded text information corresponding to the audio information; and obtaining, based on the second audio features and the decoded text information, text information corresponding to the audio information.

Type: Grant

Filed: January 8, 2021

Date of Patent: November 26, 2024

Assignee: Alibaba Group Holding Limited

Inventors: Shiliang Zhang, Ming Lei
Method and system for processing speech signal

Patent number: 11900958

Abstract: Embodiments of the present disclosure provide methods and systems for processing a speech signal. The method can include: processing the speech signal to generate a plurality of speech frames; generating a first number of acoustic features based on the plurality of speech frames using a frame shift at a given frequency; and generating a second number of posteriori probability vectors based on the first number of acoustic features using an acoustic model, wherein each of the posteriori probability vectors comprises probabilities of the acoustic features corresponding to a plurality of modeling units, respectively.

Type: Grant

Filed: December 26, 2022

Date of Patent: February 13, 2024

Assignee: Alibaba Group Holding Limited

Inventors: Shiliang Zhang, Ming Lei, Wei Li, Haitao Yao
CONTEXTUAL INSTANCE DECOUPLING-BASED MULTI-PERSON POSE ESTIMATION METHOD AND APPARATUS

Publication number: 20230360256

Abstract: The present application relates to the technical field of deep learning and pose estimation, and more particularly, to a contextual instance decoupling (CID)-based multi-person pose estimation (MPPE) method and apparatus. The method includes: acquiring a preset number of images containing multiple persons; inputting the images containing multiple persons, as a training sample, into a CID-based MPPE model for training; and performing pose estimation on a target image using the trained CID-based MPPE model, the CID-based MPPE model being provided with an instance information abstraction module, a global feature decoupling module and a heatmap estimation module. The method and apparatus of the present application can explore context clues over a greater range, thus being robust to spatial detection errors and superior in both accuracy and efficiency.

Type: Application

Filed: December 27, 2022

Publication date: November 9, 2023

Applicant: Peking University

Inventors: Shiliang ZHANG, Dongkai WANG
METHOD AND SYSTEM FOR PROCESSING SPEECH SIGNAL

Publication number: 20230245672

Abstract: Embodiments of the present disclosure provide methods and systems for processing a speech signal. The method can include: processing the speech signal to generate a plurality of speech frames; generating a first number of acoustic features based on the plurality of speech frames using a frame shift at a given frequency; and generating a second number of posteriori probability vectors based on the first number of acoustic features using an acoustic model, wherein each of the posteriori probability vectors comprises probabilities of the acoustic features corresponding to a plurality of modeling units, respectively.

Type: Application

Filed: December 26, 2022

Publication date: August 3, 2023

Inventors: Shiliang ZHANG, Ming LEI, Wei LI, Haitao YAO
End-to-end modelling method and system

Patent number: 11651578

Abstract: A method and a system for end-to-end modeling are provided. The method includes: determining a topological structure of a target-based end-to-end model, where the topological structure includes an input layer, an encoding layer, an code enhancement layer, a filtering layer, a decoding layer and an output layer; the code enhancement layer adds information of a target unit to a feature sequence outputted by the encoding layer, the filtering layer filters a feature sequence added with the information of the target unit; collecting multiple pieces of training data; and training parameters of the target-based end-to-end model by using the multiple pieces of the training data.

Type: Grant

Filed: January 11, 2017

Date of Patent: May 16, 2023

Assignee: IFLYTEK CO., LTD.

Inventors: Jia Pan, Shiliang Zhang, Shifu Xiong, Si Wei, Guoping Hu
SOUND DETECTION METHOD

Publication number: 20230074906

Abstract: The present disclosure discloses a sound detection method.

Type: Application

Filed: August 26, 2022

Publication date: March 9, 2023

Inventors: Shiliang ZHANG, Siqi ZHENG, Weilong HUANG
Streaming End-to-End Speech Recognition Method, Apparatus and Electronic Device

Publication number: 20230064756

Abstract: A method, an apparatus, and an electronic device for streaming end-to-end speech recognition are described. The method includes: extracting and encoding speech acoustic features of a received voice stream in units of frames; performing block processing, and predicting a number of activation points included in a same block that need to be encoded and outputted; determining position(s) of activation point(s) that need(s) to be decoded and outputted according to a prediction result, to a decoder to perform decoding at the position(s) of the activation point(s) and output a recognition result. Through the embodiments of the present disclosure, the robustness of a streaming end-to-end speech recognition system to noise can be improved, thereby improving the performance and the accuracy of the system.

Type: Application

Filed: October 28, 2022

Publication date: March 2, 2023

Inventors: Shiliang ZHANG, Zhifu GAO
Speech Processing method, Speech Encoder, Speech Decoder and Speech Recognition System

Publication number: 20230009633

Abstract: A speech processing method, a speech encoder, a speech decoder, and a speech recognition system are provided. The method includes: obtaining a speech signal to be processed; using a first neural network and a second neural network to process the speech signal to obtain first feature information and second feature information corresponding to the speech signal respectively, wherein a computational efficiency of the first neural network is higher than a computational efficiency of the second neural network, and an accuracy of the second feature information outputted by the second neural network is higher than an accuracy of the first feature information outputted by the first neural network; and determining target feature information used to represent semantics in the speech signal based on the first feature information and the second feature information.

Type: Application

Filed: September 23, 2022

Publication date: January 12, 2023

Inventors: Shiliang ZHANG, Zhifu GAO, Ming Lei
Method and system for processing speech signal

Patent number: 11538488

Abstract: Embodiments of the present disclosure provide methods and systems for processing a speech signal. The method can include: processing the speech signal to generate a plurality of speech frames; generating a first number of acoustic features based on the plurality of speech frames using a frame shift at a given frequency; and generating a second number of posteriori probability vectors based on the first number of acoustic features using an acoustic model, wherein each of the posteriori probability vectors comprises probabilities of the acoustic features corresponding to a plurality of modeling units, respectively.

Type: Grant

Filed: November 27, 2019

Date of Patent: December 27, 2022

Assignee: Alibaba Group Holding Limited

Inventors: Shiliang Zhang, Ming Lei, Wei Li, Haitao Yao
METHOD AND SYSTEM FOR SPEECH RECOGNITION

Publication number: 20220028404

Abstract: Embodiments of the disclosure provide a method and system for speech recognition. The method comprises dividing space into a plurality of regions based on preset DOA angles to allocate a signal source to the plurality of regions, wherein signals in the plurality of regions are enhanced and recognized, the result of which are fused to obtain a recognition result of the signal source.

Type: Application

Filed: February 3, 2020

Publication date: January 27, 2022

Inventors: Shiliang ZHANG, Ming LEI
Method and system for person re-identification

Patent number: 11182602

Abstract: The present application discloses a method and a system for person re-identification, the method including: inputting a training set to a model-to-be-trained, and determining a single-class label and memory features of each image data in the training set; determining multi-class labels through positive label prediction according to the single-class labels and a memory feature set; determining classification scores according to image features of each image data in the training set and the memory feature set; determining a multi-label classification loss according to the multi-class labels and the classification scores; and updating and training the model-to-be-trained to obtain a re-identification model according to the multi-label classification loss.

Type: Grant

Filed: September 15, 2020

Date of Patent: November 23, 2021

Assignee: Peking University

Inventors: Shiliang Zhang, Dongkai Wang
METHOD AND SYSTEM FOR PERSON RE-IDENTIFICATION

Publication number: 20210319215

Abstract: The present application discloses a method and a system for person re-identification, the method including: inputting a training set to a model-to-be-trained, and determining a single-class label and memory features of each image data in the training set; determining multi-class labels through positive label prediction according to the single-class labels and a memory feature set; determining classification scores according to image features of each image data in the training set and the memory feature set; determining a multi-label classification loss according to the multi-class labels and the classification scores; and updating and training the model-to-be-trained to obtain a re-identification model according to the multi-label classification loss.

Type: Application

Filed: September 15, 2020

Publication date: October 14, 2021

Applicant: Peking University

Inventors: Shiliang ZHANG, Dongkai WANG
Emergency brake control method and device, ECU and vehicle

Patent number: 10829119

Abstract: Disclosed are an emergency brake control method, device, ECU and vehicle. The method includes: when receiving a first trigger signal indicating that a vehicle enters a driving accompanying mode, activating the driving accompanying mode; and in the driving accompanying mode, when receiving an emergency brake command, controlling an Electronic Stability Program ESP to decelerate the vehicle at a preset deceleration in the driving accompanying mode, and sending a fuel cut-off request signal to an Engine Management System EMS to cut off a torque output of an engine.

Type: Grant

Filed: August 30, 2018

Date of Patent: November 10, 2020

Assignee: Great Wall Motor Company Limited

Inventors: Shiliang Zhang, Shenhong Liu, Fanmao Kong, Xianglu Meng, Libo Wang
EMERGENCY BRAKE CONTROL METHOD AND DEVICE, ECU AND VEHICLE

Publication number: 20200331475

Abstract: Disclosed are an emergency brake control method, device, ECU and vehicle. The method includes: when receiving a first trigger signal indicating that a vehicle enters a driving accompanying mode, activating the driving accompanying mode; and in the driving accompanying mode, when receiving an emergency brake command, controlling an Electronic Stability Program ESP to decelerate the vehicle at a preset deceleration in the driving accompanying mode, and sending a fuel cut-off request signal to an Engine Management System EMS to cut off a torque output of an engine.

Type: Application

Filed: August 30, 2018

Publication date: October 22, 2020

Applicant: GREAT WALL MOTOR COMPANY LIMITED

Inventors: Shiliang ZHANG, Shenhong LIU, Fanmao KONG, Xianglu MENG, Libo WANG
METHOD AND SYSTEM FOR PROCESSING SPEECH SIGNAL

Publication number: 20200176014

Abstract: Embodiments of the present disclosure provide methods and systems for processing a speech signal. The method can include: processing the speech signal to generate a plurality of speech frames; generating a first number of acoustic features based on the plurality of speech frames using a frame shift at a given frequency; and generating a second number of posteriori probability vectors based on the first number of acoustic features using an acoustic model, wherein each of the posteriori probability vectors comprises probabilities of the acoustic features corresponding to a plurality of modeling units, respectively.

Type: Application

Filed: November 27, 2019

Publication date: June 4, 2020

Inventors: Shiliang ZHANG, Ming LEI, Wei LI, Haitao YAO
Cervical pillow

Patent number: D941610

Type: Grant

Filed: January 12, 2021

Date of Patent: January 25, 2022

Inventor: Shiliang Zhang
Cervical pillow

Patent number: D959875

Type: Grant

Filed: June 11, 2021

Date of Patent: August 9, 2022

Inventor: Shiliang Zhang
Cervical pillow

Patent number: D959876

Type: Grant

Filed: July 6, 2021

Date of Patent: August 9, 2022

Inventor: Shiliang Zhang
Camera

Patent number: D1072021

Type: Grant

Filed: December 17, 2024

Date of Patent: April 22, 2025

Inventor: Shiliang Zhang

1 2 next