Patents by Inventor Shiliang Zhang

Shiliang Zhang has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 12170098
    Abstract: The present disclosure discloses a sound detection method. The method includes: obtaining an initial sound signal and a spatial distribution spectrum of the initial sound signal; segmenting the initial sound signal, to obtain a target sound segment, and obtaining a timestamp corresponding to the target sound segment, the target sound segment including a speech of at least one object, and the timestamp being used for indicating a start time of the target sound segment and an end time of the target sound segment; segmenting the spatial distribution spectrum by using the timestamp, to obtain a spatial distribution spectrum segment corresponding to the target sound segment; and inputting the target sound segment and the spatial distribution spectrum segment into a sound detection model, to obtain a first sound detection result, the first sound detection result being used for describing whether sound of multiple objects exists in the initial sound signal.
    Type: Grant
    Filed: August 26, 2022
    Date of Patent: December 17, 2024
    Assignee: Alibaba Damo (Hangzhou) Technology Co., Ltd.
    Inventors: Shiliang Zhang, Siqi Zheng, Weilong Huang
  • Patent number: 12154545
    Abstract: In various embodiments, this application provides an audio information processing method, an audio information processing apparatus, an electronic device, and a storage medium. An audio information processing method in an embodiment includes: obtaining a first audio feature corresponding to audio information; performing, based on an audio feature at a specified moment in the first audio feature and audio features adjacent to the audio feature at the specified moment, an encoding on the audio feature at the specified moment to obtain a second audio feature corresponding to the audio information; obtaining decoded text information corresponding to the audio information; and obtaining, based on the second audio features and the decoded text information, text information corresponding to the audio information.
    Type: Grant
    Filed: January 8, 2021
    Date of Patent: November 26, 2024
    Assignee: Alibaba Group Holding Limited
    Inventors: Shiliang Zhang, Ming Lei
  • Patent number: 11900958
    Abstract: Embodiments of the present disclosure provide methods and systems for processing a speech signal. The method can include: processing the speech signal to generate a plurality of speech frames; generating a first number of acoustic features based on the plurality of speech frames using a frame shift at a given frequency; and generating a second number of posteriori probability vectors based on the first number of acoustic features using an acoustic model, wherein each of the posteriori probability vectors comprises probabilities of the acoustic features corresponding to a plurality of modeling units, respectively.
    Type: Grant
    Filed: December 26, 2022
    Date of Patent: February 13, 2024
    Assignee: Alibaba Group Holding Limited
    Inventors: Shiliang Zhang, Ming Lei, Wei Li, Haitao Yao
  • Publication number: 20230360256
    Abstract: The present application relates to the technical field of deep learning and pose estimation, and more particularly, to a contextual instance decoupling (CID)-based multi-person pose estimation (MPPE) method and apparatus. The method includes: acquiring a preset number of images containing multiple persons; inputting the images containing multiple persons, as a training sample, into a CID-based MPPE model for training; and performing pose estimation on a target image using the trained CID-based MPPE model, the CID-based MPPE model being provided with an instance information abstraction module, a global feature decoupling module and a heatmap estimation module. The method and apparatus of the present application can explore context clues over a greater range, thus being robust to spatial detection errors and superior in both accuracy and efficiency.
    Type: Application
    Filed: December 27, 2022
    Publication date: November 9, 2023
    Applicant: Peking University
    Inventors: Shiliang ZHANG, Dongkai WANG
  • Publication number: 20230245672
    Abstract: Embodiments of the present disclosure provide methods and systems for processing a speech signal. The method can include: processing the speech signal to generate a plurality of speech frames; generating a first number of acoustic features based on the plurality of speech frames using a frame shift at a given frequency; and generating a second number of posteriori probability vectors based on the first number of acoustic features using an acoustic model, wherein each of the posteriori probability vectors comprises probabilities of the acoustic features corresponding to a plurality of modeling units, respectively.
    Type: Application
    Filed: December 26, 2022
    Publication date: August 3, 2023
    Inventors: Shiliang ZHANG, Ming LEI, Wei LI, Haitao YAO
  • Patent number: 11651578
    Abstract: A method and a system for end-to-end modeling are provided. The method includes: determining a topological structure of a target-based end-to-end model, where the topological structure includes an input layer, an encoding layer, an code enhancement layer, a filtering layer, a decoding layer and an output layer; the code enhancement layer adds information of a target unit to a feature sequence outputted by the encoding layer, the filtering layer filters a feature sequence added with the information of the target unit; collecting multiple pieces of training data; and training parameters of the target-based end-to-end model by using the multiple pieces of the training data.
    Type: Grant
    Filed: January 11, 2017
    Date of Patent: May 16, 2023
    Assignee: IFLYTEK CO., LTD.
    Inventors: Jia Pan, Shiliang Zhang, Shifu Xiong, Si Wei, Guoping Hu
  • Publication number: 20230074906
    Abstract: The present disclosure discloses a sound detection method.
    Type: Application
    Filed: August 26, 2022
    Publication date: March 9, 2023
    Inventors: Shiliang ZHANG, Siqi ZHENG, Weilong HUANG
  • Publication number: 20230064756
    Abstract: A method, an apparatus, and an electronic device for streaming end-to-end speech recognition are described. The method includes: extracting and encoding speech acoustic features of a received voice stream in units of frames; performing block processing, and predicting a number of activation points included in a same block that need to be encoded and outputted; determining position(s) of activation point(s) that need(s) to be decoded and outputted according to a prediction result, to a decoder to perform decoding at the position(s) of the activation point(s) and output a recognition result. Through the embodiments of the present disclosure, the robustness of a streaming end-to-end speech recognition system to noise can be improved, thereby improving the performance and the accuracy of the system.
    Type: Application
    Filed: October 28, 2022
    Publication date: March 2, 2023
    Inventors: Shiliang ZHANG, Zhifu GAO
  • Publication number: 20230009633
    Abstract: A speech processing method, a speech encoder, a speech decoder, and a speech recognition system are provided. The method includes: obtaining a speech signal to be processed; using a first neural network and a second neural network to process the speech signal to obtain first feature information and second feature information corresponding to the speech signal respectively, wherein a computational efficiency of the first neural network is higher than a computational efficiency of the second neural network, and an accuracy of the second feature information outputted by the second neural network is higher than an accuracy of the first feature information outputted by the first neural network; and determining target feature information used to represent semantics in the speech signal based on the first feature information and the second feature information.
    Type: Application
    Filed: September 23, 2022
    Publication date: January 12, 2023
    Inventors: Shiliang ZHANG, Zhifu GAO, Ming Lei
  • Patent number: 11538488
    Abstract: Embodiments of the present disclosure provide methods and systems for processing a speech signal. The method can include: processing the speech signal to generate a plurality of speech frames; generating a first number of acoustic features based on the plurality of speech frames using a frame shift at a given frequency; and generating a second number of posteriori probability vectors based on the first number of acoustic features using an acoustic model, wherein each of the posteriori probability vectors comprises probabilities of the acoustic features corresponding to a plurality of modeling units, respectively.
    Type: Grant
    Filed: November 27, 2019
    Date of Patent: December 27, 2022
    Assignee: Alibaba Group Holding Limited
    Inventors: Shiliang Zhang, Ming Lei, Wei Li, Haitao Yao
  • Publication number: 20220028404
    Abstract: Embodiments of the disclosure provide a method and system for speech recognition. The method comprises dividing space into a plurality of regions based on preset DOA angles to allocate a signal source to the plurality of regions, wherein signals in the plurality of regions are enhanced and recognized, the result of which are fused to obtain a recognition result of the signal source.
    Type: Application
    Filed: February 3, 2020
    Publication date: January 27, 2022
    Inventors: Shiliang ZHANG, Ming LEI
  • Patent number: 11182602
    Abstract: The present application discloses a method and a system for person re-identification, the method including: inputting a training set to a model-to-be-trained, and determining a single-class label and memory features of each image data in the training set; determining multi-class labels through positive label prediction according to the single-class labels and a memory feature set; determining classification scores according to image features of each image data in the training set and the memory feature set; determining a multi-label classification loss according to the multi-class labels and the classification scores; and updating and training the model-to-be-trained to obtain a re-identification model according to the multi-label classification loss.
    Type: Grant
    Filed: September 15, 2020
    Date of Patent: November 23, 2021
    Assignee: Peking University
    Inventors: Shiliang Zhang, Dongkai Wang
  • Publication number: 20210319215
    Abstract: The present application discloses a method and a system for person re-identification, the method including: inputting a training set to a model-to-be-trained, and determining a single-class label and memory features of each image data in the training set; determining multi-class labels through positive label prediction according to the single-class labels and a memory feature set; determining classification scores according to image features of each image data in the training set and the memory feature set; determining a multi-label classification loss according to the multi-class labels and the classification scores; and updating and training the model-to-be-trained to obtain a re-identification model according to the multi-label classification loss.
    Type: Application
    Filed: September 15, 2020
    Publication date: October 14, 2021
    Applicant: Peking University
    Inventors: Shiliang ZHANG, Dongkai WANG
  • Patent number: 10829119
    Abstract: Disclosed are an emergency brake control method, device, ECU and vehicle. The method includes: when receiving a first trigger signal indicating that a vehicle enters a driving accompanying mode, activating the driving accompanying mode; and in the driving accompanying mode, when receiving an emergency brake command, controlling an Electronic Stability Program ESP to decelerate the vehicle at a preset deceleration in the driving accompanying mode, and sending a fuel cut-off request signal to an Engine Management System EMS to cut off a torque output of an engine.
    Type: Grant
    Filed: August 30, 2018
    Date of Patent: November 10, 2020
    Assignee: Great Wall Motor Company Limited
    Inventors: Shiliang Zhang, Shenhong Liu, Fanmao Kong, Xianglu Meng, Libo Wang
  • Publication number: 20200331475
    Abstract: Disclosed are an emergency brake control method, device, ECU and vehicle. The method includes: when receiving a first trigger signal indicating that a vehicle enters a driving accompanying mode, activating the driving accompanying mode; and in the driving accompanying mode, when receiving an emergency brake command, controlling an Electronic Stability Program ESP to decelerate the vehicle at a preset deceleration in the driving accompanying mode, and sending a fuel cut-off request signal to an Engine Management System EMS to cut off a torque output of an engine.
    Type: Application
    Filed: August 30, 2018
    Publication date: October 22, 2020
    Applicant: GREAT WALL MOTOR COMPANY LIMITED
    Inventors: Shiliang ZHANG, Shenhong LIU, Fanmao KONG, Xianglu MENG, Libo WANG
  • Publication number: 20200176014
    Abstract: Embodiments of the present disclosure provide methods and systems for processing a speech signal. The method can include: processing the speech signal to generate a plurality of speech frames; generating a first number of acoustic features based on the plurality of speech frames using a frame shift at a given frequency; and generating a second number of posteriori probability vectors based on the first number of acoustic features using an acoustic model, wherein each of the posteriori probability vectors comprises probabilities of the acoustic features corresponding to a plurality of modeling units, respectively.
    Type: Application
    Filed: November 27, 2019
    Publication date: June 4, 2020
    Inventors: Shiliang ZHANG, Ming LEI, Wei LI, Haitao YAO
  • Patent number: D941610
    Type: Grant
    Filed: January 12, 2021
    Date of Patent: January 25, 2022
    Inventor: Shiliang Zhang
  • Patent number: D959875
    Type: Grant
    Filed: June 11, 2021
    Date of Patent: August 9, 2022
    Inventor: Shiliang Zhang
  • Patent number: D959876
    Type: Grant
    Filed: July 6, 2021
    Date of Patent: August 9, 2022
    Inventor: Shiliang Zhang
  • Patent number: D1072021
    Type: Grant
    Filed: December 17, 2024
    Date of Patent: April 22, 2025
    Inventor: Shiliang Zhang