Patents by Inventor Feipeng Li

Feipeng Li has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

End-to-end time-domain multitask learning for ML-based speech enhancement

Patent number: 11996114

Abstract: Disclosed is a multi-task machine learning model such as a time-domain deep neural network (DNN) that jointly generate an enhanced target speech signal and target audio parameters from a mixed signal of target speech and interference signal. The DNN may encode the mixed signal, determine masks used to jointly estimate the target signal and the target audio parameters based on the encoded mixed signal, apply the mask to separate the target speech from the interference signal to jointly estimate the target signal and the target audio parameters, and decode the masked features to enhance the target speech signal and to estimate the target audio parameters. The target audio parameters may include a voice activity detection (VAD) flag of the target speech. The DNN may leverage multi-channel audio signal and multi-modal signals such as video signals of the target speaker to improve the robustness of the enhanced target speech signal.

Type: Grant

Filed: May 15, 2021

Date of Patent: May 28, 2024

Assignee: Apple Inc.

Inventors: Ramin Pishehvar, Ante Jukic, Mehrez Souden, Jason Wung, Feipeng Li, Joshua D. Atkins
DETECTING A TRIGGER OF A DIGITAL ASSISTANT

Publication number: 20230111509

Abstract: Systems and processes for operating an intelligent automated assistant are provided. In accordance with one example, a method includes, at an electronic device with one or more processors, memory, and a plurality of microphones, sampling, at each of the plurality of microphones of the electronic device, an audio signal to obtain a plurality of audio signals; processing the plurality of audio signals to obtain a plurality of audio streams; and determining, based on the plurality of audio streams, whether any of the plurality of audio signals corresponds to a spoken trigger. The method further includes, in accordance with a determination that the plurality of audio signals corresponds to the spoken trigger, initiating a session of the digital assistant; and in accordance with a determination that the plurality of audio signals does not correspond to the spoken trigger, foregoing initiating a session of the digital assistant.

Type: Application

Filed: December 13, 2022

Publication date: April 13, 2023

Inventors: Yoon KIM, John BRIDLE, Joshua D. ATKINS, Feipeng LI, Mehrez SOUDEN
Detecting a trigger of a digital assistant

Patent number: 11532306

Abstract: Systems and processes for operating an intelligent automated assistant are provided. In accordance with one example, a method includes, at an electronic device with one or more processors, memory, and a plurality of microphones, sampling, at each of the plurality of microphones of the electronic device, an audio signal to obtain a plurality of audio signals; processing the plurality of audio signals to obtain a plurality of audio streams; and determining, based on the plurality of audio streams, whether any of the plurality of audio signals corresponds to a spoken trigger. The method further includes, in accordance with a determination that the plurality of audio signals corresponds to the spoken trigger, initiating a session of the digital assistant; and in accordance with a determination that the plurality of audio signals does not correspond to the spoken trigger, foregoing initiating a session of the digital assistant.

Type: Grant

Filed: December 3, 2020

Date of Patent: December 20, 2022

Assignee: Apple Inc.

Inventors: Yoon Kim, John Bridle, Joshua D. Atkins, Feipeng Li, Mehrez Souden
End-To-End Time-Domain Multitask Learning for ML-Based Speech Enhancement

Publication number: 20220366927

Abstract: Disclosed is a multi-task machine learning model such as a time-domain deep neural network (DNN) that jointly generate an enhanced target speech signal and target audio parameters from a mixed signal of target speech and interference signal. The DNN may encode the mixed signal, determine masks used to jointly estimate the target signal and the target audio parameters based on the encoded mixed signal, apply the mask to separate the target speech from the interference signal to jointly estimate the target signal and the target audio parameters, and decode the masked features to enhance the target speech signal and to estimate the target audio parameters. The target audio parameters may include a voice activity detection (VAD) flag of the target speech. The DNN may leverage multi-channel audio signal and multi-modal signals such as video signals of the target speaker to improve the robustness of the enhanced target speech signal.

Type: Application

Filed: May 15, 2021

Publication date: November 17, 2022

Inventors: Ramin Pishehvar, Ante Jukic, Mehrez Souden, Jason Wung, Feipeng Li, Joshua D. Atkins
Method of high-resolution amplitude-preserving seismic imaging for subsurface reflectivity model

Patent number: 11294086

Abstract: The present disclosure provides a method of high-resolution amplitude-preserving seismic imaging for a subsurface reflectivity model, including: performing reverse time migration (RTM) to obtain an initial imaging result, performing Born forward modeling on the initial imaging result to obtain seismic simulation data, and performing RTM on the seismic simulation data to obtain a second imaging result; performing curvelet transformation on the two imaging results, performing pointwise estimation in a curvelet domain, and using a Wiener solution that matches two curvelet coefficients as a solution of a matched filter; and applying the estimated matched filter to the initial imaging result to obtain a high-resolution amplitude-preserving seismic imaging result.

Type: Grant

Filed: April 28, 2021

Date of Patent: April 5, 2022

Assignee: XI'AN JIATONG UNIVERSITY

Inventors: Jinghuai Gao, Feipeng Li
Method of high-resolution amplitude-preserving seismic imaging for subsurface reflectivity model

Publication number: 20210341635

Abstract: The present disclosure provides a method of high-resolution amplitude-preserving seismic imaging for a subsurface reflectivity model, including: performing reverse time migration (RTM) to obtain an initial imaging result, performing Born forward modeling on the initial imaging result to obtain seismic simulation data, and performing RTM on the seismic simulation data to obtain a second imaging result; performing curvelet transformation on the two imaging results, performing pointwise estimation in a curvelet domain, and using a Wiener solution that matches two curvelet coefficients as a solution of a matched filter; and applying the estimated matched filter to the initial imaging result to obtain a high-resolution amplitude-preserving seismic imaging result.

Type: Application

Filed: April 28, 2021

Publication date: November 4, 2021

Inventors: Jinghuai GAO, Feipeng LI
DETECTING A TRIGGER OF A DIGITAL ASSISTANT

Publication number: 20210097998

Abstract: Systems and processes for operating an intelligent automated assistant are provided. In accordance with one example, a method includes, at an electronic device with one or more processors, memory, and a plurality of microphones, sampling, at each of the plurality of microphones of the electronic device, an audio signal to obtain a plurality of audio signals; processing the plurality of audio signals to obtain a plurality of audio streams; and determining, based on the plurality of audio streams, whether any of the plurality of audio signals corresponds to a spoken trigger. The method further includes, in accordance with a determination that the plurality of audio signals corresponds to the spoken trigger, initiating a session of the digital assistant; and in accordance with a determination that the plurality of audio signals does not correspond to the spoken trigger, foregoing initiating a session of the digital assistant.

Type: Application

Filed: December 3, 2020

Publication date: April 1, 2021

Inventors: Yoon KIM, John BRIDLE, Joshua D. ATKINS, Feipeng LI, Mehrez SOUDEN
ACOUSTIC ENVIRONMENT AWARE STREAM SELECTION FOR MULTI-STREAM SPEECH RECOGNITION

Publication number: 20200312315

Abstract: An acoustic environment aware method for selecting a high quality audio stream during multi-stream speech recognition. A number of input audio streams are processed to determine if a voice trigger is detected, and if so a voice trigger score is calculated for each stream. An acoustic environment measurement is also calculated for each audio stream. The trigger score and acoustic environment measurement are combined for each audio stream, to select as a preferred audio stream the audio stream with the highest combined score. The preferred audio stream is output to an automatic speech recognizer. Other aspects are also described and claimed.

Type: Application

Filed: March 28, 2019

Publication date: October 1, 2020

Inventors: Feipeng Li, Mehrez Souden, Joshua D. Atkins, John Bridle, Charles P. Clark, Stephen H. Shum, Sachin S. Kajarekar, Haiying Xia, Erik Marchi
DETECTING A TRIGGER OF A DIGITAL ASSISTANT

Publication number: 20190074009

Abstract: Systems and processes for operating an intelligent automated assistant are provided. In accordance with one example, a method includes, at an electronic device with one or more processors, memory, and a plurality of microphones, sampling, at each of the plurality of microphones of the electronic device, an audio signal to obtain a plurality of audio signals; processing the plurality of audio signals to obtain a plurality of audio streams; and determining, based on the plurality of audio streams, whether any of the plurality of audio signals corresponds to a spoken trigger. The method further includes, in accordance with a determination that the plurality of audio signals corresponds to the spoken trigger, initiating a session of the digital assistant; and in accordance with a determination that the plurality of audio signals does not correspond to the spoken trigger, foregoing initiating a session of the digital assistant.

Type: Application

Filed: November 5, 2018

Publication date: March 7, 2019

Inventors: Yoon KIM, John BRIDLE, Joshua D. ATKINS, Feipeng LI, Mehrez SOUDEN
DETECTING A TRIGGER OF A DIGITAL ASSISTANT

Publication number: 20180336892

Abstract: Systems and processes for operating an intelligent automated assistant are provided. In accordance with one example, a method includes, at an electronic device with one or more processors, memory, and a plurality of microphones, sampling, at each of the plurality of microphones of the electronic device, an audio signal to obtain a plurality of audio signals; processing the plurality of audio signals to obtain a plurality of audio streams; and determining, based on the plurality of audio streams, whether any of the plurality of audio signals corresponds to a spoken trigger. The method further includes, in accordance with a determination that the plurality of audio signals corresponds to the spoken trigger, initiating a session of the digital assistant; and in accordance with a determination that the plurality of audio signals does not correspond to the spoken trigger, foregoing initiating a session of the digital assistant.

Type: Application

Filed: March 13, 2018

Publication date: November 22, 2018

Inventors: Yoon KIM, John BRIDLE, Joshua D. ATKINS, Feipeng LI, Mehrez SOUDEN
NATURAL JEWELRY AUTHENTICITY IDENTIFICATION METHOD AND SYSTEM BASED ON MICROSCOPIC THREE-DIMENSIONAL IMAGING TECHNIQUE

Publication number: 20180120236

Abstract: A natural jewelry authenticity identification method comprises: scanning the whole appearance of each jewel by three-dimensional microscopic scanning and completely recording appearance characteristics of the jewel; a three-dimensional imaging technique is used to generate a three-dimensional entity image of the jewel, and a digest algorithm is used to compress three-dimensional entity image information about the jewel into a unique digest code, wherein the three-dimensional entity image and the digest code are published as publishable archive information; and if a receiver is a dealer, the same method is adopted to generate a digest code of the natural jewel, and the digest code is compared with the published digest code; and if the receiver is an individual buyer, after the natural jewel is received, the actual object is magnified and compared with the published three-dimensional entity image, and if the comparison is matched, the natural jewel is identified as authentic.

Type: Application

Filed: December 27, 2017

Publication date: May 3, 2018

Inventors: Shaoping Lu, Chen Xu, Zhenyu Di, Feipeng Li
Systems and methods for identifying speech sound features

Patent number: 8983832

Abstract: Systems and methods for detecting features in spoken speech and processing speech sounds based on the features are provided. One or more features may be identified in a speech sound. The speech sound may be modified to enhance or reduce the degree to which the feature affects the sound ultimately heard by a listener. Systems and methods according to embodiments of the invention may allow for automatic speech recognition devices that enhance detection and recognition of spoken sounds, such as by a user of a hearing aid or other device.

Type: Grant

Filed: July 2, 2009

Date of Patent: March 17, 2015

Assignee: The Board of Trustees of the University of Illinois

Inventors: Jont B. Allen, Feipeng Li
METHODS AND SYSTEMS FOR IDENTIFYING SPEECH SOUNDS USING MULTI-DIMENSIONAL ANALYSIS

Publication number: 20110178799

Abstract: Methods and systems of identifying speech sound features within a speech sound are provided. The sound features may be identified using a multi-dimensional analysis that analyzes the time, frequency, and intensity at which a feature occurs within a speech sound, and the contribution of the feature to the sound. Information about sound features may be used to enhance spoken speech sounds to improve recognizability of the speech sounds by a listener.

Type: Application

Filed: July 24, 2009

Publication date: July 21, 2011

Applicant: The Board of Trustees of the University of Illinois

Inventors: Jont B. Allen, Feipeng Li
SYSTEMS AND METHODS FOR IDENTIFYING SPEECH SOUND FEATURES

Publication number: 20110153321

Abstract: Systems and methods for detecting features in spoken speech and processing speech sounds based on the features are provided. One or more features may be identified in a speech sound. The speech sound may be modified to enhance or reduce the degree to which the feature affects the sound ultimately heard by a listener. Systems and methods according to embodiments of the invention may allow for automatic speech recognition devices that enhance detection and recognition of spoken sounds, such as by a user of a hearing aid or other device.

Type: Application

Filed: July 2, 2009

Publication date: June 23, 2011

Applicant: THE BOARD OF TRUSTEES OF THE UNIVERSITY OF ILLINOI

Inventors: Jont B. Allen, Feipeng LI
METHOD AND APPARATUS FOR LANGUAGE INDEPENDENT VOICE INDEXING AND SEARCHING

Publication number: 20080162125

Abstract: A method and apparatus for language independent voice searching in a mobile communication device is disclosed. The method may include receiving a search query from a user of the mobile communication device, converting speech parts in the search query into linguistic representations which covers at least one languages, generating a search phoneme lattice based on the linguistic representations, extracting query features from the search phoneme lattice, generating query feature vectors based on the extracted features, performing a coarse search using the query feature vectors and the indexing feature vectors from the indexing database, performing a fine search using the results of the coarse search and the indexing phoneme lattices stored in the indexing database, and outputting the fine search results to a dialog manager.

Type: Application

Filed: December 28, 2006

Publication date: July 3, 2008

Applicant: Motorola, Inc.

Inventors: Changxue C. Ma, Feipeng Li