Patents by Inventor Feipeng Li
Feipeng Li has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Patent number: 11996114Abstract: Disclosed is a multi-task machine learning model such as a time-domain deep neural network (DNN) that jointly generate an enhanced target speech signal and target audio parameters from a mixed signal of target speech and interference signal. The DNN may encode the mixed signal, determine masks used to jointly estimate the target signal and the target audio parameters based on the encoded mixed signal, apply the mask to separate the target speech from the interference signal to jointly estimate the target signal and the target audio parameters, and decode the masked features to enhance the target speech signal and to estimate the target audio parameters. The target audio parameters may include a voice activity detection (VAD) flag of the target speech. The DNN may leverage multi-channel audio signal and multi-modal signals such as video signals of the target speaker to improve the robustness of the enhanced target speech signal.Type: GrantFiled: May 15, 2021Date of Patent: May 28, 2024Assignee: Apple Inc.Inventors: Ramin Pishehvar, Ante Jukic, Mehrez Souden, Jason Wung, Feipeng Li, Joshua D. Atkins
-
Publication number: 20230111509Abstract: Systems and processes for operating an intelligent automated assistant are provided. In accordance with one example, a method includes, at an electronic device with one or more processors, memory, and a plurality of microphones, sampling, at each of the plurality of microphones of the electronic device, an audio signal to obtain a plurality of audio signals; processing the plurality of audio signals to obtain a plurality of audio streams; and determining, based on the plurality of audio streams, whether any of the plurality of audio signals corresponds to a spoken trigger. The method further includes, in accordance with a determination that the plurality of audio signals corresponds to the spoken trigger, initiating a session of the digital assistant; and in accordance with a determination that the plurality of audio signals does not correspond to the spoken trigger, foregoing initiating a session of the digital assistant.Type: ApplicationFiled: December 13, 2022Publication date: April 13, 2023Inventors: Yoon KIM, John BRIDLE, Joshua D. ATKINS, Feipeng LI, Mehrez SOUDEN
-
Patent number: 11532306Abstract: Systems and processes for operating an intelligent automated assistant are provided. In accordance with one example, a method includes, at an electronic device with one or more processors, memory, and a plurality of microphones, sampling, at each of the plurality of microphones of the electronic device, an audio signal to obtain a plurality of audio signals; processing the plurality of audio signals to obtain a plurality of audio streams; and determining, based on the plurality of audio streams, whether any of the plurality of audio signals corresponds to a spoken trigger. The method further includes, in accordance with a determination that the plurality of audio signals corresponds to the spoken trigger, initiating a session of the digital assistant; and in accordance with a determination that the plurality of audio signals does not correspond to the spoken trigger, foregoing initiating a session of the digital assistant.Type: GrantFiled: December 3, 2020Date of Patent: December 20, 2022Assignee: Apple Inc.Inventors: Yoon Kim, John Bridle, Joshua D. Atkins, Feipeng Li, Mehrez Souden
-
Publication number: 20220366927Abstract: Disclosed is a multi-task machine learning model such as a time-domain deep neural network (DNN) that jointly generate an enhanced target speech signal and target audio parameters from a mixed signal of target speech and interference signal. The DNN may encode the mixed signal, determine masks used to jointly estimate the target signal and the target audio parameters based on the encoded mixed signal, apply the mask to separate the target speech from the interference signal to jointly estimate the target signal and the target audio parameters, and decode the masked features to enhance the target speech signal and to estimate the target audio parameters. The target audio parameters may include a voice activity detection (VAD) flag of the target speech. The DNN may leverage multi-channel audio signal and multi-modal signals such as video signals of the target speaker to improve the robustness of the enhanced target speech signal.Type: ApplicationFiled: May 15, 2021Publication date: November 17, 2022Inventors: Ramin Pishehvar, Ante Jukic, Mehrez Souden, Jason Wung, Feipeng Li, Joshua D. Atkins
-
Patent number: 11294086Abstract: The present disclosure provides a method of high-resolution amplitude-preserving seismic imaging for a subsurface reflectivity model, including: performing reverse time migration (RTM) to obtain an initial imaging result, performing Born forward modeling on the initial imaging result to obtain seismic simulation data, and performing RTM on the seismic simulation data to obtain a second imaging result; performing curvelet transformation on the two imaging results, performing pointwise estimation in a curvelet domain, and using a Wiener solution that matches two curvelet coefficients as a solution of a matched filter; and applying the estimated matched filter to the initial imaging result to obtain a high-resolution amplitude-preserving seismic imaging result.Type: GrantFiled: April 28, 2021Date of Patent: April 5, 2022Assignee: XI'AN JIATONG UNIVERSITYInventors: Jinghuai Gao, Feipeng Li
-
Publication number: 20210341635Abstract: The present disclosure provides a method of high-resolution amplitude-preserving seismic imaging for a subsurface reflectivity model, including: performing reverse time migration (RTM) to obtain an initial imaging result, performing Born forward modeling on the initial imaging result to obtain seismic simulation data, and performing RTM on the seismic simulation data to obtain a second imaging result; performing curvelet transformation on the two imaging results, performing pointwise estimation in a curvelet domain, and using a Wiener solution that matches two curvelet coefficients as a solution of a matched filter; and applying the estimated matched filter to the initial imaging result to obtain a high-resolution amplitude-preserving seismic imaging result.Type: ApplicationFiled: April 28, 2021Publication date: November 4, 2021Inventors: Jinghuai GAO, Feipeng LI
-
Publication number: 20210097998Abstract: Systems and processes for operating an intelligent automated assistant are provided. In accordance with one example, a method includes, at an electronic device with one or more processors, memory, and a plurality of microphones, sampling, at each of the plurality of microphones of the electronic device, an audio signal to obtain a plurality of audio signals; processing the plurality of audio signals to obtain a plurality of audio streams; and determining, based on the plurality of audio streams, whether any of the plurality of audio signals corresponds to a spoken trigger. The method further includes, in accordance with a determination that the plurality of audio signals corresponds to the spoken trigger, initiating a session of the digital assistant; and in accordance with a determination that the plurality of audio signals does not correspond to the spoken trigger, foregoing initiating a session of the digital assistant.Type: ApplicationFiled: December 3, 2020Publication date: April 1, 2021Inventors: Yoon KIM, John BRIDLE, Joshua D. ATKINS, Feipeng LI, Mehrez SOUDEN
-
Publication number: 20200312315Abstract: An acoustic environment aware method for selecting a high quality audio stream during multi-stream speech recognition. A number of input audio streams are processed to determine if a voice trigger is detected, and if so a voice trigger score is calculated for each stream. An acoustic environment measurement is also calculated for each audio stream. The trigger score and acoustic environment measurement are combined for each audio stream, to select as a preferred audio stream the audio stream with the highest combined score. The preferred audio stream is output to an automatic speech recognizer. Other aspects are also described and claimed.Type: ApplicationFiled: March 28, 2019Publication date: October 1, 2020Inventors: Feipeng Li, Mehrez Souden, Joshua D. Atkins, John Bridle, Charles P. Clark, Stephen H. Shum, Sachin S. Kajarekar, Haiying Xia, Erik Marchi
-
Publication number: 20190074009Abstract: Systems and processes for operating an intelligent automated assistant are provided. In accordance with one example, a method includes, at an electronic device with one or more processors, memory, and a plurality of microphones, sampling, at each of the plurality of microphones of the electronic device, an audio signal to obtain a plurality of audio signals; processing the plurality of audio signals to obtain a plurality of audio streams; and determining, based on the plurality of audio streams, whether any of the plurality of audio signals corresponds to a spoken trigger. The method further includes, in accordance with a determination that the plurality of audio signals corresponds to the spoken trigger, initiating a session of the digital assistant; and in accordance with a determination that the plurality of audio signals does not correspond to the spoken trigger, foregoing initiating a session of the digital assistant.Type: ApplicationFiled: November 5, 2018Publication date: March 7, 2019Inventors: Yoon KIM, John BRIDLE, Joshua D. ATKINS, Feipeng LI, Mehrez SOUDEN
-
Publication number: 20180336892Abstract: Systems and processes for operating an intelligent automated assistant are provided. In accordance with one example, a method includes, at an electronic device with one or more processors, memory, and a plurality of microphones, sampling, at each of the plurality of microphones of the electronic device, an audio signal to obtain a plurality of audio signals; processing the plurality of audio signals to obtain a plurality of audio streams; and determining, based on the plurality of audio streams, whether any of the plurality of audio signals corresponds to a spoken trigger. The method further includes, in accordance with a determination that the plurality of audio signals corresponds to the spoken trigger, initiating a session of the digital assistant; and in accordance with a determination that the plurality of audio signals does not correspond to the spoken trigger, foregoing initiating a session of the digital assistant.Type: ApplicationFiled: March 13, 2018Publication date: November 22, 2018Inventors: Yoon KIM, John BRIDLE, Joshua D. ATKINS, Feipeng LI, Mehrez SOUDEN
-
Publication number: 20180120236Abstract: A natural jewelry authenticity identification method comprises: scanning the whole appearance of each jewel by three-dimensional microscopic scanning and completely recording appearance characteristics of the jewel; a three-dimensional imaging technique is used to generate a three-dimensional entity image of the jewel, and a digest algorithm is used to compress three-dimensional entity image information about the jewel into a unique digest code, wherein the three-dimensional entity image and the digest code are published as publishable archive information; and if a receiver is a dealer, the same method is adopted to generate a digest code of the natural jewel, and the digest code is compared with the published digest code; and if the receiver is an individual buyer, after the natural jewel is received, the actual object is magnified and compared with the published three-dimensional entity image, and if the comparison is matched, the natural jewel is identified as authentic.Type: ApplicationFiled: December 27, 2017Publication date: May 3, 2018Inventors: Shaoping Lu, Chen Xu, Zhenyu Di, Feipeng Li
-
Patent number: 8983832Abstract: Systems and methods for detecting features in spoken speech and processing speech sounds based on the features are provided. One or more features may be identified in a speech sound. The speech sound may be modified to enhance or reduce the degree to which the feature affects the sound ultimately heard by a listener. Systems and methods according to embodiments of the invention may allow for automatic speech recognition devices that enhance detection and recognition of spoken sounds, such as by a user of a hearing aid or other device.Type: GrantFiled: July 2, 2009Date of Patent: March 17, 2015Assignee: The Board of Trustees of the University of IllinoisInventors: Jont B. Allen, Feipeng Li
-
Publication number: 20110178799Abstract: Methods and systems of identifying speech sound features within a speech sound are provided. The sound features may be identified using a multi-dimensional analysis that analyzes the time, frequency, and intensity at which a feature occurs within a speech sound, and the contribution of the feature to the sound. Information about sound features may be used to enhance spoken speech sounds to improve recognizability of the speech sounds by a listener.Type: ApplicationFiled: July 24, 2009Publication date: July 21, 2011Applicant: The Board of Trustees of the University of IllinoisInventors: Jont B. Allen, Feipeng Li
-
Publication number: 20110153321Abstract: Systems and methods for detecting features in spoken speech and processing speech sounds based on the features are provided. One or more features may be identified in a speech sound. The speech sound may be modified to enhance or reduce the degree to which the feature affects the sound ultimately heard by a listener. Systems and methods according to embodiments of the invention may allow for automatic speech recognition devices that enhance detection and recognition of spoken sounds, such as by a user of a hearing aid or other device.Type: ApplicationFiled: July 2, 2009Publication date: June 23, 2011Applicant: THE BOARD OF TRUSTEES OF THE UNIVERSITY OF ILLINOIInventors: Jont B. Allen, Feipeng LI
-
Publication number: 20080162125Abstract: A method and apparatus for language independent voice searching in a mobile communication device is disclosed. The method may include receiving a search query from a user of the mobile communication device, converting speech parts in the search query into linguistic representations which covers at least one languages, generating a search phoneme lattice based on the linguistic representations, extracting query features from the search phoneme lattice, generating query feature vectors based on the extracted features, performing a coarse search using the query feature vectors and the indexing feature vectors from the indexing database, performing a fine search using the results of the coarse search and the indexing phoneme lattices stored in the indexing database, and outputting the fine search results to a dialog manager.Type: ApplicationFiled: December 28, 2006Publication date: July 3, 2008Applicant: Motorola, Inc.Inventors: Changxue C. Ma, Feipeng Li