Patents by Inventor Joshua D. Atkin

Joshua D. Atkin has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

End-to-end time-domain multitask learning for ML-based speech enhancement

Patent number: 11996114

Abstract: Disclosed is a multi-task machine learning model such as a time-domain deep neural network (DNN) that jointly generate an enhanced target speech signal and target audio parameters from a mixed signal of target speech and interference signal. The DNN may encode the mixed signal, determine masks used to jointly estimate the target signal and the target audio parameters based on the encoded mixed signal, apply the mask to separate the target speech from the interference signal to jointly estimate the target signal and the target audio parameters, and decode the masked features to enhance the target speech signal and to estimate the target audio parameters. The target audio parameters may include a voice activity detection (VAD) flag of the target speech. The DNN may leverage multi-channel audio signal and multi-modal signals such as video signals of the target speaker to improve the robustness of the enhanced target speech signal.

Type: Grant

Filed: May 15, 2021

Date of Patent: May 28, 2024

Assignee: Apple Inc.

Inventors: Ramin Pishehvar, Ante Jukic, Mehrez Souden, Jason Wung, Feipeng Li, Joshua D. Atkins
Audio Encoding with Compressed Ambience

Publication number: 20240163609

Abstract: An audio device can sense sound in a physical environment using a plurality of microphones to generate a plurality of microphone signals. Clean speech can be extracted from microphone signals. Ambience can be extracted from the microphone signals. The clean speech can be encoded at a first compression level. The ambience can be encoded at a second compression level that is higher than the first compression level. Other aspects are also described and claimed.

Type: Application

Filed: January 26, 2024

Publication date: May 16, 2024

Inventors: Tomlinson Holman, Christopher T. Eubank, Joshua D. Atkins, Soenke Pelzer, Dirk Schroeder
Processing sound in an enhanced reality environment

Patent number: 11956623

Abstract: Processing sound in an enhanced reality environment can include generating, based on an image of a physical environment, an acoustic model of the physical environment. Audio signals captured by a microphone array, can capture a sound in the physical environment. Based on these audio signals, one or more measured acoustic parameters of the physical environment can be generated. A target audio signal can be processed using the model of the physical environment and the measured acoustic parameters, resulting in a plurality of output audio channels having a virtual sound source with a virtual location. The output audio channels can be used to drive a plurality of speakers. Other aspects are also described and claimed.

Type: Grant

Filed: June 28, 2021

Date of Patent: April 9, 2024

Assignee: Apple Inc.

Inventors: Christopher T. Eubank, Joshua D. Atkins, Soenke Pelzer, Dirk Schroeder
Spatial Capture with Noise Mitigation

Publication number: 20240107259

Abstract: A device may include microphones worn on a head of a user. The device may include a processor, configured to obtain microphone signals from the plurality of microphones. The processor may attenuate breathing sound from the user by processing the microphone signals, resulting in attenuated microphone signals. The processor may render one or more output audio channels based on the plurality of attenuated microphone signals.

Type: Application

Filed: August 30, 2023

Publication date: March 28, 2024

Inventors: Yoo Mi Hur, Ashrith Deshpande, Prateek Murgai, Joshua D. Atkins, Symeon Delikaris Manias
Spatial Blending of Audio

Publication number: 20240098442

Abstract: An audio processing system may obtain a size of a visual object to present to a display. The audio processing system may determine a virtual placement for each of a plurality of virtual speakers at least based on the size of the visual object. Each of the plurality of virtual speakers may be spatially rendered at each virtual placement through binaural audio, for playback through head-worn speakers. Other aspects are also described and claimed.

Type: Application

Filed: August 29, 2023

Publication date: March 21, 2024

Inventors: Shai Messingher Lang, Joshua D. Atkins, Scott A. Wardle, Symeon Delikaris Manias
Method and system for maintaining track length for pre-rendered spatial audio

Patent number: 11937063

Abstract: A method performed by a programmed processor of an audio system, the method includes receiving a sound track that has a track length, producing a binaural audio version of a sound track, the binaural audio version having an extended track length performing a fading operation upon the binaural audio version to gradually reduce a signal level of the binaural audio version to below a signal threshold level at a time along the extended track length that corresponds to an end time of the track length of the sound track; and storing the binaural audio version having the track length of the sound track in memory for later transmission to an audio playback device for driving one or more speakers.

Type: Grant

Filed: May 5, 2022

Date of Patent: March 19, 2024

Assignee: Apple Inc.

Inventors: Juha O. Merimaa, Abdullah Fahim, Andrey D. Del Pozo, Joshua D. Atkins
Audio encoding with compressed ambience

Patent number: 11930337

Abstract: An audio device can sense sound in a physical environment using a plurality of microphones to generate a plurality of microphone signals. Clean speech can be extracted from microphone signals. Ambience can be extracted from the microphone signals. The clean speech can be encoded at a first compression level. The ambience can be encoded at a second compression level that is higher than the first compression level. Other aspects are also described and claimed.

Type: Grant

Filed: June 28, 2021

Date of Patent: March 12, 2024

Assignee: Apple Inc

Inventors: Tomlinson Holman, Christopher T. Eubank, Joshua D. Atkins, Soenke Pelzer, Dirk Schroeder
SYSTEMS AND METHODS FOR ECHO MITIGATION

Publication number: 20230410828

Abstract: Disclosed is a reference-less echo mitigation or cancellation technique. The technique enables suppression of echoes from an interference signal when a reference version of the interference signal conventionally used for echo mitigation may not be available. A first stage of the technique may use a machine learning model to model a target audio area surrounding a device so that a target audio signal estimated as originating from within the target audio area may be accepted. In contrast, audio signals such as playback of media content on a TV or other interfering signals estimated as originating from outside the target audio area may be suppressed. A second stage of the technique may be a level-based suppressor that further attenuates the residual echo from the output of the first stage based on an audio level threshold. Side information may be provided to adjust the target audio area or the audio level threshold.

Type: Application

Filed: June 21, 2022

Publication date: December 21, 2023

Inventors: Ramin Pishehvar, Mehrez Souden, Sean A. Ramprashad, Jason Wung, Ante Jukic, Joshua D. Atkins
Spatially informed acoustic echo cancelation

Patent number: 11849291

Abstract: A plurality of microphone signals can be captured with a plurality of microphones of the device. One or more echo dominant audio signals can be determined based on a pick-up beam directed towards one or more speakers of a playback device. Sound that is emitted from the one or more speakers and sensed by the plurality of microphones can be removed from plurality of microphone signals, by using the one or more echo dominant audio signals as a reference, resulting in clean audio.

Type: Grant

Filed: May 17, 2021

Date of Patent: December 19, 2023

Assignee: Apple Inc.

Inventors: Mehrez Souden, Jason Wung, Ante Jukic, Ramin Pishehvar, Joshua D. Atkins
Spatial audio file format for storing capture metadata

Patent number: 11841899

Abstract: A device with microphones can generate microphone signals during an audio recording. The device can store, in an electronic audio data file, the microphone signals, and metadata that includes impulse responses of the microphones. Other aspects are described and claimed.

Type: Grant

Filed: June 11, 2020

Date of Patent: December 12, 2023

Assignee: Apple Inc.

Inventors: Jonathan D. Sheaffer, Symeon Delikaris Manias, Gaetan R. Lorho, Peter A. Raffensperger, Eric A. Allamanche, Frank Baumgarte, Dipanjan Sen, Joshua D. Atkins, Juha O. Merimaa
Spatial headphone transparency

Patent number: 11818561

Abstract: Digital audio signal processing techniques used to provide an acoustic transparency function in a pair of headphones. A number of transparency filters can be computed at once, using optimization techniques or using a closed form solution, that are based on multiple re-seatings of the headphones and that are as a result robust for a population of wearers. In another embodiment, a transparency hearing filter of a headphone is computed by an adaptive system that takes into consideration the changing acoustic to electrical path between an earpiece speaker and an interior microphone of that headphone while worn by a user. Other embodiments are also described and claimed.

Type: Grant

Filed: November 9, 2022

Date of Patent: November 14, 2023

Assignee: Apple Inc.

Inventors: Ismael H. Nawfal, Joshua D. Atkins, Stephen J. Nimick, Guy C. Nicholson, Jason M. Harlow
DETECTING A TRIGGER OF A DIGITAL ASSISTANT

Publication number: 20230111509

Abstract: Systems and processes for operating an intelligent automated assistant are provided. In accordance with one example, a method includes, at an electronic device with one or more processors, memory, and a plurality of microphones, sampling, at each of the plurality of microphones of the electronic device, an audio signal to obtain a plurality of audio signals; processing the plurality of audio signals to obtain a plurality of audio streams; and determining, based on the plurality of audio streams, whether any of the plurality of audio signals corresponds to a spoken trigger. The method further includes, in accordance with a determination that the plurality of audio signals corresponds to the spoken trigger, initiating a session of the digital assistant; and in accordance with a determination that the plurality of audio signals does not correspond to the spoken trigger, foregoing initiating a session of the digital assistant.

Type: Application

Filed: December 13, 2022

Publication date: April 13, 2023

Inventors: Yoon KIM, John BRIDLE, Joshua D. ATKINS, Feipeng LI, Mehrez SOUDEN
DETERMINING A VIRTUAL LISTENING ENVIRONMENT

Publication number: 20230104111

Abstract: One or more acoustic parameters of a current acoustic environment of a user may be determined based on sensor signals captured by one or more sensors of the device. One or more preset acoustic parameters may be determined based on the one or more acoustic parameters of the current acoustic environment of the user and an acoustic environment of an audio file comprising audio signals that is determined based on the audio signals of the audio file or metadata of the audio file. The audio signals may be spatially rendered by applying spatial filters that include the one or more preset acoustic parameters to the audio signals, resulting in binaural audio signals. The binaural audio signals may be used to drive speakers of a headset. Other aspects are described and claimed.

Type: Application

Filed: August 19, 2022

Publication date: April 6, 2023

Inventors: Prateek Murgai, John E. Arthur, Joshua D. Atkins, Juha O. Merimaa, Dipanjan Sen, Brandon J. Rice, Alexander Singh Alvarado, Jonathan D. Sheaffer, Benjamin Bernard, David E. Romblom
Detecting a trigger of a digital assistant

Patent number: 11532306

Abstract: Systems and processes for operating an intelligent automated assistant are provided. In accordance with one example, a method includes, at an electronic device with one or more processors, memory, and a plurality of microphones, sampling, at each of the plurality of microphones of the electronic device, an audio signal to obtain a plurality of audio signals; processing the plurality of audio signals to obtain a plurality of audio streams; and determining, based on the plurality of audio streams, whether any of the plurality of audio signals corresponds to a spoken trigger. The method further includes, in accordance with a determination that the plurality of audio signals corresponds to the spoken trigger, initiating a session of the digital assistant; and in accordance with a determination that the plurality of audio signals does not correspond to the spoken trigger, foregoing initiating a session of the digital assistant.

Type: Grant

Filed: December 3, 2020

Date of Patent: December 20, 2022

Assignee: Apple Inc.

Inventors: Yoon Kim, John Bridle, Joshua D. Atkins, Feipeng Li, Mehrez Souden
METHOD AND SYSTEM FOR MAINTAINING TRACK LENGTH FOR PRE-RENDERED SPATIAL AUDIO

Publication number: 20220394406

Abstract: A method performed by a programmed processor of an audio system, the method includes receiving a sound track that has a track length, producing a binaural audio version of a sound track, the binaural audio version having an extended track length performing a fading operation upon the binaural audio version to gradually reduce a signal level of the binaural audio version to below a signal threshold level at a time along the extended track length that corresponds to an end time of the track length of the sound track; and storing the binaural audio version having the track length of the sound track in memory for later transmission to an audio playback device for driving one or more speakers.

Type: Application

Filed: May 5, 2022

Publication date: December 8, 2022

Inventors: Juha O. Merimaa, Abdullah Fahim, Andrey D. Del Pozo, Joshua D. Atkins
Spatially informed audio signal processing for user speech

Patent number: 11514928

Abstract: A device implementing a system for processing speech in an audio signal includes at least one processor configured to receive an audio signal corresponding to at least one microphone of a device, and to determine, using a first model, a first probability that a speech source is present in the audio signal. The at least one processor is further configured to determine, using a second model, a second probability that an estimated location of a source of the audio signal corresponds to an expected position of a user of the device, and to determine a likelihood that the audio signal corresponds to the user of the device based on the first and second probabilities.

Type: Grant

Filed: December 9, 2019

Date of Patent: November 29, 2022

Assignee: Apple Inc.

Inventors: Mehrez Souden, Ante Jukic, Jason Wung, Ashrith Deshpande, Joshua D. Atkins
Microphone array based deep learning for time-domain speech signal extraction

Patent number: 11508388

Abstract: A device for processing audio signals in a time-domain includes a processor configured to receive multiple audio signals corresponding to respective microphones of at least two or more microphones of the device, at least one of the multiple audio signals comprising speech of a user of the device. The processor is configured to provide the multiple audio signals to a machine learning model, the machine learning model having been trained based at least in part on an expected position of the user of the device and expected positions of the respective microphones on the device. The processor is configured to provide an audio signal that is enhanced with respect to the speech of the user relative to the multiple audio signals, wherein the audio signal is a waveform output from the machine learning model.

Type: Grant

Filed: November 20, 2020

Date of Patent: November 22, 2022

Assignee: Apple Inc.

Inventors: Mehrez Souden, Symeon Delikaris Manias, Joshua D. Atkins, Ante Jukic, Ramin Pishehvar
End-To-End Time-Domain Multitask Learning for ML-Based Speech Enhancement

Publication number: 20220366927

Abstract: Disclosed is a multi-task machine learning model such as a time-domain deep neural network (DNN) that jointly generate an enhanced target speech signal and target audio parameters from a mixed signal of target speech and interference signal. The DNN may encode the mixed signal, determine masks used to jointly estimate the target signal and the target audio parameters based on the encoded mixed signal, apply the mask to separate the target speech from the interference signal to jointly estimate the target signal and the target audio parameters, and decode the masked features to enhance the target speech signal and to estimate the target audio parameters. The target audio parameters may include a voice activity detection (VAD) flag of the target speech. The DNN may leverage multi-channel audio signal and multi-modal signals such as video signals of the target speaker to improve the robustness of the enhanced target speech signal.

Type: Application

Filed: May 15, 2021

Publication date: November 17, 2022

Inventors: Ramin Pishehvar, Ante Jukic, Mehrez Souden, Jason Wung, Feipeng Li, Joshua D. Atkins
SPATIALLY INFORMED ACOUSTIC ECHO CANCELATION

Publication number: 20220369030

Abstract: A plurality of microphone signals can be captured with a plurality of microphones of the device. One or more echo dominant audio signals can be determined based on a pick-up beam directed towards one or more speakers of a playback device. Sound that is emitted from the one or more speakers and sensed by the plurality of microphones can be removed from plurality of microphone signals, by using the one or more echo dominant audio signals as a reference, resulting in clean audio.

Type: Application

Filed: May 17, 2021

Publication date: November 17, 2022

Inventors: Mehrez Souden, Jason Wung, Ante Jukic, Ramin Pishehvar, Joshua D. Atkins
Spatial headphone transparency

Patent number: 11503409

Abstract: Digital audio signal processing techniques used to provide an acoustic transparency function in a pair of headphones. A number of transparency filters can be computed at once, using optimization techniques or using a closed form solution, that are based on multiple re-seatings of the headphones and that are as a result robust for a population of wearers. In another embodiment, a transparency hearing filter of a headphone is computed by an adaptive system that takes into consideration the changing acoustic to electrical path between an earpiece speaker and an interior microphone of that headphone while worn by a user. Other embodiments are also described and claimed.

Type: Grant

Filed: March 12, 2021

Date of Patent: November 15, 2022

Assignee: APPLE INC.

Inventors: Ismael H. Nawfal, Joshua D. Atkins, Stephen J. Nimick, Guy C. Nicholson, Jason M. Harlow

1 2 3 4 next