Patents by Inventor Joshua D. Atkin
Joshua D. Atkin has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Patent number: 11996114Abstract: Disclosed is a multi-task machine learning model such as a time-domain deep neural network (DNN) that jointly generate an enhanced target speech signal and target audio parameters from a mixed signal of target speech and interference signal. The DNN may encode the mixed signal, determine masks used to jointly estimate the target signal and the target audio parameters based on the encoded mixed signal, apply the mask to separate the target speech from the interference signal to jointly estimate the target signal and the target audio parameters, and decode the masked features to enhance the target speech signal and to estimate the target audio parameters. The target audio parameters may include a voice activity detection (VAD) flag of the target speech. The DNN may leverage multi-channel audio signal and multi-modal signals such as video signals of the target speaker to improve the robustness of the enhanced target speech signal.Type: GrantFiled: May 15, 2021Date of Patent: May 28, 2024Assignee: Apple Inc.Inventors: Ramin Pishehvar, Ante Jukic, Mehrez Souden, Jason Wung, Feipeng Li, Joshua D. Atkins
-
Publication number: 20240163609Abstract: An audio device can sense sound in a physical environment using a plurality of microphones to generate a plurality of microphone signals. Clean speech can be extracted from microphone signals. Ambience can be extracted from the microphone signals. The clean speech can be encoded at a first compression level. The ambience can be encoded at a second compression level that is higher than the first compression level. Other aspects are also described and claimed.Type: ApplicationFiled: January 26, 2024Publication date: May 16, 2024Inventors: Tomlinson Holman, Christopher T. Eubank, Joshua D. Atkins, Soenke Pelzer, Dirk Schroeder
-
Patent number: 11956623Abstract: Processing sound in an enhanced reality environment can include generating, based on an image of a physical environment, an acoustic model of the physical environment. Audio signals captured by a microphone array, can capture a sound in the physical environment. Based on these audio signals, one or more measured acoustic parameters of the physical environment can be generated. A target audio signal can be processed using the model of the physical environment and the measured acoustic parameters, resulting in a plurality of output audio channels having a virtual sound source with a virtual location. The output audio channels can be used to drive a plurality of speakers. Other aspects are also described and claimed.Type: GrantFiled: June 28, 2021Date of Patent: April 9, 2024Assignee: Apple Inc.Inventors: Christopher T. Eubank, Joshua D. Atkins, Soenke Pelzer, Dirk Schroeder
-
Publication number: 20240107259Abstract: A device may include microphones worn on a head of a user. The device may include a processor, configured to obtain microphone signals from the plurality of microphones. The processor may attenuate breathing sound from the user by processing the microphone signals, resulting in attenuated microphone signals. The processor may render one or more output audio channels based on the plurality of attenuated microphone signals.Type: ApplicationFiled: August 30, 2023Publication date: March 28, 2024Inventors: Yoo Mi Hur, Ashrith Deshpande, Prateek Murgai, Joshua D. Atkins, Symeon Delikaris Manias
-
Publication number: 20240098442Abstract: An audio processing system may obtain a size of a visual object to present to a display. The audio processing system may determine a virtual placement for each of a plurality of virtual speakers at least based on the size of the visual object. Each of the plurality of virtual speakers may be spatially rendered at each virtual placement through binaural audio, for playback through head-worn speakers. Other aspects are also described and claimed.Type: ApplicationFiled: August 29, 2023Publication date: March 21, 2024Inventors: Shai Messingher Lang, Joshua D. Atkins, Scott A. Wardle, Symeon Delikaris Manias
-
Patent number: 11937063Abstract: A method performed by a programmed processor of an audio system, the method includes receiving a sound track that has a track length, producing a binaural audio version of a sound track, the binaural audio version having an extended track length performing a fading operation upon the binaural audio version to gradually reduce a signal level of the binaural audio version to below a signal threshold level at a time along the extended track length that corresponds to an end time of the track length of the sound track; and storing the binaural audio version having the track length of the sound track in memory for later transmission to an audio playback device for driving one or more speakers.Type: GrantFiled: May 5, 2022Date of Patent: March 19, 2024Assignee: Apple Inc.Inventors: Juha O. Merimaa, Abdullah Fahim, Andrey D. Del Pozo, Joshua D. Atkins
-
Patent number: 11930337Abstract: An audio device can sense sound in a physical environment using a plurality of microphones to generate a plurality of microphone signals. Clean speech can be extracted from microphone signals. Ambience can be extracted from the microphone signals. The clean speech can be encoded at a first compression level. The ambience can be encoded at a second compression level that is higher than the first compression level. Other aspects are also described and claimed.Type: GrantFiled: June 28, 2021Date of Patent: March 12, 2024Assignee: Apple IncInventors: Tomlinson Holman, Christopher T. Eubank, Joshua D. Atkins, Soenke Pelzer, Dirk Schroeder
-
Publication number: 20230410828Abstract: Disclosed is a reference-less echo mitigation or cancellation technique. The technique enables suppression of echoes from an interference signal when a reference version of the interference signal conventionally used for echo mitigation may not be available. A first stage of the technique may use a machine learning model to model a target audio area surrounding a device so that a target audio signal estimated as originating from within the target audio area may be accepted. In contrast, audio signals such as playback of media content on a TV or other interfering signals estimated as originating from outside the target audio area may be suppressed. A second stage of the technique may be a level-based suppressor that further attenuates the residual echo from the output of the first stage based on an audio level threshold. Side information may be provided to adjust the target audio area or the audio level threshold.Type: ApplicationFiled: June 21, 2022Publication date: December 21, 2023Inventors: Ramin Pishehvar, Mehrez Souden, Sean A. Ramprashad, Jason Wung, Ante Jukic, Joshua D. Atkins
-
Patent number: 11849291Abstract: A plurality of microphone signals can be captured with a plurality of microphones of the device. One or more echo dominant audio signals can be determined based on a pick-up beam directed towards one or more speakers of a playback device. Sound that is emitted from the one or more speakers and sensed by the plurality of microphones can be removed from plurality of microphone signals, by using the one or more echo dominant audio signals as a reference, resulting in clean audio.Type: GrantFiled: May 17, 2021Date of Patent: December 19, 2023Assignee: Apple Inc.Inventors: Mehrez Souden, Jason Wung, Ante Jukic, Ramin Pishehvar, Joshua D. Atkins
-
Patent number: 11841899Abstract: A device with microphones can generate microphone signals during an audio recording. The device can store, in an electronic audio data file, the microphone signals, and metadata that includes impulse responses of the microphones. Other aspects are described and claimed.Type: GrantFiled: June 11, 2020Date of Patent: December 12, 2023Assignee: Apple Inc.Inventors: Jonathan D. Sheaffer, Symeon Delikaris Manias, Gaetan R. Lorho, Peter A. Raffensperger, Eric A. Allamanche, Frank Baumgarte, Dipanjan Sen, Joshua D. Atkins, Juha O. Merimaa
-
Patent number: 11818561Abstract: Digital audio signal processing techniques used to provide an acoustic transparency function in a pair of headphones. A number of transparency filters can be computed at once, using optimization techniques or using a closed form solution, that are based on multiple re-seatings of the headphones and that are as a result robust for a population of wearers. In another embodiment, a transparency hearing filter of a headphone is computed by an adaptive system that takes into consideration the changing acoustic to electrical path between an earpiece speaker and an interior microphone of that headphone while worn by a user. Other embodiments are also described and claimed.Type: GrantFiled: November 9, 2022Date of Patent: November 14, 2023Assignee: Apple Inc.Inventors: Ismael H. Nawfal, Joshua D. Atkins, Stephen J. Nimick, Guy C. Nicholson, Jason M. Harlow
-
Publication number: 20230111509Abstract: Systems and processes for operating an intelligent automated assistant are provided. In accordance with one example, a method includes, at an electronic device with one or more processors, memory, and a plurality of microphones, sampling, at each of the plurality of microphones of the electronic device, an audio signal to obtain a plurality of audio signals; processing the plurality of audio signals to obtain a plurality of audio streams; and determining, based on the plurality of audio streams, whether any of the plurality of audio signals corresponds to a spoken trigger. The method further includes, in accordance with a determination that the plurality of audio signals corresponds to the spoken trigger, initiating a session of the digital assistant; and in accordance with a determination that the plurality of audio signals does not correspond to the spoken trigger, foregoing initiating a session of the digital assistant.Type: ApplicationFiled: December 13, 2022Publication date: April 13, 2023Inventors: Yoon KIM, John BRIDLE, Joshua D. ATKINS, Feipeng LI, Mehrez SOUDEN
-
Publication number: 20230104111Abstract: One or more acoustic parameters of a current acoustic environment of a user may be determined based on sensor signals captured by one or more sensors of the device. One or more preset acoustic parameters may be determined based on the one or more acoustic parameters of the current acoustic environment of the user and an acoustic environment of an audio file comprising audio signals that is determined based on the audio signals of the audio file or metadata of the audio file. The audio signals may be spatially rendered by applying spatial filters that include the one or more preset acoustic parameters to the audio signals, resulting in binaural audio signals. The binaural audio signals may be used to drive speakers of a headset. Other aspects are described and claimed.Type: ApplicationFiled: August 19, 2022Publication date: April 6, 2023Inventors: Prateek Murgai, John E. Arthur, Joshua D. Atkins, Juha O. Merimaa, Dipanjan Sen, Brandon J. Rice, Alexander Singh Alvarado, Jonathan D. Sheaffer, Benjamin Bernard, David E. Romblom
-
Patent number: 11532306Abstract: Systems and processes for operating an intelligent automated assistant are provided. In accordance with one example, a method includes, at an electronic device with one or more processors, memory, and a plurality of microphones, sampling, at each of the plurality of microphones of the electronic device, an audio signal to obtain a plurality of audio signals; processing the plurality of audio signals to obtain a plurality of audio streams; and determining, based on the plurality of audio streams, whether any of the plurality of audio signals corresponds to a spoken trigger. The method further includes, in accordance with a determination that the plurality of audio signals corresponds to the spoken trigger, initiating a session of the digital assistant; and in accordance with a determination that the plurality of audio signals does not correspond to the spoken trigger, foregoing initiating a session of the digital assistant.Type: GrantFiled: December 3, 2020Date of Patent: December 20, 2022Assignee: Apple Inc.Inventors: Yoon Kim, John Bridle, Joshua D. Atkins, Feipeng Li, Mehrez Souden
-
Publication number: 20220394406Abstract: A method performed by a programmed processor of an audio system, the method includes receiving a sound track that has a track length, producing a binaural audio version of a sound track, the binaural audio version having an extended track length performing a fading operation upon the binaural audio version to gradually reduce a signal level of the binaural audio version to below a signal threshold level at a time along the extended track length that corresponds to an end time of the track length of the sound track; and storing the binaural audio version having the track length of the sound track in memory for later transmission to an audio playback device for driving one or more speakers.Type: ApplicationFiled: May 5, 2022Publication date: December 8, 2022Inventors: Juha O. Merimaa, Abdullah Fahim, Andrey D. Del Pozo, Joshua D. Atkins
-
Patent number: 11514928Abstract: A device implementing a system for processing speech in an audio signal includes at least one processor configured to receive an audio signal corresponding to at least one microphone of a device, and to determine, using a first model, a first probability that a speech source is present in the audio signal. The at least one processor is further configured to determine, using a second model, a second probability that an estimated location of a source of the audio signal corresponds to an expected position of a user of the device, and to determine a likelihood that the audio signal corresponds to the user of the device based on the first and second probabilities.Type: GrantFiled: December 9, 2019Date of Patent: November 29, 2022Assignee: Apple Inc.Inventors: Mehrez Souden, Ante Jukic, Jason Wung, Ashrith Deshpande, Joshua D. Atkins
-
Patent number: 11508388Abstract: A device for processing audio signals in a time-domain includes a processor configured to receive multiple audio signals corresponding to respective microphones of at least two or more microphones of the device, at least one of the multiple audio signals comprising speech of a user of the device. The processor is configured to provide the multiple audio signals to a machine learning model, the machine learning model having been trained based at least in part on an expected position of the user of the device and expected positions of the respective microphones on the device. The processor is configured to provide an audio signal that is enhanced with respect to the speech of the user relative to the multiple audio signals, wherein the audio signal is a waveform output from the machine learning model.Type: GrantFiled: November 20, 2020Date of Patent: November 22, 2022Assignee: Apple Inc.Inventors: Mehrez Souden, Symeon Delikaris Manias, Joshua D. Atkins, Ante Jukic, Ramin Pishehvar
-
Publication number: 20220366927Abstract: Disclosed is a multi-task machine learning model such as a time-domain deep neural network (DNN) that jointly generate an enhanced target speech signal and target audio parameters from a mixed signal of target speech and interference signal. The DNN may encode the mixed signal, determine masks used to jointly estimate the target signal and the target audio parameters based on the encoded mixed signal, apply the mask to separate the target speech from the interference signal to jointly estimate the target signal and the target audio parameters, and decode the masked features to enhance the target speech signal and to estimate the target audio parameters. The target audio parameters may include a voice activity detection (VAD) flag of the target speech. The DNN may leverage multi-channel audio signal and multi-modal signals such as video signals of the target speaker to improve the robustness of the enhanced target speech signal.Type: ApplicationFiled: May 15, 2021Publication date: November 17, 2022Inventors: Ramin Pishehvar, Ante Jukic, Mehrez Souden, Jason Wung, Feipeng Li, Joshua D. Atkins
-
Publication number: 20220369030Abstract: A plurality of microphone signals can be captured with a plurality of microphones of the device. One or more echo dominant audio signals can be determined based on a pick-up beam directed towards one or more speakers of a playback device. Sound that is emitted from the one or more speakers and sensed by the plurality of microphones can be removed from plurality of microphone signals, by using the one or more echo dominant audio signals as a reference, resulting in clean audio.Type: ApplicationFiled: May 17, 2021Publication date: November 17, 2022Inventors: Mehrez Souden, Jason Wung, Ante Jukic, Ramin Pishehvar, Joshua D. Atkins
-
Patent number: 11503409Abstract: Digital audio signal processing techniques used to provide an acoustic transparency function in a pair of headphones. A number of transparency filters can be computed at once, using optimization techniques or using a closed form solution, that are based on multiple re-seatings of the headphones and that are as a result robust for a population of wearers. In another embodiment, a transparency hearing filter of a headphone is computed by an adaptive system that takes into consideration the changing acoustic to electrical path between an earpiece speaker and an interior microphone of that headphone while worn by a user. Other embodiments are also described and claimed.Type: GrantFiled: March 12, 2021Date of Patent: November 15, 2022Assignee: APPLE INC.Inventors: Ismael H. Nawfal, Joshua D. Atkins, Stephen J. Nimick, Guy C. Nicholson, Jason M. Harlow