Patents by Inventor Joshua D. Atkins

Joshua D. Atkins has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

METHODS AND DEVICES FOR USER DETECTION BASED SPATIAL AUDIO PLAYBACK

Publication number: 20200084560

Abstract: A presence of a person within a camera field of view of an electronic device is determined by digitally processing images captured by a camera. A position of a body member of the person with respect to the electronic device is also computed by digitally processing the camera captured images. A crosstalk cancellation (XTC) signal is adjusted based on the computed position of the body member. Adjusting the XTC signal includes adjusting a first predetermined model location, which includes a location at which a user should be in order to achieve a desired virtual acoustics effect. Processing program audio based on the adjusted XTC signal, to generate audio signals that drive speakers. Other aspects are also described and claimed.

Type: Application

Filed: September 10, 2018

Publication date: March 12, 2020

Inventors: Darius A. Satongar, Joshua D. Atkins, Justin D. Crosby, Lance F. Reichert, Martin E. Johnson, Sawyer Cohen
Deep learning driven multi-channel filtering for speech enhancement

Patent number: 10546593

Abstract: A number of features are extracted from a current frame of a multi-channel speech pickup and from side information that is a linear echo estimate, a diffuse signal component, or a noise estimate of the multi-channel speech pickup. A DNN-based speech presence probability is produced for the current frame, where the SPP value is produced in response to the extracted features being input to the DNN. The DNN-based SPP value is applied to configure a multi-channel filter whose input is the multi-channel speech pickup and whose output is a single audio signal. In one aspect, the system is designed to run online, at low enough latency for real time applications such voice trigger detection. Other aspects are also described and claimed.

Type: Grant

Filed: December 4, 2017

Date of Patent: January 28, 2020

Assignee: APPLE INC.

Inventors: Jason Wung, Mehrez Souden, Ramin Pishehvar, Joshua D. Atkins
System to move a virtual sound away from a listener using a crosstalk canceler

Patent number: 10524080

Abstract: An audio processing system has one or more processors that process an audio signal on three paths. The first path has a direct gain and a direct virtual source algorithm operating on the audio signal. The second path has a plurality of early reflection gains operating on the audio signal. Operation with the early reflection gains produces a plurality of early reflections. Each of the early reflection signals may be subjected to a delay and may be processed according to an early reflections virtual source algorithm. The third path has a reverb gain and binaural reverb filters operating on the audio signal. The third path also has a crosstalk canceler. A mixer combines left and right channel outputs of each of the first path, second path and third path. The mixer produces a left loudspeaker signal and a right loudspeaker signal.

Type: Grant

Filed: August 23, 2018

Date of Patent: December 31, 2019

Assignee: APPLE INC.

Inventors: Martin E. Johnson, Darius A. Satongar, Stuart J. Wood, Lance F. Reichert, Juha O. Merimaa, Joshua D. Atkins
Coordination of beamformers for noise estimation and noise suppression

Patent number: 10482899

Abstract: An audio system has a housing in which are integrated a number of microphones. A programmed processor accesses the microphone signals and produces a number of acoustic pick up beams based groups of microphones, an estimation of voice activity and an estimation of noise characteristics on each beam. Two or more beams including a voice beam that is used to pick up a desired voice and a noise beam that is used to provide information to estimate ambient noise are adaptively selected from among the plurality of beams, based on thresholds for voice separation and thresholds for noise-matching. Other embodiments are also described and claimed.

Type: Grant

Filed: August 1, 2016

Date of Patent: November 19, 2019

Assignee: Apple Inc.

Inventors: Sean A. Ramprashad, Esge B. Andersen, Joshua D. Atkins, Sorin V. Dusan, Vasu Iyengar, Tarun Pruthi, Lalin S. Theverapperuma
Multi-channel speech signal enhancement for robust voice trigger detection and automatic speech recognition

Patent number: 10403299

Abstract: A digital speech enhancement system that performs a specific chain of digital signal processing operations upon multi-channel sound pick up, to result in a single, enhanced speech signal. The operations are designed to be computationally less complex yet as a whole yield an enhanced speech signal that produces accurate voice trigger detection and low word error rates by an automatic speech recognizer. The constituent operations or components of the system have been chosen so that the overall system is robust to changing acoustic conditions, and can deliver the enhanced speech signal with low enough latency so that the system can be used online (enabling real-time, voice trigger detection and streaming ASR.) Other embodiments are also described and claimed.

Type: Grant

Filed: June 2, 2017

Date of Patent: September 3, 2019

Assignee: Apple Inc.

Inventors: Jason Wung, Joshua D. Atkins, Ramin Pishehvar, Mehrez Souden
Recording musical instruments using a microphone array in a device

Patent number: 10390131

Abstract: A microphone array included in a portable electronic device is used to generate various virtual studio microphones by combining one or more microphone signals to produce one or more acoustic pickup beams. An error is determined in a position of the microphone array relative to an audio source to be recorded. An interface is displayed to instruct a user on repositioning the microphone array relative to the instrument and the instrument is recorded using the repositioned microphone array.

Type: Grant

Filed: September 29, 2017

Date of Patent: August 20, 2019

Assignee: Apple Inc.

Inventors: Jonathan D. Sheaffer, Darius A. Satongar, Joshua D. Atkins, Martin E. Johnson
INTELLIGENT AUDIO RENDERING FOR VIDEO RECORDING

Publication number: 20190222950

Abstract: Image analysis of a video signal is performed to produce first metadata, and audio analysis of a multi-channel sound track associated with the video signal is performed to produce second metadata. A number of time segments of the sound track are processed, wherein each time segment is processed by either (i) spatial filtering of the audio signals or (ii) spatial rendering of the audio signals, not both, wherein for each time segment a decision was made to select between the spatial filtering or the spatial rendering, in accordance with the first and second metadata. A mix of the processed sound track and the video signal is generated. Other embodiments are also described and claimed.

Type: Application

Filed: January 4, 2019

Publication date: July 18, 2019

Inventors: Jonathan D. Sheaffer, Joshua D. Atkins, Martin E. Johnson, Stuart J. Wood
Machine learning based sound field analysis

Patent number: 10334357

Abstract: Impulse responses of a device are measured. A database of sound files is generated by convolving source signals with the impulse responses of the device. The sound files from the database are transformed into time-frequency domain. One or more sub-band directional features is estimated at each sub-band of the time-frequency domain. A deep neural network (DNN) is trained for each sub-band based on the estimated one or more sub-band directional features and a target directional feature.

Type: Grant

Filed: September 29, 2017

Date of Patent: June 25, 2019

Assignee: Apple Inc.

Inventors: Joshua D. Atkins, Mehrez Souden, Symeon Delikaris-Manias, Peter Raffensperger
DEEP LEARNING DRIVEN MULTI-CHANNEL FILTERING FOR SPEECH ENHANCEMENT

Publication number: 20190172476

Abstract: A number of features are extracted from a current frame of a multi-channel speech pickup and from side information that is a linear echo estimate, a diffuse signal component, or a noise estimate of the multi-channel speech pickup. A DNN-based speech presence probability is produced for the current frame, where the SPP value is produced in response to the extracted features being input to the DNN. The DNN-based SPP value is applied to configure a multi-channel filter whose input is the multi-channel speech pickup and whose output is a single audio signal. In one aspect, the system is designed to run online, at low enough latency for real time applications such voice trigger detection. Other aspects are also described and claimed.

Type: Application

Filed: December 4, 2017

Publication date: June 6, 2019

Inventors: Jason Wung, Mehrez Souden, Ramin Pishehvar, Joshua D. Atkins
MACHINE LEARNING BASED SOUND FIELD ANALYSIS

Publication number: 20190104357

Abstract: Impulse responses of a device are measured. A database of sound files is generated by convolving source signals with the impulse responses of the device. The sound files from the database are transformed into time-frequency domain. One or more sub-band directional features is estimated at each sub-band of the time-frequency domain. A deep neural network (DNN) is trained for each sub-band based on the estimated one or more sub-band directional features and a target directional feature.

Type: Application

Filed: September 29, 2017

Publication date: April 4, 2019

Inventors: Joshua D. Atkins, Mehrez Souden, Symeon Delikaris-Manias, Peter Raffensperger
RECORDING MUSICAL INSTRUMENTS USING A MICROPHONE ARRAY IN A DEVICE

Publication number: 20190104359

Abstract: A microphone array included in a portable electronic device is used to generate various virtual studio microphones by combining one or more microphone signals to produce one or more acoustic pickup beams. An error is determined in a position of the microphone array relative to an audio source to be recorded. An interface is displayed to instruct a user on repositioning the microphone array relative to the instrument and the instrument is recorded using the repositioned microphone array.

Type: Application

Filed: September 29, 2017

Publication date: April 4, 2019

Inventors: Jonathan D. Sheaffer, Darius A. Satongar, Joshua D. Atkins, Martin E. Johnson
SYSTEM AND METHOD FOR PERFORMING PANNING FOR AN ARBITRARY LOUDSPEAKER SETUP

Publication number: 20190104364

Abstract: Placement of one or two placed virtual loudspeakers within a loudspeaker setup that includes a real loudspeakers is determined and vector base amplitude panning (VBAP) gains including the gains of the real loudspeakers and placed one or two virtual loudspeakers are also then determined. Gains of one or two placed virtual loudspeakers are redistributed to the real loudspeakers to ensure preservation of total energy. Real loudspeakers in the loudspeaker setup have redistributed gains of one or two placed virtual loudspeakers. Loudspeaker outputs are generated and transmitted to the real loudspeakers to be played back. When received audio content is ambisonics content, a predetermined grid is generated and HOA content is projected to the grid. Other aspects are also described.

Type: Application

Filed: June 5, 2018

Publication date: April 4, 2019

Inventors: Ismael NAWFAL, Symeon DELIKARIS MANIAS, Joshua D. ATKINS
DETECTING A TRIGGER OF A DIGITAL ASSISTANT

Publication number: 20190074009

Abstract: Systems and processes for operating an intelligent automated assistant are provided. In accordance with one example, a method includes, at an electronic device with one or more processors, memory, and a plurality of microphones, sampling, at each of the plurality of microphones of the electronic device, an audio signal to obtain a plurality of audio signals; processing the plurality of audio signals to obtain a plurality of audio streams; and determining, based on the plurality of audio streams, whether any of the plurality of audio signals corresponds to a spoken trigger. The method further includes, in accordance with a determination that the plurality of audio signals corresponds to the spoken trigger, initiating a session of the digital assistant; and in accordance with a determination that the plurality of audio signals does not correspond to the spoken trigger, foregoing initiating a session of the digital assistant.

Type: Application

Filed: November 5, 2018

Publication date: March 7, 2019

Inventors: Yoon KIM, John BRIDLE, Joshua D. ATKINS, Feipeng LI, Mehrez SOUDEN
SPATIAL HEADPHONE TRANSPARENCY

Publication number: 20190058952

Abstract: Digital audio signal processing techniques used to provide an acoustic transparency function in a pair of headphones. A number of transparency filters can be computed at once, using optimization techniques or using a closed form solution, that are based on multiple re-seatings of the headphones and that are as a result robust for a population of wearers. In another embodiment, a transparency hearing filter of a headphone is computed by an adaptive system that takes into consideration the changing acoustic to electrical path between an earpiece speaker and an interior microphone of that headphone while worn by a user. Other embodiments are also described and claimed.

Type: Application

Filed: July 6, 2018

Publication date: February 21, 2019

Inventors: Ismael H. NAWFAL, Joshua D. ATKINS, Stephen J. NIMICK, Guy C. NICHOLSON, Jason M. HARLOW
Intelligent audio rendering for video recording

Patent number: 10178490

Abstract: Image analysis of a video signal is performed to produce first metadata, and audio analysis of a multi-channel sound track associated with the video signal is performed to produce second metadata. A number of time segments of the sound track are processed, wherein each time segment is processed by either (i) spatial filtering of the audio signals or (ii) spatial rendering of the audio signals, not both, wherein for each time segment a decision was made to select between the spatial filtering or the spatial rendering, in accordance with the first and second metadata. A mix of the processed sound track and the video signal is generated. Other embodiments are also described and claimed.

Type: Grant

Filed: June 30, 2017

Date of Patent: January 8, 2019

Assignee: Apple Inc.

Inventors: Jonathan D. Sheaffer, Joshua D. Atkins, Martin E. Johnson, Stuart J. Wood
Intelligent Audio Rendering for Video Recording

Publication number: 20190007780

Abstract: Image analysis of a video signal is performed to produce first metadata, and audio analysis of a multi-channel sound track associated with the video signal is performed to produce second metadata. A number of time segments of the sound track are processed, wherein each time segment is processed by either (i) spatial filtering of the audio signals or (ii) spatial rendering of the audio signals, not both, wherein for each time segment a decision was made to select between the spatial filtering or the spatial rendering, in accordance with the first and second metadata. A mix of the processed sound track and the video signal is generated. Other embodiments are also described and claimed.

Type: Application

Filed: June 30, 2017

Publication date: January 3, 2019

Inventors: Jonathan D. Sheaffer, Joshua D. Atkins, Martin E. Johnson, Stuart J. Wood
Multi-Channel Speech Signal Enhancement for Robust Voice Trigger Detection and Automatic Speech Recognition

Publication number: 20180350379

Abstract: A digital speech enhancement system that performs a specific chain of digital signal processing operations upon multi-channel sound pick up, to result in a single, enhanced speech signal. The operations are designed to be computationally less complex yet as a whole yield an enhanced speech signal that produces accurate voice trigger detection and low word error rates by an automatic speech recognizer. The constituent operations or components of the system have been chosen so that the overall system is robust to changing acoustic conditions, and can deliver the enhanced speech signal with low enough latency so that the system can be used online (enabling real-time, voice trigger detection and streaming ASR.) Other embodiments are also described and claimed.

Type: Application

Filed: June 2, 2017

Publication date: December 6, 2018

Inventors: Jason Wung, Joshua D. Atkins, Ramin Pishehvar, Mehrez Souden
DETECTING A TRIGGER OF A DIGITAL ASSISTANT

Publication number: 20180336892

Abstract: Systems and processes for operating an intelligent automated assistant are provided. In accordance with one example, a method includes, at an electronic device with one or more processors, memory, and a plurality of microphones, sampling, at each of the plurality of microphones of the electronic device, an audio signal to obtain a plurality of audio signals; processing the plurality of audio signals to obtain a plurality of audio streams; and determining, based on the plurality of audio streams, whether any of the plurality of audio signals corresponds to a spoken trigger. The method further includes, in accordance with a determination that the plurality of audio signals corresponds to the spoken trigger, initiating a session of the digital assistant; and in accordance with a determination that the plurality of audio signals does not correspond to the spoken trigger, foregoing initiating a session of the digital assistant.

Type: Application

Filed: March 13, 2018

Publication date: November 22, 2018

Inventors: Yoon KIM, John BRIDLE, Joshua D. ATKINS, Feipeng LI, Mehrez SOUDEN
System and method for performing speech enhancement using a deep neural network-based signal

Patent number: 10074380

Abstract: Method for performing speech enhancement using a Deep Neural Network (DNN)-based signal starts with training DNN offline by exciting a microphone using target training signal that includes signal approximation of clean speech. Loudspeaker is driven with a reference signal and outputs loudspeaker signal. Microphone then generates microphone signal based on at least one of: near-end speaker signal, ambient noise signal, or loudspeaker signal. Acoustic-echo-canceller (AEC) generates AEC echo-cancelled signal based on reference signal and microphone signal. Loudspeaker signal estimator generates estimated loudspeaker signal based on microphone signal and AEC echo-cancelled signal. DNN receives microphone signal, reference signal, AEC echo-cancelled signal, and estimated loudspeaker signal and generates a speech reference signal that includes signal statistics for residual echo or for noise.

Type: Grant

Filed: August 3, 2016

Date of Patent: September 11, 2018

Assignee: Apple Inc.

Inventors: Jason Wung, Ramin Pishehvar, Daniele Giacobello, Joshua D. Atkins
Spatial headphone transparency

Patent number: 10034092

Abstract: Digital audio signal processing techniques used to provide an acoustic transparency function in a pair of headphones. A number of transparency filters can be computed at once, using optimization techniques or using a closed form solution, that are based on multiple re-seatings of the headphones and that are as a result robust for a population of wearers. In another embodiment, a transparency hearing filter of a headphone is computed by an adaptive system that takes into consideration the changing acoustic to electrical path between an earpiece speaker and an interior microphone of that headphone while worn by a user. Other embodiments are also described and claimed.

Type: Grant

Filed: September 22, 2016

Date of Patent: July 24, 2018

Assignee: Apple Inc.

Inventors: Ismael H. Nawfal, Joshua D. Atkins, Stephen J. Nimick, Guy C. Nicholson, Jason M. Harlow

prev 1 2 3 4 next