Patents by Inventor Joshua D. Atkins
Joshua D. Atkins has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Publication number: 20200084560Abstract: A presence of a person within a camera field of view of an electronic device is determined by digitally processing images captured by a camera. A position of a body member of the person with respect to the electronic device is also computed by digitally processing the camera captured images. A crosstalk cancellation (XTC) signal is adjusted based on the computed position of the body member. Adjusting the XTC signal includes adjusting a first predetermined model location, which includes a location at which a user should be in order to achieve a desired virtual acoustics effect. Processing program audio based on the adjusted XTC signal, to generate audio signals that drive speakers. Other aspects are also described and claimed.Type: ApplicationFiled: September 10, 2018Publication date: March 12, 2020Inventors: Darius A. Satongar, Joshua D. Atkins, Justin D. Crosby, Lance F. Reichert, Martin E. Johnson, Sawyer Cohen
-
Patent number: 10546593Abstract: A number of features are extracted from a current frame of a multi-channel speech pickup and from side information that is a linear echo estimate, a diffuse signal component, or a noise estimate of the multi-channel speech pickup. A DNN-based speech presence probability is produced for the current frame, where the SPP value is produced in response to the extracted features being input to the DNN. The DNN-based SPP value is applied to configure a multi-channel filter whose input is the multi-channel speech pickup and whose output is a single audio signal. In one aspect, the system is designed to run online, at low enough latency for real time applications such voice trigger detection. Other aspects are also described and claimed.Type: GrantFiled: December 4, 2017Date of Patent: January 28, 2020Assignee: APPLE INC.Inventors: Jason Wung, Mehrez Souden, Ramin Pishehvar, Joshua D. Atkins
-
Patent number: 10524080Abstract: An audio processing system has one or more processors that process an audio signal on three paths. The first path has a direct gain and a direct virtual source algorithm operating on the audio signal. The second path has a plurality of early reflection gains operating on the audio signal. Operation with the early reflection gains produces a plurality of early reflections. Each of the early reflection signals may be subjected to a delay and may be processed according to an early reflections virtual source algorithm. The third path has a reverb gain and binaural reverb filters operating on the audio signal. The third path also has a crosstalk canceler. A mixer combines left and right channel outputs of each of the first path, second path and third path. The mixer produces a left loudspeaker signal and a right loudspeaker signal.Type: GrantFiled: August 23, 2018Date of Patent: December 31, 2019Assignee: APPLE INC.Inventors: Martin E. Johnson, Darius A. Satongar, Stuart J. Wood, Lance F. Reichert, Juha O. Merimaa, Joshua D. Atkins
-
Patent number: 10482899Abstract: An audio system has a housing in which are integrated a number of microphones. A programmed processor accesses the microphone signals and produces a number of acoustic pick up beams based groups of microphones, an estimation of voice activity and an estimation of noise characteristics on each beam. Two or more beams including a voice beam that is used to pick up a desired voice and a noise beam that is used to provide information to estimate ambient noise are adaptively selected from among the plurality of beams, based on thresholds for voice separation and thresholds for noise-matching. Other embodiments are also described and claimed.Type: GrantFiled: August 1, 2016Date of Patent: November 19, 2019Assignee: Apple Inc.Inventors: Sean A. Ramprashad, Esge B. Andersen, Joshua D. Atkins, Sorin V. Dusan, Vasu Iyengar, Tarun Pruthi, Lalin S. Theverapperuma
-
Patent number: 10403299Abstract: A digital speech enhancement system that performs a specific chain of digital signal processing operations upon multi-channel sound pick up, to result in a single, enhanced speech signal. The operations are designed to be computationally less complex yet as a whole yield an enhanced speech signal that produces accurate voice trigger detection and low word error rates by an automatic speech recognizer. The constituent operations or components of the system have been chosen so that the overall system is robust to changing acoustic conditions, and can deliver the enhanced speech signal with low enough latency so that the system can be used online (enabling real-time, voice trigger detection and streaming ASR.) Other embodiments are also described and claimed.Type: GrantFiled: June 2, 2017Date of Patent: September 3, 2019Assignee: Apple Inc.Inventors: Jason Wung, Joshua D. Atkins, Ramin Pishehvar, Mehrez Souden
-
Patent number: 10390131Abstract: A microphone array included in a portable electronic device is used to generate various virtual studio microphones by combining one or more microphone signals to produce one or more acoustic pickup beams. An error is determined in a position of the microphone array relative to an audio source to be recorded. An interface is displayed to instruct a user on repositioning the microphone array relative to the instrument and the instrument is recorded using the repositioned microphone array.Type: GrantFiled: September 29, 2017Date of Patent: August 20, 2019Assignee: Apple Inc.Inventors: Jonathan D. Sheaffer, Darius A. Satongar, Joshua D. Atkins, Martin E. Johnson
-
Publication number: 20190222950Abstract: Image analysis of a video signal is performed to produce first metadata, and audio analysis of a multi-channel sound track associated with the video signal is performed to produce second metadata. A number of time segments of the sound track are processed, wherein each time segment is processed by either (i) spatial filtering of the audio signals or (ii) spatial rendering of the audio signals, not both, wherein for each time segment a decision was made to select between the spatial filtering or the spatial rendering, in accordance with the first and second metadata. A mix of the processed sound track and the video signal is generated. Other embodiments are also described and claimed.Type: ApplicationFiled: January 4, 2019Publication date: July 18, 2019Inventors: Jonathan D. Sheaffer, Joshua D. Atkins, Martin E. Johnson, Stuart J. Wood
-
Patent number: 10334357Abstract: Impulse responses of a device are measured. A database of sound files is generated by convolving source signals with the impulse responses of the device. The sound files from the database are transformed into time-frequency domain. One or more sub-band directional features is estimated at each sub-band of the time-frequency domain. A deep neural network (DNN) is trained for each sub-band based on the estimated one or more sub-band directional features and a target directional feature.Type: GrantFiled: September 29, 2017Date of Patent: June 25, 2019Assignee: Apple Inc.Inventors: Joshua D. Atkins, Mehrez Souden, Symeon Delikaris-Manias, Peter Raffensperger
-
Publication number: 20190172476Abstract: A number of features are extracted from a current frame of a multi-channel speech pickup and from side information that is a linear echo estimate, a diffuse signal component, or a noise estimate of the multi-channel speech pickup. A DNN-based speech presence probability is produced for the current frame, where the SPP value is produced in response to the extracted features being input to the DNN. The DNN-based SPP value is applied to configure a multi-channel filter whose input is the multi-channel speech pickup and whose output is a single audio signal. In one aspect, the system is designed to run online, at low enough latency for real time applications such voice trigger detection. Other aspects are also described and claimed.Type: ApplicationFiled: December 4, 2017Publication date: June 6, 2019Inventors: Jason Wung, Mehrez Souden, Ramin Pishehvar, Joshua D. Atkins
-
Publication number: 20190104357Abstract: Impulse responses of a device are measured. A database of sound files is generated by convolving source signals with the impulse responses of the device. The sound files from the database are transformed into time-frequency domain. One or more sub-band directional features is estimated at each sub-band of the time-frequency domain. A deep neural network (DNN) is trained for each sub-band based on the estimated one or more sub-band directional features and a target directional feature.Type: ApplicationFiled: September 29, 2017Publication date: April 4, 2019Inventors: Joshua D. Atkins, Mehrez Souden, Symeon Delikaris-Manias, Peter Raffensperger
-
Publication number: 20190104359Abstract: A microphone array included in a portable electronic device is used to generate various virtual studio microphones by combining one or more microphone signals to produce one or more acoustic pickup beams. An error is determined in a position of the microphone array relative to an audio source to be recorded. An interface is displayed to instruct a user on repositioning the microphone array relative to the instrument and the instrument is recorded using the repositioned microphone array.Type: ApplicationFiled: September 29, 2017Publication date: April 4, 2019Inventors: Jonathan D. Sheaffer, Darius A. Satongar, Joshua D. Atkins, Martin E. Johnson
-
Publication number: 20190104364Abstract: Placement of one or two placed virtual loudspeakers within a loudspeaker setup that includes a real loudspeakers is determined and vector base amplitude panning (VBAP) gains including the gains of the real loudspeakers and placed one or two virtual loudspeakers are also then determined. Gains of one or two placed virtual loudspeakers are redistributed to the real loudspeakers to ensure preservation of total energy. Real loudspeakers in the loudspeaker setup have redistributed gains of one or two placed virtual loudspeakers. Loudspeaker outputs are generated and transmitted to the real loudspeakers to be played back. When received audio content is ambisonics content, a predetermined grid is generated and HOA content is projected to the grid. Other aspects are also described.Type: ApplicationFiled: June 5, 2018Publication date: April 4, 2019Inventors: Ismael NAWFAL, Symeon DELIKARIS MANIAS, Joshua D. ATKINS
-
Publication number: 20190074009Abstract: Systems and processes for operating an intelligent automated assistant are provided. In accordance with one example, a method includes, at an electronic device with one or more processors, memory, and a plurality of microphones, sampling, at each of the plurality of microphones of the electronic device, an audio signal to obtain a plurality of audio signals; processing the plurality of audio signals to obtain a plurality of audio streams; and determining, based on the plurality of audio streams, whether any of the plurality of audio signals corresponds to a spoken trigger. The method further includes, in accordance with a determination that the plurality of audio signals corresponds to the spoken trigger, initiating a session of the digital assistant; and in accordance with a determination that the plurality of audio signals does not correspond to the spoken trigger, foregoing initiating a session of the digital assistant.Type: ApplicationFiled: November 5, 2018Publication date: March 7, 2019Inventors: Yoon KIM, John BRIDLE, Joshua D. ATKINS, Feipeng LI, Mehrez SOUDEN
-
Publication number: 20190058952Abstract: Digital audio signal processing techniques used to provide an acoustic transparency function in a pair of headphones. A number of transparency filters can be computed at once, using optimization techniques or using a closed form solution, that are based on multiple re-seatings of the headphones and that are as a result robust for a population of wearers. In another embodiment, a transparency hearing filter of a headphone is computed by an adaptive system that takes into consideration the changing acoustic to electrical path between an earpiece speaker and an interior microphone of that headphone while worn by a user. Other embodiments are also described and claimed.Type: ApplicationFiled: July 6, 2018Publication date: February 21, 2019Inventors: Ismael H. NAWFAL, Joshua D. ATKINS, Stephen J. NIMICK, Guy C. NICHOLSON, Jason M. HARLOW
-
Patent number: 10178490Abstract: Image analysis of a video signal is performed to produce first metadata, and audio analysis of a multi-channel sound track associated with the video signal is performed to produce second metadata. A number of time segments of the sound track are processed, wherein each time segment is processed by either (i) spatial filtering of the audio signals or (ii) spatial rendering of the audio signals, not both, wherein for each time segment a decision was made to select between the spatial filtering or the spatial rendering, in accordance with the first and second metadata. A mix of the processed sound track and the video signal is generated. Other embodiments are also described and claimed.Type: GrantFiled: June 30, 2017Date of Patent: January 8, 2019Assignee: Apple Inc.Inventors: Jonathan D. Sheaffer, Joshua D. Atkins, Martin E. Johnson, Stuart J. Wood
-
Publication number: 20190007780Abstract: Image analysis of a video signal is performed to produce first metadata, and audio analysis of a multi-channel sound track associated with the video signal is performed to produce second metadata. A number of time segments of the sound track are processed, wherein each time segment is processed by either (i) spatial filtering of the audio signals or (ii) spatial rendering of the audio signals, not both, wherein for each time segment a decision was made to select between the spatial filtering or the spatial rendering, in accordance with the first and second metadata. A mix of the processed sound track and the video signal is generated. Other embodiments are also described and claimed.Type: ApplicationFiled: June 30, 2017Publication date: January 3, 2019Inventors: Jonathan D. Sheaffer, Joshua D. Atkins, Martin E. Johnson, Stuart J. Wood
-
Publication number: 20180350379Abstract: A digital speech enhancement system that performs a specific chain of digital signal processing operations upon multi-channel sound pick up, to result in a single, enhanced speech signal. The operations are designed to be computationally less complex yet as a whole yield an enhanced speech signal that produces accurate voice trigger detection and low word error rates by an automatic speech recognizer. The constituent operations or components of the system have been chosen so that the overall system is robust to changing acoustic conditions, and can deliver the enhanced speech signal with low enough latency so that the system can be used online (enabling real-time, voice trigger detection and streaming ASR.) Other embodiments are also described and claimed.Type: ApplicationFiled: June 2, 2017Publication date: December 6, 2018Inventors: Jason Wung, Joshua D. Atkins, Ramin Pishehvar, Mehrez Souden
-
Publication number: 20180336892Abstract: Systems and processes for operating an intelligent automated assistant are provided. In accordance with one example, a method includes, at an electronic device with one or more processors, memory, and a plurality of microphones, sampling, at each of the plurality of microphones of the electronic device, an audio signal to obtain a plurality of audio signals; processing the plurality of audio signals to obtain a plurality of audio streams; and determining, based on the plurality of audio streams, whether any of the plurality of audio signals corresponds to a spoken trigger. The method further includes, in accordance with a determination that the plurality of audio signals corresponds to the spoken trigger, initiating a session of the digital assistant; and in accordance with a determination that the plurality of audio signals does not correspond to the spoken trigger, foregoing initiating a session of the digital assistant.Type: ApplicationFiled: March 13, 2018Publication date: November 22, 2018Inventors: Yoon KIM, John BRIDLE, Joshua D. ATKINS, Feipeng LI, Mehrez SOUDEN
-
Patent number: 10074380Abstract: Method for performing speech enhancement using a Deep Neural Network (DNN)-based signal starts with training DNN offline by exciting a microphone using target training signal that includes signal approximation of clean speech. Loudspeaker is driven with a reference signal and outputs loudspeaker signal. Microphone then generates microphone signal based on at least one of: near-end speaker signal, ambient noise signal, or loudspeaker signal. Acoustic-echo-canceller (AEC) generates AEC echo-cancelled signal based on reference signal and microphone signal. Loudspeaker signal estimator generates estimated loudspeaker signal based on microphone signal and AEC echo-cancelled signal. DNN receives microphone signal, reference signal, AEC echo-cancelled signal, and estimated loudspeaker signal and generates a speech reference signal that includes signal statistics for residual echo or for noise.Type: GrantFiled: August 3, 2016Date of Patent: September 11, 2018Assignee: Apple Inc.Inventors: Jason Wung, Ramin Pishehvar, Daniele Giacobello, Joshua D. Atkins
-
Patent number: 10034092Abstract: Digital audio signal processing techniques used to provide an acoustic transparency function in a pair of headphones. A number of transparency filters can be computed at once, using optimization techniques or using a closed form solution, that are based on multiple re-seatings of the headphones and that are as a result robust for a population of wearers. In another embodiment, a transparency hearing filter of a headphone is computed by an adaptive system that takes into consideration the changing acoustic to electrical path between an earpiece speaker and an interior microphone of that headphone while worn by a user. Other embodiments are also described and claimed.Type: GrantFiled: September 22, 2016Date of Patent: July 24, 2018Assignee: Apple Inc.Inventors: Ismael H. Nawfal, Joshua D. Atkins, Stephen J. Nimick, Guy C. Nicholson, Jason M. Harlow