Patents by Inventor Joshua D. Atkins

Joshua D. Atkins has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20200084560
    Abstract: A presence of a person within a camera field of view of an electronic device is determined by digitally processing images captured by a camera. A position of a body member of the person with respect to the electronic device is also computed by digitally processing the camera captured images. A crosstalk cancellation (XTC) signal is adjusted based on the computed position of the body member. Adjusting the XTC signal includes adjusting a first predetermined model location, which includes a location at which a user should be in order to achieve a desired virtual acoustics effect. Processing program audio based on the adjusted XTC signal, to generate audio signals that drive speakers. Other aspects are also described and claimed.
    Type: Application
    Filed: September 10, 2018
    Publication date: March 12, 2020
    Inventors: Darius A. Satongar, Joshua D. Atkins, Justin D. Crosby, Lance F. Reichert, Martin E. Johnson, Sawyer Cohen
  • Patent number: 10546593
    Abstract: A number of features are extracted from a current frame of a multi-channel speech pickup and from side information that is a linear echo estimate, a diffuse signal component, or a noise estimate of the multi-channel speech pickup. A DNN-based speech presence probability is produced for the current frame, where the SPP value is produced in response to the extracted features being input to the DNN. The DNN-based SPP value is applied to configure a multi-channel filter whose input is the multi-channel speech pickup and whose output is a single audio signal. In one aspect, the system is designed to run online, at low enough latency for real time applications such voice trigger detection. Other aspects are also described and claimed.
    Type: Grant
    Filed: December 4, 2017
    Date of Patent: January 28, 2020
    Assignee: APPLE INC.
    Inventors: Jason Wung, Mehrez Souden, Ramin Pishehvar, Joshua D. Atkins
  • Patent number: 10524080
    Abstract: An audio processing system has one or more processors that process an audio signal on three paths. The first path has a direct gain and a direct virtual source algorithm operating on the audio signal. The second path has a plurality of early reflection gains operating on the audio signal. Operation with the early reflection gains produces a plurality of early reflections. Each of the early reflection signals may be subjected to a delay and may be processed according to an early reflections virtual source algorithm. The third path has a reverb gain and binaural reverb filters operating on the audio signal. The third path also has a crosstalk canceler. A mixer combines left and right channel outputs of each of the first path, second path and third path. The mixer produces a left loudspeaker signal and a right loudspeaker signal.
    Type: Grant
    Filed: August 23, 2018
    Date of Patent: December 31, 2019
    Assignee: APPLE INC.
    Inventors: Martin E. Johnson, Darius A. Satongar, Stuart J. Wood, Lance F. Reichert, Juha O. Merimaa, Joshua D. Atkins
  • Patent number: 10482899
    Abstract: An audio system has a housing in which are integrated a number of microphones. A programmed processor accesses the microphone signals and produces a number of acoustic pick up beams based groups of microphones, an estimation of voice activity and an estimation of noise characteristics on each beam. Two or more beams including a voice beam that is used to pick up a desired voice and a noise beam that is used to provide information to estimate ambient noise are adaptively selected from among the plurality of beams, based on thresholds for voice separation and thresholds for noise-matching. Other embodiments are also described and claimed.
    Type: Grant
    Filed: August 1, 2016
    Date of Patent: November 19, 2019
    Assignee: Apple Inc.
    Inventors: Sean A. Ramprashad, Esge B. Andersen, Joshua D. Atkins, Sorin V. Dusan, Vasu Iyengar, Tarun Pruthi, Lalin S. Theverapperuma
  • Patent number: 10403299
    Abstract: A digital speech enhancement system that performs a specific chain of digital signal processing operations upon multi-channel sound pick up, to result in a single, enhanced speech signal. The operations are designed to be computationally less complex yet as a whole yield an enhanced speech signal that produces accurate voice trigger detection and low word error rates by an automatic speech recognizer. The constituent operations or components of the system have been chosen so that the overall system is robust to changing acoustic conditions, and can deliver the enhanced speech signal with low enough latency so that the system can be used online (enabling real-time, voice trigger detection and streaming ASR.) Other embodiments are also described and claimed.
    Type: Grant
    Filed: June 2, 2017
    Date of Patent: September 3, 2019
    Assignee: Apple Inc.
    Inventors: Jason Wung, Joshua D. Atkins, Ramin Pishehvar, Mehrez Souden
  • Patent number: 10390131
    Abstract: A microphone array included in a portable electronic device is used to generate various virtual studio microphones by combining one or more microphone signals to produce one or more acoustic pickup beams. An error is determined in a position of the microphone array relative to an audio source to be recorded. An interface is displayed to instruct a user on repositioning the microphone array relative to the instrument and the instrument is recorded using the repositioned microphone array.
    Type: Grant
    Filed: September 29, 2017
    Date of Patent: August 20, 2019
    Assignee: Apple Inc.
    Inventors: Jonathan D. Sheaffer, Darius A. Satongar, Joshua D. Atkins, Martin E. Johnson
  • Publication number: 20190222950
    Abstract: Image analysis of a video signal is performed to produce first metadata, and audio analysis of a multi-channel sound track associated with the video signal is performed to produce second metadata. A number of time segments of the sound track are processed, wherein each time segment is processed by either (i) spatial filtering of the audio signals or (ii) spatial rendering of the audio signals, not both, wherein for each time segment a decision was made to select between the spatial filtering or the spatial rendering, in accordance with the first and second metadata. A mix of the processed sound track and the video signal is generated. Other embodiments are also described and claimed.
    Type: Application
    Filed: January 4, 2019
    Publication date: July 18, 2019
    Inventors: Jonathan D. Sheaffer, Joshua D. Atkins, Martin E. Johnson, Stuart J. Wood
  • Patent number: 10334357
    Abstract: Impulse responses of a device are measured. A database of sound files is generated by convolving source signals with the impulse responses of the device. The sound files from the database are transformed into time-frequency domain. One or more sub-band directional features is estimated at each sub-band of the time-frequency domain. A deep neural network (DNN) is trained for each sub-band based on the estimated one or more sub-band directional features and a target directional feature.
    Type: Grant
    Filed: September 29, 2017
    Date of Patent: June 25, 2019
    Assignee: Apple Inc.
    Inventors: Joshua D. Atkins, Mehrez Souden, Symeon Delikaris-Manias, Peter Raffensperger
  • Publication number: 20190172476
    Abstract: A number of features are extracted from a current frame of a multi-channel speech pickup and from side information that is a linear echo estimate, a diffuse signal component, or a noise estimate of the multi-channel speech pickup. A DNN-based speech presence probability is produced for the current frame, where the SPP value is produced in response to the extracted features being input to the DNN. The DNN-based SPP value is applied to configure a multi-channel filter whose input is the multi-channel speech pickup and whose output is a single audio signal. In one aspect, the system is designed to run online, at low enough latency for real time applications such voice trigger detection. Other aspects are also described and claimed.
    Type: Application
    Filed: December 4, 2017
    Publication date: June 6, 2019
    Inventors: Jason Wung, Mehrez Souden, Ramin Pishehvar, Joshua D. Atkins
  • Publication number: 20190104357
    Abstract: Impulse responses of a device are measured. A database of sound files is generated by convolving source signals with the impulse responses of the device. The sound files from the database are transformed into time-frequency domain. One or more sub-band directional features is estimated at each sub-band of the time-frequency domain. A deep neural network (DNN) is trained for each sub-band based on the estimated one or more sub-band directional features and a target directional feature.
    Type: Application
    Filed: September 29, 2017
    Publication date: April 4, 2019
    Inventors: Joshua D. Atkins, Mehrez Souden, Symeon Delikaris-Manias, Peter Raffensperger
  • Publication number: 20190104359
    Abstract: A microphone array included in a portable electronic device is used to generate various virtual studio microphones by combining one or more microphone signals to produce one or more acoustic pickup beams. An error is determined in a position of the microphone array relative to an audio source to be recorded. An interface is displayed to instruct a user on repositioning the microphone array relative to the instrument and the instrument is recorded using the repositioned microphone array.
    Type: Application
    Filed: September 29, 2017
    Publication date: April 4, 2019
    Inventors: Jonathan D. Sheaffer, Darius A. Satongar, Joshua D. Atkins, Martin E. Johnson
  • Publication number: 20190104364
    Abstract: Placement of one or two placed virtual loudspeakers within a loudspeaker setup that includes a real loudspeakers is determined and vector base amplitude panning (VBAP) gains including the gains of the real loudspeakers and placed one or two virtual loudspeakers are also then determined. Gains of one or two placed virtual loudspeakers are redistributed to the real loudspeakers to ensure preservation of total energy. Real loudspeakers in the loudspeaker setup have redistributed gains of one or two placed virtual loudspeakers. Loudspeaker outputs are generated and transmitted to the real loudspeakers to be played back. When received audio content is ambisonics content, a predetermined grid is generated and HOA content is projected to the grid. Other aspects are also described.
    Type: Application
    Filed: June 5, 2018
    Publication date: April 4, 2019
    Inventors: Ismael NAWFAL, Symeon DELIKARIS MANIAS, Joshua D. ATKINS
  • Publication number: 20190074009
    Abstract: Systems and processes for operating an intelligent automated assistant are provided. In accordance with one example, a method includes, at an electronic device with one or more processors, memory, and a plurality of microphones, sampling, at each of the plurality of microphones of the electronic device, an audio signal to obtain a plurality of audio signals; processing the plurality of audio signals to obtain a plurality of audio streams; and determining, based on the plurality of audio streams, whether any of the plurality of audio signals corresponds to a spoken trigger. The method further includes, in accordance with a determination that the plurality of audio signals corresponds to the spoken trigger, initiating a session of the digital assistant; and in accordance with a determination that the plurality of audio signals does not correspond to the spoken trigger, foregoing initiating a session of the digital assistant.
    Type: Application
    Filed: November 5, 2018
    Publication date: March 7, 2019
    Inventors: Yoon KIM, John BRIDLE, Joshua D. ATKINS, Feipeng LI, Mehrez SOUDEN
  • Publication number: 20190058952
    Abstract: Digital audio signal processing techniques used to provide an acoustic transparency function in a pair of headphones. A number of transparency filters can be computed at once, using optimization techniques or using a closed form solution, that are based on multiple re-seatings of the headphones and that are as a result robust for a population of wearers. In another embodiment, a transparency hearing filter of a headphone is computed by an adaptive system that takes into consideration the changing acoustic to electrical path between an earpiece speaker and an interior microphone of that headphone while worn by a user. Other embodiments are also described and claimed.
    Type: Application
    Filed: July 6, 2018
    Publication date: February 21, 2019
    Inventors: Ismael H. NAWFAL, Joshua D. ATKINS, Stephen J. NIMICK, Guy C. NICHOLSON, Jason M. HARLOW
  • Patent number: 10178490
    Abstract: Image analysis of a video signal is performed to produce first metadata, and audio analysis of a multi-channel sound track associated with the video signal is performed to produce second metadata. A number of time segments of the sound track are processed, wherein each time segment is processed by either (i) spatial filtering of the audio signals or (ii) spatial rendering of the audio signals, not both, wherein for each time segment a decision was made to select between the spatial filtering or the spatial rendering, in accordance with the first and second metadata. A mix of the processed sound track and the video signal is generated. Other embodiments are also described and claimed.
    Type: Grant
    Filed: June 30, 2017
    Date of Patent: January 8, 2019
    Assignee: Apple Inc.
    Inventors: Jonathan D. Sheaffer, Joshua D. Atkins, Martin E. Johnson, Stuart J. Wood
  • Publication number: 20190007780
    Abstract: Image analysis of a video signal is performed to produce first metadata, and audio analysis of a multi-channel sound track associated with the video signal is performed to produce second metadata. A number of time segments of the sound track are processed, wherein each time segment is processed by either (i) spatial filtering of the audio signals or (ii) spatial rendering of the audio signals, not both, wherein for each time segment a decision was made to select between the spatial filtering or the spatial rendering, in accordance with the first and second metadata. A mix of the processed sound track and the video signal is generated. Other embodiments are also described and claimed.
    Type: Application
    Filed: June 30, 2017
    Publication date: January 3, 2019
    Inventors: Jonathan D. Sheaffer, Joshua D. Atkins, Martin E. Johnson, Stuart J. Wood
  • Publication number: 20180350379
    Abstract: A digital speech enhancement system that performs a specific chain of digital signal processing operations upon multi-channel sound pick up, to result in a single, enhanced speech signal. The operations are designed to be computationally less complex yet as a whole yield an enhanced speech signal that produces accurate voice trigger detection and low word error rates by an automatic speech recognizer. The constituent operations or components of the system have been chosen so that the overall system is robust to changing acoustic conditions, and can deliver the enhanced speech signal with low enough latency so that the system can be used online (enabling real-time, voice trigger detection and streaming ASR.) Other embodiments are also described and claimed.
    Type: Application
    Filed: June 2, 2017
    Publication date: December 6, 2018
    Inventors: Jason Wung, Joshua D. Atkins, Ramin Pishehvar, Mehrez Souden
  • Publication number: 20180336892
    Abstract: Systems and processes for operating an intelligent automated assistant are provided. In accordance with one example, a method includes, at an electronic device with one or more processors, memory, and a plurality of microphones, sampling, at each of the plurality of microphones of the electronic device, an audio signal to obtain a plurality of audio signals; processing the plurality of audio signals to obtain a plurality of audio streams; and determining, based on the plurality of audio streams, whether any of the plurality of audio signals corresponds to a spoken trigger. The method further includes, in accordance with a determination that the plurality of audio signals corresponds to the spoken trigger, initiating a session of the digital assistant; and in accordance with a determination that the plurality of audio signals does not correspond to the spoken trigger, foregoing initiating a session of the digital assistant.
    Type: Application
    Filed: March 13, 2018
    Publication date: November 22, 2018
    Inventors: Yoon KIM, John BRIDLE, Joshua D. ATKINS, Feipeng LI, Mehrez SOUDEN
  • Patent number: 10074380
    Abstract: Method for performing speech enhancement using a Deep Neural Network (DNN)-based signal starts with training DNN offline by exciting a microphone using target training signal that includes signal approximation of clean speech. Loudspeaker is driven with a reference signal and outputs loudspeaker signal. Microphone then generates microphone signal based on at least one of: near-end speaker signal, ambient noise signal, or loudspeaker signal. Acoustic-echo-canceller (AEC) generates AEC echo-cancelled signal based on reference signal and microphone signal. Loudspeaker signal estimator generates estimated loudspeaker signal based on microphone signal and AEC echo-cancelled signal. DNN receives microphone signal, reference signal, AEC echo-cancelled signal, and estimated loudspeaker signal and generates a speech reference signal that includes signal statistics for residual echo or for noise.
    Type: Grant
    Filed: August 3, 2016
    Date of Patent: September 11, 2018
    Assignee: Apple Inc.
    Inventors: Jason Wung, Ramin Pishehvar, Daniele Giacobello, Joshua D. Atkins
  • Patent number: 10034092
    Abstract: Digital audio signal processing techniques used to provide an acoustic transparency function in a pair of headphones. A number of transparency filters can be computed at once, using optimization techniques or using a closed form solution, that are based on multiple re-seatings of the headphones and that are as a result robust for a population of wearers. In another embodiment, a transparency hearing filter of a headphone is computed by an adaptive system that takes into consideration the changing acoustic to electrical path between an earpiece speaker and an interior microphone of that headphone while worn by a user. Other embodiments are also described and claimed.
    Type: Grant
    Filed: September 22, 2016
    Date of Patent: July 24, 2018
    Assignee: Apple Inc.
    Inventors: Ismael H. Nawfal, Joshua D. Atkins, Stephen J. Nimick, Guy C. Nicholson, Jason M. Harlow