Patents by Inventor Joseph Caroselli

Joseph Caroselli has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 11798533
    Abstract: Implementations disclosed herein are directed to initializing and utilizing a beamformer in processing of audio data received at a computing device. The computing device can: receive audio data that captures a spoken utterance of a user, determine that a first audio data segment of the audio data includes one or more particular words or phrases; obtain a preceding audio data segment that precedes the first audio data segment; estimate a spatial correlation matrix based on the first audio data segment and based on the preceding audio data segment; initialize the beamformer based on the estimated spatial correlation matrix; and cause the initialized beamformer to be utilized in processing of at least a second audio data segment of the audio data. Additionally, or alternatively, the computing device can transmit the spatial correlation matrix to server(s), and the server(s) can transmit the initialized beamformer back to the computing device.
    Type: Grant
    Filed: April 2, 2021
    Date of Patent: October 24, 2023
    Assignee: GOOGLE LLC
    Inventors: Joseph Caroselli, Jr., Yiteng Huang, Arun Narayanan
  • Publication number: 20230298612
    Abstract: A multichannel neural frontend speech enhancement model for speech recognition includes a speech cleaner, a stack of self-attention blocks each having a multi-headed self attention mechanism, and a masking layer. The speech cleaner receives, as input, a multichannel noisy input signal and a multichannel contextual noise signal, and generates, as output, a single channel cleaned input signal. The stack of self-attention blocks receives, as input, at an initial block of the stack of self-attention blocks, a stacked input including the single channel cleaned input signal and a single channel noisy input signal, and generates, as output, from a final block of the stack of self-attention blocks, an un-masked output. The masking layer receives, as input, the single channel noisy input signal and the un-masked output, and generates, as output, enhanced input speech features corresponding to a target utterance.
    Type: Application
    Filed: February 20, 2023
    Publication date: September 21, 2023
    Applicant: Google LLC
    Inventors: Joseph Caroselli, Arun Narayanan, Tom O'malley
  • Patent number: 11699453
    Abstract: Utilizing an adaptive multichannel technique to mitigate reverberation present in received audio signals, prior to providing corresponding audio data to one or more additional component(s), such as automatic speech recognition (ASR) components. Implementations disclosed herein are “adaptive”, in that they utilize a filter, in the reverberation mitigation, that is online, causal and varies depending on characteristics of the input. Implementations disclosed herein are “multichannel”, in that a corresponding audio signal is received from each of multiple audio transducers (also referred to herein as “microphones”) of a client device, and the multiple audio signals (e.g., frequency domain representations thereof) are utilized in updating of the filter—and dereverberation occurs for audio data corresponding to each of the audio signals (e.g., frequency domain representations thereof) prior to the audio data being provided to ASR component(s) and/or other component(s).
    Type: Grant
    Filed: August 28, 2020
    Date of Patent: July 11, 2023
    Assignee: GOOGLE LLC
    Inventors: Joseph Caroselli, Arun Narayanan, Izhak Shafran, Richard Rose
  • Publication number: 20220319498
    Abstract: Implementations disclosed herein are directed to initializing and utilizing a beamformer in processing of audio data received at a computing device. The computing device can: receive audio data that captures a spoken utterance of a user, determine that a first audio data segment of the audio data includes one or more particular words or phrases; obtain a preceding audio data segment that precedes the first audio data segment; estimate a spatial correlation matrix based on the first audio data segment and based on the preceding audio data segment; initialize the beamformer based on the estimated spatial correlation matrix; and cause the initialized beamformer to be utilized in processing of at least a second audio data segment of the audio data. Additionally, or alternatively, the computing device can transmit the spatial correlation matrix to server(s), and the server(s) can transmit the initialized beamformer back to the computing device.
    Type: Application
    Filed: April 2, 2021
    Publication date: October 6, 2022
    Inventors: Joseph Caroselli, JR., Yiteng Huang, Arun Narayanan
  • Publication number: 20200395029
    Abstract: Utilizing an adaptive multichannel technique to mitigate reverberation present in received audio signals, prior to providing corresponding audio data to one or more additional component(s), such as automatic speech recognition (ASR) components. Implementations disclosed herein are “adaptive”, in that they utilize a filter, in the reverberation mitigation, that is online, causal and varies depending on characteristics of the input. Implementations disclosed herein are “multichannel”, in that a corresponding audio signal is received from each of multiple audio transducers (also referred to herein as “microphones”) of a client device, and the multiple audio signals (e.g., frequency domain representations thereof) are utilized in updating of the filter—and dereverberation occurs for audio data corresponding to each of the audio signals (e.g., frequency domain representations thereof) prior to the audio data being provided to ASR component(s) and/or other component(s).
    Type: Application
    Filed: August 28, 2020
    Publication date: December 17, 2020
    Inventors: Joseph Caroselli, Arun Narayanan, Izhak Shafran, Richard Rose
  • Patent number: 10762914
    Abstract: Utilizing an adaptive multichannel technique to mitigate reverberation present in received audio signals, prior to providing corresponding audio data to one or more additional component(s), such as automatic speech recognition (ASR) components. Implementations disclosed herein are “adaptive”, in that they utilize a filter, in the reverberation mitigation, that is online, causal and varies depending on characteristics of the input. Implementations disclosed herein are “multichannel”, in that a corresponding audio signal is received from each of multiple audio transducers (also referred to herein as “microphones”) of a client device, and the multiple audio signals (e.g., frequency domain representations thereof) are utilized in updating of the filter—and dereverberation occurs for audio data corresponding to each of the audio signals (e.g., frequency domain representations thereof) prior to the audio data being provided to ASR component(s) and/or other component(s).
    Type: Grant
    Filed: July 11, 2018
    Date of Patent: September 1, 2020
    Assignee: GOOGLE LLC
    Inventors: Joseph Caroselli, Arun Narayanan, Izhak Shafran, Richard Rose
  • Publication number: 20190272840
    Abstract: Utilizing an adaptive multichannel technique to mitigate reverberation present in received audio signals, prior to providing corresponding audio data to one or more additional component(s), such as automatic speech recognition (ASR) components. Implementations disclosed herein are “adaptive”, in that they utilize a filter, in the reverberation mitigation, that is online, causal and varies depending on characteristics of the input. Implementations disclosed herein are “multichannel”, in that a corresponding audio signal is received from each of multiple audio transducers (also referred to herein as “microphones”) of a client device, and the multiple audio signals (e.g., frequency domain representations thereof) are utilized in updating of the filter—and dereverberation occurs for audio data corresponding to each of the audio signals (e.g., frequency domain representations thereof) prior to the audio data being provided to ASR component(s) and/or other component(s).
    Type: Application
    Filed: July 11, 2018
    Publication date: September 5, 2019
    Inventors: Joseph Caroselli, Arun Narayanan, Izhak Shafran, Richard Rose
  • Patent number: 7440497
    Abstract: A multi-phase adaptive decision feedback equalizer minimizes post-cursor inter-symbol interference in a current data bit based on values of subsequent data bits in a data communication system. In one form, the receiver includes a plurality of modules each having a respective adaptive decision feedback equalizer. A processor responsive to output signals from each of the plurality of modules generates a plurality of coefficient values. The adaptive decision feedback equalizer has a plurality of taps receiving a respective output signal from one of the modules and a respective coefficient value to generate a respective correction signal. The correction signals are summed with the data signal and processed to recover the data. Pre-calculation of coefficients permits rapid selection of data. Multi-phase operation permits higher data frequencies.
    Type: Grant
    Filed: November 1, 2004
    Date of Patent: October 21, 2008
    Assignee: LSI Corporation
    Inventors: Vishnu Balan, Joseph Caroselli, Jr., Ye Liu, Chintan M. Desai, Jenn-Gang Chern
  • Publication number: 20060093028
    Abstract: A multi-phase adaptive decision feedback equalizer minimizes post-cursor inter-symbol interference in a current data bit based on values of subsequent data bits in a data communication system. In one form, the receiver includes a plurality of modules each having a respective adaptive decision feedback equalizer. A processor responsive to output signals from each of the plurality of modules generates a plurality of coefficient values. The adaptive decision feedback equalizer has a plurality of taps receiving a respective output signal from one of the modules and a respective coefficient value to generate a respective correction signal. The correction signals are summed with the data signal and processed to recover the data. Pre-calculation of coefficients permits rapid selection of data. Multi-phase operation permits higher data frequencies.
    Type: Application
    Filed: November 1, 2004
    Publication date: May 4, 2006
    Applicant: LSI Logic Corporation
    Inventors: Vishnu Balan, Joseph Caroselli, Ye Liu, Chintan Desai, Jenn-Gang Chern