Patents by Inventor Joseph Caroselli
Joseph Caroselli has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Patent number: 11798533Abstract: Implementations disclosed herein are directed to initializing and utilizing a beamformer in processing of audio data received at a computing device. The computing device can: receive audio data that captures a spoken utterance of a user, determine that a first audio data segment of the audio data includes one or more particular words or phrases; obtain a preceding audio data segment that precedes the first audio data segment; estimate a spatial correlation matrix based on the first audio data segment and based on the preceding audio data segment; initialize the beamformer based on the estimated spatial correlation matrix; and cause the initialized beamformer to be utilized in processing of at least a second audio data segment of the audio data. Additionally, or alternatively, the computing device can transmit the spatial correlation matrix to server(s), and the server(s) can transmit the initialized beamformer back to the computing device.Type: GrantFiled: April 2, 2021Date of Patent: October 24, 2023Assignee: GOOGLE LLCInventors: Joseph Caroselli, Jr., Yiteng Huang, Arun Narayanan
-
Publication number: 20230298612Abstract: A multichannel neural frontend speech enhancement model for speech recognition includes a speech cleaner, a stack of self-attention blocks each having a multi-headed self attention mechanism, and a masking layer. The speech cleaner receives, as input, a multichannel noisy input signal and a multichannel contextual noise signal, and generates, as output, a single channel cleaned input signal. The stack of self-attention blocks receives, as input, at an initial block of the stack of self-attention blocks, a stacked input including the single channel cleaned input signal and a single channel noisy input signal, and generates, as output, from a final block of the stack of self-attention blocks, an un-masked output. The masking layer receives, as input, the single channel noisy input signal and the un-masked output, and generates, as output, enhanced input speech features corresponding to a target utterance.Type: ApplicationFiled: February 20, 2023Publication date: September 21, 2023Applicant: Google LLCInventors: Joseph Caroselli, Arun Narayanan, Tom O'malley
-
Patent number: 11699453Abstract: Utilizing an adaptive multichannel technique to mitigate reverberation present in received audio signals, prior to providing corresponding audio data to one or more additional component(s), such as automatic speech recognition (ASR) components. Implementations disclosed herein are “adaptive”, in that they utilize a filter, in the reverberation mitigation, that is online, causal and varies depending on characteristics of the input. Implementations disclosed herein are “multichannel”, in that a corresponding audio signal is received from each of multiple audio transducers (also referred to herein as “microphones”) of a client device, and the multiple audio signals (e.g., frequency domain representations thereof) are utilized in updating of the filter—and dereverberation occurs for audio data corresponding to each of the audio signals (e.g., frequency domain representations thereof) prior to the audio data being provided to ASR component(s) and/or other component(s).Type: GrantFiled: August 28, 2020Date of Patent: July 11, 2023Assignee: GOOGLE LLCInventors: Joseph Caroselli, Arun Narayanan, Izhak Shafran, Richard Rose
-
Publication number: 20220319498Abstract: Implementations disclosed herein are directed to initializing and utilizing a beamformer in processing of audio data received at a computing device. The computing device can: receive audio data that captures a spoken utterance of a user, determine that a first audio data segment of the audio data includes one or more particular words or phrases; obtain a preceding audio data segment that precedes the first audio data segment; estimate a spatial correlation matrix based on the first audio data segment and based on the preceding audio data segment; initialize the beamformer based on the estimated spatial correlation matrix; and cause the initialized beamformer to be utilized in processing of at least a second audio data segment of the audio data. Additionally, or alternatively, the computing device can transmit the spatial correlation matrix to server(s), and the server(s) can transmit the initialized beamformer back to the computing device.Type: ApplicationFiled: April 2, 2021Publication date: October 6, 2022Inventors: Joseph Caroselli, JR., Yiteng Huang, Arun Narayanan
-
Publication number: 20200395029Abstract: Utilizing an adaptive multichannel technique to mitigate reverberation present in received audio signals, prior to providing corresponding audio data to one or more additional component(s), such as automatic speech recognition (ASR) components. Implementations disclosed herein are “adaptive”, in that they utilize a filter, in the reverberation mitigation, that is online, causal and varies depending on characteristics of the input. Implementations disclosed herein are “multichannel”, in that a corresponding audio signal is received from each of multiple audio transducers (also referred to herein as “microphones”) of a client device, and the multiple audio signals (e.g., frequency domain representations thereof) are utilized in updating of the filter—and dereverberation occurs for audio data corresponding to each of the audio signals (e.g., frequency domain representations thereof) prior to the audio data being provided to ASR component(s) and/or other component(s).Type: ApplicationFiled: August 28, 2020Publication date: December 17, 2020Inventors: Joseph Caroselli, Arun Narayanan, Izhak Shafran, Richard Rose
-
Patent number: 10762914Abstract: Utilizing an adaptive multichannel technique to mitigate reverberation present in received audio signals, prior to providing corresponding audio data to one or more additional component(s), such as automatic speech recognition (ASR) components. Implementations disclosed herein are “adaptive”, in that they utilize a filter, in the reverberation mitigation, that is online, causal and varies depending on characteristics of the input. Implementations disclosed herein are “multichannel”, in that a corresponding audio signal is received from each of multiple audio transducers (also referred to herein as “microphones”) of a client device, and the multiple audio signals (e.g., frequency domain representations thereof) are utilized in updating of the filter—and dereverberation occurs for audio data corresponding to each of the audio signals (e.g., frequency domain representations thereof) prior to the audio data being provided to ASR component(s) and/or other component(s).Type: GrantFiled: July 11, 2018Date of Patent: September 1, 2020Assignee: GOOGLE LLCInventors: Joseph Caroselli, Arun Narayanan, Izhak Shafran, Richard Rose
-
Publication number: 20190272840Abstract: Utilizing an adaptive multichannel technique to mitigate reverberation present in received audio signals, prior to providing corresponding audio data to one or more additional component(s), such as automatic speech recognition (ASR) components. Implementations disclosed herein are “adaptive”, in that they utilize a filter, in the reverberation mitigation, that is online, causal and varies depending on characteristics of the input. Implementations disclosed herein are “multichannel”, in that a corresponding audio signal is received from each of multiple audio transducers (also referred to herein as “microphones”) of a client device, and the multiple audio signals (e.g., frequency domain representations thereof) are utilized in updating of the filter—and dereverberation occurs for audio data corresponding to each of the audio signals (e.g., frequency domain representations thereof) prior to the audio data being provided to ASR component(s) and/or other component(s).Type: ApplicationFiled: July 11, 2018Publication date: September 5, 2019Inventors: Joseph Caroselli, Arun Narayanan, Izhak Shafran, Richard Rose
-
Patent number: 7440497Abstract: A multi-phase adaptive decision feedback equalizer minimizes post-cursor inter-symbol interference in a current data bit based on values of subsequent data bits in a data communication system. In one form, the receiver includes a plurality of modules each having a respective adaptive decision feedback equalizer. A processor responsive to output signals from each of the plurality of modules generates a plurality of coefficient values. The adaptive decision feedback equalizer has a plurality of taps receiving a respective output signal from one of the modules and a respective coefficient value to generate a respective correction signal. The correction signals are summed with the data signal and processed to recover the data. Pre-calculation of coefficients permits rapid selection of data. Multi-phase operation permits higher data frequencies.Type: GrantFiled: November 1, 2004Date of Patent: October 21, 2008Assignee: LSI CorporationInventors: Vishnu Balan, Joseph Caroselli, Jr., Ye Liu, Chintan M. Desai, Jenn-Gang Chern
-
Publication number: 20060093028Abstract: A multi-phase adaptive decision feedback equalizer minimizes post-cursor inter-symbol interference in a current data bit based on values of subsequent data bits in a data communication system. In one form, the receiver includes a plurality of modules each having a respective adaptive decision feedback equalizer. A processor responsive to output signals from each of the plurality of modules generates a plurality of coefficient values. The adaptive decision feedback equalizer has a plurality of taps receiving a respective output signal from one of the modules and a respective coefficient value to generate a respective correction signal. The correction signals are summed with the data signal and processed to recover the data. Pre-calculation of coefficients permits rapid selection of data. Multi-phase operation permits higher data frequencies.Type: ApplicationFiled: November 1, 2004Publication date: May 4, 2006Applicant: LSI Logic CorporationInventors: Vishnu Balan, Joseph Caroselli, Ye Liu, Chintan Desai, Jenn-Gang Chern