Patents by Inventor Jonathan Le Roux
Jonathan Le Roux has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Publication number: 20210247201Abstract: A navigation system configured to provide driving instructions to a driver of a moving vehicle based on real-time description of objects in a scene pertinent to driving the vehicle is provided.Type: ApplicationFiled: February 6, 2020Publication date: August 12, 2021Applicant: Mitsubishi ELectric Research Laboratories, Inc.Inventors: Chiori Hori, Anoop Cherian, Siheng Chen, Tim Marks, Jonathan Le Roux, Takaaki Hori, Bret Harsham, Anthony Vetro, Alan Sullivan
-
Patent number: 11086918Abstract: A method for performing multi-label classification includes extracting a feature vector from an input vector including input data by a feature extractor, determining, by a label predictor, a relevant vector including relevant labels having relevant scores based on the feature vector, updating a binary masking vector by masking pre-selected labels having been selected in previous label selections, applying the updated binary masking vector to the relevant vector such that the relevant label vector is updated to exclude the pre-selected labels from the relevant labels, and selecting a relevant label from the updated relevant label vector based on the relevant scores of the updated relevant label vector.Type: GrantFiled: December 7, 2016Date of Patent: August 10, 2021Assignee: Mitsubishi Electric Research Laboratories, Inc.Inventors: Takaaki Hori, Chiori Hori, Shinji Watanabe, John Hershey, Bret Harsham, Jonathan Le Roux
-
Publication number: 20210233550Abstract: A speech separation device (12) of a speech separation system includes a feature amount extraction unit (121) configured to extract time-series data of a speech feature amount of mixed speech, a block division unit (122) configured to divide the time-series data of the speech feature amount into blocks having a certain time width, a speech separation neural network (1b) configured to create time-series data of a mask of each of a plurality of speakers from the time-series data of the speech feature amount divided into blocks, and a speech restoration unit (123) configured to restore the speech data of each of the plurality of speakers from the time-series data of the mask and the time-series data of the speech feature amount of the mixed speech.Type: ApplicationFiled: January 12, 2021Publication date: July 29, 2021Applicants: MITSUBISHI ELECTRIC CORPORATION, MITSUBISHI ELECTRIC RESEARCH LABORATORIES, INC.Inventors: Ryo AIHARA, Toshiyuki HANAZAWA, Yohei OKATO, Gordon P WICHERN, Jonathan LE ROUX
-
Publication number: 20210183373Abstract: A speech recognition system successively processes each encoder state of encoded acoustic features with a frame-synchronous decoder (FSD) and label-synchronous decoder (LSD) modules. Upon identifying an encoder state carrying information about new transcription output, the system expands a current list of FSD prefixes with FSD module, evaluates the FSD prefixes with LSD module, and prunes the FSD prefixes according to joint FSD and LSD scores. FSD and LSD modules are synchronized by having LSD module to process the portion of the encoder states including new transcription output identified by the FSD module and to produce LSD scores for the FSD prefixes determined by the FSD module.Type: ApplicationFiled: December 12, 2019Publication date: June 17, 2021Applicant: Mitsubishi Electric Research Laboratories, Inc.Inventors: Niko Moritz, Takaaki Hori, Jonathan Le Roux
-
Publication number: 20210116894Abstract: A system for controlling an operation of a machine including a plurality of actuators assisting one or multiple tools to perform one or multiple tasks, in response to receiving an acoustic mixture of signals generated by the tool performing a task and by the plurality of actuators actuating the tool, submit the acoustic mixture of signals into a neural network trained to separate from the acoustic mixture a signal generated by the tool performing the task from signals generated by the actuators actuating the tool to extract the signal generated by the tool performing the task from the acoustic mixture of signals, analyze the extracted signal to produce a state of performance of the task, and execute a control action selected according to the state of performance of the task.Type: ApplicationFiled: October 17, 2019Publication date: April 22, 2021Applicant: Mitsubishi Electric Research Laboratories, Inc.Inventors: Gordon Wichern, Jonathan Le Roux, Fatemeh Pishdadian
-
Patent number: 10811000Abstract: Systems and methods for a speech recognition system for recognizing speech including overlapping speech by multiple speakers. The system including a hardware processor. A computer storage memory to store data along with having computer-executable instructions stored thereon that, when executed by the processor is to implement a stored speech recognition network. An input interface to receive an acoustic signal, the received acoustic signal including a mixture of speech signals by multiple speakers, wherein the multiple speakers include target speakers. An encoder network and a decoder network of the stored speech recognition network are trained to transform the received acoustic signal into a text for each target speaker. Such that the encoder network outputs a set of recognition encodings, and the decoder network uses the set of recognition encodings to output the text for each target speaker. An output interface to transmit the text for each target speaker.Type: GrantFiled: April 13, 2018Date of Patent: October 20, 2020Assignee: Mitsubishi Electric Research Laboratories, Inc.Inventors: Jonathan Le Roux, Takaaki Hori, Shane Settle, Hiroshi Seki, Shinji Watanabe, John Hershey
-
Publication number: 20200312306Abstract: A speech recognition system includes an encoder to convert an input acoustic signal into a sequence of encoder states, an alignment decoder to identify locations of encoder states in the sequence of encoder states that encode transcription outputs, a partition module to partition the sequence of encoder states into a set of partitions based on the locations of the identified encoder states, and an attention-based decoder to determine the transcription outputs for each partition of encoder states submitted to the attention-based decoder as an input. Upon receiving the acoustic signal, the system uses the encoder to produce the sequence of encoder states, partitions the sequence of encoder states into the set of partitions based on the locations of the encoder states identified by the alignment decoder, and submits the set of partitions sequentially into the attention-based decoder to produce a transcription output for each of the submitted partitions.Type: ApplicationFiled: March 25, 2019Publication date: October 1, 2020Applicant: Mitsubishi Electric Research Laboratories, Inc.Inventors: Niko Moritz, Takaaki Hori, Jonathan Le Roux
-
Patent number: 10726856Abstract: Systems and methods for audio signal processing including an input interface to receive a noisy audio signal including a mixture of target audio signal and noise. An encoder to map each time-frequency bin of the noisy audio signal to one or more phase-related value from one or more phase quantization codebook of phase-related values indicative of the phase of the target signal. Calculate, for each time-frequency bin of the noisy audio signal, a magnitude ratio value indicative of a ratio of a magnitude of the target audio signal to a magnitude of the noisy audio signal. A filter to cancel the noise from the noisy audio signal based on the phase-related values and the magnitude ratio values to produce an enhanced audio signal. An output interface to output the enhanced audio signal.Type: GrantFiled: August 16, 2018Date of Patent: July 28, 2020Assignee: Mitsubishi Electric Research Laboratories, Inc.Inventors: Jonathan Le Roux, Shinji Watanabe, John Hershey, Gordon Wichern
-
Patent number: 10593321Abstract: A method for training a multi-language speech recognition network includes providing utterance datasets corresponding to predetermined languages, inserting language identification (ID) labels into the utterance datasets, wherein each of the utterance datasets is labelled by each of the language ID labels, concatenating the labeled utterance datasets, generating initial network parameters from the utterance datasets, selecting the initial network parameters according to a predetermined sequence, and training, iteratively, an end-to-end network with a series of the selected initial network parameters and the concatenated labeled utterance datasets until a training result reaches a threshold.Type: GrantFiled: December 15, 2017Date of Patent: March 17, 2020Assignee: Mitsubishi Electric Research Laboratories, Inc.Inventors: Shinji Watanabe, Takaaki Hori, Hiroshi Seki, Jonathan Le Roux, John Hershey
-
Patent number: 10592800Abstract: A method for transforms input signals, by first defining a model for transforming the input signals, wherein the model is specified by constraints and a set of model parameters. An iterative inference procedure is derived from the model and the set of model parameters and unfolded into a set of layers, wherein there is one layer for each iteration of the procedure, and wherein a same set of network parameters is used by all layers. A neural network is formed by untying the set of network parameters such that there is one set of network parameters for each layer and each set of network parameters is separately maintainable and separately applicable to the corresponding layer. The neural network is trained to obtain a trained neural network, and then input signals are transformed using the trained neural network to obtain output signals.Type: GrantFiled: November 3, 2016Date of Patent: March 17, 2020Assignee: Mitsubishi Electric Research Laboratories, Inc.Inventors: John Hershey, Jonathan Le Roux, Felix Weninger
-
Publication number: 20200058314Abstract: Systems and methods for audio signal processing including an input interface to receive a noisy audio signal including a mixture of target audio signal and noise. An encoder to map each time-frequency bin of the noisy audio signal to one or more phase-related value from one or more phase quantization codebook of phase-related values indicative of the phase of the target signal. Calculate, for each time-frequency bin of the noisy audio signal, a magnitude ratio value indicative of a ratio of a magnitude of the target audio signal to a magnitude of the noisy audio signal. A filter to cancel the noise from the noisy audio signal based on the phase-related values and the magnitude ratio values to produce an enhanced audio signal. An output interface to output the enhanced audio signal.Type: ApplicationFiled: August 16, 2018Publication date: February 20, 2020Inventors: Jonathan Le Roux, Shinji Watanabe, John Hershey, Gordon Wichem
-
Patent number: 10529349Abstract: Systems and methods for an audio signal processing system for transforming an input audio signal. A processor implements steps of a module by inputting an input audio signal into a spectrogram estimator to extract an audio feature sequence, and process the audio feature sequence to output a set of estimated spectrograms. Processing the set of estimated spectrograms and the audio feature sequence using a spectrogram refinement module, to output a set of refined spectrograms. Wherein the processing of the spectrogram refinement module is based on an iterative reconstruction algorithm. Processing the set of refined spectrograms for the one or more target audio signals using a signal refinement module, to obtain the target audio signal estimates. An output interface to output the optimized target audio signal estimates. Wherein the module is optimized by minimizing an error using an optimizer stored in the memory.Type: GrantFiled: May 18, 2018Date of Patent: January 7, 2020Assignee: Mitsubishi Electric Research Laboratories, Inc.Inventors: Jonathan Le Roux, John R Hershey, Zhongqiu Wang, Gordon P Wichern
-
Publication number: 20190318725Abstract: Systems and methods for a speech recognition system for recognizing speech including overlapping speech by multiple speakers. The system including a hardware processor. A computer storage memory to store data along with having computer-executable instructions stored thereon that, when executed by the processor is to implement a stored speech recognition network. An input interface to receive an acoustic signal, the received acoustic signal including a mixture of speech signals by multiple speakers, wherein the multiple speakers include target speakers. An encoder network and a decoder network of the stored speech recognition network are trained to transform the received acoustic signal into a text for each target speaker. Such that the encoder network outputs a set of recognition encodings, and the decoder network uses the set of recognition encodings to output the text for each target speaker. An output interface to transmit the text for each target speaker.Type: ApplicationFiled: April 13, 2018Publication date: October 17, 2019Inventors: Jonathan Le Roux, Takaaki Hori, Shane Settle, Hiroshi Seki, Shinji Watanabe, John Hershey
-
Publication number: 20190318754Abstract: Systems and methods for an audio signal processing system for transforming an input audio signal. A processor implements steps of a module by inputting an input audio signal into a spectrogram estimator to extract an audio feature sequence, and process the audio feature sequence to output a set of estimated spectrograms. Processing the set of estimated spectrograms and the audio feature sequence using a spectrogram refinement module, to output a set of refined spectrograms. Wherein the processing of the spectrogram refinement module is based on an iterative reconstruction algorithm. Processing the set of refined spectrograms for the one or more target audio signals using a signal refinement module, to obtain the target audio signal estimates. An output interface to output the optimized target audio signal estimates. Wherein the module is optimized by minimizing an error using an optimizer stored in the memory.Type: ApplicationFiled: May 18, 2018Publication date: October 17, 2019Inventors: Jonathan Le Roux, John R Hershey, Zhongqiu Wang, Gordon P Wichern
-
Publication number: 20190189111Abstract: A method for training a multi-language speech recognition network includes providing utterance datasets corresponding to predetermined languages, inserting language identification (ID) labels into the utterance datasets, wherein each of the utterance datasets is labelled by each of the language ID labels, concatenating the labeled utterance datasets, generating initial network parameters from the utterance datasets, selecting the initial network parameters according to a predetermined sequence, and training, iteratively, an end-to-end network with a series of the selected initial network parameters and the concatenated labeled utterance datasets until a training result reaches a threshold.Type: ApplicationFiled: December 15, 2017Publication date: June 20, 2019Inventors: Shinji Watanabe, Takaaki Hori, Hiroshi Seki, Jonathan Le Roux, John Hershey
-
Publication number: 20180157743Abstract: A method for performing multi-label classification includes extracting a feature vector from an input vector including input data by a feature extractor, determining, by a label predictor, a relevant vector including relevant labels having relevant scores based on the feature vector, updating a binary masking vector by masking pre-selected labels having been selected in previous label selections, applying the updated binary masking vector to the relevant vector such that the relevant label vector is updated to exclude the pre-selected labels from the relevant labels, and selecting a relevant label from the updated relevant label vector based on the relevant scores of the updated relevant label vector.Type: ApplicationFiled: December 7, 2016Publication date: June 7, 2018Applicant: Mitsubishi Electric Research Laboratories, Inc.Inventors: Takaaki Hori, Chiori Hori, Shinji Watanabe, John Hershey, Bret Harsham, Jonathan Le Roux
-
Patent number: 9881631Abstract: A method transforms a noisy audio signal to an enhanced audio signal, by first acquiring the noisy audio signal from an environment. The noisy audio signal is processed by an enhancement network having network parameters to jointly produce a magnitude mask and a phase estimate. Then, the magnitude mask and the phase estimate are used to obtain the enhanced audio signal.Type: GrantFiled: February 12, 2015Date of Patent: January 30, 2018Assignee: Mitsubishi Electric Research Laboratories, Inc.Inventors: Hakan Erdogan, John Hershey, Shinji Watanabe, Jonathan Le Roux
-
Patent number: 9685155Abstract: A method distinguishes components of a signal by processing the signal to estimate a set of analysis features, wherein each analysis feature defines an element of the signal and has feature values that represent parts of the signal, processing the signal to estimate input features of the signal, and processing the input features using a deep neural network to assign an associative descriptor to each element of the signal, wherein a degree of similarity between the associative descriptors of different elements is related to a degree to which the parts of the signal represented by the elements belong to a single component of the signal. The similarities between associative descriptors are processed to estimate correspondences between the elements of the signal and the components in the signal. Then, the signal is processed using the correspondences to distinguish component parts of the signal.Type: GrantFiled: May 5, 2016Date of Patent: June 20, 2017Assignee: Mitsubishi Electric Research Laboratories, Inc.Inventors: John Hershey, Jonathan Le Roux, Shinji Watanabe, Zhuo Chen
-
Patent number: 9679559Abstract: A method estimates source signals from a mixture of source signals by first training an analysis model and a reconstruction model using training data. The analysis model is applied to the mixture of source signals to obtain an analysis representation of the mixture of source signals, and the reconstruction model is applied to the analysis representation to obtain an estimate of the source signals, wherein the analysis model utilizes an analysis linear basis representation, and the reconstruction model utilizes a reconstruction linear basis representation.Type: GrantFiled: May 29, 2014Date of Patent: June 13, 2017Assignee: Mitsubishi Electric Research Laboratories, Inc.Inventors: Jonathan Le Roux, John R. Hershey, Felix Weninger, Shinji Watanabe
-
Patent number: 9661414Abstract: In an acoustic apparatus, an acoustic transducer is arranged in a substrate. Multiple acoustic pathways in the substrate have predetermined lengths, wherein a proximal end of each pathway forms an opening in a front surface of the substrate, and a distal end terminates at the acoustic transducer. The predetermined lengths of the acoustic pathways are designed to form an acoustic spatial filter that selectively passes acoustic signals from or to different locations. The transducer can convert electric energy to acoustic energy when the apparatus operates as a speaker, or the the transducer can convert acoustic energy to electric energy and operate as a microphone.Type: GrantFiled: June 10, 2015Date of Patent: May 23, 2017Assignee: Mitsubishi Electric Research Laboratories, Inc.Inventors: Jonathan Le Roux, John R Hershey, William S. Yerazunis, Petros T Boufounos, Laurent Daudet